Data Analytics

At XRCI, the Data Analytics lab converts data into actionable decisions and creates innovative solutions that touch the lives of people in positive ways. The research group studies new machine learning algorithms and statistical techniques to predict the future and understand large amounts of unstructured data generated across diverse Xerox Services businesses to derive new actionable insights.
The group looks at challenging research problems in core machine learning and statistical techniques, speech and signal processing, text and graph analytics and vision and multimedia analysis in the context of healthcare, transportation, education and customer care application domains.
The Data Analytics lab has excellent researchers in machine learning, statistics, stochastic analysis, graph algorithms, speech and signal processing, image and video processing, and text analytics. The team publishes regularly and helps organize prestigious conferences in their areas of expertise. The Data Analytics lab has several collaborative projects with academia in India and abroad. Following are the key areas of research within analytics:

Project Themes

  • Healthcare
  • Transportation
  • Education
  • Audio Analytics
  • Big Data Analytics
  • Making Analytics Simpler
  • Collaborations

Analytics for healthcare

Analytics can play an important role in healthcare services by changing the way hospitals work. Analytics is used extensively in the areas of clinical decision support, treatment planning, and preventive medical care. XRCI’s healthcare systems use novel statistical models and algorithms for disease risk modelling, personalized treatment planning, population-level analysis, analysis of physiological signals and prediction of emerging medical complications.

The Data Analytics lab works on novel methods of monitoring health vitals and disease diagnostics. The research group has developed the technology for remote, non-contact, non-intrusive capture of body vitals using camera-based devices and applying sophisticated signal processing and image/video processing on the data stream.

The Data Analytics team is investigating the use of different types of cameras – simple webcam, thermal camera, depth camera and hyperspectral camera – for detecting respiratory rate, heart rate, oxygen saturation and other parameters. The lab’s flagship project is the low cost, early-warning, non-intrusive breast cancer detection system, which has saved the lives of millions of women.

Analytics for transportation

Analytics is used extensively to provide insights and solve problems related to public transport systems. XRCI’s solutions are developed by modeling data using sophisticated statistical models collected from the transport systems. The model is also used to predict events in the future, which is useful in making decisions in real time.

XRCI has combined novel optimization techniques with stochastic models to prescribe efficient scheduling and route plans for public transportation. The system uses video as a smart sensor of information which is not electronically available (such as traffic information, demand, etc).

Analytics for education

Quality education is one of the pressing needs of the emerging markets, particularly India. XRCI believes that technology-enabled Massive Open Online Courses (MOOCs) and Open Education Resources (OERs) can be utilized to provide personalized educational experience. With this being the focus of current research, XRCI is looking at ways of enabling the right content to reach students with diverse learning abilities using simpler devices.

Audio analytics

XRCI is developing technology to understand speech and audio using state-of-the-art speech recognition systems that are heavily based on the Hidden Markov Model, the Gaussian Mixture models (HMM-GMM), and new acoustic modeling paradigms. The primary goal of the study is to enable high performance, actionable, and large-scale speech analytics at the medium to deep granularity on large call volumes.

In addition to traditional automatic speech recognition, XRCI is interested in understanding the meta information available in a live conversation or discourse and the adaptation of language and acoustic models to suit a speaker and the domain of discourse.

Big Data analytics platform

The term Big Data refers to the massive digital information available in both structured and unstructured form integrated from multiple, diverse, and dynamic sources of information. Big Data is defined as data that exhibits the 4-V properties – volume, velocity, variety and veracity. Big Data applications can be implemented in almost in all spheres — business, government, consumer – and in application areas such as web mining, social media analytics, mobile data personalization, customer care, healthcare, eGovernance, data center management, etc.

Big Data provides an opportunity for intelligent algorithms to analyze and generate useful actionable insights from data with previously unachievable levels of sophistication, speed and accuracy, making the combination of Big Data and Analytics increasingly important. At XRCI, in addition to the popular technologies used for crunching Big Data like noSQL/Hadoop/MapReduce, researchers are looking at different ways of representing and analyzing Big Data to make it amenable to valuable insight discovery.

Making analytics simpler to use

At Xerox Research, easy-to-use, scalable and efficient mechanism to perform data analysis to provide insights and answers to business intelligence questions is important. Such a service would be valuable to both experts – by automating much of their work – and non-experts – by providing a mechanism to solve problems without having to gain analytics expertise. The system internally calculates the hardware and software required to optimally solve the problem within the given cost and time constraints. The results from the analysis can be easily interpreted by an expert to provide a recommendation to improve the customer’s business.


The Analytics lab at XRCI collaborates with universities through internship programs, faculty sabbaticals, formal joint collaboration projects and pilots. XRCI currently is working closely with a number of academic institutes including Indian Institute of Science, Indian Institute of Technology (Delhi & Bombay), University of Helsinki, University of Texas Austin, Massachusetts Institute of Technology (MIT), Sai Vidya Institute of Technology (SVIT). These projects are areas like machine learning, speech processing, semantic analysis, cloud computing and include piloting of XRCI’s novel education solutions. XRCI also collaborates with hospitals and medical institutes such as Manipal Kasturba University, St. John’s Medical College and cancer hospitals to develop realistic algorithms for disease diagnostics.