DCAL Data Centre & Analytics Lab

Operation Theatre Scheduling and Optimization - 1st Place

Problem statement

An operation theatre is one of the most important divisions of a hospital. As a result, scheduling of surgeries to be performed in the operation theatres becomes a critical daily operational task. Currently at Columbia Asia Hospital, Hebbal (CAH), OT scheduling is undertaken manually. It is a time consuming and inefficient process, thereby resulting in a suboptimal usage of the available OTs/resources.

The objective of this project is to optimize Operation Theater (OT) scheduling such that CAH can manage high volumes of patients, reduce inefficiencies of manual processes and ensure optimal utilization of the OTs. Optimization of OT is a process to help CAH achieve the following goals:

Work more efficiently - Use existing capacity better, allocate resources better
Understand trend of clinical surgeries and utilization of OTs
Draw strategies to increase trend in clinical surgeries
Add capacity-physical space, staff, equipment, specialties, Doctors

The process of OT scheduling broadly consists of two steps. The first is a day/week wise plan to assign a specific date and time to each patient awaiting surgery while the second involves sequencing and scheduling surgeries on a given date in the available OTs.

Approach

Descriptive Analysis of the OT Data was carried out to understand dynamics of current process followed for OT scheduling. This data analysis along with descriptive statistics gave more insights into understanding the data thereby facilitating in providing actionable suggestions to CAH.

Forecasting number of surgeries

A basic level of monthly and weekly forecasting was done to predict the number of surgeries to be performed by CAH. This forecast will be taken as an input to the Scheduling Model

Optimization

The objective of the project is both scheduling and optimization. As a part of scheduling, the first task is to assign a surgery to an OT and then sequence it in such a way that the overall schedule is optimized which is the primary goal of the model. A mathematical constraint/goal programming model needed to be developed after factoring in various constraints like emergency cases, clinical priority, availability of equipment/doctor, OPD Schedule, surgery difficulty, post-surgery maintenance time etc.

To include the secondary goal, the objective function was modified to introduce cost of surgery. This ensured if there was a contention to schedule surgeries, the surgeries with maximum revenue will be scheduled, thus maximize the overall revenue. This was done to test the model with respect to cost parameters.

For solving this scheduling problem, CP Optimizer (CPO) from IBM CPLEX was used because, it is very powerful in solving scheduling problems of this nature. CPO is easy to model and run with a declarative mathematical model. Concepts like intervals and functions makes it easy, fast and maintainable. Good out of the box solution for real world scheduling problems.

Outcome

The proposed solution was built and tested with simulated data for the following scenarios:

As-is manual schedule for a highly utilized week was tested in our model and results compared
Tested for 100% scalability for increase in volume of surgery
OneTested for scalability for doubled number of OTs
Tested for OPD schedule flexibility of doctors

A comparison between base vs. simulated model to demonstrate the scalability was conducted. The optimal schedule generated through the Gantt charts showed that the model is scalable and dynamic enough to support increase in number of surgeries as well as support additional OTs.

Impact

In conclusion, using the proposed solution, will enable hospitals to provide increased patient satisfaction by reducing the wait time and at the same time effectively utilizing the OTs with limited resources. Implications of this project are as follows:

Increased service rate (number of patients served)
No wait flow process (waiting time reduction)
Increased resources utilization by closely packing surgeries
Dynamism in every aspect (Start-end surgery time, equipment availability, emergencies prioritization, surgery type prioritization, OT-surgery preference, support for flexible doctor schedule, support for flexible OT working hours per day, support for flexible recovery time for Supra major surgeries)
Revenue analysis due to increase in surgeries, while revenue increases, since the resources like OT, equipment and staff are same the profitability will increase more

Forecasting of spare parts for Spicejet - 2nd Place

Problem statement

As per International Air Transport Association (IATA)'s Maintenance Cost Task Force airlines spend around USD 67.6 Billion on MRO, representing 9.5% of the total operational cost. A significant part of this is attributed to spare parts. Spicejet, a leading airline in India, has under taken this project to optimize the cost of the spare parts and the same time ensure that the airworthiness of the fleet. Availability of the right spare parts during the scheduled and unscheduled maintenance, drives the availability of the aircraft for flight. SpiceJet is also in the process of expansion with addition of new aircrafts and additional routes. With this strategic decision it is expected that there could be new factors that come into play that impact the maintenance schedules of the aircrafts and consumption pattern of the spare parts. SpiceJet would like to build models that could forecast annual consumption of the spare parts and minimum level of inventory that they should maintain.

Approach

Aircrafts use thousands of parts. The objective of this study is to categorize the spare parts based on the consumption pattern. Multiple techniques were used for selection of parts - FSN, VED, SDE, HML. Futher, Syntetos and Boylan scheme was used to categorize based upon the demand pattern. Within the identified categories, exemplary parts were picked to verify the forecasting accuracies. The past consumption trends of the selected parts were used to predict the consumption for the subsequent months. Different forecasting techniques will be used to build the models and the best model will be selected based on the predefined criterias.

The consumables and expendables below techniques were used:

SES: Single Exponential: Smoothing for parts with no trend
Holts Method: For parts with trend
Holt Winters: Parts with trend and seasonality
ARIMA, Auto ARIMA: For data with non-stationarity
ARIMAX: Data with non-stationarity and influencing factors
Dynamic regression: Forecast using predictive variables
Croston's Method: For forecasting of parts with intermittent demand pattern

Following are features that were used as part of ARIMAX and Dynamic regression models:

Flying Hours per month
Spare parts used across years
No. of cycles
Avg. age of aircraft by month
No. of aircraft

Theoretically even though these techniques can also be used to predict the rotables, this work takes explanatory approach to the retables. Past studies have used ANN techniques on the time series elements to predict the spare parts. In this study the past data is used to extract features. These features are then used with ANN, Random forest and Bayesian networks to predict the spares consumption. One exemplary part is picked to demonstrate the methodology.

Outcome

Various models built for consumables and expendables gave an error (MAPE) ranging from 7% to 25%. From annual forecast perspective, the total expected consumption was arrived at. Transceiver was identified as the top most rotable picked for predicting the spares consumption. Based on the past data the consumption pattern of this part was identified to be highly biased. Feature extraction techniques were employed to identify, Age of the aircraft, the weather condition with month as a proxy, landing destination and number of flights operated as the key features influencing transceiver consumption. The dimension reduction was done by transforming the data on monthly consumption and the transceiver consumption per month was taken as target variable with a multinomial model. The random forest and Bayesian models were able predict with a good accuracy with train data. The ANN model was able to perform best with the test data at 70% accuracy. With augmenting the test data as part of the train data the model was able to improve the accuracy and it was concluded that the deep learning helps in such models as the changing conditions were also used to learn.

Impact

Overall the project gave SpiceJet the concept that they can use to forecast the consumption pattern various spare parts and leverage it for annual planning and inventory management. The models created provide a good linkage with various factors that take into account fleet expansion.

Analysis of transceiver indicated that the consumption was not just linked with weather (monsoon season) but seemed to have linkage number of trips and age of the aircraft as the major features for failure. Deep learning models were able to predict expected consumption with more than 90% accuracy.

Hence the past data is able help the business accurately plan for their future spare prediction to avoid flight delays. The implementation of the same needs restructuring of the data to suite the model. This also needs a user interface to be developed so that a horizontal deployment to other pats can be taken up in the future, where the end user will be able to use the model.

Every Drop Counts: Unleashing the prospective locations for Water Harvesting- 1st Place

Published at IML 2017 Conference and is available at ACM Digital Library

Problem statement

Water is at the heart of ‘Sustainable Development Goals’ set by United Nations – with an objective to balance the three dimensions of development: Environment, Social & Economic. But, with changing climatic patterns, untimely rains, prolonged dry spells, depleting ground water & drought making every drop of water extremely precious, the need of the hour is to gauge & work towards the major aspects of water harvesting -- ‘Catchment’. This study presents a structured & meticulous approach, wielding 'Geospatial Analytics' to identify the prospective locations for Water Harvesting in arid & semi-arid parts of the country for sustainable development.

Approach

The objective of this study is to explore potential sites for water harvesting by considering several environmental and socio-economic factors. We have harnessed the power of Geospatial Analytics on Remote Sensing (RS) & Geographical Imaging System (GIS) data to implement Analytic Hierarchy Process (AHP) which is one of the GIS based Multi-Criteria Decision Making (MCDM) that combines & transforms spatial data (input) into a resultant decision (output).

After having detailed study of the area under consideration, we gathered geospatial & environmental data to commence with our solution strategy.Satellite images do capture geospatial data which includes water index(NDWI), vegetation index(NDVI) & Elevation which can be mapped to specific latitude and longitude. However, the rainfall data and socio-economic data which have crucial importance in determining the prospective location has been collected from the available national data repositories.

Post the feasibility study we found that Analytic hierarchy process (AHP) is a classical land suitability analysis procedure, which gives a systematic approach in making proper decisions for site selection. GIS based AHP coupled with MCDM can be regarded as a process that combines geographical data and the decision maker’s preferences (spatial data) to obtain information for decision making.

Outcome

To measure the result set accuracy, the ideal scenario would have been that we have the updated latitude & longitude available for the existing water tanks in Bellary District (area of study) complemented by a manual surveillance. But as this study arena seems to be in a pre-mature state, the data is not available. Hence, we needed to rely on the reverse geocoding process on Google Map and analyzing each & every location based on the weights assigned to each of the criteria. We see that out of top 5 locations,

Two of them completely conform to the existing water reservoir locations.
Two of the locations are in close propinquity of the water trenches/sources thus making it an apt location for placing the water tanks.
One location (optimal as per the result set) as highlighted above seem to be not in a favorable location for placing the water tank.

Impact

Due to the Global Warming & rapid degree of climate change, the world has been witnessing the perennial climatic uncertainty & India is no alien to it. The impact of recurring droughts in Indian states has been such that Kerala declared all districts drought-hit, with 34% Monsoon deficiency. We need to be mindful of the impact of changing the climate on agriculture, and the devastating consequences it has on farmers as well as the food security prospects for rest of the human beings.

Even in the areas that receive ample rainfall, much of the rainwater either runs off or evaporates leaving the land parched & dry for most of the year.

Around 62% of India’s people depend on agriculture and six out of ten farmers depend on rain for irrigation and at high risk to the vagaries of changing weather patterns. India’s experience with adverse consequences of climate change, uncertain rainfall & lack of rainwater harvesting sites go beyond the vulnerable farm sector touching every aspect of the Indian economy, livelihood & growth.

While there are multitude of measures that are being practiced & need to be implemented to put a cap on Global Warming & bring an end to the climate Uncertainty, the colossal importance of Rain Water Harvesting can never be undermined. Identifying & Building a Rainwater Catchment area & Harvesting site to place water tanks comes with an enormous cost, so identifying the prospective locations to place these water tanks is of key importance to yield its manifold benefits across all walks of life.

Prediction of Customer Churn - An innovative approach using Logistic regression and Markov Model - 2nd Place

Problem statement

In any membership driven firm, membership fee is one of the key source of revenue to keep the business profitable. In a membership format, it is important for any firm to keep a check on inactive members or the members that are unlikely to renew their membership. This in-turn allows the management to react to the needs of these members through directed campaigns, improved assortment, etc., to retain customers. This made the business to focus on a Customer Churn model. The Client in context is a US major retail firm.

The business wanted to develop a predictive analytical solution to identify members most likely to not renew their membership, based on transaction, membership history, and demographics data.

Approach

From the problem statement provided, it was quite evident that the team had to develop a classification model to tackle the issue. Hence, several approaches came into consideration:

Exploratory Data Analysis:

Explore the transactional, visit and demographic data.
Build inferences from the above.
Outlier treatment and missing data treatment.

Transaction Gap Analysis:

Define churn, for deleted members.
Use KS statistics and Log Odds ratio to find the churn rate.

Logistic regression and Machine Learning algorithms:

Classify DELETED ~ (ACTIVE+EXPIRED)
Use RANDOM FOREST and sampling methods for better accuracy.
Use RANDOM FOREST as data is biased (~85% Active vs ~14% Deleted)

Markov Chain:

Identify stages of customer churn.
Segment customer based on RFM Model.
Calculate the time to absorption. "

Outcome

The team developed two models: Regression model using machine learning technique and Markov Model including RFM metrics. Both these models will determine the churn and help the client to build a strategy to retain the customers.

From the classification model using regression and machine learning techniques, the best approach was derived to be Random Forest Down-Sampling technique which gave best model accuracy including other considerations for model development.

From the Markov model, the time to absorption state derived predicted that the members in certain active transition states had the least time to move into churn state.

The firm might want to target members in these states (high transacting and medium transacting) who are likely to churn, but have contributed to the revenue by their transaction behavior.

The team recommends the client to take advantage of both the developed models. The regression model will tell if a customer/member is likely to churn. When this data is fed to the Markov Model, it will specify the state a customer currently stands in.

Impact

The team analyzed the transactional data of the customers, to find that close to 80% of the overall revenue contributed to the firm came from high transacting members. The models developed are able to predict customers most likely to churn, and also specifically the right cohort that contributes largely to the revenue: the high transacting members.

The firm can predict and retain these customers from moving out of the membership cycle. This will enable the firm to improve membership loyalty, and withhold the revenue contributed by these outgoing customers.

Persistency Analytics – Developing a Model for Predicting Persistency of Policies - 3rd Place

Problem statement

Persistency of policies is an important issue for life insurance companies. The problem statement was to develop a predictive model that could predict whether a policy will pay its next premium and to provide a strategy for improving persistency based on the model output.

Approach

The approach taken was to use predictive techniques such as logistic regression, random forest and GBM to predict the persistency of policies.

Outcome

The outcome of the project was a usable predictive model that could classify the policies due for payment into 3 categories: High, medium and low probability of payment. As a part of the project, strategy for each of these categories was also given based on the probabilities given by the model.

Impact

The impact of the project will be an increase in the persistency rates of the company and optimal usage of its resources to increase persistency.

Optimal Gate Assignment for a Major Metro Airport in India - 1st Place

Problem statement

The problem of assigning gates to flights is one of the key activities for an airport operator, which requires the planning team to utilize the available gates in an optimal manner to maintain its service-level agreements with airline operators while improving passenger service delivery. For all major international airport operators, optimal resource scheduling and capacity utilization is critical factor in being able to serve a larger number of flights and ensuring smooth passenger transit at improved cost of operations.

Approach

A mixed integer programming model was developed to generate the daily schedule while adhering to the assignment policies and constraints (Aircraft size, security requirements, no overlap due to close arrival/departure times, Airline priority etc.) leveraging flight schedule, Maintenance, Aircraft size, Airport Gate specification data. As an additional consideration, the study further estimated a hypothetical future scenario of limited gate capacity at the airport (due to increased growth)

Outcome

The model results indicated that all flights were assigned to the gates while adhering to the constraints.It also showed that post all assignments, spare capacity was available. The assignment (i.e. number of airplanes at airport) at peak traffic time is well under available capacity (# of Gates), indicating opportunity for expansion

Impact

The airport management was thinking of expansion of its operations and include more gates, hiwever it was identfied that there was no scarcity of resources (gates) but stochasticity of actual schedule. This was a huge cost saving for them.

iD. Special – Demand forecasting for home-made fresh food - 2nd Place

Problem statement

iD Fresh Food (India) Private Ltd., is a leading ready-to-cook and eat packaged food company. The company wanted to know how much should be loaded into each vehicle for the following day when a salesman started his beat journey. This information would then enable a macro-view of the business operations over a month and consequently helping in production planning and operations for the future periods.One of iD's main operating region, Mumbai, where it supplies about 8 different SKUs using 25+ beats was selected for the study

Approach

Multiple forecasting techniques were tested to determine what the best performing options are. Finally, an automated forecasting engine was developed which would select the best forecasting technique for a given combination of SKU and Beat

Outcome

The automated forecasting engine can determine forecasts that predict the volume per SKU to be loaded into the van used for supply on a particular beat for a particular day.

Impact

The current forecasting technique followed by the company was a purely gut feel or a naive metho. It used recent supply data to project supply for the current day. This project made an attempt to recommend an alternative analytical technique that forecasts demand at a SKU-level per beat

Value Analytics in Motor Insurance - 3rd place

Problem statement

Motor insurance accounted for 39.41% of the gross direct premiums till September 2015 in the Rs.78,000 crore premium per annum non-life insurance industry in India. This project aims to address the industry-wide challenge of operational profitability for Motor Insurance, focusing on E-business channel.

Approach

Designing an optimum media plan to improve effectiveness of the marketing spend, with a higher ROI and reduced cost of customer acquisition (COA) using regression analysis. 2. Identifying loss-making segments by evaluating Customer Lifetime Value, and thereby providing recommendations for an optimal customer mix using stochastic models

Outcome

Regression Analysis of COA with variables like Media to procure business, Average advertisement positioning, Average ticket-size etc., followed by a linear optimization model was designed for the optimal media mix plan. Customer Lifetime Value (CLV), based on the number of times one renews a vehicle policy for the life of the vehicle insured was calculated. Additional information like age of vehicle, survival rate, based on number of times the policy was renewed was used. The potential claims amount was then estimated by using a combination of Poisson and Gamma distributions and was incorporated in CLV.

Impact

The company successfully applied this technique in their operations. The project gave recommendations to the company using the optimization model to simulate different promotion strategies and decide on best approach to reduce Cost of Acquisition.

Corporate Connect

Best Projects of 2018

Operation Theatre Scheduling and Optimization - 1st Place

Problem statement

Approach

Outcome

Impact

Forecasting of spare parts for Spicejet - 2nd Place

Problem statement

Approach

Outcome

Impact

Best Projects of 2017

Every Drop Counts: Unleashing the prospective locations for Water Harvesting- 1st Place

Published at IML 2017 Conference and is available at ACM Digital Library

Problem statement

Approach

Outcome

Impact

Prediction of Customer Churn - An innovative approach using Logistic regression and Markov Model - 2nd Place

Problem statement

Approach

Outcome

Impact

Persistency Analytics – Developing a Model for Predicting Persistency of Policies - 3rd Place

Problem statement

Approach

Outcome

Impact

Best Projects of 2016

Optimal Gate Assignment for a Major Metro Airport in India - 1st Place

Problem statement

Approach

Outcome

Impact

iD. Special – Demand forecasting for home-made fresh food - 2nd Place

Problem statement

Approach

Outcome

Impact

Value Analytics in Motor Insurance - 3rd place

Problem statement

Approach

Outcome

Impact

Corporate Collaboration

In News

Analytics Society of India

Contact Us

GALLERY