Big Data Analytics (Batch 4)
Overview
A triad of terms captures the essence of "big data": volume, velocity and variety. The volume and pace at which data is created can challenge existing computing infrastructure. For example, every flight of a Boeing 777 can generate up to 1 terabyte (~1000 gigabytes) of data. Making sense of this data is imperative for decision-making and troubleshooting. The theory of bounded rationality proposed by Nobel Laureate Herbert Simon is evermore significant today with the increased complexity of business problems; the human mind is constrained in its capacity to evaluate alternatives, given limited time to make conclusions.
Organizations large and small are forced to grapple with problems of big data, which challenge the existing tenets of data science and computing technologies. Techniques in predictive analytics rely heavily on the validity of statistical concepts such as independent and identically distributed (IID) random variables and the central limit theorem (CLT). When dealing with large volumes, the validity of these assumptions becomes questionable. Straightforward tasks such as interpreting descriptive statistics have their share of issues. We begin to question the utility of summary measures and diagrams.
Algorithms that work well on "small" datasets crumble when the size of the data extends into the terabytes. Time series techniques must be revamped to handle streaming data in continuous time. Social media messages are unstructured, and have data formats that are unfit to be represented by traditional databases. While these may appear to be difficult problems, there has been tremendous progress in analyzing such data. Columnar databases have significantly boosted query speeds. File systems can seamlessly distribute datasets on multiple hard drives, and facilitate analytics on them in real time. Finally, the free and open source nature of big data platforms promotes their rapid adoption.
Programme Objective
This program is designed to equip its participants with an in-depth knowledge of Big Data Analytics (BDA). The principal advantage of our program over other offerings is that IIMB houses a high performance cluster with 10+2 nodes dedicated to big data processing.
We will use real case studies to illustrate the applications of key concepts. At the end of the course, the participants will be able to:
- Appreciate the emergence of business analytics and big data as a competitive strategy.
- Analyse datasets by applying techniques from statistics, operations research, machine learning, deep learning, network analysis and data mining.
- Process unstructured data such as social media messages and machine generated clickstream logs.
- Have a working knowledge of languages, platforms and tools that support statistical analysis and visualisation (R/Python), distributed computing (Hadoop/Spark) and network analysis (Gephi).
- Apply the theories, techniques and tools to solve problems from a wide variety of industries such as manufacturing, services, retail, software, banking and finance, sports, pharmaceuticals, and aerospace.
Who Should Attend
This certificate program will equip the participants with a large suite of analytical tools, as well as prepare them for corporate roles in analytics based consulting in marketing, operations, supply chain management, finance, insurance and general management in various industries. The course is suitable for those who are already working in analytics, and wish to enhance their knowledge. We also welcome participants with a strong analytical aptitude, who would like to start their career in analytics.
Eligibility Criteria
The participants should have a Bachelors degree in engineering/science/commerce or arts with mathematics as one of the subjects during their Bachelors program. Preferable work experience is 3 years, in exceptional cases applicants with less than 3 years are admitted into the program. It is essential that the applicants have programming knowledge.
Selection Process
After submitting their applications online, candidates shall be short-listed for an on-campus test. Questions on the test will examine the candidate’s grasp of basic quantitative concepts. As prior preparation, the candidates are suggested to enrol in a beginner edX course dedicated to Statistics and complete the exercises:
Based on a combination of the test score, past academic performance, quality of work experience and fit for an analytics career, candidates will be called in for a face-to-face interview. The test and interviews will be conducted in succession during June 2018.
Programme Directors
Professor U Dinesh Kumar
http://www.iimb.ernet.in/user/62/dinesh-kumarProfessor Shankar Venkatagiri
http://www.iimb.ernet.in/user/139/shankar-venkatagiriProfessor Pulak Ghosh
http://www.iimb.ernet.in/user/165/pulak-ghoshProgram Design
Module-1: Foundations of Data Science
Learn how to summarize, analyze, and interpret data, as well as to communicate the results using data visualisation. Two key platforms for machine learning: R and Python.
Module-2: Predictive Analytics
Understand how regression and causal forecasting models can be used to analyse real-life business problems such as prediction, classification and discrete choice problems.
Module-3: Introduction to Machine Learning
Get acquainted with a variety of supervised and unsupervised methods, and recommender systems.
Module-4: Prescriptive Analytics: Optimisation
Construct mathematical models for managerial decision situations and use freely available Excel Solver and OPL to obtain solutions and interpret the results.
Module-5: Big Data Eco-system
Use Spark/Hadoop extensively to set up and solve problems involving large datasets.
Module-6: Unstructured Data
Examine the network approach to data representation and analysis. Analyse unstructured social media messages and clickstream data.
Module-7: Deep Learning and AI
Familiarise with neural networks and deep learning frameworks such as TensorFlow and GraphLab.
Module-8: Advanced Machine Learning
Study ensemble methods and penalised regression, clustering, text analytics, spatio-temporal analysis, association rule mining and Monte Carlo simulation.
Module-9: Advanced Big Data Analytics I
The participant is introduced to advanced regression and dimension reduction techniques.
Module-10: Advanced Big Data Analytics II
Bayesian approach to big data, blockchain and policy.
Program Charges
Programme Fee*:
The programme fee is Rs. 5,75,000/-+ GST (applicable rates) per participant, payable in three installments as per the following schedule.
Rs. 2,35,000/- + Applicable GST : I installment on admission
Rs. 1,70,000/- + Applicable GST : II installment on or before 04 October 2018
Rs. 1,70,000/- + Applicable GST : III installment on or before 16 December 2018
Please Note: *Please add GST at prevailing rates to the programme fee.
All enrollments are subject to review and approval by the programme director. Joining Instructions will be shared with the organization if sponsored or to the participants on selection.
Kindly do not make your travel plans unless you receive the offer letter from IIMB.
Note :
- The programme fee should be received at the Executive Education Office, before the programme commencement date.
- In case of withdrawls, the fee will be refunded only if a request is received at least 15 days prior to the start of the programme.
- If a nomination is not accepted, the fee will be refunded to the person / organization concerned.
Programme Schedule
Important Dates
Business Analytics & Intelligence (BAI-13)
Overview
The theory of bounded rationality proposed by Nobel Laureate Herbert Simon is evermore significant today with increasing complexity of the business problems; limited ability of the human mind to analyze the alternative solutions and the limited time available for decision making. Introduction of enterprise resource planning (ERP) systems has ensured availability of data in many organizations; however, traditional ERP systems lacked data analysis capabilities that can assist the management in decision making. Business Analytics is a set of techniques and processes that can be used to analyse data to improve business performance through fact-based decision-making. Business Analytics and Business Intelligence create capabilities for companies to compete in the market effectively. Business Analytics and Big Data has become one of the main functional areas in most companies. Analytics companies develop the ability to support their decisions through analytic reasoning using variety of statistical and mathematical techniques. Thomas Davenport in his book titled, “Competing on analytics: The new science of winning”, claims that a significant proportion of high-performance companies have high analytical skills among their personnel. On the other hand, a recent study has also revealed that more than 59% of the organizations do not have information required for decision making.
In a recent article based on a survey of nearly 3000 executives, MIT Sloan Management Review reported that there is striking correlation between an organization’s analytics sophistication and its competitive performance. The biggest obstacle to adopting analytics is the lack of knowhow about using it to improve business performance. Business Analytics uses statistical, operations research and management tools to drive business performance. Many companies offer similar kind of products and services to customers based on similar design and technology and find it difficult to differentiate their product/service from their competitors. However, companies such as Amazon, Google, HP, Netflix, Proctor and Gamble and Capital One use analytics as competitive strategy. Business Analytics helps companies to find the most profitable customer and allows them to justify their marketing effort, especially when the competition is very high. For instance Capital One has managed a profit of close to $1 billion in their credit card business in the recent past, where as many of their competitors have shown a loss of several millions in credit card business. There is significant evidence from the corporate world that the ability to make better decisions improves with analytical skills. This course is designed to provide in-depth knowledge of business analytic techniques and their applications in improving business processes and decision making.
For more details click HERE
Course Content
The course consists of ten modules and a project. The modules and their contents are discussed in the following paragraphs. Case-based teaching will be used for all the modules using case studies from IIMB, Harvard Business School (HBS), Darden, Ivey, and Kellogg. Significant proportions of the cases used in the course are published by IIMB faculty at the Harvard Business Publishing. A few of them are published by the students from the previous batches based on their project work. IIMB distributes more than 25 analytics cases on Indian and Multi-national companies through Harvard Business Publishing which are used by more than 250 Institutions across more then 60 countries.
Module-1
Foundations of Date Science:
Data Visualization and Interpretation (6 days)
The process of fact-based decision making requires managers to know how to summarize, analyse, conduct hypothesis tests, interpret and communicate data using data visualization and descriptive statistics techniques to facilitate decision making. Statistical analysis is a fundamental method of quantitative reasoning that is extensively used for decision making. This module is aimed at providing participants with the most often used methods of statistical analysis along with appropriate statistical tests. The module is oriented towards application without compromising the theoretical aspects.
Foundations of Data Science Module Contents
- Introduction to data science; Different types and scales of data (ratio, interval, nominal and ordinal); Data summarization and visualization methods; Tables, Graphs, Charts, Histograms, Frequency distributions, Relative frequency measures of central tendency and dispersion; Box Plot; Chebychev’s Inequality.
- Data visualization and story telling with data.
- Basic probability concepts, Conditional probability, Bayes Theorem, Probability distributions, Continuous and discrete distributions, Binomial Distribution, Uniform Distribution, Exponential Distribution, Normal distribution, Central Limit Theorem, Sequential decision making, Decision tree.
- Sampling and estimation: Estimation problems, Point and interval estimates, Confidence Intervals
- Hypothesis testing: Constructing a hypothesis test; Null and alternate hypotheses; Test Statistic; Type I and Type II Error; Level of significance, Power of a test, ANOVA
- Test for goodness of fit, Non-parametric tests.
- Introduction to R and Python
Case Studies:
1. Central Parking Solutions Private Limited (IIMB Case)
2. A Dean's Dilemma: To Admit or Not to Admit (IIMB Case)
Module-2
Data Preprocessing and Imputation (2 days)
Quality of the data is important for success of any analytics project.
Anecdotal evidence suggests that more than 80% of time taken for an analytics project is spent on data preparation and data imputation. In this short module, we will be discussing data preparation and imputation techniques before advanced analytics tools can be applied.
Contents
Data quality check, data cleaning and imputation. K Nearest Neighbours (KNN) algorithm for data imputation.
Case Study: Analytics in HR— Predicting Job Acceptance (IIMB Case)
Module-3
Predictive Analytics: Supervised Learning Algorithms (6 Days)
Predictive analytics model predicts occurrence of future events such as demand for a product, revenue forecast, customer churn, employee attrition, fraud, default in loan repayment, etc. based on historical data. In many business problems, we try to deal with data on several variables, sometimes more than the number of observations. Regression models help us understand the relationships among these variables and how the relationships can be exploited to make decisions using supervised learning algorithms. Primary objective of this module is to understand how regression and causal forecasting models can be used to analyse real-life business problems such as prediction, classification and discrete choice problems. The focus will be case-based practical problem-solving using predictive analytics techniques to interpret model outputs. The participants will be exposed to software tools such as MS Excel, R, Python, SPSS, and SAS and how to use these software tools to perform regression, logistic regression and forecasting.
Predictive Analytics Module Contents
- Regression model building framework: Problem definition, Data pre-processing; model building; Diagnostics and Validation
- Simple linear regression: Coefficient of determination, Significance tests for predictor variables, Residual analysis, Confidence and Prediction intervals
- Multiple linear regression: Coefficient of multiple coefficient of determination, Interpretation of regression coefficients, Categorical variables, heteroscedasticity, Multi-collinearity, outliers, Autoregression and Transformation of variables, Regression Model Building
- Logistic and Multinomial Regression : Coefficient of multiple coefficient of determination, Interpretation of regression coefficients, Categorical variables, heteroscedasticity, Multi-collinearity, outliers, Autoregression and Transformation of variables, Regression Model Building
- Forecasting: Moving average, Exponential smoothing, Casual models
- Application of predictive analytics in retail, direct marketing, health care, financial services, insurance, supply chain, etc.
Case Studies:
1. Pricing of players in the Indian Premier League (IIMB Case)
2. Package Pricing at Mission Hospital (IIMB Case)
3. Colonial Broadcasting Company (HBS Case)
4. Pedigree vs Grit: Predicting Mutual Fund Manager Performance (Kellogg Case)
5. Breaking Barriers – Micro-Mortgage Analytics (IIMB Case)
6. A Game of Two Halves: In-Play Betting in Football (IIMB Case)
7. HR Analytics – Predicting Probability of Renege (IIMB Case)
8. Predicting Demand for Food at Apollo Hospital (IIMB Case)
9. Predicting Earnings Manipulations by Indian firms using machine learning algorithms (IIMB Case)
Module-4
Optimization Analytics (Prescriptive Analytics (5 Days))
Optimization models are core tools used in prescriptive analytics and are used in arriving at optimal or near optimal decisions for a given set of managerial objectives under various constraints. Optimization techniques such as gradient decent plays an important role in many machine learning algorithms. Optimization is an integral part of operations analytics with specific applications in operations and supply chain management. The objective of the module is to acquaint participants with the construction of mathematical models for managerial decision situations and use freely available Excel Solver to obtain solutions and interpret the results.
Optimization Analytics Module Contents
- Introduction to Operations Research (OR), linear programming (LP), formulating decision problems using linear programming, interpreting the results and sensitivity analysis. Concepts of shadow price and reduced cost.
- Multi-period LP models. Applications of linear programming in product mix, blending, cutting stock, transportation, transshipment, assignment, scheduling, planning and revenue management problems. Network models and project planning.
- Integer Programming (IP) problems, mixed-integer and zero-one programming. Applications of IP in capital budgeting, location decisions, contracts.
- Multi-criteria decision making (MCDM) techniques: Goal Programming (GP) and analytic hierarchy process (AHP) and applications of GP and AHP in solving problems with multiple objectives.
- Non-linear programming, portfolio theory, gradient decent technique.
Case Studies:
1. Merton Truck Company (HBS Case)
2. Supply Chain Optimization at Madurai Aavin Milk Dairy (IIMB Case)
3. Red Brand Canners (Stanford Case)
4. Managing Linen at Apollo Hospitals (IIMB Case)
5. Case on Airline Operations (IIMB Case)
Module-5
Stochastic Models (Reinforcement Learning Algorithms with Applications in Marketing and Retail Analytics (5 days)
Stochastic models offer a powerful analytical approach to model and examine complex problems in the domains of finance, retail, marketing, operations and economics under uncertainty. In management as well as in business, many measurements change with time and are inherently random in nature. Stochastic models can be used to model and measure changes in metrics used for finance, marketing, operations, supply chain, etc. over a period of time. The objective of this module is to provide an introduction to stochastic processes and their applications to business and management. Stochastic models are also the basis for reinforcement learning algorithms.
Our approach will be non-measure theoretic, with an emphasis on the applications of stochastic process models using case studies.
Stochastic Models Module Contents
- Introduction to stochastic models, Markov models, Classification of states, Steady-state probability estimation, Brand switching and loyalty modelling, Market share estimation in the short and long run. Google's ranking algorithm.
- Poisson process, Cumulative Poisson process, Applications of Poisson and cumulative Poisson in operations, marketing and insurance. Measuring effectiveness of retail promotions, warranty analytics.
- Monte Carlo simulation.
- Reinforcement Learning Algorithms: Dynamic Programming; Markov decision process, Applications of Markov decision process in sequential decision making.
Case Studies:
1. Customer Analytics at Flipkart (IIMB Case)
2. Browser Wars : Microsoft Vs Netscape (Darden Case)
3. Consumer Choices between House Brands and National Brands in Detergent Purchase at Reliance Retail (IIMB Case)
4. MNB ONE Credit Card Portfolio (Darden Case)
Advanced Analytics Modules (Modules 6, 7, 8, 9 and 10)
Advanced analytical tools will be taught in four modules. The participants will be exposed to a complex decision making scenario under uncertainty and how to deal with such problems using advanced tools and Big Data. A dedicated module on machine learning algorithms will expose the students to the recent advancements in analytics and big data.
Discussion problems will be drawn from many sectors such as finance, banking, insurance, IT, ITeS, retail, service, manufacturing, pharmaceuticals, etc.
Module-6
Advanced Analytics-1
Data Reduction, Advanced Forecasting and Operations Analytics (5 Days)
- Principal component analysis, Factor analysis, Conjoint analysis, Discriminant analysis.
- Auto-Regressive Integrated Moving Average (ARIMA) models, ARIMAX.
- Supply chain analytics
- Six Sigma as a problem solving methodology, DMAIC and DMADV methodology, Six Sigma Tool Box: Seven quality tools, Quality function deployment (QFD), SIPOC, Statistical process control, Value stream mapping, TRIZ
- Classification and regression trees (CART), Chi-squared automatic interaction detector (CHAID)
- Lean thinking: Lean manufacturing, Value stream mapping
Case Studies:
1. Apollo Hospitals: Differentiation through Hospitality (IIMB Case)
2. Dean's Dilemma: To Admit or Not to Admit (IIMB Case)
3. Dosa King – A Standardized Masala Dosa for Every Indian (IIMB Case)
4. Delivering Doors in a Window – Supply Chain Management at Hindustan Aeronautics Limited (IIMB Case)
Module-7
Advanced Analytics-2
Big Data Analytics (2 days)
Big Data is defined using volume of data, velocity at which the data is created, and variety in the data. Sources of Big Data include social networks, telecom and mobile services, healthcare and public systems and machine generated data. In this module, we introduce the Big Data technologies and challenges.
Contents : Introduction to Big Data; sources of Big Data; Big Data technologies: Hadoop distributed file system; Employing Hadoop MapReduce; Statistical Analysis of Big Data.
Module-8
Advanced Analytics-3
Machine Learning Algorithms (3 days)
This module introduces the participant to machine learning algorithms such as bagging and boosting, recommender systems, clustering, text analytics, spatio-temporal analysis, association rule mining and Monte Carlo simulation.
Contents
- Introduction to machine learning, different types of machine learning algorithms.
- Recommender Systems, Collaborative Filtering: Cosine Similarity, Jaccard Coefficient.
- Advanced recommender system.
- Bootstrap Aggregating (Bagging), Random forest, Adaptive boosting, gradient boosting
- Support vector machine and Neural Network
Case Studies:
1. Predicting Earnings Manipulation by Indian Companies Using Machine Learning Algorithms (IIMB Case)
Module-9
Advanced Analytics-4
- Introduction to neural networks; rule based expert systems<0li0
- Introduction to artificial neural networks (ANN); Neuron as computing element; Perceptron: McCullogh-Pitts model; Back-propagation algorithm; Multi-layer Neural Networks.
- Deep learning algorithms: Convolutional networks; Recurrent nets; Auto-encoders.
- Game theory: Two-person zero sum game, dynamic games
- Deep Learning Platform: H2O.ai; Dato GraphLab; Tensor Flow
Module-10
Advanced Analytics-5
- Dynamic pricing and revenue management, high dimensional data analysis, financial data analysis and prediction.
- Survival analysis and its applications: Life tables, Kaplan Meier estimates, Proportional hazards, Predictive hazard modelling using customer history data
- Analytics in finance, Discounted cash flows (DCF), Profitability analysis. Asset performance: Sharpe ratio, Calmar ratio, Value at risk (VaR), Brownian motion process, Pricing options and Black–Scholes formula
- Game theory: Two-person zero sum game, dynamic games
- Analysis of unstructured data: text mining and sentiment analysis, analysis of machine generated data
Case Studies:
1. 1920 Evil Returns – Bollywood and Social Media Marketing
2. Markdown optimization at Indian Retail Store
Details
Target Audience
The course will benefit executives, project leaders and senior managers working in various sectors. The course is designed for professionals who would like to improve ROI for their companies using analytics.
Key Benefits/Take Aways
The course is suitable for those who are already working in analytics to enhance their knowledge as well as for those with analytical aptitude and would like to start a new career in analytics.
Eligibility Criteria
The participants should have a Bachelor degree in engineering/science/commerce or arts with mathematics as one of the subjects during their Bachelor’s programme. Preferable work experience is 3 years, in exceptional cases applicants with less than 3 years are admitted into the programme.
Selection Process
Candidates will be short-listed for interview based on online aptitute test and their past academic performance, quality of work experience and fitness for analytics job.
Programme Directors
Professor U Dinesh Kumar
http://www.iimb.ernet.in/user/62/dinesh-kumar
Professor Rajluxmi V Murthy
Programme Charges
Programme Fee*:
Rs TBD* + GST (as applicable) per participant. The fee is payable in three installments as per the following schedule.
Rs. TBD + Applicable GST : I installment on admission
Rs. TBD + Applicable GST : II installment on or before 4 August 2018
Rs. TBD + Applicable GST : III installment on or before 3 November 2018
Please Note : *Please add GST at prevailing rates to the programme fee.
All enrollments are subject to review and approval by the programme director. Joining Instructions will be shared with the organization if sponsored or to the participants on selection.
Kindly do not make your travel plans unless you receive the offer letter from IIMB.
Note
- The programme fee should be received at the Executive Education Office, before the programme commencement date.
- In case of withdrawls, the fee will be refunded only if a request is received at least 15 days prior to the start of the programme.
- If a nomination is not accepted, the fee will be refunded to the person / organization concerned.
Program Schedule
Online Classes
On-campus Classes
The on-campus classes will be held in IIMB campus at Bangalore.
Off-Campus (Online classes)
A limited number of seats are available for online participants. In order to be eligible, the candidates must be residing outside of Bangalore. The candidates from SAARC countries and other neighbouring countries in Asia are also encouraged to apply.
Those interested in applying for the online option (Off Campus) must indicate their choice at the time of applying for the programme. It is not possible to change from Off Campus to On Campus mode of delivery after the selection list has been announced.
Online Mode of delivery
The participants who opt for online classes have to come to the campus for the on-campus sessions along with the other participants of the programme. The classes are streamed via Internet to the desktop or laptop of the participant.
The Infrastructure at the Remote end
The participants who sign up for the online classes have to invest in the following hardware / software:
- A Windows PC with a minimum of quad-core processor
- Windows 8.0 or higher OS
- Internet Explorer
- A PC that is free from virus
- A high-speed internet connection ( 2 Mbps or higher ) – It has to be a DSL connection. Wireless Broadband can be used but only as a backup to a regular DSL line.
- The PC and the broadband must be powered by an UPS or Inverter that has a minimum of 4-hours of backup support. (only indicative. If the power supply in your area is notoriously bad, you might think of investing in an inverter of higher capacity that supports 9-10 hours backup for powering a PC and Broadband Modem).
- And finally a quiet room with a table and chair. It is preferable that you have a 16" / 32"/ 42" Monitor to display the classroom video.
If you do not meet this infrastructure requirement, we strongly discourage you to apply for online classes.
IIMB Executive Education reserves the right to reject the candidature of an online applicant if the bandwidth tests reveal that the infrastructure at the remote end falls short of requirements.
Important Dates
How to Apply
- Please logon to online registration portal HERE for registering and applying online
- The programme is listed under the “Long Duration Programme” Tab. Do feel free to get back to us if you should have any clarification
- Email: lathap@iimb.ac.in Mobile: 91-895 128 1603
Testomonials
For video Testimonial HERE