Download as pdf or txt
Download as pdf or txt
You are on page 1of 202

5TH INTERNATIONAL CONFERENCE ON

BUSINESS ANALYTICS AND INTELLIGENCE

11 – 13 DECEMBER, 2017
5TH INTERNATIONAL CONFERENCE ON BUSINESS ANALYTICS AND INTELLIGENCE
11-13 DECEMBER, 2017
Table of Contents
PREDICTIVE ANALYTICS
BANKING, FINANCIAL SERVICES AND INSURANCE
Sl.No ABSTRACT TITLE Page No
HIDDEN MARKOV MODELS FOR STOCK PRICE PREDICTION
1 3
-A STUDY
CREDIT SCORING USING HYBRID INTELLIGENT TECHNIQUE : AN APPLICATION TO
2 4
RETAIL BANKING
3 A ROBUST PREDICTIVE MODEL FOR STOCK PRICE FORECASTING 5
ADVANCED DATA ANALYTICS SOLUTION TRANSFORMING THE TARGETING STRATEGY
4 6
FOR A MID-SIZED FINANCIAL COMPANY IN USA
A STUDY ON INDIAN STOCK MARKET USING TEXTUAL AND OPINION MINING
5 ANALYSIS OF FINANCIAL NEWS HEADLINES WITH SPECIAL REFERENCE TO BUSINESS 8
STANDARD NEWSPAPER
6 IMPACT OF INVESTOR SENTIMENT ON STOCK PRICES 9
FINANCIAL REPORTING WITH KNOWLEDGE CONVERSION AND NATURAL LANGUAGE
7 10
PROCESSING
8 IDENTIFYING EARLY SIGNS OF DEMAND SHRINKAGE OF A PRODUCT 11
9 THE IMPACT OF VOLATILITY INDEX ON INDIAN STOCK MARKET 12
SPEED OF ADJUSTMENT TOWARDS TARGET CAPITAL STRUCTURE: AN INDIAN
10 13
PERSPECTIVE
11 FORECASTING AND DETERMINATION OF GOLD PRICE IN THE WORLD GOLD MARKETS 14
12 ANALYSIS AND PREDICTION OF INDIAN ECONOMIC GROWTH BEFORE AND AFTER GST 16
ONLINE VERSUS OFFLINE BANKING SERVICES USAGE: APPLICATION OF MULTINOMIAL
13 17
LOGISTIC REGRESSION
14 PROPENSITY TO DEFAULT MODEL FOR ONLINE CREDIT MARKETPLACE 19
15 CLASSIFICATION MODELLING USING DECISION TREE 20
16 DEEP LEARNING AND FRAUD DETECTION IN CREDIT CARD TRANSACTIONS 21

MANUFACTURING
Sl.No ABSTRACT TITLE Page No
FORECASTING OF ENERGY FOR MANUFACTURERS USING ARIMA AND NEURAL
17 23
NETWORK
ANOMALOUSLY POTENTIAL FRAUDULENT LOGISTICS – A TANGANYIKAN SUPPLY
18 24
CHAIN ANALYTICS
19 FUTURE OF PREDICTIVE MAINTENANCE: IOT BASED APPROACH 25
20 CONTROL CHART PATTERN RECOGNITION USING STATISTICAL AND SHAPE FEATURES 26
21 A NEURO FUZZY MODEL FOR OEE PREDICTION 27

RETAIL
Sl.No ABSTRACT TITLE Page No
STAKEHOLDERS FRAMEWORK FOR ANALYZING RELATIONSHIP BETWEEN THE
22 29
SUSTAINABILITY METRICS OF SUPPLY CHAIN.
AN INTEGRATED MANOVA AND ANP APPROACH FOR SUPPLIER EVALUATION WITH
23 31
FUZZY DATA
DEEP DIVE ANALYSIS OF CUSTOMER EXPERIENCE WITH MACHINE LEARNING
24 32
TECHNIQUES
A CASE STUDY ON PREDICTION OF VISUAL INVENTORY USING BRAND, STORE AND
25 34
CITY PROFILE VARIABLES
26 CAPTURING DEMAND TRANSFERENCE IN RETAIL - A STATISTICAL APPROACH 36
IDENTIFYING ITEM SUBSTITUTES- A SCALABLE, MACHINE LEARNING BASED
27 38
APPROACH
IMPACT OF SOCIAL CONTEXT ON ONLINE SHOPPING IN THE SELECT ASIAN
28 COUNTRIES 40
(INDIA, MALAYSIA, SINGAPORE, UZBEKISTAN AND THAILAND)
A STUDY ON THE IMPACT OF SOCIAL MEDIA, SECURITY RISKS & REPUTATION OF THE
29 E-RETAILER ON BUYING INTENTIONS OF THE YOUTH THROUGH TRUST IN ONLINE 41
BUYING: AN STRUCTURAL EQUATION MODELING APPROACH
INTELLIGENT CATEGORIZATION OF PRODUCT RECOMMENDATIONS FOR ENHANCED
30 42
CUSTOMER EXPERIENCE
CS360 – CLOUD SECURE 360 THROUGH BEHAVIORAL ANALYSIS
31 (EXTENDED 360 CLOUD SECURITY USING DEEP LEARNING FOR BEHAVIORAL 43
ANALYSIS)
EXPLORATION OF CUSTOMER ONLINE SHOPPING EXPERIENCE TYPES AND THEIR
32 44
EFFECTS ON CUSTOMER SATISFACTION
USER REVIEWS AGGREGATION AND SUMMARIZATION AGENT TO AID E-COMMERCE
33 45
CONSUMERS
34 RESPONSE MODELLING WITH K-NN AND XNB 46
35 RESPONSE MODELLING USING SUPPORT VECTOR MACHINES & NAÏVE BAYES 47

SERVICES
Sl.No ABSTRACT TITLE Page No
NOVEL METHODS FOR MONITORING SOCIAL MEDIA PROPENSITY AND NETWORK
36 49
ACTIVITIES
37 PUBLISHER SUBSCRIBER CHURN PREDICTION MODEL USING LOGISITC AND MARKOV 50
USING TEXT MINING AND SENTIMENT ANALYSIS TO STUDY CORRELATION BETWEEN
38 51
PAGE ENGAGEMENT AND ARTICLES
39 FEATURE SELECTION IN SPARSE MATRICES 52
COMPARISON OF PERFORMANCE OF INDIAN AVIATION SERVICE PROVIDERS USING
40 53
MULTI-CRITERIA DECISION MODELS
41 CYBER SECURITY: A GRAPH BASED APPROACH 55
42 A GLOOMY GROWTH OF BPO INDUSTRY: EMPLOYEE ATTRITION ANALYSIS 56
43 CONSUMER ADOPTION OF MOBILE PAYMENTS IN INDIA POST DEMONETISATION 57
44 I-OPS - THE INTELLIGENT OPERATIONAL MATRIX 58
COLLABORATIVE, CO CREATIVE AND SUSTAINABLE FOOD DISTRIBUTION SYSTEM BY
45 59
HARNESSING INTELLIGENT URBANIZATION.
ANTECEDENTS OF ENTREPRENEURIAL ORIENTATION AND THEIR ROLE IN FOSTERING
46 ENTREPRENEURIAL INTENTIONS AMONG UNIVERSITY GRADUATES: AN SEM 60
APPROACH
47 AHA 62
48 PUBLIC TRANSPORT TRACKING SYSTEM USING IoT 63
THE IMPACT OF FACEBOOK ADVERTISEMENT, FACEBOOK REVIEW AND FACEBOOK
49 64
PAGE ON ONLINE SHOPPING INTENTION
50 SMART HEALTH CARE 65
51 CRIME ANALYSIS ACROSS MAJOR CITIES OF INDIA WITH TWITTER 66
52 DEEP LEARNING BASED DECEASED-DONOR KIDNEY ALLOCATION MODEL FOR INDIA 67

OTHERS
Sl.No ABSTRACT TITLE Page No
STAKEHOLDERS FRAMEWORK FOR ANALYZING RELATIONSHIP BETWEEN THE
53 70
SUSTAINABILITY METRICS OF SUPPLY CHAIN.
54 STATE WISE PANEL DATA STUDY OF USAGE OF TOILETS IN RURAL INDIA 72
55 PENALTY: A GAME OF CHOICES 73
KABADDI: FROM AN INTUITIVE TO A QUANTITATIVE APPROACH FOR ANALYSIS,
56 74
PREDICTIONS AND STRATEGY
FEATURE RANKING USING ASYMMETRIC INFORMATION INDEX IN CASE OF
57 75
CONTINUOUS TARGET VARIABLE
58 GREEN PURCHASE INTENTION AND BRAND EQUITY: A RETRACE 76
59 PREDICTION OF VIOLENCE IN CIVIL UNREST AREAS OF JAMMU AND KASHMIR 77
COST VIABILITY OF RENEWABLE POWER PROJECT IN EDUCATIONAL INSTITUTE IN
60 78
PUNE: CASE STUDY
61 ANALYSIS OF EDUCATIONAL PROGRSS OF INDIA USING DATA VISUALISATION 79
62 DETECTING AND PREVENTING FRAUD WITH APACHE FLINK 80
63 CLINICAL INTELLIGENCE AND INSIGHTS 81
RELIABILITY AND VALIDITY OF CORPORATE SOCIAL IDENTITY DISCLOSURE USING TEXT
64 83
ANALYTICS
65 DECODING THE PEOPLE’S SENTIMENTS: DEMONETISATION AND ELECTIONS 84
RECOMMENDER SYSTEM TO INCREASE ENGAGEMENT FOR SPENDERS & NON-
66 85
SPENDERS OF FREEMIUM MOBILE GAMES
CORPORATE OWNERSHIP STRUCTURE AND ITS DETERMINANTS: EVIDENCE FROM
67 86
INDIA
68 COUNTER TERRORISM IN THE AGE OF ARTIFICAL INTELLIGENCE: THE CASE OF INDIA 87
69 APPLICATION OF ANALYTICS PERFORMANCE MANAGEMENT 88
70 CALCULATING TRUST SCORES ON FACEBOOK AND TWITTER 89
SDC-MINER: AN ASSOCIATION RULE MINING ALGORITHM FOR CROWDMINING OF
71 91
UNCERTAIN DATA USING APACHE SPARK
ANALYSIS OF ADVERTISING MEDIA IN TERMS OF EFFECTIVENESS AND
72 92
TRUSTWORTHINESS (CUSTOMER PERCEPTION)
73 FACTOR ANALYSIS: THE MOST RELEVANT TOOL IN EXPLORATORY RESEARCH 93
NEW APPROACHES OF RESUME SECTIONING FOR AUTOMATING TALENT
74 94
ACQUISITION
75 SENTIMENT ANALYSIS OF TWITTER DATA FOR DEMONETIZATION IN INDIA 95
A STUDY ON FEASIBILITY ANALYSIS OF INVESTMENT ON COCONUT PLANTATION IN
76 96
KARNATAKA
77 SENTIMENT ANALYSIS ON DEMONETIZATION BY GOVT OF INDIA 97
78 QUALITY MANAGEMENT IN POWDER COATING PROCESS: A SIX SIGMA APPROACH 98
PRESCRIPTIVE
BANKING, FINANCIAL SERVICES AND INSURANCE (BFSI)
Sl.No ABSTRACT TITLE Page No
ELUCIDATION OF THE DYNAMICS OF CROSS-MARKET CLUSTERING AND
79 CONNECTEDNESS IN ASIAN REGION: AN MST AND HIERARCHICAL CLUSTERING 101
APPROACH
80 HETEROGENEITY IN THE RESOLUTION OF BANK FAILURES: A LATENT CLASS APPROACH 102

RETAIL
Sl.No ABSTRACT TITLE Page No
81 CRITERIA CLASSIFICATION FOR COST AND QUALITY ASSESSMENT OF SUPPLIERS 104
82 A DEA APPROACH TO EVALUATE THE EFFICIENCY OF RETAILERS 106
83 SUPPLY-DEMAND DRIVEN OPTIMAL PRODUCTION OF GREEN PRODUCT VARIANTS 107
84 STITCHING PROCESS IN THE APPAREL INDUSTRY: FUZZY DMAIC 109
PERFORMANCE EVALUATION OF SUSTAINABLE INNOVATION PRACTICES USING BEST
85 111
WORST METHOD
86 ONLINE ORDER FULFILMENT: APPROACH TO MAXIMIZE RESOURCE UTILIZATION 113
87 MARKET ENTRY STRATEGY FOR LAUNCHING A NEW DRUG 114
A CHANCE CONSTRAINT BASED LOW CARBON SUSTAINABLE SUPPLY CHAIN
88 115
CONFIGURATION FOR AN FMCG PRODUCT
89 RETAIL ANALYTICS: HARNESSING BIG DATA FOR GROWTH IN RETAIL INDUSTRY 116
90 FORMULATION & PILOT FOR INCREASING THE TCI IN F&V CATEGORY IN HYPERMARKET 118

SERVICES
Sl.No ABSTRACT TITLE Page No
91 OPTIMAL ADVERTISEMENT PLACEMENT IN TELEVISION IN A TIME WINDOW 120
STOCHASTIC MODEL FOR FORECASTING CUSTOMER EQUITY: MOBILE SERVICE
92 121
PROVIDER CASE
A MULTI-SERVER INFINITE CAPACITY MARKOVIAN FEEDBACK QUEUING SYSTEM WITH
93 122
REVERSE BALKING
IMPLEMENTATION OF WATER CYCLE ALGORITHM FOR MODELLING AND OPTIMIZATION
94 123
OF SUPPLY CHAIN NETWORK
PLAYER ACTIVITY AND FREEMIUM BEHAVIOR IN AN ONLINE ROLE PLAYING GAME: A
95 124
JOINT MODEL APPROACH
ESCALATING CREATIVE BUDGETARY MODEL TOWORDS WET GARBAGE EJECTION BY
96 125
MULTI-CRITERION DECISION ANALYSIS
97 A CONVEX MODEL DATABASE APPROACH TO RAILWAY SCHEDULE VALIDATION 127

OTHERS
Sl.No ABSTRACT TITLE Page No
FUZZY MULTI-CRITERIA APPROACH FOR JOINT PERFORMANCE EVALUATION IN A DEA
98 129
PROBLEM
99 SUSTAINABLE LOGISTICS PROVIDER EVALUATION USING ROUGH SET THEORY AND AHP 130
100 CRITERIA CLASSIFICATION FOR COST AND QUALITY ASSESSMENT OF SUPPLIERS 132
101 TRAVEL TIME PREDICTION USING BIG DATA ANALYTICS 134
ADAPTATION OF TRADITIONAL NEWSVENDOR MODEL FOR VARIABLE PER-UNIT-COST
102 135
OF UNDERSTOCKING AND OVERSTOCKING
USAGE OF RESOURCE CALENDAR AND BINARY INTEGER PROGRAMMING TO SCHEDULE
103 136
JOBS IN A WASTE MANAGEMENT SCENARIO
104 SURVIVAL ANALYSIS IN SUPPLY CHAINS USING PROBIT STICK BREAKING PROCESS 137
105 CULTIVATION OF ORGANIC CROP USING THE METHOD OF GOAL PROGRAMING 138
106 ASSESSING THE IMPACT OF CATASTROPHIC FLOOD EVENTS ON A TERRAIN 140
A HEURSITIC OPTIMIZATION SOLUTION FOR THE SELECTION OF TRANSFORMATION
107 141
FUNCTIONS FOR MEDIA CHANNELS IN MMM
ARTIFICIAL INTELLIGENCE
BANKING, FINANCIAL SERVICE AND INSURANCE (BFSI)
Sl.No ABSTRACT TITLE Page No
108 APPLICATION OF MACHINE LEARNING IN INSURANCE UNDERWRITING 144
109 GENEROUS: GENERATE KNOWLEDGE GRAPH FROM UNSTRUCTURED TEXT 145
110 AUTOMATED TRADING OF BITCOINS USING DEEP LEARNING 146

MANUFACTURING
Sl.No ABSTRACT TITLE Page No
ACTIVE DEEP LEARNING ON BIG DATA FOR ASSEMBLY PLANT FOR IMPROVING
111 148
ROBOTIC ENERGY EFFICIENCY
112 MACHINE LEARNING APPROACH FOR OCR OF DOT CODE IN TIRES 149

RETAIL
Sl.No ABSTRACT TITLE Page No
AN EXPERT SYSTEM FOR PERFUME SELECTION USING ARTIFICIAL NEURAL
113
NETWORK 151
PRAGMATIC ANALYSIS OF CUSTOMER SENTIMENTS TOWARDS FMCG BRANDS-
114 152
PATANJALI AND NESTLE
A FACIAL RECOGNITION BASED APPROACH TO LEVERAGE CCTV VIDEO DATA FOR
115 REAL TIME CUSTOMER LEVEL PROACTIVE ACTIONS AS WELL AND MEASURING 153
VARIATION OF STOCHASTIC DISTRIBUTION AT ORGANIZATIONAL LEVEL
NEURAL NETWORK ADOPTION IN TIME SERIES FORECASTING-A COMPARATIVE
116 155
ANALYSIS

SERVICES
Sl.No ABSTRACT TITLE Page No
ANALYSIS OF MONOGENIC HUMAN ACTIVITY RECOGNITION DATA USING DATA
117 157
MINING ALGORITHMS
ADVENT OF ARTIFICIAL INTELLIGENCE IN CUSTOMER EXPERIENCE
118 158
TRANSFORMATION
119 NEURAL NETWORK BASED APPROACH FOR TOURISM DEMAND FORECASTING 160
120 IRIS – A COGNITIVE DECISION ENGINE 161
121
ANOMALY DETECTION IN NETWORKING LOGS USING UNSUPERVISED
122 163
AUTOENCODER LEARNING
TEXT ANALYTICS FOR RELATIONSHIP EXTRACTION TO CONVERT SENTENCES TO
123 164
EQUATIONS
124 FACIAL COMPOSITE USING DNA DECODING 165
125 Q-MAP: CLINICAL CONCEPT MINING WITH PHRASE SENSE DISAMBIGUATION 166
126 DIABETICS EYE DISEASE DETECTION 167
127 CUSTOMER SUCCESS USING DEEP LEARNING 169
128 CISCO SERVICES DIGITIZATION USING CHAT BOTS AND MACHINE LEARNING 170
A NEURAL NETWORK BASED ATTRITION PREDICTION MODEL FOR A
129 172
MANUFACTURING PLANT
130 A NOVEL MACHINE LEARNING BASED APPROACH FOR GENERIC FORM PROCESSING 173
131 TRACTABLE MACHINE LEARNING FRAMEWORK FOR IoT SENSOR DATA 174
A SECURE PROTOCOL FOR HIGH DIMENSIONAL BIGDATA PROVIDING DATA
132 176
PRIVACY
133 LAWBO: A SMART LAWYER CHATBOT 177
MARKET WATCH - MOBILE APPLICATION SENTIMENT ANALYSIS FOR
134 178
UNDERSTANDING THE VOICE OF THE APP USERS

OTHERS
Sl.No ABSTRACT TITLE Page No
CONVERSATIONAL USER INTERFACE: A PARADIGM SHIFT IN HUMAN-COMPUTER
135 180
INTERACTION
THE EXISTENCE OF ANALYTICAL MYOPIA AND NEED FOR BIG DATA PSYCHOLOGISTS
136 181
IN BUSINESS INTELLIGENCE
137 ALGORITHMIC BIAS – CAN MACHINE LEARNING GO WRONG? 182
138 TRUST AND EXPLAINABILITY FOR ARTIFICIAL INTELLIGENCE 185
DEEP AUTOREGRESSIVE IMMUNE SYSTEM LEARNING NEURAL CORRELATES OF
139 186
DECISION MAKING
140 AI ENABLED CONVERSATIONAL ENTITY FOR PRODUCT RECOMMENDATION 187
EFFECT OF PCA AND APPLICATION OF MACHINE LEARNING ALGORITHMS ON
141 189
PLANT LEAF CLASSIFICATION
142 APPLICATIONS OF DEEP LEARNING TO AUTONOMOUS VEHICLES 190
143 TEXT ANALYTICS IN SOCIAL STREAMS USING ARTIFICIAL NEURAL NETWORKS 191
144 STACKING WITH DYNAMIC WEIGHTS ON BASE MODELS 192
DETECTION OF MALWARE APPLICATIONS IN MOBILE DEVICES USING SUPERVISED
145 193
MACHINE LEARNING ALGORITHMS
USER AUTHENTICATION USING STEGANOGRAPHY FOR BIG DATA IN MOBILE DATA
146 194
CENTER
PREDICTIVE ANALYTICS

BANKING, FINANCIAL SERVICES AND INSURANCE

BFSI

2
HIDDEN MARKOV MODELS FOR STOCK PRICE PREDICTION
-A STUDY

Prof. P.V. Chandrika


Assistant Professor
Welingkar Institute of Management
Bangalore, India
Email: chandrika.hr@gmail.com
Dr.Hema Doreswamy*
Associate Professor
Welingkar Institute of Management
Bangalore, India
Email: hema.doreswamy@welingkar.org

Abstract

The objective of investment activity is to get a reasonably good return with certain risk levels.
The definition of reasonably good return and risk levels changes according to age, job profile,
growth opportunities, and objectives of investment of the investor. Risk and return always go
hand in hand but every investor wants to make good returns. When we look at list of investment
options of an investor, it starts with treasury bills with zero levels of risk with an assured return
and ends with investments in equity shares of listed companies where both risk and return are
unpredictable. But for a good return and capital appreciation, investment in stocks either
directly or through mutual fund is inevitable. The only option available to the investor/fund
managers is to analyze the stock prices for a previous period and try to predict the stock
movement in the future. This will help in hold, buy or sell decisions for stocks. Predicting stock
prices started as early as 1950s. Financial economists started using probability theory and
statistics heavily to predict stock prices. Efficient market hypothesis (EMH), technical analysis,
fundamental analyses are few very popular methods used to predict stock prices. But predicting
stock prices for a future period has never been highly successful. Stock market collapses and
investors losing huge amount of money is a regular feature. Also, few of the stock market
prediction models widely discussed and practiced are not fully accepted by certain value
investors. For e.g. Warren Buffet does not believe in EMH theory. There are many new tools
and techniques being experimented to predict the stock prices with more accuracy. Hidden
Markov Model (HMM) is one such statistical method used to predict the stock prices. In this
article authors will build a HMM model and apply it on 5 stocks listed on Bombay stock
exchange and National stock exchange. The stocks on which HMM model will be applied are
Asian Paints, Dr.Reddy’s Laboratories Ltd, Hero Motor Corp, Larsen and Toubro Ltd and
Infosys. A time frame of 10 years, from 2006-2016 will be considered for the study.
Key words: Investment, Risk and Return, Statistical Models, Hidden Markov model (HMM)

3
CREDIT SCORING USING HYBRID INTELLIGENT TECHNIQUE: AN
APPLICATION TO RETAIL BANKING

Sanjay Nagtodea , Hema Dateb


a Fellow, National Institute of Industrial Engineering (NITIE), Mumbai 400087, India
b Professor, National Institute of Industrial Engineering (NITIE), Mumbai 400087, India

Abstract

Credit scoring has been regarded as a core appraisal tool of banking during the last few
decades, and has been widely investigated in the area of finance, in general and banking
sector in particular. The credit risk decisions are key determinants for managing the Non-
Performing Assets of the Bank because a huge losses result from wrong decision.
During the past few years, many researchers has carried out studies using various statistical
tools, data mining techniques to build intelligent system for evaluating and improving the
accuracy of credit scoring models. Most of them has predicted a dichotomous output - whether
the customer will repay the loan (GOOD Credit) or will default in repayment of the Loan (BAD
Credit), increasing the banks Net Performing Assets (NPA), thereby affecting the profitability
of the Bank. Thus, Non-Performing Assets has been a major area of concern for the banking
industry over the years.
The research proposes a Quantitative Model for prediction of loan default using hybrid
intelligent techniques for evaluating the credit risk. A two stage model of CRT-MLP
(Classification and Regression Tree - Multi-Layer Perceptron) is proposed for prediction of
Loan default, as per the Prudential norms on Income Recognition, Asset Classification (IRAC)
and Provisioning pertaining to advances as defined by RBI and explore whether the proposed
model outperform the other traditional techniques/models. In the proposed model, Credit Risk
is classified into three categories viz. Standard, Sub-Standard and Doubtful & Loss as per the
IRAC Norms of RBI. The study uses dataset containing account, demographic information and
loan repayment details of the customers of a Bank. The model is benchmarked against various
other popular models / techniques used so far for evaluation of credit score to see whether the
proposed Hybrid Model of CART-MLP results in better classification accuracy and lowering
misclassification significantly.
Keywords: Credit Scoring, Multinomial Logistic Regression (MLR), Discriminant Analysis
(DA) Classification and Regression Tree (CART), Multi-Layer Perceptron (MLP), Net
Performing Assets (NPA)

4
A ROBUST PREDICTIVE MODEL FOR STOCK PRICE
FORECASTING

Jaydip Sen
Professor
Praxis Business School
Kolkata, INDIA
jaydip@praxis.ac.in

Tamal Datta Chaudhuri


Professor
Calcutta Business School
Kolkata, INDIA
tamalc@calcuttabusinessschool.org

Abstract

Prediction of future movement of stock prices has been the subject matter of many research
work. On one hand, we have proponents of the Efficient Market Hypothesis who claim that
stock prices cannot be predicted accurately. On the other hand, there are propositions that
have shown that, if appropriately modelled, stock prices can be predicted fairly accurately.
The latter have focused on choice of variables, appropriate functional forms and techniques
of forecasting. This work proposes a granular approach to stock price prediction by
combining statistical and machine learning methods with some concepts that have been
advanced in the literature on technical analysis. The objective of our work is to take 5 minute
daily data on stock prices from the National Stock Exchange (NSE) in India and develop a
forecasting framework for stock prices. Our contention is that such a granular approach can
model the inherent dynamics and can be fine-tuned for immediate forecasting. Six different
techniques including three regression-based approaches and three classification-based
approaches are applied to model and predict stock price movement of two stocks listed in
NSE - Tata Steel and Hero Moto. Extensive results have been provided on the performance of
these forecasting techniques for both the stocks.

Keywords: Stock Price Prediction, Multivariate Regression, Logistic Regression, Decision


Tree, Artificial Neural Networks.

5
ADVANCED DATA ANALYTICS SOLUTION TRANSFORMING THE
TARGETING STRATEGY FOR A MID-SIZED FINANCIAL COMPANY
IN USA

Analyttica Datalab Inc., Bangalore, India


Pallavi Goswami (Pallavi.goswami@analyttica.com)
Tanmoy Das (Tanmoy.das@analyttica.com)
Subhadra Dutta (Subhadra.dutta@analyttica.com)
Satyabrata Samanta (Satyabrata.samanta@analyttica.com)

Abstract

Analyttica is a niche data-science and advanced business analytics company focused on


Banking and Financial Services (BFS) industry. We create incremental business impact for our
clients by developing custom innovative solutions for them in the predictive and prescriptive
analytics space.
The client for which this solution was developed, is a US based mid-sized Non-Banking
Financial Company (NBFC) that operates in a niche market segment. The NBFC is in the
business of extending credit (in the form of personal loans) to the customers who are indexed
high in terms of consumer credit risk. They wanted to leverage the power of advanced data
analytics to optimize their business actions for marketing efficiencies and risk management.
Analyttica was engaged to achieve this. The business integrated advanced analytics and
machine learning to design a solution that entailed identification of the right prospects for
targeting, leading to optimization of marketing and operational cost. This also mitigated the
long-term credit losses incurred from customers who have the tendency to default on loan.
Consumer credit bureau (CCB) data was used for building the solution. CCB is the single
largest, comprehensive, reliable and most effective source of external data in USA. An
ensemble solution approach was undertaken that involved innovative feature extraction,
dimensionality reduction and application of advanced statistical and machine learning
techniques. Although the solution was custom made for the NBFC, the design turned out to be
repeatable, scalable and customizable for similar business objectives in the consumer lending
space.
Around 1000 parameters were procured from the CCB. Time series attributes were leveraged
to create 270 additional custom derived variables. A heuristic feature extraction approach was
undertaken for the same. It extracted insights from the customer’s behavioral trend across four
custom feature categories - (1) Magnitude (2) Velocity (3) Acceleration and (4) Index.
130 additional custom variables were created to capture the interactions amongst key customer
characteristics. These additional 400 variables helped in understanding the dynamics of
customer behaviour better. This exercise was followed by dimensionality reduction to reduce
the bureau cost and enabling a long term maintenance of the solution.

6
Techniques such as Principal Component Analysis (PCA), Variable Clustering, and
Information Value were used for dimensionality reduction to arrive at an optimum set of
attributes. The number of attributes were reduced to 1/3rd, by losing only 1/10th of the
information. This helped in reducing the CCB procurement cost for the NBFC by 60%. This
was followed by building an ensemble machine learning solution that involved response and
risk segmentation using techniques such as CHAID and modeling using techniques such as
Logistic Regression. Sampling, in-time, and out-of-time validation ensured robustness and
reliability of the solution. The solution improved the customer response rate by 40%, decreased
the cost per acquisition by 20% and significantly mitigated the over-all credit risk of the
portfolio.
Analyttica’s in-house advanced point and click data analytics and machine learning platform,
Analyttica Treasure Hunt (ATH) was leveraged as a tool to solve this business problem. As a
continuous improvement approach, ATH is also being used as a sand-box/data-lab environment
to continuously experiment and identify opportunities to improve the solution by evaluating
applicability of new and different statistical approaches and machine learning algorithms.

Keywords: Consumer Credit Information, Custom Feature Extraction, Machine Learning


Techniques, Response Segmentation, Risk Segmentation

7
A STUDY ON INDIAN STOCK MARKET USING TEXTUAL AND
OPINION MINING ANALYSIS OF FINANCIAL NEWS HEADLINES
WITH SPECIAL REFERENCE TO BUSINESS STANDARD
NEWSPAPER
1 A. PAPPU RAJAN, 2S.SURESH
1Associate Professor
St.Joseph’s Institute of Management
St.Joseph’s College(Autonomous)
Tiruchirappalli,Tamil Nadu
Ap_rajan2001@yahoo.com
2Assistant Professor
St.Joseph’s Institute of Management
St.Joseph’s College(Autonomous)
Tiruchirappalli, Tamil Nadu

Abstract

This is a known fact that news and stock prices are closely related and news usually has a
great influence on stock market investment. There have been many researches aimed at
identifying that relationship or predicting stock market movements using news analysis. This
research work introduces a method of mining text options to analyze the financial headlines
of Business Standard newspaper to predict the rise and fall of the Bombay Stock Exchange. A
system is designed and implemented to predict stock price trends for the time immediately
after the publication of news articles. This system consists mainly of three components. The
first component gathers news articles and stock prices. The second component categorizes the
news articles into predefined categories, and finally the third component applies appropriate
analyzing strategies depending on the category of the news article. Thus this study focus on
analyzing the trend of the stock market based on the financial news articles released on that
particular day.

Keywords: News analytics, Business Analytics, Web sentiment analytics.

8
IMPACT OF INVESTOR SENTIMENT ON STOCK PRICES
Prajwal Eachempati
FPM, IT SYSTEMS
Indian Institute of Management Rohtak
Rohtak, Haryana
fpm03.007@iimrohtak.ac.in

Dr. Praveen Ranjan Srivastava*


Faculty, IT Systems
Indian Institute of Management Rohtak
Rohtak, Haryana
praveen.ranjan@iimrohtak.ac.in

Abstract
Capital markets are influenced by several forces which include environmental factors like fiscal
and monetary policies, and budgetary matters, lead, lag and co-incidental economic indicators,
industry affiliations of the company and its management policies, internal financial and
nonfinancial strengths, weaknesses, opportunities and threats identified from time to time,
composition of its equity shareholders and a host of other factors playing in the markets, media,
global news, expert views. It has always remained a puzzle as to why equity share prices of a
company and market moods cannot be predicted with precision. Of late, algorithmic trading
methods are being followed to predict the market with near accuracy but it has a long way to
go. In this context, studies have attempted at interpolating stock market analysis with
Behavioural Finance to plug the gap in price valuation due to human sentiment of market
players. But retail investors constitute 3.3% of the Indian market and since technical analysis
deals with impact of trade volume (demand-supply) on share price movement, sentiments
expressed may not match with trade volume as the analyst has no data regarding the sentiments
cover most investors contributing to the trade volume thus refuting the studies that explained
the gap due to human sentiment alone with accuracy. But in a SEBI report it was shown that
though Individual investors are small, they contributed larger volume of turnover due to either
a larger number of trades by retail investors or the shares traded by retail investors being of
higher average value or a combination of the two. But this trend appears to be on the decline
from 2013-14. The objective of this paper is to validate the impact of investor sentiment on
stock prices and ascertain whether it is a sufficient factor for accurate prediction. For this, an
event-based study pertaining to nine listed companies has been carried out to evaluate the
impact of the event-based sentiments on stock prices through regression analysis. This has been
refuted and the need for an integrated analysis for prediction of stock prices has been
emphasized in this paper.

Keywords: sentiment, indices, stocks, machine learning, Q-ratio

9
FINANCIAL REPORTING WITH KNOWLEDGE CONVERSION AND
NATURAL LANGUAGE PROCESSING

Amit Mitra
Associate Consultant, CGI
Bengaluru, India
amit.mitra@cgi.com
Siddharth Prem Kumar*
Software Engineer, CGI
Chennai, India
siddharth.premkumar@cgi.com
Sheifali Agarwal *
Software Engineer, CGI
Chennai, India
sheifali.agarwal@cgi.com
Sinu Antony *
Associate Consultant, CGI
Bangalore, India
sinu.antony@cgi.com

Abstract

Financial reporting is the most important reporting module for insurance companies. Currently
Insurance companies rely on either commercial off the shelf (COTS) reporting products or on
bespoke custom developed reporting solutions. Turnaround time for generating new reports
either on legacy or even modern commercial off the shelf core insurance products varies from
weeks to months. Although we are witnessing self-service Business Intelligence commercial
products, but these too requires certain amount of training for business professionals.
Sometimes these are pushed to the IT division to get help from data scientists to understand the
complexities of the data behind the COTS reporting or Self-Service Reporting tools.
As regulatory pressure continuously growing for the insurance companies; there is a great need
for fast reporting is also growing. But this is not sustainable by Traditional COTS or even Self-
Service Reporting tools. These tools predominantly rely on structured query languages for
RDBMSs or in some case No-SQL databases which requires certain amount of training for not
only Information Technology professionals but also for the business users. And when it comes
to Insurance industry; complexity related to financial services regulations on top of SQLs
makes it quite impossible to do fast reporting on the existing legacy systems or enterprise
platforms relying primarily on Relational databases.
We took up a research project to query insurance RDBMS while converting Insurance
knowledge based queries into SQL using Natural Language processing and neural machine
translation methodologies using open source technologies. The paper describes the problems,
processing pipeline, data preparation, test setup, results conclusion and scope for future works.

10
IDENTIFYING EARLY SIGNS OF DEMAND SHRINKAGE OF A
PRODUCT
Veeranagouda Goudra
Senior Business Intelligence Analyst
Rockwell Collins, Bangalore, India
veerustat@gmail.com
Ravi H.S*
Senior Data Scientist
Dell Pvt. Ltd.
Bangalore, India
ravistat@gmail.com

Bhimangouda Biradar*
Lead Analyst
Epsilon Pvt. Ltd., Bangalore, India
biradarbhimu@gmail.com
Thejeswini. R*
Analyst
Fidelity Investments, Bangalore, India
thejeswini.r@fmr.com
Nithin Reddy A. R*
Analyst, Ambertag Analytics Pvt. Ltd.
Bangalore, India
nithinreddyar@gmail.com

Abstract

The biggest hurdle of the supply chain industry is to trace the early signs of demand shrinkage
of a product. In this scenario, the need is to proactively identify the stage where the product
earns no more profit to the manufacturer because of the shrinking demand of the product in
market. The approach is to identify the homogeneous products and fit Weibull curve to those
product families. Finally, estimate the distribution parameters to identify the end of a product
life cycle in terms of market demand. Ultimately, the paper will help manufacturers in deciding
the point where to stop investing on such products, so as to avoid multiple costs involved in
the process and to divert the same budget for the emerging products of the same family.

Keywords: Supply chain, profit, shrinking demand, clustering, distributions

11
THE IMPACT OF VOLATILITY INDEX ON INDIAN STOCK
MARKET

Siddharth Haran
Student
Christ University
Bengaluru, India
siddharth.haran@bba.christuniversity.in
Dr. Sangeetha R
Associate Professor
Christ University
Bengaluru, India
sangeetha.r@christuniversity.in

Abstract

Since the cold war, Indo-U.S relations have only looked to improve and the trade relations
between the countries are at an all time high resulting in an evident impact on the Indian stock
market. The Chicago Boards Options Exchange Market Volatility Index (VIX) depicts stock
market volatility in America and indicates investors’ expectation of the stock market. Different
from previous studies of stock markets in India and the United States, this paper focuses on the
VIX to determine its impact on the Nifty50 Index. The aim is to determine whether the VIX
influences the Nifty50 index and to analyse the extent to which the VIX influences the rate of
return of the Indian stock market. Using the econometric model GARCH, the correlation
between the VIX and the volatility of the Indian stock market will be found and whether a
leverage effect exists between them. Using Python, the future trend of the indices will be
determined. The results will provide both a better management of Indian stock market vibrancy
and strategic recommendations regarding investment. The results will also provide a finer
understanding on the volatility of the stock markets in India to the policy makers of the country
and help them make effective decisions on the same.

Keywords: Volatility Index (VIX) , Indo-U.S stock markets, GARCH , Python, Nifty50.

12
SPEED OF ADJUSTMENT TOWARDS TARGET CAPITAL
STRUCTURE: AN INDIAN PERSPECTIVE

Selva Kumar. D* and Dinabandhu Bag

Abstract

On a more positive note, Indian firms do have target debt ratios and they adjust towards their
target faster over time due to higher speed of adjustment as the half-life (in years) for total
debt ratio (1.50), long-term debt ratio (1.82), and short-term debt ratio (1.21) embellish
amicable target points. This signifies long-term debt plays a vital role for Indian firms to meet
their immediate investment opportunities, expansion, and diversification initiatives. The
study documents a statistically negative association of leverage against firm size, firm
performance, promoters’ shareholding and debtors turnover exhibiting the facets of
peckingorder theory of financing. Further, tangibility, non-debt tax shields, operating costs,
R&D intensity, and liquidity detriment mixed results in case of leverage measures depicting
the shifts from short-term to long-term debt.

13
FORECASTING AND DETERMINATION OF GOLD PRICE IN THE
WORLD GOLD MARKETS
Dr. S. Maria Immanuvel
Assistant Professor of Finance
St. Joseph’s Institute of Management
28/1 Primrose road, Off M G Road
Bangalore – 560025, India
E Mail : mariaimmanuvel@sjim.edu.in
Mobile: +91 94876 29563

Dr. D. Lazar
Professor
Department of Commerce
School of Management
Pondicherry University
Puducherry – 605 014, India
E Mail : lazar.dani@gmail.com
Mobile: +91 9486650016

Prof. Rajendra Desai


Dean
Management Development Centre
St Aloysius Institute of Management & Information Technology (AIMIT)
Kotekar Post, Madoor,Beeri
Mangalore-575022
E Mail : raj@staloysius.ac.in
Mobile: +91 98865 38504

Abstract

Gold is a universal commodity traded throughout the world. Purchase of gold in countries like
India is strongly interlinked with cultural and religious beliefs. This study is an attempt to find
out whether the volume of gold consumption shows any significant impact on the world gold
spot price also known as LBMA AM fix and PM fix prices. Six major gold consuming countries
like India, Europe, USA, Middle East, China and Japan are included in the empirical analysis.
The study used the quarterly data for the period from January 1994 to June 2017. ARIMA
model is used in the price forecasting and VAR Granger Causality / Block Exogenity Wald test
to find the impact of volume of gold demand on the world gold prices. The study concludes

14
that the volume of gold demand significantly affect the world gold spot prices, LBMA AM fix
and PM fix prices. All the countries together influence the world gold price. The individual
effect is felt only from India and China and not any other countries. Forecasting results suggests
that both ARMA (1,1) and ARIMA (1,1,1) produced more or less similar results in prediction.
The actual values are well within the forecasted values. The forecasting accuracy is validated
with Mean absolute error, Mean absolute percentage error, Root mean squared error.

Key Words: World Gold Market, Gold demand, Wald test, Forecasting, ARIMA
JEL Classification Code: G15, Q02, Q31

15
ANALYSIS AND PREDICTION OF INDIAN ECONOMIC GROWTH
BEFORE AND AFTER GST
Abstract

This study is to analyze the efficiency of GST taxation system by comparing it with the
previous taxation systems levied by the government over the years. The main focus of our study
is the analysis of the different types of taxes before the implementation of the GST in the years
2000 to 2016 based on various factors which was dominant over that period of time and its
effect on the gross domestic product in our country. Datasets are obtained with a wide range of
scope for analysis by considering the necessary attributes thus predicting the GDP of the nation
if the GST was not introduced for the years 2017, 2020, 2025, 2030, 2035, 2040. This is
implemented using trend for the year 2017-18. The predictions of the indirect and direct tax
total are compared with the current tax rates of GST. The inferences would be whether the
current GST would be effective and help in the economic growth of our country or continuing
with the previous taxation would have been a wiser choice by the government.
Keywords: GDP, Direct tax, Indirect tax, GST, Growth rate.

16
ONLINE VERSUS OFFLINE BANKING SERVICES USAGE:
APPLICATION OF MULTINOMIAL LOGISTIC REGRESSION
Jinal Kirit Parikh
Assistant Professor
Amrut Mody School of Management, Ahmedabad University
Ahmedabad, India
jinal.parikh@ahduni.edu.in

Abstract

BACKGROUND
In the last few years, extensive research has been conducted to test the implementation of
technology and its adoption by the customers. When users come in contact with a new
technology, there are several factors that influence the usage and adoption of technology
driven services. There are some studies that analyse IT characteristics such as usefulness, ease
of use and/or security (Davis, 1989; Yu et al., 2005) whereas others focus on the emotions and
experiences of users (Agarwal and Prasad, 2000; Fiore and Kim, 2007).
The Government of India is undergoing financial reforms to make the banking services sector
more robust by adopting the use of technology. In one of the research studies on Indian
consumers by Chawla and Joshi, mobile users were segmented into three clusters based on
their perceptions of various factors influencing mobile banking. The segments were called
technology adoption (TA) leaders, TA followers and TA laggards. Attitude and Intentions
toward mobile banking varied significantly across all the three segments where TA leaders
had the most favorable attitudes and intentions followed by TA followers, and TA laggards
(Chawla & Joshi, 2017). In another study, five major influencing factors viz. perceived risk,
compatibility of software, customer profile, external threat and relative advantage were
considered. Perceived usefulness, Perceived Relative advantage, Perceived risk, Social
impact and Security still remain a concern when it comes to m-banking and e-banking. An
mbanking or e-banking experience for a customer can be enhanced if factors such as time,
convenience, ease and safety are taken care of by the service providing firm (Bhatt, 2016).

THE NEED
Based on literature review and background, the study has been undertaken with the following
objectives:
To explore the variables considered by the customers for using online versus
offline banking services
To analyse whether there is a significant difference between the usage
mode of banking services of customers viz. online and offline mode.
To find the relative importance of these variables in distinguishing between the
usage of online versus offline banking services.

17
METHODOLOGY
Research Design
Data collection

The data were obtained in India through a survey performed using personal, telephonic and
email interviewing techniques. In order to ensure representativeness of population,
respondents across all the categories of age, gender, income and occupation were chosen to
be a part of the study. A total of 512 respondents were contacted which resulted into 147
completed responses.

Data Analysis
Data analysis has been conducted using Multinomial Logistic Regression through SPSS.

EXPECTED FINDINGS

The results will help the banking service providers understand whether they need similar or
different strategies for both these categories of Indian consumers for differentiating
themselves. It will also indicate the relative importance of each of the variables perceived by
the customers for usage of online versus offline mode of banking services.

Keywords: online banking, offline banking, multinomial logistic regression, customer


perceptions

18
PROPENSITY TO DEFAULT MODEL FOR ONLINE CREDIT
MARKETPLACE

Pavanraj Talawar
PGDM in Business Analytics,
REVA University, Bengaluru
Pavanrajt.BA01@reva.edu.in

Lakshmi D
PGDM in Business Analytics,
REVA University, Bengaluru
Lakshmid.BA01@reva.edu.in

Abstract

Default risk is the chance that companies or individuals will be unable to make the required
payments on their debt obligations. Lenders and investors are exposed to default risk in
virtually all forms of credit extensions. In the event of a default, lenders may lose out on
periodic interest payments and in many cases, the entire principal amount. Our objective of
this study is to build a 'Propensity to Default' model which can predict the probability of
default. We have also tried to understand the important variables that explain the default
behaviour. For the purpose of our study, we have used the loan approval data of Lending
Club, an online credit marketplace. Lending Club is the world’s largest online credit
marketplace, facilitating personal loans, business loans, and financing for elective medical
procedures. The company operates fully online with no branch infrastructure, and use
technology to lower the cost. The predictive models are built on the actual data collected on
the customers of this lending platform over a period of 4 years (2007 - 2011). We have used
various Python libraries to perform detailed Exploratory Data Analysis (EDA) to display the
vital statistics of different features. We have also built multiple models (Logistic Regression,
Random Forest, Deep Learning etc.) to compare and select the most appropriate and accurate
model. These predictive models can be used by the online credit marketplaces to optimize
their loan repayment success by targeting the right borrowers and modifying their loan structure
if required.

Keywords: Default risk, Propensity to Default, Lending Club

19
CLASSIFICATION MODELLING USING DECISION TREE

Bonnie Bernard
Student, PGDM-Business Analytics
Senior Manager - Transformation Lead
TATA Consultancy Services
REVA University
Bangalore, India
bonnieb.ba01@reva.edu.in

Abstract

One of the operational challenges that the Investment Banks are facing today is to maintain a
highly accurate and consistent data across various systems and databases. The Investment
Banks are generally slow in adapting to a newer a more robust platforms and systems. Though
there are several such next-gen platforms offered by the service provides in the market, due to
high cost and legacy information these transitions will span over anywhere between three to
five years. As a simple use case of onboarding a new customer onto the Bank’s trading platform
would require an average six to seven different downstream systems. And all this is done
manually. Though in the recent few years, Investment Banks are opening up to the concept of
Robotic Process Automation and in some cases leveraging Artificial Intelligence, the potential
for big data analytics is very prominent in the coming days and years.
This paper intends to cover a simple use case for predicting transactional errors while
processing for an on-boarding cum fulfilment function across various products like Equities,
Fixed Income, Forex and Money Markets and Over the Counter trades.
The data set is the records of transactions performed by the team for the last 6 months with
about 11 variables. The predictor variable is ‘Returnflag’, 0=FALSE/1=TRUE and the data is
an imbalanced data, where the class of ‘1’ is significantly lower (0.75%).
Due to its simplicity and ease of understanding by the non-analytics stakeholders i.e. the
operations folks, decision tree algorithm is used as the technique for building the
classification model. The benefit of identifying potential bad transactions will help the
process to control a number of bad transactions to be flown into the system. This will
significantly reduce remediation effort and improve first pass yield, which is one of the key
performance indicators of the process.

Keywords: Error Prediction, Decision Tree, classification model for imbalance data, over
and under sampling

20
DEEP LEARNING AND FRAUD DETECTION IN CREDIT CARD
TRANSACTIONS
Dinesh Ghanta
Student, PGDM in Business Analytics
RACE, REVA University
Bangalore, India
Dineshg.ba01@reva.edu.in

Kavitha Mailengi*
Student, PGDM in Business Analytics
RACE, REVA University
Bangalore, India
Kavitham.ba01@reva.edu.in

Ratnakar Pandey
India Head, Analytics and Data Science
Kabbage
ratnakarpandey20@gmail.com

Abstract

As per a recent article from CNBC News, around 15.4 Million American Consumers were
victims of identity theft or some other fraudulent activities in 2016, resulting in almost $16
Billion losses. Needless to say that real-time fraud detection has become an imperative for the
global banking and financial services industry to survive in a very competitive and fast-
changing environment.

Currently, most of the players in the BFSI industry use Machine learning techniques such as
logistic regression, decision tree, Elastic Net, Gradient Boosting Tree to identifying potential
fraud cases. However, with ever-evolving modus operandi of fraudsters, we need to deploy
selflearning/ deep learning algorithms such as Autoencoder, LSTM and other Deep Learning
Neural Network to stay a step ahead of fraudsters.

In this paper, we have built an algorithm using a combination of two different deep learning
networks- Multilayer Perceptron and Autoencoders to classify the labelled fraud cases of a
European Credit Card company to achieve the highest levels of precision and recall. By
adjusting hyper-parameters through grid search and using Autoencoder output of data, we are
able to get good results for MLP technique.

Keywords: Fraud Detection, Keras, Autoencoders, Multilayer Perceptrons, Artificial Neural


Network (ANN)

21
MANUFACTURING

22
FORECASTING OF ENERGY FOR MANUFACTURERS USING
ARIMA AND NEURAL NETWORK
Iram Naim*
Research Scholar
Department of Polymer and Process Engineering, IIT, Roorkee, India
iram.naim03cs@gmail.com
Tripti Mahara
Assistant Professor
Department of Polymer and Process Engineering, IIT, Roorkee, India
triptimahara@gmail.com
Surendra Kumar
Additional General Manager
Department of Maintenances and Services, CFFP, Bharat Heavy Electrical Limited
Haridwar, India
suren@bhelhwr.co.in

Abstract

Coal, Oil and Natural Gas are fossil energy resources available worldwide. These are non-
renewable sources of energy mainly used for electricity generation and as fuel for heating &
transportation. Coal and oil are traditional energy resources, but currently use of natural gas as
an energy resource for industrial consumption has increased substantially worldwide. The main
reason of this changing scenario is that natural gas is a cleaner fuel as compared to its traditional
counterparts. Forecasting industrial natural gas consumption is a desirable activity for an
organization that utilizes this energy source extensively. This not only helps in efficient system
operations but also leads to effective procurement strategies and tactical planning. The work
here focuses on developing a forecasting model for CFFP a plant of BHEL to predict the
industrial natural gas consumption at organization level. CFFP is a casting and forging
manufacturing unit of BHEL at Haridwar and consumes energy in form of natural gas for their
production operations. This study is important as presently there is no system to predict their
natural gas consumption. Procurement of natural gas from gas providers is done on the basis
of demand that is anticipated manually. Thus, forecasting the Natural gas requirement is an
essential tool that will aid CFFP to decide their future requirements. As per literature, ARIMA
is the one of most successful time series model for short term forecasting. On the other hand,
Neural Network is another successful machine learning approach used for short term
forecasting. Models were developed using ARIMA and Neural Network for forecasting the
natural gas consumption requirements on monthly basis. 30 observations are selected as an In-
sample period to capture the existing pattern in the series and 3 months ahead prediction is
done. Accuracy measures used for performance checking are MSE, RMSE and MAPE. The
results reveal that performance of both these models is comparable, but ARIMA shows a better
performance. The MAPE value of ARIMA is 3.61% as compared to 4.26% for Neural Network.
Therefore, it is evaluated that ARIMA is the most suitable technique to predict the monthly
consumption of natural gas for CFFP and improve their procurement decision.

Keywords: Forecasting, ARIMA, Neural Network, Natural Gas


23
ANOMALOUSLY POTENTIAL FRAUDULENT LOGISTICS – A
TANGANYIKAN SUPPLY CHAIN ANALYTICS
Dr Benny J. Godwin
Assistant Professor, Institute of Management
Christ University - Bangalore
e-Mail – bennygodwin@gmail.com

Abstract

Despite the fact that supply chain network engenders an enormous operational-logistic data,
extracting precise fraudulent information from them is never unequivocal. The primary purpose
of this research is to detect the anomalous and potentially fraudulent transactions during the
cotton ginning process. Continuous monitoring systems using substantial analytics often
exhibit tediously drawn-out reports which demand painstaking post-analysis. This paper also
furnishes the antidromic role of Third-Party Logistics in distribution channels and their impact
on warehousing processes. In this regard, data were obtained from a prominent real-world
cotton ginning organization located in Tanganyika/Tanzania – East Africa. During the
determined ginning season, the organization had its business operations, purchase of seed
cotton and export of lint bales, in the following geographical regions: Igunga, Singida, Kishapu,
Kwimba, Maswa, Magu, Iramba, Shinyanga, and Dar-es-Salaam. The instances of anomalous
activities among legitimate business transactions during the seed cotton purchase and lint bale
delivery process are mapped using Supply Chain Analytics – Supply Chain Monitoring
techniques and wavelet based multi-scale Principal Component Analysis technique. The results
provide valuable insights into understanding the anomalous activities in the supply chain
process, which collectively contribute to the overall effectiveness of the cotton ginning process.
As this paper focuses on the fraudulent activities, generalizing the findings to other categories
of fraud analytics must be made with caution. This paper is to be the first one of its kind
published in the scholarly journals that elucidates the cotton ginning fraud analytics using
supply chain analytics.

Keywords: Fraud Analytics, Supply Chain Analytics, and Third-Party Logistics

24
FUTURE OF PREDICTIVE MAINTENANCE: IOT BASED APPROACH
Milind Khare*
Senior Business Analyst
Tata Consultancy Services, Bangalore, India
milind.khare@tcs.com
Premanand Raju
Manager
Tata Consultancy Services, Bangalore, India
premanand.raju@tcs.com
Dr. Anuj Prakash
Functional Consultant
Tata Consultancy Services, Bangalore, India
anuj.prakash@tcs.com

Abstract
Now a days, every manufacturing firm is seeking loyal customers as they are the key to win
the market but winning loyal customer is not easy in a highly competitive market. To win the
customer loyalty, the firms should not only manufacture the quality product but at the same
time they have to provide after sales services in an efficient manner also. After sales services
include technical support for the usage of products, supply the consumables, maintenance
activities to uphold the higher up-time of the product. Maintenance activities are very crucial
as significant amount of world’s GDP has been spent to address or fix the breakdown or
failures. Simultaneously, if the breakdown is reported to the maintenance team, it takes
significant time to repair/replace the system. Therefore, it is very necessary for manufacturing
firms that they should maintain the equipment in such a way that maintenance should start
before any breakdown to maintain higher uptime and minimize the costs. If maintenance has
been done according to the supplier manuals, there is a chance that useful part life is still left
which may increase the cost of maintenance services. Hence, the firms are looking for such
technologies which can provide some signals or alarms about the future breakdowns. The
emerging technologies like machine to machine communication, sensor technologies and cloud
based platform to avail the maintenance solutions are the internet of things. Internet of Things
(IoT) is a facilitator of virtual structure for such service computing and it works by integration
of data storage devices, smart monitoring devices, different analytical tools, platforms for
visualization and how to deliver to client.
In this paper, we have used the sensor based IoT data to know future failures of any component
of a machine. From IoT data, we want to predict that at what values of various sensor, it will
tend to fail. Therefore, we deployed machine learning technique to know the time to next failure
of any product on the basis of current condition of a machine. The prediction will also show
the type of failure i.e. mechanical failure, basic inspection required, electronic failure etc.
According to the type of failure, it will assign the field personnel available using a knowledge
base system. A simulated numerical example has been generated to test the proposed approach
for IoT based maintenance. Results will show at what time and which location, which type of
failure will occur and consequently field personnel will be assigned using the knowledge base.

Keywords: Internet of Things, Predictive Maintenance, Survival Analysis, Machine Learning.

25
CONTROL CHART PATTERN RECOGNITION USING STATISTICAL
AND SHAPE FEATURES
Ajay Kumar
Sr. Business Analyst
Tata Consultancy Services
Bangalore, India
ajay.33@tcs.com
Anurag Gupta
Manager
Tata Consultancy Services
Bangalore, India
gupta.anurag6@tcs.com
Abstract
Statistical process control (SPC) techniques are widely used in the manufacturing industry to
continuously monitor the process parameters and improve the product quality. In SPC, control
charts are one of the basic tools for monitoring the mean and range of the various parameters
which affects quality. The pattern recognition of control charts can provide deep insights
regarding issues in the manufacturing process. For example, a trend in a particular process
parameter could indicate an associated tooling wear whereas a large shift could represent an
equipment breakdown. Moreover, in case of manufacturing processes dealing with large batch
sizes, it is very difficult to manually monitor the control charts and identify the unnatural
patterns visually. Therefore, it becomes important to use an intelligent system which
automatically detect any variations in the output and alert users accordingly.

This paper presents a new approach to recognize the different type of control charts patterns
(CCP) e.g. normal pattern, cyclic pattern, trends and shifts as well as the confidence of the
trend and shift. This approach uses statistical features such as mean, median, quartiles, standard
deviation, regression and residuals and shape features like best fit line, slope, mean line and
intercept difference to detect variations followed by confidence score based rule induction to
classify these variations into patterns such as cyclic, upward trend, downward trend, upward
shift, downward shift and normal pattern. The confidence score is used to further classify trend
and shift as per their magnitude i.e. low, moderate and high. The users can customize the system
to generate alerts as per the type of variation in the quality parameters and their magnitude.
This approach has been applied in semiconductor manufacturing process where data volume is
huge due to large batch sizes and large number of quality control parameters. This algorithm
was able to detect patterns in control charts successfully without any manual intervention. This
approach does not require a time-consuming training process and is much faster compared to
conventional machine learning algorithms like Artificial Neural Network (ANN). The
proposed method can be generalised to detect patterns in various types of time series or
continuous data streams in other manufacturing functions such as productions and other
industries like medical and finance.
Keywords: Control Charts, Pattern Recognition, Statistical Process Control, Statistical and
Shape Feature

26
A NEURO FUZZY MODEL FOR OEE PREDICTION

Dr.S.K.Sudarsanam, Dr. Umasankar.V, Chintada Anusha


VIT Business School Chennai
School of Mechanical and Building Sciences
Vellore Institute of Technology,Chennai-600127,India
Email ID: Sudarsanam.sk@vit.ac.in

Abstract
Overall equipment efficiency (OEE) is an essential component for monitoring and
measuring effectiveness of capital equipments which involves three important factors namely
availability, performance (which is a measure of efficiency) and quality (which is a measure of
effectiveness). This paper aims to develop a reliable model for OEE in industry accounting the
ambiguity of data by using fuzzy model, Neural Network model and Neuro-fuzzy model.

Keywords: Overall Equipment effectiveness, Availability, Performance, Quality, Fuzzy Logic,


Neural Network, Neural Fuzzy Design Toolbox, MatLab 8.0, Fuzzy Controller, Rule Viewer,
Image Viewer, Rule editor, Fuzzy Neural Network Model, Rstudio, Cluster analysis.

27
RETAIL

28
STAKEHOLDERS FRAMEWORK FOR ANALYZING RELATIONSHIP
BETWEEN THE SUSTAINABILITY METRICS OF SUPPLY CHAIN.

Rahul Solanki*
Research Scholar
Department of Operational Research,
University of Delhi, Delhi
solanki.rahul1470@gmail.com

Jyoti Dhingra Darbari


Research Scholar
Department of Operational Research,
University of Delhi, Delhi
jydbr@hotmail.com

Vernika Agarwal
Research Scholar
Department of Operational Research,
University of Delhi, Delhi
vernika.agarwal@gmail.com

P.C. Jha
Professor,
Department of Operational Research,
University of Delhi, Delhi
jhapc@yahoo.com

Abstract
Organizations are now compelled to embrace sustainability in their supply chains (SCs) due to strict
government regulations and increasing pressure from social organizations. The sustainability-focused
supply chain is an extension of the green supply chain as it considers social criteria along with economic
and green criteria. Unfortunately, firms in India are more inclined towards economic and environmental
front with a little attention paid to social aspect. Existing studies suggest that the major impediment to
social sustainability (SS) relate to firm’s intention for maximizing economic productivity, reducing
costs, and lack of government rules regarding social injustice and labor laws. Hence, within the domain
of sustainable supply chain, SS is generally perceived as mere economic burden or in few cases just a
compulsion, even though it can positively enable other sustainability initiatives. To disambiguate the
notion, this paper brings forth relevant dimensions of SS in terms of various SC factors, which can
further stimulate overall sustainability of the SC.
To achieve the aforementioned objective, study aims to examine the following:
(i) Identify important social aspects that can be adopted by any manufacturing firm to be socially
responsible towards all its stakeholders.
(ii) How these social aspects affect the economical and environmental factors of sustainability?
(iii) Which of the dimensions of SS considered in the study are aligned with the prospective SC
performance outcomes of the firm?
(iv) How are identified relevant SS dimensions measured and managed?

29
The methodological framework of the study entails identification of social dimensions through
extensive literature survey. Various SC executives of the firm are consulted and their opinions are
considered through semi-structured interviews and questionnaire. The data collected is further analyzed
using Interpretive Structural Modelling (ISM) and MICMAC analysis to extract the social aspects,
which foster economic sustainability not only for the manufacturing firm, but for the suppliers and
customers as well. The responses gathered from the experts aid in classifying the relationship between
SS and eco-environmental performance of the firm as ‘negative relationship’, ‘No relationship’ or
‘Positive relationship’. The positively related social aspects are further investigated for understanding
the degree of influence and numerical quantification of the influence, which is done using numerical
scale of ‘1-5’.The findings of the integrated multi-criteria method suggest that ‘Ethics’, ‘Health and
safety’, ‘Labor Rights’, ‘Wages’, ‘Education and Training’, can positively lead to job satisfaction,
employee retention, building trust with stakeholders and clean and healthy environment which
consequently enhances the financial and the environmental performances of the firm. The result also
provides a baseline for the managers seeking to build a socially responsive supply chain and helps them
identifying pertinent social aspects to focus upon for achieving sustainability at all three levels of supply
chain.

Key words: Social Sustainability, Sustainable Supply Chain, ISM, Stakeholders.

30
AN INTEGRATED MANOVA AND ANP APPROACH FOR SUPPLIER
EVALUATION WITH FUZZY DATA
Nidhi Bhayana*
Research Scholar
Department of Operational Research, University of Delhi
Delhi, India
bhayananidhi@yahoo.com

Anshu Gupta
Assistant Professor
School of Business, Public Policy and Social Entrepreneurship, Ambedkar University
Delhi, India
anshu@aud.ac.in

Kanika Gandhi
Post – Doctoral Fellow
School of Engineering Science, University of Skövde
Sweden
gandhi.kanika@gmail.com

P. C. Jha
Professor
Department of Operational Research, University of Delhi
Delhi, India
jhapc@yahoo.com

Effective selection of supplier is a critical issue for the success of any organization in today’s
business environment. Efficient sourcing requires building long term relationship with the
suppliers minimizing the risk of purchasing. Several supplier selection methodologies are
proposed in the literature. Most of the literature suggests a method of supplier selection based
on several measures of performance. Limited consideration is given in the research to the
selection of performance measures to use in these problems. This study proposes a three-level
filtering approach for supplier selection. The first two stages of the approach results in the
selection of appropriate variables for the study, which are then used in the third stage to
evaluate the suppliers. Firstly, using the Delphi method the variables relevant to the supplier
selection for a particular problem are extracted from the variables identified from the extensive
literature survey. To check the significance of the difference between supplier’s performance
on extracted variable MANOVA analysis is performed. The variables that are identified to be
significantly different are then used for supplier evaluation using fuzzy ANP approach. The
application and validity of the proposed approach is illustrated with a case study.

Keywords: Delphi Method, MANOVA, Fuzzy, ANP, Supplier Selection.

31
DEEP DIVE ANALYSIS OF CUSTOMER EXPERIENCE WITH
MACHINE LEARNING TECHNIQUES

Manvinder Ghumman
Advisor
Dell, Bangalore, India
Manvinder_ghumman@dell.com

Parvathi Patnaik*
Advisor
Dell, Bangalore, India
Parvathi_patnaik@dell.com
Sumathi Subramanian
Advisor, Dell
Bangalore, India
ssumcy@gmail.com

Service and care for high technology products is an integrated journey from start to finish with
multiple touch points. Even the most incidental transaction has a lasting impact on the most
powerful of the brands, customers’ purchase decisions, expansion decisions, influencing other
customer’s future buying patterns as well. In this small connected world, further augmented
with social media, word of mouth spreads across the globe instantaneously while competition
awaits to provide equal or better service in a highly commoditized high technology product
space.
Customer retention is practical and worthwhile in the long run compared to acquiring new
customers. The need of the hour is to keep tab on the customer experience throughout the
product purchase life cycle even before a buying decision is made. And once a purchase is
done, the customer experience needs to be carefully monitored to ensure a smooth experience.
As the customer base increases for any mature organization, exceptional cases do arise when
the customer sends mail or letter directly to the executives which is in a way better, compared
to customers who voice their opinion elsewhere. And once the customer communication is
received, an escalation management team works to resolve this, almost similar to a healthcare
emergency section. This whole process is very reactive.
The aim of this paper is to proactively identify potential customer escalations before it reaches
the executives. We do this by systematically studying the history of all customer touch points
with near about eighty metrics starting from sales, to order management, to delivery, across
care, tech support, online support right from the moment the customer contacts the
organization. For the entire analysis, we have considered only those data points where the
customers directly bought from the organization and those data points where customers have
bought from retail outlets are excluded. This study is for the individual customer section and
not for the big commercial customers where the strategies are different. We study the history
of the recent fourteen days and the past one year. We have used SQL, excel for data preparation

32
and R/JMP for statistical models. We have further used conditional inference trees and
Kohonen algorithms for profiling the customer experience across the cycle.

With these machine learning models we arrive at an aggregate “customer experience score”
across customer touch points. We further study the patterns in the scores with univariate,
bivariate and a detailed exploratory analysis. With our analysis, the escalation team would be
able to identify the customers with a risk-score against it and a profiling reason that identifies
the topmost escalations. By proactively reaching out to customers based on these scores, a
mutually beneficial customer relationship is maintained.

Keywords: Random Forests, Kohonen algorithms, Decision Trees, Customer Care, R


statistical models

33
A CASE STUDY ON PREDICTION OF VISUAL INVENTORY USING
BRAND, STORE AND CITY PROFILE VARIABLES

Aashna Sultania
PGDM Final Year,
St. Joseph’s Institute of Management,
Bangalore – 560025
E-mail: aashna.0193@gmail
Vivek Sunkara
PGDM Final Year,
St. Joseph’s Institute of Management,
Bangalore – 560025
E-mail: vickysunkari@gmail.com
Prima
PGDM Final Year,
St. Joseph’s Institute of Management,
Bangalore – 560025
E-mail: prima_rosie@yahoo.com
Prof. Rajendra Desai
Faculty
St. Joseph’s Institute of Management,
Bangalore - 560025
E-mail: raju.ashu@gmail.com
Dr. Avil Saldanha
Faculty
St. Joseph’s Institute of Management,
Bangalore – 560025
E-mail: avilsaldanha@gmail.com
Aman Bajaj
Insights Team, HRG, Bangalore
E-mail: amanbajaj853@gmail.com
Roland Nonis
Insights Team, HRG, Bangalore
E-mail: nonisroland3@gmail.com

Abstract

We attempt to create an analytics model to predict Share of Visual Inventory (SOVI) at a


category of soft drinks level, utilizing past data on store, city and country level data collected
over 18 months (2016, 2017) by a soft drinks firm headquartered in the US. All the data was
from stores in countries near the US. Most of the variables in the dataset were categorical and
consisted of store profile (city, country, channel, sub channel), brand, packaging size, category,

34
display cooler ownership and type. Our attempt is to use the above variables and some derived
variables to detect a pattern in the brand, store and display data to predict SOVI at a category
of drinks level.
At a brand level this data was too fragmented with very low individual SOVI numbers to be of
much use. Besides, business use of the prediction would be more relevant at a category level.
Potential business use of our model is in the planning and estimation of stocking for new
territories based on store, brand and display parameters. A firm can estimate demand at an
aggregate level and utilize the model for planning of stocking based on replenishment policies.
The model would also be useful in detecting anomalies in current SOVI patterns for a more
detailed investigation. Stores exhibiting drastically lower SOVI than that predicted by the
model can be explored to understand the underlying reasons.

We categorized SOVI numbers into 3 categories of below 5%, 5-10% and above 10%. We
created classification models using Naïve Bayes, Logistic Regression and Random Forest
algorithms. The initial results are encouraging with model accuracy of up to 70 % confirming
the existence of a pattern in the SOVI at a category of soft drinks level.

Keywords: SOVI, KPI, Naïve Bayes, Logistic Regression, Inventory

35
CAPTURING DEMAND TRANSFERENCE IN RETAIL - A
STATISTICAL APPROACH
Omker Mahalanobish
Statistical Analyst, Walmart Labs, Bengaluru, India
omker.mahalanobish@walmart.com
Souraj Mishra
Statistical Analyst, Walmart Labs, Bengaluru, India
souraj.mishra@walmart.com
Amlan Das
Statistical Analyst, Walmart Labs, Bengaluru, India
amlan.das@walmart.com
Subhasish Misra*
Associate Data Scientist, Walmart Labs, Bengaluru, India
subhasish.misra@walmart.com

Abstract
Background:

While an item substitution measure provides for the direction, demand transference
quantifies the magnitude of demand that may get transferred to an item a) When its substitute
is deleted b) When it is introduced in a store and cannibalizes on similar items.

This, hence, is an important input into assortment optimization. If an item is predicted to exhibit
a good extent of transference then we may be more certain of deleting it (provided, it is less
than an average performer in terms of sales). Conversely, we should be careful of deleting a
very incremental item (with low demand transference) – since we’ll be losing on a bulk of its
demand.
Note that transference is not explicitly observed, it’s latent. Our methodology explains how
we capture it.

Method:

Data: POS, promotions & item attribute data is harnessed for this process.

Modeling:

• Regression models (in a longitudinal setup) are used to estimate demand for an item –
among other explanatory variables we have one that accounts for cannibalization effect
of similar items.
• The cannibalization term uses the attribute data to calculate item similarity. Its value
changes depending on presence/absence of similar items and is the instrument through
which demand transference seeps into this model.
• The modeling process is designed to automatically take care of complications such as
multicollinearity and sundry regression violations.
• Since each store is unique in terms of the consumer demand pattern these models have
been estimated at a store x substitutable community level.

36
• This means that for a category with 10 + substitutable community, we are estimating
10 * 4000 + = 40000 + models using parallelization techniques in Hadoop.
In conclusion, these models predict the extent of transference (i.e. if an item “i1” in the
predelete scenario was selling 100 units, then what amount of its demand would get
transferred to
its substitutes, say, “i2”, “i3”, “i4”). All this, at an individual store level as well as the overall
US.

Expected outcome:

The methodology has been successfully tested for multiple foods and consumable categories,
as well as general merchandising categories in the US – efforts are on towards making this one
of the processes of estimating demand transference. The entire process, despite involving
sophisticated modeling has been scaled (across all stores), automated and productized as an
easy to use manner for the business user.

Keywords: Regression, Cannibalization, Retail, Parallelization, Forecasting

37
IDENTIFYING ITEM SUBSTITUTES- A SCALABLE, MACHINE
LEARNING BASED APPROACH

Subhasish Misra
Associate Data Scientist @WalmartLabs Bengaluru, India
subhasish.misra@walmart.com
Arunita Das
Senior Statistical Analyst @WalmartLabs Bengaluru, India
arunita.das@walmart.com
Amlan Jyoti Das
Statistical Analyst @WalmartLabs Bengaluru, India
amlan.das@walmart.com
Bodhisattwa Prasad Majumder
Statistical Analyst @WalmartLabs Bengaluru, India
bodhisattwa.majumdar@walmart.com

Abstract
Background: Shelf space in a brick and mortar store is constrained, necessitating selection
of the most optimal items only. A way to do it is to delist items with low sales. But, if such an
item has no substitute then we may lose the customers who come exclusively for it. It is
imperative then, to identify substitute (and not-substitute) item pairs for a retailer.

Within substitutes, consumer behaviour quirks result in two classes of substitutes:

• Traditional substitutes: Which satisfy the same consumers need state (diet Pepsi v/s diet
Coke in the carbonated soft drink category)
• Variety substitutes: Satisfies variety seeking tendencies of households. For e.g. buying a
Pepsi (cola) & also a Fanta (orangeade) in the same transaction.

Differentiating between traditional and variety is important. Variety substitutes aid in basket
building behaviour and hence bring in more sales. Ideally then, an item can be deleted only if
it has poor sales performance, has traditional substitutes and is not a variety substitute to many
items. On the other hand, we would be cautious of removing an item with poor performance,
few/no traditional substitute, but which is a strong variety substitute (to some/many items).

Aim: Our goal was to help optimize assortment by classifying any given item pair (for a
given category) into any of the three mutually exclusive and exhaustive classes – nonsubstitute,
traditional substitute or variety substitute.

Method & expected findings: This involved careful scaling up considerations (across
categories); as well as being creative on data analysis. We had manually tagged data for a
category, however given the scaling up aspect we refrained going the route of using a
supervised model (tagged data for another category may not be made available since tagging
by an expert is time consuming). We adopted an unsupervised approach instead. The broad
steps followed were:

Feature engineering: Used to transform transaction data, item attributes, price &

38
demographics to build a holistic profile of an item-pair. We extensively used
association rule mining here.

Exploratory analysis: Done to understand which features differentiate between the


classes. Also, used variable reduction techniques to de-noise data.

Modelling: We used clustering to split all given item pairs into two categories. Then,
profile the clusters (based on EDA results) to identify which is a substitute & nonsubstitute.
Then cluster again in the pairs identified as substitutes to get two groups.
Identify, as earlier on which is the variety cluster and which is traditional. Finally,
scored an item pair on the strength of being a traditional / variety.

Validation: Checked accuracy (for a category with tagged data) - compared and
established superiority to classification & heuristic based models. Consulted &
validated with category experts for others.

Business Outcome: Currently the output from this methodology is being used in conjunction
with loyalty and performance measure to optimize assortment decisions. Already the pilot has
been run for more than 30 categories from the food and consumables section, the insights
from this process is quite well received at business end.

Keywords: Retail, Assortment, Substitutes, clustering, scoring

39
IMPACT OF SOCIAL CONTEXT ON ONLINE SHOPPING IN THE
SELECT ASIAN COUNTRIES
(INDIA, MALAYSIA, SINGAPORE, UZBEKISTAN AND THAILAND)

Mr.R.N.Balamurugan*, Dr.D.Jublee, Mr.M.Sathish and Mr. Abilash

Abstract

Growth of the Internet and technology helps the marketers to focus on electronic commerce.
There are many factors to influence online shopping behaviour; however, the research paper
deals with how social context such as family, friends, reviews, external factors and
interpersonal factors would impact in planned and unplanned purchase in online shopping. The
descriptive study was conducted through using structured survey questionnaire, data was
collected through online as well as offline and sampling technique is Snowball. Wald
Wolfowitz RUN test was conducted to ensure randomness. Structural equation model, F Test
and T Test used to determine sample size and sample size has been justified by using G power
analysis. Online consumer shopping behaviour comes by attitude, trust, shopping enjoyment
and shopping experience which is directly influenced by social context. This study found that
Indian, Singaporean and Malaysian students are careful before making purchase, think twice
before they go for it, keep a shopping list, they are particular about e-stores, and will not
purchase without a plan. However, in Singapore and Uzbekistan, unplanned purchase happens
in large extent due to high spending pattern and impulsive purchase. Since internet and e-store
is new to Uzbekistan the young generation is tempted to browse through online, get suggestions
and reviews from their friends.
Key words: Social context, Planned Purchase, Unplanned Purchase, Online shopping

40
A STUDY ON THE IMPACT OF SOCIAL MEDIA, SECURITY RISKS &
REPUTATION OF THE E-RETAILER ON BUYING INTENTIONS OF
THE YOUTH THROUGH TRUST IN ONLINE BUYING: AN
STRUCTURAL EQUATION MODELING APPROACH

Dr. Vinay Kumar


Associate Professor
Thakur Institute of Management Studies & Research, Mumbai
Mail: dr.vinaykpune@gmail.com

Abstract

The purpose of the study is to explore the factors influencing customer buying intention
through Internet shopping. Several factors such as security, firm’s reputation, privacy, and
trust that influence customer intention to purchase from e-commerce sites were analyzed.
Factors such as those mentioned above, which are commonly considered influencing
purhasing intentions through online shopping in other countries were hypothesized to be true
in the case of Maharashtra, India. A random sample comprised of 287 Maharashtrian people
who have been buying goods/services through e-commerce sites at least once, were collected
via online questionnaires. To test the hypothesis, the data were examined using Structural
Equations Modeling (SEM) which is basically a combination of Confirmatory Factor
Analysis (CFA), and linear Regression (Path Analysis). The results suggest that privacy,
security, firm’s reputation and trust affect online purchase intention significantly. Close
attention need to be placed on these factors to increase online sales. The most significant
influence comes from trust. Maharashtrian people still lack of trust toward online commerce,
so it is very important to gain customer trust to increase sales. E-commerce’s business owners
are encouraged to develop sites that can meet the expectation of potential customer, provides
privacy, having an environment of online secure transactions and to increase reputation of the
vendor. This paper outlined the key factors influencing online shopping intention in
Maharashtra and pioneered the building of an integrated research framework to understand
how consumers make purchase decision toward online shopping; a relatively new way of
shopping in the country.

Keywords: E-commerce; Online purchasing intention; Structural Equations Modeling;


Regression; Confirmatory Factor Analysis

41
INTELLIGENT CATEGORIZATION OF PRODUCT
RECOMMENDATIONS FOR ENHANCED CUSTOMER EXPERIENCE

Ladle Patel*
Data Scientist, Analytics, Genpact
Bangalore, India
ladle.patel@genpact.com
Anupriya Beniwal
Jr. Data Scientist, Analytics, Genpact
Bangalore, India
anupriya.beniwal@genpact.com
Chirag Jain
Sr. Data Scientist, Analytics, Genpact
Bangalore, India
chirag.jain4@genpact.com

Abstract

The last five years have witnessed data science making quantum leaps into a multitude of
domains spanning healthcare, customer analytics, education, manufacturing, and so forth.
Today, a personalized product recommendation is considered an integral part of the customer
experience journey. There are several data filtering tools, which make use of algorithms and
data to recommend the most relevant items to customers. Content based system, collaborative
filtering, hybrid recommender and association rules are most common algorithms used for
this purpose. However, there are practical challenges with these algorithms as either
consumers ignore their recommendations, or the sales team sees no value due to familiarity
with the customer’s requirements and preferences from past experience. This paper
articulates an approach to overcome these shortcomings by intelligently categorizing the
recommendation generated by existing algorithms. In this research, we present intelligent
categorizing of the recommendation into three types of opportunities, viz. ‘Default’,
‘Linked’, and ‘Hidden’. ‘Default’ opportunities are generic recommendations that are
independent of customer’s past purchases. ‘Linked’ opportunities are obvious
recommendations that are easy to identify from past experience of the domain. ‘Hidden’
opportunities go beyond the ‘Default’ and ‘Linked’ opportunities, which even the sales team
may not be aware of. To make more accurate personalized recommendations, the pipeline
consists of an ensemble of hybrid recommendation and association rules algorithm. The
pipeline then incorporates categorization module to additionally classify the
recommendations. The overall process significantly enhances the customer experience as the
sales team uses the categorization to strategize their communication with clients.

Keywords: Product Recommendation, Categorization, Default, Linked, Hidden

42
CS360 – CLOUD SECURE 360 THROUGH BEHAVIORAL ANALYSIS
(EXTENDED 360 CLOUD SECURITY USING DEEP LEARNING FOR BEHAVIORAL
ANALYSIS)

Thiruchendhil Arasu
IT Director, Dell Innovation Technology
QE Performance Engineering DELL- Bangalore

Dr. E. George Dharma Prakash Raj


School of Computer Science and Engineering
Bharathidasan University

B. Prashanth
Performance Engineer
DELL - Bangalore
Trichy

Abstract.
Cloud Computing provides on-demand access to affordable hardware and software platforms.
Application Services hosted on Single/Multiple Cloud provider platforms have diverse
characteristics that require extensive Security mechanisms to aid in controlling the Quality of
Service. This paper talks about the Cloud Security during the peak load of an e-commerce
website deployed on the cloud for the holiday sales in the United States. The website deployed
on the cloud is expected to have ~1.8M (Million) page views during that peak hour of the
holiday sales. It becomes critical to make sure the hits that are coming to the page are real
customer hits and not due to any Distributed Denial of Service (DDoS) attacks. Hence a
Security Mechanism and its implementation is the key to make the sites reliable and scalable
to complement the maximum revenue and the best customer experience during the holiday
sales thereby improving the brand image of the company. The aim of the model that this paper
proposes is to predict the magnitude of the DDoS attack that can occur in a DHCP server when
the DHCP is servicing customers with different patterns of behaviors. Behaviors are defined
for individuals based on their online activity depending on what kind of websites they visit.
Scores are generated for customers based on the websites they visit and in turn, the scores are
generated for DHCP servers based on the scores of customers that they service.

Keywords. Cloud Computing; Data Privacy; Data Protection; Security; Virtualization;


Monitoring; Deep Learning; Predictive Analytics.

43
EXPLORATION OF CUSTOMER ONLINE SHOPPING EXPERIENCE
TYPES AND THEIR EFFECTS ON CUSTOMER SATISFACTION

Aditya Shankar Mishra1


Associate Professor
IBS Hyderabad (I.F.H.E. University), Telangana, India
adityamishra@ibsindia.org

Pankaj Kumar Mohanty2


Research Scholar
IBS Hyderabad (I.F.H.E. University), Telangana, India
pankaj.mohanty@ibsindia.org

ABSTRACT

Customer experience in an online shopping involves both direct and indirect interaction
between the customer and the online platforms across multiple touch points along the customer
journey (includes pre-purchase, purchase and post-purchase). Customer’s online shopping
experience is not confined to the particular product usage experience, but also includes many
other multiple customer-brand related experiences happens during the customer journey (e.g.,
customer interaction with service personnel, web site interface, delivery time, product reviews,
advertisement and so on). Hence, providing a high-quality online customer experience in an
online shopping platform is considered to be a primary driver of customer satisfaction.
Research shows that customer experiences created by the online shopping platforms will have
an impact on customer’s cognitive as well as affective responses. Specifically, literature has
also suggested that customer satisfaction is created through a positive customer experience.
However, the empirical investigation regarding the relative impact of different types of online
shopping experience on customer satisfaction is limited. Therefore, in this present study, we
investigated the relative impact of different types of online customer experiences on customer
satisfaction in the context of online shopping. First, with the help of through literature review,
we identified the initial pool of items for the online customer experience. Second, a series of
FGD’s and personal interviews were conducted with academic experts and the actual online
consumers to arrive at the final pool of survey items. Third, principal component analysis was
used to identify the major components of online shopping experience. In order to check for the
reliability and validity of the shopping experience scale, we employed confirmatory factor
analyses procedure to confirm the factor structure. Finally, the study has used multiple
regression analysis to measure the differential impacts of different shopping experiences on
customer satisfaction.
Keywords: Customer Experience, Satisfaction, Online Shopping, Shopping Platform,
Customer Touch Point

44
USER REVIEWS AGGREGATION AND SUMMARIZATION AGENT
TO AID E-COMMERCE CONSUMERS

(Research in Progress)
Vimal Kumar M
Research Scholar
Indian Institute of Management Tiruchirappalli, India
Vimalkumar.f15005@iimtrichy.ac.in
Venakteswara Rao B*
Technical Lead
TurningPoint Software Solutions Pvt. Ltd, Bengaluru, India
Venkateswara.battumarthi@tpgsi.com

Abstract

The era of digital and social media is flooded with information in a variety of digital forms.
And at times, e-commerce consumers feel overloaded with the information. Such an
environment is expected to reduce the quality and efficiency of consumers’ purchase decision
making. Decision support systems are set of tools which assist the consumer in making better
decisions by analysing and recommending the decisions to the consumer. For instance: online
recommender systems can be considered as one such tool. A recent discussion on such a
decision support mechanism ventures into utilizing the collective intelligence of the consumers
through their online reviews. There exist a set of literature emphasizing the importance and
mechanisms to process the user reviews. One of the mechanisms to utilize the consumer
reviews is through review summarization. However, most of the discussion on review
summarization mechanisms consider reviews available from a single site, which can be a
limitation, as the availability of different e-commerce sites catering different segment of users,
generate various types of reviews. For instance: the reviews available on Amazon.com catering
to USA consumers may not be available on Flipkart.com catering to Indian consumers. Along
with that, there is also enormous scope for improving the existing review summarization
mechanisms. Hence, this research attempts to contribute to the existing body of knowledge in
two ways. First, we propose to aggregate and summarize user reviews from different e-
commerce channels and second, we aim to provide a novel mechanism to summarize the user
reviews of products. The paper is structured as follows. First, with the assistance of many
existing algorithms of web scraping and natural language processing, the paper proposes an
algorithm for summarizing consumer reviews. Then, this study compares proposed algorithm
with the existing text summarization approaches and list its advantages. Further, the paper
implements the working prototype of the algorithm, tests with actual product reviews. Finally,
the paper proposes future directions for improving the readability of summarization using
abstractive summarization methods and for testing the efficiency of aid in improving the quality
and effectiveness of decisions of the consumers in the online environment.

Keywords: Review Summarization, e-commerce, User Reviews, Review Aggregation, Natural


Language Processing, eWoM
45
RESPONSE MODELLING WITH K-NN AND XNB

Dr Jay B. Simha
CTO and Head, Analytics, Abiba Systems
Advisor and Professor,
REVA Academy for Corporate Excellence (RACE)
REVA University
Bengaluru, India
jay.b.simha@abibasystems.com

Dr Shinu Abhi
Professor and Director
REVA Academy for Corporate Excellence (RACE)
REVA University
Bengaluru, India
shinuabhi@reva.edu.in

Abstract

Response modelling is one of the important predictive modelling techniques used to get
insights into the responses or behaviour of the events like repeat purchase by customers. In this
work, application of a lazy learning method called k-NN and a generative model called xNB
are used to evaluate the data set for repeat purchase behaviour for an e-commerce business.
The RFM data view on a real-world data set is used for performance evaluation. Different
values of k for KNN are evaluated for suitability and robustness. A tree augmented naïve Bayes
and the BayesNet classifiers are also explored and compared with simple naïve Bayes as the
baseline. The results of the experiment are discussed with additional work planned for the
project.

Key-Words: Response, k-NN, Bayesian classifier, Scalability, RFM

46
RESPONSE MODELLING USING SUPPORT VECTOR MACHINES &
NAÏVE BAYES

Kanchan Wali
Student, PGDM-Business Analytics, REVA University
Business Analyst, Analytics Edge Pvt Ltd.
Bangalore, India
Kanchanwali.BA01@reva.edu.in
Mutturaj Baradol
Student, PGDM-Business Analytics, REVA University
Senior Engineer, Rakuten India Inc.
Bangalore, India
Mutturajb.BA01@reva.edu.in

Abstract

Predictive analytics has become the crucial differentiating factor for highly competitive
online retail companies in creating optimised solutions to provide the “right offer, right
person, right time” to their customers. Response modelling is one of the important predictive
modelling techniques used to get insights into the responses or behaviour of the events like
repeat purchase by customers. It uses data mining techniques to find similarities between
respondents from previous historical data to predict who is likely or not likely to respond in
the near future.

Our experiment involves supervised learning methods, application of a Support Vector


Machine (SVM) and Naïve Bayes are used to evaluate the data set for repeat purchase
behaviour for an e-commerce business. We use Naïve Bayes as a baseline model and further
we extend our work to more advanced classifier technique like SVM to evaluate the
performance accuracy. The support vector machine is called as a linear classifier which
creates a hyperplane that separates positive samples from negative samples.

The datasets used for this paper contains transaction data for an online retail shop. The
customer level RFM (Recency, Frequency and Monetary) data is extracted for analysis. This
paper also provides various evaluation measures for response models in terms of predictive
accuracy, computational efficiency, and mid-classification error. The results of the
experiments will be discussed in this paper enabling the retailer to deploy the right models
with superior performance.

Keywords: Response Modelling, Support Vector Machine, Naïve Bayes, Performance


accuracy.

47
SERVICES

48
NOVEL METHODS FOR MONITORING SOCIAL MEDIA
PROPENSITY AND NETWORK ACTIVITIES

Hari Bhaskar Sankaranarayanan


Director, Engineering
Amadeus Software Labs
Bangalore, India
hari.sankaranarayanan@amadeus.com
Veeresh Erched
Director, Engineering
Amadeus Software Labs
Bangalore, India
veeresh.erched@amadeus.com

Abstract

The advent of social media enabled people to establish personal connections, connect with
communities, express their likes, dislikes, and engage in meaningful exchanges and
interactions like sharing, commenting and reacting to news and happenings in life. While the
social network effects are highly positive but the network poses high-risk activities including
group uprising, the establishment of nefarious forums and misused by anti-social elements for
spreading false rumors, provocation with hidden agenda and inflammatory remarks. This poses
challenges for various agencies to contain the same on time before it goes out of hand and also
it is impractical to monitor everything, everyone and every act in social media. The network
behavior introduces complexity in identifying and tracking such activities in real time. In this
research paper, we propose an experiment with models using fuzzy logic, social network data
mining that aims to understand network behavior with parameters like type of pages, interests
that have the propensity to attract more followers based on religion, beliefs, and type of fan
base. The model will also assess propensity based on the quality of posts that is measured not
merely by the reaction, shares but also engagement by the type of people engaging with it and
associated demographics by classifying using fuzzy logic methods and applying Association
Rules Mining. We will also apply clustering techniques to identify categories of followers,
people reacting to them based on the topic mined. The paper proposes new metrics like post
propensity index, page network activity index and follower classification index to identify
anomalies and potential risks that might help to track and monitor the social media better. The
scale can be achieved by processing the network activities through a set of indices and
measuring them over real time by trend analysis. A case study of the social uprising is discussed
with this methodology to reason how the social media mobilized multiple thousands of people
within short time frame towards a common cause.

Keywords: social media, network, profile, big data, analytics

49
PUBLISHER SUBSCRIBER CHURN PREDICTION MODEL USING
LOGISITC AND MARKOV

Priyanka Sindhwani
Assistant Manager
TimeInc.India, Bengaluru
Priyanka.sindhwani@timeinc.com

Abstract

This paper analyse the subscription data for a leading publishing house in the UK to develop a
predictive customer churn model. The use of predictive technique helps in successfully
identifying the parameters leading to churn and correctly classifying them. Another objective
covered in this paper is to use RM customer segmentation technique to classify customer of
different brands and use Markov absorption process to calculate time till absorption and
customer lifetime value.
Keywords: Customer retention, Logistic, Markov, CLV

50
USING TEXT MINING AND SENTIMENT ANALYSIS TO STUDY
CORRELATION BETWEEN PAGE ENGAGEMENT AND ARTICLES

Sushrut Tendulkar
Sr. Analyst
TimeInc. India
Bengaluru, Karnataka
Sushrut[dot]Tendulkar[at]timeinc[dot]com

Abstract
Time Inc. is one of the largest media company and brings 150+ Million users on its websites
every month, with approx. more than 1000 articles published every day. There are multiple
ways to drive users to the website, for e.g. SEO, Recommendation engines for partner websites,
social media etc. But the real challenge lies in getting users engaged on the page. Engagement
improves loyalty and repeat visits which in turn is directly correlated to ad revenue. This paper
tries to understand different factors and sentiments which could affect the page engagement for
each article.

51
FEATURE SELECTION IN SPARSE MATRICES

Rahul Kumar
Data Scientist
Media iQ Digital
Bangalore, India
rahulkumar@mediaiqdigital.com
Vatsal Srivastava
Data Scientist
Media iQ Digital
Bangalore, India
vatsal@mediaiqdigital.com
Manish Pathak
Media iQ Digital
Bangalore, India
manishpathak@mediaiqdigital.com

Abstract
Feature selection, as a pre-processing step to machine learning, is effective in reducing
dimensionality, removing irrelevant data, increasing learning accuracy, and improving result
comprehensibility. There are two main approaches for feature selection: wrapper methods, in
which the features are selected using the supervised learning algorithm, and filter methods, in
which the selection of features is independent of any learning algorithm.
However, most of these techniques use feature scoring algorithms that make some basic
assumptions about the distribution of the data like normality, balanced distribution of classes,
non-sparsity or dense data-set, etc. The data generated in the real world rarely follow such strict
criteria. In some cases such as digital advertising, the generated data matrix is actually very
sparse and follows no distinct distribution. For this reason, we have come up with a new
approach towards feature selection for cases where the data-sets do not follow the above-
mentioned assumptions. Our methodology also presents an approach to solve the problem of
skewness of data. The efficiency and effectiveness of our methods is then demonstrated by
comparison with other well-known techniques of statistics like ANOVA, mutual
information, KL divergence, Fisher score, Bayes’ error, Chi-square, etc. The data-set used for
validation is a real-world user-browsing history data-set used for ad-campaign targeting. It has
very high dimensions and is highly sparse as well. Our approach reduces the number of features
to a significant degree without compromising on the accuracy of the final predictions.

Keywords: Feature Selection, Sparse-Matrices, Filters Methods.

52
COMPARISON OF PERFORMANCE OF INDIAN AVIATION
SERVICE PROVIDERS USING MULTI-CRITERIA DECISION
MODELS
Mihir Dash*
Professor & Head of Department
Department of Quantitative Methods
School of Business, Alliance University,
Chikkahagade Cross, Anekal Road,
Anekal, Bangalore-562106
mihir@alliance.edu.in, +91 - 9945182465

Abstract

The Indian aviation industry is one of the fastest growing aviation industries in the world,
with a passenger revenue growth of 20.5%, followed by China (12.1%) and Russian
Federation (8.8%). Some of the driving factors for the growth and expansion of the Indian
aviation industry include the low-cost carriers, modern airports, foreign direct investments in
domestic airlines, information technology interventions, and a growing emphasis on regional
connectivity. Currently, the Indian civil aviation industry is amongst the top ten in the world,
with a market size of around US$16 billion. According to the FICCI-KPMG report on Indian
aviation, India has a vision of becoming the third largest aviation market by 2020, and the
largest by 2030.
Airlines are classified as low-cost carriers and full-service carriers. Low-cost carriers, also
known as discount airlines or low-cost airlines, are airlines that offer lower fares in exchange
for fewer passenger comforts. Low-cost carriers have had a great impact on the aviation
industry (Sabre, 2010). The deep market penetration of low-cost airlines caused conventional
carriers to cut flights, close hubs and even abandon service to some cities. Some of the lowcost
carriers in the Indian aviation industry include Indigo, Spice Jet, and Go Air. Full-service
carriers, on the other hand, provide a wide range of additional comforts, such as
entertainment, food, drinks, and so on. There is room for growth in the full-service carrier
segment as increasing prosperity leads to demand for quality in-flight services. Some of the
full-service carriers in the Indian aviation industry include Jet Airways, Air India, and so on.
Some of the new entrants in the industry include Air Asia, Vistara, and Air Costa.

Competitive rivalry is very high in the Indian aviation industry, due to the entry of low cost
carriers and the high operating costs. Also, the industry is regulated more on the supply side
than the demand side, so that airlines are not free to choose which markets to operate and
which segments to target. The various airlines are competing for the same customers; they are
competing in terms of price, technology, in-flight entertainment, customer service, and so on.

The objective of the study is to compare the performance of players in the Indian aviation
industry. The performance indicators for aviation service providers considered in the study
include market share, operational efficiency, punctuality, reliability, and customer
satisfaction. The study uses the multi-criteria decision models Technique for Order of
Preference by Similarity to Ideal Solution (TOPSIS) and Analytic Hierarchy Process (AHP) to

53
benchmark performance in the Indian aviation industry. The sample selected for the study
comprise the top seven aviation service providers, viz. Indigo Airlines, Jet Airways, Air India,
Spice Jet, Go Air, Air Asia, and Vistara.
The results of the analysis would enable the aviation service providers to identify their strengths
and weaknesses, and take appropriate steps to improve their performance.

Keywords: Indian aviation industry, performance, multi-criteria TOPSIS.

54
CYBER SECURITY: A GRAPH BASED APPROACH

Garima Makkar
Senior Business Analyst,
Tata Consultancy Services,
Bangalore
+91-9811144199; garima.makkar@tcs.com
Malini Jayaraman
Senior Business Analyst,
Tata Consultancy Services,
Bangalore
+91-8197555991; malini.jayaraman@tcs.com
Sonam Sharma
Senior Business Analyst,
Tata Consultancy Services,
Bangalore
+91-8376801196; Sonam.sharma2@tcs.com

Abstract

The increasing usage and connectedness of computer network has made the security over the
Internet very important. Each user in the industry faces some or the other type of threat every
day. And the interesting thing lies in the procedure which an organization follows to deal with
these day-to-day challenges. Cybercrime including Point-of-sale (POS) Intrusions,
Cyberespionage etc. is growing in number as well as in sophistication making a mounting
challenge for large enterprises to work over their capabilities of traditional methods like
antivirus software. The conventional methodology of enterprises are getting outstripped
because of the new complications in relationships among contractors, competitive intelligence
sharing and regional distribution. Many enterprises these days are resorting to different security
products based on certain policies which aims to generate security logs. These records contain
information about the activities going in the network systems and are the first source consulted
at the time of an attack. However, the data produced is usually large in volume making difficult
for us to go by traditional intrusion detection system. This paper presents our experimentation
on the big log data to detect known as well as unknown attacks occurring in an organization.
For this study, we will be using data from MACCDC. This data has been collected using Bro
Software which captures all the network activities. As the underlying structure is of connected
systems, we have used a graphical approach for this approach. We have loaded the data into a
graph database as connected components. Using a graph based clustering approach, we have
identified anomalous system behaviours.

Keywords: Cyber, Graph Database, Intrusion Detection.

55
A GLOOMY GROWTH OF BPO INDUSTRY: EMPLOYEE
ATTRITION ANALYSIS

Madhuri S*
Senior Business Analyst
Tata Consultancy Services Ltd.
Bangalore, India
madhuri.6@tcs.com
Naga Chaitanya Garimella
Manager
Tata Consultancy Services Ltd.
Bangalore, India
nagachaitanya.garimella@tcs.com
Dr. Anuj Prakash*
Functional Consultant
Tata Consultancy Services Ltd.
Bangalore, India
anuj.prakash@tcs.com

Abstract
To face the challenge of higher cost, most of the firms are looking at the outsourcing of various
services and now it is well understood, tried and tested process to lower the cost. Business
process outsourcing leads such a market where parent company is at one place and product
manufacturer or service provider are at another place and the goods or services will be delivered
at other different places. Therefore, the next era will be the era of virtualization but human
resources will be the essential part of any process. The outsourcing business is having higher
growth rate but this growth is stultifying only due to the higher rate of human resource attrition
and creating the manpower crisis for any business process. In this paper, we have proposed the
methodology to give the score to each employee on the basis of various parameters and predict
the attrition of the employee in next six month. Human resources are important corporate asset
as it represents the collective expertise, innovation, leadership, entrepreneurial and managerial
skills. To know the attrition is important as industry spent a lot on developing their skills,
motivating them to high levels of performance, and maintaining their commitment. Therefore,
we have considered a significant number of variables related to employee satisfaction, work-
life balance, travelling to the office daily, employee’s social, financial and marital status, and
professional relationship with co-worker and manager etc. We estimated the probability of
attrition of each individual in next six month and also identified the critical variables which
will impact the attrition in adverse manner. The proposed approach has been applied on a
numerical example. The proposed approach has been tested and the results are well promising
and those results will help the managers to understand their human assets well in advance.

Keywords: Business Process Outsourcing, Human Resource, Attrition, Churn Modelling.

56
CONSUMER ADOPTION OF MOBILE PAYMENTS IN INDIA POST
DEMONETISATION
Roshny Unnikrishnan
Assistant Professor
Department of Management studies
PES Institute of technology, Bangalore South Campus
Bangalore, India
roshnyunnikrishnan@gmail.com,roshnyunnikrishnan@pes.edu
Dr Lakshmi Jagannathan
Professor
Department of Management studies
Dayananda Sagar College of engineering
Bangalore, India
lakshmi.quality@gmail.com
Abstract
Consumer adoption of mobile payments is not widely prevalent in India in-spite of existence
of the financial and regulatory infrastructure and presence of various mobile payments
services. Even when the benefits derived in comparison with other cash and non-cash options
are high, the rate of adoption of mobile payments is low. It is assumed that the consumers
have concerns on the level of risk undertaken through mobile based money transactions, as
expressed by one of the respondents in a survey on mobile payments - “We want our money
safer than our selfies“. The studies on mobile based money transactions in India is highly
relevant and need of the hour in the post demonetization era. Post demonetization, mobile
payments has emerged as the second largest form of cashless transaction by volume in India
second only to debit card transactions. There are various initiatives from the government to
promote digital payments to move away from being a cash based economy as cost of cash
based transactions is significantly higher than electronic payments. The current study aims to
evaluate, from the consumer perspective, the existence and extend of significant associations
between predictor constructs perceived ease of use, perceived usefulness, trust, perceived risk
and social influence and dependent construct behavioural intention towards adoption of
mobile payments in the post demonetisation era. The study also intends to segment the
consumers based on the degree of behavioural intention towards adoption of mobile
payments and to identify the consumer segments based on likelihood on intention to adopt
mobile payments. Multiple regression, Cluster analysis and Multinomial logistic regression
with Bootstrapping is proposed to be conducted on 500 responses on a 7 point scale.

Keywords: Technology adoption, Mobile payments, Behavioural intention, Multinomial


logistic regression

57
I-OPs
- THE INTELLIGENT OPERATIONAL MATRIX

Ravikumar Kubusada
Software Engineer, CGI, Bangalore, India
ravikumar.kubusada@cgi.com
Akhilesh Hiremath
Software Engineer, CGI, Bangalore, India
akhilesh.hiremath@cgi.com
Pradeep Kotha
Lead Analyst, CGI, Bangalore, India
pradeep.kotha@cgi.com
Pranathi Rao
Consultant, CGI, Bangalore, India
pranathi.rao@cgi.com

Abstract
Customer Experience is the key differentiator in any industry. Currently it outranks the product
and price as differentiating criteria. Enhancing customer experience is a key factor and a
challenge that every industry is facing today. In current industrial trends, main focus is on
gathering the past data and current activities and trying to conclude with the prediction about
the customer. The need of the hour is to not only achieve Customer growth/retention but also
improving the operational efficiency. In simple words “High Operational Efficiency is the
secret ingredient for the business growth” The main objective of I-OPs is to design a
framework across the service industry. Every industry has its own set of customers and their
requests. The current trend in industry is to assign the requests through a round-robin method.
The main focus of this research is to allocate requests to Client Service Representatives (CSRs)
based on their ranking and priority of the request. Customer Analytics will be the drive to assign
the priority to a customer request. I-OPs will create multiple ranked groups of CSRs based on
their performance using Co-occurrence algorithm. Multi-Criteria Decision making algorithm
would be applied to resolve any conflicts on the priority ranking. This matrix would help faster
service of the requests and also keep backlogs and idle times at bare minimum. The entire DNA
of the matrix allocation landscape would be made available through a rich visualization API
for the COO Community to derive insights for better decision making. The value-add from I-
OPs would be Customer Satisfaction resulting in further business growth

Keywords: Recommender System, Service Analytics, Multi criteria Decision Making, Co-
Occurence Matrix, Customer Analytics

58
COLLABORATIVE, CO CREATIVE AND SUSTAINABLE FOOD
DISTRIBUTION SYSTEM BY HARNESSING INTELLIGENT
URBANIZATION.
Dr. Sanjukta Ghosh

Abstract

The rise of urbanization is transforming food systems in many areas, including


production on the farm, processing and packaging, distribution and retail, and
consumption at the table (Seto and Ramankutty, 2016). The food and agricultural
industry is progressively moving towards sustainable production process with a core
consideration of consumer health and safety due to the frequent occurrence of foodborne
diseases. It has been observed by various researchers, that cultural shift towards
local sourcing and short food supply chains has been suggested as a positive
economic factor for rural regions (Van Der Loo et. al., 2015). Therefore there is a
need for more transparent, informative and efficient supply chain system, developed
through collaboration and co creation to address these critical and crucial social issues
in the food distribution network. This study aimed to develop a framework of
collaborative and co creative food distribution system by harnessing intelligent
urbanization. The study was designed through unique combination of qualitative and
quantitative technique. Items were generated through open coding under grounded
theory and validated through Confirmatory Factor Analysis. The framework will be
beneficial for the organic food producers to develop a digital platform, which will
lead to an effective and innovative supply chain. This study will also help the service
and experience designers to explore this opportunity and create a platform for the small-
scale farmers and contribute in their livelihood development.
Key words: Organic Food, Supply Chain, Scale Development, Grounded Theory,
Affinity Diagramming.

59
ANTECEDENTS OF ENTREPRENEURIAL ORIENTATION AND
THEIR ROLE IN FOSTERING ENTREPRENEURIAL INTENTIONS
AMONG UNIVERSITY GRADUATES: AN SEM APPROACH

Swagatika Sahoo*
Doctoral Scholar
School of Management
National Institute of Technology Rourkela,
Odisha, India, 769008
asr.swagatika@gmail.com
Dr. Rajeev Kumar Panda
Assistant Professor
School of Management
National Institute of Technology Rourkela,
Odisha, India, 769008
rkpanda@nitrkl.ac.in

Abstract
Entrepreneurial intentions being central to the entrepreneurial process have been a focal point
of discussion for academicians, researchers and policymakers across the globe. This scenario
has led to a steep rise in policies encircling entrepreneurship development through technology-
based startup creation in India and other developing economies. However, these
policies cannot achieve their desired goal and global benchmark due to lack of understanding
of the critical factors and appropriate directions for enactment.

The entrepreneurial research has identified a variety of psycho-personal, demographic and


environmental factors affecting entrepreneurial intentions of young individuals across diverse
socio-political and economic contexts. Yet, the available evidence proposes that intention to
choose an entrepreneurial career eventually develops from the perception of entrepreneurial
support or barriers from the surrounding environment. Such factors in the entrepreneurial
environment do not impact an individual’s entrepreneurial intentions in isolation; rather they
steer his/her entrepreneurial orientation by fostering the favourable or unfavourable
evaluation of the entrepreneurial career choice.

Driven by the above motivation, the present study explores the key environmental
antecedents affecting the entrepreneurial orientation of technical university graduates. The
proposed hypothesised framework suggests entrepreneurial orientation as the determinant of
entrepreneurial intentions which is shaped by the entrepreneurial environment of young
engineering students in a university setup. The environmental antecedents chosen from the
extant literature are; access to finance, access to business information, social networks and
university support, whereas the perceptual deriver affecting students’ entrepreneurial
intention is entrepreneurial orientation. Primary data was collected through a survey of 516
60
final year undergraduate engineering students across two Centrally Funded Technical
Institutions (CFTIs) in India. The reliability and validity measures of the constructs are
testified through Confirmatory Factor Analysis (CFA), and the proposed hypotheses are
validated using Structural Equation Modeling (SEM).

The results indicate that the environmental antecedents (access to finance, access to business
information, social networks and university support) have significant positive influence on
students’ entrepreneurial orientation, which in turn has a significant positive relationship with
entrepreneurial intentions. Thus, the results of this study suggest that the barriers and support
received from the government and university context significantly impact entrepreneurial
orientation, which in turn affects entrepreneurial intentions of young aspiring entrepreneurs.

This study may enable the academicians, researchers, and policymakers to frame policies for
building & promoting a holistic entrepreneurial environment to foster entrepreneurial
intentions of young individuals in India and similar developing country contexts. The study
has a likely potential to help university administrators and policymakers, to develop strategies
and effective policies that may provide the desired entrepreneurial support essential for
fostering the entrepreneurial orientation of the university students towards entrepreneurship
thereby assisting them to achieve their career goals and broader objective of nation-building.

Keywords: Environmental antecedents, Entrepreneurial orientation, Entrepreneurial


intentions, University support, Technical university students

61
AHA

Vigneshwaran S R *
Research Scholar
Indian Institute of Science
Bangalore
vigneshwaransr89@gmail.com
Vinodhini R
Student
Indian Institute of Management
Bangalore
Amrutha B
Research Scholar
Indian Institute of Science
Bangalore

Abstract

“Hunger and food insecurity are very complex phenomena affecting people and countries in
different ways,” says Melgar-Quiñonez. In order to estimate the volume of people who are
undernourished and face food insecurity, we do not have a single internationally agreed
standard with easily measurable indicators. Traditionally, in order to determine whether a
household is food insecure, a survey is conducted to understand the number of times the
household had food in a week and what they had. In this research we attempt to use a
newfangled approach - measure the pressure applied on the plate as an indicator to classify
whether a person is hungry or not. The data was captured from the students of IISc during
breakfast/lunch/dinner. To carry out this research, a new pressure sensor device that records
the changes in resistance when a force or pressure is applied on the plate was designed. The
readings from this device are then analyzed to diagnose hunger using various machine learning
and classification algorithms. This research overcomes the inadequacy of qualitative methods
when it comes to identifying the hunger level of an individual. The proposed approach can be
used as standard measure across the globe since the accuracy levels achieved are much higher
than what one would expect.

Keywords: Hunger; Pressure Sensor; Machine Learning; Feature Extraction; Time Series.

62
PUBLIC TRANSPORT TRACKING SYSTEM USING IoT

Sanjay Adasul
Associate Consultant, CGI, Bengaluru, India
sanjay.adasul@cgi.com
Anish Joseph
Lead Analyst, CGI, Bengaluru, India
a.joseph@cgi.com

Abstract

Traffic Congestions is a major challenge of Indian cities and Bengaluru is not an exception.
Various studies have concluded that substantial city commuters must switch to high-
occupancyvehicles (HOVs) to ease the traffic congestions. The purpose of the paper is to
identify the key reasons of commuters not opting for BMTC Buses and to explore the digital
solution that enables city commuters to switch to using HOVs. As part of this research a sample
survey was conducted among the working population. The data analysis and exploration of the
surveyed data revealed various reasons why commuters do not opt for BMTC Buses and
whether they would switch to HOVs if key concerns are addressed. An IoT based solution is
conceptualized to improve reliability of BMTC Buses and to promote efficient carpooling as a
preferred mode of commuting. As a result, reduced number of vehicles on the Bengaluru road
would ease traffic congestion significantly.

Keywords: IoT, Traffic Congestion, BMTC, HOVs

63
THE IMPACT OF FACEBOOK ADVERTISEMENT,
FACEBOOK REVIEW AND FACEBOOK PAGE ON ONLINE
SHOPPING INTENTION

Ms. M. Swapana, Research Scholar, VIT business school, VIT University, Vellore, India –
632 014. Email: swapana.m2014@vit.ac.in
Dr. C. Padmavathy, Senior Assistant Professor, Marketing Division, VIT Business School,
VIT University, Vellore, India – 632 014. Email: padmavathy.c@vit.ac.in
Mr. C. Naveen Kumar, Research Scholar, VIT business school, VIT University, Vellore,
India – 632 014. Email: naveenkumar.c@vit.ac.in

Abstract

This paper reviews the literature to identify crucial factors associated with the relationship
between social media and online shopping intention. Based on these factors, a survey has
been conducted to identify the relationship between Facebook advertisement, Facebook
review, Facebook page and online shopping intention of the consumer. Responses
obtained within the private university are evaluated using principal component analysis to
understand the latent structure of the factors. Finally, three factors with 13 items are
extracted after three iterations. The research findings are supported by the findings of
existing literatures which provides better understanding of online shoppers.

Keywords: Online shopping intention, Facebook advertisement, Facebook review,


Facebook page, Factor analysis

64
SMART HEALTH CARE

M.Sylaja01, M.Shibani01,T.R.Suruthimai01, P.Leela Varshini02


01-Student at Thiagarajar college of Engineering(IT)
02-Student at Thiagarajar college of Engineering(ECE)
sylajatceit@gmail.com , shibanimaran97@gmail.com , shruthimai18@gmail.com ,
praleela@gmail.com
Guide name:
Mrs.S.Karthiga,
Associate Proffessor,
Department of Information Technology,
Thiagarajar college of engineering,
Madurai

Abstract

India is the country enriched with all type of resources, but still when it comes to medical
facilities and a special concern towards the senior citizens we are standing a step behind
compared to other counties. This paper describes about our mobile application named
SmartHCare that especially benefits the health of the senior citizens. They can frequently check
their health status and be aware of the impending health problems by using our application. As
of now, they are in a need to go for hospitals to get medical checkups. Instead, they can use our
mobile application for frequent checkups of the pulse rate, and blood pressure levels etc.The
sensor fitted into the mobile phone will analyze their health status. The sensor starts to analyse
the health status only if the finger print matches with the registered finger print. And if the
health condition is abnormal, the patient and their emergency contacts will get notified. Also
the medical reports of the patient after each checkup will be sent as a mail to the emergency
contacts of the patient regularly. Though our app is similar in most of the aspects compared to
other existing health related mobile apps, we do have certain unique features which will be
discussed further sections in this paper.

Keywords: Internet of Things, Fingerprint Scanner, Sensors, data mining, Mobile


application, health status, senior citizens

65
CRIME ANALYSIS ACROSS MAJOR CITIES OF INDIA WITH
TWITTER

Nisanth T. S
Student
REVA Academy for Corporate Excellence, REVA University
Mainframe Administrator, IBM India
Bangalore, India
NishanthTS.BA02@reva.edu.in
Sonali Sucharita
Student
REVA Academy for Corporate Excellence, REVA University
Technology Lead, Cognizant Technology Solutions
Bangalore, India
sonalis.ba02@reva.edu.in

Abstract

Police traditionally use maps and tacit knowledge about a city to determine which public
areas are to be observed to prevent or reduce crime rates. Most of the times, the accuracy of
such predictions are very low. Crime prevention should be a proactive measure rather than a
reactive one. Data from social networking sites like Twitter can enable the prediction of
future crimes in major cities and is used extensively in countries like the USA.

The data used in this study are Twitter feeds from 5 major cities in India. Based on the
latitude and longitude, this study attempts to collect the data for top 10 types of crimes across
these major cities and then compare the results with the data from Government website,
National Crime Records Bureau (NCRB).

Data collection is planned in three stages; 1. Collecting crime data from Twitter 2. Data from
Govt. Crime records website and 3. Creating a combined data set with both twitter data and
the Government data.

Text mining and sentiment analytics will be used to predict the crime rates in these cities. Our
attempt is to come up with a better predictive model which can predict crimes that could
occur in these cities for the next 3 months.

Keywords: Text Mining, Crime prediction, Twitter, Density

66
DEEP LEARNING BASED DECEASED-DONOR KIDNEY
ALLOCATION MODEL FOR INDIA

Andrea Brian Churchill


Student, MBA in Business Analytics
REVA University
Bangalore, India
andreabrian.c@icloud.com
Dr Brian Mark Churchill
Student, Masters in Renal Transplantation
Liverpool University
Liverpool, UK
brianmarkc@icloud.com
Mutturaj Baradol
Student, PGDM-Business Analytics
Senior Engineer, Rakuten India Inc.
REVA University
Bangalore, India
Mutturajb.BA01@reva.edu.in
Dr Shinu Abhi
Director, Corporate Training
REVA University
Bangalore, India
shinuabhi@reva.edu.in

Abstract

Deep learning, also known as deep structured learning and deep machine learning (DML), is
a subfield of machine learning. It consists of complex algorithms called artificial neural
networks (ANN) and deep neural networks (DNN) that form an architecture consisting of
hierarchically connected multiple processing levels similar to structure and function of
human brain.

There is a huge shortage of organs for transplant. If we consider subjects on the waiting list
for kidney transplantation, they remain a huge risk of mortality from chronic kidney disease
while on dialysis. The longer the wait, the greater is the risk of death. Kidneys from deceased
donors are allocated to kidney failure patients on the waiting list on the basis of a number of
criteria, like blood group, tissue matching, age, time on dialysis, urgency, and survival
benefit. The criteria used for allocation may be different for different countries.

The organ allocation system needs to be precise and highly efficient- and the best model can
be an artificial intelligence model. Computer-based organ allocation models exist in Europe
67
and United States of America. These models are based on scientific principles and
government policies and exemplify an effort to make organs available to the neediest, and
with best tissue matching. The distance the organ needs to be transported is also taken into
account when allocating an organ, as lesser the delay in transplanting such organ, lesser is the
cold ischemia time and chances of delayed graft function.

We aim to develop a deep-learning based artificial intelligence (AI) model built on deep
neural network hierarchical architecture. Besides helping in the allocation of kidneys, the
deep learning based AI will give HLA matching details and HLA matching score that will
help clinicians decide immunosuppression protocol and will give an insight regarding the
success of transplantation. Better HLA match improves the chances of a successful kidney
transplantation.

The deep learning model will use the database on blood group, PRA (panel reactive
antibodies), age, anti-HLA antibodies, and other important variables. The database used to
train and test the model is based on statistical data collected from OPTN (Organ Procurement
and Transplantation Network), USA and Wikipedia. We have made the model based on the
guidelines for organ transplantation from NOTTO (National Organ and Tissue Transplant
Organization, India), as the model aims to help allocation of a kidney from deceased donors
in India.

We expect to develop a highly efficient allocation tool based on AI that can also give details
of HLA match that is vital for the clinical management of transplanted organ.

Keywords: Deep Learning, Kidney Transplant, Transplantation, Artificial Neural Networks,


Deep Neural Networks.

68
OTHERS

69
STAKEHOLDERS FRAMEWORK FOR ANALYZING RELATIONSHIP
BETWEEN THE SUSTAINABILITY METRICS OF SUPPLY CHAIN.

Rahul Solanki*
Research Scholar
Department of Operational Research,
University of Delhi, Delhi
solanki.rahul1470@gmail.com

Jyoti Dhingra Darbari


Research Scholar
Department of Operational Research,
University of Delhi, Delhi
jydbr@hotmail.com

Vernika Agarwal
Research Scholar
Department of Operational Research,
University of Delhi, Delhi
vernika.agarwal@gmail.com

P.C. Jha
Professor,
Department of Operational Research,
University of Delhi, Delhi
jhapc@yahoo.com

Abstract
Organizations are now compelled to embrace sustainability in their supply chains (SCs) due to
strict government regulations and increasing pressure from social organizations. The
sustainability-focused supply chain is an extension of the green supply chain as it considers
social criteria along with economic and green criteria. Unfortunately, firms in India are more
inclined towards economic and environmental front with a little attention paid to social aspect.
Existing studies suggest that the major impediment to social sustainability (SS) relate to firm’s
intention for maximizing economic productivity, reducing costs, and lack of government rules
regarding social injustice and labor laws. Hence, within the domain of sustainable supply chain,
SS is generally perceived as mere economic burden or in few cases just a compulsion, even
though it can positively enable other sustainability initiatives. To disambiguate the notion, this
paper brings forth relevant dimensions of SS in terms of various SC factors, which can further
stimulate overall sustainability of the SC.
To achieve the aforementioned objective, study aims to examine the following:
(i) Identify important social aspects that can be adopted by any manufacturing firm to be
socially responsible towards all its stakeholders.
(ii) How these social aspects affect the economical and environmental factors of
sustainability?

70
(iii) Which of the dimensions of SS considered in the study are aligned with the prospective
SC performance outcomes of the firm?
(iv) How are identified relevant SS dimensions measured and managed?

The methodological framework of the study entails identification of social dimensions through
extensive literature survey. Various SC executives of the firm are consulted and their opinions
are considered through semi-structured interviews and questionnaire. The data collected is
further analyzed using Interpretive Structural Modelling (ISM) and MICMAC analysis to
extract the social aspects, which foster economic sustainability not only for the manufacturing
firm, but for the suppliers and customers as well. The responses gathered from the experts aid
in classifying the relationship between SS and eco-environmental performance of the firm as
‘negative relationship’, ‘No relationship’ or ‘Positive relationship’. The positively related
social aspects are further investigated for understanding the degree of influence and numerical
quantification of the influence, which is done using numerical scale of ‘1-5’.The findings of
the integrated multi-criteria method suggest that ‘Ethics’, ‘Health and safety’, ‘Labor Rights’,
‘Wages’, ‘Education and Training’, can positively lead to job satisfaction, employee retention,
building trust with stakeholders and clean and healthy environment which consequently
enhances the financial and the environmental performances of the firm. The result also provides
a baseline for the managers seeking to build a socially responsive supply chain and helps them
identifying pertinent social aspects to focus upon for achieving sustainability at all three levels
of supply chain.

Key words: Social Sustainability, Sustainable Supply Chain, ISM, Stakeholders.

71
STATE WISE PANEL DATA STUDY OF USAGE OF TOILETS IN
RURAL INDIA

Monika Saxena
Assistant Professor
Amity University
Greater Noida, India
monikasaxena29@rediffmail.com

Abstract
In 2015, India was the fastest growing economy of the world with 7.5 percent growth rate, surpassing
China’s growth rate of 7 percent, and according to Swachhta Status Report by NSSO in 2015, same
year, more than half of the rural people (52.1 per cent) still defecates in open which is a major challenge
for Indian economy. After Government of India launched Swachh Bharat Mission- Gramin in 2014,
allocation of funds have increased three fold from 2,850 crore to 9000 crore . Out of which 97 percent
is been allocated to IHHL (Individual House Hold Laterines). This study focuses on the impact of
various factors affecting the sanitation facilities and usage of toilets in India. The study encompasses
states of India based on population and uses panel data model to estimate the impact of various factors
and how policies have played a major role in it.
Keywords: Empirical, Panel Data, Swachh Bharat, IHHL

72
PENALTY: A GAME OF CHOICES

Indranath Mukherjee
AVP – Strategic Analytics
XL Catlin
Gurgaon, India
indranath.mukherjee@xlcatlin.com

Abstract

The phenomenon of penalty kick in the game of football (soccer) is a matter of choice – both
for the penalty taker and the goalkeeper. The penalty taker needs to decide whether to shoot
left, right or down the middle while the goalkeeper chooses whether to dive left, right or stay
put in the center of goal. It can be shown that the players actually play a mixed strategy Nash
Equilibrium (MSNE) game which is consistent with the minimax theorem of von Neumann.
A regression analysis run on 1,407 penalty kicks from the leagues in Spain, England, Italy
and some international games from the period between September 1995 and June 2012 show
that the probability that a goal will be scored (not scored) for the shot taker (goalkeeper)
should be the same across strategies for each player. Each player’s choices are also serially
independent given constant payoffs across penalty kicks. This means that players must be
concerned only with instantaneous payoffs and there are no intertemporal links between
penalty kicks – the choices should be memoryless. In other words, the choice in one
particular penalty kick must not depend on one’s own previous choices or the opponent’s
previous choices or any other previous actions. This paper discusses the results in detail and
look at a real life application.

Keywords: Football, Penalty Kick, Game Theory, Minimax Theorem, Regression Analysis

73
KABADDI: FROM AN INTUITIVE TO A QUANTITATIVE
APPROACH FOR ANALYSIS, PREDICTIONS AND STRATEGY

Manojkumar Somabhai Parmar


Student, EGMP-43
Indian Institute of Management, Bangalore, India
manojkumar.parmar17@iimb.ac.in; parmarmanojkumar@gmail.com

Abstract

Kabaddi is a contact team sport of Indian-origin. It is a highly strategic game and generates a
significant amount of data due to its rules. However, data generated from kabaddi tournaments
has so far been unused, and coaches and players rely heavily on intuitions to make decisions
and craft strategies. This paper provides a quantitative approach to the game of kabaddi. The
research derives outlook from an analysis performed on data from the 3rd Standard-style
Kabaddi World Cup 2016, organised by the International Kabaddi Federation. The dataset,
which consists of 66 entries over 31 variables from 33 matches, was manually curated. This
paper discusses and provides a quantitative perspective on traditional strategies and
conceptions related to the game of kabaddi such as attack and defence strategies. Multiple
hypotheses are built and validated using student’s t-test. This paper further provides a
quantitative approach to profile an entire tournament to gain a general understanding of the
strengths of various teams. Additionally, team-specific profiling, through hypotheses testing
and visualisation, is presented to gain a deeper understanding of team’s behaviour and
performance. This paper also provides multiple models to forecast the winner. The model-
building includes automatic feature selection techniques and variable importance analysis
techniques. Generalised linear model with and without an elastic net, recursive partitioning and
regression tree, conditional inference tree, random forest, support vector machine (linear and
radial) and neural network-based models are built and presented. Ensemble models use
generalised linear model and random forest model techniques as ensemble method to combine
outcome of a generalised linear model with the elastic net, random forest, and neural network
based models. The research discusses the comparison between models and their performance
parameters. Research also suggests that ensemble technique is not able to boost up accuracy.
Models achieve 91.67%-100% accuracy on cross-validation dataset and 78.57%-100% on test
set. Results presented can be used to design in-game real-time winning predictions to improve
decision-making. Results presented can be used to design agent and environments to train
artificial intelligence via reinforced learning model.

Keywords: Kabaddi, Sports Analytics, Predictive Model, Team Profiling

74
FEATURE RANKING USING ASYMMETRIC INFORMATION INDEX
IN CASE OF CONTINUOUS TARGET VARIABLE

Tathagata Mukhopadhyay
Assistant Vice President
Analyttica Datalab Pvt. Ltd.
Bengaluru, India
tathagata.mukhopadhyay@gmail.com

Abstract

There is no doubt about importance of variable selection or dimensionality reduction in a data


mining exercise. We need to understand the importance of one predictor variable in prediction
of the target variable. Various researches have taken place around the concept of entropy of a
variable and mutual information shared between two variables to evaluate the importance of a
predictor or feature. But either all research is more or less on the theoretical front or for a binary
or categorical variable. Hence, this paper tries to propose a measure, Asymmetric Information
Index (AII) that can be used to quickly measure the importance of a variable or feature as a
predictor to a continuous variable. The measure we will look for here has to be a quick and
easy to calculate, unaffected by presence of missing data or outliers, unit free (so that variables
with different units can be compared), and robust enough to capture any kind of relationship,
like Information Value calculation in case of a logistic regression.
Post proposing, the paper gives a high-level algorithm and with some dummy data proves the
applicability and robustness of the measure too. This paper shows how the measure is able to
effectively identify different kinds of relationship, e.g. linear, non-linear, heteroscedastic,
homoscedastic, etc. Also, this paper also shows that the measure is not only easy to compute,
but also unaffected by outliers or missing values. Finally, the papers take a real-life dataset
where it first finds the relative ranking of raw variables based on the algorithm and then shows
that the final model contains the variables selected by the algorithm.

Keywords: Information Value, Dimensionality Reduction, Entropy Measure, Supervised


Learning, Continuous Target Variable

75
GREEN PURCHASE INTENTION AND BRAND EQUITY: A RETRACE

Siddharth Misra, PhD Scholar


sid.misra1983@gmail.com
Dr. Rajeev Kumar Panda, Assistant Professor
panda.rajeevkumar@gmail.com
Ms. Namrata Nanda, Student, MBA
nanda.namrata@gmail.com
School of Management, National Institute of Technology Rourkela, Odisha, India

Abstract

The research starts with the theoretical assumption that environmental conscious products are
better performer in terms of brand equity and have improved value proposition as compared
to non-green products, it aims at the contribution from consumer side to participate in
environmental conscious buying. Different studies in this field has revealed that customers
are interested in paying extra for better valued products, but the value is not the green
attribute but some other attributes associated to the brand. People in India have shown
reluctance in buying environmental conscious product even though, there is a level of concern
and consciousness about the environmental situations prevailing, and these products are
always perceived to be expensive for price sensitive consumers. Hence the motive behind this
study is to identify the right set of antecedents for improving consumer green buying
behaviour. This article further seeks for the impact of green purchase intention on green
brand equity and discusses the interactions like mediating and moderating role of green
purchase behaviour and pro-environmental conditions respectively. The study concentrates on
the Indian consumers and focusses on the purchase of environmental conscious refrigeration
products in India. This research is based on the reverse study of green attitude on green brand
equity.
The research employs structural equation modelling (SEM) for proving the hypotheses and
ranking the attributes using TOPSIS. The outcome verifies the positive relationship between
the green purchase intention and green brand equity. Hence, the research results indicate and
suggests that the practitioners must enhance the attitude for green purchase by incorporating
pro-environmental conditions associated with the brands. The brands also require in
converting these intentions into actual purchase, which is a behavioural aspect and hence this
leads to the enhancement of green brand equity. This result not only confers the positive
relationship between green brand intention and green brand equity but also strengthens a
believe that the presence of mediating role of actual purchase and the moderator pro
environmental conditions lead to better performance of the brands in terms of equity. This
study suggests the brand managers to sensitize the pro-environmental conditions and target the
consumers with more appropriate tools for pursuing them to convert their green purchase
intention into green purchase behaviour.

Keywords: Environmental Consciousness, Brand Equity, Green Purchase Intention,


Moderated Mediation

76
PREDICTION OF VIOLENCE IN CIVIL UNREST AREAS OF
JAMMU AND KASHMIR

Aditya Kumar, Vedant Patel, Vivek Kumar


B. Tech Department of Information Technology
National Institute of Technology Karnataka
Surathkal, India
adds.96, vedant.patel031096, vivekkumarsah2@gmail.com
Prof. G. Ram Mohana Reddy
Head of Dept. Department of Information Technology
National Institute of Technology Karnataka
Surathkal, India
profgrmreddy@nitk.ac.in

Abstract

Civil unrest events are common happenings in all forms of government, from liberal
democracies to hard-line totalitarian regimes. These events not only cause destabilization of
the socio-economic balance of a region but can also result in a huge loss of life and property.
However, the impact of such occurrences on the society can be contained by proper and
proactive handling of the events, their core origins as well as their stimuli. This gives rise to a
need for continuous risk assessment of upcoming civil unrest events, so that appropriate and
timely decisions can be taken. Forecasting such events can be a very useful tool for social
scientists and policy makers, for it can be used to help peacemakers or governments allocate
limited available resources judiciously, inform NGOs on potential hot-spots to avoid, and
even provide speculative investment opportunities for entrepreneurs and business
corporations. In this work, a novel forecasting model is proposed to supplement real-time,
policy relevant decision making by predicting temporally and geo-spatially nuanced areas of
civil unrest. The core idea proposed in our model is that a thorough analysis of the occurrence
of past events, the general sentiment regarding those events and the demography of the region
can be used to efficiently predict the possibility of future events occurring in a region.
Although, it is tough to assess the deep underlying structural impact of the factors of unrest,
with proper modelling and analysis, an approximate measurement or trend can definitely be
provided. This work proposes the use of a linear regression ARIMA (Auto Regressive
Integrated Moving Average) model on the number of events taking place in a region,
obtained through the machine-coded public dataset GDELT (Global Database of Events,
Language, and Tone) along with an ANN (Artificial Neural Network) that factors in the
nonlinear component of the model, sentiment quotient acquired through Twitter scrapping and
a quantitative measure of the socio-economic conditions of the populace, so as to predict the
occurrence of future events. Extensive analysis demonstrates the effectiveness of the
proposed model compared to the naïve forecast model and other existing methodologies.

Keywords: Event detection, ARIMA, ANN, GDELT, Event prediction

77
COST VIABILITY OF RENEWABLE POWER PROJECT IN
EDUCATIONAL INSTITUTE IN PUNE: CASE STUDY
Prof. Parikshit G. Jamdade
Assistant Professor
PVG’s College of Engineering and Technology
Pune, India
parikshit_jamdade@yahoo.co.in
Mr. Durgesh P. Vibhandik
U. G. Student
PVG’s College of Engineering and Technology
Pune, India
vibhandikdurgesh@gmail.com
Prof. Shrinivas G. Jamdade
Associate Professor
Nowrojee Wadia College
Pune, India
hv_jamdade@yahoo.co.in

Abstract

In India lot of efforts and initiatives are taken by Government of India for generation of power
from renewable energy sources to have sustainable development of business. The aim of this
paper is to look out the cost viability of solar, wind and hybrid power projects. In this paper we
analysed the solar radiation and wind speed data to calculate the power generating potential of
Pune urban area. ARIMA model is used for confirmation of the power generating potential of
Pune urban area. For case study load demand data of PVG’s College of Engineering
Technology, Pune is taken. Average load of college is recorded was 21083 kWh per month.
Total installation cost required for wind, solar, and hybrid power project are Rs. 5108451, Rs.
6440000, and Rs. 5774226 respectively. As per our study payback period for wind power
project (WPP), solar power project (SPP) and wind solar hybrid power project (HPP) is 22
months, 28 months and 25 months respectively. The results show that wind power project is an
attractive option due to lower pay back period.

Keywords: Cost Viability, ARIMA Model, Solar Power Project, Wind Power Project,
Hybrid Power Project.

78
ANALYSIS OF EDUCATIONAL PROGRSS OF INDIA USING DATA
VISUALISATION
Prachi Deshmukh-Chaudhari
ISC Analyst
HTS
Bangalore, India
Prachideshmukh77@gmail.com
Abhijeet Ashok Deshmukh
Software Quality Analyst
IBM
Mumbai, India
Abhijeetcareer3@gmail.com

Abstract:
It is said that education is the most powerful weapon which can be used to change the
world. In India, the latest calculated literacy rate is 74%. There are total 1,516,865 schools,
38498 Colleges and 760 Universities in India. In this paper, we are analysing some of the
important elements such as age group, region, gender etc. and their effect on educational
indicators such as literacy, GER etc. We have created data visualisation to present some of
these indicators using Tableau. We will be analysing the effect of the above-mentioned factors
on these indicators. The data used here is from various statistical publications of the ministry
of Human Resource Development and National University of Educational Planning and
Administration.

Keywords: Education, Data visualization, Descriptive analytics, Visual Analytics

79
DETECTING AND PREVENTING FRAUD WITH APACHE FLINK

Abhishek Gupta
Associate Software Engineer
CGI
Bangalore, India
abh.gupta@cgi.com
Rishab Prasad
Associate Software Engineer
CGI
Bangalore, India
rishab.prasad@cgi.com,

Abstract
Facing millions of dollars in fraud losses, companies cannot rely solely on strong user
authentication for online transactions. Once user credentials have been stolen or spoofed,
authentication controls are no longer effective by themselves. Thus, we need a fraud prevention
system as an add-on to prevent fraud in case user credentials have been stolen or
spoofed. This paper proposes an intelligent fraud prevention model which uses Apache Flink
to monitor transactions as they occur in real time. Hence, we can alert the customer’s on likely
nonstandard transactions in real-time which can prevent the frauds. While there is lot of focus
on the Predictive analytics and corresponding algorithms, this paper focuses more on how we
leverage the framework to integrate data real-time from multiple channels to enable quick and
fast alters.
Keywords: Big Data Analytics, Fraud Detection, Apache Flink, Real-Time Analytics

80
CLINICAL INTELLIGENCE AND INSIGHTS
Shivam Panicker*1
Senior Associate
Cognizant Technology Solutions Pvt Ltd
Pune, India
Shivam.Panicker@cognizant.com
Harshal Jagdale*2
Senior Associate
Cognizant Technology Solutions Pvt Ltd
Pune, India
Harshal.Jagdale@cognizant.com
Ram Narayan Dash*3
Associate
Cognizant Technology Solutions Pvt Ltd
Pune, India
Ramnarayan.dash@cognizant.com

Abstract
CI (Clinical Intelligence) is the new learning paradigm of the clinical process for health-care
providers. The data science will train clinicians to observe measure, interpret, diagnose, plan
and deliver best predictive care. By leveraging the power of Clinical Intelligence & Business
Intelligence will improve our nation’s entire healthcare system as well as the medical and
economic wellness of patients. First level of implementation of CI for the remote health
monitoring is an important application of Internet of Things. Through monitoring, we can give
adequate healthcare to people who are in dire need of help. With IoT, devices fitted with sensors
notify the concerned healthcare providers when there is any change in the vital functions of a
person. These devices would be capable of applying complex algorithms and analysing them
so the patient receives proper attention and medical care. The collected patient information
would be stored in cloud. Through remote monitoring, patients can significantly reduce the
length of hospital stay and perhaps, even hospital re-admission. This kind of intervention is a
boon to people living alone, especially seniors. If there is any interruption in the daily activity
of a person, alerts would be sent to family members and concerned health providers. These
monitoring devices are available in the form of wearable too. Wearable devices or home health
monitoring devices assisting patients is a common thing now. The devices are capable enough
to transmit vital sign data from a patient home to the hospital staff. It allows them to have a
real time monitoring of patient’s health. These devices use wirelessly connected glucometers,
scales, heart rate and blood pressure monitors. Devices helping in monitoring real time ICU
procedure are indeed a big part of IoT. CI creates the risk score based on the patient’s likelihood
of a major event within selective period of time, leverages the health system’s semantic data
lake from the predictive modelling. We have used the IoT to continuously monitor real time
stream data as the input to model the competency. Through the implementation of Machine
Learning and Deep Learning, modern data mining, pattern matching, data visualization and
predictive modelling tools to generate analyses and algorithms that help clinicians with
Decision Support Systems for the current and future prediction. We have used next key
competencies to develop on the way data into smart data based on the Predictive Analytics,
Descriptive analytics, Prescriptive analytics, diagnostic analytics and genetic analytics leading

81
best patient flow analytics to help make better decisions on a timelier basis irrespective of
location.

Keywords: IoT, Predictive Analytics, Deep Learning, Remote Health Care

82
RELIABILITY AND VALIDITY OF CORPORATE SOCIAL IDENTITY
DISCLOSURE USING TEXT ANALYTICS

Suvendu Kr. Pratihari


Ph.D. Scholar, School of Management
National Institute of Technology (NIT) Rourkela, India-769008
suvendupratihari@gmail.com
Dr. Shigufta Hena Uzma
Assistant Professor, School of Management
National Institute of Technology (NIT) Rourkela, India-769008
shigufta.uzma@gmail.com
Abstract

The focus of the study is to report how qualitative text data can be analysed and how validity
and reliability of these data can be assured by using text analytics principles. The paper
demonstrates this by considering the case of the banking sector in India. The data in the study
belong to the text content related to the corporate social identity disclosure by the scheduled
commercial banks (SCB) of India in their websites. The paper is grounded on the role of
corporate social responsibility (CSR) in the formation of corporate social identity, which plays
a significant role in the process of corporate branding. However, to the measure the corporate
social identity disclosure in the banking sector, the minimum content of CSR reporting has
been the subject of an extensive and still unresolved debate among scholars. The study
examines and reports the prioritisation of different corporate social identities by the banking
sectors in India to endorse the corporate branding process. The study extracted and analyses
the CSR disclosure content present in the SCBs of India using text analytics principles. The
data undergo multiple experiments such as ‘Percentage of Agreement’, ‘Scott's pi (π)’,
‘Cohen's Kappa (κ)’, and ‘Krippendorff’s Alpha (α)’ to check the validity and reliability of the
content. The methods like quartile approach of statistical data analysis and the weighted
average of prioritisation have been used to examine and discuss how the banking sector
prioritise their corporate social identity disclosure with respect to different stakeholders.

Keywords: Text Analytics, Content Analysis, Reliability, Validity, Corporate Social Identity,
Corporate Social Responsibility, Banking, India.

83
DECODING THE PEOPLE’S SENTIMENTS: DEMONETISATION
AND ELECTIONS
Lalak Harnathka*
B.Tech Student
Department of Polymer and Process Engineering, IIT Roorkee, India
lalak.harnathka@gmail.com
Abhijit Menon
B.Tech Student
Department of Polymer and Process Engineering, IIT Roorkee, India
abmenon1996@gmail.com
Tripti Mahara
Assistant Professor
Department of Polymer and Process Engineering, IIT Roorkee, India
triptimahara@gmail.com

Abstract

An important part of our information-gathering behavior has always been to find out what other
people think. Prior knowledge about people’s sentiments can even assist a country's
government to frame future policy decisions. In recent times, people have been using online
platforms like Facebook, Twitter etc. to showcase their opinion and sentiments about any
changes taking place around them, be it any major policy decision undertaken by the
government. A study about the authenticity of the opinions expressed on these platforms is
required to check whether the sentiments expressed online match with the actual scenario so
that these analysis can be used to frame the upcoming policies. On 8 Nov 2016, the Govt. of
India announced the demonetisation of all ₹ 500 & ₹ 1000 banknotes of the Mahatma Gandhi
series. The Govt. claimed that the action would curtail the shadow economy and crackdown on
the use of illicit & counterfeit cash to fund illegal activity & terrorism. Demonetisation was
easily one of the major policy decisions made by the government after coming to power in
2014. The Uttar Pradesh Elections held in 2017 was an important milestone for any
alliance/party striving to achieve majority in the Parliament, which therefore would give that
alliance/party a significant say in passing any constitutional amendment or any other policy
decisions. This research paper aims to analyze/decode the effects of demonetisation on the UP
Elections 2017 by extracting tweets and applying Naive Bayes method of sentimental analysis
to analyze people's sentiments from Twitter. We aim to check whether the people linked
demonetisation with UP elections or not. As a result of this analysis, it was found that majority
of the people did not link demonetisation with UP Elections.

Keywords: Sentiment Analysis, Demonetisation, UP Elections, Naive Bayes

84
RECOMMENDER SYSTEM TO INCREASE ENGAGEMENT FOR
SPENDERS & NON-SPENDERS OF FREEMIUM MOBILE GAMES

Sridhar Vaithianathan
Associate Professor (Analytics)
Institute of Management Technology (IMT), Hyderabad, India
sridhar.v@imthyderabad.edu.in
Sridhar Seshadri
Director, Technology Products, Cognitive Scale, Hyderabad, India
sseshadri@cognitivescale.com
Shreeram Iyer
Prinicipal Consultant,Trace Consulting, Hyderabad, India
shreeram.iyer@primeedge.in

Abstract

Within the freemium mobile game industry, there is a fine balance between retaining players
and converting them. Currently, marketing and player loyalty efforts to combat churn are still
reactive. We have recently developed a predictive modelling system that outlines proactive
approaches for retaining players before a player chooses to quit. This Proof of Concept
(POC) is both scalable and platform (iOS, Android) independent in mobile portfolio.

Our POC predicts churn and spend propensity with more than 85% accuracy. This
recommender system would improve the gamer engagement and increase the monetization of
the freemium mobile games. In addition to our research effort, we have developed a simpler
technique for game teams to utilize and monitor metrics that they think are good indicators of
engagement and monetization.
Using our new, simple statistical approach, freemium games team will be able to assess
engagement and monetization just by looking at weekly player behaviour correlations and
engagement/disengagement snapshots.
The recommendation system auto highlight areas of improvement in design and economy,
something that helps a game team to create a road map for determining the best actions for
various player segments.

This approach, when used in combination with our POC, results in better reward and
retention strategy for disengaging players and helps us achieve both re-engagement and first
time conversions.

Keywords: Recommender System, Freemium Mobile Games, Player Churn, Spend Propensity
and Monetization.

85
CORPORATE OWNERSHIP STRUCTURE AND ITS
DETERMINANTS: EVIDENCE FROM INDIA

Brahmadev Panda
Doctoral Research Scholar,
School of Management, NIT Rourkela, India
E mail: brahmadev.panda@gmail.com
Dr. N.M.Leepsa,
Assistant Professor,
School of Management, NIT Rourkela, India
E mail: leepsa.vgsom.iitkgp@gmail.com

Abstract
This paper investigates the factors determining the equity ownership structure of Indian listed
companies. We distinguish the ownership structure into two dimensions such as ownership
concentration and ownership identities. The measure for ownership concentration is used in
this study is the largest blockholder, while promoters’ equity holdings and institutional equity
holdings are used as two measures for the ownership identities. Panel data regression models
(pooled ols, fixed-effect and random-effect) and generalized method of moments (GMM) are
employed to test the research hypotheses and a sample of top 100 listed companies from the
BSE (Bombay Stock Exchange) is chosen for a period of seven years from FY 2009-10 to FY
2015- 16. We have selected a range of firm-specific variables and financing decision variables
to test their effect on the equity ownership structure of the Indian companies. We have
considered firm size, profitability, asset utilization ratio, free cash flow and firm value under
firm-specific variables, whereas financing decision, investment decision and dividend decision
are taken under financing decision.
The first finding of the study reveals that firm size has a significant negative effect and firm
value has a significant positive impact on the blockholder. It can be inferred from this finding
that ownership concentration increases with the increase in firm value, while their stake dilutes
with the growth in the size of the firms. The Second finding suggests that dividend decision,
cash flow and firm value have a positive effect, while firm size and profitability have a negative
impact on the promoters’ shareholding. This signifies that a better dividend payout, cash flow
and firm value improves the promoters’ ownership stake, while the increase in the firm size
adversely affects the promoters’ stake in the firms. Thirdly, we found that firm size and
profitability positively influence the institutional ownership, while dividend decision has a
negative effect. This indicates that institutional owners are more attracted to those firms which
are larger in size and having better profitability; whereas their stake declines in those firms that
are paying more dividends to the shareholders. The inferences are drawn from this study will
definitely help to the equity investors in taking their investment decision judiciously. This is a
unique study in an emerging market like India, which can be an informative model for other
emerging markets of the globe.

Key Words: Ownership Structure, Panel data, GMM, Bombay Stock Exchange, India

86
COUNTER TERRORISM IN THE AGE OF ARTIFICAL
INTELLIGENCE: THE CASE OF INDIA
Vishnu V.M.*
Data Analyst, Interakt Digital Solutions Pvt.Ltd, Chennai
+91-9962648726; Raghavan.vishnu@gmail.com
Mihir Dash
Professor & Head of Department
Department of Quantitative Methods
School of Business, Alliance University,
Chikkahagade Cross, Anekal Road, Bangalore-562106
mihir@alliance.edu.in; +91 – 9945182465

Abstract

Terrorism is a political phenomenon that has had its adherents for centuries, and India in
particular is a constant victim to their activities. The adherents of terror use surprise as their
major weapon and difficulties of law enforcement to easily breach the secret world of terrorists
is a major factor that has hampered efforts at curtailing this problem. While today many
problems are being solved with the help of machine learning, counter-terrorism has yet to take
off in this department at least as far as India is concerned. While it is almost impossible to
pinpoint the exact date of occurrence, we believe that the use of a model taking certain
parameters into account will help the law enforcers take better preventive steps at handling the
problem.

The objective of the study is to model the modus operandi of terrorists across a period of four
decades in India. The aim is to predict which type of terror attack will be carried at which place
using what weapons.

The analytical techniques to be used include artificial neural networks (ANNs) and Bayesian
models. While Bayesian models have been used in the literature, artificial neural networks have
not been extensively used in the literature. Koutsomanis(2014) suggested that human behaviour
can be predicted reliably by artificial intelligence, while Reby et al (1997) have proposed the
use of ANN’s as a classification method in the behavioural sciences.

The data for the study includes details about the nature of the target, date of attack, and latitude
and longitude, weapons used, and so on for terrorist attacks in India in the period 1979-2015.
The study is expected to find specific patterns in the data that will enable us to predict the kind
of terror attack that will be planned by an aspiring group – whether these will be lone wolf
attacks or group attacks, and also the nature of weapons and target and risk dates. This will
enable us to take suitable measures in order to counter the designs of the terrorists and save
precious lives.

Keywords: Terrorism, Artificial Neural Networks, Bayesian Models, Prediction, Behaviour.

87
APPLICATION OF ANALYTICS PERFORMANCE MANAGEMENT

Ms. Sunidhi Sumedha Bhosekar


Ph.D Research Scholar
School of Commerce and Management Studies
Dayananda Sagar University, Benguluru, India
sunidhi.bhosekar@gmail.com
Dr. Anupama Ghoshal*
Assistant Professor
School of Commerce and Management Studies
Dayananda Sagar University, Benguluru, India
dranupama.2016@gmail.com
Abstract

Purpose– Evaluation of the gap in Employee’s Performance against expectation with the help
of Analytics will lead to quality stimulation to accomplish objectives. Employers will get the
opportunity to customize performance centric strategies for formulating Career-based
Performance Management by nurturing leadership. This activity will optimize the overall
performance of the unit enabling to earn the desired revenue.
Design/Methodology/Approach–The empirical study covers Sectors like
1. Management Consulting and Professional Services Organization.
2. Electronic Commerce and Cloud Computing Organization.
3. Banking and Financial Services.
4. Manufacturing Sciences &
5. Universities.

Findings – Agile System of Talent Management is to find out the gap between Management
expectations from their employees and vice-versa. Agile Systems of Talent Management will
also measure the degree of gap between the two & suggest a tool to bridge the gap. Based on
experience and observation, functional aspects of Performance Management of employee by
communications, engagement, recognition, and workplace wellness is offered through Agile
System of Talent Management for exceeding goals using Big Data Technologies. Practical
implications – Creating a high performance environment at work place. Originality/Value –
This study offers new empirical findings which contribute to re-conceptualization of the
antecedents of Application of Analytics on Measurement and Assessment Issues of
Performance Management. Hence, there is absolute need to make a detailed study and find out
suitable solution with Application of Analytics on Performance Management

Keywords: Analytics, Performance Management, Agile System of Talent


Management.

88
CALCULATING TRUST SCORES ON FACEBOOK AND TWITTER
Manav Jain
Student
Prin. L. N. Welingkar Institute of Management Development and Research,Mumbai
Mumbai,India
manavjain76@gmail.com
Varenya Vikrant*
Student
Prin. L. N. Welingkar Institute of Management Development and Research,Mumbai
Mumbai,India
varenyavikrant@gmail.com
Shivam Deshpande*
Student
Prin. L. N. Welingkar Institute of Management Development and Research,Mumbai
Mumbai,India
shivam.v.deshpande@gmail.com
Abstract
BACKGROUND
All social media platforms allow the creation of profiles and pages. There are number of
people who follow a page. By following a page, the people are provided with the authority to
like and comment on posts. The organisations are setting up social media cells which work to
maintain the brand image. These social media cells can also be used to tarnish the brand value
of the competitor by continuously posting negative reviews. The public perception of brands
can be manipulated by making new deceptive trends with the help of paid social media cells.

THE NEED

Social media cells are becoming an effective way of tarnishing the brand value of organisations
and individuals So, identification of these troll accounts is extremely important to get genuine
user feedback and to get the real perception of the organisation or individual in the minds of
people.

AIM
To Identify the fake profile and paid trolls on social media platforms using trust scores.

METHOD

A method to differentiate between the genuine comments and fake/paid comments on social
media accounts by determining the trust level of the commenter. This method consolidates all
the operative information of a user who is at least on Facebook or Twitter. It assigns weight
to each field of the consolidated information. Then the aggregate score is calculated for at
least an information. Then the weighted average of the aggregated scores of the information
and trust score based on the weighted average score is calculated. The trust score can be used
to block the unwanted users/paid trolls and not allowing them to
comment on the page anymore. This will keep the social media platforms true and genuine
comments will help in improving the brand value of organisations and individuals.

EXPECTED FINDING
89
This method is very helpful in detecting the fake profiles and paid trolls. It will significantly
bring back public trust on social media.

Keywords: Sentiment Analysis, Trust Score Calculation, Troll Detection, Weighted Average,
Social Media

90
SDC-MINER: AN ASSOCIATION RULE MINING ALGORITHM FOR
CROWDMINING OF UNCERTAIN DATA USING APACHE SPARK

Sanjay Rathee and Arti Kashyap


School of Computing and Electrical Engineering
Indian Institute of Technology
Mandi 175001, INDIA
E-mail: sanjay_rathee@students.iitmandi.ac.in and arti@iitmandi.ac.in

Abstract
In the current data-driven world, large datasets having relevant information are
priceless for industries. One of the most important large datasets is crowd dataset. A crowd
dataset is one having likeness (or unlikeness) of people about a movie or leader or product etc.
These datasets can be analyzed to understand crowd behavior so that we can use it for making
business strategies.
Extraction of valuable data from these extensive datasets is a standout amongst the most vital
exploration issues. Association rule mining is one of the best methods to extract interesting
patterns or information from these large datasets. Finding possible associations between items
in large transaction based datasets (finding frequent patterns) is most important part of the
association rule mining. Most of the association rule mining algorithms work on precise
transactional data where any attribute is either present or missing. But, crowd datasets generally
have values between a fixed interval for every attribute (uncertain data). For example, a person
can give a movie rating between 0 to 5. Therefore, we propose a new distributed association
rule mining algorithm called as SDC-MINER to analyse transactional datasets having uncertain
values. SDC-Miner uses a distributed algorithm to find frequent patterns and it is capable to
handle huge datasets very efficiently. SDC-MINER is implemented on Apache Spark which
provides a highly distributed in-memory computing environment for computations.
SDCMINER is used to analyze various modified real and synthetic crowd datasets to
understand people’s behavior. We conduct in-depth experiments to gain insight into the
effectiveness, efficiency, and scalability of the SDC-MINER.

Available: https://github.com/sanjaysinghrathi/SDC-Miner

Keywords: MapReduce; Spark; Hadoop; Uncertain Data; Crowd Mining; Frequent Itemset
Mining

91
ANALYSIS OF ADVERTISING MEDIA IN TERMS OF
EFFECTIVENESS AND TRUSTWORTHINESS (CUSTOMER
PERCEPTION)
Sakshi Saxena 1
sakshi@scmhrd.edu
Research Assistant,
Symbiosis Centre for Management and Human Resource Development, Pune
Tejas Andhare2
tejas_andhare@scmhrd.edu
Post Graduate Diploma in Business Analytics,
Symbiosis Centre for Management and Human Resource Development, Pune

Abstract

‘Make in India’ initiative by Prime Minister Mr. Narendra Modi boosted entrepreneurship in
India. The campaign encouraged visionaries from all the sectors to come forward and
perceive their lifelong dream of becoming a jobs creator. With this government aid, the
campaign got a tremendous response with the registration of several small businesses and
start-ups providing innovative products and services. Out of all those startups and small
businesses, very few sustained and managed to reach out to their customers. Despite having
quality products, the majority of them are still struggling to make people aware of their
products and services. It is a reflection of having less budget for advertising and not knowing
the effective yet cheaper advertising media. A lot new ways of advertising have emerged. The
current research has been carried out with an aim to study two of the modern advertising
media and their effectiveness. Mobile advertising and Guerrilla advertising are the two
chosen advertising mediums. A significant part of study analyses customer perception of
modern advertising media in terms of effectiveness through primary research. Ads coming
from various sources have a different level of trustworthiness among people. The research
additionally discusses trustworthiness of chosen advertising media after fulfilling the
effectiveness criteria. Altogether, it tries to help these startups and small businesses on their
advertising strategy

1Primary author, corresponding author


2Co-Author

92
FACTOR ANALYSIS: THE MOST RELEVANT TOOL IN
EXPLORATORY RESEARCH

V THILAKAM NAGARAJ*
Associate Professor
PSG Institute of Management
Coimbatore, India
thilagam@psgim.ac.in
S R VIGNESHWARAN
Research Scholar
Indian Institute of Science
Bangalore, India

Abstract

When the researcher in an enthusiasm likes to cover all aspects of the subject to be studied,
tries to ask many questions and hence, leads to a huge list of variables. Though interesting
while carrying out the study especially in management as the researcher keeps getting new
insights, the analysis does become lengthy and therefore would discourage the reader.
A situation like this arose when an interesting topic of why, when and how the entrepreneurs
of Coimbatore chose entrepreneurship as a career choice. There were many questions to be
asked and as well they were not tested earlier. Factor analysis was used in such situation and
which as well helped strengthen the findings. This study aims to bring out “Evidence based
research of the superiority of factor analysis as the most relevant tool for exploratory studies”.
This paper would study how factor analysis used in similar studies conducted at two different
time periods has furnished similar results and therefore becomes a reliable data analytic tool
which could be used an all types of exploratory research when the number of variables is very
high and reduction of the same is required one to avoid repetition and two to get rid of
unnecessary variables for better results.

Keywords: Factor analysis; exploratory research; evidence based

93
NEW APPROACHES OF RESUME SECTIONING FOR AUTOMATING
TALENT ACQUISITION

Mahek Shah Girish K. Palshikar Rajiv Srivastava


TCS Research, Pune TCS Research, Pune TCS Research
shah.mahek@tcs.com gk.palshikar@tcs.com rajiv.srivastava@tcs.com

Tata Research Development and Design Centre,


54 B Hadapsar Industrial Estate, Pune 411013
+91-20-66086333

Abstract

Resume is the vital source of information for candidate’s professional career. Talent
Acquisition (TA) group of any organization uses candidate’s resume as its primary source of
information for selection process. The resume documents are semi-structured, highly
personalized, varying in writing style, depends on candidate’s language skills and so on.
Generally, TA personnel manually examine and short-list the candidates based on the
requirement and their limited technical understanding. Such manual shortlisting process is
tedious, bias-prone, time consuming, nonstandard and subjective. To overcome these
shortcomings, the techniques from areas of text mining, statistical modeling and machine
learning are being used. To perform the task of information extraction or information retrieval
accurately from documents, such as resumes, identifying the logically coherent sections
becomes a critical first step. The identified sections limit the scope for identification of specific
entities and their relationships in a small set of sentences improving efficiency. To automate
this task of sectioning in resume document, we propose conditional random fields (CRF), a
popular probabilistic sequence prediction method for identifying section boundaries in a
document. The CRF based method improves the sectioning accuracy significantly over a
baseline method which is based on manually identified patterns. The accurate section
identification has enabled a system for automated candidate profile extraction and
assessment.

94
SENTIMENT ANALYSIS OF TWITTER DATA FOR
DEMONETIZATION IN INDIA

Kaustav Roy
Data Scientist
Tata Consultancy Services
Kolkata, India
kaustav1.r@tcs.com
Debanjan Goswami
Data Analyst
Tata Consultancy Services
Kolkata, India
debanjan.goswami@tcs.com
Abstract
In recent years, a merely boom was witnessed on analysis of opinions from social media
because usually it is very difficult to obtain such a high volume of opinions through any other
normal means of collecting opinion similar to surveys, polls etc. One such social media
where opinions are expressed is Twitter and it plays a significant influence on any
phenomenon. Hence, an accurate method for predicting sentiments could enable us to
understand public view, and the impact of the event on social and economic setup can be
analysed. The idea of demonetization is nothing but a well-known phenomenon all over the
world which enables refreshments in the on-boarded economy of a particular country and
pushes the financial growth up for a better future. Whilst for a country like India which is the
fastest growing economy, is also encountered with massive corruption as well as severely
victimized in terms of both internal and cross-border terrorism, meanwhile to counter that the
Central Govt. of India declared suppression on two highest currency notes of 500 INR and
1000 INR to be used as legally approved entities for both online and offline cash transactions
effective from Nov. 8 2016 onwards. The aforementioned declaration came from the Govt.
just on the eve of Nov. 8 2016 without any prior notice, took the entire nation by surprise.
The aim of this study is to perform text mining on Social media to examine the impact of this
demonetization from public opinions which in turn, could be used by concerned authorities to
smoothen the governmental process. This study was performed using SAS Enterprise Miner
and R with the aim of achieving the following: (1) To extract Twitter data related to the
demonetization of India using various hashtags. (2) To create refined clusters of descriptive
terms corresponding to the various sentiments involved. (3) To analyse the sentiments on
created clusters to find if the demonetization act has public support to the date till data has
been collected. The main focus of this paper is to analyse the sentiments expressed on
Demonetization on Twitter so that public opinions and views are extracted, analysed and used
to understand the negative and positive impact of this act on the people of India.

Keywords: Twitter, Opinion Mining, Sentimental Analysis, Demonetization, SAS

95
A STUDY ON FEASIBILITY ANALYSIS OF INVESTMENT ON
COCONUT PLANTATION IN KARNATAKA

Vasantha Kumar A S
Research Scholar
Sikkim Manipal University
Gangtok, India
reachme@vasantha_as@rediffmail.com
Dr Kamala Suganthi Suresh*
Professor
Atria Institute of Technology
Bangaluru, India
reachme@kamalasuganthi@gmail.com
Abstract

Agriculture continues to be the backbone of Indian economy by employing 54.6 percent


of the total workforce. Wagering with monsoon that made coconut growers in distress to
take up with challenging production enterprises. They have experienced with many
hardships with respect to crop production is concerned from decades. The steep increase
in the input costs, cost of cultivation, Pests and diseases, scanty rainfall and price
fluctuations resulting in scarcity of ground water for coconut plantation are a few severe
constraints faced by coconut growers from years. Due to which, the area is shrinking to
manage the existing plantation itself and expansion of new area to bring under coconut is
almost stagnated in the recent years. In view of the above constraints the study has been
taken up in major growing areas. I) To study the socio-Economic status of coconut
growers of the study area 2) To document the investment pattern 3) To analyze the
feasibility of investment and 4) To identify the constraints of coconut growers and
suggest relevant measures. A simple random sampling method would be adopted to
enumerate the required primary data from the three districts namely Tumakuru, Hassan
and Chikkamangalore districts growers and the secondary data from district statistical
offices. Project investment feasibility analysis techniques such as NPV, IRR, PBP and BC
ratio would be employed to assess the project. Findings from the study like socioeconomic
factors and their relation to adopt best managerial practices, constraints of grower’s right from
farm loan to pests and diseases would be worked out for frequency distribution and graphical
representation. A few relevant suggestions for improvement also would be documented.

Keywords: Feasibility Analysis, Investment Pattern, Constraints, Coconut

96
SENTIMENT ANALYSIS ON DEMONETIZATION BY GOVT OF INDIA

Sachin Kakkookal
Student, PGDM-Business Analytics
REVA Academy for Corporate Excellence, REVA University
Bangalore, India
Sachink.ba02@reva.edu.in
Nidhil C. H
Student, PGDM-Business Analytics
REVA Academy for Corporate Excellence, REVA University
Bangalore, India
Nidhil.ba02@reva.edu.in
Abstract

Any system is subject to change, modifications and amendments and same go with government
policies as well. However, all such changes will have a downside where the common man must
face most of the repercussions. This paper analyses one such policy which has been trending
in the social media since November 2016. Since the announcement of Demonetization, the
Indian economy has been fluctuating in terms of inflation and GDP rate. This has affected
several small-scale businesses and individuals drastically. This paper is aimed at reviewing the
general implications of demonetization on people. This Research is based on Sentiment
Analysis or opinion mapping using Naïve Bayes classification algorithm. In recent years,
microblogging websites have evolved to become a source of varied kind of information. One
such tool is Twitter. In this paper, the data for analysis is gathered from Twitter, Facebook and
other public forums and then sentiment analysis is applied using Natural Language Processing
API’s. Sentiment analysis is a method of classifying sentiments from a given text. This helps
us to understand how an entity has influenced the minds of the general population. Sentiment
analysis is also important to know what the common public think about demonetization. It is
also a computational study of opinions, sentiment and emotions expressed in the text.

Keywords: Demonetization, Sentiment Analysis, Twitter, Opinion mining, Natural Language


Processing. Naïve Bayes algorithm

97
QUALITY MANAGEMENT IN POWDER COATING PROCESS: A SIX
SIGMA APPROACH

Anshu Gupta*
Assistant Professor
School of Business, Public Policy and Social Entrepreneurship, Ambedkar University
Delhi, India
anshu@aud.ac.in
Pallavi Sharma
Research Scholar
Department of Statistics, M.D. University
Rohtak, Haryana, India
pallavisharma.03@gmail.com
S. C. Malik
Professor
Department of Statistics, M.D. University
Rohtak, Haryana, India
sc_malik@rediffmail.com
P. C. Jha
Professor
Department of Operational Research, University of Delhi
Delhi, India
jhapc@yahoo.com

Abstract
The purpose of this study is to improve the productivity of powder coating process of an
SME manufacturing unit by minimizing the process rejections. The study applies DMAIC six
sigma methodology for process quality improvement. The study starts with identification of
defects leading to rejections and measuring the baseline process sigma level. Key process
defects are identified using Pareto analysis. Using the Ishikawa diagram and Current Reality
Tree tools generic and root causes of the problem unveiled. Suboptimal setting of powder
spray process parameters is identified as a root cause of the problem. Adopting the Taguchi’s
design of experiment methodology, optimal spray process parameters are established. The
optimal process settings are verified and implemented. The performance of the improved
process measured through sampling after stabilising the process operation, exhibited
improvement in the sigma level of the process.

Keywords: Six Sigma, DMAIC, Ishikawa Diagram, Taguchi Design of Experiments.

98
PRESCRIPTIVE

99
BANKING, FINANCIAL SERVICES AND
INSURANCE
(BFSI)

100
ELUCIDATION OF THE DYNAMICS OF CROSS-MARKET
CLUSTERING AND CONNECTEDNESS IN ASIAN REGION: AN MST
AND HIERARCHICAL CLUSTERING APPROACH

Biplab Bhattacharjee1 , Rounak Singh1, Muhammad Shafi1, Animesh Acharjee1,2,3

1School of Management Studies, National Institute of Technology, Calicut, Kerala, India

2Department of Biochemistry, Sanger Building, University of Cambridge, 80 Tennis Court


Road, Cambridge CB2 1GA.United Kingdom

3Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of
Birmingham, UK

* Corresponding author E-mail: biplabbhattacharjee2010@gmail.com

Abstract
Regional market connectedness and clustering patterns are key deciding aspects for designing
of international diversified portfolios. Over the past decades, global investors have taken a keen
interest in investable portfolios; targeting emerging markets in Asia. In this context, an enquiry
into the dynamics of Asian cross-market connectivity structures and cluster formation patterns
becomes essential for designing an optimally allocated regionally diversified portfolio. We
performed the analysis of the cross-correlation structures of the market return data of the 14
Asian market indices. The time-frame for this study is fourteen years, and it ranges between
2002 and 2016. We employed a rolling window approach that produced 151 temporally varying
observations. We generated the Minimum Spanning Tree and the Average Linkage based
Hierarchical Clustering plots for each of these temporally varying observations. Further, we
visualize the connectivity structures in this plot to decipher the formation of clusters, hub
nodes, and its associated linkage structures. In quest of identifying sets of Asian markets
possessing close linkages to Indian market index, we deployed a weighted hop count method
and further computed the scores. Additionally, we examined the change in connectivity and
cluster structures during the periods of 2008 financial crisis. We also calculate the significant
network parameters and characterize the dynamically varying network topology using those
parameters; and especially examine these measures during the phases of market stress. We also
perform Markowitz optimization on a selected set of Asian indices (tangling ends of the MST
plot of the observation) for a sample observation and further compute the degree of
diversification.

Keywords: Network filtering, Degree of diversification, Applied Graph Theory, Financial


network analysis, MST, Hierarchical Clustering, Markowitz optimization

101
HETEROGENEITY IN THE RESOLUTION OF BANK FAILURES: A
LATENT CLASS APPROACH

Padma Sharma

Abstract
This paper investigates the resolution of failed banks by the FDIC and uncovers a dichotomy
in the manner in which cases were administered when bank failures were pervasive as against
when they were sporadic. Banks that failed subsequent to unfavourable local economic
conditions had a higher median probability of receiving financial assistance from the FDIC
compared to those that failed in a relatively more favorable economic climate. The response of
the FDIC to bank-level attributes is found to be substantially stronger in the former category
of banks relative to the latter group. While these findings corroborate the conclusions of
Acharya and Yorulmazer (2007b) regarding the onset of a too-many-to-fail effect, they also
bring to light enhanced decision-making processes at the FDIC in the resolution of failed banks
during times of industry-wide distress. I develop a novel Bayesian procedure to estimate latent
class models with ordinal responses to detect unobserved heterogeneity in bank resolution. I
analyze all failures that occurred during 1984-1992 among US banks insured by the FDIC as
the regulatory landscape and the spate of regional bank crises during this period created
conditions in which bank failures that were crisis-driven and idiosyncratic occurred
contemporaneously.

Keywords: Bayesian latent class, Hierarchical Bayesian model, Efficient MCMC sampling,
Ordinal response, Banking crises

102
RETAIL

103
CRITERIA CLASSIFICATION FOR COST AND QUALITY
ASSESSMENT OF SUPPLIERS

Aditi*
Research Scholar
University of Delhi
Delhi, India
aditibajpai.du.or.16@gmail.com

Jyoti Dhingra Darbari


Research Scholar
University of Delhi
Delhi, India
jydbr@hotmail.com

Arshia Kaul
Research Scholar
University of Delhi
Delhi, India
Arshia.kaul@gmail.com

P.C.Jha
Professor
University of Delhi
Delhi, India
jhapc@yhaoo.com

While firms are bringing in products into the markets, they need to develop efficient and
effective strategies. Multiple functions of a firm need to be integrated together to place the
products in the current competitive market. One such integration is of the supply chain and
quality functions of a firm. However, the integration can have a double-pronged impact at
achieving the goals of the firm and hence must be evaluated appropriately. To begin with, one
of the most competitive strategy for achieving the integration is, associating with suppliers who
are mutually inclined towards the goal of quality enhancement. The current study empirically
examines the criteria of selection of suppliers and the extent to which these criteria impact the
business performance. Although ‘Cost’ has always been the most driving factor for supplier
selection, however many companies now promote ‘Quality’ as being central to achieving
customer value and a critical success factor under competitive environment. Therefore from
the business perspective, evaluation and selection process regarding choosing suppliers who
can improve quality of the supply chain (SC) must also take into account the various costs
associated with each criteria of evaluation and how the desirable quality level can be achieved
at minimum cost. Hence, measuring and reporting the cost of quality should be considered an
important issue for managers while selecting suppliers. Within this context, the main aim of
the study is to evaluate the criteria of selection of suppliers and understand their cost and quality
implication on the supply chain. First the criteria of evaluation are identified and their weights
of importance are calculated with the objectives of minimising associated cost and maximising
quality level. Analytical Hierarchy Process (AHP) is utilised for the purpose. Thereafter, the

104
criteria are classified into four groups: Quality, Critical, Complementary and Costly. The
classification would help us in understanding how the various criteria impact the cost aswell
as quality parameters of the suppliers selection process. The proposed methodology is validated
through a case study of a firm who is supplying a strategic product to the market.

Keywords:Supply Chain, Cost of Quality, AHP, Supplier Evaluation

105
A DEA APPROACH TO EVALUATE THE EFFICIENCY OF
RETAILERS

Nomita Pachar*
Research Scholar
Department of Operational Research, University of Delhi
Delhi, India
nomita.or.du@gmail.com

Anshu Gupta
Assistant Professor
School of Business, Public Policy and Social Entrepreneurship, Ambedkar University
Delhi, India
anshu@aud.ac.in

Jyoti Dhingra Darbari


Research Scholar
Department of Operational Research, University of Delhi
Delhi, India
jydbr@hotmail.com

P. C. Jha
Professor
Department of Operational Research, University of Delhi
Delhi, India
jhapc@yahoo.com

Due to increasing customer orientation and market competition, demand for creating better
operational efficiencies in supply chain (SC) is a rising concern. The overall aim of any firm is
to optimize the efficiency at every stage of the SC. The retailers play a pivotal role in creating
a link between upstream stages and end customers. Thus, it is important for retailers to extend
their capabilities in such a way that the SC can leverage from its efficiency to create a higher
value and competitive advantage. The first step in this direction is to analyse and measure the
overall efficiency of the retailers. In the literature, research in this direction is very limited.
This study presents a method for evaluating the performance of the retailers considering
ecological, environmental as well as social parameters. Data envelopment analysis (DEA)
approach is used to evaluate the efficiencies of retailers classifying the parameters as inputs
and outputs. The DEA approach is adopted as it not only measures and compares the
efficiencies of the decision making units (DMUs) but also identifies the dimensions for
improvement of individual DMUs. The validity of the study is illustrated with a case study of
a electronics retail supply chain. The implications drawn from the result can enable retailers in
developing strategies for enhancing their performance efficiencies.

Keywords: Retailers, Efficiency, DEA.

106
SUPPLY-DEMAND DRIVEN OPTIMAL PRODUCTION OF GREEN
PRODUCT VARIANTS

Akansha Jain*
Research Scholar
Department of Operational Research, University of Delhi
Delhi, India
akansha.269@gmail.com

Arshia Kaul
Research Scholar
Department of Operational Research, University of Delhi
Delhi, India
arshia.kaul@gmail.com

Jyoti Dhingra Darbari


Research Scholar
Department of Operational Research, University of Delhi
Delhi, India
jydbr@hotmail.com

P. C. Jha
Professor
Department of Operational Research, University of Delhi
Delhi, India
jhapc@yahoo.com

Growing consciousness of customers towards the environment has led to an increase in the
demand for green products in the consumer market. It is thus essential for a manufacturing firm
also to cater to this changing trend in the demand of products and deliver products to match the
new demands. The marketing function of the firm determines the demand of the customers,
helping the supply chain managers in understanding what and how much they must deliver.
Besides only redesigning the supply chain to orient towards customer needs, regulations
imposed by the government on the manufacturers also makes it imperative for them to produce
products which are environmentally viable. Thus, firms must redesign their supply chain
network in integration with marketing function in order to satisfy changing customer demands
as well as adhere to regulations. The redesigning of the supply chain must be such that the firm
achieves long-term profitability. Within this context, we propose a mathematical model which
addresses these concerns of the manufacturer based on the emerging demand for greenness.
The model determines the number of green or non-green variants to be manufactured with
objectives of maximizing profit and enhancing green score. The mixed integer-programming
model formulation explores the viable options for green manufacturing based on demand of
different customer segments and selects modules while satisfying the additional supply chain
constraints. The mathematical model is coded using the optimization software LINGO 11.0
and solved under the conditions of demand, capacity and other system constraints. A real life-
like example problem is constructed for demonstrating the applicability of the proposed model.

107
The findings of the study can assist the manufacturer in determining the market share of green
customers and in green customer retention by producing product variants according to their
expectation. As the proposed model is totally demand driven which is assumed to be known,
therefore for future study the demand can be taken as fuzzy so as to incorporate ambiguity.
Additionally, to ensure an increase in the demand for green products, the objective of
maximizing total market share of green product can also be considered along with the
objectives increasing profit margin.

Keywords: SCM, Marketing, Optimization, Green, Product Variants.

108
STITCHING PROCESS IN THE APPAREL INDUSTRY: FUZZY
DMAIC

Reena Nupur*
Research Scholar
Department of Applied Mathematics, Gautam Buddha University
Greater Noida, India
reenanupur1981@gmail.com

ArshiaKaul
Research Scholar
Department of Operational Research, University of Delhi
Delhi, India
arshia.kaul@gmail.com

Sushil Kumar
Assistant Professor
Department of Applied Mathematics, Gautam Buddha University
Greater Noida, India
sushil12@gmail.com

P.C. Jha
Professor
Department of Operational Research, University of Delhi
Delhi, India
jhapc@yahoo.com

In the present economic conditions such as global competition, reduced profit margin, demand
of high quality products by customers, product variety and reduced lead-time have a major
impact on manufacturing industries. The increased demand for higher quality at lower price
has made it imperative for manufacturing industries to develop their strategies such that
customers are not lost to competition. In this research we consider the case of production
process in the apparel industry. In apparel industry where reworks are quite common and lead
to unnecessary increase in the production cost. It is thus the need of the hour for an apparel
manufacturer to identify, quantify and eliminate sources of variation with well-executed quality
control plans. The Six Sigma is a well-structured data driven process improvement
methodology used by many industries and academicians to improve the quality of the process,
products and services. In our study we implement Six Sigma through Define-Measure-
Analyse-Improve-Control (DMAIC) approach. The proposed integrated framework applies Six
Sigma metrics such as check-sheet, cause and effect diagram, Pareto chart, defects per million
opportunities (DPMO), Sigma quality level, under DMAIC to understand the problems and
status of stitching process and to reduce the defect occurring in the final product. Further,
statistical process control (SPC) is also applied time to time in DMAIC approach to know the
present process stability of the ongoing process. In addition to Six Sigma metrics and SQC, the
novelty of the paper lies in the use of fuzzy process capability (FPCA) and fuzzy analytic
hierarchy process (FAHP) in measure and analyze phase of DMAIC to improve the metrics of

109
Six Sigma. FPCA is used to deal with the uncertainty while measuring the capability of the
process and FAHP is applied to prioritize (find the indicators) the critical causes of defect types
among all possible causes listed in fishbone diagram to make the final decision respectively.An
empirical real life study is presented for the stitching process in the apparel industry to show
the application of the proposed methodology in real life situations.

Keywords: Six Sigma, DMAIC, Fuzzy AHP, Fuzzy PCA, Apparel industry

110
PERFORMANCE EVALUATION OF SUSTAINABLE INNOVATION
PRACTICES USING BEST WORST METHOD

Jyoti Dhingra Darbari*


Research Scholar
Department of Operational Research,
University of Delhi,
Delhi, India
jydbr@hotmail.com

Rashi Sharma
Research Scholar
Department of Operational Research
University of Delhi,
Delhi, India
rashilakhanpal9@gmail.com

Garima Agrawal
Research Scholar
Department of Operational Research
University of Delhi,
Delhi, India
garimagrawal9@gmail.com

P.C.Jha
Professor
Department of Operational Research
University of Delhi,
Delhi, India
jhapc@yahoo.com

The study proposes an analytical review of the integrated sustainable policy framework
adopted by a flour mill company known as – Delhi Flour Mills Ltd. The company has a tough
task of striking a balance between complying with the sustainability regulations enforced by
the government and the pressure exerted by the stakeholders for a productive supply chain.
Thus, leading to an evolution of few sustainability innovation practices (SIPs) such as
‘Increasing sustainability awareness’, ‘Adoption of pollution reduction measures’, ‘Adoption
of water conservation measures’, ‘Recyclable packaging’, ‘Use of energy efficient
equipments’, ‘Sustainable employment practices’, ‘Appropriate Quality Measures’,
‘Introduction of sustainable food safety measures’ and ‘Rewards for sustainable supply’. The
introduction of these practices has influenced the decisions within the strategic, operational and
tactical domains of the company. As a consequence, now the challenge which lies ahead for
the company is to investigate and identify the SIPs that have had an insightful impact on the
sustainable performance of the company. In view of the aforesaid observation, the existing
policies can be modified subject to the constraints of cost-efficiency of the ongoing SIPs and
their sustainable productivity. Performance evaluation of the SIPs is done in terms of whether
there has been significant reduction of the total energy use and greenhouse emissions and

111
enhancement of social well being, within the budgetary constraints of the company or not. The
objective of the present study focuses on developing a decision making framework for
assessing the performance level of the SIPs keeping in mind the stakeholders’ interest and
arriving on key decisions for broadening the spectrum of sustainability. Since sustainability is
a multi-criteria concept, therefore, to assess the relevance of each SIP as compared to each
other and within the context of overall sustainability, a recently developed multi-criteria
method called the Best-Worst method is used. To evaluate and improve the current practices
and processes, the decision makers(DMs) are asked to identify the best (most desirable), and
the worst (least desirable) SIPs, followed by a pairwise vector comparison of best SIP and
worst SIP with others. The weights of the SIPs are generated by solving a max-min model.
Then, the weights of the economic, environmental and social performance are obtained using
the same procedure. Finally, significance of the SIPs is obtained in the aggregation phase in
terms of their importance to the organizational sustainability of the supply chain. The
consistency ratio is also checked for the reliability of the comparisons made.
The final inference obtained is that ‘Increasing sustainability awareness’ is the most important
sustainability initiative which has maximum social as well as environmental impact with
minimum economic input. The outcomes of this study will help industry managers, decision-
makers and practitioners decide where to channelize their resources during the next stage of
implementation of SIPs, with an objective to enhance sustainability in their supply chain and
move towards sustainable development.

Keywords: Sustainable Innovation Practices, Supply Chain management, Delhi Flour Mills,
Best-Worst Method

112
ONLINE ORDER FULFILMENT: APPROACH TO MAXIMIZE
RESOURCE UTILIZATION

Dharmender Yadav1 Avneet Saxena2


12TCS,
Think Campus, Electronic City, Phase-2, Bangalore
1dharmender.yadav@tcs.com 1+91-7620121274 2avneet.saxena@tcs.com, 2+91-9742823027

Abstract

Online grocery shopping is increasingly gaining importance & popularity in recent time due to
evolution of fastest internet and online shopping applications. This is especially observed more
in urban areas where consumer either prefer ordering online and delivery at home or ordering
it online with picking at nearby retail store location. It is expected that this online demand will
grow fivefold in next ten years. Perhaps online ordering not only save valuable time but also
allows customers to avail more discounts. Besides many advantage of online process there are
several challenges to manage by retailers. For instance, online ordering requires easy ordering
portal, efficient picking of SKUs, maximize resource utilization in additional picking process,
packing and delivering to end customer location without wastages. Major challenge for retailers
is to fulfill the demand in minimum cost including transportation, manpower, and operational
cost. Customer Order placement to order picking, till delivery includes many major questions,
for example, selection of retail/warehouse to pick material, picking of item in minimum time
with minimum resource, minimum picking material and bucket should be required, minimum
number of fleet should be used etc. In this paper, our focus is to maximize the utilization of
buckets by allocating the items to buckets in such a way that minimum number of buckets are
utilized. We recommend a mix integer mathematical optimization model to achieve this
objective. We have also compared our results with traditional solution like Next fit heuristics.
One case example is highlighted to demonstrate benefit of our approach.

Keywords: Bucket, Online grocery, Mathematical optimization model

113
MARKET ENTRY STRATEGY FOR LAUNCHING A NEW DRUG

Nagaraju Utukuri
Manager
Genpact
Bangalore, India
nagaraju.utukuri@genpact.com
Subhash Ajmani
Assistant Vice President
Genpact
Bangalore, India
subhash.ajmani@genpact.com
S. Ravi Kumar
Senior Manager
Genpact
Bangalore, India
ravi.kumars@genpact.com

Abstract

In a world of increasing pressure on margins, growing competition, and more targeted


launches, companies face the question “Which markets to enter to maximize revenue and
reach”. The numbers of Americans with CNS (Central Nervous System) disease are growing
at an exponential rate year on year as the proportion of U.S population above 65 years and
older continues to increase as the baby boomer generation ages.
This paper discusses about different factors a pharmaceutical company should consider for
carving out a market entry strategy for a drug. Some of the key factors the pharmaceutical
company should consider are Market/Competitor assessment, Identification of unmet needs,
Differentiation and Positioning, Identification of Key Opinion Leaders, Consideration of
prelaunch factors like buzz in the social media, which states to focus on, Quantification of
risk and benefits of a drug. States in U.S are very heterogeneous; they are diverse in terms of
openness to address some of the CNS diseases. A thorough understanding of these
differences by different states and carving out the right market entry strategy that can
accommodate this heterogeneity is also critical to maximize the reach and revenue for the
drug manufacturer.
The paper focuses on quantifying influence score of physicians using page rank algorithm to
identify the Key Opinion Leaders to target during market launch and Quantification of Risk
Benefit profile of a drug which would help in differentiation and positioning of drug and
prioritization of the states.

Keywords: Market entry strategy, Sales Forecasting, Key Opinion Leaders, Risk Benefit
Profile of a drug, State funding, State Openness, Policy Adaption, Epidemiology.

114
A CHANCE CONSTRAINT BASED LOW CARBON SUSTAINABLE
SUPPLY CHAIN CONFIGURATION FOR AN FMCG PRODUCT

Remica Aggarwal S.P Singh

Abstract

Realising the growing concern over the harmful aftermaths of CO2e emissions over the
survival of species, organisations all over the world are striving hard to gain sustainability
with a clean and green environment. Green supply chain management and new product
innovation and diffusion have become quite popular recently and act as a rich source of
providing competitive edge to national, multinational companies as well as enterprises.
However, research on Supply Chain Configuration for new product represents a comparably
new trend and needs to be explored further. The following research proposes a conceptual
framework and a mathematical model to illustrate how environmental concern such as strict
carbon emissions cap can be integrated with economic concern such as NPV maximization
concerning new FMCG product introduction capturing the uncertain demand phenomenon.

Key terms: Carbon emissions; CO2e equivalents; Environmental Policy

115
RETAIL ANALYTICS: HARNESSING BIG DATA FOR GROWTH IN
RETAIL INDUSTRY

Dr. Deepika Saxena


Associate Professor
Jagan Institute of Management Studies, Rohini, Delhi-110085, India
Email: deepika.jims@gmail.com
Ms. C.Komalavalli
Associate Professor
Jagan Institute of Management Studies, Rohini, Delhi-110085, India
Email:komalalvalli@gmail.com
Ms. Chetna Laroiya
Assistant Professor
Jagan Institute of Management Studies, Rohini, Delhi-110085, India
Email:chetnalaroiya@gmail.com

Abstract

Big data is the new buzzword in today’s era which leads to a cultural shift in the manner
retailers connect themselves with their customers, retrieve data and analyse, plan and strategize.
Today’s customer is empowered and has varied needs and expectations with the retailers and
e-tailers, which are formed by experiences across the diverse commercial world. In the world
of e-commerce, customers use data and technology to take control of their shopping
experiences. As technology adoption and multi-channel shopping experiences become the part
and parcel of life of the customers, it is a big challenge for both retailers and e-tailers to retain
the customers by understanding their needs and the expectations. Customers are heterogeneous
in nature with respect to demographics, tastes and preferences, interests, behaviour, purchasing
power etc. Analysing the customer behaviour, mapping them to products, offering discounts,
planning marketing strategies are the major issues for the retailers. Big data analytics in retail
known as ‘retail analytics’ helps to find out the solution of this problem. Retailing is, today,
standing on a platform which is data driven and digitally controlled; data comes from various
sources such as social networking conversation, e-commerce transactions, etc. It’s high time
for the retailers to make use of analytics to understand such data and to strategize their action
plans for future growth. It is time to understand the uses, impact, importance and role of big
data for new business imperatives and leveraging it to transform the business processes,
organizations and gradually the whole industry. This paper focuses on the business
opportunities for the retailers using the retail analytics, and how its use can lead to the growth
of retail industry. The paper discusses the role of retail analytics and various retail strategic
areas such as price optimization, future performance and demand prediction, forecasting trends,
identifying right customers etc. which may lead to the multi-fold increase in the sales revenue.
This paper also discusses the opportunities and challenges in the retail analytics. The present
study is exploratory in nature and uses secondary sources of data collection. The present paper
will help the retailers to understand various facets of retail analytics and big data in improving
customer engagement and satisfaction, operational efficiency, supply chain management and

116
product innovation. This paper help understand the retailer that big data bridges the
technological divide between a traditional store and the level of integration consumers
desire; omni-channel retailing combined with emergent technology.

Keywords: Retail Analytics, Big Data, Decision Making, Sales Maximization, Business
Opportunity

117
FORMULATION & PILOT FOR INCREASING THE TCI IN
F&V CATEGORY IN HYPERMARKET

Ramani Kulkarni
Student - PGDM Operations
Welingkar Institute of Management Development and Research, Mumbai
ramani.klkrn@gmail.com

Dr. Kavita Kalyandurgmath


Professor - Operations
Welingkar Institute of Management Development and Research, Mumbai
kavita.kalyandurgmath@welingkar.org

Abstract

The Indian retail industry has been evolving continuously from the years and the food and
groceries contribute to the largest market share (66.3% as per IBEF report-June 2017) of all
the products retailed in India. Over the years, retail stores have evolved from the local Kirana
stores to the Hypermarket, the largest format of retail in India currently. A hypermarket is a
one stop destination for all the needs of a typical Indian household and thus it is essential that
a hypermarket keeps an extremely wide product range of all categories.

Few of the main challenges that these hypermarkets face due to the large number of SKUs
present are the shrinkage and the dumping of fruits and vegetables and generating a higher
Total Commercial Income (TCI). This research paper discusses two important areas of concern
to a Hypermarket

1. To understand factors responsible for generating TCI [Total Commercial Income]


2. To increase the efficiency of Indent tool & analysis of historic data for identifying SKU’s
which are beneficial to increase TCI
To ensure minimum dumping and shrinkage, it is highly important to do the indenting in the
most accurate manner possible. And for an accurate indent to be given, the sales forecast also
has to be accurate. The methods currently followed in many Hypermarkets are raw and
inaccurate as they don’t follow a scientific approach. The ways of improvement in this process
by a better forecasting and a proper indent tool thus resulting in a higher TCI are discussed
below. The model uses analytics to forecast sales and indenting is done based on the sale
predicted by the forecasting model.
Keywords: Total Commercial Income (TCI), Dump & shrinkage analysis, Stock
Keeping Unit (SKU), Indent, OLAP cube, Scan Margin, Rate of Sales, Perceptual Map,
sales contribution, realized sales, gross rupee margin, Cost of Goods Sold (COGS), Net
sales, Inventory, Indent, Offers, Consumer preferences

118
SERVICES

119
OPTIMAL ADVERTISEMENT PLACEMENT IN TELEVISION IN A
TIME WINDOW
Sugandha Aggarwal
Assistant Professor
Amity School of Business, Amity University,
NOIDA,U.P., India
sugandha_or@yahoo.com

Arshia Kaul*
Research Scholar
Department of Operational Research, University of Delhi
Delhi, India
arshia.kaul@gmail.com

P.C. Jha
Professor
Department of Operational Research, University of Delhi
Delhi, India
jhapc@yahoo.com

Abstract
To attract customers for purchasing the products of the firms, a very popular strategy is that of
advertising. Over the years the complexities in advertising have also increased with numerous
media now being available in the market. Of the many available mediums for advertising and
many more media which are being added each day, television with its wide reach to the
audiences has maintained its importance for firms advertising for their products. With the
growth in the market, the medium has diversified in terms of what it offers to the customers
such that it has a competitive edge over the mediums. Earlier there were only a few
stations/channels which were aired on television with a limited number of shows which were
being aired on these channels. In turn there were limited options for advertisers to place
advertisements on the channels. With the changes over time, there are many channels available
for advertising and in these channels there are multiple options of placing the advertisements
in these channels. The options of placement of advertisements may vary over type of space,
time slot of day, type of program, type of channel. Also there are certain preferences of
advertisers , wherein they specify the time window of the planning horizon that they would be
like to place their advertisements. Given that these complexities exist when advertising in
television, it is essential to develop a strategy that is lucrative for the channel owners and also
satisfying the requirements of the advertisers. In the proposed mathematical model, the
objective is to maximize the revenue of the channel owner while placement of advertisements
by different advertisers. The model is solved under the constraints of time window preferences
of advertisers, diversification of advertisement placement over the planning horizon , minimum
and maximum frequency constraints and the limits on duration of advertisements. The model
is validated through a real life case study to establish the applicability to real life.

Keywords: Television, Advertising, Optimization, Time Window.

120
STOCHASTIC MODEL FOR FORECASTING CUSTOMER
EQUITY: MOBILE SERVICE PROVIDER CASE

Titiksha Singh, Pratik Virulkar, Rittwick Maji


Students

Vijayalakshmi Chetlapalli*
Adjunct Faculty

K.S.S Iyer
Honorary Adjunct Professor
Symbiosis Institute of Telecom Management
(Constituent of Symbiosis International University)
Lavale, Pune - 412115, Maharashtra, India.
{titiksha.singh,pratik.virulkar,rittwick.maji,vijayalakshmi,kss.iyer}@sitm.ac.in

Abstract

121
A MULTI-SERVER INFINITE CAPACITY MARKOVIAN FEEDBACK
QUEUING SYSTEM WITH REVERSE BALKING

Bhupender Kumar Som


Associate Professor, JIMS Rohini, Sec -5, New Delhi – 110085
Email: bksoam@live.com
Sunny Seth
Assistant Professor, JIMS Rohini, Sec -5, New Delhi – 110085
Email: sunnyseth2005@gmail.com

Abstract

Large customer base of any firm ensures better quality, value for money or both. A large
customer base functions as a motivating factor for newly arriving customers. This phenomenon
can be observed in many businesses such as restaurants, healthcare, life insurance, investment
etc. and is termed as reverse balking, which is contrary to the classical balking behavior.
Reverse balking results in higher probability of a customer joining the system with respect to
increasing customer base. This increasing probability of joining puts service facility under
pressure. That in turn results in dissatisfactory and incomplete service at times. A dissatisfied
customer is termed as a feedback customer in queuing literature.
In order to frame an effective operational policy for such a system, it is essential to measure
the performance of the system. In this paper we combine above mentioned contemporary
challenges of reverse balking and feedback to formulate a new multi-server infinite capacity
feedback Markovian queuing system with reverse balking. The system is studied in steady-
state. The necessary probability measures and measures of performance are derived. The
numerical analysis of the model is presented. Later the cost model is developed, and economic
analysis of the model is also presented. Algorithms are written in MATLAB and MS Excel for
numerical and sensitivity analysis.

Key words: reverse balking, multi-server, queuing theory, feedback queue, infinite capacity.

122
IMPLEMENTATION OF WATER CYCLE ALGORITHM FOR
MODELLING AND OPTIMIZATION OF SUPPLY CHAIN NETWORK

Santhosh Srinivasan*, Vivek Kumar Chouhan, Shahul Hamid Khan


Indian Institute of Information Technology Design and Manufacturing - Kancheepuram,
Chennai, India, mdm13d001@iiitdm.ac.in,

ABSTRACT

Manufacturers are adopting product recovery and re-manufacturing as two important


business strategies to reduce the landfill waste and to get economic advantages. Parts are
recovered from the used products for recycling, re-manufacturing and repairing. Manufacturers
are forced to switch to reverse or Closed Loop Supply Chain (CLSC) as the
traditional supply chain doesn't support product recovery. This paper concentrates on
multiperiod, multi-product, and multi-echelon CLSC network with vehicle routing between
distribution hub and retailers. The CLSC network has become more realistic when the
production and assembling units are separated as the market has become global. A Mixed
Integer Linear Programming (MILP) model has been formulated for the network and solved
using a improve Water Cycle Algorithm (iWCA). This Algorithm is a population based
algorithm and natural water cycle is the inspiration for this algorithm. Results are compared
with results of CPLEX solver and sensitivity analysis is carried out to show some managerial
insights of the model.

Keywords: - Closed Loop Supply Chain, Multi-period, Mixed Integer Linear Programming,
Remanufacturing, Product Recovery, Water Cycle Algorithm, Meta-heuristic.

123
PLAYER ACTIVITY AND FREEMIUM BEHAVIOR IN AN ONLINE
ROLE PLAYING GAME: A JOINT MODEL APPROACH

Trambak Banerjee∗ , Pulak Ghoshyand Gourab Mukherjee∗


October 26, 2017

Abstract

In free-to-play multi-player online role-playing games, product managers often rely on player activity
and freemium behavior to boast revenue. However, assessing player behavior in these settings is often
complicated by (i) the presence of extremely large number of zeros that pertain to no activity and no
purchase activity, (ii) the challenges of variable selection and parameter estimation in high-dimensional
datasets (iii) pertinent domain expertise and prior beliefs that must be incorporated into
the modelling framework and, (iv) the enormity of the data that is subjected to statistical analyses. In
this paper, we introduce an Extreme Zero Inflated (EZI) Joint Modeling framework that models player
activity and freemium behavior through a joint model of multiple longitudinal outcomes and dropouts.
The proposed framework conducts hierarchical selection of relevant predictors from a large set of
potential predictors and efficiently incorporates prior beliefs and domain expertise into the
selection mechanism via affine constraints on the unknown coefficients. On massive datasets, our
framework leverages the benefits of distributed computing and uses a split-and-conquer approach to
conduct variable selection and estimation, thereby reducing computation time drastically. On a novel
dataset of a multi player online role playing game, our framework adjusts for the extreme prevalence of
zero and identifies co-dependencies across player activity and freemium behavior. It also reveals that
players exhibit idiosyncratic activity profiles over time which may have significant implications as far
as designing personalized promotions is concerned. Moreover, our results indicate that freemium
promotion, among others, is a key component that drives player activity and behavior. However, the
effect of promotion often confounds with holidays and weekends thus complicating managerial decision
making regarding player targeting with online promotions.

Keywords: Freemium Behavior; Hierarchical Variable Selection; Proximal Gradient


Descent; Split-And-Conquer Approach; Regularized Quasi Likelihood

124
ESCALATING CREATIVE BUDGETARY MODEL TOWORDS WET
GARBAGE EJECTION BY MULTI-CRITERION DECISION
ANALYSIS

K J Ghanashyam
Research Scholar,
Department of Mathematics,
Dayanand Sagar Academy of Technology and Management,
Bangalore, India.
ghanashyamkj@gmail.com
Vatsala G A*
Associate Professor,
Dayanand Sagar Academy of Technology and Management,
Bangalore, India.
dr.vatsala.ga@gmail.com
Jyothi P #
Assistant Professor
City Engineering College,
Bangalore, India.
Jyothi_balu_95@yahoo.co.in
Archana Satkhed
Research Scholar,
Department of Mathematics,
Dayanand Sagar Academy of Technology and Management,
Bangalore, India.
archana.satkhed@gmail.com

In this mechanical life, each work is associated with more than one goal which leads to the
different multi decision model. In this rational world, garbage is predominant problem which
is faced by all country especially developing nations like India. The wastes are categorised
into many types such as Household wastes, Pharmaceutical/ Medical waste, Solid waste,
liquid waste, Gaseous waste, Industrial waste, Bio degradable wastes, Non-Bio degradable
waste etc., These Wastes are generated by sources like Industries, Institutions, Constructional
places, Agriculture, Residential areas and many more. Mismanagement of the wastes may
harm our nature thus the proper management of wastes place important role. Thus, waste
management is taking all attention and needs to be addressed on a war footing basis.
Bengaluru once a ‘pensioners’ paradise, is now a cosmopolitan city which is facing many
challenges. Among them Solid Waste Management is predominant as Bengaluru spreads over
709 KM2 producing around 4000-5000 tons of garbage per day. The total waste generated by
this city contains 60-65% of organic wastes and 45-50% of them are produced by daily
human activity. These 60% of organic wastes can be converted into compost which can be
used for organic farming. Keeping this as main goal Karnataka Compost Development
Corporation Ltd (KCDC) was established and is producing City compost and Vermi compost
using Wet Garbage as source. The primary goal of this paper is to give a complete optimal
125
solution for financial management of the wet garbage recycling plant i.e., Karnataka Compost
Development Corporation Ltd (KCDC). In this study, we have designed a progressive Goal
Programming model for financial management of KCDC to provide optimal solution using
Multi-Criterion Decision Making(MCDM) process. Even though main intension is the disposal
of wet garbage and to promote organic forming, KCDC is running at loss due to lack of
production, lack of financial management etc., Here we discuss about reducing the extra
expenses which can be utilized for promoting the organic manure. Thus, we developed a Goal
Programming model with the main objectives is as to minimize the expenditure, maximising
the production, increasing the revenue generated by the sales of different verity of compost.

Keywords: KCDC, MCDM, Goal Programming(GP), Optimisation, Deviations.

126
A CONVEX MODEL DATABASE APPROACH TO RAILWAY
SCHEDULE VALIDATION

Anushka Chandrababu, Abhilasha Aswal, Sanat R, G.N.Srinivasa Prasanna


International Institute of Information Technology, Bangalore
anushka.babu@iiitb.ac.in, abhilasha.aswal@iiitb.ac.in, sanat.r@iiitb.ac.in,
gnsprasanna@iiitb.ac.in

Abstract

Railway scheduling or timetabling is a complex problem and solutions in general should ensure
that there are no conflicts between several hundred passenger and freight trains moving in and
out of stations and between block sections. The timetabling algorithms should take into
consideration potential new demand, congestion, available infrastructure and delays due to
weather conditions, breakdowns, maintenance etc. We present here a method to validate
railway schedules through the use of a database namely CMdB (Convex Model Database). The
CMdB is suitable for summarizing structured or unstructured big-data and can be used for
answering queries for convex optimization problems. Models derived from train arrival and
departure times are stored in the CMdB. Validation of train schedules using these models to
check if there are any possible train intersections is just one of the many applications of the
CMdB.

Keywords: Railway Scheduling, Databases, Convex Models, Uncertainty

127
OTHERS

128
FUZZY MULTI-CRITERIA APPROACH FOR JOINT PERFORMANCE
EVALUATION IN A DEA PROBLEM
Riju Chaudhary*
Research Scholar
Department of Mathematics, University of Delhi, Delhi, India
riju.chaudhary@gmail.com

Pankaj Kumar Garg


Associate Professor
Rajdhani College, University of Delhi, Delhi, India
gargpk08@gmail.com

P.C. Jha
Professor
Department of Operations Research, University of Delhi, Delhi, India
jhapc@yahoo.com

Abstract
Business entities all over the world aspire to perform, not only to the best of their own
capabilities, but also outshine other similar business entities in the industry. For this, various
parametric and non-parametric models have been formulated over time to evaluate efficiency
scores of several business entities individually, relative to others.

Of all such performance evaluation techniques available, a relatively new data oriented
technique/approach is Data Envelopment Analysis (DEA), introduced by Charnes et al. By
measuring efficiency as the standardised weighted average of all its input and output levels,
DEA classifies DMUs into two categories: Efficient and Inefficient. The efficient units form
the production frontier, which provide reference units for inefficient DMUs to benchmark
against.

A classical efficiency evaluation DEA technique, which enables one to assess the efficiency of
Decision Making Units (DMUs) individually, is extended further to evaluate the performance
of all considered DMUs simultaneously by the Joint Optimization DEA technique.
Classical DEA techniques require precise input-output data and a priori information from the
decision maker. However, real world situations are characterised by unpredictable or
fluctuating behaviour of environmental factors, biased information, long term vision and non-
availability of accurate data. To incorporate the impact of these factors on the input-output
data, it, therefore, becomes imperative to consider DEA models with fuzzy data under fuzzy
environment.

The present paper aims to propose a Fuzzy Multi-Criteria Approach to a DEA problem, which
while incorporating fuzziness in parameters also maximizes the efficiency scores of all the
considered DMUs simultaneously under fuzzy environment.

Keywords: Data Envelopment Analysis (DEA), Multi-Criteria Optimization, Fuzzy


parameters, Fuzzy environmen

129
SUSTAINABLE LOGISTICS PROVIDER EVALUATION USING
ROUGH SET THEORY AND AHP

Jyoti Dhingra Darbari


Research Scholar
Department of Operational Research,
University of Delhi
Delhi, India
jydbr@hotmail.com

Vernika Agarwal*
Research Scholar
Department of Operational Research,
University of Delhi
Delhi, India
vernika.agarwal@gmail.com

P C Jha
Professor
Department of Operational Research,
University of Delhi
Delhi, India
jhapc@yahoo.com

The growing concern of the stakeholders towards environmental and social impact of
distribution networks has amplified the importance of incorporation of logistics sustainability
to automobile spare parts manufacturing companies in India. Most of these manufacturing
companies prefer to outsource their logistics to formalized agents because of their experience
and expertise. Also, it is financially difficult for companies to acquire such competencies on
their own. In this context, the selection of such logistics partners is a crucial decision,
particularly when the decision makers have to handle numerous multiple criteria of evaluation
involving all the three dimensions of sustainability. Generally, in such scenario, various multi-
criteria decision making techniques can be used for the process of evaluation and selection.
However, to make the process of selection less cumbersome and more effective, it is essential
to filter these criteria so that redundant criteria can be avoided and the focus is on a smaller set
of candidate criteria which can largely represent the entire set of criteria without loss of
information. To achieve the desired goal, the present paper utilises rough set theory (RST)
which allows for distillation of the set of sustainable criteria to identify the most pertinent
factors based on which the logistics partner can be selected. The other advantage of rough set
methodology is also that it can help in determining rules for the selection process. However,
since the underlying focus of the study is that the selection of partners is done as per the
company’s sustainability goals, therefore rather than just generating rules, we utilise the well
known group decision making technique such as analytical hierarchy process (AHP) for
representing the decision makers’ judgements in the selection process. The novelty of this
paper therefore lies in the use of rough set theory integrated with AHP for identifying the most
important sustainable criteria and selection of the logistics providers based on the shortlisted
criteria as per the decision maker’s opinion. The study is validated for the case of the

130
manufacturer of four wheeler spare parts situated in northern India. At present, the
manufacturer is already working with a few logistics companies. However, the manufacturer
wants to improve the efficiency of their distribution networks as transportation and distribution
operations generate maximum environmental impact. Moreover, the manufacturer wants to
adopt strategies to redesign its logistics operations for increasing the sustainability of its
business operations. In this context, the manufacturer wants to develop partnerships with third-
party logistics providers which can help in improving the environmental performance as well
as enhancing the corporate social responsibility (CSR) efficiency. The results demonstrate that
RST is a powerful data analysis technique which can be effectively used for reducing the time
and energy of the decision makers while making pertinent supply chain decisions, especially
which involve numerous criteria. Further, use of AHP enhances the entire assessment process
by consideration of the conflict among the decision makers. The results demonstrate the
usefulness of the RST in reducing the complexity of the logistics provider selection problem
while taking into consideration many qualitative and quantitative sustainable criteria.

Keywords: Rough set theory, AHP, third party logistic provider, Sustainability

131
CRITERIA CLASSIFICATION FOR COST AND QUALITY
ASSESSMENT OF SUPPLIERS
Department of Operational Research,
University of Delhi
Delhi, India
vernika.agarwal@gmail.com

P C Jha
Professor
Department of Operational Research,
University of Delhi
Delhi, India
jhapc@yahoo.com

Abstract

The growing concern of the stakeholders towards environmental and social impact of
distribution networks has amplified the importance of incorporation of logistics sustainability
to automobile spare parts manufacturing companies in India. Most of these manufacturing
companies prefer to outsource their logistics to formalized agents because of their experience
and expertise. Also, it is financially difficult for companies to acquire such competencies on
their own. In this context, the selection of such logistics partners is a crucial decision,
particularly when the decision makers have to handle numerous multiple criteria of evaluation
involving all the three dimensions of sustainability. Generally, in such scenario, various multi-
criteria decision making techniques can be used for the process of evaluation and selection.
However, to make the process of selection less cumbersome and more effective, it is essential
to filter these criteria so that redundant criteria can be avoided and the focus is on a smaller set
of candidate criteria which can largely represent the entire set of criteria without loss of
information. To achieve the desired goal, the present paper utilises rough set theory (RST)
which allows for distillation of the set of sustainable criteria to identify the most pertinent
factors based on which the logistics partner can be selected. The other advantage of rough set
methodology is also that it can help in determining rules for the selection process. However,
since the underlying focus of the study is that the selection of partners is done as per the
company’s sustainability goals, therefore rather than just generating rules, we utilise the well
known group decision making technique such as analytical hierarchy process (AHP) for
representing the decision makers’ judgements in the selection process. The novelty of this
paper therefore lies in the use of rough set theory integrated with AHP for identifying the most
important sustainable criteria and selection of the logistics providers based on the shortlisted
criteria as per the decision maker’s opinion. The study is validated for the case of the
manufacturer of four wheeler spare parts situated in northern India. At present, the
manufacturer is already working with a few logistics companies. However, the manufacturer
wants to improve the efficiency of their distribution networks as transportation and distribution
operations generate maximum environmental impact. Moreover, the manufacturer wants to
adopt strategies to redesign its logistics operations for increasing the sustainability of its
business operations. In this context, the manufacturer wants to develop partnerships with third-
party logistics providers which can help in improving the environmental performance as well
as enhancing the corporate social responsibility (CSR) efficiency. The results demonstrate that
132
RST is a powerful data analysis technique which can be effectively used for reducing the time
and energy of the decision makers while making pertinent supply chain decisions, especially
which involve numerous criteria. Further, use of AHP enhances the entire assessment process
by consideration of the conflict among the decision makers. The results demonstrate the
usefulness of the RST in reducing the complexity of the logistics provider selection problem
while taking into consideration many qualitative and quantitative sustainable criteria.

Keywords: Rough set theory, AHP, third party logistic provider, Sustainability

133
TRAVEL TIME PREDICTION USING BIG DATA ANALYTICS

Hari Bhaskar Sankaranarayanan


Director, Engineering
Amadeus Software Labs
Bangalore, India
hari.sankaranarayanan@amadeus.com
Roshan Khan
Director, Technology
Amadeus Software Labs
Bangalore, India
roshan.khan@amadeus.com

Abstract

The wait time and delays during travel is a key factor that impacts traveller satisfaction and
experience. Travel time prediction helps the traveller to plan the overall journey in an effective
manner like saving time, better preparedness in case of disruptions and improving overall
convenience. For airports, airlines it can help to manage staff and resources like counters,
kiosks at right time with right capacity levels. The solution can be deployed as a door to door
travel time that includes transit, connecting and wait times during the travel process. It involves
crunching data about the real-time information including traveller current location, the point of
interests, traveller profile, preferences and congestion points like queue time during check-in,
immigration and security. The process includes collecting data from multiple data sources,
processing them in a platform based on lambda architecture and apply machine learning models
like multinomial logistic regression, support vector machines and bayesian models to predict
the wait time and overall expected travel time. This research paper will discuss and present the
methodology of big data analytics in arriving at the wait time at various points of travel by
ingesting real-time information. The associated challenges include ascertaining the exact
location of the traveller in an indoor setup. In this paper, we will discuss various approaches
for tracking traveller location including beacons, sensors, video analytics and crowdsourcing
techniques. It will also highlight how the platform can help airlines and airports to help the
traveller to better utilise their time like notifications on merchandising offers. We will also
provide some practical use cases where the services are exposed in the platform that can be
consumed by travel websites and mobile applications.

Keywords: travel, airport, airlines, big data, prediction

134
ADAPTATION OF TRADITIONAL NEWSVENDOR MODEL FOR
VARIABLE PER-UNIT-COST OF UNDERSTOCKING AND
OVERSTOCKING
Gaurav Nagpal
Research scholar
BITS Pilani
Pilani, India
Gaurav19821@gmail.com

Abstract

This paper extends the well-known newsvendor model to the domain where the cost of
understocking and the cost of overstocking are not constant. The conventional newsvendor
model that is used for deriving the optimal stock of a perishable product under uncertain
demand assumes that cost of understocking and cost of overstocking are constant. But in real
life, we can appreciate that cost of understocking per unit may increase with the quantity of
deficit due to loss of goodwill created by word of mouth. Also, the cost of overstocking per
unit may increase with the quantity of surplus due to the scarcity of resources required for
holding the inventories. In this paper, the author adapts the well-known model to such a
situation. He derives the analytical solution and then moves on use the simulation for running
a few examples that consider Cu or Co as either continuous function or discrete function of the
stock gap (deficit or surplus). Theauthor considers two sub cases under each category: i.e. when
the demand follows normal distribution, and when the demand follows uniform distribution.

135
USAGE OF RESOURCE CALENDAR AND BINARY INTEGER
PROGRAMMING TO SCHEDULE JOBS IN A WASTE
MANAGEMENT SCENARIO

1. Sahana Prasad, Research Scholar,


Department of Mathematics and Statistics, Christ University, Bangalore, India
Email: sahana.prasad@res.christuniversity.in, Mob: 9448854135
2. Dr. Fr. Joseph Varghese, Associate Professor,
Department of Mathematics and Statistics, Christ University, Bangalore, India
Email: frjoseph@christuniversity.in

Abstract

Waste management is an important civic activity in any society and the problem of garbage
disposal is a topic of research and planning. This process involves many components and
resources like money, manpower, trucks, personnel etc. There are different jobs like collection,
segregation, processing, transport and so on. Proper scheduling of available resources is
essential to ensure smooth collection, processing and disposal. A scheduling problem involves
scheduling of resources available, subject to constraints in order to yield an optimal solution.
The various jobs or tasks have deadlines and availability of resources may not be continuous.
These can be described by a resource calendar, which denotes availability of resources. Any
scheduling problem works under the assumption that there is continuous availability of
resources. However, we find that there is a breakdown of some resources or there may be non-
availability of resources during processing of jobs, in this case, collection of waste. In this
paper, scheduling of resources is done using binary integer programming for resources which
are not available continuously. Here, we deal with offline setting, which has been defined
below. The different types of jobs, their effort in terms of time taken, and deadlines are used to
arrive at a schedule using resource calendars.

Keywords: Waste Management, Resource Calendar, Scheduling, Binary Integer


Programming

136
SURVIVAL ANALYSIS IN SUPPLY CHAINS USING PROBIT STICK
BREAKING PROCESS

Authors
Vaibhav Agrawal
3rdYear Undergraduate
Department of Industrial Systems and Engineering, IIT Kharagpur
Phone: +91-9932030171, E-mail: vaibhavagrawal.iitkgp@gmail.com
Shwetank Sharan
3rdYear Undergraduate
Department of Industrial Systems and Engineering, IIT Kharagpur
Phone: +91-9950723810, E-mail: shwetank4amrit@gmail.com
Yerasani Sinjana
Research Scholar
Department of Industrial Systems and Engineering, IIT Kharagpur
Phone: +91-8145989707, E-mail: sinjana91@gmail.com
Manoj Kumar Tiwari *
Professor
Department of Industrial Systems and Engineering, IIT Kharagpur
E-mail: mkt09@hotmail.com

Abstract

In recent decades supply chains are becoming larger and more complex leading to increase in
the risk of failure. Disruptions in production line due to machine breakdown causes the
inventory depletions which also end up after a given period hence incurring huge losses. In
order to stabilize the system, we have applied Probit Stick Breaking Process (PSBP) to perform
the survival analysis of the production line. Statistical Flowgraph model (SFGM) has been
applied to predict the posterior distribution of production line disruption but here case path
transmittance is fixed which is governed using prior. PSBP provides flexibility for deciding
priors and is equipped with computational simplicity. It also allows flexible estimation of the
conditional density function of production disruption risk. This method is expected to achieve
much better forecast of disruption than alternative methods. A more accurate estimation will
provide the managers with better choices of system restoration.

Keywords: Probit Stick Breaking Process, Statistical Flowgraph Model, Production line
disruption

137
CULTIVATION OF ORGANIC CROP USING THE METHOD OF
GOAL PROGRAMING

Archana Satkhed
Research Scholar,
Department of Mathematics,
Dayanand Sagar Academy of Technology and Management,
Bangalore, India.
archana.satkhed@gmail.com

Dr.Vatsala G A*
Associate Professor,
Dayanand Sagar Academy of Technology and Management,
Bangalore, India.
dr.vatsala.ga@gmail.com

K J Ghanashyam
Research Scholar,
Department of Mathematics,
Dayanand Sagar Academy of Technology and Management,
Bangalore, India.
ghanashyamkj@gmail.com

Abstract

A large population of India depends on the agriculture sector for livelihood. At the same time
due to expanding population diminishing available land and rising service industry, there is a
need to increase the production by utilizing the available agriculture resources.
Any small increase in the yield per acre can easily result in huge increase in the overall
production efficiency in the particular region in the given time.
Globally wheat is the leading source of vegetable protein in human food. It has higher protein
content than other major cereals in terms of total production. It was one of the first crop that
could be easily cultivated on a large scale. In this case study, Wheat Crop is considered to
minimize the cost of the cultivation and maximizing the crop production. Generally, all types
of soils have different physical and chemical properties and their properties have certain
limitations due to type of water, fertilizer and some other elements which are necessary to
maximize crop yield production. Due to excess increase of chemical fertilizer usage nowadays
health issues like cancer and other diseases are increasing rapidly. In concern with that, we are
promoting organic fertilizer. Compost is one of them which are also called as soil en-richer and
the key ingredient in organic farming. City compost and vermi compost which are necessary
to maximize the crop yield. The data is collected from the University of Agricultural Sciences,
Dharwad, by considering the above-mentioned aspects.

138
Agriculture planning problems cannot deal with a single goal of maximizing output or profits.
This paper consists of best utilization of water, minimizing the labor cost, minimizing the
chemical fertilizer by promoting the organic fertilizer, maximizing the soil fertility.
To optimize the result GOAL PROGRAMING is used. Goal programing (GP) is also called as
multi criteria decision model

Keywords: Fertility, cultivation, yield, manure, organic fertilizer, optimization, goal


programing

139
ASSESSING THE IMPACT OF CATASTROPHIC FLOOD EVENTS ON
A TERRAIN

Arijit Saha, Atul Singh, Nahaar N Sayeed, and Nisha Seth


Business Analytics and Intelligence
IIM Bangalore
Bangalore, India
arijit.saha16@alumni.iimb.ac.in
*
Shabarinath and Adhithya
Earth2Orbit (E2O)
Bangalore, India

Abstract

Flooding is one of the most common and devastating natural disasters which affects nearly
every population on the globe. In this paper we present a framework for assessing the flood
risk of a terrain, using geospatial analytics with freely available remote-sensing data and open
source technology stack. The framework uses a flood-inundation model to identify areas that
will be impacted due to overflow of a water body in the terrain. An experimental design
approach of clustering the terrain is used to identify troughs that may get inundated due to
heavy rain. Satellite image analysis of the terrain post heavy rains has been done to identify
how water flows through the terrain in the various clusters. The paper demonstrates the
results of successfully applying the framework on a study area (Bengaluru, India) to identify
the regions with a potential risk of flooding. Such a framework can be extended to be used for
risk analysis in the insurance industry and can also be used for various other areas like urban
planning.

Keywords: Geospatial Analytics, Flood Inundation Model, Clustering, Satellite data

140
A HEURSITIC OPTIMIZATION SOLUTION FOR THE SELECTION
OF TRANSFORMATION FUNCTIONS FOR MEDIA CHANNELS IN
MMM

Analyttica Datalab Inc., Bangalore, India


cogito@analyttica.com

Abstract

Analyttica is a niche data science and advanced business analytics company focused on
providing incremental business impact for our clients by developing custom innovative
solutions for them in the predictive and prescriptive analytics space. With the advent of newer
technology with greater computing power, big data and contemporary media channels,
organizations have realized the importance of marketing mix modelling (MMM) and identified
the many benefits it entails that gives rise to cost savings opportunities and drives profitability.
Marketers are under increasing pressure to move away from intuition based budgeting
decisions to factual budgeting decisions, substantiated through quantitative evidence. In an
attempt to understand how their marketing activity connects with real movements in sales and
market share, the client, an Australian subsidiary of one of the world’s largest automotive
manufacturer, wanted to conduct a MMM exercise and arrive at a clearer proof of their Return
on Marketing Investment. The client’s marketing strategy was designed around the themes of
Media, Messaging and Brand Equity. Various levers were identified under each of these themes
which are essentially the components of the marketing mix, to help assess the relative impact
of marketing on enquiry and sales of the client’s product. Using historical marketing and sales
data under each of the exhaustive components identified, Analyttica helped estimate the
relative influence of the various components of the marketing mix, while controlling for other
sales drivers such as seasonality. To properly capture the event and longevity of effect of the
media channels on the outcome (in this case unit sales), is paramount for the success of any
MMM application, which is heavily dependent on the suitable selection of the transformations
such as decay (simple, logarithmic, exponential etc.), logistic, ad-stock, and gamma. The
appropriate selection of these transformation functions is highly contextual and driven by the
skills, experience, knowledge and judgment of the modeler. As a surrogate to the knowledge,
skill and experience of the modeler, we developed a heuristic optimization methodology for
selection of the transformation functions for the media channels. The methodology considered
seven transformation functions competing against each other to arrive at the most appropriate
one, that reduced the error of the model. This method is automated and has been developed
into a prototype that can be applied in similar situations towards the process of measuring above
the line marketing effectiveness and optimization. The above exercise highlighted the potential
costs and benefits of all the components, which was then weighed against one another in order
to build a media mix solution to arrive at the effectiveness of each of the above the line channels
to enable the decisions around objective allocation of the marketing budget across channels.

141
ARTIFICIAL INTELLIGENCE

142
BANKING, FINANCIAL SERVICE
AND INSURANCE
(BFSI)

143
APPLICATION OF MACHINE LEARNING IN INSURANCE
UNDERWRITING

Divyanshu Suri, Predictive Modeler, XL CATLIN, India,


Divyanshu.Suri@xlcatlin.com
Mayank Bhardwaj, Predictive Modeler, XL CATLIN, India,
Mayank.Bhardwaj@xlcatlin.com

Abstract

Growing profitable business in a competitive market can be a challenge for all insurers. To
proactively improve overall returns, rather than wait for the market to harden (either by
natural market forces or as a result of a major catastrophe), insurers can seek to gain a
competitive advantage through getting carefully targeted, profitable new business onto the
books, achieving a superior risk-adjusted price for each risk bound, and improving retention
levels and hence lifetime customer value. In this paper, multiple machine learning techniques
for risk segmentation (to achieve the above mentioned) are presented along with their
performance on one of the commercial line product. The impact of feature engineering,
feature selection and parameter tweaking are explored with the objective of achieving
superior predictive performance.

Keywords: GLM, Tweedie, Predictive Modeling, Pricing in Insurance, Machine Learning

144
GENEROUS: GENERATE KNOWLEDGE GRAPH FROM
UNSTRUCTURED TEXT

Atul Singh, Mridul Mishra and Abhishek Pandey


Bengaluru, India

Abstract

Financial analysis traditionally depends significantly on a human analyst’s ability to extract


and identify discernible insights by manually reading and consuming information from
multiple sources to rate financial instruments, and to identify events that may impact the
instrument’s performance. This is primarily a manual process that impedes a firm's ability to
rate and predict the performance of a large number of financial instruments in a real time, and
scalable manner. This paper presents GENEROUS (GeneratE kNowlEdge gRaph from
UnStructured text) a machine learning based algorithm that extracts information from multiple
unstructured data sources to populate a knowledge graph in near real-time. The knowledge
graph can be made to learn to respond when certain pre-defined conditions are met, thereby
reducing the need for manual effort by the analyst to handcraft insights. The
analyst can focus only on the relevant content that raised the trigger. GENEROUS customizes
and enhances the existing machine learning techniques for Named Entity Recognition (NER)
and Information Extraction (IE). The result of applying the algorithm on dataset constructed
using articles from web is presented.

Keywords: Financial Analytics, text analytics, natural language understanding, clustering

145
AUTOMATED TRADING OF BITCOINS USING DEEP LEARNING

Gautam Kumar*
Associate Cognizant Technology Solutions Pvt Ltd, Bangalore, India
Gautam.Kumar7@Cognizant.com
Gunda Sai Thrinath*
Associate
Cognizant Technology Solutions Pvt Ltd, Bangalore, India
SaiThrinath.Gunda@Cognizant.com
Zahoor Ahmed Kazi*
Associate
Cognizant Technology Solutions, Bangalore, India
zahoorahmed.kazi@cognizant.com
Satish Hegde*
Associate Director
Cognizant Technology Solutions Pvt Ltd, Bangalore, India
Satish.Hegde@Cognizant.com

Abstract

In this paper, the author proposed an approach to predict Bitcoin price using Deep-Learning
algorithms and hence automate the Bitcoin trading. Bitcoin, a decentralized e-currency system,
represents a revolutionary change in conventional financial system, alluring many users and
media around the world and it is at the apex of crypto currencies in the current market. In our
paper we aim to understand and identify daily trends in the Bitcoin price market while also
identifying optimal features relevant to Bitcoin price prediction. Our data set consists of over
25 features relating to Bitcoin price including textual features obtained from various social
media forums, blogs, tweets and payment networks over the course of five years, recorded
daily. Our approach is similar to that of Hinton and Salakhutdinov [6] to train networks with
multiple hidden layers. The model developed by us consists of a stack of RBMs, which doesn’t
have any intra-layer connections such as in conventional Boltzmann Machine. Each RBM
(Restricted Boltzmann Machines) consists of one layer of visible units (the inputs) and one
layer of hidden units connected by symmetric links. The output of each RBM is fed to the next
RBM in the stack. We train the encoder network layer by layer in a pre-training step. The
encoder outputs a low dimensional representation of the inputs. The intention is that it retains
interesting features related to Bitcoin Price that are useful for forecasting returns on investment,
but eliminates irrelevant noise. The weights of the full network are calculated based on the
weights calculated by RBMS using back propagation, which is composed of the encoder and a
FFNN classifier. The final step is to train the entire network using the labelled examples via
back propagation.

Keywords: Bitcoin, RBM, FFNN, back propagation

146
MANUFACTURING

147
ACTIVE DEEP LEARNING ON BIG DATA FOR ASSEMBLY PLANT
FOR IMPROVING ROBOTIC ENERGY EFFICIENCY

Dr. Chiranjiv Roy


Principal Data Scientist
Mercedes Benz Research & Development India Private Ltd.
Bangalore, INDIA
chiranjiv.roy@daimler.com
Mr. Lakshminarayanan Sreenivasan
Senior Data Scientist
Mercedes Benz Research & Development India Private Ltd.
Bangalore, INDIA
reachme@email.com

Abstract

The increasing number of decentralized renewable energy sources together with the growth in
overall electricity consumption introduce many new challenges related to dimensioning of grid
assets and supply-demand balancing. Approximately 40% of the total energy consumption is
used to cover the needs of Large & Medium-scale commercial workshop and assembly plants
for Automobile businesses. To improve the design of the energy infrastructure and the efficient
deployment of resources, new paradigms have to be thought up. Such new paradigms need
automated methods to dynamically predict the energy consumption in plants. At the same time
these methods should be easily expandable to higher levels of aggregation such as
neighbourhood and the power distribution grid. Predicting energy consumption for a Workshop
Assembly Plant is complex due to many influencing factors, such as weather conditions,
performance and settings of heating and cooling systems, and the number of people present. In
this paper, we first define how to get Energy data in Real-Time from Robots & Assembly line
IoT sensors and then investigate a newly developed stochastic model for time series prediction
of energy consumption, namely the Conditional Restricted Boltzmann Machine (CRBM), and
evaluate its performance in the context of building automation systems for energy optimization.
The features are quite volatile hence we present an approach to define new features using
Unsupervised Deep Learning models like Generative Adversarial. The assessment is made on
a real dataset consisting of 7 weeks of hourly resolution electricity consumption collected from
an Automotive Assembly plant which results to a real Big Data of 20 TB in One Working Day.
The Internet of Things (IoT) sensors are the main source of signal data which is captured in
cloud or On-Premise HIVE storage in streaming. The results showed that for the energy
prediction problem solved here using CRBMs outperform Artificial Neural Networks (ANNs),
Deep Belief Network (DBN) and Hidden Markov Models (HMMs), which we applied in hybrid
formations can take feedback from Engineers and Analysts as well for continuous improvement
and become self-intelligent.

Keywords: Energy prediction, Big Data, Active Learning, Deep Learning, Assembly Robots,
GAN, Human Machine Interface, Autoencoders .

148
MACHINE LEARNING APPROACH FOR OCR OF DOT CODE IN
TIRES

Nidhi Narayan1
Cognizant
nidhi.narayan@cognizant.com
Anil John2
Cognizant
anil.john@cognizant.com
Sachida N and Mishra3
Cognizant
sachidanand.mishra@cognizant.com

Abstract

A lot goes into the DOT code of a tire which is embossed on the sidewalls. Information like week, year,
place of manufacture, tire size, Manufacturer’s Unique Code are coded as a part of DOT code. The U.S.
Department of Transportation (DOT) has provided regulations for Tire Identification Numbers as a
combination of eight to thirteen numeric and alphanumeric characters following the DOT.
It is a Federal requirement that a tire retailer logs each tire that leaves the facility against the tire’s DOT
code, consumer name and other details. This information is used when there are recalls or manufacturer
defects and other parameters in the industry.
Since presently there are no bar code or electronic method to capture the DOT Code, each DOT details
is manually entered and validated. This may take a couple of minutes or more. Various tire retailers
make an annual average shipment of 2500-3000 tires.
This paper aims at automating the DOT code extraction. This would hugely impact the man-hours being
used for just entering and validating the code. It may also pave the path various Analytics solution with
a scope for upselling.
The DOT code retrieval has been tried with different Vision APIs provided by various cloud platforms
but the result has been far from being satisfactory. This paper therefore tries to leverage Deep learning
along with a range of techniques for image pre-processing to extract the information.
For training purpose high-pixel images of various type of tires are taken. The images are then cropped
to obtain the dot code template. Various image pre-processing techniques like morphological filters,
thresholding is applied to enhance the template. A deep learning model is designed to train on this data.
Predictions are then made for various images of tires to retrieve the dot code.

149
RETAIL

150
AN EXPERT SYSTEM FOR PERFUME SELECTION USING
ARTIFICIAL NEURAL NETWORK

Rajesh Kalepu, Doctoral Scholar, rajesh.kalepu@ibsindia.org


Dr. Sindhuja P N, Associate Professor, sindhuja.menon@ibsindia.org
Department of Operations & IT, IBS Hyderabad,
IFHE University, Hyderabad, Telangana – 501203.

Abstract

The objective of this research is to help consumers in purchasing perfumes in India through
online, with the aid of the expert framework program developed by using artificial neural
networks (ANN). The expert frameworks’ role is in the preparation to capture the data from
the customer’s requirements and predict appropriate perfume. For this end, factors of perfume
costumers’ decision were recognized using Fuzzy Delphi method and a back propagation
neural network classification model was developed and trained with 3025 data of customers.
In addition, to validate the approach, the expert system program has been tested with 798 data
of customers. The model demonstrates the usefulness of 84.48% classification rate in
classifying consumers’ styles that looks satisfying. At last, we compared the results with
logistic regression analysis. The results which are obtained from ANN were more impressive
than other statistical methods.

Key Words: Expert systems, Artificial Neural Network, Back-propagation neural network,
Perfume, Fuzzy Delphi

151
PRAGMATIC ANALYSIS OF CUSTOMER SENTIMENTS TOWARDS
FMCG BRANDS- PATANJALI AND NESTLE
Kalpana S. Kumaran*
Assistant Professor (HOD),
Institute for Technology and Management, Mumbai, India
kalpanas@itm.edu
Yogita Rawat*
Assistant Professor,
Institute for Technology and Management, Mumbai, India
yogitan@itm.edu

Abstract

The growing awareness, easier access, and changing lifestyles have been the key growth drivers
for the consumer market across the globe as well as in India. As per the survey, (Feeds, 2017)
the FMCG industry has emerged as the highest paying industry in India with an average annual
cost to company (CTC) of Rs. 11.3 lakh across all levels and functions. The top leaders in
FMCG Indian market are ITC, HUL, Britannia, Nestle India, Dabur, Marico, Patanjali, Godrej,
GlaxoSmithKline, and Colgate-Palmolive (Wikipedia). In less than five years, Patanjali – a
homegrown brand with herbal products – has flooded the Indian FMCG market with more than
500 lines across multiple categories and the company is set to turn over INR 100 billion this
year. (WARC, 2017). This made Patanjali as a competitor for the Swiss company, Nestlé which
was rated as the world’s largest FMCG company, in terms of revenue amounting to staggering
92.36 billion U.S. dollars in 2015 (Statistics & Facts on Nestlé, 2015). There is disruptive
change in FMCG market due to Patanjali’s movement towards herbal and ayurvedic products,
earning profit of nearly about Rs. 5000 crores as reported in last quarter (Service, 2016).
This research aims to compare the FMCG Indian market leaders Nestle & Patanjali through
sentiment analysis on microblogging platform, Twitter. Sentiment Analysis helps to analyze
the customers behavior and is an approach to classify the sentiments of user reviews, feedbacks,
posts, tweets, & documents in terms of positive (good), negative(bad), or neutral(surprise)
(Kiruthika, 2016). Micro-blogs, a new communication channel for people to broadcast their
opinions that they would not share otherwise using existing channels namely email, phone, or
weblogs (Rosson, 2009). It is a web service that allows the subscriber to broadcast their
opinions through short messages (microblogging, 2009). In this study, we have used tweets
from Twitter Server to analyze & compare the customer reviews on the companies -Patanjali
& Nestle using the open source tool R. R is one of the best statistical tool and a user-friendly
software that offers packages to analyze data and provide visual representations (Niu, 2014).
The aim of this study is to propose a model to perform a sentiment analysis and classify the
sentiments based on emotions, polarity, and word cloud on the Indian brand-Patanjali & Swiss
brand Nestle. The process includes the creation of datasets (collection of tweets), construction
of a model, & visual representation of the datasets through application of knowledge based
techniques using R.
Keywords: Sentiment Analysis, Twitter, Microblog, Emotions, Polarity, Word Cloud, Nestle,
& Patanjali.

152
A FACIAL RECOGNITION BASED APPROACH TO LEVERAGE
CCTV VIDEO DATA FOR REAL TIME CUSTOMER LEVEL
PROACTIVE ACTIONS AS WELL AND MEASURING VARIATION OF
STOCHASTIC DISTRIBUTION AT ORGANIZATIONAL LEVEL
Jitender Pal Singh
Cognizant
Jitenderpal.Singh@cognizant.com
Sushmita Sahu Jatin Gupta
Cognizant Cognizant
Sushmita.Sahu@cognizant.com Jatin.Gupta3@cognizant.com

Abstract
Organizations throughout the world are now moving towards real time action based on
personalization and integration of offline – online space. In this respect, while a lot of analysis
has been done on the transactional data of customers, employees, etc., a huge scope now
remains to complement and connect this transactional data with human behaviors. Human
behavior in turn varies significantly by age, gender, time of day, day of week, etc. Facial
Recognition and Analysis now provides us avenues to tap this rich and influential data on a
real or near real time basis. In this paper, we present an approach to derive a linked data at a
customer or person level by detecting faces, recognizing the face with past stored customer &
prospect data, predicting the gender, predicting age, and facial emotions like joy, sadness,
anger, disgust, etc. This analysis can be summarized to generate stochastic distributions at
portfolio, organization level. This analysis will have a tremendous scope in measuring
customer satisfaction & demand volume and ability to drill down by Gender, Age and Time.
We could also link this with online data to track customers who entered the store, physical
space with their transactional behavior. Thus enabling the organization to take corrective
actions with respect to individual or segment. The applications are pan industry and domain.
For instance, this can be applicable to operational preparedness by anomaly detection if the
distribution differs from expected. Thus providing the triggers for preventive measures rather
than responding post facto. We have tested a range of techniques and tools to develop this
approach and also provide a comparative analysis of the different approaches to achieve the
above objective. We have utilized the ubiquity of the continuous video data being captured
almost all the organizations through CCTV cameras. A range of techniques for prepossessing
of video files and classification techniques have been used. Kernel and Gaussian filters have
been used to correct image quality issues, particularly relevant for low resolution videos. Image
pre-processing techniques have also been used to account for the variation in the person's head
rotation, facial expressions. These have to be aligned at the axis of symmetry of the face. We
used face landmark estimation algorithm to come up with pivot points, each called as a
landmark. We will be training a model to find these landmarks. There are some fundamental
differences between the faces of males and females and these differences will help in
classifying the image into male or female category, Emotions etc. These predictions can be
overlapped with the time stamp and other enterprise data if
relevant. HAAR Cascade Classifier, boosted tree algorithms, Support Vector Machines Simple
and Deep Neural Nets have been the classification algorithms used.

153
Keywords: Facial Recognition, Emotion, off line- online integration, Machine Learning
Classifier, HAAR Cascade Classifie

154
NEURAL NETWORK ADOPTION IN TIME SERIES FORECASTING-A
COMPARATIVE ANALYSIS
Harleen Kaur Ahuja (Author)
Data Scientist
Pluto7 Inc.
Delhi, India
harleen@pluto7.com
Manjunath Devadas (Co-Author)
CEO
Pluto7 Inc.
California, USA
manju@pluto7.com

Abstract

In today’s growing world of industry, understanding trends is important to decision making


and reacting to changes in behavioural patterns. In Supply chain market, accuracy of forecasts
is very critical as one wrong prediction can leads to either a shortage or an excess. This project
aims at implementing an effective forecasting system that will reduce such service interruptions
to a great extent. This project starts from developing predictions on supply chain data using
classical time series model like ARIMA and then will gradually move towards machine
learning and using Neural networks. Here, a comparative analysis will be done between the
results achieved in these two categories. Over the past few decades, significant development
has taken place for short term forecasting. Recently a class of models called Recurrent Neural
Networks is gaining interest in developing powerful neural networks. Since it requires
extensive GPUs for training using neural networks, this project will be running the training
models on Google cloud and will leverage the machine learning cloud capabilities.
From Classical linear models, Seasonal ARIMA will be implemented and will predict the
future demand and supply for next one year. In the Machine learning category, a specific type
of recurrent neural network called LSTM (Long Short Term Memory) will be implemented.
End to end neural network architecture will be developed using Tenor flow framework. A
comparative analysis will be performed between the results obtained from the two models.
Also, deep analysis is required to perform hyper parameter tuning in both the algorithms to
achieve high accuracy forecast systems. This is a Machine learning project. Once implemented
it can be used and incorporate to forecast any time series data.

Keywords: Neural Networks, Machine learning, Tensor Flow, Gradient descent, ARIMA

155
SERVICES

156
ANALYSIS OF MONOGENIC HUMAN ACTIVITY RECOGNITION
DATA USING DATA MINING ALGORITHMS
R.Suganya*
Assistant Professor, Department of IT
Thiagarajar College of Engineering, Madurai, India.
rsuganya@tce.edu
S.Rajaram
Associate Professor, Department of ECE
Thiagarajar College of Engineering, Madurai, India.
rajaram_siva@tce.edu
A.Sheik Abdullah
Assistant Professor, Department of IT
Thiagarajar College of Engineering, Madurai, India.
asait@tce.edu
Fiaz Mohammed Ali
Department of IT
Thiagarajar College of Engineering, Madurai, India.
faizmohammedali@gmail.com

Abstract:
Human Activity Recognition (HAR) is an area of growing interest facilitated by
the current revolution in body-worn sensor. HAR systems retrieve and process contextual
(environmental, spatial, temporal, etc.) data to understand the human behavior. Mining
behavioral patterns from such wearable data along with other available sensory data, has the
potential to offer an objective, insightful service in clinical professionals and healthcare.
Activity recognition applications are used effectively for healthcare and safety applications.
This paper proposes a data mining algorithm to automatically recognize monogenic human
activities based on four body-worn accelerometer. Monogenic children often develop
abnormal habits and in some cases they could be unsafe or even dangerous to themselves.
Because of their limited speech ability, their inexperienced parents may underestimate their
physical abilities compared to their intellectual level and may not realize that they could
easily hurt themselves. In this work the classifier model are built using C4.5, C5.0, Naive
Bayes, SVM, QDA, LDA and Neural network. A comparative study is made with these
classifier model based on their accuracy parameter. At the end, the proposed methodology
facilitates to accurately classify the monogenic child gesture and motion.We further
implemented C4.5 supervised algorithm in hadoop environment for handling large datasets.
The analysis is based on the confusion matrix and accuracy which is obtained by the 7 data
mining algorithms. From this we can be able to infer that each supervised algorithm
implemented shows better performance and accuracy for the monogenic HAR dataset.

Keywords: data mining algorithms, Monogenic Human recognition activity Data, supervised
algorithm, confusion matrix,

157
ADVENT OF ARTIFICIAL INTELLIGENCE IN CUSTOMER
EXPERIENCE TRANSFORMATION

SUDHA BHAT
Asst. Vice President
Genpact India Pvt. Ltd.
Bangalore, India
Sudha.bhat2@genpact.com
Mohan Raj
Manager
Genpact India Pvt. Ltd.
Bangalore, India
Mohan.raj@genpact.com

Abstract

Customer Experience is pivotal to any customer service organization. Some organizations have
started to adopt customer experience as a competitive differentiator. At this juncture Customer
experience transformation is at its inflection point. The next wave of growth in this field will
be propelled by Artificial Intelligence. Over the last decade, organizations have heavily
invested in capturing data at various customer touchpoints. Most of the transformations efforts
has been towards how effectively we can collect this customer data and integrate disparate data
sources to provide a unified view of customer. However, lesser effort has gone on monetizing
this data gold mine for providing superior customer experience. Artificial intelligence powered
by machine learning and deep learning techniques will mine this data and help business drive
specific business outcomes.
In this project we had attempted to create a Customer Experience Score(CES) – framework
that is based on actual customer effort incurred instead of reliance on survey based
measurement techniques. Some key features of this framework include

 CES will be a comprehensive measure of customer experience standardized


across channels
 It includes KPIs from structured data sources like CRM, transactions etc...,
 It also leverages unstructured data sources like Speech recordings, chat
transcripts and emails there by provides an accurate measure of customer effort
 CES framework uses NLP and deep learning models to extract insights from
unstructured data sources.

CES is a one-point solution to monitor, measure and optimize customer journey. It extends
further by leveraging machine learning models like Neural Networks to predict the next best
action for the customer. Some of the use cases that will be addressed by our solution

158
1. A customer’s churn propensity
2. Inclination to buy a product
3. Affinity towards different channels

This will be computed based on historical channel usage and interaction behaviour. The
insights from this models can be readily used by marketing teams to pre-empt customer churn,
identify opportunities to cross-sell/up-sell and the optimal time window for running campaigns
in a channel that can maximize the customer response.

Keywords: Customer Journey Analytics, Machine Learning, NLP, Deep Learning, Speech
Analytics

159
NEURAL NETWORK BASED APPROACH FOR TOURISM DEMAND
FORECASTING

Anurag Kulshrestha1 and Abhishek Kulshrestha2


1
Indian Institute of Management Indore, Mumbai Campus,
CBD Belapur, Navi Mumbai, Maharashtra, India
2
Shri Ramswaroop Memorial University, Lucknow, Uttar Pradesh, India
Abstract

Demand forecasting is essential for any business activity. Forecasting tourism demand is
crucial for planning and operational purposes for both government and private sector. The study
aims to forecast and compare monthly tourism demand to Singapore utilizing artificial
intelligence based approaches. The monthly data of inbound tourist arrivals to Singapore was
subjected to descriptive analysis. Augmented Dickey-Fuller (ADF), Phillips-Perron (PP) and
KwiatkowskiPhillips-Schmidt-Shin (KPSS) tests were performed to check for underlying
stationarity of the series. Multi layer perceptron (MLP), elman and radial basis function (RBF)
neural networks were utilized to generate one, two, four and six months ahead prediction of
tourist arrivals to Singapore from various visitor markets. The predictive performance of the
models was measured with root mean square error (RMSE) and mean absolute error (MAE).
To confirm the best predictive model, Diebold-Mariano (DM) test was performed to compare
the forecasting accuracy of the three intelligent models. ADF, PP and KPSS tests confirmed
the presence of unit root in the tourism time series dataset. To make the time-series stationary,
first difference of natural logarithm was performed. Upon analysis it was revealed that RBF
networks possess lower RMSE and MAE values than MLP and elman neural networks. DM
test further confirmed our findings that RBF network outperformed elman and MLP neural
network models in predicting the tourist arrivals in all forecasting horizons and for most of the
countries. The projected neural network based methodology helps in improving the forecasting
performance of artificial intelligence based approaches for predicting the inbound tourist
arrivals. The proposed methodology could be utilized by managers to effectively allocate
resources and frame policies for tourism sector.
Keywords: Tourism demand; forecasting; artificial neural networks

160
IRIS – A COGNITIVE DECISION ENGINE

Ashwin Rajan
Business Operations Manager
Cisco Systems, India
Bangalore, India
asrajan@cisco.com
Sujay Mendon
Business Operations Manager
Cisco Systems, India
Bangalore, India
sumendon@cisco.com
Suresh Shetty
Senior Manager Business Operations
Cisco Systems, India
Bangalore, India
surshett@cisco.com

Abstract
Interactive Response and Investigative System (IRIS) is a Machine Learning based Cognitive
Decision Engine. It will process business questions from users (think of searching on Google
for different topics) and supply analytical insights and recommendations based on prior tested
models and analytical inputs from all the analysts within the team stored in the system.
The engine will be a consolidation of the business and analytical knowledge from multiple
analysts that will lead to exponential gains to business insights. It will aid in answering
questions to recurring problems at the click of a button and will aid the analysts by suggesting
correlations, cause and impacts previously unknown. It will serve the additional benefit by
being the repository of the business and analyst knowledge ensuring continuity. Future benefits
include incorporating AI to let the machine self-learn, correlate and provide insights that the
business can accept/reject as a golden insight.
Keywords: cognitive, machine-learning, analytics, bot

161
162
ANOMALY DETECTION IN NETWORKING LOGS USING
UNSUPERVISED AUTOENCODER LEARNING

Kaushik Kumar Bar


Founder
Ize Analytics
Bangalore, India
kaush.mat@gmail.com

Abstract

Anomaly detection has been used in many contexts – from identifying flaws in manufacturing
processes to finding suspicious activities in surveillance videos. In this paper,
we attack the problem of anomaly detection in networking logs followed by finding the root
cause of these anomalies in the hardware status of constituent computing machines in the
underlying stack. Such models will find applications in multi-stacked multi-OS virtual
software platforms/infrastructure serviced by multiple vendors, which are very common (e.g.
OpenStack) in cloud computing today – where a failure in any of the layers entails
complicated and time-consuming debugging efforts that often result in incorrect attribution of
root cause, potentially damaging brand image of benign service components’ providers. ©
This paper proposes a practically implementable solution architecture to the problems above,
which will minimize human intervention in such scenarios. Limitations of more commonly
used techniques for anomaly detection (e.g. clustering based LOF/SOF, multivariate
Gaussian outlier detection, rule based models, supervised learning etc.) are discussed with
respect to the behaviour of anomalies in the domain context. Alternative solution approaches
are explored based on characteristics of data. Finally, we describe the proposed method that
trains an auto-encoder based unsupervised anomaly detector on text data (networking logs),
followed by augmenting the time series data (hardware status) with results from this detector,
followed by training a self-supervised LSTM auto-encoder anomaly detector on the
augmented data. In the problem domain of interest, the proposed method is expected to fare
better than the other techniques that are more prevalent.
Keywords: Anomaly detection, auto-encoder, LSTM, unsupervised deep learning, neural
network.

163
TEXT ANALYTICS FOR RELATIONSHIP EXTRACTION TO
CONVERT SENTENCES TO EQUATIONS

Varsha Rani
Data Scientist
Analytics, Genpact
Gurgaon, India
varsha.rani@genpact.com
Chirag Jain* Senior
Data Scientist
Analytics, Genpact
Bangalore, India
chirag.jain4@genpact.com

Abstract

As Artificial Intelligence is automating most of the repetitive or mundane tasks, every effort
in this direction needs to maintain a balance between precision and recall. This paper targets
an area where high precision is a critical requirement. The authors present an approach to
automate the conversion of relationships described in English sentences to mathematical
equations. The approach tries to understand the structure of the sentence by extracting its
semantic and syntactic features. It extracts entities / variables from the sentences and
identifies the relationships between them to convert into equations. The approach is based on
Natural Language Processing / Text Mining principles, and leverages dictionaries of variables
and relationships. Automated conversion of sentence is challenging due to variations in the
writing styles of people. The same relationship between the variables can be represented in
multiple ways in different sentences. It may also spread across multiple consecutive
sentences. Entity resolution and identification of relationships by information extraction
system is an ambiguous process. The developed solution integrates the naturally structured
layout of sentences to solve semantically missing or ambiguous elements. Contextual
information from neighbouring words further helps to identify the complex relationship. It
also uses customized domain specific knowledge sources to remove ambiguity. The
algorithm also incorporates co-referencing resolution to link the relationship and entity
spread across multiple sentences. This developed solution has been tested on more than 500
sentences with a precision of 90% and a recall of 75%. It is getting productized as a part of
one of the offerings from Genpact. However, it can be adopted for any application by
creating a domain specific data base of variables and relationships.
Keywords: Text Analytics, NLP, Syntactic Relationship, Automation

164
FACIAL COMPOSITE USING DNA DECODING

T.Gokulrajan
Department of Information Technology
Thiagarajar College of Engineering
Madurai-625015
tgokulrajan@gmai.com
Dr.R.Suganya M.E., Ph.D.,
Assistant Professor
Department of Information Technology
Thiagarajar College of Engineering
Madurai-625015
rsuganya@tce.edu

Abstract

DNA Decoding may favor in forensics and detecting genetic diseases. DNA is said to
be the simplest, molecular representation of a person’s characteristics. We can see a person’s
DNA as a quaternary representation of details. DNA has 4 molecules namely Adenine,
Thymine, cytosine and guanine and their corresponding bonds. A segment of DNA may have
4 combinations such as A-T , T-A , C-G , G-C. The total amount of related DNA base pairs on
Earth is estimated at 5.0 x 10^(37) and weighs 50 billion tons. A large part of DNA (more than
98% for humans) is non-coding, meaning that these sections do not serve as patterns for protein
sequences. The combinations A-T, T-A, C-G, G-C can be represented as 0,1,2,3 for easy
computations and operated with a number system of base 4. DNA decoding is based on the
belief that the variations in a person’s characteristics depends on the order in which these four
combinations exist in a DNA strand. DNA Decoding deals with the comparison of the DNA
sequences of a group of persons and analysing with every person’s looks and appearances. This
will obtain us a relation between a person's DNA and his appearance. This relation allows us
to guess a person's appearance using his DNA sequence in order. Based on the analysis, the
DNA is segmented and characterized for individual characteristic of a person. Thus every
segment is mapped with appearance and finally grouped together to get the appearance of a
person.

KEYWORDS: Facial Composite, DNA Decoding, DNA segmentation, quaternary


computations

165
Q-MAP: CLINICAL CONCEPT MINING WITH PHRASE SENSE
DISAMBIGUATION

Sheikh Shams Azam


Data Scientist
Practo Technologies Pvt. Ltd., Bangalore, India
s.shams.sam@gmail.com
Manoj Raju
Senior Data Scientist
Practo Technologies Pvt. Ltd., Bangalore, India
manoj.raju@practo.com
Venkatesh Pagidimarri
General Manager
Practo Technologies Pvt. Ltd., Bangalore, India
venkatesh.p@practo.com
Vamsi Kasivajjala
Senior Vice President
Practo Technologies Pvt. Ltd., Bangalore, India
vamsi@practo.com

Over the past decade, there has been a steep rise in data driven analysis in major areas of
medicine, such as, clinical decision support system, survival analysis, patient similarity
analysis, image analytics etc. Also, there are various ongoing research efforts in the operational
and financial fields using techniques such as demand forecasting, convex optimization. Most
of the data used in these research applications are well-structured and available in numerical or
categorical formats which can be used for experiments directly. On the opposite end, there
exists a wide expanse of data that is intractable for direct analysis owing to its unstructured
nature. These can be found in the form of discharge summaries, clinical notes, procedural notes
which are in human written free text format and neither have any relational model nor any
standard grammatical structure. An important step in utilization of these texts for such studies
is to transform and process the data to retrieve structured information from the haystack of
irrelevant data using information retrieval and data mining techniques. The unregulated format
coupled with massive size of datasets makes the mining process a monumental task requiring
robust algorithms supported by ample hardware resources and computing power. In this paper,
we present Q-Map, which is a simple yet powerful system that can sift through these datasets
to retrieve structured information aggressively and efficiently. It is backed by an effective
mining algorithm based on curated knowledge sources, that is both fast and configurable. We
also present its comparative performance with MetaMap, one of the most reputed tools for
medical concepts retrieval.

Keywords: Information Retrieval (IR), Unified Medical Language System (UMLS), Syntax
Based Analysis, Natural Language Processing (NLP), Medical Informatics

166
DIABETICS EYE DISEASE DETECTION

Chandravadan M. Prajapati
Assistant Professor
SVKM’s Institute of International Studies
Mumbai, India
Chandravadan.Prajapati@svkmiis.ac.in

Abstract

People having diabetes for a long period have very high chance of having a medical condition
in which the retina of the eye gets damaged. It is a leading cause of blindness between ages of
20 to 65 years. 80% of the new cases could be reduced if there were proper monitoring of
eyes and treatments. Diabetic retinopathy occurs when changes in blood glucose levels
affects retinal blood vessels. In some cases, these vessels will swell up and leak fluid into the
rear of the eye. In other cases, it could be a damaged nerve tissue or abnormal blood
vessels, which will grow on the surface of the retina.

167
168
CUSTOMER SUCCESS USING DEEP LEARNING

Shobha Deepthi V, Sumith Reddi Baddam


Vignesh Thangaraju, Chandrasekaran S
Cisco Systems India Pvt Ltd, Bangalore
sobv@cisco.com, sumreddi@cisco.com, vigthang@cisco.com, chands@cisco.com

Abstract
Customer Success has been the top priority for Cisco in transforming to recurring revenue
business model. For this we need to shift our paradigm from being a “reactive troubleshooting”
to “proactively advising” our customers. As part of this transformation various capabilities are
being built, to capture customer data, have smart agents that collect information from customer
networks to predict a failure before it happens and to advise the customer of the resolution.
Cisco products are both hardware and software. It is trickier to predict a failure or an issue
beforehand in software when compared to hardware because in hardware there are predefined
set of symptoms for a failure. In software, predicting an issue beforehand means knowing and
understanding what code is going in with each commit, defect or an enhancement. In most
cases, defects found during internal testing, which are often neglected, crop up as customer
issues at a later point in time. In this paper, we propose a solution to predict the potential defects
that the customer might find after the release of the product using LSTM and CNN. We also
predict the time (weeks or months) within which the customer might face this issue. This
knowledge helps the engineering teams to prioritize the defects and proactively resolve them
on time, thereby advising customers with a patch or an upgradation of the release.

169
CISCO SERVICES DIGITIZATION USING CHAT BOTS AND
MACHINE LEARNING

Jamuna Ranganathan
IT Analyst
Cisco Systems India Pvt Ltd
Bangalore, India
jrangana@cisco.com
Sandip Mohanty
Associate Data Scientist
Cisco Systems India Pvt Ltd
Bangalore, India
sandipm@cisco.com

Abstract
Cisco, the market leader in networking segment, continues to strengthen its brand by improving
the customer experience and integrating service quality within Cisco organizations
through an initiative Cisco Services Digitization - Predict RMAs.
Cisco Services ships out high number of product replacements to customers for some Product
Lines, incurring costs to Cisco.
The problem statement: Understand Customer Problem Details reported and predict if the
issue warrants for a RMA.
This use case enables a Troubleshooting BOT assistant to aid Cisco Customer Support
Engineers (CSE) who interact with Customers and resolve customer issues effectively.
A Service Request (SR) is created by the customer with a problem description, the symptom
of the issue and any attempt to troubleshoot and fix the same.
Cisco CSEs work on the tickets and determine if the issue is Hardware related or software
related. For all hardware issues, CSE will sent Replacement PIDs to customers called as
Return Material Authorization (RMA). For software issues, the CSEs will troubleshoot the
issue to resolve it.
Using machine-learning techniques, we built a Prediction Engine, which predicts a RMA or
no RMA for the SR.
The Prediction Engine is a combination of three models - a Naïve-Bayes Classifier model,
Logistic Regression Model and a Question Tree technique.
The NB model uses the Problem Description by segregating into keywords by lemmatizing
and tokenizing the text. Then the keywords are processed to build a Component Symptom
dictionary using NLP through POS tagging technique. It computes the Probability score of
being a RMA and no RMA for a given problem description.
The Logistic Regression Model works on the past Customer and Product behaviour and
computes the Probability score. It considers the attributes like rma_rate_of_customer,

170
rma_rate_of_product, case_rate_of_product, case_rate_of_customer_pid and issue
classification using Topic Modelling to calculate the probability score of being a RMA and
No RMA.
Question Trees leverage past known and most occurring issues to build question trees to
narrow down to an outcome.
The final score is calculated after determining optimum thresholds through various iterations
over the training data for all models.
This interaction between the Cisco CSE and the BOT happens over Cisco SPARK
(collaboration tool).
Keywords: Chat BOT, Cisco SPARK, Naïve-Bayes, Logistic Regression, Question Trees,
Cisco RMA

171
A NEURAL NETWORK BASED ATTRITION PREDICTION MODEL
FOR A MANUFACTURING PLANT
Dr. Pratyush Banerjee*
Visiting Assistant Professor
Department of Management, BITS Pilani
Pilani, Rajasthan, India
pratyush.banerjee@pilani.bits-pilani.ac.in
BhanuTeja Y
Final Year Student, MBA (HR)
IBS Hyderabad
Hyderabad, Telangana India

Abstract
In this study, HR Managers from a large manufacturing plant located in South India approached
the authors with their problem of high attrition. The management wanted the authors to develop
a retention prediction model. The management provided the authors with data for 580
employees out of which 464 have left the plant by 2016 and 116 were presently working in
2017. The dataset contained several demographic information such as gender, designation etc.
as well as exit-interview survey feedback on quality of life, career opportunity, leadership
perception and general job satisfaction, to name a few. A supervised multi-layer perceptron
type neural network based algorithm was used with the help of SPSS 20 to understand the
dominant factors that has led to the employee attrition in the firm. The Testing model
outperformed the Training and Holdout Models in terms of predicting attrition, though the
accuracy was not very high. All three models were robust in terms of predicting retention. The
final model was able to identify 8 employees from the current lot of existing 116 employees
(7%), who may be possible churners in near future. An independent variable importance
analysis gave the top three factors for employee churn as organizational culture, leadership
style of the middle managers and the reward structure. Recommendations were made to the
management to take actions to improve perception about organizational culture through a
culture rebranding exercise, to debrief the middle level managers about the negative impact of
their leading style on the subordinates and to revisit the reward structure. Certain limitations
with respect to data access reduced the robustness that the model could achieve. Implications
for HR Practitioners and managers is discussed at length about the possibilities and challenges
of implementing this technique in the corporate world.

172
A NOVEL MACHINE LEARNING BASED APPROACH
FOR GENERIC FORM PROCESSING

Arun Pavuri*
Business Analyst
Analytics, Genpact
Bangalore, India
arun.pavuri@genpact.com
N V S S Koundinya
Jr. Data Scientist
Analytics, Genpact
Bangalore, India
koundinya.nvss@genpact.com
Chirag Jain
Senior Data Scientist
Analytics, Genpact
Bangalore, India
chirag.jain4@genpact.com

Abstract

Recent breakthroughs in Artificial Intelligence (AI) have had a transformative effect in a


number of fields. Automation is an area where AI is playing a key role. Automation today,
straddles a variety of application areas within business operations. Form/Invoice processing is
a tedious and repetitive process requiring huge investment from corporates for the ecosystem
set-up. Today’s automated solution of form processing can only handle standard preconfigured
templates. However, in the real world, form types are not standard and therefore
require human intervention to understand their structure to extract relevant information. The
authors present a machine learning based approach to handle such inconsistencies in the
tabular format below. Optical character recognition (OCR) has been in use for a number of
years to extract readable content from scanned media. However, recent advancement in OCR
technology has enabled information extraction from tabular content present in the media
while they are able to retain tabular structure of the content. Additionally, OCR started
capturing metadata about the content like position, font type and size with additional
attributes like bold or regular, italics, etc. The proposed algorithm leverages this meta-data to
handle the challenges of automatic information extraction from inconsistent or new form
types. Also, the algorithm uses contextual features from the content from various
neighbouring blocks to identify the relevant field. Such an application will play a pivotal role
in automating the entire form/invoice processing operation while cutting costs significantly
for the business.
Keywords: Artificial Intelligence, Machine Learning, OCR, Ensemble learning

173
TRACTABLE MACHINE LEARNING FRAMEWORK FOR IoT
SENSOR DATA

Bhawani Shankar Leelar*


PhD Scholar
Department of ECE, Indian Institute of Science
Bengaluru, India
bhawanishank@iisc.ac.in
E. S. Shivaleela
Principle Research Scientist
Department of ECE, Indian Institute of Science
Bengaluru, India
lila@iisc.ac.in
T. Srinivas
Associate Professor
Department of ECE, Indian Institute of Science
Bengaluru, India
tsrinu@iisc.ac.in

Abstract
The volume of data acquired from various applications such as medical,
weather, financial, Internet of things (IoT) etc., poses various challenges in handling,
understanding and predicting the state of the system. In analyzing IoT data, mainly three types
of challenges occur. Firstly, dealing with heterogeneous data, as it comes with a lot of
imperfections – errors, missing values, different scales/units etc., thus it is very important to
deal with clarity because machine has different bias for various components of a heterogeneous
data. Secondly, it is hard to evaluate the state of the system, when the source is unknown and
for which target application the data is collected. Thirdly, with varying data characteristics, a
lot of experimentation is needed to determine which Machine Learning (ML) model would
represent it the best, as a considerable number of ML models are available, with a lot of inherent
minor variations which are better suited for particular situations only. To address these issues,
we have proposed in this paper, a complete framework with unified language for data
homogeneity, data integration and state prediction. The first challenge of
different scales and units is resolved by Symbol Algebra (SA) approach to assign the symbols
to the data values or ranges. This unified language framework can process both continuous and
discrete data. Secondly, we assign the states to the system from where the data is collected –
weather data, pollution data, etc. This approach is helpful in training the ML models where
states provide the supervised target, when we do not have predefined states values. Thirdly, we
create the hypotheses space from the popular ML models and use Bayesian Machine Learning
(BML) to tune the predictor variable and select the best model from the hypotheses space to
represent the new data. We have tested our framework on both, homogeneous and
heterogeneous data sets, which are freely available from CityPulse Project, to evaluate the
performance in different environment. We created hypotheses space with 6 ML models -
Logistic Regression (LR), Linear Discriminant Analysis (LDA), k-Nearest Neighbours (k-

174
NN), Classification And Regression Tree (CART), Naive Bayes (NB) and Support Vector
Machine (SVM). We have shown that for homogeneous data set, we got
an upper limit in efficiency with SA due to discretization loss, when compared to that of the
results without the SA usage but we get an improvement where the efficiency was less without
the use of SA. For the case of heterogeneous data, we got improved efficiency when models
were trained with symbols. We chose best model for new data from the 12 models (6 with
symbols and 6 without symbols) with the proposed BML. This framework can be applied to
any system where a little noise can be tolerated. Our future work is on methods of analyzing
sensitive systems like predicting health status of patients, which involves heterogeneous data
with least noise toleration limits.
Keywords: Machine Learning, Big Data, IoT, Unified Language Framework, Predictive
Analysis

175
A SECURE PROTOCOL FOR HIGH DIMENSIONAL BIGDATA
PROVIDING DATA PRIVACY

Prasad S P
Asst. Professor
Dept. of ISE, DSATM
Bangalore, India
prasadsp.prakash@gmail.com
Anitha J
Professor
Dept. of CSE, DSATM
Bangalore, India
anitha.jayapalan@gmail.com

Abstract

Due to recent technological development, a huge amount of data generated by social


networking, sensor networks, Internet, healthcare applications and many other companies,
which could be structured, semi-structured or unstructured, adds more challenges when
performing data storage and processing tasks. During Privacy Preserving Data Processing
(PPDP), the collected data may contain sensitive information about the data owner. Directly
releasing this information for further processing may violate the privacy of the data owner,
hence data modification is needed in such a way that it does not disclose any personal
information about the owner. On the other hand, the modified data should still be useful, not
to violate the original purpose of data publishing. The privacy and utility of data are inversely
related to each other. Existing privacy preserving techniques like k-anonymity, t-closeness
are focusing on anonymization of data which have a fixed scheme with a small number of
dimensions. There are various types of attacks on the privacy of data like linkage attack,
homogeneity attack and background knowledge attack. To provide an effective technique in
big data to maintain data privacy and preventing linkage attacks, this paper proposes a
privacy preserving protocol - UNION, for multi-party data provider with KCL
anonymization. Experiments show that this technique provides a better data utility to handle
high dimensional data, and scalability with respect to the data size compared with existing
anonymization techniques.
Keywords: Big Data, Anonymization, k-Anonymity, t-Closeness, Privacy Preserving Protocol.

176
LAWBO: A SMART LAWYER CHATBOT

Unnamalai N
Data Science Associate
Probyto, Coimbatore, India
unnamalai.n@probyto.com
Kamalika G
Data Science Associate
Probyto, Coimbatore, India
kamalika.g@probyto.com
Shubhashri G
Data Science Intern
Probyto, Coimbatore, India
shubhashri@probyto.com
Karthik Ramasubramanian*
Director
Probyto, Coimbatore, India
karthik@probyto.com
Abhishek Kumar Singh
Director
Probyto, Coimbatore, India
abhishek@probyto.com

Abstract
Artificial intelligence (AI) has evolved to the stage where it can parse intentions and churn
out useful responses to practical queries. Chatbots are AI-driven pieces of software that
converse in human terms. They’re not quite ready to pass the Turing test, but ready enough
for many forms of commerce and messaging. With the advent and rise of chatbot
adaptability, the question is not only, how to make chatbots but also, where to use it next. In
recent past, chatbots have found their applications ranging from travel, personal finance,
productivity and retail applications. When it comes to conversing and understanding like
humans, one of the most intricate domains for chatbots is the judicial system. One needs to
really pour into volumes of legal books and judgement papers to analyze and investigate a case.
“Justice delayed is justice denied!”. Time being the most valuable factor in this domain, chatbot
seems to be a good investment for helping legal professionals to save time and effort in probing
a case. LAWBO could guide and give potential ideas in drawing parallelism between cases and
at the same time, answer and fetch & derive relevant knowledge from the humongous amount
of legal data and provide it to the lawyers. We use a combination of heuristics applied on data
extracted from supreme court judgments using in-house developed, state-of-the-art parsers and
dynamic memory networks (DMN) for Natural Language Processing (NLP). DMN is a neural
network architecture which processes input sequences and questions, forms episodic memories,
and generates relevant answers, which is essentially how chatbots function. The training for
question answering tasks relies exclusively on trained word vector representations and input-
question-answer triplets generated from our parsers based on the judgment paper.
Keywords: Chatbot, Smart Lawyer, Deep Learning, Dynamic Memory Network, Natural
Language Processing, Artificial Intelligence

177
MARKET WATCH - MOBILE APPLICATION SENTIMENT
ANALYSIS FOR UNDERSTANDING THE VOICE OF THE APP USERS

Mr. Rahul Kumar


Senior Data Scientist
Mercedes Benz Research & Development India
Private Ltd.
Bangalore, INDIA
Mr. Lakshminarayanan Sreenivasan
Senior Data Scientist
Mercedes Benz Research & Development India
Private Ltd.
Bangalore, INDIA
Lakshminarayanan.sreenivasan@daimler.com

Abstract

With the increase number of Mobile applications and users moving from one application to
another increasing highly, there is a need to capture and understand the need of user sentiment
around the usage of each of the mobile applications. Understanding the usage of
the user’s sentiment also helps in improving the mobile app features in the subsequent releases.
There is a tremendous increase in users voicing their opinion freely in the social media forums
or comments in the specific mobile applications. These comments are to benefit other users or
also to caution them from using/installing the mobile app. The user comments would in form
of free flow unstructured text or natural language format and there are now specific tools
available to understand and analyse them in a quicker and meaningful manner.
With this paper we try to achieve:
How to extract the entities and group them in usefulness to business,
How to polar are the users opinion,
How considering the words occurring together helps in giving a wholesome
meaning and finally,
Representing in a Visual dashboard of the sentiment based on the mobile
application by segregating via regions, devices or even the OS versions of the app.

Keywords: Natural language Processing (NLP), Big Data, Deep Learning, Sentiment Analysis,
Market watch, Human Machine Interface.

178
OTHERS

179
CONVERSATIONAL USER INTERFACE: A PARADIGM SHIFT IN
HUMAN-COMPUTER INTERACTION

Priya Gupta*
Assistant Professor
Maharaja Agrasen College,
University of Delhi
Delhi, India
pgupta1902@gmail.com

In times like today when design of technology has to be all inclusive and usable, an
understanding of as to why and how conversational user interfaces (CUIs) are becoming the
vanguards of improved human computer interaction becomes comprehensible. For example; a
user might not be compatible with the keypad interface of the mobile screen, say the diameter
of his/her finger is bigger than the alphabet, or the users’ preference of working on desktops
rather than on cell phones for a better User Experience. This leads to issues that often hamper
user experience. CUIs are analogous to the existing conventional Graphical User Interfaces
(GUIs) wherein the users interact with the machine through humanly conversations and not
buttons or icons/graphics. This paper elucidates on the recent technology of CUI in the realm
of existing personal assistants and reasons out the dire need of CUIs in the present dynamic
technological world where the users not only want to use technology to accomplish their tasks
but rather communicate with it through human-like responding user interfaces to fulfil their
needs quickly with ease, less efforts and frustration to achieve an overall satisfied user
experience. With ‘chatbots’ in picture, verbose interactions can be a substitute to tedious,
confusing or even frustrating manual tasking which will result in increased and more satisfied
user engagement with computer interfaces. Unlike personal assistants like Siri, Google Now,
Cortana in picture, CUIs differ by letting the users interact ‘with’ the interface and not ‘at’ the
interface. Not only is this amalgamation flexible but it also redefines Human Computer
Interface to a whole new perspective of user experience, for instance, enabling visually
impaired people to operate a lift, an ATM or even any web app by communicating with the
chatbots and not by conventional touch and button-pushing. Chatbots can prove to be a boon
by being an interactive assistant in such scenarios and thereby make way for an efficient
human-computer interaction, which is the goal. Often, a lot has been studied and talked about
HCI in the history of computing and so is with chatbots. However, this study makes an attempt
to establish the relationship between the two ‘related’ but ‘never integrated’ concepts of human
computing by studying the dependency and effect of one on another. This impact of CUIs on
Human-computer Interaction is further studied.

Keywords: Human Computer Interaction, User Experience, Conversational User Interfaces,


GUI, Chat Bots

180
THE EXISTENCE OF ANALYTICAL MYOPIA AND NEED FOR BIG
DATA PSYCHOLOGISTS IN BUSINESS INTELLIGENCE

Dhriti Malviya
Student
Narsee Monjee Institute of Management Studies
Mumbai, India
dhriti.malviya@gmail.com
Himanshu Upadhyay
Senior Research Scientist
Florida International University: Applied Research Centre
Miami, USA
upadhyay@fiu.edu
Walter Quintero
Research Scientist
Florida International University: Applied Research Centre
Miami, USA
quinterw@fiu.edu

Abstract

The proliferation of real-time mobile applications and connected devices (smartphones, smart
home devices, wearable accessories like fitness bands etc.) coupled with the advent of social
media, has given rise to a new era of data science. In addition, the emergence of integrated
technologies and cross-application data collection has spelled a golden age of behavioral data
science - an age where data no longer reflects who we are but instead helps determine it. Today
almost all businesses have begun to actively invest in analytics personnel and software, for
being able to monitor every move of the consumer, throughout the length of the day. While
demonstrating ways in which businesses can effectively leverage the heaps of data generated,
this paper aims at introducing the phenomenon of analytical myopia, that businesses are often
victims of. It further explains how psychology can play a role in avoiding this myopia - by
drawing accurate behavioral insights from processed consumer data, testing known
assumptions, and building a foundation and culture for smooth exploration of data. The paper
thus proposes the concept of big data psychologists, an undeniably crucial concept for modern-
day businesses to surge competitive advantage. The demonstration in the paper is done using
Natural Language Processing (NLP) methods such as Text Analytics in R Language, for
sentiment analysis of consumer data from social networking sites. Practical use-cases of cross-
application data analytics involving testing inferences with psychological theory have been
described to further enunciate the scope of behavioral data science.

Keywords: Big Data, Natural Language Processing, Sentiment Analysis, Consumer Behavior,

181
ALGORITHMIC BIAS – CAN MACHINE LEARNING GO WRONG?

Pranay Tiwari
Senior Analyst – HR Analytics
Bangalore, India
Pranay.tiwari@fmr.com

What if the teacher that taught you how to learn was biased in the first place? Artificial
Intelligence (AI) and Machine learning (ML) are behind some of the most critical systems we
use – at work, when we communicate, while travelling, while managing our daily chores. But
as the AI enabled machines acquire human- like abilities, it seems they are also been
engrained with biases that you would not generally associate with algorithms.

The Bias of ML algorithms


 After Pokémon Go was released, several users noted that there were fewer Pokémon
locations in primarily black neighbourhoods. Urban Institute researchers found an
average of 55 PokéStops in majority white neighbourhoods and 19 in majority black
neighbourhoods.

A Google photo application made headlines when it mistakenly tagged black people as
gorillas

182
Histories of bias may live on in digital platforms and can become part of everyday
algorithmic systems leading to discrimination in results the ML algorithm throw out.

The paper tries to look into the kind of biases that creep into ML algorithms, its impact and
how can we overcome these biases and not feed it into the algorithms we create.

Types of ML & Human Biases:


Human biases become part of the ML algorithms in 3 different ways:

Interaction Bias – Microsoft’s Twitter Bot ‘Tay’ learned responses from interaction with
Twitter users who started tweeting the bot all sorts of misogynistic, racist remarks which
the Twitter Bot later started parroting.

Latent Bias – When unobserved patterns in the learning data cause the bias. E.g. on
LinkedIn, a search for a female contact may yield website responses asking if the searcher
meant to search for a similar-sounding man’s name.

Impact of Biased algorithms

When humans feed their biases to the algorithms they create, the programs could amplify
the inequalities in our past and affect the most vulnerable sections of the society. If we train a
model based on historic human data, it would also learn the biases and stereotypes which the
data carries.

183
How to treat algorithmic bias

Removing bias needs algorithm data auditing. This can be done in the following ways:
1. Ensure the data is representative of the population under question
2. Scrutinize algorithms and raise flags when its integrity comes in question
3. Study the various possible scenarios that could lead to interaction, latent & selection
bias
4. Examine the definition of success – the goals itself should be non-biased
5. Monitor long-term effects
Conclusion
As ML algorithms handles more complex and consequential decisions, organizations may also
want to be proactive about ensuring that their algorithms do good—so that their companies can
use Machine Learning to do well.

Keywords– AI, ML, ML bias, Algorithm

Disclaimer: The views, opinions, findings or recommendations expressed in this paper


are strictly those of the author. They do not necessarily reflect the views of any specific
industry

184
TRUST AND EXPLAINABILITY FOR ARTIFICIAL INTELLIGENCE

Ashutosh Verma
Lt Col
Indian Army
avermaa@gmail.com

Abstract
Availability of very high volume of data and advances made in techniques to
analyse them have led to several applications in critical areas providing predictions with high
impact and strategic implications e.g. military, medical and predictive policing. Some
instances have been observed where predictions made were biased with respect to legally
protected class like race which reduces trust of the user on Big Data Analytics and AI.
Widespread adoption of Big Data and AI has also been slow because of sense of mistrust
towards the predictions and opacity of modern algorithms like DNN. This paper surveys
work done to highlight the issues with undesirable and biased predictions made, measures
proposed to instil trust in applications and possibilities of adding ‘explainability’ in
algorithms ab initio bringing them closer to Explainable Artificial Intelligence (XAI).
XAI has significant advantages in military domain with respect to the OODA loop. The paper
also discusses the possible effects of XAI in shortening the OODA loop. The paper also aims
to encourage researchers to add ‘explainability’ in their algorithms which can considerably
enhance their usability.
Keywords: AI, XAI, Explainability, Trust, OODA, Military, DNN.

185
DEEP AUTOREGRESSIVE IMMUNE SYSTEM LEARNING
NEURAL CORRELATES OF DECISION MAKING

Vishwambhar Pathak
Assistant Professor
Dept of CSE, BIT MESRA Jaipur Off Campus
Jaipur, India
v.pathak@bitmesra.ac.in

Abstract
The paper proposes a deep (multi-layered, distributed processing, time-varying)
autoregressive immune system (DARIS) model for extracting features from multi-channel
EEGsignals in a way to determine neuromarkers related to decision making by a subject under
certain stimulations affecting perceptions. Utility of this work is observed in the field of
neuroeconomics, which uses regional activity differences and other clues to elucidate the
principles of brain functioning associated with particular cognitive function given hypotheses
about decision making under various circumstances; specially those affecting business.
Essentially it computes optimal measures of connectivity among the brain-regions activated
during the process of decision making by the subject under given experimental stimulations.
This work is based on the reference model, namely the Dendritic Cell Algorithm for Time-
Series (DCATS), which was shown to be suitable for multiple time series forecasting due to its
robust multilayered distributed recognition mechanism and conserved comprehensive memory
for parameters of the multivariate autoregressive model. Hence the model proposed in this
paper extends DCATS to achieve training through time-varying characteristic parameters of
the EEG signals from multiple sensors in a way to address the issues related to neural
connectivity analysis. The multilayered mechanism evaluates ensemble of models with
different measures quantifying the partial and directed coherence relationships among the
extracted components. The immune-ensemble-dynamics compares sub-models for their ability
to track parameter changes and resolution of time-frequency representations. The goodness-of-
fit of the model was measured with Shwarz‟s Bayesian information Criterion (SBC). Model
validation was performed using simulations with synthetic dataset. Further the trained model
was applied to learn the cortical region connectivity using the dataset available from samples
in EEGLAB tool, consisting of 32 channel signals recorded while performing the task of
selection of position of ‘Square’ and ‘Rectangle’ objects presented to single subject.

Keywords: Perceptual decision making, Functional Neural Connectivity, Multivariate


Autoregressive model, Dendritic Cell Algorithm (DCA), Deep Learning Artificial Immune
System

186
AI ENABLED CONVERSATIONAL ENTITY FOR PRODUCT
RECOMMENDATION

Kajal Negi*
Data Scientist
Analytics, Genpact
Bangalore, India
kajal.negi@genpact.com
Srishti Khatri
Jr. Data Scientist
Analytics, Genpact
Bangalore, India
srishti.khatri@genpact.com
Chirag Jain
Sr. Data Scientist
Analytics, Genpact
Bangalore, India
chirag.jain4@genpact.com

Abstract

With a surge in automation in recent times, Artificial Intelligence (AI) has become an integral
part of our lives. Given the growth in ecommerce industry in last five years, improving
customer’s digital experience has become a vital aspect of business. While there is lot of
existing manual work currently in place to meet the challenges and needs, 24x7 human support
is very expensive to maintain for any corporate. Chat bots bridge this gap by bringing in a fresh
perspective for personalized customer experience while simultaneously serving the purpose.
Chat bot is a conversational automated application to simulate the conversation with human
users over the internet. Chat bot’s potential has been explored in a variety of use case to
replace/aid the existing system. However, it is still a long way to go for them to completely
replicate a human behavior. A number of conversational applications are in place to
automatically accept the purchase order from users over internet. We have further extended
this use case of online sales by enabling AI driven conversational entity to make real time
recommendations. The application provides a highly personalized experience for online
purchase while conversing with consumers to upgrade or suggest additional products matching
with their past/ongoing purchases. In other words, the developed conversational entity serves
as a smooth channel to up-sell and cross-sell products in ecommerce market and facilitate a
communication medium for recommender system. The entity can go multiple loops in same
session and can take various attributes of the product before user confirms a purchase or
discontinues. It tries to establish a balance between user’s requirements & likings and suggests
the best buy while negotiating on various attributes of the products. The conversational entity
broadly covers requests of different types ranging from user preferences to suitably fit product,
therefore making it a more flexible platform of attending request. To meet the needs of the
187
recommender system, an ensemble of algorithms works in back end to generate personalized
suggestions for customers. In the front end, the conversational entity uses machine learning,
text analytics and natural language processing techniques to capture the intent of the user’s
input text and analyzing it further to extract information needed to complete the request.

Keywords: Chat Bot, Text Analytics, Artificial Intelligence. Natural Language Processing

188
EFFECT OF PCA AND APPLICATION OF MACHINE LEARNING
ALGORITHMS ON PLANT LEAF CLASSIFICATION

Arun Kumar*
Associate Professor
Sir Padampat Singhania University, Udaipur, India
arunkumarsai@gmail.com
Poonam Saini
Assistant Professor
Sir Padampat Singhania University, Udaipur, India
poonam.saini9@gmail.com

Abstract
The survival of the human species has been dependent on the use of plants for
satisfying their daily needs. The plants provide food, shelter, and clothing etc. The biologists
have been studying plants and their species for the last many centuries, but due to the recent
developments that have been taking place in the name of infrastructure development has led
to reckless building of roads, houses and bridges by felling the trees and destroying the
vegetation. Therefore, there is an urgent need to study plants and their species to
taxonomically classify them before they become extinct. The plants have been studied by
using their flowers, seeds, stems, leaves and roots. With the rapid development in the area of
machine intelligence and computer vision, the taxonomic classification process of plants has
picked speed. The aim of the present work is to further improve the process of taxonomic
classification results by applying PCA and Machine Learning algorithms on dorsal and
ventral plant leaves. After performing the initial image processing related tasks over the
dorsal and ventral leaf images, the hue channel from HSI images is subjected to the process
of feature extraction using first order statistical features and the texture based features. As the
feature set is very large, there is a need to study the features deeply using their principal
components (PCA) and select appropriate subset of features using Random Forest feature
selection methodology. This work utilizes all the features as well as the subset of features
which have been independently subjected to the process of plant leaf classification using their
dorsal and ventral leaf images by using machine learning techniques for classification of data
like KNN and Random Forest. This work compares the results for both the kinds of datasets
(i.e. dorsal and ventral leaf images) with works of similar nature. While making a
comparative analysis, it has been observed that the feature set prepared for ventral leaf
images using either the first order statistical method or the texture feature extraction method,
fares better as compared to their dorsal counterparts, while classifying the images for the ten
plant species used in the present work. It has also been observed that by combining the first
order statistical features with the texture features, the predictive accuracy results improve
drastically. The most important outcome of this investigation is that the ventral leaf images
can be a better substitute for the plant species classification using digital images and the
fusion of two features does improve the accuracy results.
Keywords: Random Forest, KNN, Machine Learning Algorithms, Statistical Features, Leaf
Image Classification.

189
APPLICATIONS OF DEEP LEARNING TO AUTONOMOUS
VEHICLES

Vatsal Srivastava
Bangalore, India
vatsalbits@gmail.com
Sumit Binnani
University of California San Diego
San Diego, USA
sbinnani@eng.ucsd.edu

Self-driving or autonomous vehicles have gained lot of traction over the last few years and are
undoubtedly a popular application of the recent developments in Deep Learning. One of the
most important tasks that any such vehicle is faced with is the accurate perception of its
surroundings which is achieved through the use of on-board cameras. In order to successfully
use the information generated from the cameras, an efficient system of computer vision
algorithms must be used. Deep Learning hasshown tremendous success in such tasks and many
of the state-of-the-art algorithms are now powered by Deep Learning. Convolutional Neural
Networks (CNNs) help in the task of classification of the various objects in the image like
traffic signs and traffic lights. Similarly, Fully Convolutional Networks (FCNs) help in the
tasks of Semantic Segmentation; for example, identifying the pixels that correspond to the road
and those that do not. They also help in recognition and classification of the various objects on
the road like pedestrians, other vehicles, etc. In this paper, we look at the techniques in Deep
Learning that are currently being used in the industry for the successful navigation of
autonomous vehicles. We explore some of the feature selection and pre-
processing techniques to make the model robust and to reduce the size of the data. We then
present a Deep Learning based approach that can perform various tasks like traffic sign
classification, object detection, semantic segmentation and lane detection. We have used the
openly available dataset of images taken from the camera of an autonomous vehicle for training
and testing our model. Our model is able to identify the lanes, identify the traffic signs, identify
other vehicles and classify the traffic signs and traffic lights. It is also able to generalize well
to the dynamic road conditions. We are also working with Udacity to deploy and integrate our
model on a real vehicle and validate the accuracy of our system.

Keywords: Autonomous Vehicle, Deep Learning, Convolutional Neural Networks, Fully


Convolutional Networks, Transfer Learning

190
TEXT ANALYTICS IN SOCIAL STREAMS USING ARTIFICIAL
NEURAL NETWORKS

KAVIYA.K & SHANTHINI.K. K Dr. M. SUJITHRA


M.Sc. Software Systems, Assistant Professor
Department of Computing Department of Computing
Coimbatore Institute of Technology Coimbatore Institute of Technology
Coimbatore Coimbatore
kaviya14998@gmail.com , sujisrinithi@gmail.com
shnthnikk@gmail.com

Abstract

With the rapid increase of online social network. Social media became a platform for huge
amount of the sentiment data in the form of tweets, blogs, and updates on the status, posts,
etc. Text Analytics known as text or opinion mining is a subcategory of the Natural Language
Processing (NLP). Information retrieval techniques mainly focus on processing, searching or
analysing the data present. These contents are mainly opinions, sentiments, appraisals,
attitudes, and emotions. The main goal of the proposed work is to apply text analytics in
social media using real-world examples. The proposed approach helps the business to guess
customer’s opinions and preferences which is helpful in shaping business strategy. For
improving the performance, an approach combining neural networks and fuzzy logic is
proposed.

Keywords— Text Analytics, Natural Language Processing, Artificial neural network, Fuzzy
logic.

191
STACKING WITH DYNAMIC WEIGHTS ON BASE MODELS

Biswaroop Mookherjee*
Manager
Tata Consultancy Services (TCS)
Kolkata, India
biswaroop9000@gmail.com

Abhishek Halder
Senior Business Analyst
Tata Consultancy Services (TCS)
Kolkata, India
abhishek.halder171@gmail.com

Siddhartha Mukherjee
Machine Learning Developer
Tata Consultancy Services (TCS)
Kolkata, India
sidd.mukherjee@gmail.com

Abstract

Different techniques of Statistics and Machine Learning have different specialties and
accordingly their performance differ on different datasets. Stacking is used to combine more
than one techniques using a second level model to come up with higher accuracy. The second
level model essentially uses the values predicted by different base level models as independent
variables, while the dependent variable remains the observed one. Though fit of the base level
models differ at various parts of the data, the second level model uses same set of weights on
base level models on the whole data. We have derived a method where we replace the second
level model by a linear combination of base model outputs where the weights vary. In our
method, for classification of a new observation, we select a part of the data based on some
predefined condition of proximity. Then weights are assigned on different base models
considering their accuracy in that part of the data. The algorithm applies same principle on each
of the new observations; thus, weights vary. The new ensemble method is tried on different
datasets from different fields and found to give better results than conventional stacking.

192
DETECTION OF MALWARE APPLICATIONS IN MOBILE DEVICES
USING SUPERVISED MACHINE LEARNING ALGORITHMS

KRITHIKA SHREE. L1 DR. M SUJITHRA2


II-YEAR- MSC-SOFTWARE SYSTEMS Assistant Professor,
Coimbatore Institute of Technology Department of Computing
Department of Computing Coimbatore Institute of Technology
krithikashree1999@gmail.com sujisrinithi@gmail.com

Abstract

Mobile devices have become an essential part of our daily life. This has increased the number
of mobile users. The mobile security system is also a very week and this invite number of
hackers to develop mobile malware. The malicious application is developed at a faster rate with
advanced features and relying on the current detection trend is inefficient because detecting
them becomes difficult. The main goal of the proposed framework intends to develop a
machine learning based malware detection system on Android to detect malicious applications
and to enhance security and privacy of Smartphone users. This system exploits supervised
machine learning techniques to distinguish between normal and malware applications. The app
store markets and the ordinary users can access our detection system for malware detection
through cloud service.
Keywords— malware, goodware, supervised, mobile devices, malicious applications, machine
learning

193
USER AUTHENTICATION USING STEGANOGRAPHY FOR BIG
DATA IN MOBILE DATA CENTER

Abstract

Big Data is collection of data from various sources. It is very useful and successful data centers
but keeping privacy for such large amount of data is one of the main security issues and it is
related to Authentication. User authentication and access to data from multiple locations need
to be controlled and secured. In specific, mobile users and social network users share more and
more personal data through their devices. There are many cryptography techniques or
protocols, which are currently in use to solve the problem of authentication. This paper
proposes an image based password generation method for authentication of users in mobile
data center. The proposed research uses concepts of soft dipole representation of the image to
compute unique password for users and steganographic technique for transferring password
securely to the users.
Keywords: Big data, Security, User authentication, Image password, steganography

194
THANK YOU

195

You might also like