Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Paper 6: Management Information System

Module 20: Data Mining for Decision Support

Prof. S P Bansal
Principal Investigator Vice Chancellor
Maharaja Agrasen University, Baddi

Prof Yoginder Verma


Co-Principal Investigator Pro–Vice Chancellor
Central University of Himachal Pradesh. Kangra. H.P.

Prof. Manu Sood


Paper Coordinator Chairman, Department of Computer Science
Himachal Pradesh University, Shimla.

Dr. Ashish Saihjpal


Content Writer Assistant Professor,
University Business School, Panjab University (RC),
Ludhiana
Items Description of Module
Subject Name Management
Paper Name Management Information System
Module Title Data Mining for Decision Support
Module Id Module No 20
Pre- Requisites Fundamentals of Decision Support Systems
Objectives To understand the application of Data Mining for Decision Support
Keywords Clustering, OLAP, Modeling, Warehousing, Decision Support Systems

QUADRANT-I

Module- 20 Data Mining for Decision Support


1. Learning Outcome
2. Introduction
2.1 How does Data Mining Work?
3. Scope of Data Mining
4. The Data Mining Architecture
4.1 Data Mining Methods
5. Data Mining for Decision Support
6. Applications of Data Mining
7. Summary

1. Learning Outcome:
 After completing this module the students will be able to:
 Understand the fundamentals of Data Mining (DM).
 Understand the scope of Data Mining.
 Understand the DM Architecture.
 List various methods of Data Mining.
 Get an overview of how Data Mining aids Decision Support.
 Understand the industry vide applications of Data Mining.

2. Introduction

Data Mining (DM) is a process undertaken by enterprises to process unstructured data into relevant
information. It uses softwares to look for patterns in large batches of data. Businesses can learn more
about their customers and develop strategies in line with business goals so as to have a positive impact on
sales. Data mining is dependent on effective data collection, warehousing as well as computer processing.
Exhibit 1: Data Mining
Image Source: http://www.boostit.net

Data Mining helps establish relationship between the micro and the macro environment that affect key
performance indicators like sales, profitability, revenue across different industries like retail, hospitality,
education, communication etc. To exemplify, it may enable us to analyze the micro factors such as the 4
P’s of Marketing i.e. Price, Product, Promotions and Place with respect to the macro factors like
technology, economic conditions and taxation laws to forecast sales trends. Hence, Data Mining offers the
capability to organize the transactional data to deduce relationships that enable business decision making.
This can be understood from a simple ordered Rubrics Cube diagram in exhibit 1 which can be
reorganized from an unstructured to a systematic format.

Data mining tools can find solutions to business problems that traditionally have been difficult to resolve.
It can drill down databases and look for patterns, trends and other predictable information that experts
may overlook as unimportant to business.

Exhibit 2: Basic Functionality, Data Mining


Image Source: http://maaw.info/images/DataMining.gif
Data mining tools enable to predict behavioral patterns, allowing businesses to make prompt and
analytical decisions. Integration of Decision Support Systems with Data Mining Tools is possible on
existing hardware platform and be installed on operating systems that pre-exist. They are interoperable
and scalable to interface with new products and systems as per changing business needs.

With Data Mining Capability and Data Analytics, Walmart uses what they call predictive
technology to keep customers informed. Their computer networks hold trillions of bytes what
they call the Teradata warehouse. This is nothing but the customer profiles integrated with their
purchasing history and buying patterns. Prior to the occurrence of the Hurricane Frances, they
alerted their customers on what could be essential items each of them could stock. Walmart
could build its inventory beforehand and assured customers to be able to pick their products of
regular use well in time. This not only prepared the customers to bulk purchase routine items
but positively increased the sales of various other related items unexpectedly. It translated to
higher sales of groceries as well as baby products and those for the elderly.

2. 1 How Data Mining Works?

Data Mining works on the principle of Data Modeling. This helps to translate data in such a manner that
supports business process. The data are mined by system experts and users of Information Systems.

Data mining consists of five major elements

 Loading of transaction data to the data warehouse.


 Database Management and Data Archival.
 Data Access to IS experts and analysts.
 Data Analysis using Data Mining Software tools.
 Graphic user interface for interpretation.

Data Mining is the bridge between transaction processing and analytical systems. Based on user queries
the mining software searches the database for patterns and relationships. These analytical tools are of
various types like neural networks, clustering etc.
Exhibit 3: How data mining works
Image Source: http://technologyspace.weebly.com/uploads/1/1/5/2/11524599/2505000.jpg?426

Exhibit 3 showcases the sequence on tasks that form part of the Data Mining Process from capturing of
information from multiple sources to pattern analysis and evaluation for business decision making.
Generally, any of four types of relationships are sought as shown in Fig1.

Sequential
Classes Clusters Associations
Patterns

Figure 1: Types of Data Mining Relationships

 Classes: This works on machine learning which involves the classification of data into pre-
defined groups. Linear programming and decision trees are among the few techniques used here.
Based upon user query the DM tool searches data and segregates it into separate classes. E.g.,
based on past trends how many sales executives are likely to resign within the first year of their
employment?

Exhibit 4: Data Mining – Clusters


Image Source:http://blogs.sas.com/content/subconsciousmusings/files/2016/05/clustering-based-on-similarities.png

 Clusters: It is different from classification as it defines the classes and places the objects in each
class. This can be seen in Exhibit4 where customers have been classified according to their
spending patterns. A library management software searches for books for a particular subject or
author is this way.

Exhibit 5: Associations in Data Mining


Image Source: https://i.ytimg.com/vi/RiFrbyiYpRs/maxresdefault.jpg
 Associations: Data is mined in way that studies association as shown in exhibit 5. It can be used
for market based analysis. People who may generally buy frozen snacks are likely to try variants
of sauces and ketchup. People buying dresses are likely to match the same with accessories or
shoes. These associations are studied using historical data.

Exhibit 6: Sequential Patterns


Image Source: http://3.bp.blogspot.com

 Sequential patterns: Data is mined to understand and interpret behavior patterns and trends over a
period of time to be studied. This is exemplified in Exhibit 6. The likelihood of bed linen and
curtains purchased with each purchase of new bed may enable home furnishers to understand the
patterns of consumer purchases.

Exhibit 7: Basic Blocks of Data Mining Process


Image Source: https://www.youtube.com/watch?v=W44q6qszdqY

3. Scope of Data Mining

Data mining is an important part of knowledge discovery process that analyzes large enormous set of data
and gives us unknown, hidden and useful information and knowledge. Data Mining finds numerous
applications in multiple fields such as healthcare and medicine, transportation, insurance, hospitality
government etc. Data Mining can discover new correlations, patterns and trends in vast amounts of
business data stored in data warehouses. Data mining software uses advanced recognition of patterns,
algorithms, mathematical and statistical techniques to sift through mountains of data to extract previously
unknown strategic business information.
Many companies use data for -

 Performing Market Basket analysis to identify new product bundles


 Find root cause of quality parameters or manufacturing problems.
 Prevent high rate of customer churn and acquire new customers.
 Cross sell to existing customers.
 Customer profiling and segmentation.

The Data Mining tool enables quick data analysis for business oriented queries. The cumbersome process
of doing this for large data repositories is minimized considerably as the process is automated. Campaign
management, message broadcasting to user lists, digital marketing are areas where data mining tools can
provide business oriented results with minimum investments.

Data Mining (DM) tools offer drill down capabilities in huge data warehouses. Users can fetch business
critical information, at the click of a mouse. The value proposition lies in the fact that DM software can
be installed using existing hardware and software platforms which maximizes the return on investment.

4. The Data Mining Architecture

Data mining is a very important process where business critical and un-explored information is extracted
from large volumes of data. There are a number of components involved in the data mining process.

The following components build up the Data Mining Architecture:

Exhibit 8: Data Mining Architechture


Image Source: http://www.wideskills.com/data-mining-tutorial/data-mining-architecture

 Data Sources – It refers to the activities that lead to accumulation of data. Social media, cloud
based applications, activity generated data, public info, legacy based platforms, points of sale
(PoS) are touch points that pull up data. e.g ., Apple App Store, iTunes, Google Maps, Ola,
Banking Apps, Email Clients, Twitter etc. Data from these multiple access points is unstructured.
It needs to be organized, filtered and structured before passing it into the data warehouse server.
Multiple techniques are used for cleaning and integration of data.

 Database / Data Warehouse Server - The database or data warehouse server contains a store
house of data. Hence, the server is responsible for retrieving the relevant data based on the data
mining request of the user.

 OLAP Server – Online Analytical Processing Server – A subset of Data and Business Analytics,
it comprises of relational databases, data mining and reporting capability. Multiple applications of
OLAP include sales funnel reporting, daily MIS reports, sales funnel analysis, financial reporting
etc. OLAP tools enable analysis of multidimensional data from various dimensions. Multiple
queries can be run; databases can be navigated within rapid execution time.

 Data Mining Engine –It lies at the core of the data mining infrastructure. It is responsible for
carrying out various data mining techniques like clustering, association, correlation etc. through
which information is categotised.

 Pattern Evaluation Modules – It measures the characteristics of the pattern by using a threshold
value. It interacts with the data mining engine to focus on pattern analysis.

 Graphical User Interface – It is responsible for providing a platform between the data mining
engine and the end user. It makes the reporting understandable and easy to interpret. Results are
displayed using pictorial tools and diagrams in a manner easily understandable.

 Knowledge Base –It is assigned the task to provide input to the data mining engine. Modules of
pattern evaluation interact with the Knowledge Base on a constant basis. It facilitates knowledge
discovery and substantiates the data sources for pulling out interesting patterns.

4.1 Data Mining Methods

An exhaustive list of data mining techniques is listed in Exhibit 9. These methods can be classified on
basis of Statistics, Artificial Intelligence, Operation Research Methods, Neural Networks, Stochastic
Search Methods and synergy between various Statistical Techniques.
Exhibit 9: Data Mining Techniques
Image Source: http://www.constantinereport.com/term_images/data+mining+tools+(1).png

From the gamut of DM tools and techniques the most popular and commonly used ones are:

 Neural Networks – This technique works in a similar way a human would. It is based on the
collection of fundamental units called neurons analogous to axons in a biological brain.

 Decision Trees – At the core, a decision tree comprises of a root node, branches and leaf nodes. It
helps to break down groups of data into multiple sub sets. This predictive modeling approach has
multiple uses in data mining. The Chi Square Automatic Interaction Detection – CHAID is a
popular method.

 Time Series Analysis – It refers to analysis of data sets arranged in a ordered time frame or
chronological order. It helps to study pattern recognition, weather and meteorological forecasts,
seismic forces, astronomy, communications and control systems to name a few.

 Rule Induction – It is a concept where a general rule can be extracted studying data sets. These
may be deduced as local data patterns or certain scientific models of data.

 Regression – It helps to predict values that occur in continuation, in a particular data set.
Regression analysis has multiple applications across various industries. Few examples include
financial forecasting, sales budget planning and environmental analysis.
The Card Protection Company – Be it any industry one looks at, each customer is unique and
behaves in a unique pattern. Then how can a machine predict the possible behavior or buying pattern
of any single customer?
This is an area answered by Data Mining and is based on the underlying principle of market
segmentation. Accumulating data pertaining to each user and maintaining its profile helps to draw
intricate analysis. Compilation of user attributes like demographics, past purchases, purchase
schedules and patterns, frequency can help answer number of questions. It helps to segment
customers with similar characteristics and acquaint them with best offers.
A similar activity was carried by the Card Protection Company. With its humungous database of
nearly 7 million customers, they employed DataInsight to carry customer modeling. The unstructured
data was organized with demographic segmentation profiles. Nearly 300 customer characteristics
where narrowed down to 30. Decision Trees and CHAID analysis helped reveal best categories to
predict customer behavior. Classification such as, the frequent respondents, those who would
respond in a month, those who would shop during the sale periods and those who made purchases
once in a year. They also narrowed down the age segments that were most reactive to promotional
offers and coupons. Such an exercise helped Card Protection Company tremendously.
Source: http://www.campaignlive.co.uk/article/172622/technique---using-data-mining-market-segmentation

5. Data Mining for Decision Support

Decision support systems (DSS) are defined as interactive application systems which are intended to help
decision makers utilize data and models in order to identify and solve problems and make decisions. They
incorporate both data and models and they are designed to assist decision makers in decision making
processes. Exhibit 10 lists various models and techniques of this aspect. They provide support for
decision making, but do not replace it.

Exhibit 10: Data Mining for Decision Support


Image Source: http://static1.1.sqspcdn.com
The mission of decision support systems is to improve effectiveness, rather than the efficiency of
decisions. A decision support system can take many different forms and every decision support system is
developed for a specific objective and bases on a particular decision process and set of methods,
techniques and approaches. The design of DSS is created in agreement to the decision-making process
and decision problems which the DSS is going to support.

The objective of data mining is to discover relationships, patterns and knowledge hidden in data. Data
mining is the process of analyzing data in order to discover implicit but potentially useful information and
uncover previously unknown patterns and relationships hidden in data.

Data mining is an interdisciplinary field which encompasses statistical, pattern recognition, and machine
learning tools to support the analysis of data and discovery of principles that lie within the data.

Integration of data mining and decision support enhances the capability of the DSS which can handle
complex problems than before. Moreover, this can significantly improve current approaches and create
new ones for problem solving, by enabling the fusion of knowledge from experts and knowledge
extracted from data.

Detecting Fraud Using Data Mining

Exhibit 11: Fraud Detection


Image Source: http://www.statsoft.com/textbook/fraud-detection

With the world converging with the use of technology, information is now available at the click of
a mouse. However, this ease of information availability and accessibility needs to be restricted in
the hands of users who actually need it. Unethical use of business related information,
unauthorized access and unsecured mode of transmission may only lead to the collapse of an
enterprise.
Data Mining has proven ability in the field of detecting instances of fraud across various
industries. Online transactions, e-commerce, internet banking and payments are a platform
vulnerable to fraud and money laundering if not looked at critically. Both of the commonly used
techniques for fraud detection find applications in Data Mining. Be it the use of Machine
Learning or Artificial Intelligence, they provide provision to understand instances of fraud and
raise an alarm.
Data mining techniques such as profiling, clustering, and classification and time series look up
data sets to study transaction patterns. Log in information, mismatch in typing patters,
inappropriate account activity are instances where authentic users can be cautioned. Data Mining
also assists forensic analytics, biometric and retina analysis where logging into secure networks
requires user validation.
6. Applications of Data Mining

Data Mining is a concept that has numerous applications in possibly all sectors of the industry and
academics. Data Mining works on volumes of data and can help out study hidden patterns that can be
used for business critical use cases. Data Mining intricately studies transactional data and organizes it into
batches using statistical and computing tools to establish relationships and patterns.

Exhibit 12: Philips Health Suite


Image Source: www.philips.co.in/healthcare/medical-products

Some of these applications are elaborated as under:

 The field of Bioinformatics studies the structural patterns of proteomics by studying their patterns
and databases. In healthcare, data mining plays an essential role in data visualization and soft
computing. It helps in forecasting trends and ensuring resources are available to meet the patient
demands in various segments. Health care requires precision and accuracy. The automation and
use of Electronic Health Records to database customer records is a common trend. These tools
can compare the symptoms, causes and treatments and provide suggestions as per clinical best
practices. Data Mining techniques study the data sources and use the analysis to build predictive
models.
Exhibit 13: Applications of Data Mining
Image Source: http://bpo.rsystems.com .

 Today is the era of converging Web Technologies, 4G and high speed connectivity. Thus,
establishing communication networks that transmit voluminous information in a secure encrypted
way, in the hands of authorized users is key. Data Mining has extensive usage in the
telecommunications industry. From office automation solutions to networks that enable voice and
data transmission, data mining helps to understand business dynamics better and make decisions.
DM helps to trace trespassing, intrusion detection, fraud and maintain quality of service.

Exhibit 14: Data Mining applications in Financial Forecasting


Image Source: https://www.mitre.org

 A banking department can leverage data pertaining to credit card users to predict which
customers are frequent users and would be keen to purchase a card with new features. Using a
small test by mailing, the characteristics of customers with an affinity for the product can be
identified. Recent projects have indicated more than a 20-fold decrease in costs for targeted
mailing campaigns over conventional approaches.
Exhibit 15: Sales & Revenue Management Reporting
Image Source: https://highlyscalable.files.wordpress.com

 Data mining can be applied to check how customer groups react to a promotion, how effective the
promotion are with respect to cost and benefits, which marketing channels have been successful
for different campaigns in the past and so on. By analyzing this kind of information, retailer re
create advertisement and design promotional activities. Exhibit 15 showcases several use cases
for a Sales and Revenue Management system. This is integrated with a data warehouse, and
customer profile data from retail stores.

Exhibit 16: How Data Mining Works for Marketing Reporting


Image Source: http://www.boostit.net

 Optimizing the prices for every product is a difficult task. A number of factors pertaining to
customer demand are considered before pricing. Normally, price increases leads to lower sales
and customer adoption of alternate products. Data mining can outline demand for the products
and the relation between how a price change of a particular product affects sales of other
products.
7. Summary

As business operations in a firm multiply, the involvement of human expertise is not sufficient. It
involves time, effort and a considerable amount of compensation. The same tasks can be processed with
information systems designed to support business processes. Not only can they work round the clock but
tirelessly generate iterative results.

Data mining is the process of processing and analyzing data in order to find useful information for
business. It involves selecting, exploring and modeling large amounts of data to uncover previously
unknown patterns and ultimately arrive at comprehensible information, from large databases. Today,
multinational companies and large organizations have operations in many places in the world. Each place
of operation may generate large volumes of data. Decision makers require access from all such sources
and take strategic decisions. Data mining uses a number of techniques that include statistical analysis,
decision trees, neural networks, rule induction and refinement and graphic visualization. The combination
of business acumen with the power of data mining techniques can help organizations gain a strategic
advantage in their efforts to optimize customer management.

You might also like