Professional Documents
Culture Documents
Lecture 1-2 Introduction To Business Analytics and Machine Learning Analytics
Lecture 1-2 Introduction To Business Analytics and Machine Learning Analytics
Lecture 1-2 Introduction To Business Analytics and Machine Learning Analytics
2
BUSINESS ANALYTICS
None of the above data sets have any meaning until they are given a
CONTEXT and PROCESSED into a useable form
ISBM ©
7
2021
DATA INTO INFORMATION
OR
Data that has been processed into a form that gives it meaning
ISBM ©
9
2021
EXAMPLE 1
Information ???
ISBM ©
10
2021
EXAMPLE 2
Processing
Information ???
ISBM ©
11
2021
EXAMPLE 3
111192, 111234
Raw Data
Information ???
ISBM ©
12
2021
KNOWLEDGE
Volume Variety
The amount The types
of data of data
THE 4 V’S
OF
BIG DATA
Velocity Veracity
The frequency of The quality
data of data
19
BIGDATA SOURCES
VOLUME : SCALE OF DATA
20
ISBM ©
21
2021
VOLUME: SCALE OF DATA
▪ Origin
▪ Authenticity
▪ Trustworthiness
▪ Completeness
▪ Integrity
VALUE
ISBM ©
26
2021
Volume Variety
The amount The types
of data of data
THE 4 V’S
OF
BIG DATA
Veracity
Velocity The quality
The frequency of data of data
And finally, Value. Not all data is necessarily valuable data.
D
CS590
TYPES OF DATA SETS
Record
• Data Matrix
• Document Data
• Transaction Data
Graph
• World Wide Web
• Molecular Structures
Ordered
• Spatial Data
• Temporal Data
• Sequential Data
• Genetic Sequence Data
27
D
CS590
RECORD DATA
28
DATA MATRIX
29
DOCUMENT DATA
timeout
season
coach
game
score
team
ball
lost
pla
wi
n
y
Document 1 3 0 5 0 2 6 0 2 0 2
Document 2 0 7 0 2 1 0 0 3 0 0
Document 3 0 1 0 0 1 2 2 0 3 0
CS590D
30
TRANSACTION DATA
TID Items
1 Bread, Coke, Milk
2 Beer, Bread
3 Beer, Coke, Diaper, Milk
4 Beer, Bread, Diaper, Milk
5 Coke, Diaper, Milk
CS590D
31
D
CS590
GRAPH DATA
<a href="papers/papers.html#bbbb">
Data Mining </a>
<li>
2 <a href="papers/papers.html#aaaa">
Graph Partitioning </a>
<li>
5 1 <a href="papers/papers.html#aaaa">
Parallel Solution of Sparse Linear System of Equations </a>
<li>
2 <a href="papers/papers.html#ffff">
N-Body Computation and Dense Linear System Solvers
5
32
D
CS590
CHEMICAL DATA
33
D
CS590
ORDERED DATA
Sequences of transactions
Items/Events
An element of
the sequence
34
D
CS590
ORDERED DATA
GGTTCCGCCTTCAGCCCCGCGCC
CGCAGGGCCCGCCCCGCGCCGTC
GAGAAGGGCCCGCCTGGCGGGCG
GGGGGAGGCGGGGCCGCCCGAGC
CCAACCGAGTCCGACCAGGTGCC
CCCTCTGCTCGGCCTAGACCTGA
GCTCATTAGGCGGCAGCGGACAG
GCCAAGTAGAACACGCGAAGCGC
TGGGCTGCCTGCTGCGACCAGGG
35
D
CS590
ORDERED DATA
Spatio-Temporal Data
Average Monthly
Temperature of
land and ocean
36
D
CS590
DATA QUALITY
37
DATA SCIENCE OVERVIEW
38
ISBM ©
39
2021
DATA SCIENCE – A VISUAL DEFINITION
40
MODELS
1-41
BUSINESS ANALYTICS APPLICATIONS
⦁ Pricing
◦ setting prices for consumer and industrial goods, government
contracts, and maintenance contracts
⦁ Customer segmentation
◦ identifying and targeting key customer groups in retail, insurance,
and credit card industries
⦁ Merchandising
◦ determining brands to buy, quantities, and allocations
⦁ Location
◦ finding the best location for bank branches and ATMs, or where to
service industrial equipment
⦁ Social Media
◦ understand trends and customer perceptions; assist marketing
managers and product designers
1-43
IMPORTANCE OF BUSINESS ANALYTICS
⦁ Benefits
◦ …reduced costs, better risk management, faster
decisions, better productivity and enhanced bottom-line
performance such as profitability and customer
satisfaction.
⦁ Challenges
◦ …lack of understanding of how to use analytics,
competing business priorities, insufficient analytical skills,
difficulty in getting good data and sharing information,
and not understanding the benefits versus perceived
costs of analytics studies.
Privacy?
SCOPE OF BUSINESS ANALYTICS
🞂 Simulation
🞂 Forecasting
🞂 Scenario and “what-if” analyses
🞂 Optimization
🞂 Text Mining
🞂 Social media, web, and text analytics
▪ SQL various databases
▪ Excel Spreadsheets
▪ Tableau Software Simple drag and drop tools for visualizing data from
▪ spreadsheets and other databases.
▪ IBM Cognos Express An integrated business intelligence and
planning solution designed to meet the needs of midsize companies,
provides reporting, analysis, dashboard, scorecard, planning, budgeting
and forecasting capabilities.
▪ SAS / SPSS / Rapid Miner Predictive modeling and data mining,
visualization, forecasting, optimization and model management, statistical
analysis, text analytics, and more using visual workflows.
▪ R / Python Advanced programing-based data preparation, analytics and
▪ visualization.
⦁ Data: numerical or textual facts and figures that
are collected through some type of measurement
process.
⦁ External
⦁ Economic trends
⦁ Marketing research
⦁ Examples:
◦ A tire pressure gage that consistently reads several pounds of pressure
below the true value is not reliable, although it is valid because it does
measure tire pressure.
◦ The number of calls to a customer service desk might be counted
correctly each day (and thus is a reliable measure) but not valid if it is
used to assess customer dissatisfaction, as many calls may be simple
queries.
◦ A survey question that asks a customer to rate the quality of the food in a
restaurant may be neither reliable (because different customers may
have conflicting perceptions) nor valid (if the intent is to measure
customer satisfaction, as satisfaction generally includes other elements
of service besides food).
⦁ Model - an abstraction or representation of a real
system, idea, or object.
S = AEBECT
where
• S is sales,
• t is time,
• e is the base of natural logarithms, and
• a, b and c are constants that need to be estimated.
LEARN
MODEL
“To try to eliminate risk in business enterprise is futile. Risk is inherent in the
commitment of present resources to future expectations. Indeed, economic
progress can be defined as the ability to take greater risks. The attempt to
eliminate risks, even the attempt to minimize them, can only make them
irrational and unbearable. It can only result in the greatest risk of all: rigidity.”
– Peter Drucker
⦁ Prescriptive decision models help decision
makers identify the best solution.
max. Profit
s.t. Sales >= 0
Sales is integer
⦁ Introduction to Analytics
⦁ Tools
⦁ Data
⦁ Models
⦁ Problem solving with analytics
1. RECOGNIZE A PROBLEM