Professional Documents
Culture Documents
STAT 1005 Lecture 1 (DR Lau) - Sep 2022
STAT 1005 Lecture 1 (DR Lau) - Sep 2022
STAT 1005 Lecture 1 (DR Lau) - Sep 2022
Data Science
Dr Adela Lau
Email: adelalau@hku.hk
1
What is Data Science?
• Data science combines multiple fields, including
statistics, scientific methods, artificial intelligence
(AI), and data analysis, to extract value from data.
• It collects data from the web, smartphones,
customers, sensors, and other sources to derive
actionable insights.
• It encompasses preparing data for analysis,
including cleansing, aggregating, and manipulating
the data to perform advanced data analysis.
• It reviews the results to uncover patterns and
enable business leaders to draw informed insights.
https://www.oracle.com/hk/data-science/what-is-data-science/ 2
What Skills required for a Data Scientist?
3
Why Data Science?
Yes, data science
Is it possible the uses machine
car driving itself? learning to learn
how to drive and
give feedback
each time in
driving
It can take
decisions like
slowing down,
stopping by itself,
speeding up,..
4
Why Data Science? Self driving cars can reduce
deaths caused by car
accidents
5
Why Data Science? • Improper route planning
• Poor logistics management in cargo and check-
in processes
• Incorrect decisions in human resources,
equipment selection, etc.
• Lack of data of flight status and environment
risks
• Route Planning
• Predictive model on
flight delay
• Video analytics on
airport’s
environment risks
• Promotional offers 6
Why Data Science?
7
https://data-flair.training/blogs/data-science-applications/
How Data Science help?
4. Route optimization
1. Innovate design and Vehicle cargo matching
design simulation Find best route
with virtual reality,
permutation design
and decisions 4. Reduce cargo charges
by peak hour prediction
3. Reduce production and
Inventory risk 4. Save last mile costs.
with accurate Just-in-time Last time optimization via
manufacturing and efficient vehicle routing
demand prediction
9
What Can Data Science Do?
Predictive Analysis
What will happen next?
10
What Can Data Science Do?
Pattern Discovery
Is there any hidden knowledge
in the data?
11
Statistical Methods for Data Science
12
Statistical Methods for Data Science
13
Statistical Methods for Data Science
14
Machine Learning Algorithm for Data Science
15
Data Visualization for Data Science
16
17
Data Science Process: Technological View
18
Success of Data Science!
19
A Case Analysis of Data Science
• Top 5 Data Science Applications in Business (6.13 mins)
• https://www.youtube.com/watch?v=sHMGIQpqrz0
20
Problem 1: Which customer should we target
at marketing campaign?
• Goal: Personalize marketing campaign
• Data sources: Customer database, transaction
database, marketing campaign database
• Method: Segment customers into different groups
based on features of age group, income, clickstream,
location, demographic, ….
• Techniques: descriptive analytics (e.g frequency
distribution, mean, mode…)
21
Descriptive analytics (e.g frequency distribution, mean,
mode…) – What marketing channel to use and forecast the
trend?
Email responses,
Social Media
response, Web surfing
Marketing campaign
channel analysis
22
Problem 2: How to do product
recommendation to the customers?
•Goal: Predict customer trend. Ø Item to item collaborative
1)What likely the customers buy filtering in real time.
your products and reasons they Ø Match customer purchased
bought it. items to similar items.
•Data sources: web clickstream •Techniques: clickstream
data, transaction data
analysis and classification
•Method: Develop a rules
recommendation engine to
suggest products to customer
23
Clickstream Analysis
Association Rules
24
Problem 3: How to predict customer
preferences?
•Goal: Predict customer trend. •Method:
1)What likely the customers buy 1)What product will buy by similar
your products and reasons they customers
bought it.
•Techniques: correlation
•Data sources: Customer review, analysis
customer survey, transaction
data
25
Correlation Analysis
26
Problem 4: How to do decision making to
promote or fire a staff
• Goal: Evaluate staff performance.
• Data sources: Transaction data
• Method:
◦ Compare sales’ performance (e.g. min,
max, and mean of the sales per month
by staff, total sales by shop, customer
order per month by staff,…
◦ Predict salesman performance
• Techniques: Descriptive analytics,
and regression model
27
Focus in this course
• Wk 8: Sampling Distributions and Correlation Analysis
• Wk 9: Hypothesis testing
• Wk 10: Regression and Prediction
• Wk 11: Classification
28
Thank you!
29