UCS551 Chapter 1 - Introduction To Data Analytics

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24


Chapter 1:
Introduction to Data Analytics


Ref: https://www.coursera.org/lecture/data-

1. Definition of data analytics

2. Importance of data analytics
3. Type of data analytics
4. Example of applications
5. Data Science
6. Data Analytics Process
1.1 Definition of data analytics

Data All Around

• Lots of data is being collected
and warehoused
– Web data, e-commerce
– Financial transactions, bank/credit
– Online trading and purchasing
– Social Network
1.1 Definition of data analytics

How Much Data That We Have?

• Google processes 20 PB a day

• Facebook has 60 TB of daily logs
• eBay has 6.5 PB of user data +
50 TB/day
• 1000 genomes project: 200 TB
1.1 Definition of data analytics

How Much Data That We Have?

1.1 Definition of data analytics

• Data Analytics WordCloud

1.1 Definition of data analytics
• What is Data?
Data is a set of values of subjects with respect to qualitative or quantitative variables. Data and
information or knowledge are often used interchangeably; however data becomes information
when it is viewed in context or in post-analysis.
1.1 Definition of data analytics

• What is Big Data?

Big Data is any data that is expensive to manage
and hard to extract value from
• Volume
The size of the data – define the word “big”
• Velocity
how fast the data can be processed and
accessed (social media posts/ YouTube
videos etc that are uploaded in thousands
every second should be accessible as early
as possible.
• Variety and Complexity Big have 7V’s.
The diversity of sources, formats, quality,
Can you guess another 4 V’s
• Variability
– data which keeps on changing constantly - focus on understanding and
interpreting the correct meanings of raw data
• Veracity
– about making sure the data gathered is accurate and keeping the bad
data away from the systems
• Visualization
– how to present data to the management for decision-making purposes
• Value
– user needs to understand that the organization needs some value after
efforts are made and resources are spent on the other V’s (if it is done
and processed correctly)
1.1 Definition of data analytics
• Types of Data
• Relational Data
• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
• Streaming Data
1.1 Definition of data analytics
• Types of Data
– Relational Data (Tables/Transaction/Legacy Data)

1.1 Definition of data analytics
• Types of Data
– Text Data (Web)
1.1 Definition of data analytics
• Types of Data
– Semi-structured Data – Graph Data

Political Polarization During

the 2008 US Presidential
Campaign Graph data
1.1 Definition of data analytics
• Types of Data
– Streaming Data

1.1 Definition of data analytics

• What is Data Analytics

Ø “is a process of inspecting, cleansing, transforming, and modeling data
with the goal of discovering useful information, suggesting conclusions,
and supporting decision-making”. - Wikipedia
Ø "leverage data in a particular functional process (or application) to
enable context-specific insight that is actionable.“ – Gartner
Ø the science of analyzing raw data to make conclusions about that
information - Investopedia
Ø examines large amounts of data to uncover hidden patterns,
correlations and other insights - SAS

• Data analytics is important because it helps a business of

a particular sector to optimize its performance.
Implementing it into the business model means
companies can help reduce costs by identifying more
efficient ways of doing business and by storing large
amounts of data. A company/ sector can also use data
analytics to make better business decisions and help
analyze customer trends and satisfaction per say, which
can lead to new (and better) products and services or at
least provide input or guideline.

• Value Chain: The analytics will tell how the existing

information is going to aid the business in finding out the
gold mine that is the way to success for a company.
• Knowledge: The insights able to comprehend a guide to
show how you can go about your business in the near
future and what is that the economy already has its
hands on. That’s how you are going to avail the benefit
before anyone else.
• Opportunities: Data Analytics gives us analyzed data that
helps us in seeing opportunities before the time that’s
another way of unlocking more options.
There are four types of data analytics:
• Descriptive analytics
– describes what has happened over a given period. Have the number of views gone up? Are
sales stronger this month than last?
• Diagnostic analytics
– focuses more on why something happened. This involves more diverse data inputs and a bit of
hypothesizing. Did the weather affect beer sales? Did that latest marketing campaign impact
• Predictive analytics
– moves to what is likely going to happen in the near term. What happened to sales last time we
had a hot summer? How many weather models predict a hot summer this year?
• Prescriptive analytics
– moves into the territory of suggesting a course of action. If the likelihood of a hot summer as
measured as an average of these five weather models is above 58%, then we should add an
evening shift to the brewery and rent an additional tank to increase output.
1.4 Example of applications
• Nowadays, data analytics has become important needs in solving
business problem in various field including:
– Case 1: Customer Analytics

• Analytics are often used to model customer behavior. For

example, modeling the events that lead to a customer
becoming brand loyal.

– Case 2: Credit Risk Analytics

• Analytics conducted on credit data that help risk managers to

stay competitive in today’s marketplace. The manager can use
analytics to access real credit data, inference evaluation and
decision, conduct low default portfolio risk modelling, stress-
testing as well as building and validate credit risk
management model. Predictive analytics is often used to
model business risk such as the credit risk associated with a
particular customer.
1.4 Example of applications

– Case 3: Retail Analytics

• Analytics for retail forecasts and operations. For example, a retailer may attempt to
predict demand for a trendy new style of shoe by color and sales region.

– Case 4: Marketing Analytics

• Analytics to look at the results of product, pricing, promotion, advertising and
distribution strategies. For example, analytics might show that female customers in
their 20s are 70% more likely to purchase a particular item at price A as compared
to price B

– Case 5: Business Analytics

• A company would like to identify which of their customers are likely to stop using
their services (to churn). Thus, this company can use data analytics to explore and
understand the customer’s behaviour based on the company’s business data.
Based on the results obtained the company can focus on the retention strategy.
Example of applications. More…
• Netflix
• https://www.edureka.co/blog/data-
What is Data Science?

Data science is a multi-disciplinary field that

uses scientific methods, processes, algorithms
and systems to extract knowledge and insights
from structured and unstructured data.

Diagram 1: Data science process flow

You might also like