Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 12

Introduction to

Data Science
Module 1
Week 1
Overview of Data Science
Module Objectives
At the end of this module, students must be able to:
1. Explain the meaning of and differentiate the concepts of
big data, data analytics and data science;
2. Differentiate the domain areas of statistics and data
science;
Big Data
What is Big Data?
 refers to humongous volumes of data
that cannot be processed effectively
with the traditional applications that
exist (usually comprise of raw data
that isn’t aggregated and is most often
impossible to store in the memory of a
single computer)
Big Data
What is Big Data?
 refers to humongous volumes of data that cannot be processed effectively with the
traditional applications that exist (usually comprise of raw data that isn’t aggregated
and is most often impossible to store in the memory of a single computer)
 immense volumes of data, both unstructured and structured (usually inundates a
business on a day-to-day basis)
 something that can be used to analyze insights which can lead to better decisions
and strategic business moves.
 (Gartner)“high-volume, and high-velocity and/or high-variety information assets
that demand cost-effective, innovative forms of information processing that enable
enhanced insight, decision making, and process automation”
The Ten-V’s of Big Data
Common Types of Big Data
Data Science
 Deals with unstructured and structured data
Data Science
 Deals with unstructured and structured data
 a field that comprises of everything that
related to data cleansing, preparation, and
analysis.
 the combination of statistics, mathematics,
programming, problem-solving, capturing
data in ingenious ways
 the umbrella of techniques used when
trying to extract insights and information
from data.
The Data Science Process
Data Analytics
 the science of examining raw data with the purpose of drawing conclusions
about that information.
 involves applying an algorithmic or mechanical process to derive insights.
(e.g., running through a number of data sets to look for meaningful
correlations between each other)
 used in a number of industries to allow the organizations and companies to
make better decisions as well as verify and disprove existing theories or
models.
 its focus lies in inference, which is the process of deriving conclusions that
are solely based on what the researcher already knows.
Data Analytics
Analytics Value Chain

You might also like