Data Science Module1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Introduction to Data Science

By Gaurav Kumar
Data
Data are measurements or
observations that are collected as a
source of information

Data can be generated by


computer , humans and combination
of both.
What is data Science

Data Science is a collection of


techniques used to extract value
from data.

Data Science techniques rely on


finding useful patterns, connections ,
and relationships within data.
What is data Science.

Data Science is the application of computational and statistical


techniques to address or gain insight into some problem in the
real world.

Data Science = statistics + data processing + machine learning


+scientific enquiry + visualisation + business analytics + big data +
….
What is Data Science
Data Science is lying at the intersection of computer science , statistics and
substantive application domains .

From computer science comes machine learning and high performance computing
technologies for dealing with scale.

From statistics come a long tradition of exploratory data analysis , significance testing
and visualisation.

From application domains in business and the sciences comes challenge worthy of
battle and evaluation standard to assess when they have been adequately
conquered.
Workflow
Obtain data that you hope will answer the question.

Explore the data to understand it.

Clean and prepare the data for analysis.

Perform analysis, model building and testing etc.

Draw conclusions from your work.

Report those conclusions to the relevant stakeholders.


What is Big Data .

Big data refers to a huge volume of data that can not be stored,
processed by any traditional data storage or processing units

Big data is often automatically generated by a machine of some sort and


usually not in user friendly format. Default is to capture everything and
worry about what matters later.
Big data exceeds the reach of commonly used hardware environments and
software tools to capture , manage and process it within elapsed time for its user
population

Big data refers to data sets whose size is beyond the ability of typical database
software tools to capture , store , manage and analyse.
Traits of Big Data
Web scraping
https://data-lessons.github.io/library-webscraping-DEPRECATED/01-introduction/
Reporting
Analysis
vs

data into one


Gathering
Data Reporting
:

place
and presenting
representations
.

it in visual

data
Analysis : Interpreting your
Data it context
and giving

task than
is more difficult
Data analysis knowl
because
it requires
data reporting models and

about different analytical


edge
statistical techniques
.
data
,
connect
ros-channel
Reporting can easi
and make information
give comparisons which
graphs
,

charts ,
Dashboard,
er to
gralp (eg .

Whereas
Kanakis
are
reports ,
not analysisreports)
data and makes action suggestions
.

understands the
into
In reporting ,
data is organised
inspecti
summaries whereas analysis involves
data before
and transforming
ng cleaning
,
,

models
.
creating
data into information ,
translates
Reporting turns information into

Whereas Analysis
insights
.

to ask 'What'
users
allows
Reporting 'What is
the data . Ex :
about
questions sales
our
performance of
the average
Analysis should
team ?' Whereas
and 'what
can

answer 'Why' questions ,

me do about it !
https://bidataintel.com/2021/05/data-analysis-and-reporting-differences/

You might also like