Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 23

Data Science

and its role in


Big Data
analytics
Stefano De Francisci

THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION

Eurostat
Outline

1. Data Science, basic concepts


2. A short history
3. A new concept of Science?
4. Big Data as the new frontier of Data Science
5. Data, information, knowledge

Eurostat
WIKIPEDIA
Extraction of knowledge from
volumes
large of data that are structured or
...[DS includes] The field of data science is unstructured, which is a continuation of
mathematics, statistics, emerging at the intersection of the field data mining and predictive
data engineering, the fields of social science and analytics, also known as knowledge
pattern recognition and statistics, information and discovery and data mining (KDD).
learning, advanced computer science, and design "Unstructured data" can include emails,
computing, visualization, BERKELEY SCHOOL OF videos, photos, social media, and other
uncertainty modeling, user-generated content.
data warehousing, and INFORMATION
high performance
computing with the goal INTERDISCIPLINARY First, the raw material, the
of extracting meaning NEW KINDS “data” part of Data Science,
from data and creating Data is increasingly
OF DATA
data products Science heterogeneous and
unstructured. Second,
MOUT DATA AS computers interpret data
PRODUCT automatically, making them
NEW METHODS FOR active agents in the process
of sense making.
MAKING-SENSE TO DATA
DHAR
…merely using data isn’t really what we Data science is the study of
mean by “data science.” A data application where information comes At its core, data science
acquires its value from the data itself, and from, what it represents involves using automated
creates more data as a result. It’s not just an and how it can be turned methods to analyze
application with data; it’s a data product. into a valuable resource in massive amounts of data
Data science enables the creation of data the creation of business and and to extract knowledge
products IT strategies from them.
LOUKADIS (O’REILLY MEDIA) NEW YORK
ROUSE UNIVERSITY
Eurostat
Data Science landscape
• Signal processing • Visualization
• Probability models • Predictive analytics
• Nanotechnologies
• Machine learning • Uncertainty modeling
• Physics
• Statistical learning • Data warehousing
• Robotics
• Data mining • Data compression
• Mathematics
• Database • Computer programming
• Statistics
• Data engineering • High Performance Computing
• Information theory
• Pattern recognition
• Information technology
• AI
FIELDS Data
Science TECHINIQUES
(WIKIPEDIA)
OBJECTS
APPROACHES

The development of machine learning, a branch


Methods that scale to Big Data are of of artificial intelligence used to uncover patterns
particular interest in data science, in data from which predictive models can be
although the discipline is not generally developed, has enhanced the growth and
considered to be restricted to such data. importance of data science.

Eurostat
A data scientist may or may not
Who is a Data Scientist? have specialized industry
knowledge to aid in modeling
In addition to advanced analytic skills, this individual is also
business problems and with
proficient at integrating and preparing large, varied datasets,
understanding and preparing data.
architecting specialized database and computing environments,
and communicating results.
Creating value from data
The data scientist has requires a range of talents:
TASKS MISSION
emerged as a new role, from data integration and
distinct from — but Data Scientist preparation, to
with similarities to — PROFILE TALENT architecting specialized
those of business GARTNER computing/database
intelligence (BI) analysts environments, to data
and statisticians RESPONSIBILITY PECULIARITY mining and intelligent
algorithms

An individual responsible for modeling complex


business problems, discovering business insights Data scientists can be invaluable in generating
and identifying opportunities through the use of insights, especially from "big data;" but their unique
statistical, algorithmic, mining and visualization combination of technical and business skills, together
techniques. with their heightened demand, makes them difficult
to find or cultivate.

Eurostat
D. Laney, L. K a r t
Unicorn

CONWAY Venn RALEIGH


diagram
MOUT ERICKSON

Eurostat
5
4 6
Venn
diagram
>7 7

Eurostat
Eurostat
Eurostat
Is Data Science a maturity science?

Types of domain dealt by an intellectual Feature of a new discipline:


enterprises: (a) To represent an autonomous field (unique
(a) topics (facts, data, problems, topics)
phenomena, observations, and the (b) To provide an innovative approach to both
like) traditional and new philosophical topics
(b) methods (techniques, approaches, (original methodologies);
and (c) To stand beside other disciplines, offering the
so on) systematic treatment of its own conceptual
(c) theories (hypotheses, explanations, and foundations (new theories).
If aso forth) attempts to innovate in more than one of these domains simultaneously is
discipline
premature, as detaches itself too abruptly from the normal and continuous thread of evolution of
its general field (Stent 1972).
crossroad of Transdisciplinary (like
As everyone’s
• technical matters to be anyone’s cybernetics or semiotics)
concern is
• theoretical issues own area of or interdisciplinary (like
nobody’s
• applied problems specialisation biochemistry or cognitive
business
• conceptual analyses science)?

L. F l o r i d i
Eurostat
191

201
http://www.emc.com/collateral/about/news/emc-data-science-study-wp.pdf

285
wher
e second first
value valu value
e
Size of effect shown
in graphic 191 201 0,050

Size of effect in data 285 195 0,462


195
Lie factor 0,108
Eurostat
Short History of Data Science
(Loosely based on
Gil Press version)

http://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science

Eurostat
Eurostat
Eurostat
Eurostat
Eurostat
Eurostat
Eurostat
Eurostat
Eurostat
Steps to a Metaphisics of Data Science
• How does the Data Science in the context of the Knowledge Organization?
• What are its relations with other fields of scientific knowledge?
• Can DS be explained as part of the philosophy of science?

Data Information Knowledge

Scientific Data Information Knowledge


context Science Science Science

Philosophy of
Philosophical Philosophy of Philosophy of Knowledge
context Data Information (Epistemology,
Gnoseology)

Eurostat
Beyond Data Science?
Information Science sits at the intersection
Information Science is the study of information of technology, people, and organizations.
and how it is used by people within It is a distinct discipline and has a focus on
organisations Information and Communication
Technologies (ICT) used by people to
manage information within organisations.

Data Information Knowledge

Information
Ambito Science
scientifico

Ambito
filosofico
http://infosci.otago.ac.nz/what -is-information-science/

Eurostat
Beyond Data Science?

Data
nowledge
Informatio K
n
Knowledge
Science
Ambito
scientifico

Ambito
https://www.crcpress.com/Knowledge-Science-Modeling-the-Knowledge-Creation-Process/Nakamori/9781439838365
Eurostat

You might also like