Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 8

Data Science

What is Data Science

• Data science is an interdisciplinary field that uses


scientific methods, processes, algorithms and systems to
extract knowledge and insights from noisy, structured
and unstructured data, and apply knowledge and
actionable insights from data across a broad range of
application domains.

• Data science is related to data mining, machine


learning and big data.

• Data science is a "concept to unify statistics, data


analysis, informatics, and their related methods" in order
to "understand and analyze actual phenomena" with
data.

• It uses techniques and theories drawn from many fields


within the context of mathematics, statistics, computer
science, information science, and domain knowledge.
Models
Types of Data Science
Tasks Description Algorithms Examples

Classification Predict if a data point belongs to Decision Trees, Neural Assigning voters into known buckets by
one of predefined classes. The networks, Bayesian political parties eg: soccer moms.
prediction will be based on models, Induction rules, K Bucketing new customers into one of
learning from known data set. nearest neighbors known customer groups.

Regression Predict the numeric target label of Linear regression, Logistic Predicting unemployment rate for next
a data point. The prediction will regression year. Estimating insurance premium.
be based on learning from known
data set.

Anomaly detection Predict if a data point is an outlier Distance based, Density Fraud transaction detection in credit
compared to other data points in based, LOF cards. Network intrusion detection.
the data set.

Time series Predict if the value of the target Exponential smoothing, Sales forecasting, production
variable for future time frame ARIMA, regression forecasting, virtually any growth
based on history values. phenomenon that needs to be
extrapolated

Clustering Identify natural clusters within the K means, density based Finding customer segments in a
data set based on inherit clustering - DBSCAN company based on transaction, web
properties within the data set. and customer call data.

Association analysis Identify relationships within an FP Growth, Apriori Find cross selling opportunities for a
itemset based on transaction retailor based on transaction purchase
data. history.
Course Core Algorithms

outline Classification
Decision Trees
Rule Induction
k-Nearest Neighbors
Naïve Bayesian
Artificial Neural Networks Common Applications
Process Basics
Support Vector

Data Science Machines Ensemble


Text Mining
Process Learners Regression
Time Series Forecasting
Data Exploration Linear Regression
Logistic Regression Anomaly Detection
Model Evaluation
Association Feature Selection
Analysis
Apriori
FP-Growth

Clustering
k-Means
DBSCAN
10 applications that build upon the concepts of Data Science, exploring various
domains such as the following:

• Fraud and Risk Detection


• Healthcare
• Internet Search
• Targeted Advertising
• Website Recommendations
• Advanced Image Recognition
• Speech Recognition
• Airline Route Planning
• Gaming
• Augmented Reality

You might also like