Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

● Data science is a field that involves statistical and computational

techniques to extract insights and knowledge from data.


● It encompasses a wide range of tasks, including data cleaning and
preparation, data visualization, statistical modeling, machine learning, and
more.
● Data scientists use these techniques to discover patterns and trends in
data, make predictions, and support decision-making.

Data Science Components:

1. Statistics:Statistics is a way to collect and analyze the numerical data in a large


amount and finding insights from it.
. Domain Expertise: In data science, domain expertise binds data science together.
. Data engineering: Data engineering is a part of data science, which involves acquiring,
storing and transforming the data
. Visualization: Data visualization is meant by representing data in a visual context so
that people can easily understand

5. Advanced computing: Advanced computing involves designing, writing and


maintaining the source code of computer programs.

6. Mathematics: Mathematics is the critical part of data science. Mathematics involves


the study of quantity
7. Machine learning: Machine learning is backbone of data science. Machine learning is
all about to provide training to a machine so that it can act as a human brain.

Data Science Lifecycle


The main phases of data science life cycle are given below:

1. Discovery: The first phase is discovery ,When you start any data science project, you
need to determine what are the basic requirements, priorities, and project budget. In
this phase, we need to determine all the requirements of the project

2. Data preparation: Data preparation is also known as Data Munging. In this phase, we
need to perform the following tasks:

○ Data cleaning

○ Data Reduction

○ Data integration

○ Data transformation,
3. Model Planning: In this phase, we need to determine the various methods and
techniques to establish the relation between input variables.

Common tools used for model planning are:

○ SQL Analysis Services

○ R

○ SAS

4. Model-building: In this phase, the process of model building starts. We will create
datasets for training and testing purpose.

Following are some common Model building tools:

○ SAS Enterprise Miner

○ WEKA

5. Operationalize: In this phase, we will deliver the final reports of the project, along
with briefings, code, and technical documents. This phase provides you a clear
overview of complete project performance.

6. Communicate results: In this phase, we will check if we reach the goal, which we
have set on the initial phase

Need for Data Science:

○ With the help of data science technology, we can convert the massive amount of
raw and unstructured data into meaningful insights.

○ Data science technology handle the huge amount of data,


○ Data science can help in different predictions such as various survey, elections
Applications of Data Science:

○ Image recognition and speech recognition:


When you upload an image on Facebook and start getting the suggestion to tag
to your friends. This automatic tagging suggestion uses image recognition
algorithm, which is part of data science.

○ Gaming world:
EA Sports, Sony, Nintendo, are widely using data science for enhancing user
experience.

○ Internet search: All these search engines use the data science technology to
make the search experience better
○ Transport:
Transport industries also using data science technology to create self-driving
cars.

○ Healthcare:

In the healthcare sector, data science is providing lots of benefits. Data science
is being used for tumor detection, drug discovery,

○ Recommendation systems:

Most of the companies, are using data science technology for making a better
user experience with personalized recommendations.
Types of Data Science Roles:

Data Strategist
Ideally, before a company collects any data, it hires a data strategist—a
senior professional who understands how data can create value for
businesses.

Data Architect
A data architect plans out high-level database structures. This involves
the planning, organization, and management of information within a
firm, ensuring its accuracy

Data Engineer
Data engineers build the infrastructure, organize tables, and set up the
data to match the use cases defined by the architect

Data Analyst
Data analysts explore, clean, analyze, visualize, and present information,
providing valuable insights for the business. They typically use SQL to
access the database.
Data Scientist
A data scientist can leverage machine and deep learning to create
models and make predictions based on past data.

We can distinguish three main types of data scientists:

​ Traditional data scientists

​ Research scientists

​ Applied scientists

Understanding Data Science History


1963: John W. Tukey, an American mathematician,predicted the rise of a new field in
his now-famous paper "The Future of Data Analysis.
1977: IASC was founded, with the mission of "linking traditional statistics, the
knowledge of experts and modern computer technology, to transform data into
knowledge
1990: the pioneer Knowledge Discovery in Databases workshop and the foundation of
the IFCS
1994: The emerging phenomena of "Database Marketing" was discovered in Business
2000: Data science has definitely evolved as a recognized and specialized field
And data collection is practically widespread.
2005: Big data makes its debut .Google and Facebook are using big data
2014: As data became more important and organizations became more interested in
detecting patterns, Data science became an important part of businesses
2015: Artificial Intelligence (AI), Machine Learning, and Deep learning all make their
debut in the field of data science in 2015.
2018: introduction of new regulations in the field.
2020s: We're seeing further advancements in AI and machine learning
Data Security Issues-

Data Storage

Fake Data

Data Management

Data Access Control

Data Poisoning

Employee Theft

You might also like