Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Overview of Data

Science
1
What is Data Science
• The key word in Data Science is Science. The term "science" implies
knowledge gained through systematic study
• Data Science is the generalizable extraction of knowledge from Data
• Data Science is highly inter-disciplinary in nature

Data
Science

2
Data Science Process

3
What is Data Analytics

• Data analytics that relies heavily on data summarization and


visualization focusing on business information
• For a Supermarket chain, based on the sales, we can do analytics on
Data
– Top 10 selling product by volume
– Top 10 selling product by value
– Comparative sales volume day wise
– Comparative sales volume store wise
– Top 10 selling product for each store
– Section wise sales figure
– No of people who purchased
– Top 10 most profitable product

4
Understanding of Business Analytics

• Business analytics (BA) refers to the skills, technologies, applications


and practices for continuous iterative exploration and investigation of
past business performance to gain insight and drive business planning.
(Beller, Michael J.; Alan Barnett (2009-06-18). "Next Generation Business Analytics".
Lightship Partners LLC. )

• Business analytics focuses on developing new insights and


understanding of business performance based on data and statistical
methods

• Business analytics makes extensive use of data, statistical and


quantitative analysis, explanatory and predictive modelling, and fact-
based management to drive decision making

5
Need of Business Analytics

• A Business problem is a difference between the desired state and the


current state
• We assume that the desired state has a higher level of business
benefits compared to the current state
• The business benefit may be in terms of revenue / customer retention
/ employee satisfaction / ease of operations (productivity
improvement / defect reduction / availability)
• The benefits may be measurable in quantitative terms. It must
definitely be observable
• Business Analytics tries to provide actionable insights and decision
support to bridge the gap between current and desired state

6
Types of Business Analytics – Descriptive Analytics

• Descriptive Analytics answers the question “What has happened”


using Historical data.
• Uses Exploratory Analysis and Visualization techniques on data to
gather insights for future actions. E.g. How many goods sold in the
past and design a promotional campaign if required.
• It is characterized by traditional BI and Visualization
• Gives Hindsight
• Information oriented
• Descriptive Analytics is also a step towards Predictive Analytics

7
Types of Business Analytics – Predictive Analytics

• Predictive Analytics approach focuses on modelling and prediction


with emphasis on the business relevance of the resulting insight
• Forecast future performance and results based on previous trends and
patterns and current knowledge of the scenario
• It is suggestive action “What is likely to happen”
• Gives Insight
• Knowledge oriented

8
Types of Business Analytics – Prescriptive Analytics

• Prescriptive Analytics is a form of advanced analytics which examine


data to answer the question “ What should be done”
• It recommends decision/ action
• Gives Foresight
• Optimization oriented
• Machine Learning is used in Predictive and Prescriptive Analytics

9
Supply Chain Analytics

• Globalization and available of alternatives has resulted in Product Variance,


Demand volatility and globally dispersed customer. Base of Suppliers and
Logistic partners have increased accordingly
• Demand management and Inventory management has become a big
challenge
• To serve customer demand, it is important to forecast on daily or real time
basis.
• Analytics are applied to these areas to
– More accurately anticipate demand
– To predict and monitor inventory supply and replenishment
– Plan inventory flow of goods and services
– Continuously rethink logistic network and freight optimization
– Optimize overall Supply Chain cost
– Supply chain visibility and responsiveness, minimize customer impact

10
Health Care Analytics

• Health Care is one of the key area which is getting attention of many
Analytics professional
• In Health Care, priority areas for application of Analytics are:
– Clinical Decision Support System – Reduce diagnostic error and delayed
diagnostics
– Population Health Management – Identify patterns that can be replicated
in large population or geographic areas
– Remote Health Management and Tele Health – Using patient data from
insurance claims, lab results, hospital information systems, pharmacies;
target group of patients are identified who will respond to Tele Health
– Genomic Research – Mapping, sequencing and analysis of DNA helps to
discover subtypes of diseases (e.g. different subtypes of cancer) and
subtype of population. This helps in developing vaccination and cure
– Readmission – Analyse clinical and non clinical factors which affect this
– Preventive Health Care - Analysis of health history and predict disease
and take preventive steps at right time
11
Marketing Analytics

• Marketing and Sales performance is enhanced using Analytics:


– Sales and lead generation. Which region, customer segment to target
– Insights into customer preferences and trends
– Allows to design and monitor campaigns and their respective outcomes
– Allocation of advertisement budget to various channels
– Analysis of Click stream allows to provide customers with personalized
offerings and focused marketing
– Evaluate industry trend and trending features based on keyword search
– Loyalty and ongoing engagement, upselling and cross selling
– Reduce Customer churn
– Analyse variables influencing consumer behaviour and take decisions

12
Human Resource Analytics

• Analytics is applied to human resource processes of an


organization to improve employee performance and improve the
business outcome
– Employee Attrition: Why are our best and most experienced employees
are leaving prematurely
– Forecast workforce requirements and determine how to best fill open
positions
– Improve workforce utilization, employee satisfaction and productivity
– Determine what training to be delivered and when; improve competency
– Reduce cost of recruitment and retention without compromising on
Business goals
– Determine policy effectiveness and design new policies

13
Data Management and Business Analytics
• Data is one of the most important asset of any enterprise.
• Managing data faces many challenges: Volume growth, multiple data
sources, multiple data types, increase in sites and no of systems, data
retention, missing and noisy data, etc
• As business gets more competitive, management faces the task of
sifting through enormous amounts of data, hidden in multiple
operational and historical legacy systems, to make informed decisions
– Data Extraction – Extracting data from multiple data sources
– Data Integration – combine multiple data source, multiple data types
– Data Cleaning – remove noise and inconsistent data
– Data Selection – data relevant to the analysis task is retrieved
– Data Transformation – data transformed and consolidated in the form suitable for
analysis. Perform summary or aggregation as required.
• Efficient Data Management is one of the key challenges faced by
organizations for effective Business Analytics
14
Web Analytics and Business Intelligence

• Web Analytics focuses on the interactions of customers with


web and particularly company's website.
• Data from web and a web server's log can be harvested to
generate useful and actionable business intelligence, on existing
and potential customers
• Web Analytics is extremely powerful when data from web is
integrated with data from customer and sales databases in a
Data Warehouse.
• Web Analytics provides business intelligence used to better
understand the customer's unique needs, interests and patterns
• It also helps to identify technical and navigation website issues
and identify improvements for website design based on user
behaviour patterns
15
Data Warehousing and OLAP

• Data Warehouse refers to a data repository that is maintained


separately from an organization’s operational Database
• Information from multiple heterogeneous sources is integrated in
advance and stored in a warehouse for direct query and analysis
• They support information processing by providing a solid platform of
consolidated historic data for analysis
• Data Warehousing systems serve users or knowledge workers in the
role of Data Analysis and Decision Making by providing Online
Analytical Processing (OLAP) tools for the interactive analysis
• Such systems can organize and present data in various formats in
order to accommodate the diverse needs of various users

16
Data Visualization using R and Excel

• Data visualization is the presentation of data in a pictorial or


graphical format. It enables decision makers to see analytics
presented visually, so they can grasp difficult concepts or
identify new patterns.
• Basic Visualization can be done both in R and Excel using
following:
– Histogram
– Bar / Line Chart
– Pie Chart
– Box plot
– Scatter plot
• R Programming offers a satisfactory set of inbuilt function and
libraries (such as ggplot2, leaflet, lattice) to build visualizations
and present data.
17
Data Visualization using Tableau

• Tableau is a very flexible Product for enabling Data Analytics using


visualization and summarization, flexible and live visual analytics

18
Big Data and Data Science

• As size of Data increases and cannot be handled by traditional Database management


systems, then we call it Big Data
• Humans are creating Trillions of GB of data every year. It is sometimes termed as
“Data deluge”
• Merely keeping up with this flood, and storing the bits that might be useful, is difficult
enough. Analysing it, to spot patterns and extract useful information, is harder still
• Data Scientist need to work more and more with Big Data, which many times
comprise of billions of transactions every day in order to predict the outcomes and
recommend decisions
• E.g. Credit card companies need to perform analytics on millions of transactions and
come out with rules so that they can monitor every purchase and identify fraudulent
transactions with high accuracy
• Video Surveillance produces hundreds of GBs of video footage each day and they
need to be constantly analyzed for unusual behavior
• Online retails also analyses millions of purchases to identify customer behavior and
market trends to recommend new products

19
Data Science as a Strategic Asset

• We live in a world characterized by VUCA – Volatility, Unpredictability,


Complexity and Ambiguity
• Finding actionable insights and make the future predictive, in such
situations are both challenging and rewarding in terms of competitive
edge, cost optimization, innovative solutions and even ensuring
survival in difficult times
• A company which has included Data Science as part of their decision
making Strategy gets benefitted on continual basis . A Data Scientist is
a key asset to any organization and drives this process

20

You might also like