Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Lecture 1

Data Warehouses

Data Warehouses - BACHELOR - 2020/21



Course outline
Lecture Goals

Data Warehouses - BACHELOR - 2020/21


Scope

source: qlik.com

Data Warehouses - BACHELOR - 2020/21



Landmark
Scope

source: it-novum.com

Data Warehouses - BACHELOR - 2020/21


Business Advantage

• Enterprises are adept at gathering all sorts of data


 about their customers, prospects, internal business processes,

Data Warehouses - BACHELOR - 2020/21


suppliers, partners, and even competitors

• Capturing data, however, is just the beginning

• With flood of data comes a flood of analytics


 analysing data and using it to make informed business
decisions
Business Advantage
• BARC’s BI Trend Monitor 2020 studied perceived
iimportance of Data-Driven Culture in companies
(n=2,637)
 Data-driven culture was identified as the third most
important trend, rising from fifth place in the previous year.

Data Warehouses - BACHELOR - 2020/21


 Creating a data-driven culture is about replacing gut feeling
with decisions based on data-derived facts
 simple key figures such as revenue or profit, results from advanced
analytics models, or even qualitative data
 The preconditions for establishing a data-driven culture are
access to data, governance of usage and quality of data,
methodological knowledge on how to analyze data and
appropriate technologies to prepare and analyze data.

source: bi-survey.com
Business Advantage

• Special knowledge about a business situation


 conveys business advantage

Data Warehouses - BACHELOR - 2020/21


 allows for proper responses and decisions

• 21st-century information processing


 we should learn how to think about a company's data as a
corporate information asset
 one that can be manipulated in different ways to corporate benefit
Business Advantage

• Information is the
fundamental building block
of decision process
 At a certain point

Data Warehouses - BACHELOR - 2020/21


organizations enter a stage,
in which intuitions and
hunches are not enough to
make optimal business
decisions

• It is important to deliver
adequate information to
decision makers
Data

Data Warehouses - BACHELOR - 2020/21


source: collabion.com
healthcatalyst.com
Data issues
• Heterogeneous applications, unsystematic format and
poor data quality make it difficult to analyse the data
 impede decision making

Data Warehouses - BACHELOR - 2020/21


Data issues
• Bad customer data costs companies six percent of their
total sales
 according to a UK Royal Mail survey
 https://www.royalmail.com/business/system/files/royal-mail-data-
services-insight-report-2018.pdf

Data Warehouses - BACHELOR - 2020/21


• The overall cost of poor data quality for U.S. businesses
at $3.1 trillion per year
 estimate by IBM
 https://www.entrepreneur.com/article/393161

• The UK Government’s Data Quality Hub estimates


organizations spend between 10% and 30% of their
revenue tackling data quality.
Data issues
• Here’s a real-world example of poor data quality
wreaking havoc despite technically being error-free
 1999 – The Mars Climate Orbiter — was lost due to a data
consistency issue.
 While NASA used metric units, its contractor, Lockheed Martin,

Data Warehouses - BACHELOR - 2020/21


used imperial English units.
 As a result, Lockheed’s software calculated the Orbiter’s thrust in
pounds of force, while NASA’s software ingested the numbers
using the metric equivalent, newtons.
 This resulted in the NASA probe dipping 100 miles closer to the
planet than expected, causing the Orbiter to either crash on Mars
or fly out towards the sun
 NASA doesn’t know, as communication with the probe was
already lost
 It resulted in the premature end of NASA’s $327.6 million mission.
Needs

You have a need for specific

Data Warehouses - BACHELOR - 2020/21



data, but the system it’s on
won’t let you access it in
the way you need. This is
why data warehouses are
created.
 N.Mannon, Blastam
Example

• Creating a report that puts


• Consider three systems in together inflows and
an org. outflows for a given

Data Warehouses - BACHELOR - 2020/21


 Accounting and finances customer
 invoices  Is not trivial task
 Production  Filter data, export – all three
 production costs systems
 CRM  Integrate and clean data
 Marketing costs  Create a report
 Data refresh requires a redo
Business Advantage
• Examples
 Did something happened?
 What is the last year’s sales
amount of salesperson 202
?
• Information delivery is  Is the decrease in sales during
possible, when summer months this year
comparable to average decrease
 Proper efficiency/quality

Data Warehouses - BACHELOR - 2020/21


in past years?
measures are defined  Why something happened?
 Proper questions are  Is the increase of sales amount
formulated of salesperson 202 due to
 Did something happened? general increase in sales?
 Why something happened?  What is going to happen?
 What is going to happen?  Will the sales of our products
increase before the upcoming
 Unknown data holidays?
dependencies/relations
 Unknown data
dependencies/relations
 Is there some correlation
between customer’s
demographics and generate
profit?
Needs

• Information

Data Warehouses - BACHELOR - 2020/21


 Fundamental element of the
decision support process
 Allows for competitive edge
 Human intuition and
hunches are not enough for
optimal decisions
Needs

Data Warehouses - BACHELOR - 2020/21


Needs

• Business Advantage
 Special knowledge about a business situation

Data Warehouses - BACHELOR - 2020/21


 conveys business advantage
 allows for proper responses and decisions

 20th-century information processing


 we should learn how to think about a company's data as a
corporate information asset,
 one that can be manipulated in different ways to corporate
benefit.
Application Case
Hoyt Highland Partners
• Hoyt Highland Partners
 a marketing intelligence firm that assists health care providers with growing
their patient base.

• Hoyt Highland was working with an urgent care clinic client.


 The urgent care clinic faced increased competition from other urgent care
operators and convenient care clinics.
 The clinic needed to decide if it should move its location or change marketing
practices to increase its income.

Data Warehouses - BACHELOR - 2020/21


• Hoyt Highland identified
 the most concentrated areas of the clinic's target audience were located
 80 percent of the clinic's patients lived within a 5-mile radius of a clinic location
 which clusters were well represented in the urgent care clinic database and which
clusters provide the operator with the highest return-on-investment (ROI)
potential.
 young families were well represented, but that singles and seniors were
underrepresented.
 proximity is a top factor in the choice of an urgent care clinic.

• This analysis helped the clinic to determine


 that the best course of action was to change its marketing focus rather than to
move its clinics.
 Today, the clinic focuses its marketing toward patients who live within a 5-mile
radius of a clinic location and toward young families.

• Source:
 "Location, Location, Location“, Acxiom, acxiom.com
Application Case
SensE Network
• Sense Networks
 is one of many companies developing applications to better
understand customers' movements.

• One of its applications analyses data on the movements of almost 4


million cell phone users.
 The data come from GPS, cell phone towers, and local Wi-Fi hotspots.

Data Warehouses - BACHELOR - 2020/21


 The data are anonymized, but are still linked together.
 This linkage enables data miners to see clusters of customers getting
together at specific locations (bars, restaurants) at specific hours.

• Clustering techniques can be used to identify what types of "tribes"


these customers belong to, e.g., business travellers, "young
travellers, " and so on.
 By analysing data at this level of detail, customer profiles can be built
with sufficient granularity to permit targeted marketing and
promotions.
 Besides the conventional use of the information to target customers
with better precision and appropriate offers, such systems may
someday be helpful in studying crime and the spread of disease.
• Sources:
 S. Baker, "The Next Net," BusinessWeek, March 2009, pp. 42-46,
 Greene, K., "Mapping a City's Rhythm,“ Technology Review, March 2009
 Sheridan, B., "A Trillion Points of Data, " Newsweek, March 9, 2009, pp. 34-37.
Business Intelligence
• Maintaining data, processing information, supporting
decision making and enabling analytics requires a
proper technological ecosystem

• Business intelligence (BI for short)

Data Warehouses - BACHELOR - 2020/21


 is a technology-driven process for analyzing data and
presenting information which helps executives, managers
and other corporate end users make informed business
decisions.

• Business intelligence system


 is designed with the primary goal of extracting important
data from an organization's raw data to reveal insights to
help a business make faster and more accurate decisions.

• Business intelligence for becoming more data-driven.


BI
• Business Intelligence
systems
 are tools, technologies and
applications which support
data integration, analysis
and reporting.

Data Warehouses - BACHELOR - 2020/21


• Business intelligence
 is a set of theories,
methodologies, processes,
architectures, and
technologies that transform
raw data into meaningful
and useful information for
business purposes.
 can handle large amounts of
information to help identify
and develop new
opportunities.

source: saksoft.com
Approach

source: encorebusiness.com

Data Warehouses - BACHELOR - 2020/21


Approach

• How Business Intelligence systems are implemented?

Data Warehouses - BACHELOR - 2020/21


• Here are the rudimentary steps:
 Step 1) Raw Data from corporate databases is extracted. The
data could be spread across multiple systems heterogeneous
systems.
 Step 2) The data is cleaned and transformed into the data
warehouse. The table can be linked, and data cubes are
formed.
 Step 3) Using BI system the user can ask quires, request ad-
hoc reports or conduct any other analysis.
source: guru99.com

Data Warehouses - BACHELOR - 2020/21


Business INTELLIGENCE

Data Warehouses - BACHELOR - 2020/21


• Why the term intelligence?
Solution

• Data Warehouse (DWH)


 enables processing and

Data Warehouses - BACHELOR - 2020/21


gathering data from various
sources in a coherent model.
 The structure of a data
warehouse is optimized so as
to simplify and unify data
analysis.

source: fusionanalyticsworld.com
Solution

source: passionned.com

Data Warehouses - BACHELOR - 2020/21


Market
• According to the latest report by IMARC Group, titled
"Data Warehousing Market: Global Industry Trends, Share,
Size, Growth, Opportunity and Forecast 2021-2026," the
market is expecting to grow at a CAGR of around 11%
during 2021-2026.
 The compound annual growth rate (CAGR) is the rate of return
(RoR) that would be required for an investment to grow from its
beginning balance to its ending balance, assuming the profits

Data Warehouses - BACHELOR - 2020/21


were reinvested at the end of each period of the investment’s life
span.

• Data warehousing refers to storing, analyzing and


reporting structured, semi-structured and unstructured
data on an electronic platform.
 It also records historical and current data in a centralized
repository for providing business intelligence (BI) and long-term
insights.

• Data warehousing consists of load, warehouse and query


managers deployed on-premises, cloud or hybrid
environments to extract and analyze data.
Market

• Global Data Warehouse


Market size was valued at

Data Warehouses - BACHELOR - 2020/21


US$ 23.45 Bn. in 2020

• The total revenue is


expected to grow at a
CAGR of 10.7% through
2021 to 2027, reaching
nearly US$ 47.8 Bn.
Market
• Based on region
 the market across North
America accounted for largest
share in 2019, holding nearly
two-fifths of the market.
 the global data warehousing
market across Asia-Pacific is

Data Warehouses - BACHELOR - 2020/21


expected to register the
highest CAGR of 12.5% from
2021 to 2028

• Key market players include


 Actian Corporation, Cloudera,
Inc., Amazon.com. Inc., IBM
Corporation, Google Inc.,
Oracle Corporation, Microsoft
Corporation, Snowflake, Inc.,
SAP, and Teradata
Corporation.
Business intelligence
• Business intelligence (BI)
helps organizations analyze
historical and current data,
so they can quickly uncover
actionable insights for

Data Warehouses - BACHELOR - 2020/21


making strategic decisions.

• Business intelligence tools


make this possible by
processing large data sets
across multiple sources and
presenting findings in
visual formats that are
easy to understand and
share.
Business intelligence

Data Warehouses - BACHELOR - 2020/21


Business intelligence

• How Business Intelligence systems are implemented?

Data Warehouses - BACHELOR - 2020/21


 Here are the rudimentary steps:
 Step 1) Raw Data from corporate databases is extracted. The data
could be spread across multiple systems heterogeneous systems.
 Step 2) The data is cleaned and transformed into the data
warehouse. The table can be linked, and data cubes are formed.
 Step 3) Using BI system the user can ask quires, request ad-hoc
reports or conduct any other analysis.
Data warehousing

source: panoply.io

Data Warehouses - BACHELOR - 2020/21


SQL Server ecosystem
• SQL Server • SQL Server
Integration
Services

Data Warehouses - BACHELOR - 2020/21


Integration Storage

Analysis Reporting

• SQL Server • SQL Server


Analysis Reporting
Services Services
Modern trends

• Modernization
 may involve previously untapped features or platforms,

Data Warehouses - BACHELOR - 2020/21


 such as in-memory functions, in-database analytics, clouds, and
data types or analytics that are new to your organization.
 Besides the core warehouse, the systems and tools integrated
with it need modernization too,
 especially those for analytics, reporting, and data integration.

 Russom P., Modernizing Data Warehouse Infrastructure, TDWI


Checklist Report, 2018
Modern trends

Data Warehouses - BACHELOR - 2020/21


Modern trends - automation

Data Warehouses - BACHELOR - 2020/21


source: clicdata.com
Modern trends: Cloud

Data Warehouses - BACHELOR - 2020/21


Modern trends - streams

Data Warehouses - BACHELOR - 2020/21


source: https://dribbble.com/shots/4588750-The-Data-Stream
parikshitag.com
Modern trends - lambda

Data Warehouses - BACHELOR - 2020/21


source: blue-granite.com
Modern trends - MPP

Data Warehouses - BACHELOR - 2020/21


source: microsoft.com
Modern trends – MPP

Data Warehouses - BACHELOR - 2020/21


• Sherman, Rick
 Business Intelligence Guidebook: From
Data Integration to Analytics, Elsevier
Science, 2015
Bibliography • David Loshin,
 Business Intelligence: The Savvy
Manager's Guide, Getting Onboard with
Emerging IT, Morgan Kaufmann
Publishers, 2003

• Swain Scheps,

Data Warehouses - BACHELOR - 2020/21


 Business Intelligence for Dummies,
Wiley Publishing, 2008
SOURCES • Efraim Turban, Ramesh Sharda,
Dursun Delen, David King,
 Business Intelligence (2nd Edition),
Prentice Hall, 2010

• Reza Rad,
 Microsoft SQL Server 2014 Business
Intelligence Development Beginner's
Guide, Packt Publishing, 2014
Thank you

Data Warehouses - BACHELOR - 2020/21

You might also like