Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Data Access

Lecture 2
Conducted by
Ms. Akila Brahmana
Department of ICT
Faculty of Technology
University of Ruhuna
Objectives
 Describe the importance of defining the access strategy early
in the development of the warehouse.
 Identify the different categories of data access tools.
 Identify the different database models that support OLAP
query tools.
 Identify OLAP query techniques.
 Identify non-OLAP tools for data access.
 Identify factors influencing query tool choice.
Data Warehouse vs. Heterogeneous DBMS
 Traditional heterogeneous DB integration:
 Build wrappers/mediators on top of heterogeneous databases
 Query driven approach
 When a query is posed to a client site, a meta-dictionary is used to
translate the query into queries appropriate for individual
heterogeneous sites involved, and the results are integrated into a
global answer set
 Complex information filtering, compete for resources
 Data warehouse: update-driven, high performance
 Information from heterogeneous sources is integrated in
advance and stored in warehouses for direct query and analysis
Data Warehouse vs. Operational DBMS
 OLTP (on-line transaction processing)
 Major task of traditional relational DBMS
 Day-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration,
accounting, etc.
 OLAP (on-line analytical processing)
 Major task of data warehouse system
 Data analysis and decision making
 Distinct features (OLTP vs. OLAP):
 User and system orientation: customer vs. market
 Data contents: current, detailed vs. historical, consolidated
 Database design: ER + application vs. star + subject
 View: current, local vs. evolutionary, integrated
 Access patterns: update vs. read-only but complex queries
OLTP vs. OLAP
OLTP OLAP
Users clerk, IT professional knowledge worker

Function day to day operations decision support


DB design application-oriented subject-oriented
Data current, up-to-date detailed, flat relational historical, summarized,
isolated multidimensional integrated,
consolidated
Usage repetitive ad-hoc
Access read/write lots of scans
index/hash on prim. key
Unit of work short, simple transaction complex query
# records accessed tens millions
#users thousands Hundreds
DB size 100MB-GB 100GB-TB
Metric transaction throughput query throughput, response
Data Access Tools
 OLAP tools
 Multidimensional data access
 Relational data access
 Standard reporting tools
 Relational data access primarily
 Multidimensional access possibly
 Data mining tools
OLAP and DSS Compared
 OLAP - used for multidimensional analysis
 DSS - provides a system enabling decision making
 OLAP tools provide a DSS capability
 OLAP for the warehouse provides analytical power
OLAP Rules
1. Multidimensional conceptual view
2. Transparency
3. Accessibility
4. Consistent reporting performance
5. Client-server architecture
OLAP Rules

6. Multi-user support
7. Unrestricted cross dimensional operations
8. Intuitive data manipulation
9. Flexible reporting
10. Unlimited dimensions and aggregation levels
Relational Database Model
Table Columns

Rows

Key values to join

The data is found at the intersection of a row


and a column
Multidimensional Database Model

Customer Store
Store

Time Time

SALES FINANCE

Product
GL_Line

The data is found at the intersection of


dimensions
Comparison of Models
Property Relational Multidimensional
Data representation Two dimension Multiple dimensions

Data extraction Specific rows Specific dimensions

Results Tool specific Matrix


Multidimensional Queries
 What are my ten best and worst products, in terms of percentage
growth, in 2019 compared with 2018?
 What is the relative growth of my largest selling product?
 How did each of my products grow last year?
 Chart the four-week moving-average trend for sales over the last
two fiscal years.
 How profitable is each of my sales territories, if overhead is
allocated on a sales versus head count basis?
 Project next quarter’s sales based on the last 38 months’ figures
for the Northern Region.
OLAP Models
MOLAP ROLAP

Desktop

OLAP
Server

Desktop
MD Server OLAP

Database
Server

Warehouse
The MOLAP Model

 The application layer stores data in a DSS Client

multidimensional structure.
 The presentation layer provides the
multidimensional view.
MOLAP
Application Layer
Engine

Warehouse
The MOLAP Model
 Data
DSS Client
 Arrays
 Cached
 Offloaded from server
 Efficient storage and processing MOLAP
Application Layer
Engine
 Complexity hidden from the user
Analysis using pre aggregated
summaries and pre calculated measures
Warehouse
The ROLAP Model

 The warehouse stores atomic data. DSS Client

 The application layer generates SQL for


the two dimensional view.
 The presentation layer provides the
multidimensional view.
ROLAP
Application Layer
Engine

Multiple
SQL

Warehouse
Server
The ROLAP Model
 Data and metadata in Server
DSS Client
 Multidimensional views of data
 High connectivity
 Unlimited ROLAP
Application Layer
Database size
Engine

Query criteria
Multiple

SQL
 Complex SQL generated by tool
Warehouse
Server
OLAP Query Characteristics
 Access to large amounts of data
 Analysis of data relationships by many business criteria
 Analysis of data by time
 Display of data across different dimensions
 Complex calculations using formula
 Quick response
Standard Query Techniques
Customer

Why?
Slice /Dice Time

What? Account
Why?

Why?

Drill-down
Standard Query Techniques

Why? Drill-up

What?
Drill-across
Why?

Why?
Pivoting
Standard Query Techniques
 Roll up (drill-up): summarize data
 by climbing up hierarchy or by dimension reduction

 Drill down (roll down): reverse of roll-up


 from higher level summary to lower level summary or detailed data, or
introducing new dimensions

 Slice and dice:


 project and select
Standard Query Techniques
 Pivot (rotate):
 reorient the cube, visualization, 3D to series of 2D planes.

 Other operations
 drill across: involving (across) more than one fact table
 drill through: through the bottom level of the cube to its backend
relational tables (using SQL)
A Sample Data Cube
Example: Standard Query Techniques
Subject: Sales
Slice
 The Slice OLAP operations takes
one specific dimension from a cube
given and represents a new sub-
cube, which provides information
from another point of view.
 It can create a new sub-cube by
choosing one or more dimensions.
The use of Slice implies the
specified granularity level of the
dimension.
Dice
 OLAP Dice emphasizes two or
more dimensions from a cube
given and suggests a new sub-
cube, as well as Slice operation
does.
Drill Up
 Drill-up is an operation to gather
data from the cube either by
ascending a concept hierarchy for a
dimension or by dimension reduction
in order to receive measures at a less
detailed granularity.

 So that to see a broader perspective


in compliance with the concept
hierarchy a user has to group
columns and unite the values.
Drill down
 Drill-down is an operation
opposite to Drill-up. It is carried
out either by descending a
concept hierarchy for a
dimension or by adding a new
dimension.
 It lets a user deploy highly
detailed data from a less
detailed cube.
Pivot
 This OLAP operation rotates
the axes of a cube to provide
an alternative view of the
data cube.
 Pivot clusters the data with
other dimensions which helps
analyze the performance of a
company or enterprise.
Understanding OLAP Standard Query Techniques
Given a fact table with sales data (for example sales(market#, product#,
time#, amount)) and relevant dimension tables, write an SQL statement
that slices the cube to select sales only in week 2, and dice it by regions.

Assuming that we have the following dimension tables:

Market (Market_ID, City, Region)


Product (Product_ID, Name, Category, Price)
Time (Time_ID, Week, Month, Quarter)
Sales (Market_ID, Product_ID, Time_ID, Amount)
Answer
The following query provides a solution:
SELECT M.Region, SUM (Amount)
FROM Sales S, Time T, Market M
WHERE T.Time_ID = S.Time_ID AND T.Week =“Week2”
AND S.Market_ID = M.Market_ID
GROUP BY M. Region
Understanding OLAP Standard Query Techniques
Calculate the following query
given the table (sales) below:
SELECT Time, Region,
Department, sum(Profit) AS
Profit
FROM sales
GROUP BY ROLLUP(Time,
Region, Department)
Answer
Non-OLAP Tools
 Specialized tools for DSS
 May access relational or multidimensional data
 Access operational and warehouse data
 Ad-hoc query
 Reporting
 Exploration
Non-OLAP Tools
 Spreadsheets

12345.00 100% ABC CO


12780.00 110% GMBH LTD
2345787.00 230% GBUK INC
87877.98 200% FFR ASSOC
5678.00 -10% MCD CO
Choosing a Reporting Architecture
 Business needs
 User adaptability Good

 GUI interface MOLAP

Query
 Computer architecture Performance

 Network architecture ROLAP

OK
 Openness
Simple Complex
 Performance Analysis

 Management
 Enterprise-wide perspective
Tools Comparison
 Relational
 Known environment
 Use with operational and warehouse systems
 No complex analysis functions
 Multidimensional
 Quick access to data
 Extensive libraries of complex functions
 Strong modeling and forecasting capabilities
 Use with operational and warehouse systems
 Difficulty of changing dimensions
Summary

 Data access strategy


 Data access tools
 Database models for OLAP
 OLAP query techniques
 Non-OLAP tools
 Query tool choice
Thank You!

You might also like