Professional Documents
Culture Documents
Data Warehousing Olap: Click To Edit Master Subtitle Style
Data Warehousing Olap: Click To Edit Master Subtitle Style
and
OLAP
Click to edit Master subtitle style
4/15/12
Motivation
On the other hand:
In most organizations, data about specific parts of business is there - lots and lots of data, somewhere, in some form. Data is available but not information -and not the right information at the right time.
bring together information from multiple sources as to provide a consistent database source for decision support queries. decision support applications from
4/15/12 off-load
Warehousing
Walmart: 900-CPU, 2,700 disks, 23TB Teradata system ROLAP, MOLAP, HOLAP rollup. drill-down, slice& dice
4/15/12
where,
4/15/12
knowledge workers (executives, managers, analysts) make faster and better decisions:
what were the sales volumes by region and by product category in the last year?
how did the share price of computer manufacturers correlate with quarterly profits over the past 10 years? will a 10% discount increase sales volume sufficiently?
4/15/12
OLTP the main aim of OLTP is reliable and efficient processing of a large number of transactions and ensuring data consistency. OLAP the main aim of OLAP is efficient multidimensional processing of large data volumes.
4/15/12
Traditional OLTP
Traditionally, DBMS have been used for on-line transaction processing (OLTP)
order entry: pull up order xx-yy-zz and update status field banking: transfer $100 from account X to account Y
clerical data processing tasks detailed up-to-date data structured, repetitive tasks short transactions are the unit of work read and/or update a few records isolation, recovery, and integrity are critical
4/15/12
4/15/12
OLAP
Clerk, IT professional Knowledge worker decision support subject-oriented historical, summarized multidimensional day to day operations application-oriented
usage access
repetitive read/write,
index/hash on prim. key unit of work short, simple transaction tens hundreds complex query millions
A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of managements decisionmaking process. --- W. H. Inmon Collection of data that is used primarily in organizational decision making
Subject oriented: oriented to the major subject areas of the corporation that have been defined in the data model.
E.g. for an insurance company: customer, product, transaction or activity, policy, claim, account, and etc.
4/15/12 E.g.
There is no consistency in encoding, naming conventions, , among different data sources Heterogeneous data sources When data is moved to the warehouse, it is converted.
4/15/12
Operational data is regularly accessed and manipulated a record at a time, and update is done to data in the operational environment. Warehouse Data is loaded and accessed. Update of data does not occur in the data warehouse environment.
4/15/12
The time horizon for the data warehouse is significantly longer than that of operational systems.
Operational database: current value data. Data warehouse data : nothing more than a sophisticated series of snapshots, taken of at some moment in time.
The key structure of operational data may or may not contain some element of time. The key structure of the data warehouse always contains some element of time. 4/15/12
Performance
special data organization, access methods, and implementation methods are needed to support multidimensional views and operations typical of OLAP Complex OLAP queries would degrade performance for operational transactions Concurrency control and recovery modes of OLTP are not compatible 4/15/12with OLAP analysis
Function
missing data: Decision support requires historical data which operational DBs do not typically maintain data consolidation: DS requires consolidation (aggregation, summarization) of data from heterogeneous sources: operational DBs, external sources data quality: different sources typically use inconsistent data 4/15/12 representations, codes and formats