Professional Documents
Culture Documents
The Data Warehouse Is A Place Where People Can Access Their Data. The Goals of A Data Warehouse Are As Follows
The Data Warehouse Is A Place Where People Can Access Their Data. The Goals of A Data Warehouse Are As Follows
The data warehouse is a place where people can access their data. The goals of a data warehouse are as follows
Access warehouse retrievals must be fast The Data in a data warehouse is consistent Users must be able to slice and dice the data A warehouse must use easy to use browsing tools The data warehouse is a place where we publish used data The Quality of the Data in the data warehouse is a driver of business reengineering.
4/14/2012
Consistency
Both OLTP and data warehouse systems are greatly concerned with data consistency. OLTP consistency is microscopic. The point of transaction processing is to process a very large number of tiny, atomic transactions with out loosing any of them. In a data warehouse, consistency is measured globally. We dont care about individual transactions. But we care enormously that the current load of new data is a full and consistent set of data.
4/14/2012 3
What is a Transaction
A serious OLTP System processes thousands or even millions of transactions per day.
A serious data warehouse often will process only one transaction per day. But this transaction contains millions of records Called a Production Data Load. What we care about is the consistent state of the system we started before the production data load.
If we are forced to stop the production data load before it was complete we will not roll back the inserted records. We will rather overwrite the entire system with a snapshot of the system taken before the production data load.
4/14/2012 4
You could look at lets say, ales units, sales dollars, defects etc.
Suppose that U ask to see a report of your companys Units Sold. Heres what u get: 113
4/14/2012 6
Fact Table
A Fact Table is a table in the relational data warehouse that stores the detailed values for measures, or facts. Example a fact table that stores Dollars and Units by state, by product and by Month has five columns.
State Product Month Units Dollars
The first 3 columns are Key columns, the remaining two are measure values.
4/14/2012
Fact Table
Each column in the fact table should be either a key or a measure.
The key column for a date dimension might be either an integer key or a date.
4/14/2012
Dimension Tables
A dimension table contains one row for each leaf level member of the dimension. Ex. A product dimension table with 3 products will have 3 rows. In most cases a dimension table also contains one column containing a numeric key columns that uniquely identifies each member. This column that contains the unique value is the primary key and references the foreign key in the fact table.
4/14/2012
CHRIS
Dimension Tables
If the dimension is involved in a balanced hierarchy it will have an additional column that gives the parent for each member. Ex.if you have 3 products in a dimension table that belong to a particular product Subcategory your table will look like this.
PROD_ID 589 592 1218 Prod_Name Sweet Muffins Coconut Muffins Salt Bread SubCategory Muffins Muffins Bread
4/14/2012
CHRIS
10
Star Schema
When each dimension is stored in a single table, the databases organization is called a star Schema Design. When a Database Dimensions are stored in a chain of tables, the databases design is called a Snowflake Design. A relational database must perform time consuming joins each time a report executes, and a star design for a dimension requires fewer joins than a snowflake design.
4/14/2012
CHRIS
11
CHRIS
12
CHRIS
13
CHRIS
14
HOLAP ( Hybrid OLAP) - A storage option of both relational and proprietary structure.
4/14/2012
CHRIS
15
CHRIS
16
Metadata All the information in the data warehouse environment that is not the actual data itself.
4/14/2012
CHRIS
17
CHRIS
18
CHRIS
19
CHRIS
20
Backing Up and Recovering Since data warehouse data is a flow of data from the legacy system on through to the data marts and eventually onto the users desktops, a real question arises about where to take the necessary snapshots.
4/14/2012 21
Choose the measured facts that will populate each fact table record.
4/14/2012 22