Professional Documents
Culture Documents
Data Warehousing Tutorial
Data Warehousing Tutorial
CO M
WWW.PARETOANALYSTS.COM
Page 1
6/13/2012
The structure of data warehouses is easier for end users to navigate, understand and query against unlike the relational databases primarily designed to handle lots of transactions. Data warehouses enable queries that cut across different segments of a company's operation. E.g. production data could be compared against inventory data even if they were originally stored in different databases with different structures. Queries that would be complex in very normalized databases could be easier to build and maintain in data warehouses, decreasing the workload on transaction systems. Data warehousing is an efficient way to manage and report on data that is from a variety of sources, non uniform and scattered throughout a company. Data warehousing is an efficient way to manage demand for lots of information from lots of users. Data warehousing provides the capability to analyze large amounts of historical data for nuggets of wisdom that can provide an organization with competitive advantage.
What is OLAP?
OLAP stands for Online Analytical Processing. It uses database tables (fact and dimension tables) to enable multidimensional viewing, analysis and querying of large amounts of data. E.g. OLAP technology could provide management with fast answers to complex queries on their operational data or enable them to analyze their company's historical data for trends and patterns.
What is OLTP?
OLTP stands for Online Transaction Processing. OLTP uses normalized tables to quickly record large amounts of transactions while making sure that these updates of data occur in as few places as possible. Consequently OLTP database are designed for recording the daily operations and transactions of a business. E.g. a timecard system that supports a large production environment must record successfully a large number of updates during critical periods like lunch hour, breaks, startup and close of work.
(profit by month, quarter, year), Region dimension (profit by country, state, city), Product dimension (profit for product1, product2).
space to store them. They store only the definitions and not the data of the referenced source cubes. They are similar to views in relational databases.
finally multiply this result by the number of years of transactions involved. Divide this result by 1024 to convert to kilobytes and by 1024 again to convert to megabytes. E.g. A data warehouse will store facts about the help provided by a companys product support representatives. The fact table is made of up of a composite key of 7 indexes (int data type) including the primary key. The fact table also contains 1 measure of time (datetime data type) and another measure of duration (int data type). 2000 product incidents are recorded each hour in a relational database. A typical work day is 8 hours and support is provided for every day in the year. What will be approximate size of this data warehouse in 5 years? First calculate the approximate size of a row in bytes (int data type = 4 bytes, datetime data type = 8 bytes): size of a row = size of all composite indexes (add the size of all indexes) + size of all measures (add the size of all measures). Size of a row (bytes) = (4 * 7) + (8 + 4). Size of a row (bytes) = 40 bytes. Number of rows in fact table = (number of transactions per hour) * (8 hours) * (365 days in a year). Number of rows in fact table = (2000 product incidents per hour) * (8 Hours ) * (365 days in a year). Number of rows in fact table = 2000 * 8 * 365 Number of rows in fact table = 5840000 Size of fact table (1 year) = (Number of rows in fact table) * (Size of a row) Size Size Size Size (bytes per year) = 5840000 * 40 (bytes per year) = 233600000. (megabytes per year) = 233600000 / (1024*1024) (in megabytes for 5 years) = (23360000 * 5) / (1024 *1024) Size of fact table (megabytes) = 1113.89 MB Size of fact table (gigabytes) = 1113.89 / 1024 Size of fact table (gigabytes) = 1.089 GB of of of of fact fact fact fact table table table table
WWW.PARETOANALYSTS.COM
Page 5
6/13/2012