Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Birla Institute of Technology & Science, Pilani Work-Integrated Learning Programmes Division Second Semester 2010-2011 Mid-Semester Test

(EC-1 Regular) Solution Course No. : SS ZG515 Course Title : DATA WAREHOUSING Nature of Exam : Closed Book Weightage : 40% Duration : 2 Hours Date of Exam : 05/02/2011 (AN) 1. How are the top-down and bottom-up approaches for building a data warehouse different ? Discuss the merits and disadvantages of each approach. [7] Ans :

The top down approach Bill Inmon saw a need to transfer data from diverse OLTP systems into a centralized place where the data could be used for analysis. He insisted that data should be organized into subject oriented, integrated, non volatile and time variant structures. The data should be accessible at detailed atomic levels by drilling down or at summarized levels by drilling up. The data marts are treated as sub sets of the data warehouse. Each data mart is built for an individual department and is optimized for analysis needs of the particular department for which it is created. The Bottom-Up Approach Ralph Kimball designed the data warehouse with the data marts connected to it with a bus structure. The bus structure contained all the common elements that are used by data marts such as conformed dimensions, measures etc defined for the enterprise as a whole. He felt that by using these conformed elements, users can query all data marts together. This architecture makes the data warehouse more of a virtual reality than a physical reality. All data marts could be located in one server or could be located on different servers across the enterprise while the data warehouse would be a virtual entity being nothing more than a sum total of all the data marts.
Advantages of Top-Down a. A truly corporate effort, an enterprise view of data b. Inherently architected-not a union of disparate DMs c. Single, central storage of data about the content d. Central rules and control e. May be developed fast using iterative approach Disadvantages of Top-Down f. Takes longer to build even with iterative method

g. High exposure/risk to failure h. Needs high level of cross functional skills i. High outlay without proof of concept j. Difficult to sell this approach to senior management and sponsors Advantages of Bottom-Up Approach k. Faster and easier implementation of manageable pieces l. Favorable ROI and proof of concept m. Less risk of failure n. Inherently incremental; can schedule important DMs first o. Allows project team to learn and grow Disadvantages of Bottom-Up Approach p. Each DM has its own narrow view of data q. Permeates redundant data in every DM r. Difficult to integrate if the overall requirements are not considered in the beginning 2. Why it is difficult to capture the effectiveness of promotion dimension? Why promotion dimension is not suitable for inventory data mart? [6] Ans : Promotion conditions include TPRs, End-aisle displays, Newspapers ads, Coupons, Combinations are common. Promotion are judged on the basis of:[2] Lift and Baseline sales Time shifting Cannibalization Growing the market So difficult to capture the effect of promotion.[4] In the promotional dimension typical of promotions: discount, brokerage, joint venture, etc., as the dimension attribute which are business dimensions so there is no need of including them in inventory data mart[2]

3. Explain the importance of grain in fact table. How fact tables are sparse? Why we do not store GMROI in the inventory fact table? [8] Ans : Grain is the level of detail for the measurement or metrics.[2] Sparse fact table: for example for any sales table if there is a holiday or no orders are received or processed then that rows lies null.[4] GMROI [Gross Margin Return On Inventory]is not additive and, therefore, is not stored in enhanced fact table. Gmroi is calculated from the constituent columns [2]

4. Why it is important to transform the data before it is loaded onto the data warehouse? List any five types of transformations that are performed. Give an example for each. [6] Ans : Source systems are very diverse and desperate . Many systems are old legacy system running on obsolete data base. Historical data change in values is not preserved in source operational systems. Lack of consistency in source operational systems Changing data according to new business conditions[3]

major transformation process that are done [3] 1) format revisions 2)decoding of fields 3)character set conversion 4)key restructuring 5)de-duplication 6)merging of information 5. Explain the difference between destructive merge and constructive merge for applying data to the data warehouse repository. When do you use these modes? [6] Ans : data loading component Destructive merge: Incoming data is applied to target data if primary key matches then update the matching record and a record without a match is simply added to the target. Constructive merge: If primary key of record matches with the existing one then mark the incoming record as a super ceding to the old one[4] It will be used in the data loading and append process.[2] 6. Discuss the architectural component of the data warehouse. In which architectural component does OLAP fit in? What is the function of OLAP? [7]

The Datwarehouse architectural components are as follows : Data Content Complex Analysis and Quick Response Metadata-driven At the Data Source - The data staging architectural component governs the transformation, cleansing, and integration of data.

Data Source Layer This represents the different data sources that feed data into the data warehouse. The data source can be of any format -- plain text file, relational database, other types of database, Excel file, ... can all act as a data source. Many different types of data can be a data source: Operations -- such as sales data, HR data, product data, inventory data, marketing data, systems data. Web server logs with user browsing data. Internal market research data. Third-party data, such as census data, demographics data, or survey data. All these data sources together form the Data Source Layer. Data Extraction Layer Data gets pulled from the data source into the data warehouse system. There is likely some minimal data cleansing, but there is unlikely any major data transformation. Staging Area This is where data sits prior to being scrubbed and transformed into a data warehouse / data mart. Having one common area makes it easier for subsequent data processing / integration. ETL Layer

This is where data gains its "intelligence", as logic is applied to transform the data from a transactional nature to an analytical nature. This layer is also where data cleansing happens. Data Storage Layer This is where the transformed and cleansed data sit. Based on scope and functionality, 3 types of entities can be found here: data warehouse, data mart, and operational data store (ODS). In any given system, you may have just one of the three, two of the three, or all three types. Data Logic Layer This is where business rules are stored. Business rules stored here do not affect the underlying data transformation rules, but does affect what the report looks like. Data Presentation Layer This refers to the information that reaches the users. This can be in a form of a tabular / graphical report in a browser, an emailed report that gets automatically generated and sent everyday, or an alert that warns users of exceptions, among others. Metadata Layer This is where information about the data stored in the data warehouse system is stored. A logical data model would be an example of something that's in the metadata layer. System Operations Layer This layer includes information on how the data warehouse system operates, such as ETL job status, system performance, and user access history. OLAP is used for DSS tools that use multidimensional data analysis techniques Support for a DSS data store Data extraction and integration filter Specialized presentation interface Multi Dimensional Data analysis Advanced Database Support Easy to use end user interfaces Support Client/Server Architecture

You might also like