Professional Documents
Culture Documents
Data Warehouse Development & Schemas
Data Warehouse Development & Schemas
8-2
Top-down Development Spiral Development Approach ERD Based He insisted that data should be organized into subject oriented, integrated, non volatile and time variant structures. Detailed data is regularly extracted from the ODS and Data marts and temporarily hosted in the staging area for aggregation, summarization and then extracted and loaded into the Data warehouse.
8-3
Bottom up approach uses bus structure. Plan big, build small Subject oriented or department-oriented data warehouse such as marketing or sales. This model strikes a good balance between centralized and localized flexibility. This architecture makes the data warehouse more of a virtual reality than a physical reality. All data marts could be located in one server or could be located on different servers across the enterprise while the data warehouse would be a virtual entity being nothing more than a sum total of all the data marts.
8-4
DW Development Approaches
(Inmon Approach) (Kimball Approach)
8-5
8-6
Star schema
A star schema can be simple or complex. A simple star consists of one fact table; a complex star can have more than one fact table.
They are normally descriptive, textual values Dimension tables are generally small in size then fact table.
8-7
Star schema
8-8
Simple structure -> easy to understand schema Great query effectives -> small number of tables to join Relatively long time of loading data into dimension tables -> de-normalization, redundancy data caused that size of the table could be large. The most commonly used in the data warehouse implementations -> widely supported by a large number of business intelligence tools
8-9
Snowflake schema
The snowflake schema architecture is a more complex variation of the star schema used in a data warehouse, because the tables which describe the dimensions are normalized.
8-10
For each star schema it is possible to construct fact constellation schema(for example by splitting the original star schema into more star schemes each of them describes facts on another level of dimension hierarchies) The fact constellation architecture contains multiple fact tables that share many dimension tables. The main shortcoming of the fact constellation schema is a more complicated design because many variants for particular kinds of aggregation must be considered and selected. Moreover, dimension tables are still large.
8-11
8-12
8-13
The project must fit with corporate strategy There must be complete buy-in to the project It is important to manage user expectations The data warehouse must be built incrementally Adaptability must be built in from the start The project must be managed by both IT and business professionals (a businesssupplier relationship must be developed) Only load data that have been cleansed/high quality Do not overlook training requirements Be politically aware.
MBA IV SEM (SEC-A)
Risks in Implementing DW
No mission or objective Quality of source data unknown Skills not in place Inadequate budget Lack of supporting software Source data not understood Weak sponsor Users not computer literate Political problems or turf wars Unrealistic user expectations (Continued )
MBA IV SEM (SEC-A)
8-14
Architectural and design risks Scope creep and changing requirements Vendors out of control Multiple platforms Key people leaving the project Loss of the sponsor Too much new technology Having to fix an operational system Geographically distributed environment Team geography and language culture
MBA IV SEM (SEC-A)
8-15
Starting with the wrong sponsorship chain Setting expectations that you cannot meet Engaging in politically naive behavior Loading the warehouse with information just because it is available Believing that data warehousing database design is the same as transactional DB design Choosing a data warehouse manager who is technology oriented rather than user oriented (see more on page 356)
MBA IV SEM (SEC-A)
8-16
Enabling real-time data updates for real-time analysis and real-time decision making is growing rapidly
8-17
8-18
8-19
8-20
Due to its huge size and its intrinsic nature, a DW requires especially strong monitoring in order to sustain its efficiency, productivity and security. The successful administration and management of a data warehouse entails skills and proficiency that go past what is required of a traditional database administrator.
8-21
Scalability
The amount of data in the warehouse How quickly the warehouse is expected to grow The number of concurrent users The complexity of user queries
Good scalability means that queries and other data-access functions will grow linearly with the size of the warehouse Emphasis on security and privacy
MBA IV SEM (SEC-A)
Security
8-22
Questions ?
8-23