Professional Documents
Culture Documents
DWH Concepts (By - P.M.Prasad)
DWH Concepts (By - P.M.Prasad)
A variation on the staging structure is the addition of data marts to the data
warehouse. The data marts store summarized data for a particular line of business,
making that data easily accessible for specific forms of analysis. For example,
adding data marts can allow a financial analyst to more easily perform detailed
queries on sales data, to make predictions about customer behavior. Data marts
make analysis easier by tailoring data specifically to meet the needs of the end user.
A fact is the part of your data that indicates a specific occurrence or transaction. For
example, if your business sells flowers, some facts you would see in your data
warehouse are:
Several numbers can describe each fact, and we call these numbers measures.
Some measures to describe the fact ‘ordered 500 new flower pots from China for
$1500’ are:
• Quantity ordered - 500
• Cost - $1500
When analysts are working with data, they perform calculations on measures (e.g.,
sum, maximum, average) to glean insights. For example, you may want to know the
average number of flower pots you order each month.
• Time purchased - 1 pm
• Simpler tables make data modification commands faster to write and execute
• Less redundant data means you save on disk space, and so you can collect
and store more data
Denormalization is the process of deliberately adding redundant copies or groups of
data to already normalized data. It is not the same as un-normalized data.
Denormalization improves the read performance and makes it much easier to
manipulate tables into forms you want. When analysts work with data warehouses,
they typically only perform reads on the data. Thus, denormalized data can save
them vast amounts of time and headaches.
Benefits of denormalization:
• Fewer tables minimize the need for table joins which speeds up data analysts’
workflow and leads them discovering more useful insights in the data
Data Models
It would be wildly inefficient to store all your data in one massive table. So, your data
warehouse contains many tables that you can join together to get specific
information. The main table is called a fact table, and dimension tables surround it.
The first step in designing a data warehouse is to build a conceptual data model that
defines the data you want and the high-level relationships between them.
Online analytical processing (OLAP) allows you to run complex read queries and
thus perform a detailed analysis of historical transactional data. OLAP systems help
to analyze the data in the data warehouse.