Professional Documents
Culture Documents
Data Warehousing U1&2 Notes
Data Warehousing U1&2 Notes
(This pdf covers 99% of all the topics strictly from the syllabus of Data
Warehousing and Data Mining subject Unit 1 and 2 and it’s content is made
to be written in exam)
1. Operational Source –
• An operational Source is a data source consists of
Operational Data and External Data.
• Data can come from Relational DBMS like Informix,
Oracle.
2. Load Manager –
• The Load Manager performs all operations associated
with the extraction of loading data in the data
warehouse.
• These tasks include the simple transformation of data to
prepare data for entry into the warehouse.
Contents:
Fact Table:
• A fact table is a primary table in a dimensional
model. A Fact Table contains
• Measurements/facts
• Foreign key to dimension table
Dimension table:
• A dimension table contains dimensions of a fact.
• They are joined to fact table via a foreign key.
• Dimension tables are de-normalized tables.
• The Dimension Attributes are the various columns
in a dimension table.
(This is absolute bull shit, I am leaving this topic for brighter minds so they can
enlighten people)
Metadata
Types of Metadata:
1. Roll-up:
Roll-up performs aggregation on a data cube in any of the
following ways −
• By climbing up a concept hierarchy for a dimension
• By dimension reduction
• Roll-up is performed by climbing up a concept hierarchy
for the dimension location.
• Initially the concept hierarchy was "street < city <
province < country".
• On rolling up, the data is aggregated by ascending the
location hierarchy from the level of city to the level of
country.
• The data is grouped into cities rather than countries.
• When roll-up is performed, one or more dimensions from
the data cube are removed.
2. Drill-down:
3. Slice:
The slice operation selects one particular dimension from a
given cube and provides a new sub-cube.
4. Dice:
Dice selects two or more dimensions from a given cube and
provides a new sub-cube.
5. Pivot:
The pivot operation is also known as rotation.
It rotates the data axes in view in order to provide an
alternative presentation of data.
DATA PREPROCESSING:
3. Data transformation:
4. Data reduction:
(I failed to find any kind of mining methods for frequent items set… so I guess it’s
time for brighter minds to shine, only if they want to… either way I am leaving this
one I am already tired)
Where,
rxy= Pearson r correlation coefficient between x and y
n= number of observations
xi = value of x (for ith observation)
yi= value of y (for ith observation)
------------------THE END------------------
This marks the end of Unit1&2 Notes for this subject… Hope it helped,
for those who just try to command me to make notes in DM, better help me
making it quicker than just demanding or don’t demand, I’ll do what I have to.