Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

DATA WAREHOUSE

&
DATA MNAGEMENT
SEMESTER - 6
UNIT - 1

HICOLLEGE.IN
INTRODUCTION TO DATA WAREHOUSING
Imagine a warehouse, but instead of storing boxes, we're stockpiling
information – a massive collection of data specifically designed to help us
understand our business better.
Here's a key difference between a data warehouse and a regular database:
Regular Database: Designed for daily transactions, like recording a sale or
updating customer details. Think of it as a cash register, constantly
processing new information.
Data Warehouse: Focused on analysis, storing historical data for extended
periods. It's like a giant filing cabinet, meticulously organized for easy
retrieval and analysis.
Benefits of Data Warehousing
Here are some key advantages of using a data warehouse:
Improved Decision Making: Consolidated data provides a unified view,
enabling better-informed choices.
Enhanced Data Analysis: Easier to analyze large datasets and identify trends
and patterns.
Reduced Costs: Saves time and resources by eliminating the need to access
and combine data from multiple sources.
Increased Efficiency: Faster analysis leads to quicker responses and
improved operational efficiency.

HiCollege Click Here For More Notes 03


DIFFERENCE BETWEEN DATABASE SYSTEM
AND DATA WAREHOUSE

HiCollege Click Here For More Notes 04


COMPELLING NEED FOR DATA WAREHOUSING
Suppose you are the lead project manager of HiCollege.in and this morning you
have to give the company direction for future moves but you see the company’s
data is scattered and it is like missing pieces in a puzzle, so in order to provide
right direction you need to have a single place where all company’s data can
rely on and interrelate hence, data warehouses are important because:-

Centralized Repository: It brings data together from disparate sources into a


single, unified platform.
Consistent Data: The data goes through a transformation process, ensuring
consistency in format and definitions.
Historical Analysis: Data warehouses store historical data, allowing you to track
trends and identify patterns over time.
Faster Insights: Easy access to clean, integrated data enables faster analysis and
quicker decision-making.

DATA WAREHOUSE BUILDING BLOCKS


1. Source Data:
Imagine this as the raw materials for your warehouse. It comes from various
operational systems within your organization:
Sales data (transactions, customer purchases)
Customer data (names, demographics, purchase history)
Marketing data (campaign performance, website clicks)
Financial data (transactions, budgets)
Even external data (social media sentiment, market research)
2. Data Staging Area:
This is a temporary storage space where raw data is prepared before
entering the warehouse. Think of it as a cleaning and sorting station for your
information. Here's what happens:
Data Extraction: Data is pulled from various sources.
Data Cleaning: Errors and inconsistencies are identified and fixed.
Data Transformation: Data is formatted consistently for seamless
integration.

HiCollege Click Here For More Notes 05


3. Data Storage:
This is the heart of your warehouse, where the transformed data resides.
Here, organization is key! Data can be stored in various ways depending on
your needs:
Relational Databases: Structured data organized in tables with rows and
columns (like customer names and purchase history).
Data Marts: Subset of the data warehouse focused on a specific
department or business function (like a mini-warehouse for the
marketing team).
Data Cubes: Multidimensional view of data allowing for faster analysis of
complex relationships (think of it as a pivot table on steroids).
4. Data Transformation Layer (ETL or ELT):
The magic happens here! This layer is the transformer, the architect of your
data. There are two main approaches:
Extract-Transform-Load (ETL): Data is transformed in the staging area
before being loaded into the warehouse.
Extract-Load-Transform (ELT): Data is loaded into the warehouse first,
then transformed within the warehouse itself.
5. Metadata:
This is the data about your data – like a detailed catalog for your warehouse.
It describes the structure, meaning, and origin of the data stored in the
warehouse. Think of it as labels on all your data boxes!
6. Information Delivery & Presentation Layer:
This is the access point for users to interact with the data warehouse. It
provides tools for:
Data Visualization: Creating charts, graphs, and dashboards to see trends
and patterns.
Online Analytical Processing (OLAP): Allows users to drill down, slice and
dice the data from multiple perspectives.
Data Mining: Uncovering hidden patterns and insights from the data.
7. Management and Control:
Every warehouse needs a good security system! This layer ensures data
security, user access control, and maintains the overall health of the data
warehouse.

HiCollege Click Here For More Notes 06


THREE-TIER ARCHITECTURE (IMP)
1. Presentation Tier (UI Tier):
Imagine this as the shop window of your application. It's the user interface (UI)
where users interact with the application. This tier can include:

Web interfaces (web pages)


Mobile apps
Desktop applications
The presentation tier takes user input, formats it, and sends it to the
application tier. It also displays the processed information back to the user in a
user-friendly format.

2. Application Tier (Business Logic Tier):


Consider this the engine room of your application. It handles the core
functionalities and business logic. This tier:

Receives user input from the presentation tier.


Validates and processes the data.
Communicates with the data tier to retrieve or store information.
Performs calculations and business logic based on the data.
Prepares the processed data for presentation to the user interface.
The application tier acts as an intermediary between the user interface and
the database, ensuring the data is handled correctly.

3. Data Tier (Database Tier):


Think of this as the basement of your application, where all the data is stored.
This tier can include:

Relational databases (like MySQL, Oracle)


NoSQL databases
Data warehouses
The data tier stores, retrieves, and manages the application's data. It interacts
with the application tier based on its requests.

HiCollege Click Here For More Notes 07


THREE-TIER ARCHITECTURE (IMP)

HiCollege Click Here For More Notes 07


METADATA IN THE DATA WAREHOUSE
In the world of data warehousing, data is king, but there's another crucial player
often working behind the scenes – metadata. Imagine a massive library; books
are the data, but without a well-organized catalog system (metadata), finding
the information you need would be a nightmare.
What is Metadata?
Metadata is essentially "data about data." It provides details and descriptions
about the data stored in your data warehouse. Here's what metadata typically
includes:
Data Structure: Describes how data is organized within the warehouse
(tables, columns, data types).
Data Meaning: Defines the meaning of each data element (e.g.,
"CustomerID" refers to a unique customer identifier).
Data Origin: Tracks where the data came from (e.g., sales database,
marketing system).
Data Transformation Rules: Documents how the data was transformed
before being loaded into the warehouse (e.g., currency conversion).
Data Retention Policies: Specifies how long the data will be stored in the
warehouse.
Data Access Controls: Defines who can access specific data elements within
the warehouse.

HiCollege Click Here For More Notes 08

You might also like