Data Warehouse Power Point Presentation

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 18

Article: Article Title, Author(s), Date published, …

Course: Foundations of Information Science and Systems (InSS 601)

Reviewed By: xxx

Submitted to: xxx

Month Date, Year


Overview of Data Warehouse

 A data warehouse is a set of data that support decision making process.


It is subject -oriented, integrated, consistent and it shows evolution overtime, and it
is not volatile.

 data warehouse is a place where the data is stored for archival, security and
analysis reasons.

 The data in a warehouse never updated but used only to respond to queries from
end users who usually make decision.

There are four fundamental characteristics of data warehouse which are listed
below:
 Subject Oriented: In data warehouse, data are categorized by subjects.

 
Overview of Data Warehouse

 Integration: Data warehouse is expected to be completely integrated as it is a


place where data from various places are stored.

 Time Variant: Time dimension is very important since data warehouse contains
historical data that help with the forecast and decision making.

 Nonvolatile: End users cannot update or change data, once they are keyed to the
warehouse.
Notable Points from the Literature Review
 Data warehouse is a relational database management system (RDBMS), which is
specially developed for the requirements of transaction processing system.

 Once the relevant data is collected and stored in the data warehouse, the potential of the
data warehouse can be enhanced.
 the author W.H. Inmon describes the evolution of decision support systems, like how data
warehouse environment is, and he also describes the data warehouse design, advanced
topics related to data warehouse and its future as well.

 Rajan Tiwari gives detailed information about the Data Warehouse, where he describes its
meaning, characteristics, architecture and all the related concepts for building a data
warehouse.
Notable Points from the Literature Review
 data mining is a process which is used to convert raw data into useful information by
business organizations.

 The authors Matteo Golfarelli and Stefano Rizzi in , describe about lifecycle of data
warehouse systems and suggested a methodological approach for designing them. They
also focus on how data is extracted from sources, transformed, cleansed, and then used
to populate DW.
NEED OF DATA WAREHOUSE

 An ordinary database can store MB to GB of data, which is used for specified


purposes.

 To store larger data like TBs of data, the storage must be shifted to data
warehouse.

 a transactional database does not provide itself for analytics. So, in order to
conduct effective analysis, a company needs to maintain a central data
warehouse to fully understand its business by organizing, studying and using its
historic data to make strategic decisions and analyze the upcoming trends.
ADVANTAGES OF DATA WAREHOUSE

 Repository for historical information for comparative and competitive


analysis.

 Ability to enhanced data quality and completeness.

 Real-time consolidation of financial data becomes practical.

 The IT costs and staff dedicated to reporting are greatly reduced.

 Allow business process redesign and align with business strategy.


ADVANTAGES OF DATA WAREHOUSE

 Give end users freedom to carry out wide-ranging analysis in various manners.

 Simplify the process of data access.

 Identify market trends.

 Reduce operation costs.

 Allow business process redesign and align with business strategy.


DISADVANTAGES OF DATA WAREHOUSE

 The initial cost of building a warehouse is huge.

 different departments are unwilling to share their personal data in a central


repository ,security and increases Ownership issues in certain sectors.

 it is unable to capture the required amount of data which might be required for
evaluation by a particular organization.

 There is the low flexibility of quotation because it is difficult for the company to
meet certain requirements since wages and salary costs are also rising.
Data Warehouse Approach

Data warehouse container is built with a Bottom-Up Approach, Top-Down Approach or


else grouping of together.
 TOP-DOWN APPROACH
It is begin with preparation. It is functional within belongings everywhere the expertise
be grown-up with healthy identified.

 BOTTOM-UP APPROACH
It starts through the experiment with models.

 COLLECTIVE APPROACH
an association is able to utilize the designed with considered scenery but the Top-down
approach though retain the quick execution with an opportunistic function of Bottom-
up approach. 
DATA WAREHOUSE ARCHITECTURE

 A data warehouse works like a central repository. The information comes from one
or more data sources, such as a transactional system or other relational databases.

 The data can be structured, semi-structured or unstructured.

 Data warehouse architecture is a method to define all computing and


representation architectures of end-user computing in an enterprise.

 The data warehouse and its architecture largely depend on the elements of the
business situation.
DATA WAREHOUSE ARCHITECTURE

The following architecture properties are very much necessary for a data warehouse
system (Kelly,1997):
 Separation: Analysis and transaction processing should be carried out
separately.

 Scalability: Hardware and software architectures should be simple to upgrade


the data volume.

 Extensibility: The architecture should be able to handle new processes and


technologies without having to redesign the entire system.
 Security: Monitoring accesses are necessary because of the strategic data stored
in the data warehouses.

 Administrability: The Data warehouse management should not be difficult.


TYPES OF DATA WAREHOUSE ARCHITECTURE

Data warehouse Architectures are mainly of three types:


1. Single-Tier Architecture: It is not often used in practice. The goal is to minimize
the amount of stored data to achieve the goal, it eliminates data redundancy.
The weakness of this architecture is that it cannot meet the requirements for separation
between analytical and transaction processing.

2. Two-Tier Architecture: It is used to highlight the separation between physically


available sources and data warehouses, it consists of four main sequential data flow
stages: Source Layer, Data Staging, Data Warehouse Layer and Analysis.

3. Three-Tier Architecture: The three-tier architecture of the source layer (containing


multiple source systems), the reconciled layer and the data warehouse layer (containing
both data warehouse and data marts).
COMPONENTS of DATA WAREHOUSE

A typical data warehouse consists of five main components: central data warehouse databases,
ETL tools, metadata, access/query tools and data marts.

 Data Warehouse Database: The central data warehouse database is the cornerstone of
the data-warehousing environment. This database is almost always implemented on
the relational database management system (RDBMS) technology.

 Sourcing, Acquisition, Clean-up & Transformation Tools (ETL):are used for performing all
necessary conversions, summarizations, and all the changes needed to transform data into
a unified format in the data warehouse.

 Meta Data: Metadata is data about the data that defines the data warehouse, and is used
for creating, managing, and maintaining the data warehouse.
COMPONENTS of DATA WAREHOUSE

 Query Tools (Access Tools): One of the main tasks of the data warehousing is to
provide the company with the information needed to make strategic decisions.
Users use front-end tools to interact with the data warehouse. The query tool allows
users to interact with the data warehouse system .
 online analytical processing tools (OLAP), and data mining tools.:: Query and
reporting tools can be divided into two categories: reporting tools and managed
query tools. Further the reporting tools are divided into production reporting tools
and report writers.
FUTURE OF DATA WAREHOUSE

 The future of the data warehouse is going to be cloud-based. To have a good


performance, security, flexibility, and ease of use, organizations are shifting to
Cloud data warehousing.

 Many companies have used hybrid, public and private clouds to start their journey
from data to decision making.
 Few of the cloud databases are: Snowflake, Big Query, Redshift, Azure SQL Data
Warehouse. With the advantages of hybrid and cloud platform, the next generation
data warehouse tends to be more beneficial in 3 main aspects: Storage, computing
infrastructure and services.
FUTURE OF DATA WAREHOUSE

 The data warehouse also assists AI and machine learning to obtain results.

 The next generation data warehouse market would help in the infrastructure that
would support Big Data.

 As a system, a concept, and a method of providing information about customers,


markets, and companies, the data warehouse will not disappear anytime soon.

 Data warehouses are becoming a progressively important part of the digital world.

You might also like