Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 18

DATA WAREHOUSING

AND
DATA MINING
PRESENTED BY

N.ASHOK
S.R.ARUN KUMAR
THIRU RAMAKRISHNA NALLAMMAI POLYTECHNIC COLLEGE , DHARAPURAM.
DATA WAREHOUSING
 Data warehousing is defined as a method of
collecting information from various sources and
storing it under a unique model at a single site
and a process of centralized data management
and retrieval.
 Data warehousing represents an ideal vision of
maintaining a central storage area of all
organizational data
 Centralization of data is needed to maximize
user access and analysis.
CHARACTERISTICS

Subject oriented
Integrated
Time varient
Non volatile
FUNCTIONS

Data sources Cleaning Transformation

New updates Datawarehouse


FUNCTIONS
 Gathering data from various sources
 Integrating those data by converting them
to a common integrated schema
 Transforming and Cleaning the data
which improves consistency of data
 Updating the Data
 Summarizing data
LIFE CYCLE

5.Enhance 1.Design

4.Operate 2.Prototype

3.Deploy
BUILDING A
DATAWARE HOUSE
 Extract data from multiple sources
 Format the data for consistency within the
ware house
 Cleaning the data – Ensures validity.
 Converting the schema of data to a
common integrated schema
 Back flushing – Loading the cleaned data
into the warehouse
OLAP
Online Analytical Processing:
 Analyze data in a multidimensional
format.
 Transforms the data warehouse data into
specific meaningful information.
 Provides User friendly environment for
interactive data analysis
OLAP
 Structure of OLAP

DATA WAREHOUSE

SQL RESULT

OLAP SERVER

REQUEST RESULT SET

Front End Tool

USER
DATA MINING
 Data mining is the process of extracting
(relevant information) or finding hidden
knowledge (new Information) from large
database.
 It has been described as "the science of
extracting useful information from large
data sets or databases."
DATA MINING
 Data mining software is one of a number of
analytical tools for analyzing data. It allows users
to analyze data from many different dimensions
or angles, categorize it, and summarize the
relationships identified.
 Technically, data mining is the process of finding
correlations or patterns among dozens of fields
in large relational databases.
DATA MINING

Pattern Evaluation
DATA MINING and
Knowledge Presentation

Selection and transformation


User
DATA WARE HOUSE
Cleaning & Integration

Database 1 Database 2 Database n


DATA MINING
Various steps involved in data mining
 Data Cleaning – Removes noise and Inconsistent data
 Data Integration – Data from various sources is
combined
 Data Selection – The relevant data are only retrieved
from the database for analysis.
 Data transformation – The selected data are made
ready for performing combined operations.
 Data Mining – Methods to data extract patterns.
 Pattern evaluation – Identify the needed patterns
representing knowledge
 Knowledge presentation – Presenting the mined
knowledge to the user by using knowledge representing
techniques
RELATIONSHIP TO DATA
WAREHOUSING
 Classification – Process of finding a set of
models that describes or distinguish data
classes.
 Clustering -Data items are grouped according
to logical relationships or consumer preferences.
 Sequential patterns -Data is mined to
predict the behavior patterns and trends.
 Association - States whether the database
tuple mentioned are associated with each other.
APPLICATIONS
MARKETING
 It helps to analyze the consumer behavior
based on buying patterns
 determination of marketing techniques
such as advertising ,store location , design
of catalogs etc.,
APPLICATIONS
FINANCE
 Helps to analyze the worthiness of clients
 Performance analysis of finance
investments like stock, bonds, mutual
funds and fraud detection.
MANUFACTURING
 Helps to optimize machines , man power ,
materials ,design of manufacturing
process,product design.
CONCLUSION
 Data warehousing:In the absence of a data
warehousing architecture, an enormous amount
of redundancy of information was required to
support the multiple decision support
environment that usually existed
 Data Mining: many marketers use data mining to
find strong consumer patterns and relationships.
Large organizations and educational institutions
also data mine to find significant correlations that
can enhance our society.

You might also like