Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 27

DATA WAREHOUSING AND DATA MINING

1
DATA WAREHOUSING

The primary aim for data


Data warehousing is
warehousing is to provide
combining data from
businesses with analytics
multiple sources into one
results from data mining,
comprehensive and easily
OLAP, Scorecarding and
manipulated database.
reporting.
NEED FOR DATA WAREHOUSING
 Information is now considered as a key for all the
works.
 Those who gather, analyze, understand, and act
upon information are winners.
 Information have no limits, it is very hard to
collect information from various sources, so we
need an data warehouse from where we can get
all the information.
TODAYS BUSINESS INFORMATION
DATA WAREHOUSING
INCLUDES:-
 Retrieving data

 Analyzing data

 Extracting data

 Loading data

 Transforming data

 Managing data
DATA WAREHOUSE ARCHITECTURE

 Data warehousing is designed to provide an


architecture that will make cooperate data
accessible and useful to users.
 There is no right or wrong architecture.
 The worthiness of the architecture can be
judge by its use, and concept behind it .
 Data Warehouses can be architected in
many different ways, depending on the
specific needs of a business.
TYPICAL DATA WAREHOUSING ENVIRONMENT
 An operational data store (ODS) is basically
a database that is used for being an temporary
storage area for a datawarehouse.
 Its primary purpose is for handling data which are
progressively in use.
 Operational data store contains data which are
constantly updated through the course of the
business operations.
 ETL (Extract, Transform, Load) is used to
copy data from:-
 ODS to data warehouse staging area.
 Data warehouse staging area to data warehouse .
 Data warehouse to data mart .
 ETL extracts data, transforms values of inconsistent
data, cleanses "bad" data, filters data and loads data
into a target database.
 The Data Warehouse Staging Area is
temporary location where data from source
systems is copied.
 It increases the speed of data warehouse
architecture.
 It is very essential since data is increasing day
by day.
 The purpose of the Data Warehouse is to integrate
corporate data.
 The amount of data in the Data Warehouse is massive.
Data is stored at a very deep level of detail.
 This allows data to be grouped in unimaginable ways.
 Data Warehouses does not contain all the data in the
organization ,It's purpose is to provide base that are
needed by the organization for strategic and tactical
decision making.
 ETL extract data from the Data Warehouse and send
to one or more Data Marts for use of users.
 Data marts are represented as shortcut to a data
warehouse ,to save time.
 It is just an partition of data present in data
warehouse.
 Each Data Mart can contain different combinations
of tables, columns and rows from the Enterprise
Data Warehouse.
REASONS FOR CREATING AN DATA MART

 Easy access to frequently needed data.


 Creates collective view by a group of users.
 Improves user response time.
 Ease of creation.
 Lower cost than implementing a full Data
warehouse
DATA MINING
 The non-trivial extraction of implicit,
previously unknown, and potentially useful
information from large databases.

– Extremely large datasets


– Useful knowledge that can improve
processes
– Cannot be done manually
WHERE HAS IT COME FROM ?
MOTIVATION
Databases today are huge:
– More than 1,000,000 entities/records/rows

– From 10 to 10,000 fields/attributes/variables
– Giga-bytes and tera-bytes
Databases a growing at an unprecendented rate The
corporate world is a cut-throat world

– Decisions must be made rapidly

– Decisions must be made with maximum
knowledge
HOW DOES DATA MINING
WORK?
•EXTRACT, TRANSFORM, AND LOAD TRANSACTION DATA ONTO THE DATA
WAREHOUSE SYSTEM.
•STORE AND MANAGE THE DATA IN A MULTIDIMENSIONAL DATABASE
SYSTEM.

•PROVIDE DATA ACCESS TO BUSINESS ANALYSTS AND INFORMATION
TECHNOLOGY PROFESSIONALS.

•ANALYZE THE DATA BY APPLICATION SOFTWARE.

•PRESENT THE DATA IN A USEFUL FORMAT, SUCH AS A GRAPH OR TABLE

DATA MINING MEASURES
 Accuracy
 Clarity
 Dirty Data
 Scalability
 Speed
 Validation
TYPICAL APPLICATIONS OF DATA MINING
ADVANTAGES OF DATA MINING
 Engineering and Technology
 Medical Science
 Business
 Combating Terrorism
 Games
 Research and
Development
ENGINEERING AND TECHNOLOGY
 In Electrical Power Engineering
- used for condition monitoring of
high voltage electrical equipment
- vibration monitoring and
analysis of transformer on-load tap-
changers
 Education
- to concentrate their knowledge
MEDICAL
SCIENCE
 Data mining has been widely used in area of
bioinformatics , genetics
 DNA sequences and variability in disease
susceptibility which is very important to help
improve the diagnosis, prevention and treatment
of the diseases
BUSINESS
 In Customer Relationship Management
applications
 It Translate data from customer to merchant
Accurately
 Distribute Business Processes
 Powerful Tool For Marketing
COMBATING TERRORISM
 Concept used by Interpol against terrorists for
searching their records by Multistate Anti-
Terrorism Information Exchange
 In the Secure Flight program , Computer
Assisted Passenger Pre screening System ,
Semantic Enhancement
GAMES
 for certain combinatorial games, also called
table bases (e.g. for 3x3-chess)
 It includes extraction of human-usable
strategies
 Berlekamp in dots-and-boxes and Joh
Nunn in chess endgames are notable examples
RESEARCH AND DEVELOPMENT
 Helps to Develop the search algorithms
 It offers huge libraries of graphing and
visualisation softwares
 The users can easily create the models
optimally
LIST OF THE TOP EIGHT DATA-MINING
SOFTWARE VENDORS IN 2008

 Angoss Software
 Infor CRM Epiphany
 Portrait Software SAS
 G-Stat
 SPSS
 ThinkAnalytics
 Unica
 Viscovery

You might also like