Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 22

Chapter 5

Business Intelligence: Data


Warehousing, Data Acquisition, Data
Mining, Business Analytics, and
Visualization

1
How to leverage all the data that
organizations collect and store?

Answer
• Data warehousing
• Data acquisition (access)
• Data mining
• Online analytical processing (OLAP)
or Business Analytics
• Data Visualization

2
Data, Information, Knowledge

• Data
– Items that are the most elementary descriptions
of things, events, activities, and transactions
– May be internal or external
• Information
– Organized data that convey meaning and value
• Knowledge
– Processed data or information that conveys
understanding, experience, accumulated
learning and expertise applicable to a problem
or activity

3
What kinds of Data Issues
organizations deal with?
• Multiple sources
• Wide time frame
• Data reduction (aggregation)
• Various levels of detail
• Various amounts of data
• Varying degrees of accuracy
• Provide random access to database
• Security and private databases
• End user interface (i.e., ability to interface two or
more databases at a time)

4
5
Preparing data for Warehousing

• Cleanse data
– When populating warehouse
– Data quality action plan
– Best practices for data quality
– Measure results
• Data integrity issues
– Uniformity
– Version
– Completeness check
– Conformity check
– Genealogy or drill-down
6
Preparing data for Warehousing

• Data Integration
• Access needed to multiple sources
– Often enterprise-wide
– Disparate and heterogeneous databases
– XML becoming language standard
• Web (external data source)
– Intelligent agents
– Document management systems
– Content management systems
• External Commercial databases
– Sell access to specialized databases

7
Database Models
• Hierarchical
– Top down, like inverted tree
– Fields have only one “parent”, each “parent” can have multiple
“children”
– Fast
• Network
– Relationships created through linked lists, using pointers
– “Children” can have multiple “parents”
– Greater flexibility, substantial overhead
• Relational
– Flat, two-dimensional tables with multiple access queries
– Examines relations between multiple tables
– Flexible, quick, and extendable with data independence
• Object oriented
– Data analyzed at conceptual level
– Inheritance, abstraction, encapsulation

8
Database Models

• Multimedia Based
– Multiple data formats
• JPEG, GIF, bitmap, PNG, sound, video, virtual reality
– Requires specific hardware for full feature
availability
• Document Based
– Document storage and management
• Intelligent
– Intelligent agents and ANN
• Inference engines

9
What is Data Warehousing?

• Is the process of taking internal and/or external


data, cleansing it and storing it in a data
warehouse where it can be accessed by various
decision makers in the decision support process.
• A data mart is a part of a data warehouse
containing a subject area data.
• Data warehousing solves the data acquisition or
access problem.
• The end users perform ad hoc query, reporting,
analysis and visualization on the data warehouse
or on one or more data marts.

10
11
Business Intelligence and
Analytics

• Business intelligence
– Acquisition of data and information for
use in decision-making activities
• Business analytics
– Models and solution methods
• Data mining
– Applying models and methods to data to
identify patterns and trends

12
What is Multidimensionality?

• It is the ability to present in one screen or


table several dimensions, e.g., Sales by
region, by city, by product, by time period,
by salesperson (5 dimensions), and it can
be easily changed for different
presentations of dimensions.
• Data organized according to business
standards, not analysts – for slicing and
dicing them to gain new insights.

13
What is OLAP?
• A database-oriented DSS which uses data warehouse and a
set of tools usually with multidimensional capabilities to aid
in reporting, querying and data analysis.
• Activities performed by end users in OLAP systems
– Specific, open-ended query generation
• SQL
– Requesting Ad hoc reports
– Conducting Statistical and other (e.g. data mining) analyses
– Building DSS applications
• Modeling and visualization capabilities
• OLAP tools fall into four product groups:
– Multidimensional spreadsheets
– Multidimensional query & report writing tools for standard
RDMS (e.g., Business Objects)
– Fully multidimensional DBMS
– Visual information access systems

14
What is Data Mining?

• A term used to describe knowledge discovery in


databases

• Requires accessibility from a user’s workstation


to data that may reside in different locations (e.g.,
corporate warehouse, servers)

• Statistical, mathematical, artificial intelligence, and


machine-learning techniques are used to identify
new patterns in data for knowledge discovery

• Automatic, uses intelligent search, and fast


15
Data Mining (contd.)
• Data mining application classes of problems
– Classification
– Clustering
– Association
– Sequencing
– Regression
– Forecasting
– Others
• Data mining application areas
– Marketing
– Banking
– Insurance
– Health care
– Law enforcement
– Government and defense
– Others

16
What is an Intelligent Database?

• Provides the user with an easy access to data

• Allows to do complex operations without much user input –


the user specifies what she is looking for, and an intelligent
agent can execute the request

• Business Objects is an example of an intelligent database

• “Objects” are created by IS professionals to represent


elements in databases such as customers, products or
locations. Users click on the objects and business objects
automatically generates and executes the appropriate SQL
queries
17
Data Visualization

• Technologies supporting visualization


and interpretation
– Digital images (maps), GIS, GUI, tables,
multidimensionality, graphs, VR (virtual
reality), 3D, animation
– Identify relationships and trends
• Data manipulation allows real time
look at performance data

18
Analytic systems

• Real-time queries and analysis


• Real-time decision-making
• Real-time data warehouses updated
daily or more frequently
– Updates may be made while queries are
active
– Not all data updated continuously
• Deployment of business analytic
applications
19
GIS

• Computerized system for managing


and manipulating data with digitized
maps
– Geographically oriented
– Geographic spreadsheet for models
– Software allows web access to maps
– Used for modeling and simulations

20
21
Web Analytics/Intelligence

• Web analytics
– Application of business analytics to Web
sites
• Web intelligence
– Application of business intelligence
techniques to Web sites

22

You might also like