Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 51

Business Intelligence

• BI Meaning
• Components of BI
• BI process
• BI Providers
• Functions of BI server
• BI Capabilities
• BI Users
• BI Infrastructure
• BI Tools
• Others

07/24/23 1
Business Intelligence
• IS generates enormous amounts of operational data that contain patterns,
relationships, clusters, and other information that can facilitate management,
especially planning and forecasting.

• BI systems produce such information from operational data.


• BI is the collective information about…
• Customers
• Competitors
• Business partners
• Competitive environment

• BI is the process of extracting data from an OLAP database and then


analyzing that data for information that you can use to make informed
business decisions and take action.

07/24/23 2
Components of Business Intelligence (BI) Systems

07/24/23 3
Three Primary Activities in the BI Process

07/24/23 4
Acquire Data: Extracted Order Data
• Query
Sales (CustomerName, Contact, Title, Bill Year, Number Orders, Units, Revenue, Source,
PartNumber)
Part (PartNumber, Shipping Weight, Vendor)

07/24/23 5
Sample Extracted Data: Part Data Table

07/24/23 6
Analyze Data

07/24/23 7
Qualifying Parts Query Design

07/24/23 8
Publish Results: Qualifying Parts Query Results

07/24/23 9
What Are the Two Functions of a BI
Server?
Management and delivery

07/24/23 10
BI Providers

07/24/23 11
BI Users

07/24/23 12
BUSINESS INTELLIGENCE USERS

Power users :Producers Capabilities Casual users :consumers


20% of employee 80 % of employee
IT developers Production reports Operational employee
Super users Parameterized report Senior manager
Business analysts Dashboard Manager/staff
Analytical modelers Ad hoc, drill down Business analyst
search/OLAP

07/24/23 1-13
BUSINESS INTELLIGENCE AND ANALYTICS CAPABILITIES

– Production reports: These are predefined reports based on


industry specific requirements .

– Parameterized reports: Users enter several parameters


as in a pivot table to filter data and isolate impacts of
parameters. For instance,
• you might want to enter region and time of day to understand
how sales of a product vary by region and time.

07/24/23 1-14
BUSINESS INTELLIGENCE AND ANALYTICS CAPABILITIES

• Dashboards/scorecards: These are visual tools for presenting


performance data defined by users.

• Ad hoc query/search/report creation: These allow users to create


their own reports based on queries and searches.

• Drill down: This is the ability to move from a high-level summary


to a more detailed view.

• Forecasts, scenarios, models: These include the ability to perform


linear forecasting, what-if scenario analysis, and analyze data
using standard statistical tools.

07/24/23 1-15
BI Capabilities Example

07/24/23 16
BI Infrastructure
• an array of tools for obtaining useful information from all the
different types of data used by businesses today, including
semi-structured and unstructured big data in vast quantities.
• These capabilities include
– Data warehouse and data mart
– Hadoop,
– in-memory computing, and
– analytical platforms.

07/24/23 17
07/24/23 18
Hadoop
• For handling unstructured and semi-structured data
in vast quantities, as well as structured data, organizations are
using Hadoop.

• Hadoop is an open source software framework managed by the


Apache Software Foundation that enables
– distributed parallel processing of huge amounts of data across
inexpensive computers.

07/24/23 19
Key services:
• Hadoop consists of several key services:
– the Hadoop Distributed File System (HDFS) for data
storage and
– MapReduce for high-performance parallel data
processing.
– HDFS links together the file systems on the numerous
nodes in a Hadoop cluster to turn them into one big file
system.
– Hadoop’s MapReduce was inspired by Google’s
MapReduce system for breaking down processing of huge
datasets and assigning work to the various nodes in a
cluster.
07/24/23 20
• HBase, Hadoop’s non-relational database, provides rapid
access to the data stored on HDFS and a
transactional platform for running high-scale real-time
applications.
• Hadoop can process large quantities of any kind of data,
including structured transactional data, loosely structured
data such as Facebook and Twitter feeds, complex data such
as Web server log files, and unstructured audio and video
data.
• Hadoop runs on a cluster of inexpensive servers, and
processors can be added or removed as needed. Companies
use Hadoop for analyzing very large

21
In-Memory Computing
• Another way of facilitating big data analysis is to use in-memory
computing, which relies primarily on a computer’s main memory (RAM) for
data storage.
(Conventional DBMS use disk storage systems.)

• Users access data stored in system primary memory, thereby eliminating


bottlenecks from retrieving and reading data in a traditional, disk-based
database and dramatically shortening query response times.

• In-memory processing makes it possible for very large sets of data,


amounting to the size of a data mart or small data warehouse, to reside
entirely in memory.

07/24/23 22
• Leading commercial products for in-memory computing
include SAP’s High Performance Analytics Appliance (HANA)
and Oracle Exalytics.

• Each provides a set of integrated software components,


including in-memory database software and specialized
analytics software, that run on hardware optimized for in-
memory computing work.

07/24/23 23
• Centrica, a gas and electric utility, uses HANA to quickly
capture and analyze the vast amounts of data generated by
smart meters.

• The company is able to analyze usage every 15 minutes,


giving it a much clearer picture of usage by
neighborhood, home size, type of business served, or building
type.

• HANA also helps Centrica show its customers their energy


usage patterns in real-time using online and mob

07/24/23 24
Analytic platforms
• Commercial database vendors have developed specialized
high-speed analytic platforms using both relational and non-
relational technology that are optimized for analyzing large
datasets.
• These analytic platforms such as
– IBMNetezza and Oracle Exadata, feature preconfigured hardware-
software systems that are specifically designed for query processing
and analytics

07/24/23 25
Analytic platforms contd…
• For example,IBM Netezza features tightly integrated
database, server, and storage components that
handle complex analytic queries 10 to 100 times
faster than traditional systems.

• Analytic platforms also include in-memory systems


and NoSQL non-relational database management
systems.

07/24/23 26
BI Tools
1. Reporting Tools
• Integrate data from multiple systems

• Sorting, grouping, summing, averaging, comparing data

2. Data-mining Tools
• Used to discover hidden patterns and relationships
• Use sophisticated statistical techniques, regression analysis, and decision

tree analysis , Market-basket analysis


3. Knowledge-management tool
• Create value by collecting and sharing human knowledge about products,
product uses, best practices, other critical knowledge

07/24/23 27
Reporting Tools
• Reporting tools produce information from data using
five basic operations:
• Sorting
• Grouping
• Calculating
• Filtering
• Formatting

07/24/23 1-28
RFM Analysis

RFM analysis allows you to analyze and rank customers according to


purchasing patterns as this figure shows.

..R = how recently a customer purchased your products

..F = how frequently a customer purchases your products

..M = how much money a customer typically spends on your products

07/24/23 29
RFM Analysis ….

Divides customers into five groups and assigns a score from 1 to 5


• R score 1 = top 20 percent in most recent orders

• R score 5 = bottom 20 percent (longest since last order)

• F score 1 = top 20 percent in most frequent orders

• F score 5 = bottom 20 percent least frequent orders

• M score 1 = top 20 percent in most money spent

• M score 5 = bottom 20 percent in amount of money spent

1-30
07/24/23
Interpreting RFM Score result ….

07/24/23 1-31
Online Analytical processing (OLAP)

• OLAP, a second type of reporting tool, is more generic than


RFM.

• OLAP provides the ability to sum, count, average, and


perform other simple arithmetic operations on groups of
data.

• ability to manipulate, analyze large volumes of data from


multiple perspectives

07/24/23 1-32
Online Analytical processing (OLAP)

07/24/23 1-33
OLAP Features….
• Dynamic
• User can change report structure
• View online

• Dimension
• Characteristic of measure—purchase date, customer
type, location, sales region

• Users can drill down into data.


– Divide data into more detail

07/24/23 1-34
OLAP operation

i. Drill Down

ii. Consolidation

iii. Slicing and dicing

07/24/23 1-35
OLAP Consolidation Data

07/24/23 1-36
OLAP Drill Down

07/24/23 1-37
Data Mining

• A major use of data warehouse databases

• Data is analyzed to reveal hidden correlations, patterns, and


trends

• Uses variety of techniques to find hidden patterns and


relationship in large pool of data and infer rules from them
that can be used to predict future behavior and guide
decision.

07/24/23 1-38
Data Mining Tools

i. Query-and-reporting tools – similar to QBE tools, SQL, and


report generators

ii. Intelligent agents – utilize AI tools to help you “discover”


information and trends

iii. Multidimensional analysis (MDA tools) – slice-and-dice


techniques for viewing multidimensional information

iv. Statistical tools – for applying mathematical models to data


warehouse information

07/24/23 1-39
Data Mining Tools/Techniques

• Can be of Two Types


–Supervised data mining :Regression and NN

–Un supervised data mining: Decision tree and cluster


analysis

07/24/23 1-40
Supervised data mining …

Model developed before analysis (ie regression ,NN)

• Statistical techniques used to estimate parameters

•Examples:
..Regression analysis—measures impact of set of variables
on one another
..Used for making predictions

07/24/23 1-41
Supervised data mining… Neural Network

• Popular supervised data-mining technique used to


predict values and make classifications such as “good
prospect” or “poor prospect” customers

07/24/23 42
Unsupervised data mining….
• Analysts do not create model before running analysis. i.e.

cluster analysis, decision tree etc.

• Analysts create hypotheses after analysis to explain patterns


found.

• No prior model about the patterns and relationships that


might exist . Common statistical technique used:
• Cluster analysis to find groups of similar customers from

customer order and demographic data

07/24/23 1-43
Unsupervised data mining…Decision tree

• Hierarchical arrangement of criteria that predict


a classification or value

• Basic idea of a decision tree


..Select attributes most useful for classifying something
on some criteria that create disparate groups
• More different or pure the groups, the better the
classification

07/24/23 44
Decision tree

07/24/23 1-45
Create Set of If/Then Decision Rules

If student is a junior and works in a restaurant, then


predict grade > 3.0.
• If student is a senior and is a non-business major,
then predict grade < 3.0.
• If student is a junior and does not work in a
restaurant, then predict grade < 3.0.
• If student is a senior and is a business major, then
make no prediction.

07/24/23 1-46
Market Basket Analysis (MBA)

Market-basket analysis is a data-mining technique for


determining sales patterns.
– Uses statistical methods to identify sales patterns in large volumes of
data
– Shows which products customers tend to buy together

– Used to estimate probability of customer purchase

– Helps identify cross-selling opportunities


• "Customers who bought book X also bought book Y”

07/24/23 1-47
Hypothetical sales data of 1000 items at a Dive shop

07/24/23 1-48
Market-Basket terminologies

Support
..Probability that two items will be bought together
..Fins and masks purchased together 150 times,
thus support for fins and a mask is 150/1,000, or
15 percent
..Support for fins and weights is 60/1,000, or 6
percent
..Support for fins along with a second pair of fins is
10/1,000, or 1 percent

07/24/23 49
Market-Basket terminologies
Confidence
..What proportion of the customers who bought a mask also
bought fins?
..Conditional probability estimate

• Example:
» Probability of buying fins = 28%
» Probability of buying swim mask = 27%
• After buying fins,
» Probability of buying mask = 150/270 or 55.56%

• ..Likelihood that a customer will also buy fins almost doubles, from
28% to 55.56%.
• Thus, all sales personnel should try to sell fins to anyone buying a
mask.

07/24/23 1-50
Market-Basket terminologies…

Lift
..Ratio of confidence to base probability of buying
item
..Shows how much base probability increases or
decreases when other products are purchased
•Example:
..Lift of fins and a mask is confidence of fins given
a mask, divided by the base probability of fins.
..Lift of fins and a mask is .5556/.28 = 1.98

07/24/23 1-51

You might also like