Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

In-Database Analytics :

DB

Oracle 10g
10g DB

Data Warehousing

ETL

OLAP

BI/DW

Statistics

Data Mining


Flashback : RTE


: In-Database Analytics

DB
In-Database Statistics
OLAP Option
Data Mining Option

Q&A

Flashback :
RTE

RTE(Real-Time Enterprise)?
an enterprise that competes by using up-todate information to progressively remove
delays to the management and
execution of its critical business processes
- Gartner, Definition of Real Time Enterprise
Real-Time Enterprise



(Remove
delays)
(Up-to-date)

RTE
RTE

9
9
.


9
9


DW Renovation


KPI

Fin
MFG

ETL

MES




(ETL
)

ODS

RTDW
(OLAP
+
OLTP)

BI Portal





: Data in
One Place

Information
Driven

Enterprise

: Gap


Query
and Reporting

OLAP

Data Mining

&

3

?


,

?

6


?

Business Intelligence



3 DW : 30TB
DW : 100TB
2,3 PB DW

.

4000 2TB
?



: ,


: ?

: ,
OLAP
Engine

Data
Integration
Engine
Data
Warehouse

Mining
Engine



.
,
?(:TB )

9
9
9







9 Security, Scalability, Availability


&
9 SQL

:
In-Database Analytics
Oracle 10g DB
Data Warehousing

ETL

OLAP

Statistics

Data Mining

DB

Data Warehouse
Built-in Statistics
OLAP Option
Data Mining
Option

Oracle Business Intelligence


Know More, Do More, Spend Less!

Oracle 10g DB

REGION

PRODUCT
TIME

Query & Reporting


Oracle BI Solution
BI Beans
Oracle Reports

Data Warehousing

ETL

OLAP

Drill for Detail


OLAP Option
Spreadsheet Add-In

Statistics

Data Mining

Mine for New Insights


Access & Assemble Data
Oracle Warehouse Builder

Oracle Data Mining Option


Spreadsheet Add-In
Statistics
Text Mining

In-Database Analytics

Oracle 10g
10g DB
Data Warehousing

ETL

OLAP

Statistics

DB


9Fast scoring : CPU 250

Data Mining



TCO

In-Database Analytics :
: DVD

,

3 3
DVD

,

In-Database Analytics :

1 :
DB

DB

2 : DB


3 :
DB

In-Database Analytics :

SQL
select responder, cust_region, count(*) as cnt,
sum(post_purch pre_purch) as tot_increase,
:
avg(post_purch pre_purch) as avg_increase,
stats_t_test_paired(pre_purch, post_purch) as significance
from (
:
select cust_name,
prediction(campaign_model using *) as responder,
sum(case when purchase_date < 15-Apr-2005 then
purchase_amt else 0 end) as pre_purch,
DB
sum(case when purchase_date >= 15-Apr-2005 then
purchase_amt else 0 end) as post_purch
from customers, sales, products@PRODDB
where sales.cust_id = customers.cust_id
and purchase_date between 15-Jan-2005 and 14-Jul-2005
and sales.prod_id = products.prod_id
and contains(prod_description, DVD) > 0
group by cust_id, prediction(campaign_model using *) )
group by rollup responder, cust_region order by 4 desc;

In-Database Analytics :

(SQL pipelining)


DB DM,

9 :
9 :

DB

10g
Ranking functions
rank, dense_rank, cume_dist, percent_rank,
ntile

Window Aggregate functions


(moving and cumulative)
Avg, sum, min, max, count, variance, stddev,
first_value, last_value

LAG/LEAD functions
Direct inter-row reference using offsets

Reporting Aggregate functions


Sum, avg, min, max, variance, stddev, count,
ratio_to_report

Statistical Aggregates
Correlation, linear regression family, covariance

Linear regression
Fitting of an ordinary-least-squares regression
line to a set of number pairs.
Frequently combined with the COVAR_POP,
COVAR_SAMP, and CORR functions.
Note: Statistics and SQL Analytics are included in Oracle
Database Standard Edition

Descriptive Statistics
average, standard deviation, variance, min, max, median
(via percentile_count), mode, group-by & roll-up
DBMS_STAT_FUNCS: summarizes numerical columns
of a table and returns count, min, max, range, mean,
stats_mode, variance, standard deviation, median,
quantile values, +/- n sigma values, top/bottom 5 values

Correlations
Pearsons correlation coefficients, Spearman's and
Kendall's (both nonparametric).

Cross Tabs
Enhanced with % statistics: chi squared, phi coefficient,
Cramer's V, contingency coefficient, Cohen's kappa

Hypothesis Testing
Student t-test , F-test, Binomial test, Wilcoxon Signed
Ranks test, Chi-square, Mann Whitney test, KolmogorovSmirnov test, One-way ANOVA

Distribution Fitting
Kolmogorov-Smirnov Test, Anderson-Darling Test, ChiSquared Test, Normal, Uniform, Weibull, Exponential

Pareto Analysis (documented)


80:20 rule, cumulative results table

In-Database Statistics

(: )

Note: Statistics and SQL Analytics are included in Oracle Database Standard Edition

OLAP
OLAP
SQL Ad-Hoc
9
9

OLAP


API

10g OLAP Option


DW OLAP
DB

9

9, , Ad-Hoc
SQL
OLAP API



9

9SQL OLAP API

Oracle OLAP Platform


Oracle
HTML DB

OracleBI
Reports

OracleBI
OracleBI
Discoverer Spreadsheet Oracle
BI Beans
Add-In
OLAP

Oracle Oracle Enterprise


Planning &
Demand
Budgeting
Planning

Database
OLAP Option: Query Analysis Planning
Oracle Warehouse Builder

Analytic Workspace Manager

Case Study: Simple Queries


120

Time to build

98

100

Time to execute simple queries


80

60

40
23
20

17

14

16
10

17

14

0
Analytic Workspace

14 MVs

214 MVs

518 MVs

Case Study: OLAP Queries


450
411

Time to build

400

Time to execute OLAP


queries

350
300
250
200

147

150

126
98

100
50

17

23

10

17

0
Analytic Workspace

14 MVs

214 MVs

518 MVs

Data Mining

Data Mining

(Attribute Importance)
(Classification)

(Decision Trees)
(Clustering)

(Associations)

(Anomaly Detection)

Data Mining

(churn)

(Basel II)

DB

(Sarbanes-Oxley)

Oracle Data Mining


Oracle mining platform
PL/SQL API
Java API
Oracle Data Miner (GUI)
Spreadsheet Add-In


Attribute importance
Classification, regression & prediction
Anomaly detection
Association rules
Clustering
Nonnegative matrix factorization
BLAST

Oracle Data Mining


Attribute Importance


A1 A2 A3 A4 A5 A6 A7

Classification & Prediction




Regression

Income

>$50K

<=$50K

Gender

Age

>35

Status

Married
Buy = 0

Gender

Single F
Buy = 1

Buy = 0

<=35
HH Size

>4

Buy = 1 Buy = 0

<=4
Buy = 1

Oracle Data Mining


Clustering



Association Rules


Feature Extraction

clustering text mining

F1 F2 F3 F4

Oracle Data Mining 10g R2


Improve ease of use

,
GUI
Wizards
Mining

SQL &
Java

BI

metagroup.com
Copyright 2004
META Group, Inc.
All rights reserved.
METAspectrum 60.1


In-Database Analytics

Benefit





H/W, O/S

DB

Grid, RAC, BI,


SQL & PL/SQL
DB


DB ,

RTE
DB
9, ,
9

DB
9
,

You might also like