Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Best Practices for Analyzing Objects

Document

Best Practices for Analyzing Objects

Date: Monday, May 16, 2011

Copyright  2007 TUSC Page 1 of 26


Best Practices for Analyzing Objects
Document
Document Title: Best Practices for Analyzing Objects

Document Filename: best_practices_for_dbms_stats.doc

Configuration
History

Version Date Applied Changes Author(s)


01.00 Feb 2008 Initial document Brian P Michael,
Sr. Consultant

Distribution
History

Version Date Name(s)


01.00

Copyright  2007 TUSC Page 2 of 26


Best Practices for Analyzing Objects
Document

Table of Contents
1 PURPOSE OF THIS DOCUMENT..............................................................................................................................4
2 SUMMARY......................................................................................................................................................................4
3 TABLE AND INDEX STATISTICS ............................................................................................................................6
4 COLUMN STATISTICS................................................................................................................................................6
4.1 WHEN SHOULD HISTOGRAMS BE CREATED....................................................................................6
4.1.1 Example: Small Table ..............................................................................................................................7
4.1.2 Example: Unique/Primary Keys...............................................................................................................7
4.1.3 Example: MyWebSite Field......................................................................................................................7
4.1.4 Example: Age Field...................................................................................................................................7
4.1.5 Example: Name field.................................................................................................................................7
4.2 EVERY APPLICATION IS DIFFERENT..................................................................................................7
5 COLUMN WHERE CLAUSE USAGE........................................................................................................................8
5.1 USING SYS.COL_USAGE$.......................................................................................................................8
5.2 MAINTAINING SYS.COL_USAGE$........................................................................................................9
6 CPU COST MODELING...............................................................................................................................................9
7 DBMS_STATS...............................................................................................................................................................10
8 SETTING DBMS_STATS PARAMETERS...............................................................................................................11
8.1 USING DEFAULTS..................................................................................................................................12
8.2 GETTING PARAMS.................................................................................................................................12
8.3 CONSTANTS............................................................................................................................................12
8.4 ESTIMATE_PERCENT............................................................................................................................13
8.4.1 auto_sample_size....................................................................................................................................13
8.5 CASCADE.................................................................................................................................................13
8.6 METHOD_OPT.........................................................................................................................................14
8.7 DEGREE....................................................................................................................................................14
8.8 GRANULARITY.......................................................................................................................................14
9 COLLECTING TABLE STATISTICS.......................................................................................................................14
10 COLLECTING INDEX STATS................................................................................................................................14
11 COLLECTING COLUMN STATS AND HISTOGRAMS.....................................................................................15
11.1 METHOD_OPT.......................................................................................................................................15
11.2 METHOD_OPT SIZE..............................................................................................................................15
11.2.1 size N.....................................................................................................................................................15
11.2.2 repeat.....................................................................................................................................................15
11.2.3 auto........................................................................................................................................................15
11.2.4 skewonly...............................................................................................................................................15
11.3 METHOD_OPT EXAMPLES.................................................................................................................15
11.3.1 FOR ALL COLUMNS..........................................................................................................................15
11.3.2 FOR ALL COLUMNS SIZE 1 **** Note this is the default value.....................................................16
11.3.3 FOR ALL COLUMNS SIZE 254.........................................................................................................16
11.3.4 FOR ALL INDEXED COLUMNS.......................................................................................................16
11.3.5 FOR ALL INDEXED COLUMNS SIZE 1..........................................................................................16
11.3.6 FOR ALL HIDDEN COLUMNS.........................................................................................................16
11.3.7 FOR ALL HIDDEN COLUMNS SIZE 1.............................................................................................16
11.3.8 FOR COLUMNS COL_A, COL_B......................................................................................................16
11.3.9 FOR COLUMNS COL_A SIZE 1, COL_B SIZE 1.............................................................................16
11.3.10 FOR COLUMN COL_A SIZE 5, COL_B SIZE AUTO, COL_C SIZE 200.....................................16
11.3.11 FOR COLUMNS COL_A SIZE AUTO, COL_B SIZE AUTO.........................................................16
11.3.12 FOR ALL COLUMNS SIZE AUTO..................................................................................................16
11.3.13 FOR COLUMNS SIZE AUTO COL_A, COL_B, COL_C................................................................16
Copyright  2007 TUSC Page 3 of 26
Best Practices for Analyzing Objects
Document
11.3.14 FOR ALL COLUMNS SIZE SKEWONLY.......................................................................................16
11.3.15 FOR ALL COLUMNS SIZE REPEAT..............................................................................................17
12 COLLECTING DICTIONARY AND FIXED OBJECT STATS...........................................................................17
12.1 FIXED OBJECTS STATS.......................................................................................................................17
12.2 DICTIONARY STATS - STATISTICS ON SYS, SYSTEM and OTHER ORACLE COMPONENTS
..........................................................................................................................................................................17
13 COLLECTING CPU COST MODELING STATS.................................................................................................18
13.1 HOW DO I REVIEW CPU COST MODELING STATISTICS?...........................................................18
13.2 SYS.AUX_STATS$.................................................................................................................................18
13.3 HOW DO I COLLECT STATS FOR CPU COST MODELING?..........................................................19
13.4 WHEN DO I COLLECT NEW SYSTEM STATS?................................................................................19
13.5 SAVING AND RESTORE SYSTEM STATS........................................................................................19
13.6 VIEWING SAVED SYSTEM STATS....................................................................................................19
14 RETENTION OF PREVIOUSLY COLLECTED STATISTICS...........................................................................20
14.1 BACKING UP AND RESTORING STATISTICS USING STATTAB ................................................20
14.1.1 CREATING A STATTAB TABLE......................................................................................................20
14.1.2 SAVING OFF STATISTICS – DBMS_STATS.EXPORT_/IMPORT................................................20
14.1.2.1 Transfering Stats to Another Schema or Database............................................................................20
14.1.3 BACKING UP USAGE INFORMATION...........................................................................................20
14.1.4 VIEWING SAVED STATISTICS USING STATTAB.......................................................................21
14.1.4.1 Viewing Saved Table Statistics..........................................................................................................21
14.1.4.2 Viewing Saved Column Statistics......................................................................................................21
14.1.4.3 Viewing Saved Index Statistics..........................................................................................................21
14.1.4.4 Viewing Saved CPU Statistics...........................................................................................................22
14.1.4.5 sys.aux_stats$.....................................................................................................................................22
14.2 USING 10G RETENTION OF STATISTICS.........................................................................................23
14.2.1.1 Determining How far back we can restore from................................................................................23
14.2.1.2 Getting and Setting the Retention Time.............................................................................................23
14.2.2 RESTORING STATISTICS WITH 10G AUTO RETENTION..........................................................23
14.2.2.1 Restoring Table Stats.........................................................................................................................23
14.2.2.2 Restoring Dictionary Stats.................................................................................................................24
14.2.2.1 Restoring Database Stats....................................................................................................................24
14.2.2.2 Restoring Schema Stats......................................................................................................................24
15 AUTOMATED STATS JOB......................................................................................................................................24
16 LOCKING AND UNLOCKING STATISTIC COLLECTIONS...........................................................................25
17 LIMITATIONS OF DBMS_STATS..........................................................................................................................25
17.1 CHAINED ROWS ..................................................................................................................................25
17.2 VALIDATE STRUCTURE.....................................................................................................................25
18 APPENDIX..................................................................................................................................................................25
18.1 A Note from Metalink on Automatic Undo Retention.............................................................................25
18.2 BIBLIOGRAPHY....................................................................................................................................26

Copyright  2007 TUSC Page 4 of 26


Best Practices for Analyzing Objects
Document

1 PURPOSE OF THIS DOCUMENT


This document describes Best Practices for collection of statistics for the Cost Based Optimizer.

2 SUMMARY
The Oracle Cost Based Optimizer (CBO) is a core part of the Oracle technology stack and makes a significant
contribution to the overall performance to the database. The technology was originally obtained from Digital
Equipment Corp following Oracle’s purchase of the Rdb database system in 1992. Since then it has been refined
and extended. With Oracle 10g, the original Rule Based Optimizer (RBO) is de-supported. It is expected that in
future releases the RBO will disappear altogether and the CBO will be the only query optimization technology
available.
The Oracle Cost Based Optimizer relies on table and object statistics to determine the optimal path to use to fulfill a
user’s query. In Oracle releases prior to 10g, there are 2 types of optimizers that Oracle utilizes to create execution
plans for queries. The RULE based optimizer, available in release 9i and lower, utilizes a set of fairly
understandable rules, applied in serial order, to estimate and obtain a proper execution plan.

The Cost Based Optimizer (CBO), which has been available since Oracle version 7.3, is the only optimizer available
in future releases. The CBO uses informational statistics captured about an object to estimate and obtain the proper
execution plan with the cheapest cost.

In 9i and 10g, additional statistics about the machine, cpu’s and i/o patterns can also be collected using the CPU
Cost Modeler.

To operate properly, the CBO must have accurate statistics to create the best, and cheapest execution path for a
query. This white paper helps to clarify how to collect statistics accurate statistics and what options to use.

Copyright  2007 TUSC Page 5 of 26


Best Practices for Analyzing Objects
Document

3 TABLE AND INDEX STATISTICS


Oracle uses a number of statistics to determine the best execution path for any given query. Statistics about the
number of rows a table has, the average space are all calculated during a table statistics collection.

4 COLUMN STATISTICS
When Oracle calculates the estimated cardinality of an execution path, Oracle estimates that each distinct column
value will point to the same number of rows that any other distinct column value will.
If data is highly skewed in favor of 1 column value over another, Oracle can use this information to obtain a closer
estimate to the number of rows that will be returned.

To map the skew-ness of a column, Oracle utilizes 2 types of distributions named: Frequency and Height-Balanced
(or equi-depth). Oracle limits the number of histogram buckets for either distribution type to 254.

A frequency distribution models a precise histogram, based exactly on how many rows a single column value is
contained within. A frequency distribution can be created only when there are less than 255 distinct values for a
column. A frequency histogram can take on 2 forms by Oracle. Each form will show up slightly differently when
querying histogram$. In form #1, Oracle will create 1 bucket number for each distinct value, and place the exact
count of rows for that value in the bucket. In form #2, Oracle will use “Bucket Subtraction”. Bucket Subtraction
will label a bucket number with the number of total rows of current value and store the actual column value. In this
method, Oracle obtains the number of rows for each distinct value by subtracting the current bucket number with the
previous bucket number.
You can easily identify a “Bucket Subtraction” histogram because the bucket numbers usually go beyond 255.

A Height-Balanced distribution model (statistically known as an equi-depth histogram) obtains it’s model from
trying to evenly distribute that distinct column values across a known number of buckets (hence equi-depth/height
balanced).

As the number of distinct values approaches the number of rows and when the number of rows is large, this model
becomes very in-accurate.

4.1 WHEN SHOULD HISTOGRAMS BE CREATED


Histograms should only be created when having a histogram in place will likely change a potential
execution plan towards another plan.

Here are a few examples, based upon the following table:


Create table mywebspace ( MyUniqueKey number,
name varchar2(20),
address varchar2(20),
DOB date,
age number,
city varchar2(30),
state varchar2(2),
county varchar2(30),
mywebsite varchar2(400)
);
Copyright  2007 TUSC Page 6 of 26
Best Practices for Analyzing Objects
Document

Create unique index mykey on mywebspace(myuniquekey);


Create index DOBidx on mywebspace(DOB);
Create index ageidx on mywebspace(age);

The mywebsite field always starts with “http://www.myspace.com/personalwebsites/”


4.1.1 Example: Small Table
The table is very small, say, 10 rows.

Since all the rows probably fit in 1 block, we would ever only do 1 I/O operation to retrieve this data. A
histogram would never be necessary and in point of fact, indexes probably wouldn’t be either.
4.1.2 Example: Unique/Primary Keys
Never create a histogram on any UNIQUE or PRIMARY KEY column. The data is 100% evenly
distributed with 1 single possible value per row.
4.1.3 Example: MyWebSite Field
When Oracle creates a histogram on varchar2 fields, only the first 32 characters of the substring are used
for creating the histogram.

Since all mywebsite URL’s start with exactly the same 40 characters, a histogram could not be used
effectively.

Also to note, if UTF8 or other multibyte charactersets are used, the substring is 16 characters.
4.1.4 Example: Age Field
Since we are most likely talking about the human species in this table, we are all < 254 years old.
A histogram on age might very well be valuable if the age varies widely overall, but there are a huge
amount of 18 years old in this table.

Since an index is on AGE, the optimizer might decide to use a FULL table scan if we are looking for 18
year olds only.
4.1.5 Example: Name field
If the name field was queried as the only column in the where clause, a FULL TABLE SCAN, or RANGE
scan would be used, regardless of a histogram in place.

Assume that an index was on the name field, and assuming the name field is fairly distinct across the data,
the data most likely will closely track to the number of rows in the table and probably wouldn’t change the
execution plan or the cardinality estimate.

Therefore, there is no reason to create a histogram on this field.

4.2 EVERY APPLICATION IS DIFFERENT


Regardless of the examples above, every table and every column in each application responds differently to
histograms.

The only guaranteed approach to histogram creation is to do thorough analysis on the application and the execution
plans for each query.
Copyright  2007 TUSC Page 7 of 26
Best Practices for Analyzing Objects
Document

5 COLUMN WHERE CLAUSE USAGE


Oracle tracks every column used in a where clause across database reboots using the SYS.COL_USAGE$ table.
This table stores the object# (from dba_objects) and the column# (intcol# - column_id) from dba_tab_columns
and the timestamp in which it was last used in a where clause.

The table also tracks “how” it was used in the clause, whether it was used an equality (a=b), an equijoin (tablea.a =
tableb.b), nonequijoins (tablea.a <> tableb), Ranges, likes and IS NULL usages.

SQL> desc sys.col_usage$


Name Null? Type
----------------------------------------- -------- ----------------------------
OBJ# NUMBER
INTCOL# NUMBER
EQUALITY_PREDS NUMBER
EQUIJOIN_PREDS NUMBER
NONEQUIJOIN_PREDS NUMBER
RANGE_PREDS NUMBER
LIKE_PREDS NUMBER
NULL_PREDS NUMBER
TIMESTAMP DATE

Querying the table looks like the following:


OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- ---------
72 1 444 0 0 0 0 0 17-DEC-06
72 2 444 0 0 0 0 0 17-DEC-06
72 3 444 0 0 0 0 0 17-DEC-06
73 1 444 0 0 0 0 0 17-DEC-06

The above query can be joined to dba_objects and dba_tab_columns to see the table_name and column_names. This
query shows how each column is used in where clauses. It also shows the last date the column was used in a where
clause.
5.1 USING SYS.COL_USAGE$
There are many ways this table can be used. I use the following query to help manually determine which fields
should have histograms collected, making sure to not collect histograms on any column that is UNIQUE or a
PRIMARY KEY.

SELECT TABLE_NAME, COLUMN_NAME, NUM_NULLS, NUM_DISTINCT FROM


USER_TAB_COLUMNS WHERE (TABLE_NAME, COLUMN_NAME) IN
(
SELECT DISTINCT TABLE_NAME, COLUMN_NAME FROM (
SELECT O.OBJECT_NAME TABLE_NAME, C.COLUMN_NAME, CU.TIMESTAMP LAST_USED
FROM SYS.COL_USAGE$ CU, USER_OBJECTS O, USER_TAB_COLS C
WHERE O.OBJECT_ID = CU.OBJ#
AND C.COLUMN_ID = CU.INTCOL#
AND C.TABLE_NAME = O.OBJECT_NAME
AND C.DATA_TYPE NOT LIKE '%LOB%'
AND (cu.equality_preds + cu.equijoin_preds + cu.nonequijoin_preds+
cu.range_preds + cu.like_preds) <> 0
AND cu.TIMESTAMP >= SYSDATE - 60
AND c.COLUMN_ID <> 1
) COLUSAGE
WHERE
NOT EXISTS (

Copyright  2007 TUSC Page 8 of 26


Best Practices for Analyzing Objects
Document
SELECT 1 FROM USER_CONSTRAINTS C, USER_CONS_COLUMNS CC
WHERE C.CONSTRAINT_TYPE IN ( 'P')
AND CC.CONSTRAINT_NAME = C.CONSTRAINT_NAME
AND CC.OWNER = C.OWNER
AND C.TABLE_NAME = COLUSAGE.TABLE_NAME
AND CC.TABLE_NAME = C.TABLE_NAME
AND CC.COLUMN_NAME = COLUSAGE.COLUMN_NAME
)
)
AND NUM_DISTINCT > 0
ORDER BY TABLE_NAME, COLUMN_NAME

5.2 MAINTAINING SYS.COL_USAGE$


To consistently run good collections using “SIZE AUTO”, very old information in the SYS.COL_USAGE$ table
must be purged. This table is not maintained by Oracle properly and I have personally put an enhancement request
in to have a call to dbms_stats to properly maintain it.

Although oracle does not support direct manipulation of the sys tables, I have found that purging old information,
those records where timestamp > 6 months, helps both the results of the SIZE AUTO command, and the
performance.

6 CPU COST MODELING


The Cost based model without CPU statistics is based upon I/O costing. Specifically, the optimizer reviews an
explain path based upon the plan with the lowest number of I/O’s.

Starting in 9i, the optimizer includes CPU Cost Modeling, which adds a CPU cost to the CBO costing and refines
the I/O costs based upon actual hardware responses to single block and multiblock read times.

In 10G, CPU Cost modeling is turned on by default, although, using defaults set by Oracle, until statistics are
gathered by the dba.

Once CPU Cost modeling is in place, the optimizer uses the following formula for costing execution plans:

“The costing model is a formula that calculates the cost of any statement.

Cost = (#SRds * sreadtim + #MRds * mreadtim + #CPUCycles / cpuspeed ) / sreadtim

where:
• #SRDs is the number of single block reads
• #MRDs is the number of multi block reads
• #CPUCycles is the number of CPU Cycles *)
• sreadtim is the single block read time
• mreadtim is the multi block read time
• cpuspeed is the CPU cycles per second

Copyright  2007 TUSC Page 9 of 26


Best Practices for Analyzing Objects
Document

7 DBMS_STATS
DBMS_STATS is the package used to generate cost based optimizer statistics for databases.
This package can be broken down into 6 categories and their associated functions and/or procedures below.

• GATHER Procedures to GATHER statistics


GATHER_DATABASE_STATS Procedures
GATHER_DICTIONARY_STATS Procedure
GATHER_FIXED_OBJECTS_STATS Procedure
GATHER_INDEX_STATS Procedure
GATHER_SCHEMA_STATS Procedures
GATHER_SYSTEM_STATS Procedure
GATHER_TABLE_STATS Procedure

• DELETE Procedure to DELETE generated statistics


DELETE_COLUMN_STATS Procedure
DELETE_DATABASE_STATS Procedure
DELETE_DICTIONARY_STATS Procedure
DELETE_FIXED_OBJECTS_STATS Procedure
DELETE_INDEX_STATS Procedure
DELETE_SCHEMA_STATS Procedure
DELETE_SYSTEM_STATS Procedure
DELETE_TABLE_STATS Procedure

• RETENTION Procedures to SAVE, RESTORE and TRANSFER statistics


RESTORE_DICTIONARY_STATS Procedure
RESTORE_FIXED_OBJECTS_STATS Procedure
RESTORE_SCHEMA_STATS Procedure
RESTORE_SYSTEM_STATS Procedure
RESTORE_TABLE_STATS Procedure

EXPORT_COLUMN_STATS Procedure
EXPORT_DATABASE_STATS Procedure
EXPORT_DICTIONARY_STATS Procedure
EXPORT_FIXED_OBJECTS_STATS Procedure
EXPORT_INDEX_STATS Procedure
EXPORT_SCHEMA_STATS Procedure
EXPORT_SYSTEM_STATS Procedure
EXPORT_TABLE_STATS Procedure

IMPORT_COLUMN_STATS Procedure
IMPORT_DATABASE_STATS Procedure
IMPORT_DICTIONARY_STATS Procedure
IMPORT_FIXED_OBJECTS_STATS Procedure
IMPORT_INDEX_STATS Procedure
IMPORT_SCHEMA_STATS Procedure
IMPORT_SYSTEM_STATS Procedure
IMPORT_TABLE_STATS Procedure

Copyright  2007 TUSC Page 10 of 26


Best Practices for Analyzing Objects
Document
CREATE_STAT_TABLE Procedure
DROP_STAT_TABLE Procedure
PURGE_STATS Procedure
GET_STATS_HISTORY_RETENTION Function
GET_STATS_HISTORY_AVAILABILITY Function
ALTER_STATS_HISTORY_RETENTION Procedure

• LOCKING Procedures to LOCK and UNLOCK statistics


LOCK_SCHEMA_STATS Procedure
LOCK_TABLE_STATS Procedure

UNLOCK_SCHEMA_STATS Procedure
UNLOCK_TABLE_STATS Procedure

• DEFAULTS Procedures to Modify Package DEFAULTS


RESET_PARAM_DEFAULTS Procedure
SET_PARAM Procedure
GET_PARAM

• MANUAL Procedures to Manually create or manipulate statistics


PREPARE_COLUMN_VALUES Procedures
PREPARE_COLUMN_VALUES_NVARCHAR2 Procedure
PREPARE_COLUMN_VALUES_ROWID Procedure

SET_COLUMN_STATS Procedures
SET_INDEX_STATS Procedures
SET_SYSTEM_STATS Procedure
SET_TABLE_STATS Procedure

GET_COLUMN_STATS Procedures
GET_INDEX_STATS Procedures
GET_SYSTEM_STATS Procedure
GET_TABLE_STATS Procedure
GENERATE_STATS Procedure

8 SETTING DBMS_STATS PARAMETERS


In 10G, default parameters can be set for the database. When parameters are specified in the call to dbms_stats,
those parameters override the defaults previously set.

Recommended Defaults:
DBMS_STATS.SET_PARAM('CASCADE','TRUE');
DBMS_STATS.SET_PARAM('ESTIMATE_PERCENT','100');
DBMS_STATS.SET_PARAM('DEGREE’,'NULL');
DBMS_STATS.SET_PARAM('METHOD_OPT','FOR ALL COLUMNS SIZE AUTO');
DBMS_STATS.SET_PARAM('NO_INVALIDATE','FALSE');
DBMS_STATS.SET_PARAM('GRANULARITY','ALL');
Copyright  2007 TUSC Page 11 of 26
Best Practices for Analyzing Objects
Document
DBMS_STATS.SET_PARAM('AUTOSTATS_TARGET','AUTO');

FOR AUTOSTATS TARGET, 3 POSSIBLE VALUES


DBMS_STATS.SET_PARAM('AUTOSTATS_TARGET','ALL');
DBMS_STATS.SET_PARAM('AUTOSTATS_TARGET','ORACLE');
DBMS_STATS.SET_PARAM('AUTOSTATS_TARGET','AUTO');

8.1 USING DEFAULTS


If your 10g defaults are set, calls to dbms_stats can be made very simply. In example, to collect schema
stats with defaults:

DBMS_STATS.GATHER_SCHEMA_STATS (OWNNAME=>’SOME_SCHEMA’);

In any case, the full options can be specified


DBMS_STATS.GATHER_SCHEMA_STATS(
OWNNAME=>’SOME_SCHEMA’,
ESTIMATE_PERCENT=>100,
METHOD_OPT=>’FOR ALL COLUMNS SIZE 1’,
DEGREE=>4,
GRANULARITY=>’ALL’,
CASCADE=>TRUE
);

8.2 GETTING PARAMS


You can use a select statement to get the PARAMS.

select
'AUTOSTATS_TARGET:', dbms_stats.get_param('AUTOSTATS_TARGET'),
'GRANULARITY:', dbms_stats.get_param('GRANULARITY'),
'CASCADE:', dbms_stats.get_param('CASCADE'),
'ESTIMATE_PERCENT:', dbms_stats.get_param('ESTIMATE_PERCENT'),
'DEGREE:', dbms_stats.get_param('DEGREE'),
'METHOD_OPT:', dbms_stats.get_param('METHOD_OPT'),
'NO_INVALIDATE:',dbms_stats.get_param('NO_INVALIDATE')
from dual
/

8.3 CONSTANTS

Use the following constant to indicate that auto-sample size algorithms should be used:

AUTO_SAMPLE_SIZE CONSTANT NUMBER;

The constant used to determine the system default degree of parallelism, based on the initialization
parameters, is:

DEFAULT_DEGREE CONSTANT NUMBER;

Copyright  2007 TUSC Page 12 of 26


Best Practices for Analyzing Objects
Document
Use the following constant to let Oracle select the degree of parallelism based on size of the
object, number of CPUs and initialization parameters:

AUTO_DEGREE CONSTANT NUMBER;

Use the following constant to let Oracle decide whether to collect statistics for indexes or not:

AUTO_CASCADE CONSTANT BOOLEAN;

Use the following constant to let oracle decide when to invalidate dependent cursors.

AUTO_INVALIDATE CONSTANT BOOLEAN;

8.4 ESTIMATE_PERCENT

Estimate percent specifies what amount of the percent of the table should be sampled to obtain the statistics.
Higher sampling percentages, up to 100%, are best, but, there are many documents that say collection statistics on a
range of only 10-15%-30% of the table is sufficient.

For most of the tables, especially for table/columns with less than 255 distinct values, it is quite important to collect
statistics based upon 100% of the data in the table.

This is very, very important for the creation of frequency based histograms on columns with < 255 distinct
values. If a value is missing in the rows estimated, then it will not get mapped to a Frequency distribution.

For extremely large tables, and those tables where a sampling of the data will give a very good statistical
representation of the table, estimating the statistics at different values can provide a good result in a more efficient
manner.

I tend to recommend using 100% estimate of the data, on tables that are small always, and on databases where
collecting at 100% can be done during the appropriate window. For very large objects, test at different levels. Keep
in mind, that a table with 100 million rows of data probably won’t shift statistical representation readily.

8.4.1 auto_sample_size
Oracle will determine the best sample size for an objects statistic while performing the collection if this
value is used. There are different opinions as to what is best for a database, and every application is
different.

I recommend setting estimate percent to 100% on all databases where collection at 100% is possible
within the scheduled jobs window. When the collections can not be finished within the window, I suggest
testing AUTO_SAMPLE_SIZE to see if good statistical measures can be obtained for your application.
An alternative to using AUTO_SAMPLE_SIZE is to hard set ESTIMATE_PERCENT or to break up the
collection into separate jobs.

8.5 CASCADE

Copyright  2007 TUSC Page 13 of 26


Best Practices for Analyzing Objects
Document
When CASCADE is set to TRUE, the statistics will also be collected on all indexes on this table, but, parallel index
statistics creation CAN NOT be used.

I prefer setting CASCADE=FALSE when doing very large tables, and calling the
DBMS_STATS.GATHER_INDEX_STATS specifically for those objects.

8.6 METHOD_OPT

The METHOD_OPT parameter tells the DBMS_STATS routine whether or not to create histograms for table
columns and how to go about doing it.

The default value for METHOD_OPT calculates column statistics with no histograms.

See the discussion for COLUMN STATISTICS below.

8.7 DEGREE

The DEGREE option sets the degree of parallelization for the collection. Since collections typically run in serial, set
this to the number of cpu’s in the system, provided the system is normally quiet during statistics collection.
Otherwise set this to a reasonable number not to over-parallelize the collection.

8.8 GRANULARITY

The GRANULARITY option, applies to only partitioned and sub-partitioned tables, defines which level of statistics
are going to be collected.

There are 6 levels: AUTO, ALL, GLOBAL, PARTITION, SUBPARTITION, GLOBAL AND PARTITION.

When using partitions, be very careful about setting METHOD_OPT.

I recommend using multiple passes of collection on large, partitioned, objects.


Collect GLOBAL statistics on named table with/without histograms (METHOD_OPT)
LOOP through sub partitions
Collect SUBPARTITION statistics with default METHOD_OPT
LOOP through partitions specifically
Collect PARTITION statistics individually with default METHOD_OPT

By breaking the job down, I find they finish faster, and with less problems, especially with sort space.

9 COLLECTING TABLE STATISTICS


Table statistics should always be collected at the partition and sub-partition level when applicable.
To collect table level statistics, use DBMS_STATS.GATHER_TABLE_STATS.

DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>’ABC’,TABNAME=>’MY TAB’,ESTIMATE_PERCENT=>100,


METHOD_OPT=>’FOR ALL COLUMNS SIZE AUTO’);

10 COLLECTING INDEX STATS


Copyright  2007 TUSC Page 14 of 26
Best Practices for Analyzing Objects
Document

As with table statistics, options for setting DEGREE, GRANULARITY and ESTIMATE_PERCENT exists as also
the ability to gather at partition and sub-partition layers.

On very large databases, I recommend breaking the jobs down, manually calling GATHER_INDEX_STATS as
appropriate.

11 COLLECTING COLUMN STATS AND HISTOGRAMS


11.1 METHOD_OPT
The METHOD_OPT parameter in the DBMS_STATS.GATHER_TABLE_STATS allows for refinement of
histogram collection. There are many articles that tell you to use AUTO all the time, but I find that each
application tends to be different.

11.2 METHOD_OPT SIZE


There are 4 size options available.
11.2.1 size N
Using SIZE with a number, other then 1, will create a Histogram on the column with “up to” N buckets.
11.2.2 repeat
Oracle will “refresh” the current column statistics with the same number of buckets as currently used.
11.2.3 auto
Oracle will choose the number of buckets, including NOT creating a histogram, based upon the where
clause usage data stored in sys.col_usage$ AND whether this column’s data is highly skewed.

AUTO is Oracle’s preferred, method, although, it is not the default.

See section below on Sys.col_usage$

11.2.4 skewonly
Oracle will choose the number of buckets, including NOT creating a histogram, based upon whether this
column’s data is highly skewed, only. It will not look at the columns where clause usage.

The difference between AUTO and SKEWONLY is simple. SKEWONLY does not review sys.col_usage$ and
only looks as skewness. AUTO investigates both skewness and usage.

11.3 METHOD_OPT EXAMPLES


Below is a description of the different combinations of how this field can be utilized.
Note carefully that when a METHOD_OPT parameter is used, but no SIZE value is specified, SIZE 75 is the
default value.

11.3.1 FOR ALL COLUMNS


COLLECTS COLUMN STATS AND DEFAULT 75 BUCKET HISTOGRAMS FOR EVERY COLUMN
IN THE TABLE

Copyright  2007 TUSC Page 15 of 26


Best Practices for Analyzing Objects
Document
11.3.2 FOR ALL COLUMNS SIZE 1 **** Note this is the default value
COLLECTS COLUMN STATS FOR EVERY COLUMN IN THE TABLE, NO HISTOGRAMS
11.3.3 FOR ALL COLUMNS SIZE 254
COLLECTS COLUMN STATS FOR EVERY COLUMN IN THE TABLE, AND UP TO 254 BUCKET
HISTOGRAMS
11.3.4 FOR ALL INDEXED COLUMNS
COLLECTS COLUMN STATS FOR INDEXED COLUMNS ONLY, 75 BUCKET HISTOGRAMS
11.3.5 FOR ALL INDEXED COLUMNS SIZE 1
COLLECTS COLUMN STATS FOR INDEXED COLUMNS ONLY, NO HISTOGRAMS
11.3.6 FOR ALL HIDDEN COLUMNS
COLLECTS STATS ON HIDDEN COLUMNS FOR FUNCTION BASED INDEXES, 75 BUCKET
HISTOGRAMS
11.3.7 FOR ALL HIDDEN COLUMNS SIZE 1
COLLECTS COLUMN STATS FOR HIDDEN COLUMNS FOR FUNCTION BASED INDEXES, NO
HISTOGRAMS
11.3.8 FOR COLUMNS COL_A, COL_B
COLLECTS COLUMN STATS FOR EACH COLUMN LISTED, DEFAULT 75 BUCKET HISTOGRAM
11.3.9 FOR COLUMNS COL_A SIZE 1, COL_B SIZE 1
COLLECTS COLUMN STATS FOR EACH COLUMN LISTED, NO HISTOGRAM
11.3.10 FOR COLUMN COL_A SIZE 5, COL_B SIZE AUTO, COL_C SIZE 200
COLLECTS COLUMN STATS FOR EACH COLUMN LISTED AND WITH SPECIFIC HISTOGRAM
BUCKET SIZES GIVEN FOR EACH COLUMN
11.3.11 FOR COLUMNS COL_A SIZE AUTO, COL_B SIZE AUTO
COLLECTS COLUMN STATS FOR EACH COLUMN LISTED, HISTOGRAMS BASED ON USAGE
AND SKEW
11.3.12 FOR ALL COLUMNS SIZE AUTO
COLLECTS COLUMN STATS FOR ALL COLUMNS IN THE TABLE, HISTOGRAMS BASED ON
SKEW AND WORKLOAD
11.3.13 FOR COLUMNS SIZE AUTO COL_A, COL_B, COL_C
COLLECTS COLUMN STATS FOR EACH COLUMN LISTED, HISTOGRAMS BASED ON USAGE
AND SKEW
11.3.14 FOR ALL COLUMNS SIZE SKEWONLY
COLLECTS COLUMN STATS FOR ALL COLUMNS IN THE TABLE, HISTOGRAMS CREATED
FOR SKEWED DATA ONLY.

Copyright  2007 TUSC Page 16 of 26


Best Practices for Analyzing Objects
Document
11.3.15 FOR ALL COLUMNS SIZE REPEAT
RE-COLLECT COLUMN STATISTICS ON ALL COLUMNS THAT CURRENTLY HAVE
STATISTICS. RE-COLLECT HISTOGRAMS ON THOSE COLUMNS THAT CURRENTLY HAVE
HISTOGRAMS, AND ALSO USE THE SAME BUCKET SIZE.

12 COLLECTING DICTIONARY AND FIXED OBJECT STATS


`
Oracle recommends collecting statistics on the fixed objects and dictionary statistics.

In Oracle 9i, there was much debate on whether to gather stats on the sys objects. Some said yes, some said no. My
personal experience was that collecting schema level stats in 9i didn’t work so well and was a bad idea.

In 10G, it is quite necessary to collect both statistics on all components.

12.1 FIXED OBJECTS STATS


To collect statistics on the X$ (fixed objects), run the following:
DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;

I suggest doing this once a database has been fully populated and any time a significant amount of schema
objects have been created.

To see your fixed objects stats, join the v$fixed_table view to tab_stats$.

select b.name, a.obj#, a.rowcnt, a.blkcnt, a.analyzetime, a.samplesize


from tab_stats$ a, v$fixed_table b
where a.obj# = b.object_id;

In addition, you can delete, export and import fixed objects stats from one system to another using
dbms_stats.delete_fixed_objects_stats, dbms_stats.export_fixed_objects_stats and
dbms_stats.import_fixed_objects_stats.

12.2 DICTIONARY STATS - STATISTICS ON SYS, SYSTEM and OTHER ORACLE COMPONENTS
There are many schemas that ship with the database today, drsys, cmsys, mdsys, wmsys, etc.

The documentation suggests that a call to dbms_stats.gather_dictionary_stats with no arguments will


collect stats on all ‘SYS’,’SYSTEM’ and other dictionary schemas as listed in the “SCHEMA” column of
dba_registry.

I have NOT found this to be the case.

It appears, that even though the stated default for the “OPTIONS” parameter is “GATHER”, specifically
setting this parameter, “OPTIONS”, obtain the correct collection. Leaving “OPTIONS” as set to default,
does not.

The documentation also states that you can individually collect statistics on the other components by
specifically giving the comp_id (component id) from dba_registry to the call to gather stats. Without
specifying the “OPTIONS” parameter, this also does not work as expected.
Copyright  2007 TUSC Page 17 of 26
Best Practices for Analyzing Objects
Document

begin
for c1rec in (select comp_id from dba_registry) loop
DBMS_STATS.GATHER_DICTIONARY_STATS(
COMP_ID=>c1rec.COMP_ID, OPTIONS=>’GATHER’
);
end loop;
end;
/

If selecting “LAST_ANALYZED” column from dba_tables shows that the date is still old, verify the
schema is listed in dba_registry. If not, try performing a dbms_stats.gather_schema_stats.

13 COLLECTING CPU COST MODELING STATS

13.1 HOW DO I REVIEW CPU COST MODELING STATISTICS?

CPU Cost modeling information can be viewed using sys.aux_stats$.


These elements can also be directly manipulated and I have found that to be useful for testing different
scenarios.

To determine if CPU Cost Modeling is active, verify that data is populated in this table.
If cpuspeed is NOT populated, but cpuspeednw IS populated, then CPU Cost modeling is turned on, but
using Oracle Defaults.

For CPU Cost Modeling to function properly, workload statistics must be captured using
dbms_stats.gather_system_stats.
13.2 SYS.AUX_STATS$

Each record in the sys.aux_stats$ table holds a value for the CPU statistics.
The values are defined below:

• iotfrspeed—I/O transfer speed in bytes for each millisecond
• ioseektim - seek time + latency time + operating system overhead time, in milliseconds
• sreadtim - average time to read single block (random read), in milliseconds
• mreadtim - average time to read an mbrc block at once (sequential read), in milliseconds
• cpuspeed - average number of CPU cycles for each second, in millions, captured for the
workload (statistics collected using 'INTERVAL' or 'START' and 'STOP' options)
• cpuspeednw - average number of CPU cycles for each second, in millions, captured for the
noworkload (statistics collected using 'NOWORKLOAD' option.
• mbrc - average multiblock read count for sequential read, in blocks
• maxthr - maximum I/O system throughput, in bytes/second
• slavethr - average slave I/O throughput, in bytes/second” *From 10g Manual
Copyright  2007 TUSC Page 18 of 26
Best Practices for Analyzing Objects
Document

13.3 HOW DO I COLLECT STATS FOR CPU COST MODELING?

A single call to dbms_stats.gather_system_stats with an appropriate interval of a few hours during an average
workload is all that is required to collect statistics.

Statistics are gathered using the DBMS_STATS.GATHER_SYSTEM_STATS call using an interval period, or
manually started and stopped.

To collect statistics for a 2 hour interval, run the following:


DBMS_STATS.GATHER_SYSTEM_STATS (gathering_mode=>’INTERVAL’,interval=>120);

13.4 WHEN DO I COLLECT NEW SYSTEM STATS?

CPU “System” stats should be re-collected whenever the average workload for the database shifts and whenever
there is a change to CPU’s and/or I/O hardware and patterns.

This includes most hardware upgrades, including adding HBA cards, NIC Cards, CPU’s, Disk drives, and external
RAID or SAN hardware and/or configuration that could have an impact on performance.

13.5 SAVING AND RESTORE SYSTEM STATS

In 9i, cpu cost modeling statistics can be exported using dbm_stats.export_system_stats and then re-imported using
dbms_stats.import_system_stats.

In 10g, system stats can also be restored from recent past collections (as available), using
dbms_stats.restore_system_stats;

13.6 VIEWING SAVED SYSTEM STATS


Collecting stats, and saving different versions is convenient when possible workloads shift. It is quite easy to then
compare different versions to see the effects of hardware, or workload changes on the system stats.

Provided you have exported system stats using dbms_stats.export_system_stats, and provided a “STATID” for that
collection, the following view can be used to view the contents of those stats.

CREATE OR REPLACE VIEW STATTAB_cpu_stats


AS
SELECT
CPU.STATID,
CPU.C1 STATUS,
CPU.C2 START_TIME,
CPU.C3 STOP_TIME,
CPU.N3 CPUSPEED,
CPU.N11 MBRC,
CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N1 END SREADTIME,
CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N2 END MREADTIME,
CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N1 END MAXTHR,
CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N2 END SLAVTHR
FROM
STATTAB CPU, STATTAB PARIO
WHERE CPU.TYPE= 'S'

Copyright  2007 TUSC Page 19 of 26


Best Practices for Analyzing Objects
Document
AND CPU.C4 = 'CPU_SERIO'
AND CPU.STATID = PARIO.STATID
AND PARIO.C4 = 'PARIO';

14 RETENTION OF PREVIOUSLY COLLECTED STATISTICS

Time and Time again, I am encountered by customers with problems where the database becomes very, very slow.
After analysis, it is revealed that someone recollected statistics last night and explain plans are not what they were
yesterday
There is no backup of yesterday’s statistics. This was a major problem in 8i and 9i, but has mostly been erased with
the default stats retention history in 10g.

ALWAYS, ALWAYS, ALWAYS – BACKUP UP YOUR STATISTICS BEFORE COLLECTION.

Oracle provides a table to store copies of statistics, and can be run easily using DBMS_STATS.
Once in place, a simple call to DBMS_STATS.EXPORT_xxx_STATS should be used before the collection begins.

In 10g, a default 31 days of statistics history is kept. I still use EXPORT functions even in 10g.

14.1 BACKING UP AND RESTORING STATISTICS USING STATTAB


14.1.1 CREATING A STATTAB TABLE
Oracle provides a table to store copies of statistics. This is extremely convenient.

DBMS_STATS.CREATE_STAT_TABLE(ownname=>’MYSCHEMA’,stattab=>’STATTAB,tblspace=>’TABLESPACE_NAME’);

14.1.2 SAVING OFF STATISTICS – DBMS_STATS.EXPORT_/IMPORT


Use a simple call to DBMS_STATS.EXPORT_XXX_STATS to export either all database statistics, a
table, index, partition, etc and even the CPU stats.
14.1.2.1 Transfering Stats to Another Schema or Database
To move statistics to another database, export those statistics using
DBMS_STATS.EXPORT_xxxx_STATS. Then copy the table to another database using exp, imp,
datapump or db link. Then use DBMS_STATS.IMPORT_xxx_STATS to import those statistics to the
data dictionary.

If the schema is different or some tables in the new database don’t exist, YOU MUST manually
manipulate the STATTAB table. To modify the schema these stats are appropriate for, update
STATTAB, setting “C5” column as appropriate.

Delete rows for columns that do not belong, or for tables that do not belong. Use the views below to view
the statistics in the STATTAB table.

14.1.3 BACKING UP USAGE INFORMATION


There are also 4 other tables that are worth storing. A copy of SYS.COL_USAGE$, SYS.AUX_STATS$
and SYS.HISTOGRAM$, V$OBJECT_USAGE.

Copyright  2007 TUSC Page 20 of 26


Best Practices for Analyzing Objects
Document
It is good practice to make a backup up copy of the following tables on a regular basis, especially before a
large collection.
14.1.4 VIEWING SAVED STATISTICS USING STATTAB
The best way to understand and use the statistics in the STATTAB table is to use the views below.

To use the following views, add a where clause to select from the appropriate “STATID” that you wish to
view.
14.1.4.1 Viewing Saved Table Statistics
CREATE OR REPLACE VIEW STATTAB_TABLE_STATS
AS
SELECT
STATID,
C5 OWNER,
C1 TABLE_NAME,
C2 PARTITION_NAME,
C3 SUBPART_NAME,
N1 NUM_ROWS,
N2 NUM_BLOCKS,
N3 AVG_ROW_LEN,
N4 SAMPLE_SIZE
FROM
STATTAB
WHERE TYPE= 'T';

14.1.4.2 Viewing Saved Column Statistics


CREATE OR REPLACE VIEW STATTAB_COLUMN_STATS
AS
SELECT
STATID,
C5 OWNER,
C1 TABLE_NAME,
C2 PARTITION_NAME,
C3 SUBPART_NAME,
C4 COLUMN_NAME,
N1 NUM_DISTINCT,
N2 DENSITY,
N4 SAMPLE_SIZE,
N5 NUM_NULLS,
N6 LO_VAL,
N7 HI_VAL,
N8 AVG_COL_LEN,
N10 ENDPOINT_NUMBER,
N11 ENDPOINT_VALUE
FROM
STATTAB
WHERE TYPE= 'C';

14.1.4.3 Viewing Saved Index Statistics


CREATE OR REPLACE VIEW STATTAB_INDEX_STATS
AS
SELECT
STATID,
C5 OWNER,
C1 INDEX_NAME,
C2 PARTITION_NAME,
C3 SUBPART_NAME,
N1 NUM_ROWS,
N2 LEAF_BLOCKS,
N3 DISTINCT_KEYS,
Copyright  2007 TUSC Page 21 of 26
Best Practices for Analyzing Objects
Document
N4 LEAF_BLOCKS_PER_KEY,
N5 DATA_BLOCKS_PER_KEY,
N6 CLUSTERING_FACTOR,
N7 BLEVEL,
N8 SAMPLE_SIZE
FROM
STATTAB
WHERE TYPE= 'I';

14.1.4.4 Viewing Saved CPU Statistics


CREATE OR REPLACE VIEW STATTAB_cpu_stats
AS
SELECT
CPU.STATID,
CPU.C1 STATUS,
CPU.C2 START_TIME,
CPU.C3 STOP_TIME,
CPU.N3 CPUSPEED,
CPU.N11 MBRC,
CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N1 END SREADTIME,
CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N2 END MREADTIME,
CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N1 END MAXTHR,
CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N2 END SLAVTHR
FROM
STATTAB CPU, STATTAB PARIO
WHERE CPU.TYPE= 'S'
AND CPU.C4 = 'CPU_SERIO'
AND CPU.STATID = PARIO.STATID
AND PARIO.C4 = ‘PARIO’;

14.1.4.5 sys.aux_stats$
Each record in the sys.aux_stats$ table holds a value for the CPU statistics.
The values are defined below:

• iotfrspeed—I/O transfer speed in bytes for each millisecond
• ioseektim - seek time + latency time + operating system overhead time, in milliseconds
• sreadtim - average time to read single block (random read), in milliseconds
• mreadtim - average time to read an mbrc block at once (sequential read), in milliseconds
• cpuspeed - average number of CPU cycles for each second, in millions, captured for the
workload (statistics collected using 'INTERVAL' or 'START' and 'STOP' options)
• cpuspeednw - average number of CPU cycles for each second, in millions, captured for
the noworkload (statistics collected using 'NOWORKLOAD' option.
• mbrc - average multiblock read count for sequential read, in blocks
• maxthr - maximum I/O system throughput, in bytes/second
• slavethr - average slave I/O throughput, in bytes/second” *From 10g Manual
14.1.4.5.1 Viewing Saved CPU Statistics
CREATE OR REPLACE VIEW STATTAB_cpu_stats
AS
SELECT
CPU.STATID,
CPU.C1 STATUS,
CPU.C2 START_TIME,
CPU.C3 STOP_TIME,
CPU.N3 CPUSPEED,
CPU.N11 MBRC,
CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N1 END SREADTIME,
CASE CPU.C4 WHEN 'CPU_SERIO' THEN CPU.N2 END MREADTIME,
CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N1 END MAXTHR,
Copyright  2007 TUSC Page 22 of 26
Best Practices for Analyzing Objects
Document
CASE PARIO.C4 WHEN 'PARIO' THEN PARIO.N2 END SLAVTHR
FROM
STATTAB CPU, STATTAB PARIO
WHERE CPU.TYPE= 'S'
AND CPU.C4 = 'CPU_SERIO'
AND CPU.STATID = PARIO.STATID
AND PARIO.C4 = 'PARIO';

14.2 USING 10G RETENTION OF STATISTICS

In 10g, backups of statistics are kept automatically.

14.2.1.1 Determining How far back we can restore from


Oracle 10g maintains availability of statistics for a default period of 31 days.

To identify the retention period and availability of statistics, queries can be run using
Dbms_stats.GET_STATS_HISTORY_AVAILABILITY and
Dbms_stats.GET_STATS_HISTORY_RETENTION against dual.

SQL> select dbms_stats.get_stats_history_availability from dual;

GET_STATS_HISTORY_AVAILABILITY
---------------------------------------------------------------------------
16-DEC-07 03.34.26.921000000 PM -06:00

14.2.1.2 Getting and Setting the Retention Time


The default retention time can be change.
To view the current retention time, the following query can be used:
SQL> select dbms_stats.get_stats_history_retention from dual;

GET_STATS_HISTORY_RETENTION
---------------------------
31

To modify the retention time, run the following:


SQL> exec dbms_stats.alter_stats_history_retention(# of Days);

14.2.2 RESTORING STATISTICS WITH 10G AUTO RETENTION


Below are examples of restoring statistics. Each call, takes only a few parameters, mainly, the objects
owner, the object name and the timestamp you wish to restore from.
14.2.2.1 Restoring Table Stats
begin
dbms_stats.restore_table_stats (
'ESCROW1',
'ED_FILE_EXCEPTS',
'01-JAN-08 10.00.00.000000000 AM -06:00');
end;

Copyright  2007 TUSC Page 23 of 26


Best Practices for Analyzing Objects
Document
14.2.2.2 Restoring Dictionary Stats
begin
dbms_stats.restore_dictionary_stats ( '01-JAN-08 10.00.00.000000000 AM -06:00');
end;

14.2.2.1 Restoring Database Stats


begin
dbms_stats.restore_database_stats ( '01-JAN-08 10.00.00.000000000 AM -06:00');
end;

14.2.2.2 Restoring Schema Stats


begin
dbms_stats.restore_schema_stats (‘ESCROW1’, '01-JAN-08 10.00.00.000000000 AM -06:00');
end;

15 AUTOMATED STATS JOB


In 10g+, a scheduled job exists to automatically gather and maintain statistics for the database.
The script that creates this job is ORACLE_HOME/rdbms/admin/catmwin.sql.

The jobname in the 10G scheduler is “GATHER_STATS_JOB”


The program simply calls “dbms_stats.gather_database_stats_job_proc;”

SELECT * FROM DBA_SCHEDULER_JOBS WHERE JOB_NAME =


'GATHER_STATS_JOB';

The program name that the job calls is “GATHER_STATS_PROG”

SELECT * FROM DBA_SCHEDULER_PROGRAMS WHERE PROGRAM_NAME =


‘GATHER_STATS_PROG’;

To see the historical start and end times of the jobs:


select * from dba_optstat_operations order by end_time;

To see the job run details:


select * FROM DBA_SCHEDULER_JOB_RUN_DETAILS where job_name =
‘GATHER_STATS_JOB’

To see the running jobs:


Select * from dba_SCHEDULER_RUNNING_JOBS where job_name =
‘GATHER_STATS_JOB’

To see the job logs:


Select * from dba_scheduler_job_log where job_name = ‘GATHER_STATS_JOB’;

To disable/enable the job, you can use the dbms_scheduler routines.

Copyright  2007 TUSC Page 24 of 26


Best Practices for Analyzing Objects
Document
BEGIN
DBMS_SCHEDULER.ENABLE('GATHER_STATS_JOB');
DBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB');
END;
/

To change any of the constants that the job uses, use can set the following globals using:
AUTO_SAMPLE_SIZE CONSTANT NUMBER;
DEFAULT_DEGREE CONSTANT NUMBER;
AUTO_DEGREE CONSTANT NUMBER;
AUTO_CASCADE CONSTANT BOOLEAN;
AUTO_INVALIDATE CONSTANT BOOLEAN

16 LOCKING AND UNLOCKING STATISTIC COLLECTIONS


To keep the automated job, or other users from collecting statistics on objects, schema and table stats can be locked
using DBMS_STATS.LOCK_xxx_STATS and DBMS_STATS.UNLOCK_xxx_STATS.

This is very useful if you have performed specific collections and do not want the automatic scheduled job to modify
those collections.

17 LIMITATIONS OF DBMS_STATS
17.1 CHAINED ROWS

Periodically, analyzed chained rows should be run on all tables. This is especially true when statspack shows large
number of “Table Fetch continued row”. To analyze for chained rows, the older, ANALYZE TABLE xxxx LIST
CHAINED ROWS INTO CHAINED_ROWS should be run.

This report will give you a list of all rows in the table that are chained, and the ROWID’s of those rows.
It is important to fixed tables with many chained rows, by rebuilding those rows.

Chained rows statistics DO NOT AFFECT the CBO and therefore have nothing to do with DBMS_STATS.

17.2 VALIDATE STRUCTURE


To validate the structure of an object, analyze table validate structure must still be used.

DBMS_STATS only performs statistics collections that are relevant to CBO.

18 APPENDIX
18.1 A Note from Metalink on Automatic Undo Retention

When undo tablespace is using NON-AUTOEXTEND datafiles,


V$UNDOSTAT.TUNED_UNDORETENTION may be calculated too high preventing
undo block from being expired and reused. In extreme cases the undo
Copyright  2007 TUSC Page 25 of 26
Best Practices for Analyzing Objects
Document
tablespace could be filled to capacity by these unexpired blocks.

An alert may be posted on DBA_ALERT_HISTORY that advises to increase


the space when it is not really necessary if this fix is applied.
If the user sets their own alert thresholds for undo tablespaces the
bug may prevent alerts from being produced.

Workaround:
alter system set "_smu_debug_mode" = 33554432;
This causes the v$undostat.tuned_undoretention to be calculated as
the maximum of:
maxquerylen secs + 300
undo_retention specified in init.ora

18.2 BIBLIOGRAPHY
The following documents were consulted in the preparation of this paper:

Cost-Based Oracle Fundamentals – Jonathan Lewis


Metalink Note 114671.1 - Gathering Statistics for the Cost Based Optimizer
Metalink Note:117203.1 - How to Use DBMS_STATS to Move Statistics to a Different Database
Metalink Note:159787.1 - 9i: Import STATISTICS=SAFE
Metalink Note:175258.1 - How to Compute Statistics on Partitioned Tables and Indexes
Metalink Note 236935.1 - Global statistics - An Explanation
Metalink Note 237293.1 - How to Move from ANALYZE to DBMS_STATS - Introduction
Metalink Note 237538.1 - How to Move from ANALYZE to DBMS_STATS on Partitioned Tables
Metalink Note 237901.1 - Gathering Schema or Database Statistics Automatically – Examples
Metalink Note 1031826.6 - Histograms: An Overview

Copyright  2007 TUSC Page 26 of 26

You might also like