19c Real-Time and High Frequency Statistics Collection

ORACLE 19C:
REAL-TIME
STATISTICS &
HIGH-FREQUENCY
STATISTICS
COLLECTION
DAVID KURTZ
ACCENTURE ENKITEC GROUP
UKOUG TECHFEST2019
WHO AM I
Accenture Enkitec Group

Performance tuning
• PeopleSoft ERP
• Oracle RDBMS
Book
• www.go-faster.co.uk
• blog.psftdba.com
Oak Table
Copyright © 2019 Accenture All rights reserved. |

AGENDA
2 new 19c features

• Real-Time Statistics
• High-Frequency Automatic Optimizer Statistics Collection
Unfortunately, at the moment, these features are only available on

Engineered Systems
Copyright © 2019 Accenture All rights reserved. | 3

LET'S TALK ABOUT THE OPTIMIZER
The optimizer is a piece of software that works out • Cost-Based Optimizer

how to execute your SQL statements. • Volumetric statistics about tables and
• Oracle 2.3, 1979, rule-based optimizer indexes inform a mathematical model that
• Cost-Based Optimizer introduced Oracle 7.0.12, costs execution plans and finds the
1992 cheapest.
• Since then • Implicit assumption:
• DBMS_STATS package replaced ANALYZE • Execution plan cost α Execution duration
• Lots of automatic behaviour
• Histograms to describe skew • If you provide inaccurate statistics to the
optimizer, then you cannot expect it to
• Maintenance Window produce the correct answer.
• Optimizer Dynamic Sampling • With every release we have seen efforts to
• Statistic Preferences (11gR2) improve the optimizer cost model and
improve the quality of the statistics
• Cardinality/Statistics Feedback (11gR2/12c)
• The optimizer works better with accurate
• Adaptive Plans (12c) statistics.
• Hash based NDV calculation (12c)
ENSOR PARADOX
Dave Ensor • "The only time that it is safe to gather

"the grand old man of performance optimization statistics is when to do so would make no
methodology in the Oracle world" – Jonathan Lewis difference."
• What happens when you have gathered statistics?
• The plan might change
• Or the plan might not change
• What happens if you do not gather statistics?
• The plan might still change because the data
has changed.
• Literal or bind values have moved outside the
range of the column statistics, or into a region
of different skew.
• Why do we gather optimizer statistics?
• So that the plans remain the same!

REAL-TIME STATISTICS

PURPOSE OF REAL-TIME STATISTICS
Oracle Documentation:
• 10.3.3.3 19c SQL Tuning Guide, Part V Optimizer Statistics, Optimizer Statistics Concepts,
• How the Database Gathers Optimizer Statistics, On-line Statistics Gathering, Real-Time
Statistics
"Online statistics … aim to reduce the possibility of the optimizer being misled by stale
statistics.
• 12c introduced online statistics gathering for CREATE TABLE AS SELECT statements
and direct-path inserts.
• 19c introduces real-time statistics, which extend online support to conventional DML
statements.
• Because statistics can go stale between DBMS_STATS jobs, real-time statistics helps the
optimizer generate more optimal plans.
• Real-time statistics augment rather than replace traditional statistics…must continue to
gather statistics regularly using DBMS_STATS"

TABLE MONITORING
Tracks inserts, updates and deletes, as well as whether the table has been truncated, in
memory.
• Introduced in 10g
• Enabled by default since 11g
Used to determined whether statistics are stale
• Drives statistics collection during the maintenance window
Periodically persisted to disk
• by SMON, approximately every 15 minutes
• can be flushed out manually with dbms_stats.flush_database_monitoring_info

WHAT HAPPENS DURING DML?
"When a DML operation is currently modifying a table, Oracle Database dynamically

computes values for the most essential statistics."
• There is no call to DBMS_STATS visible in trace.

• 'OPTIMIZER STATISTICS GATHERING' operation in execution plan
• The changes to the table are tracked in memory and are flushed to the data dictionary
later along with other monitoring data.
• Or they can be flushed with dbms_stats.flush_database_monitoring_info.
• On-line statistics visible in data dictionary views
• DBA_TAB_STATISTICS
• DBA_TAB_COL_STATISTICS
• Additional rows with note: 'STATS_ON_CONVENTIONAL_DML'

DEMONSTRATING REAL-TIME STATISTICS

DEMONSTRATING REAL-TIME STATISTICS
A series of simple tests that

• Build, populate, and then make some more changes
• Then we look at the statistics and some execution plans
This will
• reveal the behaviour of real-time statistics
• and suggest some limitations and risks

TEST: POPULATING A SIMPLE TABLE IN
CONVENTIONAL PATH MODE
connect / as sysdba
alter system flush shared_pool;
A2 starts in the range 1-32, and
connect oe/oe later 1-105, but is highly skewed
drop table t purge;
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)
,CONSTRAINT t_pk PRIMARY KEY (a));
insert into t
with n as (select rownum n from dual connect by level <= 1000)
select n.n, ceil(sqrt(n.n)), TO_CHAR(TO_DATE(n.n,'j'),'jsp')
from n;
commit;
insert into t
with n as (select 1000+rownum n from dual connect by level <= 10000)
from n;
commit;
connect / as sysdba
exec dbms_stats.flush_database_monitoring_info;
connect oe/oe

REAL-TIME STATISTICS BEING GATHERED
ON CONVENTIONAL DML
SQL_ID dv6z6pa38uuyk, child number 0

-------------------------------------
insert into t with n as (select 1000+rownum n from dual connect by
level <= 10000) select n.n, ceil(sqrt(n.n)),
TO_CHAR(TO_DATE(n.n,'j'),'jsp') from n
Plan hash value: 761049541
---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 2 (100)| | | | |
| 1 | LOAD TABLE CONVENTIONAL | T | | | | | | | |
| 2 | OPTIMIZER STATISTICS GATHERING | | 1 | 13 | 2 (0)| 00:00:01 | | | |
| 3 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 | | | |
| 4 | COUNT | | | | | | | | |
| 5 | CONNECT BY WITHOUT FILTERING| | | | | | 2048 | 2048 | 2048 (0)|
| 6 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
---------------------------------------------------------------------------------------------------------------------

BUT JUST BECAUSE THERE ARE GATHERED
DOESN'T MEAN THEY ARE STORED
Simple Query of Statistics Views

Table Num
Name MON Rows ALL_TABLES
----- --- --------
T YES
Table Num ALL_TAB_STATISTICS

Name Rows BLOCKS NOTES
----- -------- ---------- -------------------------
T
Column Num ALL_TAB_COLUMNS

Name DATA_TYPE Vals LOW_VALUE LOW_V HIGH_VALUE HI_V
------ ---------- -------- ---------------- ---------------- -------------------------------- --------------------------------
A NUMBER
A2 NUMBER
B VARCHAR2 ALL_TAB_COL_STATISTICS
no rows selected

• If you don't have normal statistics, you won't get real-time statistics.

TEST: STATISTICS AUTOMATICALLY
GATHERED ON INITIAL DIRECT-PATH LOAD
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)
insert /*+APPEND*/ into t

from n;
commit;
insert into t
from n;

WITHOUT FLUSHING MONITORING INFO
WE HAVE ADDITIONAL COLUMN STATS
Table Num
Name MON Rows
----- --- --------
T YES 1000
Table Num
----- -------- ---------- -------------------------
T 1000 5
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
B VARCHAR2 1000 6569676874 eight 74776F2068756E647265642074776F two hundred two
Table Column Num

Name Name Distinct LOW_VALUE LOW_V HIGH_VALUE HI_V HISTOGRAM
----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
STATS_ON_LOAD
T A2 32 C102 1 C121 32 NONE
STATS_ON_LOAD
T B 1000 6569676874 eight 74776F2068756E647265642074776F two hundred two NONE
STATS_ON_LOAD
T A C102 1 C3020A53 10982
STATS_ON_CONVENTIONAL_DML
T B 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred twenty-seven
56E64726564207477656E74792D736576656E
T A2 C102 1 C20206 105

COLUMN STATS FOR CONVENTIONAL DML
BUT NUMBER OF ROWS ON TABLE WRONG
SQL_ID dbxq2k3458mkn, child number 0
-------------------------------------
select count(*) from t where a2 <= 42
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 3 (100)| | 1 |00:00:00.01 | 79 | 5 |
| 1 | SORT AGGREGATE | | 1 | 1 | 13 | | | 1 |00:00:00.01 | 79 | 5 |
|* 2 | TABLE ACCESS FULL| T | 1 | 425 | 5525 | 3 (0)| 00:00:01 | 1764 |00:00:00.01 | 79 | 5 |
------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):

---------------------------------------------------
2 - filter("A2"<=42)
Note
-----
- dynamic statistics used: statistics for conventional DML

FLUSH THE DATABASE MONITORING
INFOMATION
I could wait for the monitoring information to flush naturally.

Or I can flush it manually:

HAVING FLUSHED MONITORING INFORMATION
ALSO HAVE ADDITIONAL TABLE STATS!
Table Num
Name MON Rows
----- --- --------
T YES 1000
Table Num
----- -------- ---------- -------------------------
T 1000 5
T 11000 76 STATS_ON_CONVENTIONAL_DML
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
STATS_ON_LOAD
T A2 32 C102 1 C121 32 NONE

STATS_ON_LOAD

STATS_ON_LOAD
T B 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred ninety-two

56E64726564206E696E6574792D74776F
T A2 C102 1 C20206 105

T A C102 1 C3020A1F 10930


COLUMN STATS FOR TABLE TOO
SQL_ID 4w1zhvzyh0hb9, child number 0 Added comment to force parse of new

-------------------------------------
select /*B*/ count(*) from t where a2 <= 42 statement, otherwise would get same
plan as before from library cache
---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 22 (100)| | 1 |00:00:00.01 | 78 |
| 1 | SORT AGGREGATE | | 1 | 1 | 13 | | | 1 |00:00:00.01 | 78 |
|* 2 | TABLE ACCESS FULL| T | 1 | 4680 | 60840 | 22 (0)| 00:00:01 | 1764 |00:00:00.01 | 78 |
---------------------------------------------------------------------------------------------------------------------
…
Note
-----

• Real-time column statistics really are real-time because they are immediately visible to
the optimizer
• Real-time table statistics do not become visible until the table monitoring information
has been flushed.
• Creation of real-time statistics does not invalidate cursors.
• Assuming application requires to parse new SQLs, there are several opportunities for
execution plans to change as state of statistics change
• Stale Statistics – under estimate rows
• Real-time column statistics – worse under estimate
• Real-time table statistics – over estimate

TEST: STATISTICS (INCLUDING HISTOGRAM)
EXPLICITLY GATHERED BEFORE CONVENTIONAL DML
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)
insert into t
from n;
commit;
exec dbms_stats.gather_table_stats('OE','T' -
,method_opt=>'FOR ALL COLUMNS SIZE AUTO FOR COLUMNS A2 SIZE 254');
insert into t
from n;
commit;

ADDITIONAL COLUMNS STATISTICS ON DML
AND HISTOGRAM ON COLUMN A2
Table Num
Name MON Rows
----- --- --------
T YES 1000
Table Num
----- -------- ---------- -------------------------
T 1000 5
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
T A2 32 C102 1 C121 32 FREQUENCY
T A2 C102 1 C20206 105

T B 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred eighty-nine

56E64726564206569676874792D6E696E65
T A C102 1 C3020A51 10980


HISTOGRAM HAS TAKEN PRECEDENCE OVER
REAL-TIME COLUMN STATISTICS
SQL_ID dbxq2k3458mkn, child number 0 All the values in histogram <= 32

-------------------------------------
select count(*) from t where a2 <= 42 So optimizer thinks it will get all
1000 rows in table
---------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 13 | | | 1 |00:00:00.01 | 73 |
---------------------------------------------------------------------------------------------------------------------
…
Note
-----

HAVING FLUSHED MONITORING INFORMATION
NOW HAVE REAL-TIME TABLE STATISTICS
Table Num
Name MON Rows
----- --- --------
T YES 1000
Table Num
----- -------- ---------- -------------------------
T 1000 5
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
T A2 32 C102 1 C121 32 FREQUENCY
T B 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred forty-four

56E6472656420666F7274792D666F7572
T A2 C102 1 C20206 105

T A C102 1 C302094E 10877


HISTOGRAM HAS TAKEN PRECEDENCE OVER
REAL-TIME COLUMN STATISTICS
The optimizer knows there are
11000 rows in the table, but the
SQL_ID dbxq2k3458mkn, child number 0 histogram still says they are all
-------------------------------------
select count(*) from t where a2 <= 42 in the range 1-32, even though
the column max value is 105
---------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 13 | | | 1 |00:00:00.01 | 79 |
|* 2 | TABLE ACCESS FULL| T | 1 | 11000 | 139K| 22 (0)| 00:00:01 | 1764 |00:00:00.01 | 79 |
---------------------------------------------------------------------------------------------------------------------…
Note
-----

Histograms takes precedence over column statistics.

• The proportion of rows returned will be calculated from histogram without reference to
column statistics.
• Real-time statistics never generate statistics that would require scanning the table.
• Don't generate histograms, or update the number of distinct values
• Heavily skewed data may still present challenges even with real-time statistics
On-Line statistics (on direct-path load) can also gather histograms

• _optimizer_gather_stats_on_load_hist=TRUE
• This undocumented parameter is set in Autonomous Data Warehouse
Collect on-line statistics on every direct path load

• _optimizer_gather_stats_on_load_all=TRUE
• Also set in Autonomous Data Warehouse
SHOULD WE COLLECT HISTOGRAMS BY
DEFAULT?
Oracle Default: Jonathan Lewis:

• histograms created automatically where skew • Do not gather histograms
and usage • Unless Frequency or Top-Frequency
• set_global_prefs • Get rid of all "size repeat" in 12c+
• method_opt => 'for all columns size auto' • Change default
• set_table_prefs • set_global_prefs
• method_opt => 'for all columns size auto, for • method_opt => 'for all columns size 1'
columns XXX size 1'
• set_table_prefs
• method_opt => 'for all columns size 1, for
columns XXX size N'
• Write scripts for "unusual" histograms
http://coug.us/wp/wp- https://cdn.ymaws.com/ukoug.org/resource/res
content/uploads/2019/03/MColgan_Best_Practices_ mgr/scotland_19/jonathan_lewis.pdf
for_Manging_statistics.pdf
TEST: FULL EXPLICIT STATISTICS
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)

from n;
commit;
insert into t
from n;
commit;
exec dbms_stats.gather_table_stats('OE','T' –
,method_opt=>'FOR ALL COLUMNS SIZE AUTO FOR COLUMNS A2 SIZE 254');

STATISTICS COLLECTION
DELETES REAL-TIME STATISTICS
Table Num
Name MON Rows
----- --- --------
T YES 11000
Table Num Not surprising. Fresh statistics

----- -------- ---------- ------------------------- do not need to be augmented
T 11000 76
by real-time statistics
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 11000 C102 1 C3020B 11000
A2 NUMBER 105 C102 1 C20206 105
B VARCHAR2 11000 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred two
56E647265642074776F
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 11000 C102 1 C3020B 11000 NONE
T A2 105 C102 1 C20206 105 FREQUENCY
T B 11000 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred two NONE
56E647265642074776F

UP TO DATE FREQUENCY HISTOGRAM PRODUCES
ACCURATE ESTIMATE ON SKEWED DATA
SQL_ID dbxq2k3458mkn, child number 0

-------------------------------------
select count(*) from t where a2 <= 42
---------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 4 | | | 1 |00:00:00.01 | 78 |
---------------------------------------------------------------------------------------------------------------------

TEST: LONE DELETES ARE NOT TRACKED BY
REAL-TIME STATITICS
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)

from n;
commit;
delete from t where a > 800;

commit;

NO REAL-TIME STATISTICS AFTER
DELETE ONLY
Table Num
Name MON Rows
----- --- --------
T YES 1000
Table Num
----- -------- ---------- -------------------------
T 1000 5
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
STATS_ON_LOAD
T A2 32 C102 1 C121 32 NONE

STATS_ON_LOAD

STATS_ON_LOAD

TEST: UPDATES ARE TRACKED BY REAL-TIME
STATISTICS
create table t
(a number NOT NULL A2 goes from 1-32 to 23-39
,a2 number NOT NULL B is prefixed with z
,b varchar2(2000)

from n;
commit;
update t
set a2 = ceil(sqrt(a+1000))
, b = 'z'||TO_CHAR(TO_DATE(a+1000,'j'),'jsp')
where a <= 500;
commit;
REAL-TIME STATISTICS AFTER UPDATE
Table
Name MON
Num
Rows
Increase in column maximum
----- --- --------
T YES 1000
values have been detected
Table Num
Increase in column minimum
----- -------- ---------- -------------------------
T 1000 5
value not detected
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
STATS_ON_LOAD
T A2 32 C102 1 C121 32 NONE

STATS_ON_LOAD

STATS_ON_LOAD
T B 6569676874 eight 7A6F6E652074686F7573616E6420746872656 zone thousand three hundred eleven

52068756E6472656420656C6576656E
T A2 C102 1 C126 37

TEST: UPDATE THEN DELETE IS TRACKED BY
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)
from n;
commit;
update t
set a2 = ceil(sqrt(a+1000))
, b = 'z'||TO_CHAR(TO_DATE(a+1000,'j'),'jsp')
where a <= 500;
commit;
delete from t where a > 800;
commit;
REAL-TIME STATISTICS UPDATE THEN DELETE
Table Num
Name MON Rows
----- --- --------
T YES 1000 Reduction in number
Table Num
of rows detected
----- -------- ---------- -------------------------
T
T
1000
800
5
7 STATS_ON_CONVENTIONAL_DML
No real-time column statistics on
A because it was not updated.
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
STATS_ON_LOAD
T A2 32 C102 1 C121 32 NONE

STATS_ON_LOAD

STATS_ON_LOAD
T B 6569676874 eight 7A6F6E652074686F7573616E642074776F206 zone thousand two hundred thirty-five

8756E64726564207468697274792D66697665
T A2 C102 1 C126 37

REAL-TIME STATISTICS AFTER UPDATE THEN
DELETE
SQL_ID 1fs1r0dh9pg95, child number 0 Still estimates 160 rows because

-------------------------------------
select count(*) from t where a > 800 20% of range of 1-1000, out of 800 rows.
800*.2=160
--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 4 | | | 1 |00:00:00.01 | 2 |
|* 2 | INDEX RANGE SCAN| T_PK | 1 | 160 | 640 | 2 (0)| 00:00:01 | 0 |00:00:00.01 | 2 |
--------------------------------------------------------------------------------------------------------------------
…
Note
-----

TEST: TRUNCATE DOES NOT REMOVE
STATISTICS
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)

from n;
truncate table t;
insert into t
from n;
commit;

REAL-TIME STATISTICS AFTER TRUNCATE
AND INSERT
Table Num
Name MON Rows
----- --- --------
T YES 1000
Table Num
10000 rows √
----- -------- ---------- -------------------------
T 1000 5
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
STATS_ON_LOAD
T A2 32 C102 1 C121 32 NONE

STATS_ON_LOAD

STATS_ON_LOAD
T A2 C102 1 C20206 105

T A C102 1 C302095C 10891

T B 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred three

56E64726564207468726565
STATS_ON_CONVENTIONAL_DML Copyright © 2019 Accenture All rights reserved. | 43
REAL-TIME STATISTICS AFFECTED BY
STATISTICS ON TRUNCATED DATA
Query returns no rows because data in
SQL_ID 65dtcj7k2w7c8, child number 0 range 1-1000 was truncated.
-------------------------------------
select count(*) from t where a < 501 Statistics suggest A in range 1-11000
So 500/11000*10000=454
--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 13 | | | 1 |00:00:00.01 | 3 |
|* 2 | INDEX RANGE SCAN| T_PK | 1 | 459 | 5967 | 2 (0)| 00:00:01 | 0 |00:00:00.01 | 3 |
--------------------------------------------------------------------------------------------------------------------
…
Note
-----

A FEW TRAPS
It appears that deletes are not tracked by real-time statistics – surely a bug
• Updates are tracked
• Deletes after Updates are tracked
Truncate doesn't delete statistics

• So you can get statistics on truncated data mixed with real-time statistics on new data

TEST: PREDICATE OUTSIDE COLUMN RANGE
STATISTICS – UNIFORM DISTRIBUTION
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)
create index t_a2 on t(a2);

from n;
commit;
select /*A*/ count(b) from t where a >= 900;
insert into t
from n;
commit;
select /*C*/ count(b) from t where a >= 1900;
exec dbms_stats.gather_table_stats('OE','T',method_opt=>'FOR ALL COLUMNS SIZE AUTO FOR COLUMNS A2 SIZE 1');
select /*D*/ count(b) from t where a >= 1900;

STATISTICS ON LOAD
SQL_ID b614cm44zbh7n, child number 0

-------------------------------------
select /*A*/ count(b) from t where a >= 900
---------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 26 | | | 1 |00:00:00.01 | 6 |
---------------------------------------------------------------------------------------------------------------------

Table Num
Name MON Rows
----- --- --------
T YES 1000
Table Num
----- -------- ---------- -------------------------
T
T
1000
2000
5
12 STATS_ON_CONVENTIONAL_DML
Max column value of A is 1984,
actually 2000
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
STATS_ON_LOAD
T A2 32 C102 1 C121 32 NONE

STATS_ON_LOAD

STATS_ON_LOAD
T B 6569676874 eight 74776F2068756E647265642074776F two hundred two

T A2 C102 1 C12E 45
T A C102 1 C21455 1984


STALE STATISTICS
WITH REAL-TIME STATISTICS
SQL_ID 32cu1x428w7sn, child number 0 Estimate slightly off because maximum column
-------------------------------------
select /*C*/ count(b) from t where a >= 1900 value is recorded as 1984 when actually 2000
Plan hash value: 2966233522 85/1984*2000=86
---------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 29 | | | 1 |00:00:00.01 | 14 |
---------------------------------------------------------------------------------------------------------------------
Note
-----
Good estimate on a column without skew.
This is the problem that real-time statistics
solves very well.

FRESH STATISTICS
SQL_ID 2jv5mfvk6qswb, child number 0 The conventional statistics produce a more accurate
-------------------------------------
select /*D*/ count(b) from t where a >= 1900 estimated number of rows, and consequently we use
Plan hash value: 2053823973 an index scan rather than a full scan
---------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 33 | | | 1 |00:00:00.01 | 4 |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | 101 | 3333 | 3 (0)| 00:00:01 | 101 |00:00:00.01 | 4 |
|* 3 | INDEX RANGE SCAN | T_PK | 1 | 101 | | 2 (0)| 00:00:01 | 101 |00:00:00.01 | 2 |
---------------------------------------------------------------------------------------------------------------------------------------

TEST: STATISTICS AND REAL-TIME STATISTICS
ON GLOBAL TEMPORARY TABLES (FROM 12C)
create global temporary table t

(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)
,CONSTRAINT t_pk PRIMARY KEY (a))
on commit preserve rows;

from n;
commit;
insert into t
from n;
commit;

ON-LINE BUT NOT REAL-TIME STATISTICS ON
GLOBAL TEMPORARY TABLE
Table Num
Name MON Rows
----- --- --------
T NO
Table Num
Name Rows BLOCKS NOTES SCOPE
----- -------- ---------- ------------------------- -------
T SHARED
T 1000 5 SESSION
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER
A2 NUMBER
B VARCHAR2
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
T A2 32 C102 1 C121 32 NONE

NO REAL-TIME STATISTICS
ON GLOBAL TEMPORARY TABLES
Concept of real-time statistics doesn't fit with GTTs
Statistics on GTTs have always been problematic.

• Statistics gather on data in one session can be used in other sessions
• 19 introduces session level statistics on GTTs
• Stored in memory, exposed via x$kxttstets
GTTs designed for short-lived storage of data

• If you wait for a database process to flush real-time statistics to disk
• Is the session using the GTTs likely still there?
• Which session's statistics will be stored in memory and flushed?
• Where they being flushed to? And you would have to have conventional statistics
So, it just doesn't happen

LOCK STATISTICS BEFORE DML
create table "OE".t

(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)
insert /*+APPEND*/ into "OE".t

from n;
commit;
exec dbms_stats.lock_table_stats('OE','T');
update "OE".t set b = UPPER(TO_CHAR(TO_DATE(a,'j'),'jsp')) where rownum<=101;
commit;

LOCKING STATISTCS BEFORE DML
PREVENTS REAL-TIME STATISTICS
Table Num
----- -------- ---------- ------------------------- -------
T 1000 5 SHARED
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- ------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
STATS_ON_LOAD
T A2 32 C102 1 C121 32 NONE

STATS_ON_LOAD

STATS_ON_LOAD

LOCKING STATISTCS BEFORE DML
PREVENTS REAL-TIME STATISTICS
Locking statistics prevents

• collection of statistics in schema/database-wide operations
• collection of real-time statistics on either direct or conventional-path load.
Locking statistics is used because

• No explicit statistics, perhaps rely on dynamic sampling
• Nothing for real-time statistics to augment
• Manage statistics collection manually, or perhaps in the application.
• No point confusing the issue with real-time statistics.

DISABLING REAL-TIME STATISTICS
Despite potential for problems, I think real-time Documented hints

statistics are useful enhancement.
• Applied to DML statements
However, if you really want, you can disable this
behaviour.
• NO_GATHER_OPTIMIZER_STATISTICS
Two undocumented parameters
• Can be set at system, session, or statement level
• GATHER_OPTIMIZER_STATISTICS
• optimizer online stats gathering for conventional
DML • Doesn't override
_optimizer_gather_stats_on_conventional_dml =
• _optimizer_gather_stats_on_conventional_dml FALSE
= TRUE
• use optimizer statistics gathered for conventional
DML There is no table preference for Real-Time
• _optimizer_use_stats_on_conventional_dml Statistics
= TRUE • It is not possible to control them at table
level

NO_GATHER_OPTIMIZER_STATISTICS
Supress collection of statistics on initial bulk load Supress collection of real-time statistics during
of table conventional DML
• CREATE TABLE … AS SELECT …
• INSERT /*+APPEND*/ … SELECT …
• (into an empty table)
Introduced in Oracle 12.1 Introduced in Oracle 19.1

NO_GATHER_OPTIMIZER_STATISTICS HINT
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)
insert /*+APPEND NO_GATHER_OPTIMIZER_STATISTICS*/ into t
from n;
commit;
insert into t
from n;

NO STATISTICS AT ALL
If you don't have normal statistics then you don't get real-time statistics
Table Num
Name MON Rows
----- --- --------
T YES
Table Num
----- -------- ---------- ------------------------- -------
T SHARED
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER
A2 NUMBER
B VARCHAR2
no rows selected

NO_GATHER_OPTIMIZER_STATISTICS HINT
create table t
(a number NOT NULL
,a2 number NOT NULL
,b varchar2(2000)
from n;
commit;
insert /*+NO_GATHER_OPTIMIZER_STATISTICS*/ into t
from n;

STATISTICS ON LOAD ONLY
Table Num
Name MON Rows
----- --- --------
T YES 1000
Table Num
----- -------- ---------- ------------------------- -------
T 1000 5 SHARED
Column Num
------ ---------- -------- ---------------- ---------------- ------------------------------------- -------------------------------------
A NUMBER 1000 C102 1 C20B 1000
A2 NUMBER 32 C102 1 C121 32
Table Column Num

----- ------ -------- ---------------- ---------------- ------------------------------------- ------------------------------------- -------------
NOTES
-------------------------
T A 1000 C102 1 C20B 1000 NONE
STATS_ON_LOAD
T A2 32 C102 1 C121 32 NONE

STATS_ON_LOAD

STATS_ON_LOAD

• In general, more accurate statistics will result in fewer problems if not better
performance.
• Skewed data has always been a challenge
• It is still going to be a challenge with real-time statistics
• However, there is no substitute for maintaining appropriate explicit statistics

• Including during application execution if necessary!

HIGH-FREQUENCY AUTOMATIC OPTIMIZER
STATISTICS COLLECTION

HIGH-FREQUENCY AUTOMATIC STATISTICS
COLLECTION
• Scheduled PL/SQL Procedure: dbms_stats.gather_st_job_continuous_proc
• Very similar to automatic statistics collection job that runs in the maintenance window
(auto optimizer stats collection).
• Not enabled by default. You have to turn it on.
• Applies to whole database.
• Key differences are limited run time, so it can be run more frequently.
• Only one instance can run concurrently. So only one CPU.
• It will run during the maintenance window
• Uses SYS.STATS_TARGET$ to drive processing
• Stalest tables within each object type first
• This also is used to prevent the two jobs running concurrently?
• Can interact with real-time statistics.

CONSEQUENCES AND THREATS
Cursor Invalidation Statistics History

• Caused by collecting statistics • Old statistics and histograms saved to history
• Cursors might disappear quite quickly after a when new collected.
process completes as HFASC occurs • Increased volume of statistics history retained.
• Might increase parse overhead • Increased purge overhead.

LIMITED CONTROL WITH PREFERENCES
You can
• enable/disable job,
• set Execution Frequency,
• set Maximum Run-time,
• Omit tables by locking statistics
EXEC DBMS_STATS.SET_GLOBAL_PREFS('AUTO_TASK_STATUS','ON');
EXEC DBMS_STATS.SET_GLOBAL_PREFS('AUTO_TASK_INTERVAL','900');
EXEC DBMS_STATS.SET_GLOBAL_PREFS('AUTO_TASK_MAX_RUN_TIME','3600');

READ PREFERENCES
TO GET CURRENT SETTINGS
SET LINESIZE 100 pages 99

COLUMN auto_task_status HEADING 'Auto Task|Status' FORMAT a10
COLUMN auto_task_max_run_time HEADING 'Auto Task|Max Run Time' FORMAT a15
COLUMN auto_task_interval HEADING 'Auto Task|Interval' FORMAT a10
SELECT DBMS_STATS.GET_PREFS('AUTO_TASK_STATUS') AS auto_task_status

, DBMS_STATS.GET_PREFS('AUTO_TASK_MAX_RUN_TIME') AS auto_task_max_run_time
, DBMS_STATS.GET_PREFS('AUTO_TASK_INTERVAL') AS auto_task_interval
FROM dual;
Auto Task Auto Task Auto Task

Status Max Run Time Interval Default Settings
---------- --------------- ----------
OFF 3600 900

DBA_AUTO_STAT_EXECUTIONS
with x as (
SELECT x.*
, end_time-start_time diff
, start_time-(LAG(end_time,1) over (order by start_time)) start_lag
FROM DBA_AUTO_STAT_EXECUTIONS x
ORDER BY x.start_time
)
select opid, origin, status
, ((extract( day from start_lag )*24+
extract( hour from start_lag ))*60+
extract( minute from start_lag ))*60+
extract( second from start_lag ) start_lag
, start_time, end_time
, ((extract( day from diff )*24+
extract( hour from diff ))*60+
extract( minute from diff ))*60+
extract( second from diff ) secs
, completed, failed, timed_out, in_progress
from x
Where start_time >= sysdate-1/24
/

DBA_AUTO_STAT_EXECUTIONS
Report on each automatic statistics collection process

Start Exec Timed In
OPID ORIGIN STATUS Lag START_TIME END_TIME Secs COMPLETED FAILED Out Progress
----- -------------------- ----------- -------- ----------------------------------- ----------------------------------- -------- --------- ------ ------ --------
8405 AUTO_TASK COMPLETED 313.873 01-JUL-19 08.16.09.885689 AM -05:00 01-JUL-19 08.16.22.277274 AM -05:00 12.392 45 1 0 0
8468 AUTO_TASK IN PROGRESS 60.339 01-JUL-19 09.09.35.537039 AM -05:00 01-JUL-19 09.09.35.537039 AM -05:00 .000 66 0 0 1

SYS.STATS_TARGET$
COLUMN Description
START_TIME Timestamp when statistics gathering started
END_TIME Timestamp when statistics gathering ended
STALENESS Staleness as a percentage derived from statistics and counts of table modifications statistics, but
limited to ±100
LEAST/GREATEST combination of ROWCNT, INSERTS, INSERTS+DELETE+UPDATES,
ROWCNT+INSERTS-DELETES/(ROWCNT+.01))
Also Derived from DBMS_STATS_INTERNAL.IS_STALE and DBMS_STATS_INTERNAL.BITCLR
OSIZE Size on object in bytes from DBMS_STATS_INTERNAL.GET_TABLE_BLOCK_COUNT or
DBMS_STATS_INTERNAL.SEGMENT_NUMBER_BLOCKS
OBJ# Object number – see SYS.OBJ$
TYPE# Object type – see DBA_OBJECTS or table above
FLAGS ?
STATUS 0=awaiting processing
1=in process
2=complete
?=timed out (because there is a timeout column on DBA_AUTO_STAT_EXECUTIONS
4=failed
SID SID of HFASC process
SERIAL# Serial# of HFASC process
PART# Partition/Sub-partition number
BO# Base Object ID for partitioned object
SYS.STATS_TARGET$
select
s.START_TIME, s.END_TIME
,s.STALENESS
,s.OSIZE
,s.OBJ#, name, NVL(subname,'<NULL>') subname
,s.TYPE#
,decode(s.type#, 0, 'NEXT OBJECT', 1, 'INDEX', 2, 'TABLE', 3, 'CLUSTER',
19, 'TABLE PARTITION', 20, 'INDEX PARTITION',
34, 'TABLE SUBPARTITION', 35, 'INDEX SUBPARTITION',
'UNDEFINED') object_type
,s.FLAGS, s.STATUS
,s.SID, s.SERIAL#
,s.PART#, s.BO#
from sys.stats_target$ s
left outer join sys.obj$ o on o.obj# = s.obj#
Where start_time >= sysdate-10/1440
order by start_time
/
SYS.STATS_TARGET$
START_TIME END_TIME STALENESS OSIZE OBJ#
----------------------------------- ----------------------------------- ---------- ---------- -------
NAME SUBNAME TYPE# OBJECT_TYPE FLAGS STATUS SID
------------------------------ ---------------------------------------- ----- ------------------ ----- ------ -----
SERIAL# PART# BO#
------- ---------- ----------
04-SEP-19 05.45.20.430796 PM +00:00 04-SEP-19 05.45.20.529349 PM +00:00 .4 131072 11576
WRM$_DATABASE_INSTANCE <NULL> 2 TABLE 288 2 17
28450
04-SEP-19 05.45.20.599045 PM +00:00 04-SEP-19 05.45.20.854072 PM +00:00 .5 262144 670

COL_USAGE$ <NULL> 2 TABLE 288 2 17
28450
04-SEP-19 05.45.20.887099 PM +00:00 04-SEP-19 05.45.20.961224 PM +00:00 .4 65536 88757

WRM$_PDB_IN_SNAP WRM$_PDB_IN_SNAP_1884370030_2326 19 TABLE PARTITION 352 2 17
28450 120 11594
04-SEP-19 05.45.20.937960 PM +00:00 04-SEP-19 05.45.20.958548 PM +00:00 -99 819200 88759

WRM$_PDB_IN_SNAP_PK WRM$_PDB_IN_SNAP_1884370030_2326 20 INDEX PARTITION 576 2 11
32191 120 11596
04-SEP-19 05.45.20.982490 PM +00:00 04-SEP-19 05.45.21.213654 PM +00:00 .4 589824 88701

WRH$_SYSMETRIC_HISTORY WRH$_SYSMETRIC_HISTORY_1884370030_2326 19 TABLE PARTITION 352 2 17
28450 321 11343
04-SEP-19 05.45.21.149670 PM +00:00 04-SEP-19 05.45.21.210769 PM +00:00 -99 819200 88703

WRH$_SYSMETRIC_HISTORY_INDEX WRH$_SYSMETRIC_HISTORY_1884370030_2326 20 INDEX PARTITION 576 2 11
32191 321 11345

CONCLUSIONS

BOTH FEATURES ARE EXADATA ONLY
19c Feature available on

• Oracle Database Enterprise Edition on Engineered Systems (EE-ES)
• Oracle Database Exadata Cloud Service (ExaCS)
• Database Licensing Information User Manual, 1.3 Permitted Features, Options, and
Management Packs by Oracle Database Offering, Performance - Table 1-8
I think both features are primarily aimed at Autonomous Transaction Processing Databases
• Not ADW because you mostly direct path load, and don't update HCC objects
• It will also apply to Exadata.
• For example, they will apply to a classical ERP on Exadata
• But why not other platforms? It would help them also.
However, you can reproduce all the behaviour if you set

• alter system set "_exadata_feature_on"=TRUE scope=spfile;
• Obviously, you would NEVER do that in any other than a play database, but it does show
that the feature is there and it works.
CONCLUSIONS
REAL-TIME STATISTICS HIGH-FREQUENCY AUTOMATIC STATISTICS

• Helps to avoid plan regressions caused by COLLECTION
using stale data • It is a continuous maintenance window
• Clearly works well with uniformly distributed • Recollects statistics that are stale
data • Starting with most stale within object type
• Augments existing statistics with simple
metrics:
• Could it repeatedly get stuck on large but
• Row counts stale tables?
• Min/max values
• Nothing that requires a scan of all the data
• No Histograms – doesn't handle skewed data
well
• Based on table monitoring

QUESTIONS
E-mail:
• david.kurtz@accenture.com

19c Real-Time and High Frequency Statistics Collection

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

19c Real-Time and High Frequency Statistics Collection

Uploaded by

Copyright:

Available Formats

ORACLE 19C:

Accenture Enkitec Group

Copyright © 2019 Accenture All rights reserved. |

2 new 19c features

Unfortunately, at the moment, these features are only available on

Copyright © 2019 Accenture All rights reserved. | 3

The optimizer is a piece of software that works out • Cost-Based Optimizer

Dave Ensor • "The only time that it is safe to gather

Copyright © 2019 Accenture All rights reserved. | 6

Copyright © 2019 Accenture All rights reserved. | 7

Copyright © 2019 Accenture All rights reserved. |

Copyright © 2019 Accenture All rights reserved. |

"When a DML operation is currently modifying a table, Oracle Database dynamically

• There is no call to DBMS_STATS visible in trace.

Copyright © 2019 Accenture All rights reserved. | 10

Copyright © 2019 Accenture All rights reserved. | 11

A series of simple tests that

Copyright © 2019 Accenture All rights reserved. | 12

Copyright © 2019 Accenture All rights reserved. | 13

SQL_ID dv6z6pa38uuyk, child number 0

Plan hash value: 761049541

Copyright © 2019 Accenture All rights reserved. | 15

Simple Query of Statistics Views

Table Num ALL_TAB_STATISTICS

Column Num ALL_TAB_COLUMNS

Copyright © 2019 Accenture All rights reserved. | 16

Copyright © 2019 Accenture All rights reserved. |

insert /*+APPEND*/ into t

Copyright © 2019 Accenture All rights reserved. | 18

Table Column Num

Copyright © 2019 Accenture All rights reserved. | 19

Plan hash value: 2966233522

Predicate Information (identified by operation id):

Copyright © 2019 Accenture All rights reserved. | 20

I could wait for the monitoring information to flush naturally.

Copyright © 2019 Accenture All rights reserved. | 21

Table Column Num

T A2 32 C102 1 C121 32 NONE

T B 1000 6569676874 eight 74776F2068756E647265642074776F two hundred two NONE

T B 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred ninety-two

T A2 C102 1 C20206 105

T A C102 1 C3020A1F 10930

Copyright © 2019 Accenture All rights reserved. | 22

SQL_ID 4w1zhvzyh0hb9, child number 0 Added comment to force parse of new

Copyright © 2019 Accenture All rights reserved. | 23

Copyright © 2019 Accenture All rights reserved. |

Copyright © 2019 Accenture All rights reserved. |

Table Column Num

T A2 32 C102 1 C121 32 FREQUENCY

T B 1000 6569676874 eight 74776F2068756E647265642074776F two hundred two NONE

T A2 C102 1 C20206 105

T B 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred eighty-nine

T A C102 1 C3020A51 10980

Copyright © 2019 Accenture All rights reserved. | 26

SQL_ID dbxq2k3458mkn, child number 0 All the values in histogram <= 32

Copyright © 2019 Accenture All rights reserved. | 27

Table Column Num

T A2 32 C102 1 C121 32 FREQUENCY

T B 1000 6569676874 eight 74776F2068756E647265642074776F two hundred two NONE

T B 6569676874 eight 74776F2074686F7573616E642074776F20687 two thousand two hundred forty-four

T A2 C102 1 C20206 105

T A C102 1 C302094E 10877

Copyright © 2019 Accenture All rights reserved. | 28

insert /+APPEND/ into t

insert /+APPEND/ into t

insert /+APPEND/ into t

insert /+APPEND/ into t

insert /+APPEND/ into t

insert /+APPEND/ into t

insert /+APPEND/ into t