Managing Cost-Based Optimiser Statistics in Peoplesoft: David Kurtz

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 32

Managing Cost-Based Optimiser

Statistics in PeopleSoft
David Kurtz
Go-Faster Consultancy Ltd.

david.kurtz@go-faster.co.uk
www.go-faster.co.uk
Who Am I?
• Oracle Database Specialist
– Independent consultant
• System Performance
tuning
– PeopleSoft ERP
– Oracle RDBMS
• Book
– www.psftdba.com
• UKOUG
– PeopleSoft Technology SIG
Committee

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 2


s in PeopleSoft
• If you are an Oracle DBA
– Next Week
– In Birmingham
– 3-days

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 3


s in PeopleSoft
Agenda
• Cost Based Optimiser Statistics
– Permanent Tables
• Database/Schema-Wide Maintenance
– Interim Tables
• Working Storage/Temporary/Reporting
• Populated by Batch
• Statistics Managed by Batch

• I am going to talk about Oracle databases


– The principles apply to all databases
– The specifics are different

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 4


s in PeopleSoft
Cost Based Optimiser
• A mathematical model
– Given a SQL statement
– Uses volumetric statistics
– Determines ‘optimal’ execution
plan
• Like predicting the weather
– Doesn’t always produce the
best answer
• Deficiencies in the model
• Inaccuracies in data (statistics)

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 5


s in PeopleSoft
What do you mean by ‘optimal’?
• An execution plan has a cost
– That cost is an estimate of how
long the query will take.
– In practice that may not be the
best execution plan.
– The Optimiser might not know
enough about the data.
• Skew
• Correlation

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 6


s in PeopleSoft
What do you mean ‘correlation’?
• Imagine table of people
– Attributes include: Star sign, Month of Birth
– How many people born in August?
• 1 in 12
– How many people are Leos?
• 1 in 12
– How many Leos born in August?
• 1 in 144? No.
• 1 in 16, We know that data is correlated

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 7


s in PeopleSoft
What do you mean ‘correlation’?
• Example from HR. • From a real system
931012 rows
PS_JOB
(emplid VARCHAR2(11) 142224 EMPLIDs
… 7456 EFFDTs
,effdt DATE
…)
• There is one day with
77985 rows!

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 8


s in PeopleSoft
General Principles

• Tell the database the truth about the objects


it is working on.
– At least enough of the truth for the Optimiser to
make the right decision
• System Statistics reflect performance of disks
• Object Statistics reasonably up to date
• Histograms to describe skew of individual columns.
– Oracle 11g will build histograms on combinations of
columns

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 9


s in PeopleSoft
Corner Cases

• There will be times when that is not enough


– Hints (which are directives)

• And in extreme cases


– Manually adjust statistics
– ‘Tuning by Cardinality Feedback’ (Wolfgang Breitling)
– http://www.centrexcc.com/papers.html

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 10


s in PeopleSoft
System Statistics
• Introduced in Oracle 9i
• Not automatically collected
– In which case arbitrary defaults used
• Tells Oracle about
– CPU Speed
• Cost of code path
– Single –v- Multi-block Disk Speed
• Relative cost of index –v- scan

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 11


s in PeopleSoft
By default, how does Oracle
maintain statistics?

• Maintenance Window
– New in Oracle 10g
• Previously you had to define your own job
– Default 10pm – 6am weekdays + weekends
– Refresh ‘stale’ object statistics
• Stale means >10% change
• Table monitoring enabled by default
– user_tab_statistics
– dbms_stats.flush_database_monitoring_info
– Automatic Histogram Collection

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 12


s in PeopleSoft
Histograms?
• Statistics include minimum and maximum value
for every column
– Optimiser assumes values evenly distributed in that
range.
• Describe non-uniformities in data distributions
– Frequency –v- Height Balanced
– Up to 254 ‘buckets’
• Parse Overhead
– More input data, more calculations

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 13


s in PeopleSoft
Maintaining Statistics
• What usually happens! • My Advice
– Leave to to default – 90% of time, default is
behaviour fine
• We keep the statistics • Rather like traction
up to date to maintain control on a car
system stability – Take control, at least
– I sometimes see for tables where you
Legacy 9i scripts, that have had to resolve a
collect stats on problem
everything at high
sample size.

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 14


s in PeopleSoft
Taking Control

• Techniques:
– Lock Statistics
– Delete Statistics
– Optimiser Dynamic Sampling
– Modifying PeopleSoft Behaviour

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 15


s in PeopleSoft
Locking Statistics
• New in Oracle 10g
• Table (and indexes) omitted from
database/schema-wide statistics operations
• dbms_stats.gather_table_stats & ANALYZE
commands generate an error when stats are
locked
• dbms_stats has force=>TRUE

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 16


s in PeopleSoft
Optimiser Dynamic Sampling
• Extra statistics collected on-the-fly at parse time.
– Usually for when there are no object statistics
– Useful for when object statistics may lead to errors in
estimation of selectivity.
• Recommendation: Set to Level 4
• default is 2
– Enabled for tables with 2 column predicates
– This helps where columns correlated

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 17


s in PeopleSoft
Working Storage/Reporting Tables

• In an ideal world
– Clear the table
%TruncateTable()
– Populate table
– Gather statistics
%UpdateStats()
– Use tables

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 18


s in PeopleSoft
%UpdateStats
• Application Engine macro
– Invokes DDL Model
– Dynamically defined in PSDDLMODEL
• So you can change this
– Used to be ANALYZE TABLE
– From PT8.48 delivered with
dbms_stats.gather_table_stats
• http://blog.psftdba.com/2007/05/changes-to-ddl-mo
dels-in-peopletools.html

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 19


s in PeopleSoft
DDL Model Problems (1)

%UpdateStats(record name ,[HIGH/LOW])


• Low
– DDL Model specifies 1% sample
• High
– dbms_stats.auto_sample_size
• Dynamically calculated by Oracle
• often less than 1%

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 20


s in PeopleSoft
DDL Model Problems (2)
%UpdateStats(record name ,[HIGH/LOW])
• Low
– FOR ALL COLUMNS SIZE 1
• Removes all histograms
• High
– FOR ALL INDEXED COLUMNS SIZE 1
• Removes histograms from indexed columns
• Leaves histograms and min/max values on
unindexed column values unchanged.
Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 21
s in PeopleSoft
%UpdateStats
• A good idea, but a blunt instrument.
• Replace it with custom PL/SQL package
– wrapper.ps_stats()
• If statistics locked suppress gather
• Doesn’t analyze Global Temporary Tables
• Refreshes stale statistics on partitioned tables
• Meta-data driven to control behaviour per record to override
defaults
– http://blog.psftdba.com/2009/06/controlling-how-updat
estats-collects.html

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 22


s in PeopleSoft
COBOL %UpdateStats
• Expansion hard coded in COBOL
– Still produces ANALYZE in PT8.49
– GP Cobol uses stored statements
STORE GPPSERVC_U_STATH
%UPDATESTATS(PS_GP_PYE_HIST_WRK);
– It is possible to recode statements to call wrapper
STORE GPPSERVC_U_STATH
BEGIN
wrapper.ps_stats(p_ownname=>user,p_tabname=>
'PS_GP_PYE_HIST_WRK'); END;;

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 23


s in PeopleSoft
PeopleSoft Temporary Records
• Permanent Database Tables used for
– Temporary Working Storage
• Application Engine automatically clears the data
out of the table.
– Truncate or Delete
• Neither invalidate statistics.
– Statistics will reflect the previous data
• No point including these tables in maintenance
window
– Statistics gathered will always be out of date.

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 24


s in PeopleSoft
Statistics on Temporary Records
• Delete and lock Statistics
– To omit them from schema/database-wide maintenance
activities
• Optimiser Dynamic Sampling
– So you don’t need to explicitly collect statistics
• But you will sometimes!
• %UpdateStats
– Or Process needs to refresh statistics
– Use wrapper.ps_stats() otherwise ORA-20005

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 25


s in PeopleSoft
Deleting Statistics in
Application Engine Programs

• Delete statistics on temporary tables on


allocation to Application Engine instance
– Trigger on PS_AETEMPTBLMGR
– Ensures that you don’t use statistics collected
by previous Application Engine
• With different process instance number
– http://blog.psftdba.com/2009/04/statistics-
management-for-peoplesoft.html

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 26


s in PeopleSoft
Oracle 10g Statistics Retention
• By default Oracle 10g retains a history of object
statistics for 31 days.
– Global setting, not per object or schema
– dbms_stats writes previous values to
SYS.WRI$_OPTSTAT%HISTORY tables
– Lots of concurrent batches with lots of %UpdateStats?
• you could have a locking problem 31 days after go live.
– http://blog.psftdba.com/2009/06/oracle-10g-statistics-
history-retention.html
– You might choose to disable statistics retention.

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 27


s in PeopleSoft
Statistics on Partitioned Tables

• Partitioning is a strategy that is usually


adopted for very large tables.
• Gathering Statistics can be time consuming
– Have to update partition-level statistics on
every partition
– Then calculate table-level statistics.

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 28


s in PeopleSoft
Time-based Partitioning
• Annual/Monthly/Weekly/Periodic partitions
– Historical Partitions haven’t changed
– Future partitions created in advance are still
empty
– GATHER STALE only available for
database/schema-wide operations
• http://blog.psftdba.com/2009/04/statistics-
management-for-partitioned.html
• Wrapper package can refresh stale partitions only.

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 29


s in PeopleSoft
Conclusion
• Whatever you decide to do about statistics,
make sure it’s your decision.
– Defaults may not be the right decision all of the
time.
• My suggestions in this presentation are all
in production on a large system.
• The wrapper package is my solution to
series of problems

Managing Cost-Based Optimiser Statistic ©2009 www.go-faster.co.uk 30


s in PeopleSoft
Questions?
Managing Cost-Based Optimiser
Statistics in PeopleSoft
David Kurtz
Go-Faster Consultancy Ltd.

david.kurtz@go-faster.co.uk
www.go-faster.co.uk

You might also like