Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2

Chapter 2. Statspack Advanced Tuning

At the beginning of February 2005, in the suburbs of Chicago, USA, I had the honor to meet the famous Oracle database guru, Ren The Ultimate
2Mr. Richard J. Niemiec, CEO of Software Consultants (TUSC) consulting company, he is in the database performance
He has profound attainments and world-renowned achievements in adjustment and optimization. At the same time, he has written a lot, and his proud work "Oracle
The book "Performance Tuning Tips and Techniques" has been the best-selling book in the performance tuning category in the United States for several years
the first.
It has been my wish for many years to listen to the master's teachings. I seized the precious time to ask him about what I was facing, which is also the Chinese
Common questions about RAC, 10g and Oracle binding Veritas, etc., and his answer to Oracle
The deep understanding of the product, the broad thinking of starting to analyze specific problems and the rich experience of integrating theory with practice, let m
The demeanor of the teacher. On the occasion of parting, the master asked me to pay tribute to my colleagues in China, and entrusted me with his beloved work──
"Advanced Adjustments" is brought to you.
After reading it carefully, this article has both depth and breadth. In just a few pages, it covers most of the actual work.
It feels like it's too late. Therefore, I read carefully, translated carefully, and organized carefully. After discussing with Richard overseas and obtaining his
After agreeing, I rewrote parts of this chapter on the basic concepts and workings of Latch.
So far, I hope all colleagues like this translation and apply it in your work. It will be the best report to the master.
Consolation, it will also be the happiest thing for my "messenger"!

Translator Lu Xueyong

The following is Mr. Richard J. Niemiec's evaluation of this translation:


“ I want to give a heartfelt thanks to Lu Xueyong for translating this article. His ability in
Oracle has taken the original article and made it even better. His tremendous Oracle skills are
only exceeded by his character and dedication to helping share the knowledge in China. Thank you
you Lu for your enormous contribution and dedication to character! TUSC has always been a
company in the US that has been dedicated to sharing the knowledge as well as a dedication to
doing work with character, I ' ve been lucky to work with Lu who is dedicated to the same in
China!"

24 Oracle Database Performance Optimization

2.1 Statspack advanced adjustment translation

In Oracle 9i (and 8.1.6), if I were allowed to pick and choose my favorite, in terms of system monitoring and performance problem location
If you want to use the above tools, but at the same time, you can only choose two at most, so my first choice must be Statspack and Enterprise Manager.
Statspack replaces the UTLBSTAT/UTLESTAT scripts provided in previous versions of Oracle and
Important expansions have been made. In the following text, I will focus on the solutions to some difficult problems when implementing Waits adjustment with Sta
solution. It took me a lot of effort to write out these questions. Beginners may find it difficult and confusing to read.
So I suggest that interested beginners should start by reading Oracle's related documents.

2.1.1 Top 5 Wait Events

When trying to find and eliminate system bottlenecks quickly, Top 5 Wait Events is probably the most
We were able to disclose some contents of the problem. This part will list Top 5 waiting events, all waiting events and background waiting events
events, and identifying the main waiting events can often help resolve urgent system adjustments. if
TIMED_STATISTICS has been set to true, the waiting events will be sorted according to the length of the waiting time; if not, it
They are sorted by how many times they are waiting to happen.
From the list below, you can find a lot of waiting for reading a single data block (DB File Sequential Reads) and related
Waiting for Latches (Latch Free), at the same time, you can find serious waiting for writing data files and writing log files, as well as log
file competition. In order to locate and resolve major problems, it is often necessary to carefully study the more detailed information provided in other sections of S
for detailed reports. Top 5 Wait Events ~~~~~~~~~~~~~~~~~ Wait % Total Event Waits Time (cs) Wt Time ------------------ -------------------------- -----------

Some of the most common problems, their explanations, and usually possible solutions are now listed. DB File Scattered Read

This usually indicates a wait associated with a full table scan. Full table scans in memory are generally sporadic rather than continuous
are distributed successively to various parts of the buffer. If the value of this item is large, it may indicate that there may be missing or suppressed indexes.
In many cases, we actually prefer full table scan, because it is often more efficient than index scan, so when we find such waiting things
When using a file, you should first confirm whether a full table scan is really needed. In the case of small tables, you can try to load them entirely into
memory buffers to avoid repeatedly reading their contents from disk. DB File Sequential Read

This is a metric related to single-block reads, such as index reads. Huge values ​suggest possible inefficient table join ordering
Or a poorly selective index scan. In a well-tuned multi-transaction system, this value is usually large, so it is best to set this
Item wait metrics are studied in relation to other well-known issues such as inefficient SQL that appear in Statspack reports, such as
It is said to check whether the index scan is necessary, and for example, to check the connection order in the multi-table connection, and so on. Parameter DB_CAC

Chapter 2 Statspack Advanced Tuning 25

The value of is the determining factor for how many times this wait occurs. Problems caused by hash joins will appear in PGA memory, in the "crazy
"Swallow" a large amount of memory while causing severe waits for sequential reads or direct read/write waits. Free Buffer Waits

This entry indicates that the system is waiting because no memory buffer is currently available. When all SQL statements are well adjusted
Next, this item waits to tell you that you should increase DB_BUFFER_CACHE. This wait may also be due to the current SQL statement
The selectivity of the statement is too poor: the memory buffer is full of a large number of index blocks, and the system cannot find the necessary index blocks to p
more memory. A large number of DML (addition, deletion or modification) or DBWR writing speed is not fast enough may also cause this item to wait, memory buff
Zones can be filled with a lot of duplicate content, creating serious inefficiencies. To solve this kind of problem, you can consider speeding up incremental checkpo
Use more DBWR processes or increase the number of physical disks. Buffer Busy Wait

Such waits on the buffer are due to the way the buffer is not shared, or because the buffer is passing its
His session reads the contents of the data block. Buffer busy waiting should not exceed 1% of the quota. Please check Statspack buffering
section of the wait area (or check the v$waitstat view) to see if the wait is happening on the segment header. If so, add the
Freelist Groups for segments or increase the difference between PCTFREE and PCTUSED. If the wait happens on the undo header
On the above, you can solve the problem by increasing the rollback segment; if the wait occurs on the undo block, you can consider reducing the drive
The data density on the table for this consistent read, or increase DB_CACHE_SIZE. If the wait occurs on a data block,
Then you can move the data on this block to another data block to avoid "hot spots", increase the Freelists of the table or use local management
tablespaces (LMT's). If the wait occurs on an index block, the index can be rebuilt, partitioned, or reversed
Index to keywords. To avoid buffer busy waits associated with data blocks, a smaller blocksize can be used instead: each block
There are now fewer records in the block, and the block is no longer as "hot" as it used to be. When executing a DML (add, delete
or change), Oracle Database writes information to the data block, including user information that is "interested" in the state of the data block.
Information (Interested Transaction List, referred to as ITL). In order to reduce the waiting on this part, you can increase the initrans parameter
to reserve space for more ITL slots in the data block; you can also increase the PCTFREE parameter of the table so that, in the initrans
More ITL slots (limited by maxtrans cap) can be dynamically
allocated to space. Latch Free

Latches are a low-level synchronization lock mechanism to maintain the order of certain access and execution operations. Oracle Server Communicator
Concurrent use of objects such as redo threads, tables, and transactions is achieved through enqueue, and the system is achieved through Latch
Protection of shared memory structures in Global Area (SGA). Latch is fast and low cost, often through a single memory unit
Yuan to achieve. Most of the Latch is mutually exclusive, they can give some single process write permission. Shared Latch
Allows concurrent read operations on a memory structure. When a Latch is requested but it is occupied by another process, a Latch will be generated
Records of free misses. Most Latch problems are related to not using bind variables (Library Cache Latch), redo logs
Generation problem (Redo Allocation Latch), memory buffer competition problem (Cache Buffers LRU Chain) and internal
"Overheated" data blocks in the cache buffer (Cache Buffers Chain). There are also some Latch Waits due to software errors
If you suspect that you have encountered such a bug, you can check the bug report on MetaLink
(oracle.com/support). When the Latch miss rate exceeds 0.5%, it should be studied carefully. I myself will be near
This issue of Oracle Magazine writes an article on Latch waiting in detail, because this topic really needs a special article.
Can speak clearly. Enqueue

enqueue is a lock that protects shared resources. Locks can protect shared resources such as data in records, preventing two
26 Oracle Database Performance Optimization
people modify the record at the same time. The enqueue structure contains a first-in-first-out (FIFO) queue mechanism, which is different from
The non-FIFO mechanism of Oracle Latch prior to version 9i. enqueue wait is usually ST enqueue, HW enqueue
and TX4 enqueue. ST enqueue plays a role in the physical space allocation and management of the data dictionary management table space
use. When some tablespaces managed by the data dictionary continue to have problems, you can switch to locally managed tablespaces (LMT),
Either pre-allocate the extent, or at least make the next extent bigger. HW enqueue is used with the high water mark of the same segment,
Manual allocation of extensions can avoid waiting on them. TX4 is the most common among various enqueue waits, and its appearance is usually
The result of one of the following three problems. The first, duplication of the unique index, you need to commit/rollback to release
enqueue. The second type is multiple concurrent modifications to the same bitmap index slice. Because there may be multiple
rowids, when multiple users try to modify the same piece, commit/rollback is required to release the enqueue. The third is also
The most likely one is that multiple users modify the same data block at the same time, and if there are no free ITL slots, a
Block-level locks. This problem can be solved by increasing initrans or maxtrans to accommodate more ITL slots, increasing
PCTFREE for large tables can also solve this problem. Now let's talk about TM locks -- a row-level lock. If there is a foreign key, then the
It is necessary to create an index for them to avoid the trouble of this common lock. Log Buffer Space

This wait occurs while writing the log buffer faster than LGWR can write the contents of the log buffer to the redo log file.
zone, or if the log switch is too slow. To solve this problem, you can increase the size of the redo log file, increase the
log buffer, or use a faster disk. Log File Switch

All requests about commits are waiting on "logfile switch (archiving needed)" or "logfile switch (chkpt.
Incomplete)". At this time, it should be confirmed whether the disk used for archiving is full or whether it is too slow. Due to the relationship between I/O, DBWR
It may be too slow, you may have to add some large-capacity redo logs. However, if DBWR is the problem, you have to switch to
Consider increasing the number of database writer processes. Log File Sync

When the user submits or rolls back, LGWR quickly writes the redo of this session into the log file from the log buffer, and the user enters
The process must wait for the write to complete successfully. To shorten the waiting time, more records can be submitted at a time (for example, instead of record
instead of submitting in batches of 50). Consider putting redo logs on fast disks, or putting different redo
Logs are placed on different physical disks, and these are to reduce the impact of archiving on LGWR. Try not to use RAID 5,
Because RAID 5 is too slow for write-heavy applications. Should try to use file system direct I/O or raw device,
Raw devices are really fast to write. Idle Events

At the bottom of the waiting list you will see some idle waiting events, which you can ignore. Idle events are usually listed in the
The bottom part of each paragraph, such as SQL*Net messages to/from the client and other background related timing information, etc. null
Idle events are placed in the stats$idle_event table.

2.1.2 Wait Time Quick Reference

The waiting problems and possible solutions are summarized in Table 2-1.

Chapter 2 Statspack Advanced Tuning 27


Table 2-1 Waiting problems and possible solutions
Waiting for a possible solution to the problem
Sequential Read Indicates that there are many index reads - adjust the code (especially the table join part)
Scattered Read Indicates that there are many full table scans - tweak the code to fit small tables into memory
Free Buffer Increase DB_CACHE_SIZE, speed up checkpoints, and tune code
Buffer Busy Section header - add freelists or freelist groups
Buffer Busy Data block - separate "hot" data, use reverse key index, use small data block
Buffer Busy Data block - increase initrans and maxtrans
Buffer Busy undo header - increase the rollback segment
Buffer Busy undo block ——Increase commit frequency, increase rollback segment
Latch Free Study Latch details (see below)
Enqueue - ST Use local tablespaces or preallocate large extents
Enqueue-HW Pre-allocate extensions above the high-water mark
Enqueue-TX4 Increase initrans and maxtrans for tables or indexes
Enqueue-TM Index foreign keys, see table locks in your application
Log Buffer Space Larger log buffer, redo logs on fast disks
Log File Switch The archive device is too slow or too full, increase or expand the redo log
Log File Sync More records per commit, faster redo log disks, raw devices
Idle Events neglect

Common idle events include the following: dispatcher timer (shared server idle event); lock manager wait for remote message (RAC idle event); pipe get (user
2.2 About Latch

What exactly is Latch, its usage and problems encountered, etc., have been done in the previous part of Latch Free
The basic introduction will not be repeated here.
There are two basic Latch request methods: "Willing-To-Wait" and "No-Wait". In the former way, if the

28 Oracle Database Performance Optimization


If the requested Latch is busy, the process that requested the Latch, such as process A, will re-request the Latch after a while
Latch, for example, loops until the value determined by _SPIN_COUNT is repeated that many times. On a single CPU system, _SPIN_COUNT
It is set to 1, because the process that requests the Latch is empty without another CPU releasing the process that occupies the Latch
In addition to consuming some CPU resources, there is no other meaning. With CPU count > 1, _SPIN_COUNT of
The default value is 2000.
If after the entire waiting loop, Latch is still occupied by other processes, process A will give up the CPU and go to sleep
state, and a sleep corresponds to a wait in the Latch Free wait event. Parameter_MAX_EXPONENTIAL_SLEEP
Determines the maximum time for process A to sleep before being allowed to request Latch again. The default value of this parameter is 200 centiseconds (1
centisecond = 1/100 second). The sleep time is 1 centisecond for the first time, doubled to 2 centiseconds for the second time, and counted from the second time
In the next requests, the sleep time will be doubled after every 2 cycles, forming 1, 2, 2, 4, 4, 8, 8...
sleep time mode until the upper limit determined by _MAX_EXPONENTIAL_SLEEP is reached.
In two cases, a process, such as process B, can request Latch according to the "No-Wait" method. Here we only introduce
Shao looks a relatively simple one. Take Redo Copy Latch as an example, because there are more than one Latch of the same type, process B is requesting
When the Redo Copy Latch is obtained, it will start in the "No-Wait" mode: when a Latch is already occupied,
Process B does not wait to request it again, but instead immediately requests another Latch. just after asking for all this
When there is no such Latch, the server process will determine one of the Latches and let process B wait.
treat. "No-Wait" Latches are the immediate_gets, immediate_misses columns in the v$latch view and the Statspack report
The Latch-related section of the report generates information.
After locating potentially problematic Latch-related wait events, in the Latch Activity section of the Statspack report
More specific information about these latches will be available. Get Requests, Pct Get Miss and Avg Slps/Miss (About
Sleep and Missed) are statistics for "Willing-To-Wait" Latches requests, while NoWait Requests and Pct NoWait
Miss is for "No-Wait" Latches requests. Pct Miss relative to two kinds of Latch requests should be close to 0.0.
The Latch Sleep Breakdown section in the Statspack report provides details on Latch: request these
Latch's process is spinning or sleeping.
When analyzing Latch problems, the v$latch view is very helpful, v$latchholder, v$latchname and v$latch_children
Iso views are also very helpful.
Look at the sections about Latch in the above Statspack report or query the v$latch view to see how many processes there are
Had to wait (Latch Miss) or sleep (Latch Sleep) and how many times they had to sleep. Provided below is
Part of the Latch Activity section of the Statspack report, where there is apparently a problem with the library cache Latch (library cache Latch
is an example of a "Willing-To-Wait" Latch): Pct Avg Wait Pct Get Get Slps Time NoWait NoWait Latch Requests Miss /Miss (s) Requests Miss

------------------------ -------------- ------ ------ ------ ------------ ------


KCL freelist latch 4,924 0.0 0
cache buffer handles 968,992 0.0 0.0 0
cache buffer chains 761,708,539 0.0 0.4 21,519,841 0.0
cache buffers lru chain 8,111,269 0.1 0.8 19,834,466 0.1
library cache 67,602,665 2.2 2.0 213,590 0.8
redo allocation 12,446,986 0. 0.0 0
redo copy 320 0.0 10,335,430 0.1
user lock 1,973 0.3 1.2 0
Chapter 2 Statspack Advanced Tuning 29
When the "Latch Free" item in the Waiting Events section of a Statspack report shows a high value, in the report about
Within the paragraphs of the Latch, the questions that need to be researched must be found. The following content will help you review these questions about Latc

Library cache Latch queues access requests to objects in the library whenever SQL or PL/SQL stored procedures, packages, and functions are executed
And trigger, this Latch is used. This Latch will also be frequently used in Parse operations. Oracle 8i is based on a common
Share pool Latch to protect the memory allocation of the library cache, but since 9i, 7 sub-Latches have been added. The shared pool is too small or the SQL statem
When they cannot be reused, contention occurs for the "Shared Pool", "Library Cache Pin", or "Library Cache" Latches.
SQL statements can no longer be used because bind variables are not used, and SQL statements that are similar but not exactly the same can be seen everywhere
statement, and increasing the size of the pool will only make the Latch problem worse. You can set initialization parameters
CURSOR_SHARING=FORCE (same as in 9i) to reduce problems with unused bind variables. However, the shared pool and library
The cached Latch problem also occurs when the latter setting is too small for a large number of SQL statements to be processed and more memory needs to be a
will happen. To load SQL or PL/SQL statements into the memory and release part of the space first, the operation will occupy the Latch mutually exclusively, makin
Other users wait. This competition can be mitigated by increasing the shared pool, or by using the DBMS_Shared_Pool.Keep
Stored procedures solve this problem by pinning large SQL or PL/SQL statements in memory. Redo Copy

The number of redo copy Latch is 2*CPU_COUNT by default, but can be changed by
_LOG_SIMULTANEOUS_COPIES initialization parameters to reset. Increasing this parameter can help slow down the redo copy
Latch competition. Redo copy Latch is used to copy redo records from PGA to redo log buffer. Redo Allocation

The competition for Redo Allocation Latch (allocation of space in the redo log buffer) can be selected by NOLOGGING
option to slow down, this option can reduce the load on the log buffer. Also, unnecessary commits should be avoided. Row Cache Objects

"Row Cache Objects" Latch competition usually means data dictionary competition, it may also be over-parsing
Symptoms of SQL statements that rely on common synonyms. Increasing the size of the shared pool usually resolves this issue. This method is often used to solv
The problem of cache Latch, and on the premise that the problem is solved, the competition of "Row Cache Objects" Latch is usually rooted
Wouldn't have been a problem. Cache Buffers Chains

The memory buffer chain Latch is required when scanning the memory buffer in the SGA. "Overheated" data blocks in memory buffers (via
Often being accessed) caused the problem of memory buffer chain Latch, "overheated" data blocks may also be poorly adjusted SQL
Symptoms caused by the statement. "Overheating" records create "overheating" blocks, causing problems and mappings for other records in the block
to the hash chain on the block address. To locate a "hot block", the v$latch_children view can be queried to determine the block address,
And through the connection between this view and x$bh view, determine all data blocks under the protection of this Latch (this will determine the data blocks affec
all data blocks that are ringing). With the file# and dbablk found in the x$bh view, dba_extents can be queried to determine the
affected objects. If the "hot block" is on an index, then the reverse key index can be used to put the consecutively discharged records
Moved to other data blocks, they will no longer be locked in a series of "hot blocks". If the "hot block" is the "root" of the index
block, then the reverse keyword index will not help, set _DB_BLOCK_HASH_BUCKETS to the number of buffers
A value of the smallest prime number twice as large as (DB_CACHE_SIZE/DB_BLOCK_SIZE) will usually clear the noise from this problem. Oracle
Before 9i, this parameter had a default value, but it caused a very serious competition problem of this Latch. In Oracle 9i, this
The parameter is correctly set to a prime number.

30 Oracle Database Performance Optimization Cache Buffers LRU Chain

The memory buffer LRU chain Latch is used when scanning the LRU (least recently used) chain of all memory buffer blocks. too small
in-memory buffers, oversized in-memory buffer throughput, too many in-memory sort operations, DBWR slowing down
The workload, etc. may all be the culprit that causes serious competition in the LRU chain Latch of the memory buffer: please adjust the logic that leads to excess
Read the query! You can increase the initialization parameter DB_BLOCK_LRU_LATCHES to get multiple LRU Latches, so you can reduce
less competition. Generally speaking, non-SMP (symmetric multiprocessor) systems only need a unique LRU Latch, and in the case of SMP systems
In some cases, Oracle automatically sets the number of LRU Latch to 1/2 the number of CPUs. For a database writer process, it is necessary to allocate
Give it at least one LRU Latch. If you need to increase the number of database writing processes, you must not forget to increase the LRU Latch at the same time
Number of.
The Latch problem and its possible solutions are summarized in Table 2-2.

Table 2-2 Latch problems and possible solutions


Possible solutions to the Latch problem
Library Cache Using bind variables, adjust SHARED_POOL_SIZE
Shared Pool Using bind variables, adjust SHARED_POOL_SIZE
Redo Allocation Minimize redo generation and avoid unnecessary commits
Redo Copy Increase _LOG_SIMULTANEOUS_COPIES
Row Cache Objects increase shared pool
Cache Buffers Chain _DB_BLOCK_HASH_BUCKETS should be increased or primed
Cache Buffers LRU Chain Set DB_BLOCK_LRU_LATCHES or use multiple buffer pools

Any latch that hits less than 99% of the time should be carefully checked. In the past, some latch problems were caused by software bugs
Therefore, you should remember to check the issues related to Latch on MetaLink.

h h d l f h l h bl l d b ff h d lb h d
This chapter details some of the most common latch problems, including memory buffer chains, redo copies, library caches, and memory
Latch issues such as buffer LRU chains.

Reference Information
1. Steve Adams. Oracle 8i Internal Services for Waits, Latches, Locks, and Memory.
O'Reilly UK, 2003
2. Oracle Doc ID: 61998.1, 39017.1
3. Connie Dialeris, Graham Wood. Performance Tuning with Statspack White Paper,
2000
4. Notes from Richard Powell, Cecilia Gervasio, Russell Green and Patrick Tearle
5. Statspack checklist; Kevin Loney, Randy Swanson, Bob Yingst, 2002
6. Rich Niemiec, IOUG Masters Tuning Class, 2002
7. Richard J. Niemiec. Oracle Performance Tuning Tips and Techniques. McGraw-Hill.
1999
8 . Richard J. Niemiec. Oracle 9i Performance Tuning Tips and Techniques.
McGraw-Hill. 2003

Chapter 2 Statspack Advanced Tuning 31

About the Author


Richard J. Niemiec, CEO of TUSC (The Ultimate Software Consultants), author of "Oracle
Performance Tuning Tips and Techniques" book.

Translator profile
Lu Xueyong, studied combinatorial algorithms in the United States in the 1980s, and worked in AT&T Bell Labs,
Engaged in relational database work in Chicago Futures Trading Center and other places. Since 1996, he has been a full-time Oracle DBA/
Developer and later Senior Consultant. In 2001, he returned to serve in China and is currently a senior consultant of UT Starcom.

You might also like