1.4 Informix Performance Tuning and Troubleshooting

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 88

Informix 11.

7 Bootcamp
Informix Performance Tuning and Troubleshooting
Information Management Technology Ecosystems

© 2010 IBM Corporation


Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
2 © 2010 IBM Corporation
Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
3 © 2010 IBM Corporation
Introduction

• Types of performance problems


• The database server seems unresponsive

• Users are experiencing long processing times

• Performance for a particular query has suddenly slowed

• The Message Log is showing long checkpoint durations

• An Informix utility is processing very slowly

• New connections are slow

4 © 2010 IBM Corporation


Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
5 © 2010 IBM Corporation
Know Your Application

• Important to have an understanding of:


• Business needs
• The supported Database System

• MUST understand:
• User and application activities
• Processing times of users and applications
• Data layout
• Data access
• Business impact of outages
• Frequency of application changes

• Recommendation
• Be involved!!!!

6 © 2010 IBM Corporation


Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
7 © 2010 IBM Corporation
Know Your Environment - OLTP

• OLTP Characteristics
• Small number of rows returned
• Quick response times of queries
• High read and write buffer cache rates
• Short checkpoint durations
• Maximum I/O throughput
• Known / limited “windows” for periodic tasks
• Short “fast recovery” times

• Recommendation
• Important to spend time in areas that will have the greatest
impact on performance – INDEXES
• Understand how the data is searched and create the correct
indexes
8 © 2010 IBM Corporation
Know Your Environment - DSS

• DSS Characteristics
• Large number of rows processed and returned
• Summations/Aggregations common – Large Temp Table Usage
• Optimum memory utilization
• Parallel data queries (PDQ)
• Light scans
• Maximum I/O throughput
• Tend to be queried by front-end tools
• Table Fragmentation
• PDQPRIORITY

• Important to focus on specific areas that will have the greatest


impact on performance – MEMORY
• Optimum memory utilization will have the greatest impact

9 © 2010 IBM Corporation


Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
10 © 2010 IBM Corporation
Be Prepared!

• Understand how the system is supposed to work!

• Properly diagnosing a performance problem will require data


on system behavior

• Capture the relevant server configuration and profile data


when all is well
• Also useful for trending analysis as the system grows

• Having relevant baseline data available makes problem


diagnosis easier and faster

11 © 2010 IBM Corporation


Create a Base Line BEFORE Problems Develop!

Informix Utilities (ONSTAT, ONCHECK, etc )


Run periodically
SMI Tables
Capture
Database Scheduler

UNIX Utilities

Time IO/Sec User CPU% ReadCache


9/1/2002 8:00 41 0.23 99.6
9/1/2002 9:00 63.3 0.41 98.2
9/1/2002 10:00 92.34 0.61 98.3
9/1/2002 11:00 102.2 0.63 98.1
9/1/2002 12:00 50.32 0.55 97.3
9/1/2002 13:00 100.23 0.71 98.1
9/1/2002 14:00 104.2 0.72 98.5
Analysis
Try to have several weeks of data before you need it so that trends and
cycles can be understood.
12 © 2010 IBM Corporation
Benchmark - Collect Performance Information

• Save a history of performance data

• Collect at both peak and average times periodically

• COLLECT:
• Application throughput / response times
• EXPLAIN plans and response times of frequently executed
application queries
• onstats
• Enable ALL configuration statistics and capture relevant data
• onchecks
• Print information on reserved pages, extents and large tables
• Operating system data
• vmstat, ps, iostat, sar, truss, …

13 © 2010 IBM Corporation


Benchmark - Collect Configuration Information

• Save a history of configuration changes

• Useful for answering questions related to what has changed

• COLLECT:
• Instance configuration files
• $ONCONFIG, $INFORMIXSQLHOSTS, adtcfg, sm_versions,
termcap, ixbar.servernum
• Informix environment variables
• Table definitions with storage information using dbschema -ss
• Disk configuration using onstat –d
• Instance and database layout using oncheck –pe
• System information (/etc/hosts, /etc/hosts.equiv, …)
14 © 2010 IBM Corporation
Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
15 © 2010 IBM Corporation
Informix Utilities

• Informix includes many utilities to diagnose performance


problems
• onstat
• System Monitoring Interface (SMI)
• oncheck
• onlog
• Database Scheduler
• OAT
• EXPLAIN outputs
• SQLTRACE
• Operating System Utilities

16 16 © 2010 IBM Corporation


Operating System Utilities
• ipcs (information about • vmstat
Informix shared memory) • Paging/page outs
• Swapping
• Prints information on active inter-
process communication facilities • Page scan rates
• Shared memory segments • Free memory
• Semaphores • time or timex
• Message queue
• Times the running of a process in real
• sar time, user time and system CPU time

• Monitors cpu utilization • iostat


• Reports on disk activity • Provides data about the I/O device
usage of the system
• Reports on memory allocation • Measures and reports
• disk throughput
• ps • how many Kbytes are being transferred
per second
• Process state • number of seeks per second
• Priority of the process • percentage of disk utilization
• Number of physical reads or writes per
• Memory utilization second

17 © 2010 IBM Corporation


Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
18 © 2010 IBM Corporation
Analyzing a Performance Problem – First Steps

• The most basic question that should be asked first is:


Has anything changed on the server?
• Often the trigger of a performance issue is a change in the
configuration or on the load of the system
• Database Server changes?
• instance configuration, environment variables, schema changes, etc.
• Operating system changes?
• OS release, patch, configuration
• Hardware configuration changes?
• Workload changes?
• More data?

• If you can identify the change at this stage can it / should it be


undone
19 © 2010 IBM Corporation
Analyzing a Performance Problem – First Steps

• The second most basic question that should be asked is:


Is there something new/unusual running on the Server?

• Check for the following:


• More than the usual queries
• More transactions
• Larger queries
• Other non-database workloads on the system?
• More/new users/applications than normal
• onstat –u and onstat –g ses
• A utility that should not be running at this time
• Check onstat –g ath to see if there are threads relating to
• Backups and Restores (arcbackup, ontape)
• Loads (onpload)
• Check onstat –g sql for any Update Statistics sessions

20 © 2010 IBM Corporation


Analyzing a Performance Problem – First Steps
• What are the values of some iscommit The number of times OnLine performed a
commit
key server statistics? The number of times a transaction was rolled
isrollbk
back
• onstat -p
ovlock Increments when the value of LOCKS is
database sysmaster; exceeded .
select name, value from sysprofile ovuser Increments when the maximum number of
where name in userthreads is exceeded relative to the setting
( "iscommits", "isrollbacks", of NETTYPE.
"ovlock", "ovuser", “ovbuff” ovbuff The number of times a request was made for a
"latchwts", "buffwts", "lockwts", "ckptwts", buffer in the buffer pool but none was available.
"deadlks", "lktouts",
"fgwrites", "lruwrites", "chunkwrites" ) latchkwts Increments when a userthread must wait to
acquire a latch
fgwrites Foreground write - caused when a buffwts Increments when a userthread must wait to
session needs to have a page from disk acquire a buffer
placed into the buffer pool, but there are
no clean/free buffers available. lockwts Increments when a userthread must wait to
acquire a lock.
lruwrites LRU writes - background writes that
typically occur when the percentage of ckptwts The number of times a thread has had to wait
dirty buffers exceeds the percent that is for a checkpoint to complete before continuing.
specified for lru_max_dirty in the deadlks The number of potential deadlocks situations
BUFFERPOOL configuration parameter. that have been encountered
chunkwrites Chunk writes - commonly performed by lktouts The number of times users experienced Lock
page-cleaner threads during a checkpoint Timeouts

21 © 2010 IBM Corporation


Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
22 © 2010 IBM Corporation
Analyzing a Performance Problem – Disk Layer

Can be seen
as a disk
problem
Resource
Bottleneck Basic Symptoms
Memory • Swapping reported in vmstat
• Activity on swap disks seen in iostat

Resource
Bottleneck Basic Symptoms
Disk • I/O wait seen in vmstat / iostat / perfmon / top
• Disk > 80% busy seen in iostat
• Long device queues for I/O threads as seen in onstat –g ioq
• Low-mid CPU usage seen in vmstat / perfmon
• Which device?? If all else fails, on smaller systems watch the
disk lights!

23 © 2010 IBM Corporation


Is There a Table That is Very Heavily Utilized?
• What tables have the most I/O?
• onstat –t (currently active)

database sysmaster; database sysmaster;


select select tabname[1,8] table,
dbsname, sum(isreads) reads,
tabname, sum(iswrites) writes,
DBINFO ( 'dbspace', partnum ), sum(isrewrites) updates,
lockreqs, lockwts, deadlks, lktouts, sum(isdeletes) deletes
isreads, iswrites, isrewrites, isdeletes, from sysptprof
bufreads, bufwrites, pagreads, where tabname not like "sys%"
pagwrites group by tabname
from sysptprof order by reads desc,
order by isreads desc; writes desc,
updates desc,
-- change this sort to whatever you need to monitor. deletes desc;

• Heavily utilized tables can cause disk contention


• Is the data for the table fragmented on disk?
• oncheck -pt
24 © 2010 IBM Corporation
Is There a Table With Too Many Extents?
• How many extents does each database table have?
• oncheck -pe
database sysmaster;
select
dbsname,
tabname,
count(*) num_of_extents,
sum( pe_size ) total_size
from systabnames, sysptnext
where partnum = pe_partnum
group by 1, 2
order by 3 desc, 4 desc;

• An abnormal (> 50) number of extents can slow queries


• As a result of excessive disk head movement to find and go to
the many extents
• Consider rebuilding or running the defragment task on the table

25 © 2010 IBM Corporation


Is There an Index with too Many Levels?
• How many levels does each Index have?
• oncheck –pT database:table

database sysmaster;
select idxname,
levels
from sysindexes
order by 2 desc;

• The number of index levels may also adversely effects performance


• The more index levels, the more probes Informix needs to get to index leaf
nodes
• Furthermore, if a leaf node gets split or merged, it may take more time for
the whole index to adjust to this change
• If any index has more than 4 levels, consider dropping and rebuilding it to
consolidate its levels or enabling Index Compression and/or adjusting the
Alice settings for better performance
26 © 2010 IBM Corporation
Are many Indexes are NOT Unique?
• How Unique are the Indexes? database sysmaster;
select tabname, idxname,
nrows, nunique
from systables t, sysindexes I
where t.tabid =i.tabid
and t.tabid > 99
and nrows > 0
and nunique > 0

• The higher the nunique percentage, the more unique the index is!
• A highly duplicate index can severely impact performance for
updates and deletes
• Usually seen when there are many deletes or updates of the key values
• Informix must search through all the duplicates until it finds the correct key
to delete or update
• Consider replacing the original index with a composite index that combines
the highly duplicate column and a more unique column
• Composite indexes can also take advantage of Index Self-joins
27 © 2010 IBM Corporation
Are there too many Sequential Scans?
• How many sequential scans are being performed – overall or against
certain tables?
database sysmaster;
select dbsname,
• onstat –g ppf (per table) tabname,
sum(seqscans) total_scans
• onstat –p (instance level) from sysptprof
where seqscans > 0 AND
dbsname not like "sys%"
group by 1, 2
order by 3 desc

• Missing indexes or under-utilized asynchronous data reads can


severely affect performance
• What is the percentage of queries that involve a sequential
scan?
• SeqScanRatio = (seqscans / isam starts) * 100.00
• High buffer waits may also indicate the difference between
RA_PAGES and RA_THRESHOLD is too small
ReadAheadpct = ((RA-pgsused / (idxa-ra + idx-ra + da-ra) ) *100)
28 © 2010 IBM Corporation
Is there excessive I/O in one or more Chunks?
database sysmaster;
• What Chunks have the select name
most I/O? dbspace,
chknum, " Primary " chktype,
• onstat –D fname[15,25] path_name,
• onstat –g iof reads, writes,
pagesread, pageswritten
from syschktab c, sysdbstab d
where c.dbsnum = d.dbsnum
• If the data is not spread union all
evenly across dbspaces select
and chunks, each on name dbspace,
different devices, chknum, "Mirror" chktype,
performance can suffer due fname[15,25] path_name,
reads, writes,
to I/O waits pagesread, pageswritten
from sysmchktab c, sysdbstab d
where c.dbsnum = d.dbsnum
order by 1,2,3;

29 © 2010 IBM Corporation


Are there excessive Disk Page Reads?

• What percent of I/O is from buffers?


• onstat -p databse sysmaster;
-- Get % read cached
select dr.value dskreads, br.value bufreads,
round ((( 1 - ( dr.value / br.value )) *100 ), 2) cached
from sysprofile dr, sysprofile br
where dr.name = "dskreads“ AND
br.name = "bufreads";

-- Get % write cached


select dw.value dskwrites, bw.value bufwrites,
round ((( 1 - ( dw.value / bw.value )) *100 ), 2) cached
from sysprofile dw, sysprofile bw
where dw.name = "dskwrites“ AND
bw.name = "bufwrites"

• In OLTP environments, the Read Cache % should be over 95% and


the Write Cache % over 80%
• Are users wait too often to modify buffer cache pages (too few LRUs)?
• Bufwaits Ratio = (bufwaits / (bufwrits + pagreads)) * 100.00
• Is a subset of the buffer cache being thrashed?
• Buffer Turnover Rate = (((bufwrits + pagreads) / BUFFERS) / Elapsed)
30 © 2010 IBM Corporation
Are the Logical Logs the Bottleneck?

• Is anything using the same disks as the Logs?


• Run oncheck –pe to see what else is in the same dbspace as the logs
• Check if iostat shows > 80 IO/s on a single disk where the logs reside

• Logs can be very performance-sensitive in an OLTP


environments
• A good place to use your best hardware
• Consider moving logical logs to their own dbspace, faster disks, RAID
parallelization with small (e.g. 8k) stripe size, and/or fast controller with
write caching
• Are there an abnormally higher number of transactions?
• Check commit frequency
• Monitor using onstat –l –(numwrits value)
• Unbuffered Logging will also cause more log buffer flushes
• Consider changing to Buffered Logging
• Application snapshot to find out who is committing so frequently
• Monitor using onstat –g tpf (isct value)
31 © 2010 IBM Corporation
Are the Transaction Buffers the Bottleneck?

• Are the Log Buffers a bottleneck?


• Monitor onstat –l

• Small buffers can result in excessive flushing, degrading performance


• Physical Log Buffer
• Observe the bufsize and the pages/io columns
• Check efficiency:
= 75% < 65% > 90%
(pages/io) / (bufsize) Utilized Too Large Too Small
Efficiently

• Logical Log Buffer


• Monitor the logical log buffer usage the same as above
• If unbuffered logging is being used, the flushing of the buffer will depend
on the size of transactions, not the utilization of the buffer

32 © 2010 IBM Corporation


Are there excessive Sort Operations occurring?

• Is there a particular user performing many sorts (onstat –D)?


database sysmaster;
select sid, • Measure session sort Efficiency:
total_sorts, ((total_sorts - dsksorts)/total_sorts)*100
dsksorts,
• Higher percentage Æ more efficient
max_sortdiskspace
from syssesprof;

• Disk sorts are significantly more expensive than memory sorts


• Are many sorts spilling over to Disk?

database sysmaster; • New Informix 11 configuration


select parameter allows configuration of
totalsorts, memory for sorts:
memsorts,
disksorts DS_NONPDQ_QUERY_MEM
from sysprofile onmode –wm/f
where name matches "*sort*";
• Default 128K
33 © 2010 IBM Corporation
Are the TEMP Dbspaces Being Heavily Utilized?
• Are too many TEMP tables the problem?
database sysmaster;
select dbsname,
tabname,
c.owner,name,
ti_nrows,
ti_rowsize
from sysdbspaces a, systabinfo b, systabnames c
where a.dbsnum = ( trunc ( b.ti_partnum/1048576 ) ) AND
b.ti_partnum=c.partnum AND
(bitval(ti_flags,'0x0020')=1 or bitval ( ti_flags, '0x0040') = 1 )

• Sorts and intermediate result sets may be the result of poor plans
• May need to look into queries which require a lost of sorting (SQLTRACE)

• May also look into the potential of adding indexes to reduce sorting
(EXPLAIN)

• Statistics may also be out of date, causing scan / sort plans


34 © 2010 IBM Corporation
Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
35 © 2010 IBM Corporation
Analyzing a Performance Problem – CPU Layer

Can be seen
as a CPU
problem

Resource
Bottleneck Basic Symptoms
Memory • Higher-than-normal system CPU time seen in vmstat
• Lower-than-normal overall CPU consumption seen in vmstat

Resource
Bottleneck Basic Symptoms
CPU • Total CPU utilization near 100% seen in vmstat / sar
• One process steadily consuming (100/N) % of total
CPU time as seen in ps / task manage

36 © 2010 IBM Corporation


Do the Ready or Wait Queues show Waiting Threads?

• Is the Ready Queue consistently showing waiting threads?


• onstat -g rea
• Shows threads that are ready to run/executed
• If threads are accumulating for a class of virtual processors (VPs) for a
period of time
• Add additional VPs of that class to distribute the processing load

• Does the Wait Queue show most Threads waiting on one


or more events or resources?
• onstat -g wai
• Shows threads that waiting for a particular event before they can
continue to run
• Check the STATUS filed to see what the thread is waiting for (on a
condition, I/O, latch)
• Many threads waiting on the same Condition or Latch could point to a
problem area
37 © 2010 IBM Corporation
Is an Informix Process (ONINIT) Consuming All the OS CPU Time?
informix 13977 1 0 10:34:58 ? 0:00 oninit • onstat -g glo (output below)
informix 13978 13977 0 10:34:58 ? 0:00 oninit
informix 13979 13978 0 10:34:58 ? 9:03 oninit
shows the virtual processor
informix 13980 13978 0 10:34:58 ? 0:00 oninit class of the oninit processes.
informix 13981 13978 0 10:34:59 ? 0:00 oninit
informix 13982 13978 0 10:35:00 ? 0:00
Individual oninit
virtual processors:
informix 13995 13978 0 10:35:01
vp ? 0:00 oninit
pid class usercpu syscpu total
informix 13996 13978 0 10:35:02
1 ? 0:00 oninit
13977 cpu 0.16 0.24 0.40
informix 13997 13978 0 10:35:03
2 ? 0:00 oninit
13978 adm 0.02 0.10 0.12
informix 13998 13978 0 10:35:03
3 ? 0:00 oninit
13979 cpu 15.07 0.05 15.12
informix 14000 13978 0 10:35:03
4 ? 0:00 oninit
13980 lio 0.00 0.01 0.01
informix 14001 13978 0 10:35:03
5 ? 0:00 oninit
13981 pio 0.00 0.01 0.01
informix 14002 13978 0 10:35:03
6 ? 0:00 oninit
13982 aio 0.01 0.02 0.03
informix 14003 13978 0 10:35:03
7 ? 0:00 oninit
13995 msc 0.00 0.00 0.00
informix 14004 13978 0 10:35:04
8 ? 0:00 oninit
13996 aio 0.00 0.01 0.01
9 13997 shm 0.00 0.01 0.01
• ps –ef (output above) 10 13998 aio 0.00 0.01 0.01
12 14000 aio 0.00 0.01 0.01
(UNIX command) 13 14001 aio 0.01 0.01 0.02
11 14002 aio 0.00 0.01 0.01
Use TOP or PS 14 14003 aio 0.00 0.01 0.01
(Op Sys commands) 15 14004 aio 0.00 0.01 0.01
to find the highest tot 3.27 0.51 3.78
CPU hog
38 © 2010 IBM Corporation
Is There a THREAD Doing a Lot of Work?
• What THREADs are consuming the most Virtual CPU Time?
What THREADs are consuming the most Virtual CPU Time?
onstat –g cpu
tid name vp Last Run CPU Time #scheds status
*2 lio vp 0 3lio 06/27 13:26:39 28.6397 3749 IO Idle
*3 pio vp 0 4pio 06/27 13:25:09 5.0609 517 IO Idle
*4 aio vp 0 5aio 06/27 13:29:23 31.1610 112645 IO Idle
*5 msc vp 0 6msc 06/27 13:27:57 0.1137 50 IO Idle
*6 aio vp 1 7aio 06/27 13:29:23 19.1152 5524 IO Idle
7 main_loop() 1cpu 06/27 13:31:55 7.1407 678090 sleeping secs: 1
*8 sm_poll 1cpu 06/27 13:31:55 67245.0333 940398 running
9 sm_listen 1cpu 06/27 13:27:57 0.0057 32 sleeping forever
10 sm_discon 1cpu 06/27 13:31:55 2.5516 676641 sleeping secs: 1
11 flush_sub(0) 1cpu 06/27 13:31:55 1.7716 677707 sleeping secs: 1
*12 aio vp 2 8aio 06/27 13:29:23 21.7697 727 IO Idle
*13 aio vp 3 9aio 06/27 13:25:09 23.7650 677 IO Idle
*14 aio vp 4 10aio 06/27 13:25:09 18.0777 1118 IO Idle
16 aslogflush 1cpu 06/27 13:31:55 2.0833 676638 sleeping secs: 1
17 btscanner_0 1cpu 06/27 13:31:35 1.7299 22352 sleeping secs: 31
*18 onmode_mon 1cpu 06/27 13:31:55 2.9390 676641 sleeping secs: 1
*40 dbScheduler 1cpu 06/27 13:29:23 1.5202 3444 sleeping secs: 148
71 sqlexec 2cpu 06/27 14:24:22 54277.9907 2655 running
71 sqlexec 2cpu 06/27 14:24:22 327.3421 2655 IO Wait
ƒ Also check the STATUS of the thread to see what it is doing
ƒ If STATUS = running, find the session ID using onstat –u and then use onstat –g <sesid> to find detailed
information on the session’s activities
39 © 2010 IBM Corporation
Are There Many CPU Intensive Joins Being Performed?
• Find the costs and query plans of all the queries that use
merge join, sequential scan, hash join, and nested loop join
• Memory joins are very CPU intensive
database sysmaster;
select current year to second rundate, s.username, e.sqx_sessionid sid,
e.sqx_estcost estcost,e.sqx_estrows estrows, e.sqx_seqscan seqscan,
e.sqx_mrgjoin mergejoin,e.sqx_dynhashjoin dynhashjoin, e.sqx_tempfile tempfile,
e.sqx_sqlstatement statement
from sysmaster:syssessions s, sysmaster:syssqexplain e
where s.sid = e.sqx_sessionid and e.sqx_iscurrent = 'Y' AND
e.sqx_selflag in ('SQ_SELECT', 'SQ_UPDATE', 'SQ_DELETE') AND
e.sqx_sdbno = 0 AND
and (sqx_mrgjoin >0 OR sqx_seqscan >0 OR sqx_dynhashjoin >1 OR sqx_index > 0);
rundate 2009-01-30 11:05:21
username informix
sid 45
estcost 32301
estrows 1
seqscan 2
mergejoin 0
dynhashjoin 0
tempfile 2
statement select current year to second rundate, s.username, e.sqx_sessionid
sid, e.sqx_estcost estcost,e.sqx_estrows estrows, e.sqx_seqscan seqscan,
e.sqx_mrgjoin mergejoin,e.sqx_dynhashjoin dynhashjoin, e.sqx_tempf ile tempfile,
e.sqx_sqlstatement statement from sysmaster:syssessions s, sysmaster:syssqexplain e

40 © 2010 IBM Corporation


Is There Significant SQL Activity in Shared Memory?

• Are dynamic SQL statements continuously being Prepared?

• Enable Statement Cache if multiple users execute the same SQL


statements
• Configuration: STMT_CACHE 2
• Dynamically; onmode –e on
• Example: 100 people execute an application the same day
and use the statement:
SELECT * FROM ORDERS WHERE order_num = :hostvar

• Statements should exactly match at prepare time


• Informix does not consider SQL statements containing different
literal values in the WHERE clause as an exact match

• SQL Statement Re-Prepare waste significant CPU cycles

41 © 2010 IBM Corporation


Are There Any Queries that are Resource Intensive?
• What are the expensive (HIGH COST) queries?
database sysmaster;
select current year to second rundate,
s.username,
s.pid,
s.hostname,
e.sqx_sessionid sid,
e.sqx_estcost estcost,
e.sqx_estrows estrows,
e.sqx_seqscan seqscan,
e.sqx_tempfile tempfile,
e.sqx_sqlstatement
statement
from sysmaster:syssessions s, sysmaster:syssqexplain e
where s.sid = e.sqx_sessionid AND
e.sqx_iscurrent = 'Y‘ AND
e.sqx_selflag in ('SQ_SELECT', 'SQ_UPDATE', 'SQ_DELETE')
order by e.sqx_estcost desc

• Queries with high estimated costs can monopolize


cpu/memory/disk resources and adversely affect other users
42 © 2010 IBM Corporation
Is Informix Shared Memory Fragmented?

• Check the shared memory bitmap to look for memory


fragmentation
• onstat –g nbm
• During a wait for a resource, the thread will spin (or loop) trying to
acquire the latch
• Identify list of resources where a wait was required to obtain a latch for a
resource
• onstat –g spi
• Consider setting environment variable SPINLOCK_BREAKCNT to 150000
• Consider setting env variable DONTDRAINPOOLS to 1 or onmode –F

• Enable VP_MEMORY_CACHE_KB
• Creates private memory blocks for CPU VP’s (onstat –g vpcache)
• Allows threads to retrieve memory blocks from that private cache rather than
going to the shared memory bitmap (requires spin lock/mutex)

43 © 2010 IBM Corporation


Is More Time Being Spent in System Time Versus User Time?

• High System Time usually means a lot of time spent in activities the
operating system is responsible for
• Device management, especially with older devices
• Network
• The number of interrupts generated by high-volume client-server
applications can be very high
• Memory Management
• Is system memory over allocated?
• Swap activity shown in vmstat / performance monitor
• Reduce memory consumption, e.g. bufferpools, initial virtual segment, etc
• Kernel time spent managing memory can be high for large memory
systems (e.g. 16+ GB)
• Large page support can cut this dramatically
• Rare event - needs to be investigated by system/network admins

44 © 2010 IBM Corporation


Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
45 © 2010 IBM Corporation
Analyzing a Performance Problem – Network Layer

Resource
Bottleneck Basic Symptoms
Network • High client-server network traffic in netstat
• Collision rate > 10% in netstat
• Many Transmit/Receive Errors in netstat
• Memory allocation errors using netstat
• Abnormal number of re-
transmits/timeouts/packets dropped in
netstat

46 © 2010 IBM Corporation


Is the Network Configured Adequately?

• Does network time dominate? • Network Response Time =

• Are there spikes in network time response time seen at client


coinciding with workload phases, minus
etc?
response time seen at server

• Network Configuration
• ‘ping’ can be used to verify network
Parameters at the
lags Operating System Layer
• Available Capacity?
• # of TCP/IP Socket buffers?
• Network bottlenecks can arise from
configuration problems, such as • Socket Buffer size?
network cards accidentally being left • Keepalive/Timeout interval?
set half-duplex, or an incorrect speed • Maximum Connection
setting, etc Requests?

47 © 2010 IBM Corporation


Is Client-Server Database Traffic High?

• Very high client-server network traffic


• Client Load utility, bulk data extraction, LOB manipulation,
very high rate OLTP, etc.
• Configuration issues, e.g. mismatched Ethernet transmission rates
• External factors such as other activity on shared LAN

• Network bottlenecks typically arise from very high volumes of data


being moved around – very large result sets, client load, etc.

• Can client logic be pushed onto the server to reduce traffic?


• Stored procedures?
• Datablades – embed application logic?
• More sophisticated SQL? E.g. predicates to filter result set at the server
• Above can potentially ADD some additional server CPU cost

48 © 2010 IBM Corporation


Are Clients still communicating over the Network?

• Are client sessions hung?


database sysmaster; ƒ Check:
select onstat -g ses <sid>
sid, to find more information
net_client_name,
net_read_cnt, net_write_cnt , about any particular
net_read_bytes, net_write_bytes, user
net_open_time, net_last_read, net_last_write,
net_protocol
from sysnetworkio
• Ensure the Network Listener port is still active
sqlhosts file:
ids_serv_tcp onsoctcp mach1 9088
ids_serv_shm onipcshm mach1 dummy

netstat -a | grep 9088


tcp 0 0 mach1.9088 *.* LISTEN
49 © 2010 IBM Corporation
Are the Network Buffers Under-Configured?
• Have the network buffer queues been exceeded?
• onstat -g ntu / onstat –g ntm Number of times
the free buffer
global network information:
threshold was exceeded
#netscb connects read write q-limits q-exceed alloc/max
6/ 7 58 622 2626 2818/ 10 10/ 0 0/ 0
Individual thread network information (basic):
netscb type thread name sid fd poll reads writes q-nrm q-exp
3fc97bc soctcp sqlexec 5269 66 5 38 38 0/ 1 0/ 0

• Size network buffers to accommodate a typical user/application


request
• Can improve CPU utilization by eliminating the need to break up a
request into multiple messages
• NETTYPE configuration parameter
• Instead of Informix dynamically • IFX_NETBUF_PVTPOOL_SIZE
satisfying requests using the global environment variable
network buffer pool, possible to • IFX_NETBUF_SIZE environment
configure private network buffers variable, and b (client buffer size)
for each session option in the sqlhosts file or registry
50 © 2010 IBM Corporation
Is There a High Number of Database Connection Requests?

• Are there a significant number of new connections in a short


period?
• onstat –g ntu
Example: 110 connections in 19 seconds
IBM Informix Version 9.40.FC6W1 -- On-Line -- Up 2 days 15:55:26
c000000059d931c8 soctcp soctcplst 21 6 10 533103 0 0/ 0 0/ 0 0/ 0
IBM Informix Version 9.40.FC6W1 -- On-Line -- Up 2 days 15:55:45
c000000059d931c8 soctcp soctcplst 21 6 10 533213 0 0/ 0 0/ 0 0/ 0

• For systems that can have 200 or more concurrent network users,
better performance might result from adding more poll threads
• Improve connection throughput for a given interface/protocol
combination
• Allocate additional listener threads (add a new DBSERVERALIAS)
• Add another network-interface card
51 © 2010 IBM Corporation
Is User Authentication the Bottleneck?

• Are MSC VPs unable to keep up with new connection request?


• The MSC VP handles network authentication (e.g.
gethostbyname)

• Check onstat –g glo (Eff column)


• If at or close to 100% effective, try adding another MSC VP

• Beginning with Informix 10, can have multiple MSC VP’s


• Format for VPCLASS in the ONCONFIG
• VPCLASS msc,<#>,noage
• Add another MSC VP dynamically
• onmode -p +1 msc
52 © 2010 IBM Corporation
Are Network Connections Timing Out or Rejected?

• Monitor using
global network information:
onstat –g ntd #netscb connects read write q-free q-limits q-exceed
5/ 7 54 9938 41616 1/ 1 10/ 10 0/ 0
• Check the number of Client Type Calls Accepted Rejected Read Write
accepted vs. rejected sqlexec yes 35 7 714 747
connections srvinfx yes 0 0 0 0
onspace yes 0 0 0 0
• If there are a large onlog yes 0 0 0 0
onparam yes 0 0 0 0
number of rejections oncheck yes 18 0 9107 40806
then either: onload yes 0 0 0 0
onunload yes 0 0 0 0
• The user table has onmonitor yes 1 0 63 63
overflowed (onstat dr_accept yes 0 0 0 0
–p: ontape yes 0 0 0 0
ovuserthreads) srvstat yes 0 0 0 0
asfecho yes 0 0 0 0
• The network is timing listener yes 0 0 54 0
out on the crsamexec yes 0 0 0 0
safe yes 0 0 0 0
connection Totals 54 7 9938 41616
• Consult your System
Administrator
53 © 2010 IBM Corporation
Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
54 © 2010 IBM Corporation
Is There Any Resource Contention on the Server?
• Are there are any sessions waiting on a particular resource!
onstat –u WAIT column
database sysmaster; gives similar information
select username,
sid,
is_wlatch, -- blocked waiting on a latch
is_wlock, -- blocked waiting on a locked record or table
is_wbuff, -- blocked waiting on a buffer
is_wckpt, -- blocked waiting on a checkpoint
is_incrit -- session is in a critical section of
-- transaction (e.g writing to disk)
from syssessions
order by username;

• Use the session ID (sid) from the above output to find more
information on why that session is waiting for one or more
resources
onstat –g ses <sid>
55 © 2010 IBM Corporation
Is Informix Shared Memory Under Configured?

• Are there too many Virtual segments?


• Monitor using onstat –g seg

• More than 3 virtual shared memory segments can indicate that


the initial virtual share memory segment is too small
• The database engine has to constantly allocate additional virtual
sheared memory segments to satisfy user requests
• This can eventually cause severe performance problems

• Consider increasing SHMVIRTSIZE and possibly other Informix


shared memory parameters including EXTSHMADD, SHMADD,
SHMTOTAL

• Use the UNIX ipcs command to see sizes of Informix shared memory

56 © 2010 IBM Corporation


Are the LRU Queues Under-Configured?

• Are there a high number of buffwaits?


• Use onstat –p to monitor
• LRU lock contention can significantly degrade performance
• A thread randomly (hash) chooses an LRU queue to obtain a free buffer to when it
needs to load data in memory from disk
• If the LRU queue is locked by another thread, thread trying to obtain a free buffer has
to wait (buffwaits)

• Is the percent of pages in the modified (m) side of each queue at


checkpoint time greater than LRU MAX DIRTY
• If not, either there are not enough Page Cleaners to perform write requests, not
enough AIO vps or KAIO threads to perform physical writes, or the disk controllers
or the drives themselves are saturated
• Monitor using onstat –R

• Use the BUFFERPOOL configuration parameter to increase the number of


LRU queues
• Consider enabling AUTO_LRU
57 © 2010 IBM Corporation
Are There Any Long or Blocking Checkpoints?
onstat –g ckp
Auto Checkpoins=On RTO_SERVER_RESTART=60 seconds Estimated recovery time 7 seconds

Critical Sections Physical Log Logical Log


Clock Total Flush Block # Ckpt Wait Long # Dirty Dskflu Total Avg Total Avg
Interval Time Trigger LSN Time Time Time Waits Time Time Time Buffers /Sec pages /Sec Pages /Sec
1 18:41:36 Startup 1:f8 0.0 0.0 0.0 0 0.0 0.0 0.0 4 4 3 0 1 0
2 18:41:49 Admin 1:11c12cc 0.3 0.2 0.0 1 0,0 0.0 0.0 2884 2884 1966 162 4549 379
3 18:42:21 Plog 8:188 2.3 2.0 2.0 1 0.0 2.0 2.0 14438 7388 318 10 65442 2181
4 18:42:44 *User
*User 10:19c018 0.0 0.0 0.0 1 0.0 0.0 0.0 39 39 536 21 20412 816
5 18:46:21 RTO 12:188 54.8 54.2 0.0 30 0.6 0.4 0.6 68232 1259 210757 1033 150118 735

Max Plog Max Llog Max Dskflush Avg Dskflush Avg Dirty Blocked
pages/sec pages/sec Time pages/sec pages/sec Time
8796 6581 54 43975 2314 0

Time to flush the buffer pool


What caused
Time for the checkpoint to complete Time Informix blocked due
the checkpoint?
to lack of resources
Time between checkpoints Was the checkpoint blocking? (*)

database sysmaster;
select * from syscheckpoint, sysckptinfo;
58 © 2010 IBM Corporation
Are There Any Long or Blocking Checkpoints?
onstat –g ckp
Auto Checkpoins=On RTO_SERVER_RESTART=60 seconds Estimated recovery time 7 seconds

Critical Sections Physical Log Logical Log


Clock Total Flush Block # Ckpt Wait Long # Dirty Dskflu Total Avg Total Avg
Interval Time Trigger LSN Time Time Time Waits Time Time Time Buffers /Sec pages /Sec Pages /Sec
1 18:41:36 Startup 1:f8 0.0 0.0 0.0 0 0.0 0.0 0.0 4 4 3 0 1 0
2 18:41:49 Admin 1:11c12cc 0.3 0.2 0.0 1 0,0 0.0 0.0 2884 2884 1966 162 4549 379
3 18:42:21 Plog 8:188 2.3 2.0 2.0 1 0.0 2.0 2.0 14438 7388 318 10 65442 2181
4 18:42:44 *User 10:19c018 0.0 0.0 0.0 1 0.0 0.0 0.0 39 39 536 21 20412 816
5 18:46:21 RTO 12:188 54.8 54.2 0.0 30 0.6 0.4 0.6 68232 1259 210757 1033 150118 735

Max Plog Max Llog Max Dskflush Avg Dskflush Avg Dirty Blocked
pages/sec pages/sec Time pages/sec pages/sec Time
8796 6581 54 43975 2314 0 # of Flushes that
were done per
second
# of users who had to wait Average time
to enter a critical section a user had to
wait to enter a # of dirty buffers that needed to
critical section, be flushed at the checkpoint
or to do work
Time spent in the checkpoint waiting for
users to exit critical sections - part of the Longest time a user had to wait to
“blocking” phase” enter a critical section, or to do work

59 © 2010 IBM Corporation


Is a Database Utility Experiencing Performance Issues?
• Update Statistics?
• Ensure PDQ (> 0) to to allocate more memory to it
• DBUPSPACE – is it constraining?
• Index Build?
• Ensure PDQPRIORITY > 0
• Set DBSPACETEMP to allocate multiple Temp spaces for sorting,
temp tables, etc
• Set PSORT_NPROCS to perform sorting in parallel
• Backup or Restore?
• Both ontape and onbar use archive transport buffers in the Informix
virtual segments
• For ontape, the number of transport buffers are fixed
• For onbar, the number of transport buffers are configurable
• BAR_MAX_BACKUP
• BAR_NB_XPORT_COUNT
• BAR_XFER_BUF_SIZE (set to max allowed)
• Monitor using onstat –g stq
60 © 2010 IBM Corporation
Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
61 © 2010 IBM Corporation
Recommendations - Server Performance Tuning

• Save server statistics periodically • KAIO is always BEST (when available)


• Database design and indexes are
• Establish benchmark levels and application issues
metrics
• OS configuration and performance
issues are rare
• Configure multiple monitors,
sensors, and alert levels • However, check the Release
Notes!
• Automate responses to alerts
• Disk failure are a common source
• Raw is still always BEST of performance issues
• Often seen when the disks start
• Use DIRECT_IO if you have to failing
use OS files for data storage
• Network issues manifest as slow
• Separate physical log, logical log, telnet, not responsive to typing
dbspaces targeted on specific • Check for errors reported in the
tables, etc kernel buffer (dmesg) or OS log

62 © 2010 IBM Corporation


Best Practices - Indexes

• Nothing is as important as having the right indexes


• Understanding the interaction between the application code and the
database is vital for building the correct indexes
• Even if a query is using an index, it may not be the correct index!

• One composite index might be more efficient than two or three


singleton indexes

• Make sure that indexes have a purpose, whether it’s to ensure


integrity or support a query

• Create indexes before you create declarative referential integrity

• Try to create indexes that achieve key-only lookups


• Index provides all required data values for the sql statement
• No data row access required through index leaf pages
63 © 2010 IBM Corporation
Summary

• There are no firmly established rules to tune the database!

• Biggest gains likely from fixing application issues


• Followed by Database, OS, Hardware (in order)

• Benchmarking is the key to resolving issues quickly

• Predict performance problems before they happen

• Investigate everything when diagnosing performance


problems

• Look at New Features for help!

64 © 2010 IBM Corporation


65 © 2010 IBM Corporation
Agenda
• Introduction
• Understand your Business
• Know Your Application
• Know Your Environment
• Be Prepared!
• Utilities to Diagnose Performance Problems
• Analyzing a Performance Problem
• First Steps
• Disk
• CPU
• Network
• Informix Specific
• Recommendations and Best Practices
• Appendix
66 © 2010 IBM Corporation
Disk Bottleneck Diagnosis – 10,000 Feet

Inadequate disk configuration / subsystem?

Bad plan giving tablescan? Insufficient sort space? Anything sharing the disks?
• Old statistics? Missing indexes? High transaction rate
• Need more indexes? • Too-frequent commits?
• Insufficient Read Aheads? Bad plan(s) giving excessive • Buffered logging?
• Over-aggressive page cleaning? index scanning? • High data volume?
• LOB reads/writes? • Need more/different indexes? • Logging too much data?

Index Temp
Data Dbspace Dbspace Log
Dbspace Devices

Disk
Bottleneck

Bottleneck
Type?

67 © 2010 IBM Corporation


CPU Bottleneck Diagnosis – 10,000 Feet

Frequent in-bufferpool table scans?


SQL w/o parameter markers? Old device drivers?
Statement cache not enabled? Network interrupts?
Oninit process in an endless loop? Memory management?
Increased workload?

High High
User System
Time Time

CPU
Bottleneck

Bottleneck
Type?

68 © 2010 IBM Corporation


Network Bottleneck Diagnosis – 10,000 Feet

Excessive data flow?


• LOBs
• intermediate data
• Significant number of records
Shared network conflict?
Many new connections?
Slow Authentication?
Inadequate Network Configuration?

Network
Bottleneck

Bottleneck
Type?

69 © 2010 IBM Corporation


Is only one query experiencing a performance problem?

• Is only one query slow?


• Review EXPLAIN information (SET EXPLAIN ON)
• Enable SQLTRACE for that userid and review explain using
onstat –g his
• Check if distribution statistics for that table are current (sysdistrib)

• Methods to Tune individual Query Performance


• PDQPRIORITY or DS_NONPDQ_QUERY_MEM
• Fragmentation
• Using Indexes
• Optimizer Directives
• Index Self-Join (good if lead key has duplicates, enable INDEX_SELFJOIN)
• Filter Selectivity (create Functional indexes, avoid difficult regular
expressions/substrings)
• Join and Sort SQL (avoid repeated sorting, set PSORT_NPROCS - together
with PSORT_DBTEMP, use temp tables)
• Update Statistics (AUS)
70 © 2010 IBM Corporation
Are Many Queries Experiencing Performance Problems?

• Are most or All Queries slow?


• First, eliminate external factors – disk, network, cpu, memory

• Are Statistics out of date?


• Check when Update Statistics was last run? (constructed column)

• Enable SQLTRACE globally and review explains using


onstat –g his

• Enable EXPLAIN_STAT configuration parameter to monitor statistics on


queries
• Use SET EXPLAIN or onmode –Y <sessionid> to create Query Statistics

• Consider using External Optimizer Directives


• Allows DBA to modify the query plan without changing the application
• To create external directives use the SAVE EXTERNAL DIRECTIVE statement

71 © 2010 IBM Corporation


Are some Chunks Not Accessible?
• What is the Status of the Chunks and Dbspaces?
DBSPACE Status
• onstat -d database sysmaster;
select dbspace, -- dbspace name
d.dbsnum, -- dbspace num
is_mirrored, -- dbspace is mirrored 1=Yes 0=No
is_blobspace, -- dbspace is blobspace 1=Yes 0=No
is_temp, -- dbspace is temp 1=Yes 0=No
chknum chunknum, -- CHUNK number
fname device, -- dev path
offset dev_offset, -- dev offset
is_offline, -- Offline 1=Yes 0=No
is_recovering, -- Recovering 1=Yes 0=No
is_blobchunk, -- Blobspace 1=Yes 0=No
is_inconsistent, -- Inconsistent 1=Yes 0=No
chksize Pages_size, -- chuck size in pages
nfree Pages_free, -- chunk free pages
mfname mirror_device, -- mirror dev path
mis_recovering_offse -- mirror recovering 1=Yes 0=No
from sysdbspaces d, syschunks c
where d.dbsnum = c.dbsnum
order by dbsnum, dbspace, chunknum
DBSPACE Usage
database sysmaster;
select name dbspace,
sum(chksize) allocated,
sum(nfree) free,round(((sum(chksize) - sum(nfree))/sum(chksize))*100) pcused
from sysdbspaces d, syschunks c
where d.dbsnum = c.dbsnumgroup by nameorder by name
72 © 2010 IBM Corporation
What are the Specific Locks Currently Held by Each User?
• Find which locks currently held by each user?
database sysmaster;
SELECT d.name[1,13] database_name, "database lock" table_name,
l.type lock_type, l.keynum index_num, HEX(l.rowidlk), s.username user,
s.sid, q.sqs_statement
FROM syslocks l, sysdatabases d, syssessions s, syssqlstat q
WHERE l.rowidlk = d.rowid AND l.owner = s.sid
AND dbsname = 'sysmaster' AND tabname = 'sysdatabases'
AND s.sid = q.sqs_sessionid
UNION ALL
SELECT l.dbsname[1,13], l.tabname, l.type, l.keynum, HEX(l.rowidlk),
s.username, s.sid, q.sqs_statement
FROM syslocks l, syssessions s , syssqlstat q
WHERE l.owner = s.sid
AND s.sid = q.sqs_sessionid
AND dbsname != 'sysmaster'
AND tabname != 'sysdatabases' ORDER BY 4,5;
ƒ Modify the table name in the predicates to get a specific table or all tables.
ƒ Leave out the hostname for a display that fits on one line.
ƒ Add “a.waiter != 0” to get locks with someone waiting.
73 © 2010 IBM Corporation
Is There Significant Lock Contention Activity?

• Is there significant lock contention activity?


• Individual Lock wait time?
• Number of threads/sessions waiting on locks?
• Find the number of locks held by each user?
Types of Locks
database sysmaster ; • B - byte lock
select syssessions.username, • IS - intent shared lock
syslocks.tabname,
• S - shared lock
count(*)
from syslocks, outer syssessions • XS - repeatable read shared key
where syslocks.owner = syssessions.sid • U - update lock
group by 1,2
• IX - intent exclusive lock
• SIX - shared intent exclusive
• X - exclusive lock
• XR - repeatable read exclusive
74 © 2010 IBM Corporation
Is the User Workload Higher?

• Are There More/New Users than normal connected to the Instance?


database sysmaster;
select sysdatabases.name database, -- Database Name
syssessions.username, -- User Name
syssessions.hostname, -- Workstation
syslocks.owner sid, -- Informix Session ID
connected startdate -- convert unix time to date
from syslocks, sysdatabases , outer syssessions
where syslocks.tabname = “sysdatabases” -- Locks on sysdatabases
and syslocks.rowidlk = sysdatabases.rowid -- Join to database
and syslocks.owner = syssessions.sid -- Use session ID
order by 1;

• Command line:
• onstat –u and onstat –g ses

• What is the ACTIVE number of user sessions?


• Last line of onstat –u
75 © 2010 IBM Corporation
Is There a User Monopolizing the Transaction Logs?
• Find User Transaction by Log Usage?
database sysmaster;
select t.username,
t.sid,
tx_logbeg,
tx_loguniq,
tx_logpos
from systrans x, sysrstcb t
where tx_owner = t.address

username sid tx_logbeg tx_loguniq tx_logpos


Informix 1 0 16 892952
lester 53 0 0 0
Informix 12 0 0 0
• Note the difference between the unique id of the log where the user’s
transaction began (tx_logbeg) and the unique id of the log where the
transaction ended/still continuing (tx_loguniq)
76 © 2010 IBM Corporation
Logical Logs Status Information

• What is the status of the Logical Logs?


• Command Line
• onstat –l

A Newly added (and ready to use) database sysmaster;


B Backed up select uniqid,
C Current logical-log file used size_used,
D Marked for deletion To drop the
log file and free its space for reuse, is_used,
you must perform a level-0 backup is_current,
of all storage spaces is_backed_up,
F Free, available for use is_archived
L The most recent checkpoint record from syslogs
U Used order by uniqid

77 © 2010 IBM Corporation


Which Virtual Processor is Consuming the Most CPU Cycles?

• Is a VP other than the CPU VP


consuming the most CPU cycles? vpid pid class usercpu syscpu
1 295 cpu
database sysmaster; 503.26 45.22
select vpid, pid, txt[1,5] class, 2 296 adm 0.45
round( usecs_user, 2) usercpu, 0.72
round( usecs_sys, 2) syscpu 3 297 lio 1.57
from sysvplst a, flags_text b 7.83
where a.class = b.flags AND 4 298 pio 0.35
b.tabname = "sysvplst" 0.21
5 299 aio 7.30
• Inadequate number of VPs to support a 56.16
workload will cause performance issues 6 300 msc 305.04
• onstat –g glo 0.64
• Effective CPU Utilization - Eff column 7 301 aio 3.26
• onstat –g ioq
VPCLASS
23.75
8 302 tliJVP, TLI, SHM,
(CPU, 0.17
• Maximum queue length - maxlen column 0.54SOC, STR)
• Monitor to see if adding more VPS of a 9 305 pio 0.77
particular class would help 1.02

78 © 2010 IBM Corporation


Are Light Scans Being Utilized?

• Check using onstat –g lsc


descriptor address next_lpage next_ppage ppage_left bufcnt look_aside
0 c41edcd0 87 500223 490 1 N
1 c41efe68 85 6000cc 497 1 Y
1 c41f07b8 83 7000cc 497 1 Y

• Performance advantages of using light scans


• Transfers larger blocks of data in one I/O operation
• Bypasses the overhead of the buffer pool when many pages are read
• Prevents frequently accessed pages from being forced out of the
buffer pool when many sequential pages are read for a single DSS
query
• Light scans occur under the following conditions:
• The optimizer chooses a sequential scan of the table.
• The number of pages in the table is greater than the number of buffers
in the buffer pool.
• The isolation level obtains no lock or a shared lock on the table:
79 © 2010 IBM Corporation
Is There Significant Scan Activity in Shared Memory?

• Are there SELECTs with a large amount of rows read, BUT a low
number of physical reads?

• Monitor the bufreads and bufwrites filed of onstat –p to see if


those numbers increase dramatically over a short period of time

• A good indication is if the bufreads or bufwrites values overflow


or wrap often

• Frequent in-memory table scans can consume significant CPU

80 © 2010 IBM Corporation


Which Users are Consuming the Most Resources?

• Monitor resource usage database sysmaster;


select username,
by User syssesprof.sid,
lockreqs, lockwts,
deadlks,
lktouts,
logrecs,
isreads, iswrites, isrewrites, isdeletes,
iscommits, isrollbacks, longtxs,
bufreads, bufwrites,
seqscans,
pagreads, pagwrites,
total_sorts, dsksorts, max_sortdiskspace,
logspused
from syssesprof, syssessions
where syssesprof.sid = syssessions.sid
order by bufreads desc
-- order by what you want to monitor

• Typically, in situations where the CPU usage is very high, it


is usually the result of a few users

81 © 2010 IBM Corporation


Is the High CPU Usage Confined to Only One CPU?

• Is a subset of the CPUs saturated?

• Is there one process exclusively monopolizing one CPU?


• Is NUMCPUVPS set to use as many of the available CPUs as
possible?

• Can the workload be parallelized to use more CPUs?


• Applies to utilities like UPDATE STATISTICS & CREATE INDEX as
well as user applications
• Is PDQ being effectively utilized (onstat –g mgm)?

• A CPU bottleneck doesn’t have to saturate the whole


system - it can also be on a single CPU

82 © 2010 IBM Corporation


Are there any Blocking Locks? database sysmaster;
select t2.dbsname database,
t5.sid SessionID,
t8.txt type,
• Find which Locks are blocking t2.tabname table,
other users? t4.sid lock_sess,
t5.username lock_user,
t6.sid wait_sess,
• Including who those other t7.username wait_user,
t1.rowidr row_id,
users are t1.keynum
from sysmaster:'informix'.syslcktab t1,
• Similar to onstat –k sysmaster:'informix'.systabnames t2,
sysmaster:'informix'.systxptab t3,
sysmaster:'informix'.sysrstcb t4,
• Use the sessionid from the sysmaster:'informix'.sysscblst t5,
sysmaster:'informix'.sysrstcb t6,
results of this query to find what sysmaster:'informix'.sysscblst t7,
the User, who is holding the sysmaster:'informix'.flags_text t8
where t1.owner = t3.address
lock(s) that other sessions are AND t1.partnum = t2.partnum
AND t3.owner = t4.address
waiting for, is doing AND t1.wtlist = t6.address
AND t4.sid = t5.sid
onstat –g ses AND t8.tabname = 'syslcktab'
AND t8.flags = t1.type
<sessionid> AND t6.sid = t7.sid

Locking • High total lock wait time seen in some onstats


• Many processes in lock wait state seen in some onstats
• Low CPU consumption seen in vmstat

83 © 2010 IBM Corporation


Know Your Environment – OLTP + DSS

• OLTP + DSS Characteristics

• Combination of DSS Queries with OLTP transaction


activity

• Number of records returned depends on type of activity

• Maximizing I/O throughput still important

• Memory resources allocated with primarily DSS queries


in mind

84 © 2010 IBM Corporation


Know Your Environment – OLTP + DSS Considerations

• Important to balance resources and loads on the system


• Requires continuous monitoring of server and system resources

• Control users by preventing execution of poorly written


queries which could monopolize all server resources

• Dynamically adjust memory as needed


• Peak Times (OLTP)
• Minimize DS_TOTAL_MEMORY / MAX_PDQPRIORITY
• Non-Peak Times (DSS)
• Maximize DS_TOTAL_MEMORY / MAX_PDQPRIORITY

85 © 2010 IBM Corporation


Recommendations - Server Performance Tuning

• Capture Table and index partition row and extent counts

• Capture Table and index partition I/O statistics

• Capture Dbspace %free space and absolute free space

• Capture Number of pages of each partition in the cache at any


given moment

• Catalog common statements to permit sizing statement cache


and simplify performance tuning

• Track new table creation to efficiently size data dictionary and


distribution caches

• Track propagation of new stored procedures to better size cache

86 © 2010 IBM Corporation


Is Parallel Processing Enabled?

• Are there PDQ Queries processing?


• Environment Variable PDQPRIORITY > 1
• Monitor: onstat –g mgm
• Allows parallel scans of data reads
• Set PSORT_NPROCS configuration parameter
• Specifies the maximum number of threads that Informix can use to
sort a query
• Set to 2 with multiple CPUs; increase as necessary
• Set PSORT_DBTEMP environment variable
• Specifies where Informix writes the temporary files it uses when it
performs a sort
• Specify directories that reside in file systems on different disks
• PDQ can improve performance provided a single query cannot
monopolize all PDQ resources
• Set MAX_PDQPRIORITY < 20 for OLTP servers
87 © 2010 IBM Corporation
Informix 11 Bootcamp
Performance Tuning and Troubleshooting Tools and Techniques
Information Management Partner Technologies

© 2010 IBM Corporation

You might also like