Professional Documents
Culture Documents
RAC Troubleshooting
RAC Troubleshooting
juliandyke.com
Agenda
juliandyke.com
juliandyke.com
Introduced in Oracle 10.2 Checks cluster configuration stages - verifies all steps for specified stage have been completed components - verifies specified component has been correctly installed Supplied with Oracle Clusterware Can be downloaded from OTN (Linux and Windows) Also works with 10.1 (specify -10gR1 option) For earlier versions see Metalink Note 135714.1 Script to Collect RAC Diagnostic Information (racdiag.sql)
juliandyke.com
On the Red Hat 4 and Enterprise Linux platforms, the following additional RPM is required for CLUVFY
cvuqdisk-1.0.1-1.rpm
This package is supplied in the clusterware/cluvfy/rpm directory on the clusterware CD-ROM It can also be download from OTN On each node as the root user install the RPM using:
rpm -ivh cvuqdisk-1.0.1-1.rpm
juliandyke.com
juliandyke.com
juliandyke.com
For example, to check configuration before installing Oracle Clusterware on node1 and node2 use:
Checks: node reachability user equivalence administrative privileges node connectivity shared stored accessibility If any checks fail append -verbose to display more information
juliandyke.com
Trace files are written to the $CV_HOME/cv/log directory By default this directory is removed immediately after CLUVFY is execution On Linux/Unix comment out the following line in runcluvfy.sh
# $RM -rf $CV_HOME
juliandyke.com
Note that it may be necessary to cleanup the CRS installation before executing root.sh again
10
juliandyke.com
to
# Run DBCA $JRE_DIR/bin/jre -DORACLE_HOME=$OH -DJDBC_PROTOCOL=thin -mx64m -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath $CLASSPATH oracle.sysman.assistants.dbca.Dbca $ARGUMENTS
11
juliandyke.com
Oracle Clusterware
12
juliandyke.com
Provides Node membership services (CSS) Resource management services (CRS) Event management services (EVM) In Oracle 10.1 and above resources include Node applications ASM Instances Database Instances Services Node applications include: Virtual IP (VIP) Listeners Oracle Notification Service (ONS) Global Services Daemon (GSD)
13
juliandyke.com
Node application introduced in Oracle 10.1 Allows Virtual IP address to be defined for each node All applications connect using Virtual IP addresses If node fails Virtual IP address is automatically relocated to another node Only applies to newly connecting sessions
14
juliandyke.com
VIP1 Listener1
VIP2 Listener2
VIP1 Listener1
VIP1
VIP2
Listener2
Instance1
Node 1
Instance2
Node 2
Instance1 Node 1
Instance2 Node 2
15
juliandyke.com
On Linux during normal operation, each node will have one VIP address. For example:
[root@server3]# ifconfig eth0 Link encap:Ethernet HWaddr 00:11:D8:58:05:99 inet addr:192.168.2.103 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::211:d8ff:fe58:599/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:6814 errors:0 dropped:0 overruns:0 frame:0 TX packets:10326 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:684579 (668.5 KiB) TX bytes:1449071 (1.3 MiB) Interrupt:217 Base address:0x8800 Link encap:Ethernet HWaddr 00:11:D8:58:05:99 inet addr:192.168.2.203 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:217 Base address:0x8800
eth0:1
The resource for VIP address for 192.168.2.203 is initially running on server3
16
juliandyke.com
If Oracle Clusterware on server3 is shutdown, the VIP resource is transferred to another node (in this case server11)
[root@server11]# ifconfig eth0 Link encap:Ethernet HWaddr 00:1D:7D:A3:0A:55 inet addr:192.168.2.111 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::21d:7dff:fea3:a55/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2792 errors:0 dropped:0 overruns:0 frame:0 TX packets:4097 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:329891 (322.1 KiB) TX bytes:593615 (579.7 KiB) Interrupt:177 Base address:0x2000 Link encap:Ethernet HWaddr 00:1D:7D:A3:0A:55 inet addr:192.168.2.211 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:177 Base address:0x2000 Link encap:Ethernet HWaddr 00:1D:7D:A3:0A:55 inet addr:192.168.2.203 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:177 Base address:0x2000
eth0:1
eth0:2
17
juliandyke.com
on on on on
[root@server3]# ./crs_relocate ora.server3.vip -c server3 Attempting to stop `ora.server3.vip` on member `server11` Stop of `ora.server3.vip` on member `server11` succeeded. Attempting to start `ora.server3.vip` on member `server3` Start of `ora.server3.vip` on member `server3` succeeded. HA Resource ----------ora.server11.vip ora.server12.vip ora.server3.vip ora.server4.vip 18 Target -----application application application application State ----ONLINE ONLINE ONLINE ONLINE
on on on on
juliandyke.com
In Oracle 10.2, Oracle Clusterware log files are created in the $CRS_HOME/log directory can be located on shared storage $CRS_HOME/log directory contains subdirectory for each node e.g. $CRS_HOME/log/server6 $CRS_HOME/log/<node> directory contains: Oracle Clusterware alert log e.g. alertserver6.log client - logfiles for OCR applications including CLSCFG, CSS, OCRCHECK, OCRCONFIG, OCRDUMP and OIFCFG crsd - logfiles for CRS daemon including crsd.log cssd - logfiles for CSS daemon including ocssd.log evmd - logfiles for EVM daemon including evmd.log racg - logfiles for node applications including VIP and ONS
19
juliandyke.com
log
<nodename>
client
crsd
cssd
evmd
racg
alert<nodename>.log
racgeut
racgimon
racgmain
20
juliandyke.com
log
<nodename>
client
racg
racgeut
racgimon
racgmain
racgmdb
21
juliandyke.com
If OCR or voting disk are not available, error files may be created in /tmp e.g. /tmp/crsctl.4038 For example, if OCR cannot be found:
OCR initialization failed accessing OCR device: PROC-26: Error while accessing the physical storage Operating System error [No such file or directory] [2] OCR is inaccessible - no CRS daemons will start No errors written to log files
clsscfg_vhinit: unable(1) to open disk (/dev/raw/raw2) Internal Error Information: Category: 1234 Operation: scls_block_open Location: statfs Other: statfs failed /dev/raw/raw2 Dep: 2 Failure 1 checking the Cluster Synchronization Services voting disk '/dev/raw/raw2'. Not able to read adequate number of voting disks 22
juliandyke.com
Script called on each node by SRVCTL to control resources Copy of script in each Oracle home $ORA_CRS_HOME/bin/racgwrap $ORA_ASM_HOME/bin/racgwrap $ORACLE_HOME/bin/racgwrap Sets environment variables Invokes racgmain executable Generated from racgwrap.sbs Differs in each home Sets $ORACLE_HOME and $ORACLE_BASE environment variables for racgmain Also sets $LD_LIBRARY_PATH Enable trace by setting _USR_ORA_DEBUG to 1
23
juliandyke.com
In Unix systems the Oracle SGA is located in one or more operating system shared memory segments Each segment is identified by a shared memory key Shared memory key is generated by the application Each shared memory key maps to a shared memory ID Shared memory ID is generated by operating system Shared memory segments can be displayed using ipcs -m
[root@server3] # ipcs -m ------ Shared Memory Segments -------key shmid owner perms 0x8a48ff44 131072 oracle 640 0x17d04568 163841 oracle 660
status
Oracle generates the shared memory key from the values of $ORACLE_HOME $ORACLE_SID
24
juliandyke.com
[oracle@server3]$ export ORACLE_SID=PROD1 [oracle@server3]$ sqlplus / as sysdba ... Connected to idle instance
25
juliandyke.com
Implemented on Unix systems Not required with third-party clusterware Implemented in Linux in 10.2.0.4 and above In 10.2.0.3 and below hangcheck timer module is used Provides hangcheck timer functionality to maintain cluster integrity Behaviour similar to hangcheck timer Runs as root Locked in memory Failure causes reboot of system See /etc/init.d/init.cssd for operating system reboot commands
26
juliandyke.com
OPROCD takes two parameters -t - Timeout value Length of time between executions (milliseconds) Normally defaults to 1000 -m - Margin Acceptable margin before rebooting (milliseconds) Normally defaults to 500 Parameters are specified in /etc/init.d/init.cssd OPROCD_DEFAULT_TIMEOUT=1000 OPROCD_DEFAULT_MARGIN=500 Contact Oracle Support before changing these values
27
juliandyke.com
/etc/init.d/init.cssd can increase OPROCD_DEFAULT_MARGIN based on two CSS variables reboottime (mandatory) diagwait (optional) Values can for these be obtained using
[root@server3]# crsctl get css reboottime [root@server3]# crsctl get css diagwait
28
juliandyke.com
CSS maintains two heartbeats Network heartbeat across interconnect Disk heartbeat to voting device
Disk heartbeat has an internal I/O timeout (in seconds) Varies between releases In Oracle 10.2.0.2 and above disk heartbeat timeout can be specified by CSS disktimeout parameter Maximum time allowed for a voting file I/O to complete If exceeded file is marked offline Defaults to 200 seconds
crsctl get css disktimeout crsctl set css disktimeout <value>
29
juliandyke.com
Network heartbeat timeout can be specified by CSS misscount parameter Default values (Oracle Clusterware 10.1 and 10.2) are:
Linux
Unix Windows
60 seconds
30 seconds 30 seconds
Default value for vendor clusterware is 600 seconds crsctl get css misscount crsctl set css misscount <value>
30
juliandyke.com
Relationship between internal I/O timeout (IOT), MISSCOUNT and DISKTIMEOUT varies between releases
Version 10.1.0.3 10.1.0.4 10.1.0.5
10.1.0.6 10.2.0.1 10.2.0.2
Description IOT = MISSCOUNT - 15 seconds IOT = MISSCOUNT - 15 seconds IOT = MISSCOUNT - 3 seconds
IOT = DISKTIMEOUT during normal operations IOT = MISSCOUNT during initial cluster formation or reconfiguration IOT = MISSCOUNT - 3 seconds IOT = DISKTIMEOUT during normal operations IOT = MISSCOUNT during initial cluster formation or reconfiguration
31
juliandyke.com
If disktimeout supported CSS will not evict a node from the cluster when I/O to voting disk takes more than MISSCOUNT seconds unless during during initial cluster formation slightly before reconfiguration Nodes will not be evicted as long as voting disk operations are completed within DISKTIMEOUT seconds
Disk Heartbeat Completes within DISKTIMEOUT seconds Takes more than DISKTIMEOUT seconds Completes within MISSCOUNT seconds Reboot No Yes Yes
Network Heartbeat Completes within MISSCOUNT seconds Completes within MISSCOUNT seconds Takes more than MISSCOUNT seconds
32
juliandyke.com
CRSCTL can also be used to enable and disable Oracle Clusterware To enable Clusterware use: # crsctl enable crs
33
juliandyke.com
In Oracle 10.2, CRSCTL can be used to check the current state of Oracle Clusterware daemons To check the current state of all Oracle Clusterware daemons # crsctl check crs CSS appears healthy CRS appears healthy EVM appears healthy To check the current state of individual Oracle Clusterware daemons # crsctl check cssd CSS appears healthy # crsctl check crsd CRS appears healthy # crsctl check evmd EVM appears healthy
34
juliandyke.com
CRSCTL can be used to manage the CSS voting disk To check the current location of the voting disk use:
To delete an existing voting disk use: # crsctl delete css votedisk <path_name>
35
juliandyke.com
In Oracle 10.2 and above Oracle Clusterware debugging can be enabled and disabled for CRS CSS EVM Resources Subcomponents Debugging can be controlled statically using environment variables dynamically using CRSCTL Debug settings can be persisted in OCR for use in subsequent restarts
36
juliandyke.com
To list modules available for debugging use: # crsctl lsmodules crs # crsctl lsmodules css # crsctl lsmodules evm
CRSCOMM
CRSD CRSEVT
CRS
CRS CRS
EVMDMAIN
EVMEVT
EVM
EVM
37
juliandyke.com
For example:
# crsctl debug log crs "CRSCOMM:2,COMMCRS:2,COMMNS:2" Set CRSD Debug Module: CRSCOMM Level: 2 Set CRSD Debug Module: COMMCRS Level: 2 Set CRSD Debug Module: COMMNS Level: 2
Values only apply for current node Stored within OCR in SYSTEM.crs.debug.<node>.<module> For example:
# ocrdump -stdout -keyname SYSTEM.crs.debug.vm1.CRSCOMM
38
juliandyke.com
For example:
# crsctl debug log res ora.vm1.vip:5 Set Resource Debug Module: ora.vm1.vip Level: 5
OCR debug value is stored in USR_ORA_DEBUG To check current debug value set in OCR for ora.vm1.vip use:
# ocrdump -stdout -keyname \ CRS.CUR.ora\!vm1\!vip.USR_ORA_DEBUG
39
juliandyke.com
Debugging for CRSD and EVMD can also be configured using environment variables To enable tracing for all modules use ORA_CRSDEBUG_ALL For example:
# export ORA_CRSDEBUG_ALL=5
Note that these environment variables have not been implemented in OCSSD or OPROCD
40
juliandyke.com
In Oracle 10.1 and above debugging can also be configured in $ORA_CRS_HOME/srvm/admin/ocrlog.ini By default this file contains:
# "mesg_logging_level" is the only supported parameter currently. # level 0 means minimum logging. Only error conditions are logged mesg_logging_level = 0
# The last appearance of a parameter will override the previous value. # For example, log level will become 3 when the following value is uncommented. # Change to log level 3 for detailed logging from Oracle Cluster Registry # mesg_logging_level = 3
# Component log and trace level specification template #comploglvl="comp1:3;comp2:4" #comptrclvl="comp1:2;comp2:1"
41
juliandyke.com
Component level logging can also be configured in the OCR For example:
crsctl debug log crs OCRAPI:5;OCRCLI:5;OCRSRV:5;OCRMAS:5;OCRCAC:5
Components include: OCRAPI - OCR Abstraction Component OCRCAC - OCR Cache Component OCRCLI - OCR Client Component OCRMAS - OCR Master Thread Component OCRMSG - OCR Message Component OCRSRV - OCR Server Component OCRUTL - OCR Util Component
42
juliandyke.com
CSS dump is written to $ORA_CRS_HOME/log/<node>/cssd/ocssd.log Dump contents can be made more readable e.g.:
cut -c58- < ocssd.log > ocssd.dmp
43
juliandyke.com
The olsnodes utility lists all nodes currently running on the cluster With no arguments olsnodes lists the nodes e.g.
$ olsnodes london1 london2
In Oracle 10.2 and above, with -p argument olsnodes lists node names and private interconnect
$ olsnodes -p london1 london1-priv london2 london2-priv
In Oracle 10.2 and above, with -i argument olsnodes lists node names and VIP address
$ olsnodes -i london1 london1-vip london2 london2-vip
44
juliandyke.com
In Oracle 10.1 and above the OCRCONFIG utility performs various administrative operations on the OCR including: displaying backup history configuring backup location restoring OCR from backup exporting OCR importing OCR upgrading OCR downgrading OCR In Oracle 10.2 and above OCRCONFIG can also manage OCR mirrors overwrite OCR files repair OCR files
45
juliandyke.com
Options include
Description Display help message Version 10.1+
Option -help
-showbackup
-backuploc -restore -export -import -upgrade -downgrade -replace -overwrite -repair
10.1+
10.1+ 10.1+ 10.1+ 10.1+ 10.1+ 10.1+ 10.2+ 10.2+ 10.2+
46
juliandyke.com
In Oracle 10.1 and above OCR is automatically backed up every four hours Previous three backup copies are retained Backup copy retained from end of previous day Backup copy retained from end of previous week Check node, times and location of previous backups using the showbackup option of OCRCONFIG e.g.
# ocrconfig -showbackup london1 2005/08/04 11:15:29 london1 2005/08/03 22:24:32 london1 2005/08/03 18:24:32 london1 2005/08/02 18:24:32 london1 2005/07/31 18:24:32
/u01/app/oracle/product/10.2.0/crs/cdata/crs /u01/app/oracle/product/10.2.0/crs/cdata/crs /u01/app/oracle/product/10.2.0/crs/cdata/crs /u01/app/oracle/product/10.2.0/crs/cdata/crs /u01/app/oracle/product/10.2.0/crs/cdata/crs
47
ENSURE THAT YOU COPY THE PHYSICAL BACKUPS TO TAPE AND/OR REDUNDANT STORAGE
juliandyke.com
48
juliandyke.com
To restore the OCR from a physical backup copy Check you have a suitable backup using:
# ocrconfig -showbackup
For example:
juliandyke.com
In Oracle 10.1 and above, you can verify the configuration of the OCR using the OCRCHECK utility
# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262144 Used space (kbytes) : 7752 Available space (kbytes) : 254392 ID : 1093363319 Device/File Name : /dev/raw/raw1 Device/File integrity check succeeded /dev/raw/raw2 Device/File integrity check succeeded Cluster registry integrity check succeeded
In Oracle 10.1 this utility does not print the ID and Device/File Name information
50
juliandyke.com
In Oracle 10.1 and above, you can dump the contents of the OCR using the OCRDUMP utility For example:
# ocrdump
This command writes its output to a file called OCRDUMPFILE in the current working directory You can specify an output file name using:
# ocrdump <dump_file_name>
For example:
# ocrdump ocr_cluster1
51
juliandyke.com
In Oracle 10.2 and above, you can write OCRDUMP output to stdout For example:
# ocrdump -stdout
In Oracle 10.2 and above, you can optionally restrict output by specifying a key For example:
# ocrdump -stdout SYSTEM # ocrdump -stdout SYSTEM.css # ocrdump -stdout SYSTEM.css.misscount
In Oracle 10.2 and above, you can optionally format output in XML. For example:
# ocrdump -stdout SYSTEM.css.misscount -xml
52
juliandyke.com
The CRS_STAT utility reports the current status of resources managed by Oracle Clusterware Resources include: databases instances services ASM instances node applications gsd ons vip listeners
53
juliandyke.com
54
If a node has failed, the STATE field will show which node the applications have failed over to
juliandyke.com
With the -t option, crs_stat lists resources together with their state and the current node
Name Type Target State Host -----------------------------------------------------------ora....T1.inst application ONLINE ONLINE server3 ora....T2.inst application ONLINE ONLINE server4 ora....T3.inst application ONLINE ONLINE server11 ora....T4.inst application ONLINE ONLINE server12 ora.TEST.db application ONLINE ONLINE server3 ora....SM3.asm application ONLINE ONLINE server11 ora....11.lsnr application ONLINE ONLINE server11 ora....r11.gsd application ONLINE ONLINE server11 ora....r11.ons application ONLINE ONLINE server11 ora....r11.vip application ONLINE ONLINE server11 ora....SM4.asm application ONLINE ONLINE server12 ora....12.lsnr application ONLINE ONLINE server12 ora....r12.gsd application ONLINE ONLINE server12 ora....r12.ons application ONLINE ONLINE server12 ora....r12.vip application ONLINE ONLINE server12
55
juliandyke.com
With the -ls option, crs_stat lists resources together with their owner, group and permissions.
Name Owner Primary PrivGrp Permission ----------------------------------------------------------------ora....T1.inst oracle oinstall rwxrwxr-ora....T2.inst oracle oinstall rwxrwxr-ora....T3.inst oracle oinstall rwxrwxr-ora....T4.inst oracle oinstall rwxrwxr-ora.TEST.db oracle oinstall rwxrwxr-ora....SM3.asm oracle oinstall rwxrwxr-ora....11.lsnr oracle oinstall rwxrwxr-ora....r11.gsd oracle oinstall rwxr-xr-ora....r11.ons oracle oinstall rwxr-xr-ora....r11.vip root oinstall rwxr-xr-ora....SM4.asm oracle oinstall rwxrwxr-ora....12.lsnr oracle oinstall rwxrwxr-ora....r12.gsd oracle oinstall rwxr-xr-ora....r12.ons oracle oinstall rwxr-xr-ora....r12.vip root oinstall rwxr-xr--
56
juliandyke.com
CRS_STAT abbreviates resource names Oracle provides an AWK script that includes complete resource names Metalink Note: 259301_1 CRS and 10g RAC
#!/bin/bash RSC_KEY=$1 QSTAT=-u AWK=/usr/bin/awk $AWK \ 'BEGIN {printf "%-45s %-10s %-18s\n","HA Resource", "Target", "State"; printf "%-45s %-10s %-18s\n","-----------", "------", "-----";}' $ORA_CRS_HOME/bin/crs_stat $QSTAT | $AWK \ 'BEGIN { FS="="; state = 0; } $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1}; state == 0 {next;} $1~/TARGET/ && state == 1 {apptarget = $2; state=2;} $1~/STATE/ && state == 2 {appstate = $2; state=3;} state == 3 {printf "%-45s %-10s %-18s\n", appname,apptarget,appstate;state = 0;}'
57
juliandyke.com
juliandyke.com
The CRS_GETPERM and CRS_SETPERM utilities can be used to check and modify Oracle Clusterware permissions For example to change the owner of an instance to oracle and group to oinstall
59
juliandyke.com
Oracle Cluster Registry Vulnerable to corruption Versions experiencing OCR corruptions have included: 10.1.0.3 10.2.0.2 10.2.0.3 11.1.0.6 Also experienced by many Oracle employees about 20% of UKOUG RAC & HA SIG delegates Typical symptom is "placement error" May be related to configuration of services Corruption may occur at an earlier date May occur when service is configured on non-master node
60
juliandyke.com
If mirror is configured: Restore from mirror using ocrconfig -overwrite See Administration and Deployment Guide for details If backup is available: Restore from backup using ocrconfig -restore If no backup is available: Rebuild OCR using procedure described in Metalink Note 399482.1 - How to recreate OCR/Voting disk accidentally deleted
61
juliandyke.com
Rebuild procedure (adapted from Note 399482.1): On each node shutdown Oracle Clusterware
[root@server3]# crsctl stop crs
Check that all Clusterware processes have stopped On each node execute rootdelete.sh
[root@server3]# $ORA_CRS_HOME/instance/rootdelete.sh
Note that for a corrupt OCR it may be necessary to zero the OCR. For example:
[root@server3]# dd if=/dev/zero of=/dev/ocr bs=1M
62
juliandyke.com
Rebuild procedure (adapted from Note 399482.1) continued: On first node execute root.sh
[root@server3]# $ORA_CRS_HOME/root.sh
Use srvctl to add ASM instances Database Instance Services Use netca to add listener Execute cluvfy to verify CRS configuration
[oracle@server4]$ cluvfy stage -post crsinst -n node1,node2
63
juliandyke.com
juliandyke.com
In Oracle 8.0 and above it is possible to specify a module and action for any session Modules and actions allow inefficient SQL statements to be identified and isolated more efficiently Modules and actions are reported in STATSPACK / AWR / ASH reports V$SESSION V$SQL V$ACTIVE_SESSION_HISTORY Current module and action for a session is reported in V$SESSION.MODULE V$SESSION.ACTION
65
juliandyke.com
66
juliandyke.com
Introduced in Oracle 10.1 Contains the following subroutines SESSION_TRACE_ENABLE SESSION_TRACE_DISABLE DATABASE_TRACE_ENABLE DATABASE_TRACE_DISABLE CLIENT_ID_TRACE_ENABLE CLIENT_ID_TRACE_DISABLE CLIENT_ID_STAT_ENABLE CLIENT_ID_STAT_DISABLE SERV_MOD_ACT_TRACE_ENABLE SERV_MOD_ACT_TRACE_DISABLE SERV_MOD_ACT_STAT_ENABLE SERV_MOD_ACT_STAT_DISABLE
67
juliandyke.com
Trace is enabled using the following subroutines: SESSION_TRACE_ENABLE DATABASE_TRACE_ENABLE CLIENT_ID_TRACE_ENABLE SERV_MOD_ACT_TRACE_ENABLE By default event 10046 level 8 trace will be enabled Includes wait events In Oracle 11.1 these subroutines have an additional PLAN_STATS parameter which specifies when row source statistics are dumped. Possible values are NEVER FIRST_EXECUTION (default) ALL_EXECUTIONS
68
juliandyke.com
69
juliandyke.com
Introduced in Oracle 10.2 To enable trace for the entire database use:
EXECUTE DBMS_MONITOR.DATABASE_TRACE_ENABLE;
70
juliandyke.com
Trace can be enabled for using client identifiers Useful when many sessions connect using the same Oracle user Useful with connection caches To set a client identifier use DBMS_SESSION.SET_IDENTIFIER For example:
BEGIN DBMS_SESSION.SET_IDENTIFIER ('CLIENT42'); END;
71
juliandyke.com
72
juliandyke.com
Trace can be enabled for a specific service service and module service, module and action To enable trace for SERVICE1 use:
BEGIN DBMS_MONITOR.SERV_MOD_ACT_TRACE_ENABLE (SERVICE_NAME => 'SERVICE1'); END;
73
juliandyke.com
74
juliandyke.com
75
juliandyke.com
TRACE_TYPE column can be CLIENT_ID SERVICE SERVICE_MODULE SERVICE_MODULE_ACTION DATABASE Currently enabled trace aggregations are reported in DBA_ENABLED_AGGREGATIONS
76
juliandyke.com
In Oracle 11.1 and above the diagnostics area has been redesigned Diagnostics area is located in $ORACLE_BASE/diag and includes the following top-level directories asm clients crs diagtool lsnrctl netcman ofm rdbms tnslsnr
77
juliandyke.com
Trace directory includes server (foreground) process trace files background process trace files alert log (text) All trace files and alert log are written to $ORACLE_BASE/diag/rdbms/<database>/<instance>/trace For example for database TEST $ORACLE_BASE/diag/rdbms/test/TEST1/trace BACKGROUND_DUMP_DEST and USER_DUMP_DEST both specify same trace directory by default Deprecated in Oracle 11.1
78
juliandyke.com
V$DIAG_INFO dynamic performance view Introduced in Oracle 11.1 Returns values for the following diagnostics
Example Value /u01/app/oracle /u01/app/oracle/diag/rdbms/test/TEST 2 1
Name ADR Base ADR Home Active Incident Count Active Problem Count
/u01/app/oracle/diag/rdbms/test/TEST/trace/TEST_ora_14003.trc
/u01/app/oracle/diag/rdbms/test/TEST/alert /u01/app/oracle/diag/rdbms/test/TEST/cdump TRUE /u01/app/oracle/diag/rdbms/test/TEST/incident /u01/app/oracle/diag/rdbms/test/TEST/trace /u01/app/oracle/diag/rdbms/test/TEST/hm
79
juliandyke.com
By default trace is written to standard output In Oracle 10.1 and above, the same environment variable can be used to trace: NETCA VIPCA SRVCONFIG GSDCTL CLUVFY CLUUTIL
80
juliandyke.com
References
Metalink Notes
265769.1 - Troubleshooting CRS Reboots 240001.1 - Troubleshooting CRS root.sh problems 341214.1 - How to cleanup after a failed (or successful) Oracle Clusterware installation 294430.1 - MISSCOUNT Definition and Default Values 357808.1 - CRS Diagnostics 272331.1 - CRS 10g Diagnostic Guide 330358.1 - CRS 10g R2 Diagnostic Collection Guide 331168.1 - Clusterware consolidated logging in 10gR2 357808.1 - Diagnosibility for CRS/EVM/RACG 289690.1 - Data Gathering for Troubleshooting RAC and CRS Issues 284752.1 - Increasing CSS Misscount, Reboottime and Disktimeout 462616.1 - Reconfiguring the CSS disktimeout of 10gR2 Clusterware for proper LUN failover 317628.1 - How to replace a corrupt OCR mirror file 279793.1 - How to restore a lost voting disk in 10g
81
juliandyke.com
82
juliandyke.com