Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 73

11gR2 RAC/Grid Clusterware:

Best Practices, Pitfalls, and

Lessons Learned
Presented during DOUG meeting held on 10/21/2010 at Dallas, TX

Ramkumar Rajagopal

DBARAC is a specialty database consulting firm with expertise in a variety of

industries based at Austin, Texas.
Our people are experts in Oracle Real application clustered database focused
solutions for managing large database systems.
We provide proactive database management services including but not
limited to In-house, on-shore DBA support , remote DB support, database
maintenance and backup and recovery.
Our DBA Experts provide specialized services in the areas of -
Root cause analysis
Capacity planning
Performance tuning,
Database migration and consolidation
Broad industry expertise
High-Availability RAC database specialists
End-to-end database support

11GR2 Grid clusterware 2


Senior Database Consultant DBARAC

Oracle Database/Applications DBA since 1995
Dell, JP Morgan Chase, Verizon
Presenter @ Oracle Open world 2007
Author Dell Power Solutions articles

11GR2 Grid clusterware 3


Node eviction issue in 10g
What is 11GR2 Grid Clusterware?
The Challenges
Whats different today?
Weve seen this before, smart guy
Architecture and Capacity Planning
Upgrade Paths
Pre-installation best practices
Grid Clusterware Installation
Clusterware Startup sequence
Post Install steps
RAC Database build steps

11GR2 Grid clusterware 4

Why a node is evicted?

Split brain condition

IO fencing

CRS keeps the lowest number node up

Node eviction detection

11GR2 Grid clusterware
Root Causes of Node

Network heartbeat lost

Voting disk problems

cssd is not healthy


Hang check timer

cssd and oclsomon race to suicide

11GR2 Grid clusterware 6

11GR2 Grid Clusterware

Node eviction algorithm is enhanced

Prevent a split-brain problem without rebooting the


Oracle High Availability Services Daemon

Will still reboot in some cases

Faster relocation of services on node failure in 11GR2

11GR2 Grid Clusterware 7

9i/10g RAC Scenario

Several separate versions of databases

Several servers

Space/resource issues

Lesser resources

Provisioning takes time

11GR2 Grid clusterware 8

Top concerns

How many are using 11GR2 Grid clusterware?

Do you have more than one mission-critical databases
within single RAC cluster?
Can you allocate resources dynamically to handle peak
volumes of various application loads without
Issues on using shared infrastructure
Will my database availability and recovery suffer ?
Will my database performance suffer ?
How to manage a large clustered environment to meet slas
for several applications?

11GR2 Grid clusterware
Why 11GR2 Grid CRS?

11GR2 Grid Clusterware is

An Architecture

An IT Strategy
Clusterware & ASM storage deployed together
Many, many Oracle Database Instances
Drives Consolidation

11GR2 Grid clusterware 10


Skilled resources

Meeting SLAs

End-to-end testing not possible

Security Controls

Capacity issues

Higher short-term costs

11GR2 Grid clusterware 11

Whats different today?

11gR2 Grid CRS & ASM supports

11GR2, 11GR1, 10gR1 and 10gR2 Single Instances

Powerful servers, 64Bit O/s

Provisioning Framework to deploy

Grid control

11GR2 Grid clusterware 12

11GR2 RAC DB Architecture

11GR2 Grid clusterware 13

Capacity Planning

What are the current requirements?

What are the future growth requirements in the next 6-12months?

To meet the demand estimate the hardware requirements

Data retention requirements

Archiving and purging

11GR2 Grid clusterware 14

Capacity Planning metrics

Database metrics for capacity planning

CPU & memory Utilization
I/O rates
Device utilization
Queue length
Storage utilization
Response time
Transaction rate
Network Packet loss
Network Bandwidth utilization

11GR2 Grid clusterware 15

Capacity Planning Strategy

Examine existing engagement processes

Examine existing capacity of servers/storage

Define Hardware/database scalability

Provisioning for adding capacity

Integration testing

Large clustered database

SLA requirements

11GR2 Grid clusterware 16

Comparison 10g vs 11GR2

Server consolidation

Database consolidation

Instance consolidation

Storage consolidation

11GR2 Grid clusterware 17

AGENDA so far

Node eviction issue in 10g
What is 11GR2 Grid clusterware?
The Challenges
Whats different today?
Weve seen this before, smart guy
Architecture and Capacity Planning
Upgrade Paths
Pre-installation best practices
Grid Clusterware Installation
Clusterware Startup sequence
Post Install steps
RAC Database build steps

11GR2 Grid clusterware 18

Upgrade Paths

Out-of-place clusterware upgrade

Rolling Upgrade

Oracle 10gR2 - from

Oracle 11gR1 - from

11GR2 Grid clusterware 19

Pre-installation best

Network Requirements

Cluster Hardware Requirements

ASM Storage Requirements

Verification Checks

11GR2 Grid clusterware 20

Pre-Installation best practices
Network Configuration

SCAN -Single Client Access Name

Failover - Faster relocation of services
Better Load balancing
MTU package size of Network Adapter (NIC)
Forwarder, zone entries and reverse lookup
Ping tests
Two dedicated interconnect switches for redundant interconnects
Run cluvfy

11GR2 Grid clusterware 21

Pre install - Network -
SCAN Configuration

11GR2 Grid clusterware 22

Pre Install Network -
SCANVIP Troubleshooting

SCAN Configuration:
$GRID_HOME/bin/srvctl config scan

SCAN Listener Configuration:

$GRID_HOME/bin/srvctl config scan_listener
SCAN Listener Resource Status:
$GRID_HOME/bin/crsctl stat res -w "TYPE = ora.scan_listener.type

Local and remote listener parameters

11GR2 Grid clusterware 23

Pre Install -Cluster
Hardware requirements

Os/kernel same on all servers in the cluster

Minimum 32 GB of RAM
Minimum Swap space 16GB
Minimum Grid Home free space 16GB
For each Oracle Home directory allocate 32 GB of space (for each
db -32GB)
Allocate adequate disk space for centralized backups
Allocate adequate storage for ASM diskgroups DATA and FRA

11GR2 Grid clusterware 24

Cluster Hardware
requirements continued

Most cases: use UDP over 1 Gigabit Ethernet

For large databases - Infiniband/IP or 10 Gigabit Ethernet

Use OS Bonding/teaming to virtualize interconnect

Set UDP send/receive buffers high enough

Crossover cables are not supported

11GR2 Grid clusterware 25

Pre Install - ASM Storage

In 11gR2 ASM diskgroups are used

Grid infrastructure - OCR, Voting disk and ASMspfile.
Database - DATA and FRA.

OCR and voting disks for Grid clusterware

OCR can now be stored in Automatic Storage Management (ASM).

Add Second diskgroup for ocr using

- ./ocrconfig -add +DATA02
Change the compatibility of the new diskgroup to 11.2 as follows:

11GR2 Grid clusterware 26

AGENDA so far

Node eviction issue in 10g
What is 11GR2 Grid clusterware?
The Challenges
Whats different today?
Weve seen this before, smart guy
Architecture and Capacity Planning
Upgrade Paths
Pre-installation best practices
Grid Clusterware Installation
Clusterware Startup sequence
Post Install steps
RAC Database build steps

11GR2 Grid clusterware 27

Hardware/Software details

10gR2 architecture 9 database servers, 25TB storage

Original Database Version:
Original RAC cluster version :
Original Operating System: Ret Hat Linux 5 As 64 Bit
Storage Type : ASM & RAW Storage
11gR2 Grid architecture 4 database servers, 40TB
New Database Version:
New Grid Clusterware/ASM version:
New Operating System : Ret Hat Linux 5 As 64Bit
Data migration steps using Rman backup and restore and
data pump export dump files

11GR2 Grid Clusterware
11GR2 Migration Steps

Install 11gR2 Grid clusterware and Asm

Install 11gR2 database binaries for each
database separately
Create the 11gR2 database
Add additional ASM diskgroups
Install 11GR1/10gR2 database binaries
Create 11GR1/10gR2 databases
Take backup
Restore the data
11GR2 Grid Clusterware 29
checks- cluvfy

Before Clusterware installation

./cluvfy stage -pre crsinst -n node1,node2, node3 verbose

Before Database installation

./cluvfy stage -pre dbinst -n node1,node2, node3 -fixup -verbose

11GR2 Grid Clusterware 30

11gR2 Grid Clusterware
Installation Step 1

11GR2 Grid Clusterware 31


11GR2 Grid Clusterware

11GR2 Grid Clusterware

11GR2 Grid Clusterware
Step 5

11GR2 Grid Clusterware
Step 6

11GR2 Grid Clusterware
Step 7

11GR2 Grid Clusterware
Step 8

11GR2 Grid Clusterware
Step 8 cont..

11GR2 Grid Clusterware
Step 9

11GR2 Grid Clusterware 40

Step 9 cont...

11GR2 Grid Clusterware 41

Step 10

11GR2 Grid Clusterware
Step 11

11GR2 Grid Clusterware 43

Step 11 cont..

11GR2 Grid Clusterware
Step 12

11GR2 Grid Clusterware
Step 12

11GR2 Grid Clusterware
Step 13

11GR2 Grid Clusterware 47

Step 14

11GR2 Grid Clusterware

root> /tmp/
Response file being used is
Enable file being used is
Log file location: /tmp/CVU_11.
Setting Kernel Parameters...
fs.file-max = 327679
fs.file-max = 6815744
net.ipv4.ip_local_port_range = 9000 65500
net.core.wmem_max = 262144
net.core.wmem_max = 1048576

11GR2 Grid Clusterware
Step 15

11GR2 Grid Clusterware
Step 16

11GR2 Grid Clusterware
Step 16 cont.

11GR2 Grid Clusterware 52


Cd /home/oracle/oraInventory
[root@oradb-grid1 oraInventory]# ./
Changing permissions of /home/oracle/oraInventory.
Adding read, write permissions for group.
Removing read,write,execute permissions for world.

Changing groupname of /home/oracle/oraInventory to oinstall.

The execution of the script is complete.
[root@oradb-grid1 oraInventory]# cd /u01/app/oracle
[root@oradb-grid1 oracle]# ls
product scripts
[root@oradb-grid1 oracle]# cd product
[root@oradb-grid1 product]# ls
[root@oradb-grid1 product]# cd 11*
[root@oradb-grid1 11.2.0]# ls
[root@oradb-grid1 11.2.0]# cd db*
[root@oradb-grid1 grid]# ls root*
[root@oradb-grid1 grid]# ./
Running Oracle 11g script...

11GR2 Grid Clusterware 53


The following environment variables are set as:

ORACLE_HOME= /u01/app/oracle/product/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:

The file "dbhome" already exists in /usr/local/bin. Overwrite it? (y/n)
The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n)
The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n)
Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of script.
Now product-specific root actions will be performed.
2009-10-02 10:31:44: Parsing the host name
2009-10-02 10:31:44: Checking for super user privileges
2009-10-02 10:31:44: User has super user privileges
Using configuration parameter file:
Creating trace directory

11GR2 Grid Clusterware

Creating OCR keys for user 'root', privgrp 'root'..

Operation successful.
root wallet
root wallet cert
root cert export
peer wallet
profile reader wallet
pa wallet
peer wallet keys
pa wallet keys
peer cert request
pa cert request
peer cert
pa cert
peer root cert TP
profile reader root cert TP
pa root cert TP
peer pa cert TP
pa peer cert TP
profile reader pa cert TP
profile reader peer cert TP
peer user cert
pa user cert
Adding daemon to inittab

11GR2 Grid Clusterware

CRS-4123: Oracle High Availability Services has been started.

ohasd is starting
CRS-2672: Attempting to start 'ora.gipcd' on 'oradb-grid1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'oradb-grid1'
CRS-2676: Start of 'ora.gipcd' on 'oradb-grid1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'oradb-grid1'
CRS-2676: Start of 'ora.gpnpd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'oradb-grid1'
CRS-2676: Start of 'ora.cssdmonitor' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'oradb-grid1'
CRS-2672: Attempting to start 'ora.diskmon' on 'oradb-grid1'
CRS-2676: Start of 'ora.diskmon' on 'oradb-grid1' succeeded
CRS-2676: Start of 'ora.cssd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'oradb-grid1'
CRS-2676: Start of 'ora.ctssd' on 'oradb-grid1' succeeded

ASM created and started successfully.

DiskGroup DATA created successfully.

11GR2 Grid Clusterware

clscfg: -install mode specified

Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-2672: Attempting to start 'ora.crsd' on 'oradb-grid1'
CRS-2676: Start of 'ora.crsd' on 'oradb-grid1' succeeded
CRS-4256: Updating the profile
Successful addition of voting disk 659585bf3a834f39bf281fd47e9ed6db.
Successful addition of voting disk 762177cd6f844f25bfc677fb681a02ab.
Successful addition of voting disk 154c17a0de9c4ffdbff9a3b2f22b52f6.
Successfully replaced voting disk group with +DATA.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 659585bf3a834f39bf281fd47e9ed6db (/dev/oracleasm/disks/VOL1) [DATA]
2. ONLINE 762177cd6f844f25bfc677fb681a02ab (/dev/oracleasm/disks/VOL2) [DATA]
3. ONLINE 154c17a0de9c4ffdbff9a3b2f22b52f6 (/dev/oracleasm/disks/VOL4) [DATA]
Located 3 voting disk(s).

11GR2 Grid Clusterware
CRS-2673: Attempting to stop 'ora.crsd' on 'oradb-grid1'
CRS-2677: Stop of 'ora.crsd' on 'oradb-grid1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'oradb-grid1'
CRS-2677: Stop of 'ora.asm' on 'oradb-grid1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'oradb-grid1'
CRS-2677: Stop of 'ora.ctssd' on 'oradb-grid1' succeeded
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'oradb-grid1'
CRS-2677: Stop of 'ora.cssdmonitor' on 'oradb-grid1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'oradb-grid1'
CRS-2677: Stop of 'ora.cssd' on 'oradb-grid1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'oradb-grid1'
CRS-2677: Stop of 'ora.gpnpd' on 'oradb-grid1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'oradb-grid1'
CRS-2677: Stop of 'ora.gipcd' on 'oradb-grid1' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'oradb-grid1'
CRS-2677: Stop of 'ora.mdnsd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.mdnsd' on 'oradb-grid1'
CRS-2676: Start of 'ora.mdnsd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'oradb-grid1'
CRS-2676: Start of 'ora.gipcd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'oradb-grid1'
CRS-2676: Start of 'ora.gpnpd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'oradb-grid1'
CRS-2676: Start of 'ora.cssdmonitor' on 'oradb-grid1' succeeded

11GR2 Grid Clusterware

CRS-2672: Attempting to start 'ora.cssd' on 'oradb-grid1'

CRS-2672: Attempting to start 'ora.diskmon' on 'oradb-grid1'
CRS-2676: Start of 'ora.diskmon' on 'oradb-grid1' succeeded
CRS-2676: Start of 'ora.cssd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'oradb-grid1'
CRS-2676: Start of 'ora.ctssd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'oradb-grid1'
CRS-2676: Start of 'ora.asm' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'oradb-grid1'
CRS-2676: Start of 'ora.crsd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'oradb-grid1'
CRS-2676: Start of 'ora.evmd' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'oradb-grid1'
CRS-2676: Start of 'ora.asm' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.DATA.dg' on 'oradb-grid1'
CRS-2676: Start of 'ora.DATA.dg' on 'oradb-grid1' succeeded
CRS-2672: Attempting to start 'ora.registry.acfs' on 'oradb-grid1'
CRS-2676: Start of 'ora.registry.acfs' on 'oradb-grid1' succeeded

11GR2 Grid Clusterware

oradb-grid1 2009/10/02 10:37:08 /u01/app/oracle/product/11.2.0/grid/cdata/oradb-

Preparing packages for installation...
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
Updating inventory properties for clusterware
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 39997 MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /home/oracle/oraInventory
'UpdateNodeList' was successful.

oradb-grid2 output :-

[root@oradb-grid2 oraInventory]# pwd

[root@oradb-grid2 oraInventory]# ./
Changing permissions of /home/oracle/oraInventory.

11GR2 Grid Clusterware
Step 16

11GR2 Grid Clusterware
Step 16 cont.

11GR2 Grid Clusterware
Grid CRS Startup sequence

DB Resource SCAN Listener Services ONS

networkResource SCAN VIPs Node VIPs ACFS Reg GCS VIP

crsdRootAgent crsdOraAgent

mdnsd gipcd ASM gpnpd evmd

crsd ctssd Diskmon ACFS

oraRootAgent oraAgent cssdAgent cssdMonitor


11GR2 Grid Clusterware 63

Clusterware verification

Clusterware processes

$ ps -ef|grep -v grep |grep d.bin

oracle 9824 1 0 Jul14 ? 00:00:00 /u01/app/grid11gR2/bin/oclskd.bin
root 22161 1 0 Jul13 ? 00:00:15 /u01/app/grid11gR2/bin/ohasd.bin reboot
oracle 24161 1 0 Jul13 ? 00:00:00 /u01/app/grid11gR2/bin/mdnsd.bin
oracle 24172 1 0 Jul13 ? 00:00:00 /u01/app/grid11gR2/bin/gipcd.bin
oracle 24183 1 0 Jul13 ? 00:00:03 /u01/app/grid11gR2/bin/gpnpd.bin
oracle 24257 1 0 Jul13 ? 00:01:26 /u01/app/grid11gR2/bin/ocssd.bin
root 24309 1 0 Jul13 ? 00:00:06 /u01/app/grid11gR2/bin/octssd.bin
root 24323 1 0 Jul13 ? 00:01:03 /u01/app/grid11gR2/bin/crsd.bin reboot
root 24346 1 0 Jul13 ? 00:00:00 /u01/app/grid11gR2/bin/oclskd.bin
oracle 24374 1 0 Jul13 ? 00:00:03 /u01/app/grid11gR2/bin/evmd.bin

11GR2 Grid Clusterware 64

Clusterware verification

Clusterware checks

$ crsctl check crs

CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online $ crsctl check has
CRS-4638: Oracle High Availability Services is online
$ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is

11GR2 Grid Clusterware 65

Clusterware verification

Clusterware processes
$ crsctl check cluster -all
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

11GR2 Grid Clusterware 66

Clusterware verification

$ crsctl status resource -t

ora.DATA.dg ASM disk group (new resource)
ONLINE ONLINE rat-rm2-ipfix006
ONLINE ONLINE rat-rm2-ipfix007
ONLINE ONLINE rat-rm2-ipfix008
ONLINE ONLINE rat-rm2-ipfix006
ONLINE ONLINE rat-rm2-ipfix007
ONLINE ONLINE rat-rm2-ipfix008
ONLINE ONLINE rat-rm2-ipfix006 Started
ONLINE ONLINE rat-rm2-ipfix007 Started
ONLINE ONLINE rat-rm2-ipfix008 Started
ora.eons new resource
ONLINE ONLINE rat-rm2-ipfix006
ONLINE ONLINE rat-rm2-ipfix007
ONLINE ONLINE rat-rm2-ipfix008

11GR2 Grid Clusterware 67

Post Install Steps

One-off patch unlock the Grid home first

# perl -unlock -crshome
Download and Install the latest Patch Updates
Back Up the Script
Install Cluster Health Management
Install OS Watcher and RACDDT
Check the backups of OCR and voting disks
Lock the Grid home after patch installation

11GR2 Grid Clusterware 68


Grid control

Adding/dropping nodes

Automatically discovers services

Policy-based cluster management

Automated Cluster Patching

End to end management of the cluster

11GR2 Grid Clusterware 69

Scalability & Flexibility
Sever pools of hardware available

Consolidation of hardware and storage

Rapid provisioning of resources to add capacity where its required

Improved utilization of resources

Better ROI

11GR2 Grid Clusterware 70


Reduced Hardware
Improved availability SLAs
Shorter Time to add additional server or storage
Higher Security
Data Sharing & visibility
Better application performance
Centralized backup and archive
Higher ROI higher utilization
Pride in Ownership, Eliminating the Assembly Line
Bottom line = reduce TCO!

11GR2 Grid Clusterware 71


11gR2Oracle Clusterware Administration and Deployment Guide

Metalink ID 1054902.1 for Network configuration
RAC Assurance Support Team: RAC Starter Kit and Best Practices
Metalink NOTE:887522.1 - 11gR2 Grid Infrastructure Single Client
Access Name
Metalink NOTE:946452.1 - DNS and DHCP Setup Example for Grid
Infrastructure GNS
Metalink Pre 11.2 Database Issues in 11gR2 Grid Infrastructure
Environment [ID 948456.1]

11GR2 Grid Clusterware 72

11GR2 Grid cluster

Email @

11GR2 Grid Clusterware 73

You might also like