Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 265

Sections to be discussed

 Basic RAC Questions


 RAC installation Questions
 RAC Upgrade/Patching Questions
 RAC Data Guard Configuration Questions
 RAC troubleshooting Questions

1) What are Oracle Clusterware processes for 10g on UNIX and Linux?
Cluster Synchronization Services (ocssd):
Manages cluster node membership and runs as the oracle user; failure of this process
results in cluster restart.
 CSS provides basic Group Services Support; it is a distributed group membership
system that allows applications to coordinate activities to archive a common result.
 Group services use vendor clusterware group services when it is available.
 Lock services provide the basic cluster-wide serialization locking functions, it uses
the First In, First Out (FIFO) mechanism to manage locking
 Node services uses OCR to store data and updates the information during
reconfiguration, it also manages the OCR data which is static otherwise.

Cluster Ready Services (crsd):


The crs process manages cluster resources (which could be a database, an instance, a
service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on
the resource's configuration information that is stored in the OCR. This includes start, stop,
monitor and failover operations. This process runs as the root user

The CRSd process manages resources such as starting and stopping the services and
failover of the application resources, it also spawns separate processes to manage
application resources. CRS manages the OCR and stores the current know state of the
cluster, it requires a public, private and VIP interface in order to run. OCSSd provides
synchronization services among nodes, it provides access to the node membership and
enables basic cluster services, including cluster group services and locking, failure of this
daemon causes the node to be rebooted to avoid split-brain situations.

Event manager daemon (evmd):


A background process that publishes events that crs creates.

The Event Management Logger, which runs the EVMd process. The daemon spawns
processes called evmlogger and generates the events when things happen. The evmlogger
spawns new children processes on demand and scans the callout directory to invoke
callouts. Death of the EVMd daemon will not halt the instance and will be restarted.

Process Monitor Daemon (OPROCD):


This process monitors the cluster and provides I/O fencing. OPROCD performs its check,
stops running, and if the wake up is beyond the expected time, then OPROCD resets the
processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting
the node. OPROCD uses the hangcheck timer on Linux platforms.
The OPROCd daemon provides the I/O fencing for the Oracle cluster; it uses the hangcheck
timer or watchdog timer for the cluster integrity. It is locked into memory and runs as a
real-time processes, failure of this daemon results in the node being rebooted. Fencing is
used to protect the data, if a node were to have problems fencing presumes the worst and
protects the data thus restarts the node in question, it’s better to be save than sorry.

RACG (racgmain, racgimon):


Extends clusterware to support Oracle-specific requirements and complex resources. Runs
server callout scripts when FAN events occur.

In 10g, CRS consisted of three major components, as shown in Figure 1. These components
manifested themselves as daemons, which ran out of inittab on Linux/Unix, or as services
on Windows. The three daemons were:

Oracle Cluster Synchronization Services daemon (CSSD)


Cluster Ready Services daemon (CRSD), which is the main engine for maintaining
availability of resources
Event Manager Daemon (EVMD)
Of these three components, CSSD and EVMD ran as user oracle, while CRSD ran as root.
The CSSD was responsible for cluster synchronization, cluster membership, and group
membership; EVMD handled event messaging for the processes; and CRSD managed the
resources. Resource management such as start, stop, and monitor was done using scripts
and processes that came under the RACG label. An example would be racgimon, which
monitored the status of database instances.
In Oracle 11g Release 1, the init-managed stack remained; however, as you see in Figure 2,
the Oracle 11g Release 2 startup and process stacks have completely changed. The CRS
stack has effectively been split into two stacks, with the Oracle High Availability Services
daemon (OHASD) handling the low-level processes and the Cluster Ready Services daemon
(CRSD) handling the higher level resources such as database instances. These two stacks
no longer use the old RACG framework but have a new agent framework to manage their
availability, and now the concept of a local registry (OLR) is managed by OHASD as well as
the Cluster Registry (OCR) which is managed by CRSD.

CLUSTERWARE PROCESSES in 11g RAC R2 Environment

i).Cluster Ready Services (CRS)


$ ps -ef | grep crs | grep -v grep
root 25863 1 1 Oct27 ? 11:37:32 /opt/oracle/grid/product/11.2.0/bin/crsd.bin reboot
crsd.bin => the above process is responsible for start, stop, monitor and failover of
resource. It maintains OCR and also restarts the resources when the failure occurs.
This is applicable for RAC systems. For Oracle Restart and ASM ohasd is used.
ii).Cluster Synchronization Service (CSS)
$ ps -ef | grep -v grep | grep css
root 19541 1 0 Oct27 ? 00:05:55 /opt/oracle/grid/product/11.2.0/bin/cssdmonitor
root 19558 1 0 Oct27 ? 00:05:45 /opt/oracle/grid/product/11.2.0/bin/cssdagent
oragrid 19576 1 6 Oct27 ? 2-19:13:56 /opt/oracle/grid/product/11.2.0/bin/ocssd.bin
cssdmonitor => Monitors node hangs(via oprocd functionality) and monitors OCCSD
process hangs (via oclsomon functionality) and monitors vendor clusterware(via vmon
functionality).This is the multi threaded process that runs with elavated priority.
Startup sequence: INIT --> init.ohasd --> ohasd --> ohasd.bin --> cssdmonitor
cssdagent => Spawned by OHASD process.Previously(10g) oprocd, responsible for I/O
fencing.Killing this process would cause node reboot.Stops,start checks the status of
occsd.bin daemon
Startup sequence: INIT --> init.ohasd --> ohasd --> ohasd.bin --> cssdagent
occsd.bin => Manages cluster node membership runs as oragrid user. Failure of this
process results in node restart.
Startup sequence: INIT --> init.ohasd --> ohasd --> ohasd.bin --> cssdagent --> ocssd -->
ocssd.bin
iii) Event Management (EVM)
$ ps -ef | grep evm | grep -v grep
oragrid 24623 1 0 Oct27 ? 00:30:25 /opt/oracle/grid/product/11.2.0/bin/evmd.bin
oragrid 25934 24623 0 Oct27 ? 00:00:00
/opt/oracle/grid/product/11.2.0/bin/evmlogger.bin -o
/opt/oracle/grid/product/11.2.0/evm/log/evmlogger.info -l
/opt/oracle/grid/product/11.2.0/evm/log/evmlogger.log
evmd.bin => Distributes and communicates some cluster events to all of the cluster
members so that they are aware of the cluster changes.
evmlogger.bin => Started by EVMD.bin reads the configuration files and determines what
events to subscribe to from EVMD and it runs user defined actions for those events.
iv).Oracle Root Agent
$ ps -ef | grep -v grep | grep orarootagent
root 19395 1 0 Oct17 ? 12:06:57 /opt/oracle/grid/product/11.2.0/bin/orarootagent.bin
root 25853 1 1 Oct17 ? 16:30:45 /opt/oracle/grid/product/11.2.0/bin/orarootagent.bin
orarootagent.bin => A specialized oraagent process that helps crsd manages resources
owned by root, such as the network, and the Grid virtual IP address.
The above 2 process are actually threads which looks like processes. This is a Linux specific
v).Cluster Time Synchronization Service (CTSS)
$ ps -ef | grep ctss | grep -v grep
root 24600 1 0 Oct27 ? 00:38:10 /opt/oracle/grid/product/11.2.0/bin/octssd.bin reboot
octssd.bin => Provides Time Management in a cluster for Oracle Clusterware
vi).Oracle Agent
$ ps -ef | grep -v grep | grep oraagent
oragrid 5337 1 0 Nov14 ? 00:35:47 /opt/oracle/grid/product/11.2.0/bin/oraagent.bin
oracle 8886 1 1 10:25 ? 00:00:05 /opt/oracle/grid/product/11.2.0/bin/oraagent.bin
oragrid 19481 1 0 Oct27 ? 01:45:19 /opt/oracle/grid/product/11.2.0/bin/oraagent.bin
oraagent.bin => Extends clusterware to support Oracle-specific requirements and complex
resources. This process runs server callout scripts when FAN events occur. This process was
known as RACG in Oracle Clusterware 11g Release 1 (11.1).
ORACLE HIGH AVAILABILITY SERVICES STACK
i) Cluster Logger Service
$ ps -ef | grep -v grep | grep ologgerd
root 24856 1 0 Oct27 ? 01:43:48 /opt/oracle/grid/product/11.2.0/bin/ologgerd -m
mg5hfmr02a -r -d /opt/oracle/grid/product/11.2.0/crf/db/mg5hfmr01a
ologgerd => Receives information from all the nodes in the cluster and persists in a CHM
repository-based database. This service runs on only two nodes in a cluster
ii).System Monitor Service (osysmond)
$ ps -ef | grep -v grep | grep osysmond
root 19528 1 0 Oct27 ? 09:42:16 /opt/oracle/grid/product/11.2.0/bin/osysmond
osysmond => The monitoring and operating system metric collection service that sends
the data to the cluster logger service. This service runs on every node in a cluster
iii). Grid Plug and Play (GPNPD):
$ ps -ef | grep gpn
oragrid 19502 1 0 Oct27 ? 00:21:13 /opt/oracle/grid/product/11.2.0/bin/gpnpd.bin
gpnpd.bin => Provides access to the Grid Plug and Play profile, and coordinates updates to
the profile among the nodes of the cluster to ensure that all of the nodes have the most
recent profile.
iv).Grid Interprocess Communication (GIPC):
$ ps -ef | grep -v grep | grep gipc
oragrid 19516 1 0 Oct27 ? 01:51:41 /opt/oracle/grid/product/11.2.0/bin/gipcd.bin
gipcd.bin => A support daemon that enables Redundant Interconnect Usage.
v). Multicast Domain Name Service (mDNS):
$ ps -ef | grep -v grep | grep dns
oragrid 19493 1 0 Oct27 ? 00:01:18 /opt/oracle/grid/product/11.2.0/bin/mdnsd.bin
mdnsd.bin => Used by Grid Plug and Play to locate profiles in the cluster, as well as by
GNS to perform name resolution. The mDNS process is a background process on Linux and
UNIX and on Windows.
vi).Oracle Grid Naming Service (GNS)
$ ps -ef | grep -v grep | grep gns
gnsd.bin => Handles requests sent by external DNS servers, performing name resolution
for names defined by the cluster.

2) What are Oracle database background processes specific to RAC?


LMS—Global Cache Service Process
LMD—Global Enqueue Service Daemon
LMON—Global Enqueue Service Monitor
LCK—Instance Enqueue Process
DIAG- Diagnostic Daemon

LMS (Lock Manager Server Process) —Global Cache Service Process (GCS):
This is the cache fusion part and the most active process; it handles the consistent copies of
blocks that are transferred between instances. It receives requests from LMD to perform
lock requests. I roll back any uncommitted transactions. There can be up to ten LMS
processes running and can be started dynamically if demand requires it.

They manage lock manager service requests for GCS resources and send them to a service
queue to be handled by the LMSn process. It also handles global deadlock detection and
monitors for lock conversion timeouts.
As a performance gain you can increase this process priority to make sure CPU starvation
does not occur

You can see the statistics of this daemon by looking at the view X$KJMSDP

The Global Cache Service Processes (LMSx) are the processes that handle remote Global
Cache Service (GCS) messages.

This process maintains statuses of datafiles and each cached block by recording information
in a Global Resource Directory (GRD). This process also controls the flow of messages to
remote instances and manages global data block access and transmits block images
between the buffer caches of different instances. This processing is a part of cache fusion
feature.

Note: Real Application Clusters software provides for up to 10 Global Cache Service
Processes. The number of LMSx varies depending on the amount of messaging traffic
among nodes in the cluster.

 Primary job is to transport blocks across the nodes for cache-fusion requests.
 Lock Manager Server Process is used in Cache Fusion. It enables consistent copies of
blocks to be transferred from a holding instance’s buffer cache to a requesting
 It rollbacks any uncommitted transactions for any blocks that are being requested for
a consistent read by the remote instance.

LMD (Lock Monitor Daemon)—Global Enqueue Service Daemon (GES):


This manages the enqueue manager service requests for the GCS. It also handles deadlock
detention and remote resource requests from other instances.

You can see the statistics of this daemon by looking at the view X$KJMDDP

LMON (Lock Monitor Process)—Global Enqueue Service Monitor (GES):


This process manages the GES; it maintains consistency of GCS memory structure in case
of process death. It is also responsible for cluster reconfiguration and locks reconfiguration
(node joining or leaving), it checks for instance deaths and listens for local messaging.

A detailed log file is created that tracks any reconfigurations that have happened.

LCK (Lock Process)—Instance Enqueue Process (Lock Process – GES):


It manages instance resource requests and cross-instance call operations for shared
resources. It builds a list of invalid lock elements and validates lock elements during
recovery.

DIAG (Diagnostic Daemon):


This is a lightweight process; it uses the DIAG framework to monitor the health of the
cluster. It captures information for later diagnosis in the event of failures. It will perform
any necessary recovery if an operational hang is detected.

3) What are Oracle Clusterware Components?


Voting Disk:
Oracle RAC uses the voting disk to manage cluster membership by way of a health check
and arbitrates cluster ownership among the instances in case of network failures. The voting
disk must reside on shared disk.
You can have up to 32 voting disks in your cluster

Oracle Cluster Registry (OCR):


Maintains cluster configuration information as well as configuration information about any
cluster database within the cluster. The OCR must reside on shared disk that is accessible
by all of the nodes in your cluster

4) How do you troubleshoot node reboot?


There are many reasons for a RAC node eviction, few are listed below:

 Hardware Failure: A failure of any of the major hardware components (CPU, RAM,
network interconnect) can cause a node eviction.
 Server Overload: A server that is experiencing RAM swapping might trigger a node
eviction. It's important that each node be properly configured.
 Voting disk communications this can happen when communications to the voting disk
is interrupted, causing the disconnected node to be evicted and re-boot.
 Database issues if the database (or the ASM instance) is not responding (a database
“hangs" condition), then a node eviction may occur.

Troubleshooting:
1. Look at the cssd.log files on both nodes; usually we will get more information on the
second node if the first node is evicted and check at crsd.log file also.
Analysis:
If you see “Polling” key words with reduce in percentage values in cssd.log file that says the
eviction is probably due to Network.
If you see “Diskpingout” are something related to -DISK- then, the eviction is because of
Disk time out.
If network was the issue, then check if any NIC cards were down, or if link switching as
happen. And check private interconnect is working between both the nodes.
Check the OS level health: /var/messages
2. Collect NMON/OS Watcher/RDA reports to make sure /justify if it was DISK issue or
Network, if in case we see more memory contention/paging in the reports then it’s time to
collect AWR report to see what loads/SQL was running during that period?
3. The evicted node will have core dump file generated and system reboot info.
4. Find out if there was node reboot, is it because of CRS or others, check system reboot
time.
5. Sometimes eviction could also be due to OS error where the system is in halt state for
while or Memory over commitment or CPU 100% used, check OS /system logfiles to get
more information.
6. What got changed recently? Ask your coworker to open up a ticket with Oracle and
upload logs
7. Check the health of clusterware, db instances, asm instances, uptime of all hosts and all
the logs – ASM logs, Grid logs, CRS and ocssd.log, HAS logs, EVM logs, DB instances logs,
OS logs, SAN logs for that particular timestamp.
8. Run TFA and OSWATCHER, NETSTAT, IFCONFIG settings etc based on error messages
during your RCA.
9. Verify user equivalence between cluster nodes
10. A major reason however for node evictions at our cluster was at the "patch-levels" not
being equal across the two nodes.
Nodes sometimes completely died, without any error what so ever. It turned to be a bug in
the installer of 11.1.0.7.1 PSU.

COMMON CAUSES OF NODE EVICTIONS:


1. High CPU, Resource consumption on the nodes which prevents processes like ocssd,
oprocd (10.2,11.1), the cssdagent or cssd monitor from executing.
2. Less memory available at the OS
3. Bad huge pages values
4. Bad N/W or interconnect delays, heartbeat failure, loss of network connectivity between
nodes
5. Storage hung (failed I/O against voting disk)
6. Oracle bug while performing CRS upgrade, check MOS first for possible bugs with a
particular version. CSS not good
7. All NICs are in same subnet.
8. Unnecessary host’s entries
11. The size of the control file is increase caused Instability and instance/node evictions on
10.2.0.4.

Location of Logs:
$ORA_CRS_HOME/crs/log: Contains trace files for the CRS resources.
$ORA_CRS_HOME/crs/init: Contains trace files of the CRS daemon during startup. This is a
good place to start with any CRS login problems.
$ORA_CRS_HOME/css/log : The Cluster Synchronization (CSS) logs indicate all actions such
as reconfigurations, missed check-ins, connects, and disconnects from the client CSS
listener. In some cases, the logger logs messages with the category of auth.crit for the
reboots done by Oracle. This could be used for checking the exact time when the reboot
occurred.
$ORA_CRS_HOME/css/init : Contains core dumps from the Oracle Cluster Synchronization
Service daemon (OCSSd) and the process ID (PID) for the CSS daemon whose death is
treated as fatal. If abnormal restarts for CSS exist, the core files will have the format of
core.
$ORA_CRS_HOME/evm/log: Log files for the Event Volume Manager (EVM) and evmlogger
daemons Not used as often for debugging as the CRS and CSS directories.
$ORA_CRS_HOME/evm/init: PID and lock files for EVM, Core files for EVM should also be
written here.
$ORA_CRS_HOME/srvm/log: Log files for Oracle Cluster Registry (OCR), which contains the
details at the Oracle cluster level.
$ORA_CRS_HOME/log: Log files for Oracle Clusterware (known as the cluster alert log),
which contains diagnostic messages at the Oracle cluster level. This is available from Oracle
database 10g R2.

For More Details:


Please check metalink
Note 265769.1 Troubleshooting CRS Reboots
Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle
Clusterware Node evictions.

5) How do you backup the OCR?


There is an automatic backup mechanism for OCR. The default location is:
$ORA_CRS_HOME\cdata\"clustername"\

Automatic backups:

a) Oracle Clusterware (CRSD) automatically creates OCR backups every 4 hours.


b) A backup is created for each full day.
c) A backup is created at the end of each week.
d) Oracle Database retains the last three copies of OCR.

To display backups:
#ocrconfig -showbackup

To restore a backup:
#ocrconfig -restore

With Oracle RAC 10g Release 2 or later, you can also use the export command
#ocrconfig -export -s online

Oracle RAC 11g Release 1, you can do a manual backup of the OCR with the command:
# ocrconfig -manualbackup

6) How do you backup voting disk in 10g and 11g?


#dd if=voting_disk_name of=backup_file_name

11g:
In 11g release 2 you no longer have to take voting disks backup. In fact according to Oracle
documentation restoration of voting disks that were copied using the "dd" or "cp" command
may prevent your clusterware from starting up.

So, in 11g Release 2 your voting disk data is automatically backed up in the OCR whenever
there is a configuration change.

Also the data is automatically restored to any voting that is added.


How Voting Happens?

The CKPT process updates the control file every 3 seconds in an operation known as
heartbeat.
CKPT writes to a single block that is local to the node/each instance and intra instance
coordination is not required. This block is called checkpoint progress record.

All members of the cluster attempt to lock on the controlfile record for updating.
The instance which obtains the locks tallies the votes from all members. Then, the group
membership must conform to the decided (voted) membership before allowing GCS/GES to
proceed for reconfiguration. The control file record is then stored in the same block as the
heartbeat in the controlfile checkpoint progress record.

What are NETWORK and DISK HEARTBEAT and how it registers in VOTING DISKS/FILES?

 All nodes in the RAC cluster register their heartbeat information in the voting
disks/files. RAC heartbeat is the polling mechanism that is sent over the cluster
interconnect to ensure all RAC
 Nodes are available.
 Voting disks/files are just like attendance register where you have nodes
mark their attendance (heartbeats).
 CSSD process on every node makes entries in the voting disk to ascertain the
membership of the node. While marking their own presence, all the nodes also
register the information about their communicability with other nodes in the voting
disk. This is called NETWORK HEARTBEAT.
 CSSD process in each RAC maintains the heart beat in a block of size 1 OS block in
the hot block of voting disk at a specific offset. The written block has a header area
with the node name. The heartbeat counter increments every second on every write
call. Thus heartbeat of various nodes is recorded at different offsets in the voting
disk. This process is called DISK HEARTBEAT.
 In addition of maintaining its own disk block, CSSD processes also monitors the disk
block maintained by the CSSD processes of other nodes in cluster. Healthy nodes will
have continuous network & disk heartbeats exchanged between the nodes. Break in
heartbeats indicates a possible error scenario.
 If the disk is not updated in a short timeout period, the node is considered unhealthy
and may be rebooted to protect the database. In this case, a message to this effect
is written in the KILL BLOCK of node. Each nodes reads its KILL BLOCK once per
second, if the kill block is not overwritten, node commits suicide.
 During reconfig (leaving or joining), CSSD monitors all nodes heartbeat information
and determines whether the nodes has a disk heartbeat including those with no
network heartbeat. If no disk heartbeat is detected, then node is considered as
dead.

What Information is stored in VOTING DISK/FILE?

It contains 2 types of data.

Static data: Info about the nodes in cluster


Dynamic data: Disk heartbeat logging

It contains the important details of the cluster nodes membership like


 Which node is part of the cluster?
 Which node is leaving the cluster and
 Which node is joining the cluster?

Purpose of Voting disk or why Voting disk is needed


Voting disks are used by clusterware for health check.
 Used by CSS to determine which nodes are currently members of the cluster.
 In concert with other cluster components like CRS to shutdown, fence or reboot
either single or multiple nodes whenever network communication is lost between any
node within the cluster, to prevent to split-brain condition in which 2 or more
instances attempt to control the RAC database and thus protecting the database.
 Will be used by CSS to arbitrate (to take an authorized decision) with peers that it is
not able to see over the private interconnect in the event of an outage, allowing it to
salvage (rescue from loss) the largest fully connected sub-cluster for further
operation. During this operation, node membership (NM) will make an entry in the
voting disk to inform its vote on availability. Other instances in the cluster too do
similar actions. The 3 voting disks configured also provide a method to determine
who in the cluster should survive.
Example: if eviction of one of the node is necessitated due to unresponsive action, then the
node that has 2 voting disks with start evicting the other node. NM alternates it action
between the heartbeat and the voting disk to determine the availability of the other nodes
in cluster.

Possible scenarios in voting disks

As we know now that voting disks is used by CSSD. It contains both network & disk
heartbeat from all nodes and if any break in heartbeat will result in eviction of the node
from cluster. There are possible scenarios with missing heartbeats.
 Network heart beat is successful, but disk heart beat is missed.
 Disk heart beat is successful, but network heart beat is missed.
 Both heart beats failing.

When a cluster is involved with many nodes, then few more scenarios are possible.

 Nodes have a split into N sets of nodes, communicating within the sets, but not with
the members in other set.
 Just one node going unhealthy. Nodes with quorum (minimum number of nodes to
make cluster valid) will maintain active membership of the cluster and other node(s)
will be fenced/rebooted.

Why should we have ODD number of voting disk?

A node must be able to access more than half of the voting disks at any time.

Example:

 Let us consider 2 node clusters with even number of voting disks say 2.
 Let node 1 is able to access voting disk 1.
 Node 2 is able to access voting disk 2.
 From the above steps, we see that we don’t have any common file where clusterware
can check the heartbeat of both the nodes.
 If we have 3 voting disks and both the nodes are able to access more than half ie., 2
voting disks, there will be atleast one disk which will be accessed by both the nodes.
The clusterware can use this disk to check the heartbeat of the nodes.
 A node not able to do so will be evicted from the cluster by another node that has
more than half the voting disks to maintain the integrity of the cluster.

Where voting disks are stored

It can be stored in
 Raw devices
 Cluster file system supported by Oracle RAC such as OCFS, Sun cluster or VERITAS
Cluster Filesystem
 ASM disks (in 11gR2).

When voting disk is stored in ASM, a question is raised how the voting file on ASM can be
accessed when we want to add a new node to a cluster.

Oracle ASM reserves several blocks at the fixed location for every Oracle ASM disk used for
storing the voting files. As a result, Oracle clusterware can access the voting disks present
in ASM even if the ASM instance is down and CSS can continue to maintain the Oracle
cluster even if the ASM has failed. The physical location of the voting files in ASM disks are
fixed i.e., the cluster stack does not rely on a running ASM instance to access the files.

If the ASM is stored in ASM, the multiplexing of voting disk is decided by the redundancy of
the diskgroup.

#of copies ( Minimum # of


Redundancy of disks
of the diskgroup voting disk in the diskgroup)
External 1 1
Normal 3 3
High 5 5

7) How do I identify the voting disk location?


#crsctl query css votedisk

8) How do I identify the OCR file location?


Check /var/opt/oracle/ocr.loc or /etc/ocr.loc (depends upon platform)
Or
#ocrcheck

9) Is ssh required for normal Oracle RAC operation?


"ssh" are not required for normal Oracle RAC operation. However "ssh" should be enabled
for Oracle RAC and patchset installation.

10) What is SCAN? How SCAN works? Benefits of SCAN? How to configure SCAN? How
many SCAN Listeners and why?
What is SCAN?
Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle
Database running in a cluster. The benefit is clients using SCAN do not need to change if
you add or remove nodes in the cluster.

How does SCAN work?


SCAN is a GSD resource, which is managed by CRS. So, SCAN is pretty much aware of
what's going on in the cluster. Though Oracle documentation suggests that SCAN is a
recommendation, but its a kind of mandatory as Oracle 11gR2 OUI would not proceed
without it. SCAN is on top of VIPs, but you can directly connect to the local listener if you
would like to bypass SCAN in client’s tnsnames.ora. Clients use SCAN name in tnsnames.ora
to make the DB connection. SCAN Listener would forward the request to local listener that’s
running on VIPs.

So, SCAN needs to resolve to one to three IP addresses with the same name. Oracle
recommends using three IP Addresses for SCAN in DNS. There would be three SCAN
listeners only, though the cluster has got dozens of nodes. SCAN listeners would be started
from GRID Oracle Home, not the database/rdbms home. Since its part of a grid, this can be
used for all the database in the cluster. So, we don't to run netca to create listeners in DB
Homes anymore. If the default port, 1521, is used, Oracle instances (PMON) automatically
registers with the SCAN listener. Here is a quick look at Oracle documentation's load
balancing flow with SCAN:

 PMON process of each instance registers the database services with the default
listener on the local node and with each SCAN listener, which is specified by the
REMOTE_LISTENER database parameter.
 Oracle client connects using SCAN name: myscan:1521/sales.example.com
 Client queries DNS to resolve scan_name.
 SCAN listener selects least loaded node (node in this example)
 The client connects to the local listener on node2. The local listener starts a
dedicated server process for the connection to the database.
 The client connects directly to the dedicated server process on node2 and accesses
the sales2 database instance.

After the installation, two SCAN listeners would be started on one node and another SCAN
listener on another node in a two node cluster.

How to setup DNS for Oracle 11g R2 SCAN?


Oracle recommends three IP Addresses be used for SCAN.

How many SCAN Listeners and why?


There is no technical reason behind that. Whenever Oracle recommending something which
is used & tested on variety of environments and found stable for performance, scalability
and high availability.

SCAN Listener would normally be into following chain during normal operation (there might
be something more which I am not able to guess at this time)
Ready for service –> Busy (Received request from Client) –> Busy (Identifying least loaded
node) –> Busy (Redirecting connection to local listener of least loaded node) –> Busy
(Handing off address of that local listener to client) –> Ready for service

So let’s consider following points

-> Having less than 2 SCAN listeners would be concern for HA.
-> Having 2 SCAN listeners would be good for HA.
-> For any processing in round robin we want to make sure that whenever we approaches
someone we want that process so having only 2 SCAN listener would put me into situation
when I am trying to approach SCAN which is not yet ready to check my request.
-> So 2 SCANs would be good to go, still the formula of having N+1 , which means take one
more than what you need for your system. Same is like we have with number of control
files, though you should be good with two control files, still its better to have 2+1.
-> So Having 2+1 would be good for both HA as well as scaleability.

Last thing I would like to share that as per oracle 3 SCAN listeners would be sufficient
enough to handle any peak load connection request even on largest cluster. Still if your
environment is facing bottleneck then you have given an option to add more SCAN listeners.

http://www.freeoraclehelp.com/2011/12/scan-setup-for-oracle-11g-release211gr2.html
https://saruamit4.wordpress.com/2013/09/27/how-many-scan-listeners/
http://www.oracle.com/technetwork/products/clustering/overview/scan-129069.pdf

11) What is the purpose of Private Interconnect?


Clusterware uses the private interconnect for cluster synchronization (network heartbeat)
and daemon communication between the clustered nodes. This communication is based
on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP).
Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches
of participating nodes in the cluster.

12) Why do we have a Virtual IP (VIP) in Oracle RAC?


Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP
timeout period (which can be up to 10 min) before getting an error. As a result, you
don't really have a good HA solution without using VIPs.
When a node fails, the VIP associated with it is automatically failed over to some other node
and new node re-arps the world indicating a new MAC address for the IP. Subsequent
packets sent to the VIP go to the new node, which will send error RST packets back to the
clients. This results in the clients getting errors immediately.

13) What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR Report?
This is most likely due to a fault in interconnect network.
Check netstat -s
If you see "fragments dropped" or "packet reassemblies failed", Work with your system
administrator find the fault with network.

14) How many nodes are supported in a RAC Database?


10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances
in a RAC database.

15) Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, and
however sqlplus can start it on both nodes? How do you identify the problem?
Set the environmental variable SRVM_TRACE to true, and start the instance with srvctl. Now
you will get detailed error stack.

16) What is the purpose of the ONS daemon?


The Oracle Notification Service (ONS) daemon is a daemon started by the CRS clusterware
as part of the nodeapps. There is one ONS daemon started per clustered node.
The Oracle Notification Service daemon receives a subset of published clusterware events
via the local EVMD and RACGIMON clusterware daemons and forwards those events to
application subscribers and to the local listeners.

 FAN or Fast Application Notification feature or allowing applications to respond to


database state changes.
 In 10gR2 Load Balancing Advisory, the feature that permit load balancing across
different rac nodes dependent of the load on the different nodes. The rdbms MMON is
creating
 An advisory for distribution of work every 30seconds and forward it via racgimon and
ONS to listeners and applications.

17) What is the split-brain scenario?


In Oracle RAC, split-brain is the scenario when one or more nodes update to the database
files without considering the integrity with other nodes. So in that scenario there is
High possibility of compromising of database integrity and introducing the corruption to the
database.

18) What is the role and purpose of voting disk/file in RAC?


In Oracle RAC, voting disk file is used to determine the state of each nodes in the cluster.
Each node should write heartbeat to the voting disk in predetermine interval i.e.
1 sec, so other nodes in the cluster know that the node is alive. If node could not register
the heartbeat to voting disk in stipulated time frame then it should be fence out from cluster
to avoid split-brain scenario, which might introduce corruption to the database. Oracle
Cluster Synchronization Service Daemon (OCSSD) is responsible to
maintain Synchronization of the cluster using voting disk.

Voting disk record node membership information. Oracle Clusterware uses the voting disk to
determine which instances are members of a cluster. The voting disk must reside on a
shared disk. For high availability, Oracle recommends that you have a minimum of three
voting disks. If you configure a single voting disk, then you should use external mirroring
to provide redundancy. You can have up to 32 voting disks in your cluster.

19) How would you find the interconnect IP address from any node within an Oracle 10g
RAC configuration?
Using oifcfg command.

20) How many OCR and voting disks should one have?
For redundancy, one should have at least two OCR disks and three voting disks (raw disk
partitions). These disk partitions should be spread across different physical disks.

21) What is TAF? (Transparent Application Failover)


After an Oracle RAC node crashes—usually from a hardware failure—all new application
transactions are automatically rerouted to a specified backup node.
The challenge in rerouting is to not lose transactions that were "in flight" at the exact
moment of the crash.
One of the requirements of continuous availability is the ability to restart in-flight application
transactions, allowing a failed node to resume processing on another server without
interruption.
Oracle's answer to application failover is a new Oracle Net mechanism dubbed Transparent
Application Failover.
TAF allows the DBA to configure the type and method of failover for each Oracle Net client.

22) What is FAN and FCF?


The Fast Connection Failover (FCF) feature is an Oracle RAC.
Fast Application Notification (FAN) client implemented through the connection pool.
The feature requires the use of an Oracle JDBC driver and an Oracle RAC database.

23) What is dynamic remastering? When will the dynamic remastering happens?
 Dynamic remastering is ability to move the ownership of resource from one instance
to another instance in RAC.
 Dynamic resource remastering is used to implement for resource affinity for
increased performance.
 Resource affinity optimized the system in situation where update transactions are
being executed in one instance.
 When activity shift to another instance the resource affinity correspondingly move to
another instance.
 If activity is not localized then resource ownership is hashed to the instance.

 In 10g dynamic remastering happens in file+object level.


 The process of remastering is very stringent.
 For one instance should touch more than 50 times than the other instance in
particular period (say 10 mints).
 This touch ratio and time can be tuned by gc_affinity_limit and _gc_affinity_time
parameter.

24) Why we required to maintain odd number of voting disks?


Odd numbers of disk are to avoid split brain,
When Nodes in cluster can't talk to each other they run to lock the Voting disk and whoever
lock the more disk will survive, if disk number are even there are chances that node might
lock 50% of disk (2 out of 4) then how to decide which node to evict.
Whereas when number is odd, one will be higher than other and each for cluster to evict the
node with less number.

25) How you check the health of Your RAC Database?


'crsctl' command from root or oracle user can be used to check the clusterware health, but
for starting or stopping we have to use root user or any privilege user.

$ crsctl check crs

26) If there is some issue with virtual IP how will you troubleshoot it? How will you change
virtual ip?
$ srvctl modify nodeapps -A new_address
27) How you will backup your RAC Database?
An RAC Database consists of
 OCR
 Voting disk
 Database files, controlfiles, redolog files & Archive log files

28) Do you have any idea of load balancing in application? How load balancing is done?
http://practicalappsdba.wordpress.com/category/for-master-apps-dbas/

29) Give the usage of srvctl?


srvctl start instance -d db_name -i "inst_name_list" [-o start_options]
srvctl stop instance -d name -i "inst_name_list" [-o stop_options]
srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate
srvctl start database -d name [-o start_options]
srvctl stop database -d name [-o stop_options]
srvctl start database -d orcl -o mount

30) What is GRD?


GRD stands for Global Resource Directory.
The GES and GCS maintain records of the status of each datafile and each cached block
using global resource directory. This process is referred to as cache fusion and helps in data
integrity.

31) Give Details on ACMS?


ACMS stands for Atomic Controlfile Memory Service.
In an Oracle RAC environment ACMS is an agent that ensures a distributed SGA memory
update (ie) SGA updates are globally committed on success or globally aborted in event of a
failure.

32) What are the major RAC wait events?


In a RAC environment the buffer cache is global across all instances in the cluster and hence
the processing differs.
The most common wait events related to this are gc cr request and gc buffer busy

GC CR request: the time it takes to retrieve the data from the remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will
increase the amount of data blocks requested by an Oracle session.
The more blocks requested typically means the more often a block will need to be read from
a remote instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested
data block.

33) Give details on GTX0-j


The process provides transparent support for XA global transactions in a RAC environment.
The database autotunes the number of these processes based on the workload of XA global
transactions.

34) Give details on LMON


This process monitors global enques and resources across the cluster and performs global
enqueue recovery operations. This is called as Global Enqueue Service Monitor.
35) Give details on LMD
This process is called as global enqueue service daemon.
This process manages incoming remote resource requests within each instance.

36) Give details on LMS


This process is called as Global Cache service process.
This process maintains statuses of datafiles and each cahed block by recording information
in a Global Resource Directory (GRD).This process also controls the flow of messages to
remote instances and manages global data block access and transmits block images
between the buffer caches of different instances. This processing is a part of cache fusion
feature.

37) Give details on LCK0


This process is called as Instance enqueue process.
This process manages non-cache fusion resource requests such as library and row cache
requests.
38) Give details on RMSn
This process is called as Oracle RAC management process.
These processes perform manageability tasks for Oracle RAC.
Tasks include creation of resources related Oracle RAC when new instances are added to the
cluster.

39) Give details on RSMN


This process is called as Remote Slave Monitor.
This process manages background slave process creation and communication on remote
instances. This is a background slave
Process. This process performs tasks on behalf of a co-ordinating process running in another
instance.

40) How to export and import crs resources while migrating Oracle RAC to new server?
Below script generate svrctl add script for database, instance, service and 11G listeners
from OCR from current RAC.

Save the result of the script and run it at new RAC.

For DBNAME in $(srvctl config database)


do
# generate DB resource
srvctl config database -d $DBNAME -a | awk -v dbname="$DBNAME" \
'BEGIN { FS=":" }
$1~/Oracle home/ || $1~/ORACLE_HOME/ {dbhome = "-o" $2}
$1~/Spfile/ || $1~/SPFILE/ {spfile = "-p" $2}
$1~/Disk Groups/ {dg = "-a" $2}
END { if (avail == "-a ") {avail = ""}; printf "%s %s %s %s %s\n", "srvctl add database -d
", dbname, dbhome, spfile, dg }'
# generate Instance resource
srvctl status database -d $DBNAME | awk -v dbname="$DBNAME" \
'$4~/running/ { printf "%s %s %s %s %s %s\n", "srvctl add instance -d ",dbname, " -i ",
$2 ," -n ", $7 }
$5~/running/ { printf "%s %s %s %s %s %s \n", "srvctl add instance -d ",dbname, " -i ",
$2 ," -n ", $8 }'
# modify instance for 10G - ASM dependency
if [ $(echo $ORACLE_HOME | grep "1020" | wc -l ) -eq 1 ]
then
srvctl status database -d $DBNAME | awk -v dbname="$DBNAME" \
'$2~/1$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, " -i ", $2 ," -
s +ASM1" }
$2~/2$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, " -i ", $2 ," -
s +ASM2" }
$2~/3$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, " -i ", $2 ," -
s +ASM3" }
$2~/4$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, " -i ", $2 ," -
s +ASM4" }'
fi
echo "srvctl start database -d $DBNAME"
# Generate Service resource
snamelist=$(srvctl status service -d $DBNAME | awk '{print $2}')
for sname in $snamelist
do
srvctl config service -d $DBNAME -s $sname| awk -v dbname="$DBNAME" -v
sname=$sname \
'BEGIN { FS=":"}
$1~/Preferred instances/ {pref = "-r" $2}
$1~/PREF/ {pref = "-r" $2; sub(/AVAIL/, "", pref) }
$1~/Available instances/ {avail = "-a" $2}
$2~/AVAIL/ {avail = "-a" $3}
$1~/Failover type/ {ft = "-e" $2}
$1~/Failover method/ {fm = "-m" $2}
$1~/Runtime Load Balancing Goal/ {g = "-B" $2}
END { if (avail == "-a ") {avail = ""}; printf "%s %s %s %s %s %s %s %s %s %s\n",
"srvctl add service -d ",dbname, "-s ", sname, pref, avail ,ft, fm,g, "-P BASIC"}'
echo "srvctl start service -d $DBNAME -s $sname"
done
done
# Listener at 11G Home. 10G listener can't be added with srvctl.
srvctl config listener | awk \
'BEGIN { FS=":"; state = 0; }
$1~/Name/ {lname = "-l" $2; state=1};
$1~/Home/ && state == 1 {ohome = "-o" $2; state=2;}
$1~/End points/ && state == 2 {lport = "-p " $3; state=3;}
state == 3 {if (ohome != "-o ") {printf "%s %s %s %s\n", "srvctl add listener ", lname,
ohome, lport;} state=0;}'

41) What components in RAC must reside in shared storage?


All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.

42) What is the significance of using cluster-aware shared storage in an Oracle RAC
environment?
All instances of an Oracle RAC can access all the datafiles, control files, SPFILE's, redolog
files when these files are hosted out of cluster-aware shared storage which are group of
shared disks.

43) Give few examples for solutions that support cluster storage?
ASM (automatic storage management), raw disk devices, network file system (NFS), OCFS2
and OCFS (Oracle Cluster Fie systems)

44) How can we configure the cluster interconnect?


Configure User Datagram Protocol (UDP) on Gigabit Ethernet for cluster interconnects.
On UNIX and Linux systems we use UDP and RDS (Reliable data socket) protocols to be
used by Oracle Clusterware. Windows clusters use the TCP protocol.

45) Can we use crossover cables with Oracle Clusterware interconnect?


No, crossover cables are not supported with Oracle Clusterware interconnects.

46) What is the use of cluster interconnect?


Cluster interconnect is used by the Cache fusion for inter instance communication.

47) How do users connect to database in an Oracle RAC environment?


Users can access a RAC database using a client/server configuration or through one or more
middle tiers, with or without connection pooling. Users can use oracle services feature to
connect to database.

48) What is the use of a service in Oracle RAC environment?


Applications should use the services feature to connect to the Oracle database. Services
enable us to define rules and characteristics to control how users and applications
connect to database instances.
49) What are the characteristics controlled by Oracle services feature?
The characteristics include a unique name, workload balancing and failover options, and
high availability characteristics.

50) What is the significance of VIP address failover?


When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don't have to wait for TCP connection
Timeout messages.

51) How do we verify that RAC instances are running?


Issue the following query from any one node connecting through SQL*PLUS.
SQL>select * from V$ACTIVE_INSTANCES;

52) What is FAN?


Fast application Notification as it abbreviates to FAN relates to the events related to
instances, services and nodes. This is a notification mechanism that Oracle RAC uses
to notify other processes about the configuration and service level information that includes
service status changes such as, UP or DOWN events. Applications can respond to FAN
events and take immediate action.

53) Where can we apply FAN UP and DOWN events?


FAN UP and FAN DOWN events can be applied to instances, services and nodes.
State the use of FAN events in case of a cluster configuration change?
During times of cluster configuration changes, Oracle RAC high availability framework
publishes a FAN event immediately when a state change occurs in the cluster. So
applications can receive FAN events and react immediately. This prevents applications from
polling database and detecting a problem after such a state change.

54) What is rolling upgrade? And how to apply rolling patch in RAC?
It is a new ASM feature from Database 11g. ASM instances in Oracle database 11g release
(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables us
to patch or upgrade ASM nodes in a clustered environment without affecting database
availability. During a rolling upgrade we can maintain a functional cluster while one or more
of the nodes in the cluster are running in different software versions.

55) Can rolling upgrade be used to upgrade from 10g to 11g database?
No, it can be used only for Oracle database 11g releases (from 11.1).

56) State the initialization parameters that must have same value for every instance in an
Oracle RAC database?

Some initialization parameters are critical at the database creation time and must have
same values. Their value must be specified in SPFILE or PFILE for every instance. The lists
of parameters that must be identical on every instance are given below:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_passWORD_FILE
UNDO_MANAGEMENT

57) What is ORA-00603: ORACLE server session terminated by fatal error or ORA-29702:
error occurred in Cluster Group Service operation?
RAC node name was listed in the loopback address

58) Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?


These parameters can be identical on all instances only if these parameter values are set to
zero.

59) What two parameters must be set at the time of starting up an ASM instance in a RAC
environment?
The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.

60) What is a CRS resource?


Oracle clusterware is used to manage high-availability operations in a cluster. Anything that
Oracle Clusterware manages is known as a CRS resource. Some examples of CRS resources
Are database, an instance, a service, a listener, a VIP address, an application process etc.

61) How does Oracle Clusterware manage CRS resources?


Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR (Oracle Cluster Registry).

62) Name some Oracle clusterware tools and their uses?


OIFCFG - allocating and deallocating network interfaces
OCRCONFIG - Command-line tool for managing Oracle Cluster Registry
OCRDUMP - Identify the interconnect being used
CVU - Cluster verification utility to get status of CRS resources

63) How do we remove ASM from a Oracle RAC environment?


We need to stop and delete the instance in the node first in interactive or silent mode. After
that ASM can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name

64) How do we verify that an instance has been removed from OCR after deleting an
instance?
srvctl config database -d database_name
cd CRS_HOME/bin
./crs_stat

65) How do we verify an existing current backup of OCR?


ocrconfig -showbackup

66) What are the types of connection load-balancing?


There are two types of connection load-balancing: server-side load balancing and client-side
load balancing.
67) What is the difference between server-side and client-side connection load balancing?
Client-side balancing happens at client side where load balancing is done using listener. In
case of server-side load balancing listener uses a load-balancing advisory to
redirect connections to the instance providing best service.

68) Write a sample script for RMAN for the recovery if all the instance are down. (First
explain the procedure how you will restore)?
Bring all nodes down.
Start one Node
Restore all datafiles and archive logs.
Recover 1 Node.
Open the database.
Bring other nodes up.
Confirm that all nodes are operational.

69) Clients are performing some operation and suddenly one of the datafile is experiencing
problem what do you do? If the cluster is a two node?
Bring the datafile offline recover the datafile.

70) How to move OCR and Voting disk to new storage device?

Moving OCR
==========
You must be logged in as the root user, because root owns the OCR files.
Also an ocrmirror must be in place before trying to replace the OCR device.

Make sure there is a recent backup of the OCR file before making any changes:

ocrconfig –showbackup

If there is not a recent backup copy of the OCR file, an export can be taken for the current
OCR file. Use the following command to generate an export of the online OCR file:

In 10.2

# ocrconfig –export -s online

In 11g

# ocrconfig -manualbackup

The new OCR disk must be owned by root, must be in the oinstall group, and must have
permissions set to 640. Provide at least 100 MB disk space for the OCR.

On one node as root run:

# ocrconfig -replace ocr


# ocrconfig -replace ocrmirror

Now run ocrcheck to verify if the OCR is pointing to the new file

Moving Voting Disk


==================
Note: crsctl votedisk commands must be run as root

Shutdown the Oracle Clusterware (crsctl stop crs as root) on all nodes before making any
modification to the voting disk. Determine the current voting disk location using:

crsctl query css votedisk

Take a backup of all voting disk:

dd if=voting_disk_name of=backup_file_name

To move a Voting Disk, provide the full path including file name:

crsctl delete css votedisk –force


crsctl add css votedisk –force

After modifying the voting disk, start the Oracle Clusterware stack on all nodes

# crsctl start crs

Verify the voting disk location using

crsctl query css votedisk

71) What is runfixup.sh script in Oracle Clusterware 11g release 2 installation?


With Oracle Clusterware 11g release 2, Oracle Universal Installer (OUI) detects when the
minimum requirements for an installation are not met, and creates shell scripts, called
fixup scripts, to finish incomplete system configuration steps. If OUI detects an incomplete
task, then it generates fixup scripts (runfixup.sh). You can run the fixup script after you
click the Fix and Check Again Button.
The Fixup script does the following:
If necessary sets kernel parameters to values required for successful installation, including:
 Shared memory parameters.
 Open file descriptor and UDP send/receive parameters.
 Sets permissions on the Oracle Inventory (central inventory) directory.
 Reconfigures primary and secondary group memberships for the installation owner, if
necessary, for the Oracle Inventory directory and the operating system privileges
groups.
 Sets shell limits if necessary to required values.

72) When exactly during the installation processes are clusterware components created?
After fulfilling the pre-installation requirements, the basic installation steps to follow are:

1. Invoke the Oracle Universal Installer (OUI)

2. Enter the different information for some components like:


- name of the cluster
- public and private node names
- location for OCR and Voting Disks
- network interfaces used for RAC instances
-etc.

3. After the Summary screen, OUI will start copying under the $CRS_HOME (this is the
$ORACLE_HOME for Oracle Clusterware) in the local node the libraries and executables.
- here we will have the daemons and scripts init.* created and configured properly.
Oracle Clusterware is formed of several daemons, each one of which have a special function
inside the stack. Daemons are executed via the init.* scripts (init.cssd, init.crsd and
init.evmd).

- note that for CRS only some client libraries are recreated, but not all the executables (as
for the RDBMS).

4. Later the software is propagated to the rest of the nodes in the cluster and the
oraInventory is updated.

5. The installer will ask to execute root.sh on each node. Until this step the software for
Oracle Clusterware is inside the $CRS_HOME.

Running root.sh will create several components outside the $CRS_HOME:

- OCR and VD will be formated.

- control files (or SCLS_SRC files ) will be created with the correct contents to start Oracle
Clusterware.

These files are used to control some aspects of Oracle Clusterware like:
- enable/disable processes from the CSSD family (Eg. oprocd, oslsvmon)
- stop the daemons (ocssd.bin, crsd.bin, etc).
- prevent Oracle Clusterware from being started when the machine boots.
- etc.

- /etc/inittab will be updated and the init process is notified.

In order to start the Oracle Clusterware daemons, the init.* scripts first need to be run.
These scripts are executed by the daemon init. To accomplish this some entries must be
created in the file /etc/inittab.

- the different processes init.* (init.cssd, init.crsd, etc) will start the daemons (ocssd.bin,
crsd.bin, etc). When all the daemons are running then we can say that the
installation was successful

- On 10.2 and later, running root.sh on the last node in the cluster also will create the
nodeapps (VIP, GSD and ONS). On 10.1, VIPCA is executed as part of the RAC installation.

6. After running root.sh on each node, we need to continue with the OUI session. After
pressing the 'OK' button OUI will include the information for the public and
cluster_interconnect interfaces. Also CVU (Cluster Verification Utility) will be executed.

73) How do we backup voting disks?

Oracle recommends that you back up your voting disk after the initial cluster creation and
after we complete any node addition or deletion procedures.
 First, as root user, stop Oracle Clusterware (with the crsctl stop crs command) on all
nodes. Then, determine the current voting disk by issuing the following command:
crsctl query votedisk css
 Then, issue the dd or ocopy command to back up a voting disk, as appropriate.
Give the syntax of backing up voting disks:-
On Linux or UNIX systems:
dd if=voting_disk_name of=backup_file_name
Where, voting_disk_name is the name of the active voting disk backup_file_name is the
name of the file to which we want to back up the voting disk contents
On Windows systems, use the ocopy command:

ocopy voting_disk_name backup_file_name

74) What is the Oracle Recommendation for backing up voting disk?


Oracle recommends us to use the dd command to backup the voting disk with a minimum
block size of 4KB.

75) How do you restore a voting disk?


To restore the backup of your voting disk, issue the dd or ocopy command for Linux and
UNIX systems or ocopy for Windows systems respectively.

dd if=backup_file_name of=voting_disk_name

On Windows systems, use the ocopy command:

ocopy backup_file_name voting_disk_name


Where,
backup_file_name is the name of the voting disk backup file
voting_disk_name is the name of the active voting disk

80) How can we add and remove and move multiple voting disks?
If we have multiple voting disks, then we can remove the voting disks and add them back
into our environment using the following commands, where path is the complete path of the
location where the voting disk resides:

Delete vote disk:

crsctl delete css votedisk path

Add vote disk:

crsctl add css votedisk path


crsctl add css votedisk path -force

Move vote disk:


Replacing the path variable with the fully qualified path name for the voting disk we want to
move:

crsctl delete css votedisk path -force


crsctl add css votedisk path -force

81) What should we do after modifying voting disks?


Restart Oracle Clusterware using the crsctl start crs command on all nodes, and verify the
voting disk location using the following command:

crsctl query css votedisk

82) When can we use -force option?


If our cluster is down, then we can include the -force option to modify the voting disk
configuration, without interacting with active Oracle Clusterware daemons. However, using
the -force option while any cluster node is active may corrupt our configuration.
83) What is split brain Syndrome?
In RAC environment, server nodes communicate with each other using High speed private
interconnects network. A split brain situation happens when all the links of the private
interconnect fail to respond to each other but instances are still up and running. So each
instance thinks that the other nodes/instances are dead and that it should take over the
ownership.
In split brain situation, instances independtly access the data and modify the same blocks
and the database will end up with changed database overwritten which could lead to data
corruption. To avoid this, various algorithm are implemented to handle split brain scenario.
In RAC, the IMR (Instance Membership Recovery) service is one of the one of the efficient
algorithm used to detect & resolve the split-brain syndrome. When one instance fails to
communicate with other instances or when one instance becomes inactive due to any
reason and is unable to issue the control file heartbeat, the split brain is detected and the
detecting instance will evict the failed instance from the database. This process is called
node eviction.

84 Connections are hanging? What are the possibilities?


External issues - The network being down, Kerberos security issues, SSO or a firewall
issue can cause an Oracle connection to hang.
One way to test this is to set sqlnet.authentication_services= (none) in your sqlnet.ora
file and retry connecting.
Listener is not running - Start by checking the listener (check lsnrctl stat).
No RAM - Over allocation of server resources, usually RAM, whereby there is not enough
RAM to spawn another connection to Oracle.
Contention - It is not uncommon for an end-user session to “hang” when they are trying to
grab a shared data resource that is held by another end-user.

85) FAN in RAC?


With Oracle RAC in place, database client applications can leverage a number of high
availability features including:
Fast Connection Failover (FCF): Allows a client application to be immediately notified of a
planned or unplanned database service outage by subscribing to Fast Application
Notification (FAN) events.
Run-time Connection Load-Balancing: Uses the Oracle RAC Load Balancing Advisory events
to distribute work appropriately across the cluster nodes and to quickly react to changes in
cluster configuration, overworked nodes, or hangs.
Connection Affinity (11g recommended/required): Routes connections to the same database
instance based on previous connections to an instance to limit performance impacts of
switching between instances.
RAC supports web session and transaction-based affinity for different client scenarios.

86) How to find Cluster Interconnect IP address from Oracle Database?


The easiest way to find the cluster interconnect is to view the “hosts” file.
 Query X$KSXPIA
The following query provides the interconnect IP address registered with Oracle database:
SQL> select IP_KSXPIA from x$ksxpia where PUB_KSXPIA = 'N';
This query should be run on all instances to find the private interconnect IP address used on
their respective nodes.

 Query GV$CLUSTER_INTERCONNECTS view


Querying GV$CLUSTER_INTERCONNECTS view lists the interconnect used by all the
participating instances of the RAC database.
SQL> select INST_ID, IP_ADDRESS from GV$CLUSTER_INTERCONNECTS;

87) How to Identify master node in RAC ?


Grep crsd log file
# /u1/app/../crsd>grep MASTER crsd.log | tail -1

(or)

cssd >grep -i "master node" ocssd.log | tail -1

OR You can also use V$GES_RESOURCE view to identify the master node.

88) How to monitor block transfer interconnects nodes in rac?


The v$cache_transfer and v$file_cache_transfer views are used to examine RAC statistics.
The types of blocks that use the cluster interconnects in a RAC environment are monitored
with the v$ cache transfer series of views:

v$cache_transfer: This view shows the types and classes of blocks that Oracle transfers
over the cluster interconnect on a per-object basis.
The forced_reads and forced_writes columns can be used to determine the types of objects
the RAC instances are sharing.
Values in the forced_writes column show how often a certain block type is transferred out of
a local buffer cache due to the current version being requested by another instance.

89) What is global cache service (GCS) monitoring?


The use of the GCS relative to the number of buffer cache reads, or logical reads can be
estimated by dividing the sum of GCS requests (global cache gets + global cache
converts + global cache cr blocks received + global cache current blocks received ) by
the number of logical reads (consistent gets + db block gets ) for a given statistics
collection interval.
A global cache service request is made in Oracle when a user attempts to access a buffer
cache to read or modify a data block and the block is not in the local cache.
A remote cache read, disk read or change access privileges is the inevitable result.
These are logical read related. Logical reads form a superset of the global cache service
operations.

90) What are structural changes in 11g R2 RAC?


http://satya-racdba.blogspot.com/2010/07/new-features-in-9i-10g-11g-rac.html
Grid & ASM are on one home,
Voting disk & ocrfile can be on the ASM,
SCAN,
By using srvctl, we can mange diskgroups, home, ons, eons, filesystem, srvpool, server,
scan, scan_listener, gns, vip, oc4j,GSD

91) What is local OCR? And how to identify the location of OLR?
/etc/oracle/local.ocr
/var/opt/oracle/local.ocr

92) How to backup OLR file? And how to check the OLR backup location?

93) If voting disk/OCR file got corrupted and don’t have backups, how to get them?
We have to install Clusterware.

94) Who will manage OCR files?


cssd will manage OCR.

95) Who will take backup of OCR files?


crsd will take backup.
96) What is the use of SCAN IP (SCAN name) and will it provide load balancing?

Ans:
Single Client Access Name (SCAN) is a new Oracle Real Application Clusters (RAC) 11g
Release 2,
feature that provides a single name for clients to access an Oracle Database running in a
cluster.
The benefit is clients using SCAN do not need to change if you add or remove nodes in the
cluster.

97) How many SCAN listeners will be running? Why three SCAN?
Three SCAN listeners only

98) What is FCF?


Fast Connection Failover provides high availability to FAN integrated clients, such as clients
that use JDBC, OCI, or ODP.NET.
If you configure the client to use fast connection failover, then the client automatically
subscribes to FAN events and can react to database UP and DOWN events.
In response, Oracle gives the client a connection to an active instance that provides the
requested database service.

99) What are nodeapps?


VIP, listener, ONS, GSD

100) What is gsd (Global Service Daemon)?


http://www.datadisk.co.uk/html_docs/rac/rac_cs.htm

Runs on each node with one GSD process per node.


The GSD coordinates with the cluster manager to receive requests from clients such as the
DBCA, EM, and the SRVCTL utility to execute administrative job tasks such as instance
startup or shutdown.
The GSD is not an Oracle instance background process and is therefore not started with the
Oracle instance

101) How to do load balancing in RAC?

Client Side Connect-Time Load Balance:


---------------------------------------
The client load balancing feature enables clients to randomize connection requests among
the listeners.
This is done by client Tnsnames Parameter: LOAD_BALANCE.
The (load_balance=yes) instructs SQLNet to progress through the list of listener addresses
in the address_list section of the net service name in a random sequence. When set to OFF,
instructs SQLNet to try the addresses sequentially until one succeeds.

Client Side Connect-Time failover


-------------------------------------
This is done by client Tnsnames Parameter: FAILOVER
The (failover=on) enables clients to connect to another listener if the initial connection to
the first listener fails. Without connect-time failover, Oracle Net attempts a connection with
only one listener.

Server Side Listener Connection Load Balancing


-------------------------------------------------
With server-side load balancing, the listener directs a connection request to the best
instance currently providing the service.
Init parameter remote_listener should be set. When set, each instance registers with the
TNS listeners running on all nodes within the cluster.

There are two types of server-side load balancing:


--------------------------------------------------
Load Based — Server side load balancing redirects connections by default depending on
node load. This id default.
Session Based — Session based load balancing takes into account the number of sessions
connected to each node and then distributes the connections to balance the number of
sessions across the different nodes.

From 10g release 2 the service can be setup to use load balancing advisory. This means
connections can be routed using SERVICE TIME and THROUGHPUT. Connection load
balancing means the goal of a service can be changed, to reflect the type of connections
using the service.

Transparent Application Failover (TAF):


----------------------------------------------
Transparent Application Failover (TAF) is a feature of the Oracle Call Interface (OCI) driver
at client side. It enables the application to automatically reconnect to a database, if the
database instance to which the connection is made fails. In this case, the active transactions
roll back.
Tnsnames Parameter: FAILOVER_MODE

e.g (failover_mode= (type=select) (method=basic))


Failover Mode Type can be Either SESSION or SELECT.

Session failover will have just the session to failed over to the next available node. With
SELECT, the select query will be resumed.
TAF can be configured with just server side service settings by using dbms_service package.

Fast Connection Failover (FCF):


-----------------------------------
Fast Connection Failover is a feature of Oracle clients that have integrated with FAN HA
Events.
Oracle JDBC Implicit Connection Cache, Oracle Call Interface (OCI), and Oracle Data
Provider for .Net (ODP.Net) include fast connection failover.

With fast connection failover, when a down event is received, cached connections affected
by the down event are immediately marked invalid and cleaned up.

102) what are the uses of services? How to find out the services in cluster?
Applications should use the services to connect to the Oracle database.
Services define rules and characteristics (unique name, workload balancing, failover options,
and high availability) to control how users and applications connect to database instances.

103) How to find out the nodes in cluster (or) how to find out the master node?

# olsnodes -- Which ever displayed first, is the master node of the cluster.

Select MASTER_NODE from v$ges_resource;

To find out which is the master node, you can see ocssd.log file and search for "master node
number".
103) How to know the public IPs, private IPs, VIPs in RAC?
# olsnodes -n -p -i

104) What is HAS (High Availability Service) and the commands?


HAS includes ASM & database instance and listeners.

crsctl check has


crsctl config has
crsctl disable has
crsctl enable has
crsctl query has releaseversion
crsctl query has softwareversion
crsctl start has
crsctl stop has [-f]

105) what is fencing?


I/O fencing prevents updates by failed instances, and detecting failure and preventing split
brain in cluster.
When a cluster node fails, the failed node needs to be fenced off from all the shared disk
devices or diskgroups. This methodology is called I/O Fencing, sometimes called Disk
Fencing or failure fencing.

106) Why Clusterware installed in root (why not oracle)?


Oracle Clusterware works closely with the operating system, system administrator access is
required for some of the installation tasks.
In addition, some of the Oracle Clusterware processes must run as the special operating
system user, root.

107) what are the wait events in RAC?


http://satya-racdba.blogspot.com/2012/10/wait-events-in-oracle-rac-wait-events.html
http://orainternals.wordpress.com/2009/12/23/rac-performance-tuning-understanding-
global-cache-performance/

gc buffer busy
gc buffer busy acquire
gc current request
gc cr request
gc cr failure
gc current block lost
gc cr block lost
gc current block corrupt
gc cr block corrupt
gc current block busy
gc cr block busy
gc current block congested
gc cr block congested.
gc current block 2-way
gc cr block 2-way
gc current block 3-way
gc cr block 3-way
(gc current/cr block n-way, n is number of nodes)
gc current grant 2-way
gc cr grant 2-way
gc current grant busy
gc current grant congested
gc cr grant congested
gc cr multi block read
gc current multi block request
gc cr multi block request
gc cr block build time
gc current block flush time
gc cr block flush time
gc current block send time
gc cr block send time
gc current block pin time
gc domain validation
gc current retry
ges inquiry response
gcs log flush sync

108) what are the initialization parameters that must have same value for every instance in
an Oracle RAC database?
http://satya-racdba.blogspot.com/2012/09/init-parameters-in-oracle-rac.html

ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_PASSWORD_FILE
UNDO_MANAGEMENT

109) what is the difference between cr block and cur (current) block?

110) New features in Oracle Clusterware 12c?


Oracle Flex ASM - This feature of Oracle Clusterware 12c claims to reduce per-node
overhead of using ASM instance.
Now the instances can use remote node ASM for any planned/unplanned downtime. ASM
metadata requests can be converted by non-local instance of ASM.

ASM Disk Scrubbing - From RAC 12c, ASM comes with disk scrubbing feature so that
logical corruptions can be discovered.
Also Oracle 12c ASM can automatically correct this in normal or high redundancy
diskgroups.

Oracle ASM Disk Resync and Rebalance enhancements-


Commands Databases Supporting to the application Gameing Game What is raid
Application Continuity (AC) - is transparent to the application and in-case the database
or the infrastructure is unavailable, this new features which work on JDBC drivers, masks
recoverable outages.
This recovers database session beneath the application so that the outage actually appears
to be delayed connectivity or execution.
Transaction guard (improvements of Fast Application Notification).

IPv6 Support - Oracle RAC 12c now supports IPv6 for Client connectivity, Interconnect is
still on IPv4.

Per Subnet multiple SCAN - RAC 12c, per-Subnet multiple SCAN can be configured per
cluster.

Each RAC instance opens the Container Database (CDB) as a whole so that versions would
be same for CDB as well as for all of the Pluggable Databases (PDBs). PDBs are also fully
compatible with RAC.

Oracle installer will run root.sh script across nodes. We don't have to run the scripts
manually on all RAC nodes.

new "ghctl" command for patching.

111) New features in Oracle 9i/10g/11g RAC?


http://satya-racdba.blogspot.in/2010/07/new-features-in-9i-10g-11g-rac.html

Oracle Real Application Clusters New features

Oracle 9i RAC:
---------------------
OPS (Oracle Parallel Server) was renamed as RAC
CFS (Cluster File System) was supported
OCFS (Oracle Cluster File System) for Linux and Windows
watchdog timer replaced by hangcheck timer

Oracle 10g R1 RAC:


-------------------
Cluster Manager replaced by CRS
ASM introduced
Concept of Services expanded
ocrcheck introduced
ocrdump introduced
AWR was instance specific

Oracle 10g R2 RAC:


-------------------
CRS was renamed as Clusterware
asmcmd introduced
CLUVFY introduced
OCR and Voting disks can be mirrored
Can use FAN/FCF with TAF for OCI and ODP.NET
The Waiting the Wait Latest News Resource Manager Installing Music Downloads

Oracle 11g R1 RAC:


---------------------
--> Oracle 11g RAC parallel upgrades - Oracle 11g have rolling upgrade features whereby
RAC database can be upgraded without any downtime.
-->Hot patching - Zero downtime patch application.
-->Oracle RAC load balancing advisor - Starting from 10g R2 we have RAC load balancing
advisor utility.
11g RAC load balancing advisor is only available with clients who use .NET, ODBC, or the
Oracle Call Interface (OCI).
-->ADDM for RAC - Oracle has incorporated RAC into the automatic database diagnostic
monitor, for cross-node advisories.
The script addmrpt.sql run give report for single instance, will not report all instances in
RAC, this is known as instance ADDM.
But using the new package DBMS_ADDM, we can generate report for all instances of RAC,
this known as database ADDM.
--> Optimized RAC cache fusion protocols - moves on from the general cache fusion
protocols in 10g to deal with specific scenarios where the protocols could be further
optimized.
--> Oracle 11g RAC Grid provisioning - The Oracle grid control provisioning pack allows us
to "blow-out" a RAC node without the time-consuming install, using a pre-installed
"footprint".

Oracle 11g R2 RAC:


-----------------------
--> We can store everything on the ASM. We can store OCR & voting files also on the ASM.
--> ASMCA
--> Single Client Access Name (SCAN) - eliminates the need to change tns entry when
nodes are added to or removed from the Cluster.
RAC instances register to SCAN listeners as remote listeners. SCAN is fully qualified name.
Oracle recommends assigning 3 addresses to SCAN, which create three SCAN listeners.
--> Clusterware components: crfmond, crflogd, GIPCD.
--> AWR is consolidated for the database.
--> 11g Release 2 Real Application Cluster (RAC) has server pooling technologies so it’s
easier to provision and manage database grids.
This update is geared toward dynamically adjusting servers as corporations manage the ebb
and flow between data requirements for datawarehousing and applications. By default,
LOAD_BALANCE is ON.
--> GSD (Global Service Deamon), gsdctl introduced.
--> GPnP profile.
--> Cluster information in an XML profile.
--> Oracle RAC OneNode is a new option that makes it easier to consolidate databases that
aren’t mission critical, but need redundancy.
--> raconeinit - to convert database to RacOneNode.
--> raconefix - to fix RacOneNode database in case of failure.
--> racone2rac - to convert RacOneNode back to RAC.
--> Oracle Restart - the feature of Oracle Grid Infrastructure's High Availability Services
(HAS) to manage associated listeners, ASM instances and Oracle instances.
--> Oracle Omotion - Oracle 11g release2 RAC introduces new feature called Oracle
Omotion, an online migration utility.
This Omotion utility will relocate the instance from one node to another, whenever instance
failure happens.
Omotion utility uses Database Area Network (DAN) to move Oracle instances.
Database Area Network (DAN) technology helps seamless database relocation without losing
transactions.
--> Cluster Time Synchronization Service (CTSS) is a new feature in Oracle 11g R2 RAC,
which is used to synchronize time across the nodes of the cluster. --> CTSS will be
replacement of NTP protocol.
--> Grid Naming Service (GNS) is a new service introduced in Oracle RAC 11g R2. With
GNS, Oracle Clusterware (CRS) can manage Dynamic Host Configuration Protocol -->
(DHCP) and DNS services for the dynamic node registration and configuration.
--> Cluster interconnect: Used for data blocks, locks, messages, and SCN numbers.
--> Oracle Local Registry (OLR) - From Oracle 11gR2 "Oracle Local Registry (OLR)"
something new as part of Oracle Clusterware. OLR is node’s local repository, --> similar to
OCR (but local) and is managed by OHASD. It pertains data of local node only and is not
shared among other nodes.
--> Multicasting is introduced in 11gR2 for private interconnect traffic.
--> I/O fencing prevents updates by failed instances, and detecting failure and preventing
split brain in cluster. When a cluster node fails, the failed node needs to be fenced off from
all the shared disk devices or diskgroups. This methodology is called I/O Fencing,
sometimes called Disk Fencing or failure fencing.
--> Re-bootless node fencing (restart)? - instead of fast re-booting the node, a graceful
shutdown of the stack is attempted.
--> Clusterware log directories: acfs*
--> HAIP (IC VIP).
--> Redundant interconnects: NIC bonding, HAIP.
--> RAC background processes: DBRM – Database Resource Manager, PING – Response
time agent.
--> Virtual Oracle 11g RAC cluster - Oracle 11g RAC supports virtualization.

112) Can I change a node’s hostname?


Yes, however, the node must be removed and added back to the cluster with the new name.

113) how do I define a service for a Policy-Managed Database?


When you define services for a policy-managed database, you define the service to a server
pool where the database is running. You can define the service as either UNIFORM (running
on all instances in the server pool) or SINGLETON (running on only one instance in the
server pool). For SINGLETON services, Oracle RAC chooses on which instance in the server
pool the service is active. If that instance fails, then the service fails over to another
instance in the server pool. A service can only run in one server pool.
Services for administrator-managed databases continue to be defined by the PREFERRED
and AVAILABLE definitions.

114) how do I convert from a Policy-Managed Database to Administrator-Managed


Database?
You cannot directly convert a policy-managed database to an administrator-managed
database. Instead, you can remove the policy-managed configuration using the 'srvctl
remove database' and 'srvctl remove service' commands, and then create a new
administrator-managed database with the 'srvctl add database' command.

115) what is Grid Plug and Play (GPnP)?


Grid Plug and Play (GPnP) eliminates per-node configuration data and the need for explicit
add and delete node steps. This allows a system administrator to take a template system
image and run it on a new node with no further configuration. This removes many manual
operations, reduces the opportunity for errors, and encourages configurations that can be
changed easily. Removal of the per-node configuration makes the nodes easier to replace,
because they do not need to contain individually-managed state.
Grid Plug and Play reduces the cost of installing, configuring, and managing database nodes
by making their per-node state disposable. It allows nodes to be easily replaced with
regenerated state.

116) what is a Server Pool?


Server pools enable the cluster administrator to create a policy which defines how Oracle
Clusterware allocates resources. An Oracle RAC policy-managed database runs in a server
pool. Oracle Clusterware attempts to keep the required number of servers in the server pool
and, therefore, the required number of instances of the Oracle RAC database. A server can
be in only one server pool at any time. However, a database can run in multiple server
pools. Cluster-managed services run in a server pool where they are defined as either
UNIFORM (active on all instances in the server pool) or SINGLETON (active on only one
instance in the server pool).

You should create redo log groups only if you are using administrator-managed databases.
For policy-managed databases, increase the cardinality and when the instance starts, if you
are using Oracle Managed Files and Oracle ASM, then Oracle automatically allocates the
thread, redo, and undo.

If you remove an instance from your Oracle RAC database, then you should disable the
instance’s thread of redo so that Oracle does not have to check the thread during database
recovery.

For policy-managed databases, Oracle automatically allocates the undo tablespace when the
instance starts if you have OMF enabled.

117) what is Run-Time Connection Load Balancing?


The run-time connection load balancing feature enables routing of work requests to an
instance that offers the best performance, minimizing the need to relocate work. To enable
and use run-time connection load balancing, the connection goal must be set to SHORT and
either of the following service-level goals must be set:
· SERVICE_TIME—The Load Balancing Advisory attempts to direct work requests to
instances according to their response time. Load Balancing Advisory data is based on the
elapsed time for work done by connections using the service, as well as available bandwidth
to the service. This goal is best suited for workloads that require varying lengths of time to
complete, for example, an internet shopping system.
· THROUGHPUT—The Load Balancing Advisory measures the percentage of the total
response time that the CPU consumes for the service. This measures the efficiency of an
instance, rather than the response time. This goal is best suited for workloads where each
work request completes in a similar amount of time, for example, a trading system.
Client-side load balancing balances the connection requests across the listeners by setting
the parameter ‘LOAD_BALANCE=ON’ directive. When you set this parameter to ON, Oracle
Database randomly selects an address in the address list, and connects to that node's
listener. This balances client connections across the available SCAN listeners in the cluster.
When clients connect using SCAN, Oracle Net automatically load balances client connection
requests across the three IP addresses you defined for the SCAN, unless you are using
EZConnect.

118) what are the different types of Server-Side Connection Load Balancing?
With server-side load balancing, the SCAN listener directs a connection request to the best
instance currently providing the service by using the load balancing advisory. The two types
of connection load balancing are:
· SHORT—Connections are distributed across instances based on the amount of time that
the service is used. Use the SHORT connection load balancing goal for applications that have
connections of brief duration. When using connection pools that are integrated with FAN, set
the connection load balancing goal to SHORT. SHORT tells the listener to use CPU-based
statistics.
· LONG—Connections are distributed across instances based on the number of sessions in
each instance, for each instance that supports the service. Use the LONG connection load
balancing goal for applications that have connections of long duration. This is typical for
connection pools and SQL*Forms sessions. LONG is the default connection load balancing
goal, and tells the listener to use session-based statistics.

119) how do I enable the Load Balancing Advisory (LBA)?


To enable the load balancing advisory, use the ‘-B’ option when creating or modifying the
service using the ‘srvctl’ command.

120) How does the database register with the Listener?


When a listener starts after the Oracle instance starts, and the listener is listed for service
registration, registration does not occur until the next time the Oracle Database process
monitor (PMON) discovery routine starts. By default, PMON discovery occurs every 60
seconds.
To override the 60-second delay, use the SQL ‘ALTER SYSTEM REGISTER’ statement. This
statement forces the PMON process to register the service immediately.
If you run this statement while the listener is up and the instance is already registered, or
while the listener is down, then the statement has no effect.

121) can I configure both failure notifications with Universal Connection Pool (UCP)?
Connection failure notification is redundant with Fast Connection Failover (FCF) as
implemented by the UCP. You should not configure both within the same application.

122) should I configure Transparent Application Failure (TAF) in my service definition if


using Fast Connection Failure (FCF)?
Do not configure Transparent Application Failover (TAF) with Fast Connection Failover (FCF)
for JDBC clients as TAF processing will interfere with FAN ONS processing.

123) Can I use Fast Connection Failover (FCF) and Transparent Application Failover (TAF)
together?
No. Only one of them should be used at a time.

124) what is the status of Fast Connection Failover (FCF) with Universal Connection Pool
(UCP)?
FCF is now deprecated along with the Implicit Connection Caching in favor of using the
Universal Connection Pool (UCP) for JDBC.

125) Does Fast Connection Failover (FCF) support planned outages?


FCF does not support planned outages like service relocation (reference Doc ID:
1076130.1). It is designed to work for unplanned outages, where a RAC service is preferred
on all the nodes in the cluster and one of the nodes goes down unexpectedly. When a
planned outage like a service relocation is done from one node to the other, FCF does not
work as expected and the result is unpredictable. There is no solution at present for this.
Enhancement request 9495973 has been raised to address this limitation.

126) Should I user JDBC Thin driver or JDBC OCI driver?


Oracle thin JDBC driver is usually preferred by application developers because it is cross
platform and has no external dependencies. However some applications require the high-
performance, native C-language based Oracle Call Interface (OCI) driver. This driver is
compatible with FCF and can alternatively use Transparent Application Failover (TAF) which
operates at a lower level than FCF and can automatically resubmit SELECT queries in the
event of a node failure. However for most applications, the ease of deployment of the thin
driver with full FCF support will outweigh any benefits offered by the OCI driver.

127) how do I subscribe to HA Events?


If you are using a client that uses Oracle Streams Advanced Queuing, such as OCI and
ODP.NET clients, to receive FAN events, you must enable the service used by that client to
access the alert notification queue by using the ‘-q’ option via the ‘srvctl’ command.
FAN events are published using ONS and Oracle Streams Advanced Queuing. The service
metrics received from the Oracle RAC load balancing advisory through FAN events for the
service are automatically placed in the Oracle Streams AQ queue table, ALERT_QUEUE.

Use the following query against the internal queue table for load balancing advisory FAN
events to monitor load balancing advisory events generated for an instance:-
SET PAGES 60 COLSEP '|' LINES 132 NUM 8 VERIFY OFF FEEDBACK OFF
COLUMN user_data HEADING "AQ Service Metrics" FORMAT A60 WRAP
BREAK ON service_name SKIP 1
SELECT
TO_CHAR (enq_time, 'HH:MI:SS') Enq_time, user_data
FROM sys.sys$service_metrics_tab
ORDER BY 1;

128) what is Connection Affinity?


Connection affinity is a performance feature that allows a connection pool to select
connections that are directed at a specific Oracle RAC instance. The pool uses run-time
connection load balancing (if configured) to select an Oracle RAC instance to create the first
connection and then subsequent connections are created with an affinity to the same
instance.

129) what types of affinity does Universal Connection Pool (UCP) support?
UCP JDBC connection pools support two types of connection affinity: transaction-based
affinity and Web session affinity.

130) What is Transaction-Based Affinity?


Transaction-based affinity is an affinity to an Oracle RAC instance that can be released by
either the client application or a failure event. Applications typically use this type of affinity
when long-lived affinity to an Oracle RAC instance is desired or when the cost (in terms of
performance) of being redirected to a new Oracle RAC instance is high. Distributed
transactions are a good example of transaction-based affinity. XA connections that are
enlisted in a distributed transaction keep an affinity to the Oracle RAC instance for the
duration of the transaction. In this case, an application would incur a significant
performance cost if a connection is redirect to a different Oracle RAC instance during the
distributed transaction.
Transaction-based affinity is strictly scoped between the application/middle-tier and UCP for
JDBC; therefore, transaction-based affinity only requires that the
setFastConnectionFailoverEnabled property be set to true and does not require complete
FCF configuration. In addition, transaction-based affinity does not technically require run-
time connection load balancing. However, it can help with performance and is usually
enabled regardless. If run-time connection load balancing is not enabled, the connection
pool randomly picks connections.

131) What is Web Session Affinity?


Web session affinity is an affinity to an Oracle RAC instance that can be released by either
the instance, a client application, or a failure event. The Oracle RAC instance uses a hint to
communicate to a connection pool whether affinity has been enabled or disabled on the
instance. An Oracle RAC instance may disable affinity based on many factors, such as
performance or load. If an Oracle RAC instance can no longer support affinity, the
connections in the pool are refreshed to use a new instance and affinity is established once
again.
Applications typically use this type of affinity when short-lived affinity to an Oracle RAC
instance is expected or if the cost (in terms of performance) of being redirected to a new
Oracle RAC instance is minimal. For example, a mail client session might use Web session
affinity to an Oracle RAC instance to increase performance and is relatively unaffected if a
connection is redirected to a different instance.
132) What is recommended for WebLogic Server?
Oracle recommends using WebLogic JDBC multi data sources to handle failover instead.
While connect-time failover does not provide the ability to pre-create connections to
alternate Oracle RAC nodes, multi data sources have multiple connections available at all
times to handle failover.
Transparent Application Failover (TAF) is not supported for any WLS data source. TAF, as
delivered via JDBC is currently not transparent. It is documented to affect some ongoing
query results and PreparedStatements in unpredictable and unrecoverable ways. TAF JDBC
requires specific recovery code at the application level and affects the integrity of
statements that WebLogic might be caching.

133) do I still need to backup my Oracle Cluster Registry (OCR) and Voting Disks?
You no longer have to back up the voting disk. The voting disk data is automatically backed
up in OCR as part of any configuration change and is automatically restored to any voting
disk added. If all voting disks are corrupted, however, you can restore.
Oracle Clusterware automatically creates OCR backups every four hours. At any one time,
Oracle Database always retains the last three backup copies of OCR. The CRSD process that
creates the backups also creates and retains an OCR backup for each full day and at the end
of each week. You cannot customize the backup frequencies or the number of files that
Oracle Database retains.

134) How is DBMS_JOB functionality affected by RAC?


DBMS jobs can be set to run either on database (i.e. any active instance), or a specific
instance.

135) What is PARELLEL_FORCE_LOCAL?


By default, the parallel server processes selected to execute a SQL statement can operate
on any or all Oracle RAC nodes in the cluster. By setting PARALLEL_FORCE_LOCAL to TRUE,
the parallel server processes are restricted to just one node, the node where the query
coordinator resides (the node on which the SQL statement was executed). However, in
11.2.0.1 when this parameter is set to TRUE the parallel degree calculations are not being
adjusted correctly to only consider the CPU_COUNT for a single node. The parallel degree
will be calculated based on the RAC-wide CPU_COUNT and not the single node CPU_COUNT.
Due to this bug 9671271 it is not recommended that you set PARALLEL_FORCE_LOCAL to
TRUE in 11.2.0.1, instead you should setup a RAC service to limit where parallel statements
can execute.

136) What is the Service Management Policy?


When you use automatic services in an administrator-managed database, during planned
database startup, services may start on the first instances to start rather than their
preferred instances. Prior to Oracle RAC 11 g release 2 (11.2), all services worked as
though they were defined with a manual management policy.

137) Why does my user appear across all nodes when querying GV$SESSION when my
service does not span all nodes?
The problem is you are querying GV$SESSION as the ABC user and this results in the
"strange" behavior. If you select gv$session, 2 parallel servers are spawned to query the
v$session on each node. This happens as the same user. Hence when you query gv$session
as ABC you are seeing 3 (one real and 2 parallel slaves querying v$session on each
instance). The reason you are seeing 1 on one node and 3 on the other is the order in which
the parallel processes query the v$session. Take the sys (or any other) user to query the
session of ABC and you will not see this problem.
138) How does Clusterware startup with OCR and Voting Disk in ASM?
The startup sequence has been changed/replaced, now being 2-phased, optimized
approach:
Phase I
· OHASD will startup "local" resources first.
· CSSD uses GPnP profile which stores location of voting disk so no need to access ASM
(voting disk is stored different within ASM than other files so location is known).
Simultaneously,
· ORAAGENT starts up and ASM instance is started (subset of information in OCR is stored in
OLR, enough to startup local resources), and ORAROOTAGENT starts CRSD.
So the 1st phase of Clusterware startup is to essentially start up local resources.
Phase II
· At this point ASM and full OCR information is available and the node is "joined" to cluster.

This is about to understand the startup sequence of Grid Infrastructure daemons and its
resources in 11gR2 RAC.
In 11g RAC aka Grid Infrastructure we all know there are additional background daemons
and agents, and the Oracle documentation is not so clear nor the other blog.
For example:- I have found below two diagram follow any one of these.

Explanation from diagram


OHASD Phase:-
OHASD (Oracle High Availability Server Daemon) starts Firsts and it will
start
OHASD Agent Phase:-
OHASD Agent starts and in turn this will start
gipcd Grid interprocess communication daemon, used for monitoring cluster interconnect
mdnsd Multicast DNS service It resolves DNS requests on behalf of GNS
gns The Grid Naming Service (GNS), a gateway between DNS and mdnsd, resolves DNS requests
gpnpd Grid Plug and Play Daemon, Basically a profile similar like OCR contents stored in XML format
$GI_HOME/gpnp/profiles/etc., this is where used by OCSSD also to read the ASM disk location
start up with out having ASM to be up, moreover this also provides the plug and play profile w
this can be distributed across nodes to cluster
evmd/ Evm service will be provided by evmd daemon, which is a information about events happening
evmlogger cluster, stop node,start node, start instance etc.
cssdagent (cluster synchronization service agent), in turn starts
ocssd Cluster synchronization service daemon which manages node membership in the cluster
If cssd found that ocssd is down, it will reboot the node to protect the data integrity.
cssdmonitor (cluster synchronization service monitor), replaces oprocd
and provides I/O fencing
OHASD orarootagent starts and in turn starts
crsd.bin Cluster ready services, which manages high availability of cluster resources , like stopping ,
starting, failing over etc.
diskmon.bin disk monitor (diskdaemon monitor) provides I/O fencing for exadata storage
octssd.bin Cluster synchronization time services , provides Network time protocol services but
manages its own rather depending on OS
CRSD Agent Phase:- crsd.bin starts two more agents
crsd orarootagent(Oracle root agent) starts and in turn this will start
gns Grid interprocess communication daemon, used for monitoring cluster interconnect
gns vip Multicast DNS service It resolves DNS requests on behalf of GNS
Network Monitor the additional networks to provide HAIP to cluster interconnects
Scan vip Monitor the scan vip, if found fail or unreachable failed to other node
Node vip Monitor the node vip, if found fail or unreachable failed to other node
crsd oraagent(Oracle Agent) starts and in turn it will start (the same functionality in
11gr1 and 10g managed by racgmain and racgimon background process) which is now
managed by crs Oracle agent itself.
·
ASM & disk groups Start & monitor local asm instance
ONS FAN feature, provides notification to interested client
eONS FAN feature, provides notification to interested client
SCAN Listener Start & Monitor scan listener
Node Listener Start & monitor the node listener (rdbms?)

139) what is the Oracle Database Quality of Service Management?


Oracle Database QoS Management is an automated, policy-based product that monitors the
workload requests for an entire system. Oracle Database QoS Management manages the
resources that are shared across applications and adjusts the system configuration to keep
the applications running at the performance levels needed by your business. Oracle
Database QoS Management responds gracefully to changes in system configuration and
demand, thus avoiding additional oscillations in the performance levels of your applications.
If you use Oracle Database Quality of Service Management (Oracle Database QoS
Management), then you cannot have SINGLETON services in a server pool, unless the
maximum size of that server pool is one.

140) Is a re-link required for the Clusterware home after an OS upgrade?


In 11.2, there are some executables in the GRID home that can and should be re-linked
after an OS upgrade. The procedure to do this is:

#> cd GI_HOME/crs/install
#> perl rootcrs.pl -unlock

As the grid infrastructure for a cluster owner:

$> export ORACLE_HOME=Grid_home


$> $GI_HOME/bin/relink

As root again:

#> cd GI_HOME/crs/instal
141) How do I determine the “Master” node?
For the cluster synchronization service (CSS), the master can be found by searching
$GI_HOME/log/cssd/ocssd.log. For master of an enqueue resource with Oracle RAC, you can
select from v$ges_resource. There should be a master_node column.

142) What are the different types of failover mechanisms available?


· JDBC-THIN driver supports Fast Connection Failover (FCF)
· JDBC-OCI driver supports Transparent Application Failover (TAF)
· JDBC-THIN 11gR2 supports Single Client Access Name (SCAN)

143) what is recommendation on type of tablespaces?


You should use locally managed, auto-allocate tablespaces. With auto-allocate Oracle
automatically grows the size of the extent depending on segment size, available free space
in the tablespace and other factors. The extent size of a segment starts at 64 KB and grows
to 1 MB when the segment grows past 1 MB, and 8 MB once the segment size exceeds 64
MB. So for a large table, the extent size will automatically grow to be large. The use of
uniform extents is strongly discouraged for two reasons; space wastage and the impact that
wasted space has on scan performance.
For large partitioned objects you should use multiple big file tablespaces to avoid file header
block contention during parallel load operations. File header block contention appears as the
‘gc buffer busy’ enqueue wait event in an AWR report. Checking the buffer wait statistic will
indicate if it is the file header block that is being contended for.
To evenly distribute a partitioned table among multiple big file tablespaces use the STORE
IN clause.

144) what is the major difference between 10g and 11g RAC?
Well, there is not much difference between 10g and 11gR (1) RAC.
But there is a significant difference in 11gR2.

Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS


 Databases
 Instances
 Applications
 Node Monitoring
 Event Services
 High Availability

From 11gR2(onwards) its completed HA stack managing and providing the


following resources as like the other cluster software like VCS etc.
 Databases
 Instances
 Applications
 Cluster Management
 Node Management
 Event Services
 High Availability
 Network Management (provides DNS/GNS/MDNSD services on behalf of other
traditional services) and SCAN – Single Access Client Naming method, HAIP
 Storage Management (with help of ASM and other new ACFS filesystem)
 Time synchronization (rather depending upon traditional NTP)
 Removed OS dependent hang checker etc, manages with own additional monitor
process

145) what are Oracle Cluster Components?


Cluster Interconnect (HAIP)
Shared Storage (OCR/Voting Disk)
Clusterware software

146) what are Oracle Kernel Components (nothing but how does Oracle RAC database
differs than Normal single instance database in terms of Binaries and process)
Basically Oracle kernel need to switched on with RAC On option when you convert to RAC,
that is the difference as it facilitates few RAC bg process like LMON,LCK,LMD,LMS etc.
To turn on RAC
# link the oracle libraries
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk rac_on
# rebuild oracle
$ cd $ORACLE_HOME/bin
$ relink oracle
Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database. Oracle RAC
instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion. Oracle RAC instances are composed of following background
processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor

147) what is Clusterware?


Software that provides various interfaces and services for a cluster. Typically, this includes
capabilities that:
 Allow the cluster to be managed as a whole
 Protect the integrity of the cluster
 Maintain a registry of resources across the cluster
 Deal with changes to the cluster
 Provide a common view of resources

148) under which user or owner the process will start?


Component Name of the Process Owner
Oracle High Availability Service ohasd init, root
Cluster Ready Service (CRS) Cluster Ready Services root
Cluster Synchronization Service (CSS) ocssd,cssd monitor, cssdagent grid owner
Event Manager (EVM) evmd, evmlogger grid owner
Cluster Time Synchronization Service
octssd root
(CTSS)
Oracle Notification Service (ONS) ons, eons grid owner
Oracle Agent oragent grid owner
Oracle Root Agent orarootagent root
Grid Naming Service (GNS) gnsd root
Grid Plug and Play (GPnP) gpnpd grid owner
Multicast domain name service (mDNS) mdnsd grid owner

149) As you said Voting & OCR Disk resides in ASM Diskgroups, but as per startup sequence
OCSSD starts first before than ASM, how is it possible?
How does OCSSD starts if voting disk & OCR resides in ASM Diskgroups?
You might wonder how CSSD, which is required to start the clustered ASM instance, can be
started if voting disks are stored in ASM? This sounds like a chicken-and-egg problem:
without access to the voting disks there is no CSS, hence the node cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM instance. To solve this
problem the ASM disk headers have new metadata in 11.2: you can use kfed to read the
header of an ASM disk containing a voting disk. The kfdhdb.vfstart and kfdhdb.vfend fields
tell CSS where to find the voting file. This does not require the ASM instance to be up. Once
the voting disks are located, CSS can access them and joins the cluster.

150) how does SCAN works?

 Client Connected through SCAN name of the cluster (remember all three IP addresses
round robin resolves to same Host name (SCAN Name), here in this case our scan name
is cluster01-scan.cluster01.example.com
 The request reaches to DNS server in your corp and then resolves to one of the node out
of three. a. If GNS (Grid Naming service or domain is configured) that is a subdomain
configured in the DNS entry for to resolve cluster address the request will be handover
to GNS (gnsd)
 Here in our case assume there is no GNS, now the with the help of SCAN listeners where
end points are configured to database listener.
 Database Listeners listen the request and then process further.
 In case of node addition, Listener 4, client need not to know or need not change any
thing from their tns entry (address of 4th node/instance) as they just using scan IP.
 Same case even in the node deletion.

151) How to troubleshoot instance startup issue?


$export SRVM_TRACE=TRUE
$srvctl start instance -d -i

152) What is GPNP?

Grid Plug and Play along with GNS provide dynamic

In previous releases, adding or removing servers in a cluster required extensive manual


preparation.

In Oracle Database 11g Release 2, GPnP allows each node to perform the following tasks
dynamically:

 Negotiating appropriate network identities for itself


 Acquiring additional information from a configuration profile
 Configuring or reconfiguring itself using profile data, making host names and
addresses resolvable on the network

For example a domain should contain

 –Cluster name: cluster01


 –Network domain: example.com
 –GPnP domain: cluster01.example.com

To add a node, simply connect the server to the cluster and allow the cluster to configure
the node.

To make it happen, Oracle uses the profile located in


$GI_HOME/gpnp/profiles/peer/profile.xml which contains the cluster resources, for example
disk locations of ASM. etc.

So this profile will be read local or from the remote machine when plugged into cluster and
dynamically added to cluster.

153) what are the file types that ASM support and keep in disk groups?

Data Pump dump


Control files Flashback logs
sets
Data Guard
Data files DB SPFILE
configuration
Temporary data Change tracking
RMAN backup sets
files bitmaps
RMAN data file
Online redo logs OCR files
copies
Transport data
Archive logs ASM SPFILE
files

154) what is node listener?


In 11gr2 the listeners will run from Grid Infrastructure software home
 The node listener is a process that helps establish network connections from ASM clients
to the ASM instance.
 Runs by default from the Grid $ORACLE_HOME/bin directory
 Listens on port 1521 by default
 Is the same as a database instance listener
 Is capable of listening for all database instances on the same machine in addition to the
ASM instance
 Can run concurrently with separate database listeners or be replaced by a separate
database listener
 Is named tnslsnr on the Linux platform

155) what is SCAN listener?


A scan listener is something that additional to node listener which listens the incoming db
connection requests from the client which got through the scan IP, it got end points
configured to node listener where it routes the db connection requests to particular node
listener.

156) what is the difference between CRSCTL and SRVCTL?


crsctl manages clusterware-related operations:
 Starting and stopping Oracle Clusterware
 Enabling and disabling Oracle Clusterware daemons
 Registering cluster resources
srvctl manages Oracle resource–related operations:
 Starting and stopping database instances and services
 Also from 11gR2 manages the cluster resources like network,vip,disks etc

157) How to find the cluster network settings?


To determine the list of interfaces available to the cluster:
$ oifcfg iflist –p -n
To determine the public and private interfaces that have been configured:
$ oifcfg getif
To determine the Virtual IP (VIP) host name, VIP address, VIP subnet mask, and VIP
interface name:
$ srvctl config nodeapps -a

158) How to change Cluster interconnect in RAC?


On a single node in the cluster, add the new global interface specification:
$ oifcfg setif -global eth2/192.0.2.0:cluster_interconnect
Verify the changes with oifcfg getif and then stop Clusterware on all nodes by running the
following command as root on each node:
# oifcfg getif
# crsctl stop crs
Assign the network address to the new network adapters on all nodes using ifconfig:
#ifconfig eth2 192.0.2.15 netmask 255.255.255.0 \ broadcast 192.0.2.255
Remove the former adapter/subnet specification and restart Clusterware:
$ oifcfgdelif -global eth1/192.168.1.0
# crsctl start crs
159) Managing or Modifying SCAN in Oracle RAC?
To add a SCAN VIP resource:
$ srvctl add scan -n cluster01-scan
To remove Clusterware resources from SCAN VIPs:
$ srvctl remove scan [-f]
To add a SCAN listener resource:
$ srvctl add scan_listener
$ srvctl add scan_listener -p 1521
To remove Clusterware resources from all SCAN listeners:
$ srvctl remove scan_listener [-f]
160) How to check the node connectivity in Oracle Grid Infrastructure?
$ cluvfy comp nodecon -n all –verbose
161) what is OLR? How to backup OLR?

162) what happens during failure events? – How scan connection still works when node
dies? When local listener dies/when scan listener dies?
The main component that makes the solution agnostic to nodes’ failures are 3 SCAN
listeners and related IP addresses. Those listeners are Cluster wide services meaning that
and of 3 listeners are not bind to and particular node and can run on and of the nodes. If
one of the nodes crashes than the following happens:

 Survived nodes will have at least 1 SCAN listener running


 Short time (a minute) after the crash SCAN Listener records got refreshed and new
client request are not forwarded to the crashed node any more
 Failed SCAN listeners got started on survived nodes
 After a minute (default refresh time) new SCAN listener’s instances know about the
DB instances that serves certain services configured and start function as before.

During the crash itself there are a small timeframe when client connections ma get some
errors alike the following. However in 1-2 minutes it all gets to normal:
"ORA‐12514: TNS:listener does not currently know of service"

The exact SCAN information refresh process is to be investigated, may be Oracle Process
and Notification service might be part of the process as each of the listeners keeps
connection to “127.0.0.1:6100” socket

What methods are available to keep the time synchronized on all nodes in the cluster?

Either the Network Time Protocol(NTP) can be configured or in 11gr2, Cluster Time
Synchronization Service (CTSS) can be used.

What files components in RAC must reside on shared storage?

Spfiles, ControlFiles, Datafiles and Redolog files should be created on shared storage.

Where does the Clusterware write when there is a network or Storage missed heartbeat?

The network ping failure is written in $CRS_HOME/log

How do you find out what OCR backups are available?

The ocrconfig -showbackup can be run to find out the automatic and manually run backups.

If your OCR is corrupted what options do have to resolve this?

You can use either the logical or the physical OCR backup copy to restore the Repository.

How do you find out what object has its blocks being shipped across the instance the most?

You can use the dba_hist_seg_stats.


How do we know which database instances are part of a RAC cluster?

You can query the V$ACTIVE_INSTANCES view to determine the member instances of the
RAC cluster.

What is OCLUMON used for in a cluster environment?

The Cluster Health Monitor (CHM) stores operating system metrics in the CHM repository for
all nodes in a RAC cluster. It stores information on CPU, memory, process, network and
other OS data, This information can later be retrieved and used to troubleshoot and identify
any cluster related issues. It is a default component of the 11gr2 grid install. The data is
stored in the master repository and replicated to a standby repository on a different node.

What would be the possible performance impact in a cluster if a less powerful node (e.g.
slower CPU’s) is added to the cluster?

All processing will show down to the CPU speed of the slowest server.

What is the purpose of OLR?

Oracle Local repository contains information that allows the cluster processes to be started
up with the OCR being in the ASM storage system. Since the ASM file system is unavailable
until the Grid processes are started up a local copy of the contents of the OCR is required
which is stored in the OLR.

What is the default memory allocation for ASM?

In 10g the default SGA size is 1G in 11g it is set to 256M and in 12c ASM it is set back to
1G.

How do you backup ASM Metadata?

You can use md_backup to restore the ASM diskgroup configuration in-case of ASM
diskgroup storage loss.

What files can be stored in the ASM diskgroup?

In 11g the following files can be stored in ASM diskgroups.

Datafiles
Redo logfiles

Spfiles

In 12c the files below can also new be stored in the ASM Diskgroup

Password file

What it the ASM POWER_LIMIT?

This is the parameter which controls the number of Allocation units the ASM instance will try
to rebalance at any given time. In ASM versions less than 11.2.0.3 the default value is 11
however it has been changed to unlimited in later versions.

What is a rolling upgrade?

A patch is considered a rolling if it is can be applied to the cluster binaries without having to
shutting down the database in a RAC environment. All nodes in the cluster are patched in a
rolling manner, one by one, with only the node which is being patched unavailable while all
other instance open.

What are some of the RAC specific parameters?

Some of the RAC parameters are:

CLUSTER_DATABASE

CLUSTER_DATABASE_INSTANCE

INSTANCE_TYPE (RDBMS or ASM)

ACTIVE_INSTANCE_COUNT

UNDO_MANAGEMENT

What is the future of the Oracle Grid?

The Grid software is becoming more and more capable of not just supporting HA for Oracle
Databases but also other applications including Oracle’s applications. With 12c there are
more features and functionality built-in and it is easier to deploy these pre-built solutions,
available for common Oracle applications.
What components of the Grid should I back up?

The backups should include OLR, OCR and ASM Metadata.

Is there an easy way to verify the inventory for all remote nodes

You can run the opatch lsinventory -all_nodes command from a single node to look at the
inventory details for all nodes in the cluster.

Q What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.

Q what is dynamic remastering ? When will the dynamic remastering happens?

dynamic remastering is ability to move the ownership of resource from one instance to
another instance in RAC. dynamic resource remastering is used to implement for resource
affinity for increased performance. resource affinity optimized the system in situation where
update transactions are being executed in one instance. when activity shift to another
instance the resource affinity correspondingly move to another instance. If activity is not
localized then resource ownership is hashed to the instance.

In 10g dynamic remastering happens in file+object level.the process of remastering is very


stringent. For one instance should touch more than 50 times than the other instance in
particular period(say 10 mints). this touch ratio and time can be tuned by gc_affinity_limit
and _gc_affinity_time parameter.
Q How you check the services in RAC Node?

We can check the service or start the services with 'srvctl' command.load balanced/TAF
service named RAC online.

[oracle@TEST_NODE1 ~]$ srvctl start service -d orcl -s RAC

[oracle@TEST_NODE1 ~]$ crsstat

Q If there is some issue with virtual IP how will you troubleshoot it?How will you change
virtual ip?

To change the VIP (virtual IP) on a RAC node, use the command

[oracle@testnode oracle]$ srvctl modify nodeapps -A new_address

Q How you will backup your RAC Database?

Backup strategy of RAC Database:

An RAC Database consists of

1)OCR

2)Voting disk &

3)Database files, controlfiles, redolog files & Archive log files

Q Do you have any idea of load balancing in application?How load balancing is done?

http://practicalappsdba.wordpress.com/category/for-master-apps-dbas/

Q What is RAC?
RAC stands for Real Application cluster. It is a clustering solution from Oracle Corporation
that ensures high availability of databases by providing instance failover, media failover
features.

Q What is RAC and how is it different from non RAC databases?

RAC stands for Real Application Cluster, you have n number of instances running in their
own separate nodes and based on the shared storage. Cluster is the key component and is
a collection of servers operations as one unit. RAC is the best solution for high performance
and high availably. Non RAC databases has single point of failure in case of hardware failure
or server crash.

Q Give the usage of srvctl ?

srvctl start instance -d db_name -i "inst_name_list" [-o start_options]

srvctl stop instance -d name -i "inst_name_list" [-o stop_options]

srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate

srvctl start database -d name [-o start_options]

srvctl stop database -d name [-o stop_options]

srvctl start database -d orcl -o mount

Q Mention the Oracle RAC software components ?

Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database.Oracle RAC
instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion.Oracle RAC instances are composed of following background processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)

GTX0-j—Global Transaction Process

LMON—Global Enqueue Service Monitor

LMD—Global Enqueue Service Daemon

LMS—Global Cache Service Process

LCK0—Instance Enqueue Process

RMSn—Oracle RAC Management Processes (RMSn)

RSMN—Remote Slave Monitor

Q What is GRD?

GRD stands for Global Resource Directory. The GES and GCS maintains records of the
statuses of each datafile and each cahed block using global resource directory.This process
is referred to as cache fusion and helps in data integrity.

Q What are the different network components are in 10g RAC?

public, private, and vip components

Private interfaces is for intra node communication. VIP is all about availability of application.
When a node fails then the VIP component fail over to some other node, this is the reason
that all applications should based on vip components means tns entries should have vip
entry in the host list

Q Give Details on ACMS:

ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is
an agent that ensures a distributed SGA memory update(ie)SGA updates are globally
committed on success or globally aborted in event of a failure.

Q What are the major RAC wait events?


In a RAC environment the buffer cache is global across all instances in the cluster and hence
the processing differs.The most common wait events related to this are gc cr request and gc
buffer busy

GC CR request :the time it takes to retrieve the data from the remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will
increase the amount of data blocks requested by an Oracle session. The more blocks
requested typically means the more often a block will need to be read from a remote
instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested
data block.

Q Give details on GTX0-j

The process provides transparent support for XA global transactions in a RAC


environment.The database autotunes the number of these processes based on the workload
of XA global transactions.

Q Give details on LMON

This process monitors global enques and resources across the cluster and performs global
enqueue recovery operations.This is called as Global Enqueue Service Monitor.

Q Give details on LMD

This process is called as global enqueue service daemon. This process manages incoming
remote resource requests within each instance.

Q Give details on LMS


This process is called as Global Cache service process.This process maintains statuses of
datafiles and each cahed block by recording information in a Global Resource
Dectory(GRD).This process also controls the flow of messages to remote instances and
manages global data block access and transmits block images between the buffer caches of
different instances.This processing is a part of cache fusion feature.

Q Give details on LCK0

This process is called as Instance enqueue process.This process manages non-cache fusion
resource requests such as libry and row cache requests.

Q Give details on RMSn

This process is called as Oracle RAC management process.These pocesses perform


managability tasks for Oracle RAC.Tasks include creation of resources related Oracle RAC
when new instances are added to the cluster.

Q How to export and import crs resources while migrating Oracle RAC to new server.

Below script generate svrctl add script for database, instance, service and 11G listeners
from OCR from current RAC.

Save the result of the script and run it at new RAC.

for DBNAME in $(srvctl config database)

do

# Generate DB resource

srvctl config database -d $DBNAME -a | awk -v dbname="$DBNAME" \

'BEGIN { FS=":" }
$1~/Oracle home/ || $1~/ORACLE_HOME/ {dbhome = "-o" $2}

$1~/Spfile/ || $1~/SPFILE/ {spfile = "-p" $2}

$1~/Disk Groups/ {dg = "-a" $2}

END { if (avail == "-a ") {avail = ""}; printf "%s %s %s %s %s\n", "srvctl add database -d
", dbname, dbhome, spfile, dg }'

# Generate Instance resource

srvctl status database -d $DBNAME | awk -v dbname="$DBNAME" \

'$4~/running/ { printf "%s %s %s %s %s %s\n", "srvctl add instance -d ",dbname, " -i ",
$2 ," -n ", $7 }

$5~/running/ { printf "%s %s %s %s %s %s \n", "srvctl add instance -d ",dbname, " -i ",
$2 ," -n ", $8 }'

# Modify instance for 10G - ASM dependency

if [ $(echo $ORACLE_HOME | grep "1020" | wc -l ) -eq 1 ]

then

srvctl status database -d $DBNAME | awk -v dbname="$DBNAME" \

'$2~/1$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, " -i ", $2 ," -
s +ASM1" }

$2~/2$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, " -i ", $2 ," -
s +ASM2" }

$2~/3$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, " -i ", $2 ," -
s +ASM3" }

$2~/4$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, " -i ", $2 ," -
s +ASM4" }'

fi

echo "srvctl start database -d $DBNAME"

# Generate Service resource


snamelist=$(srvctl status service -d $DBNAME | awk '{print $2}')

for sname in $snamelist

do

srvctl config service -d $DBNAME -s $sname| awk -v dbname="$DBNAME" -v


sname=$sname \

'BEGIN { FS=":"}

$1~/Preferred instances/ {pref = "-r" $2}

$1~/PREF/ {pref = "-r" $2; sub(/AVAIL/, "", pref) }

$1~/Available instances/ {avail = "-a" $2}

$2~/AVAIL/ {avail = "-a" $3}

$1~/Failover type/ {ft = "-e" $2}

$1~/Failover method/ {fm = "-m" $2}

$1~/Runtime Load Balancing Goal/ {g = "-B" $2}

END { if (avail == "-a ") {avail = ""}; printf "%s %s %s %s %s %s %s %s %s %s\n",


"srvctl add service -d ",dbname, "-s ", sname, pref, avail ,ft, fm,g, "-P BASIC"}'

echo "srvctl start service -d $DBNAME -s $sname"

done

done

# Listener at 11G Home. 10G listener can't ba added with srvctl.

srvctl config listener | awk \

'BEGIN { FS=":"; state = 0; }

$1~/Name/ {lname = "-l" $2; state=1};

$1~/Home/ && state == 1 {ohome = "-o" $2; state=2;}

$1~/End points/ && state == 2 {lport = "-p " $3; state=3;}


state == 3 {if (ohome != "-o ") {printf "%s %s %s %s\n", "srvctl add listener ", lname,
ohome, lport;} state=0;}'

Q Give details on RSMN

This process is called as Remote Slave Monitor.This process manages background slave
process creation andd communication on remote instances. This is a background slave
process.This process performs tasks on behalf of a co-ordinating process running in another
instance.

Q What components in RAC must reside in shared storage?

All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.

Q What is the significance of using cluster-aware shared storage in an Oracle RAC


environment?

All instances of an Oracle RAC can access all the datafiles,control files, SPFILE's, redolog
files when these files are hosted out of cluster-aware shared storage which are group of
shared disks.

Q Give few examples for solutions that support cluster storage

ASM(automatic storage management),raw disk devices,network file system(NFS), OCFS2


and OCFS(Oracle Cluster Fie systems).

Q What is an interconnect network?

An interconnect network is a private network that connects all of the servers in a cluster.
The interconnect network uses a switch/multiple switches that only the nodes in the cluster
can access.
Q How can we configure the cluster interconnect?

Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster interconnect.On unix
and linux systems we use UDP and RDS(Reliable data socket) protocols to be used by
Oracle Clusterware.Windows clusters use the TCP protocol.

Q Can we use crossover cables with Oracle Clusterware interconnects?

No, crossover cables are not supported with Oracle Clusterware intercnects.

Q What is the use of cluster interconnect?

Cluster interconnect is used by the Cache fusion for inter instance communication.

Q How do users connect to database in an Oracle RAC environment?

Users can access a RAC database using a client/server configuration or through one or more
middle tiers ,with or without connection pooling.Users can use oracle services feature to
connect to database.

Q What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle database.Services
enable us to define rules and characteristics to control how users and applications connect
to database instances.

Q What are the characteristics controlled by Oracle services feature?

The charateristics include a unique name, workload balancing and failover options,and high
availability characteristics.
Q What enables the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the
instances in an Oracle RAC database.

Q What is a virtual IP address or VIP?

A virtl IP address or VIP is an alternate IP address that the client connectins use instead of
the standard public IP address. To configureVIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same subnet as the public
network.

Q What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which the VIP
address can accept TCP connections but it cannot accept Oracle connections.

Q Give situations under which VIP address failover happens

VIP addresses failover happens when the node on which the VIP address runs fails, all
interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from
the network.

Q What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don't have to wait for TCP connection timeout
messages.

Q What are the administrative tools used for Oracle RAC environments?

Oracle RAC cluster can be administered as a single image using OEM(Enterprise


Manager),SQL*PLUS,Servercontrol(SRVCTL),clusterverificationutility(cvu),DBCA,NETCA
Q How do we verify that RAC instances are running?

Issue the following query from any one node connecting through SQL*PLUS.

$connect sys/sys as sysdba

SQL>select * from V$ACTIVE_INSTANCES;

The query gives the instance number under INST_NUMBER column,host_:instancename


under INST_NAME column.

Q What is FAN?

Fast application Notification as it abbreviates to FAN relates to the events related to


instances,services and nodes.This is a notification mechanism that Oracle RAc uses to notify
other processes about the configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can respond to FAN events and
take immediate action.

Q Where can we apply FAN UP and DOWN events?

FAN UP and FAN DOWN events can be applied to instances,services and nodes.

State the use of FAN events in case of a cluster configuration change?

During times of cluster configuration changes,Oracle RAC high availability framework


publishes a FAN event immediately when a state change occurs in the cluster.So
applications can receive FAN events and react immediately.This prevents applications from
polling database and detecting a problem after such a state change.

Q Why should we have seperate homes for ASm instance?

It is a good practice to have ASM home seperate from the database


hom(ORACLE_HOME).This helps in upgrading and patching ASM and the Oracle database
software independent of each other.Also,we can deinstall the Oracle database software
independent of the ASM instance.
Q What is the advantage of using ASM?

Having ASM is the Oracle recommended storage option for RAC databases as the ASM
maximizes performance by managing the storage configuration across the disks.ASM does
this by distributing the database file across all of the available storage within our cluster
database environment.

Q What is rolling upgrade?

It is a new ASM feature from Database 11g.ASM instances in Oracle database 11g
release(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables
us to patch or upgrade ASM nodes in a clustered environment without affecting database
availability.During a rolling upgrade we can maintain a functional cluster while one or more
of the nodes in the cluster are running in different software versions.

Q Can rolling upgrade be used to upgrade from 10g to 11g database?

No,it can be used only for Oracle database 11g releases(from 11.1).

Q State the initialization parameters that must have same value for every instance in an
Oracle RAC database

Some initialization parameters are critical at the database creation time and must have
same values.Their value must be specified in SPFILE or PFILE for every instance.The list of
parameters that must be identical on every instance are given below:

ACTIVE_INSTANCE_COUNT

ARCHIVE_LAG_TARGET

COMPATIBLE

CLUSTER_DATABASE

CLUSTER_DATABASE_INSTANCE

CONTROL_FILES

DB_BLOCK_SIZE

DB_DOMAIN
DB_FILES

DB_NAME

DB_RECOVERY_FILE_DEST

DB_RECOVERY_FILE_DEST_SIZE

DB_UNIQUE_NAME

INSTANCE_TYPE (RDBMS or ASM)

PARALLEL_MAX_SERVERS

REMOTE_LOGIN_passWORD_FILE

UNDO_MANAGEMENT

Q What is ORA-00603: ORACLE server session terminated by fatal error or ORA-29702:


error occurred in Cluster Group Service operation?

RAC node name was listed in the loopback address...

Q Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?

These parameters can be identical on all instances only if these parameter values are set to
zero.

What two parameters must be set at the time of starting up an ASM instance in a RAC
environment?The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.

Q Mention the components of Oracle clusterware

Oracle clusterware is made up of components like voting disk and Oracle Cluster
Registry(OCR).

Q What is a CRS resource?


Oracle clusterware is used to manage high-availability operations in a cluster.Anything that
Oracle Clusterware manages is known as a CRS resource.Some examples of CRS resources
are database,an instance,a service,a listener,a VIP address,an application process etc.

Q What is the use of OCR?

Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).

Q How does a Oracle Clusterware manage CRS resources?

Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).

Q Name some Oracle clusterware tools and their uses?

OIFCFG - allocating and deallocating network interfaces

OCRCONFIG - Command-line tool for managing Oracle Cluster Registry

OCRDUMP - Identify the interconnect being used

CVU - Cluster verification utility to get status of CRS resources

Q What are the modes of deleting instances from ORacle Real Application cluster
Databases?

We can delete instances using silent mode or interactive mode using DBCA(Database
Configuration Assistant).

Q How do we remove ASM from a Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or silent mode.After
that asm can be removed using srvctl tool as follows:

srvctl stop asm -n node_name


srvctl remove asm -n node_name

We can verify if ASM has been removed by issuing the following command:

srvctl config asm -n node_name

Q How do we verify that an instance has been removed from OCR after deleting an
instance?

Issue the following srvctl command:

srvctl config database -d database_name

cd CRS_HOME/bin

./crs_stat

Q How do we verify an existing current backup of OCR?

We can verify the current backup of OCR using the following command : ocrconfig -
showbackup

What are the performance views in an Oracle RAC environment?

We have v$ views that are instance specific. In addition we have GV$ views called as global
views that has an INST_ID column of numeric data type.GV$ views obtain information from
individual V$ views.

What are the types of connection load-balancing?

There are two types of connection load-balancing:server-side load balancing and client-side
load balancing.

Q What is the difference between server-side and client-side connection load balancing?

Client-side balancing happens at client side where load balancing is done using listener.In
case of server-side load balancing listener uses a load-balancing advisory to redirect
connections to the instance providing best service.

Q What are the three greatest benefits that RAC provides??


The three main benefits are availability, scalability, and the ability to use low cost
commodity hardware. RAC allows an application to scale vertically, by adding CPU, disk and
memory resources to an individual server. But RAC also provides horizontal scalability,
which is achieved by adding new nodes into the cluster. RAC also allows an organization to
bring these resources online as they are needed. This can save a small or midsize
organization a lot of money in the early stages of a project.

In a RAC environment, if a node in the cluster fails, the application continues to run on the
surviving nodes contained in the cluster. If your application is configured correctly, most
users won't even know that the node they were running on became unavailable.

Q What are the major RAC wait events?

In a RAC environment the buffer cache is global across all instances in the cluster and hence
the processing differs.The most common wait events related to this are gc cr request and gc
buffer busy

GC CR request: the time it takes to retrieve the data from the remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will
increase the amount of data blocks

requested by an Oracle session. The more blocks requested typically means the more often
a block will need to be read from a remote instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested
data block.

Q What are the different network components in Oracle 10g RAC?

We have public, private, and VIP components. Private interfaces is for intra node
communication. VIP is all about availability of application. When a node fails then the VIP
component will fail over to some other node, this is the reason that all applications should
be based on VIP components. This means that tns entries should have VIP entry in the host
list.
Q Tune the following RAC DATABASE (DBNAME=PROD) which is 3 node RAC.

PROD1 PROD2 PROD3

CPU 8 CPU 15 CPU 8

32 GB RAM 12 GB RAM 16 GB RAM

What are you looking for here? What tuning information do you expect?

It is a 3 node cluster with different hardware configuration running RAC.

I would put 20% of the memory for Oracle in each node. So that would mean that the SGA
is different in each of the nodes.

Also since the CPU's are different PROD2 can have more number of max number of
processes as compared to the rest of them.

But as I said this is just configuration, this is not tuning. Question is not clear.

Q Write a sample script for RMAN for the recovery if all the instance are down.(First explain
the procedure how you will restore)

Bring all nodes down.

Start one Node

Restore all datafiles and archive logs.

Recover 1 Node.

Open the database.

bring other nodes up.

Confirm that all nodes are operational.

Clients are performing some operation and suddenly one of the datafile is experiencing
problem what do you do? The cluster is a two node one.

Bring the datafile offline recover the datafile.


How can you connect to a specific node in a RAC environment?

tnsnames.ora ensure that you have INSTANCE_NAME specified in it.

Q How to move OCR and Voting disk to new storage device?

Moving OCR

==========

You must be logged in as the root user, because root owns the OCR files. Also an ocrmirror
must be in place before trying to replace the OCR device.

Make sure there is a recent backup of the OCR file before making any changes:

ocrconfig –showbackup

If there is not a recent backup copy of the OCR file, an export can be taken for the current
OCR file. Use the following command to generate an export of the online OCR file:

In 10.2

# ocrconfig –export -s online

In 11g

# ocrconfig -manualbackup

The new OCR disk must be owned by root, must be in the oinstall group, and must have
permissions set to 640. Provide at least 100 MB disk space for the OCR.

On one node as root run:


# ocrconfig -replace ocr

# ocrconfig -replace ocrmirror

Now run ocrcheck to verify if the OCR is pointing to the new file

Moving Voting Disk

==================

Note: crsctl votedisk commands must be run as root

Shutdown the Oracle Clusterware (crsctl stop crs as root) on all nodes before making any
modification to the voting disk. Determine the current voting disk location using:

crsctl query css votedisk

Take a backup of all voting disk:

dd if=voting_disk_name of=backup_file_name

To move a Voting Disk, provide the full path including file name:

crsctl delete css votedisk –force

crsctl add css votedisk –force

After modifying the voting disk, start the Oracle Clusterware stack on all nodes

# crsctl start crs

Verify the voting disk location using


crsctl query css votedisk

Q When exactly during the installation process are clusterware components created?

After fulfilling the pre-installation requirements, the basic installation steps to follow are:

1. Invoke the Oracle Universal Installer (OUI)

2. Enter the different information for some components like:

- name of the cluster

- public and private node names

- location for OCR and Voting Disks

- network interfaces used for RAC instances

-etc.

3. After the Summary screen, OUI will start copying under the $CRS_HOME (this is the
$ORACLE_HOME for Oracle Clusterware) in the local node the libraries and executables.

- here we will have the daemons and scripts init.* created and configured properly.

Oracle Clusterware is formed of several daemons, each one of which have a special function
inside the stack. Daemons are executed via the init.* scripts (init.cssd, init.crsd and
init.evmd).

- note that for CRS only some client libraries are recreated, but not all the executables (as
for the RDBMS).

4. Later the software is propagated to the rest of the nodes in the cluster and the
oraInventory is updated.
5. The installer will ask to execute root.sh on each node. Until this step the software for
Oracle Clusterware is inside the $CRS_HOME.

Running root.sh will create several components outside the $CRS_HOME:

- OCR and VD will be formated.

- control files (or SCLS_SRC files ) will be created with the correct contents to start Oracle
Clusterware.

These files are used to control some aspects of Oracle Clusterware like:

- enable/disable processes from the CSSD family (Eg. oprocd, oslsvmon)

- stop the daemons (ocssd.bin, crsd.bin, etc).

- prevent Oracle Clusterware from being started when the machine boots.

- etc.

- /etc/inittab will be updated and the init process is notified.

In order to start the Oracle Clusterware daemons, the init.* scripts first need to be run.
These scripts are executed by the daemon init. To accomplish this some entries must be
created in the file /etc/inittab.

- the different processes init.* (init.cssd, init.crsd, etc) will start the daemons (ocssd.bin,
crsd.bin, etc). When all the daemons are running then we can say that the installation was
successful

- On 10.2 and later, running root.sh on the last node in the cluster also will create the
nodeapps (VIP, GSD and ONS). On 10.1, VIPCA is executed as part of the RAC installation.
6. After running root.sh on each node, we need to continue with the OUI session. After
pressing the 'OK' button OUI will include the information for the public and
cluster_interconnect interfaces. Also CVU (Cluster Verification Utility) will be executed.

Q What are Oracle Clusterware processes for 10g on Unix and Linux

Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as
the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be
a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application
process, and so on) based on the resource's configuration information that is stored in the
OCR. This includes start, stop, monitor and failover operations. This process runs as the root
user

Event manager daemon (evmd) —A background process that publishes events that crs
creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O
fencing. OPROCD performs its check, stops running, and if the wake up is beyond the
expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure
results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on
Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements


and complex resources. Runs server callout scripts when FAN events occur.

Q What are Oracle database background processes specific to RAC

•LMS—Global Cache Service Process

•LMD—Global Enqueue Service Daemon


•LMON—Global Enqueue Service Monitor

•LCK0—Instance Enqueue Process

To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy
a query or transaction, Oracle RAC instances use two processes, the Global Cache Service
(GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the
statuses of each data file and each cached block using a Global Resource Directory (GRD).
The GRD contents are distributed across all of the active instances.

Q What are Oracle Clusterware Components

Voting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a
health check and arbitrates cluster ownership among the instances in case of network
failures. The voting disk must reside on shared disk.

Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as


configuration information about any cluster database within the cluster. The OCR must
reside on shared disk that is accessible by all of the nodes in your cluster

Q How do you troubleshoot node reboot

Please check metalink ...

Note 265769.1 Troubleshooting CRS Reboots

Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle
Clusterware Node evictions.

Q How do you backup the OCR


There is an automatic backup mechanism for OCR. The default location is :
$ORA_CRS_HOME\cdata\"clustername"\

To display backups :

#ocrconfig -showbackup

To restore a backup :

#ocrconfig -restore

With Oracle RAC 10g Release 2 or later, you can also use the export command:

#ocrconfig -export -s online, and use -import option to restore the contents back.

With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR with the
command:

# ocrconfig -manualbackup

Q How do you backup voting disk

#dd if=voting_disk_name of=backup_file_name

Q How do I identify the voting disk location

#crsctl query css votedisk

Q How do I identify the OCR file location

check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)

or

#ocrcheck
Q What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.

Q What is the purpose of Private Interconnect ?

Clusterware uses the private interconnect for cluster synchronization (network heartbeat)
and daemon communication between the the clustered nodes. This communication is based
on the TCP protocol.

RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP).
Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches
of participating nodes in the cluster.

Q Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP
timeout period (which can be up to 10 min) before getting an error. As a result, you don't
really have a good HA solution without using VIPs.

When a node fails, the VIP associated with it is automatically failed over to some other node
and new node re-arps the world indicating a new MAC address for the IP. Subsequent
packets sent to the VIP go to the new node, which will send error RST packets back to the
clients. This results in the clients getting errors immediately

Q How many nodes are supported in a RAC Database?

10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances
in a RAC database.

Q Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however
sqlplus can start it on both nodes? How do you identify the problem?
Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl.
Now you will get detailed error stack.

Q what is the purpose of the ONS daemon?

The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware
as part of the nodeapps. There is one ons daemon started per clustered node.

The Oracle Notification Service daemon receive a subset of published clusterware events via
the local evmd and racgimon clusterware daemons and forward those events to application
subscribers and to the local listeners.

This in order to facilitate:

a. the FAN or Fast Application Notification feature or allowing applications to respond to


database state changes.

b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing accross
different rac nodes dependent of the load on the different nodes. The rdbms MMON is
creating an advisory for distribution of work every 30seconds and forward it via racgimon
and ONS to listeners and applications.

Q How do users connect to database in an Oracle RAC environment?

Users can access a RAC database using a client/server configuration or through one or more
middle tiers, with or without connection pooling. Users can use oracle services feature to
connect to database.

Q What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle database. Services
enable us to define rules and characteristics to control how users and applications connect
to database instances.

Q What are the characteristics controlled by Oracle services feature?


The characteristics include a unique name, workload balancing and failover options, and
high availability characteristics.

Q What is a voting disk?

A voting disk is a file that manages information about node membership.

Q What are the administrative tasks involved with voting disk?

Following administrative tasks are performed with the voting disk :

1) Backing up voting disks

2) Recovering Voting disks

3) Adding voting disks

4) Deleting voting disks

5) Moving voting disks

Q How do we backup voting disks?

1) Oracle recommends that you back up your voting disk after the initial cluster creation
and after we complete any node addition or deletion procedures.

2) First, as root user, stop Oracle Clusterware (with the crsctl stop crs command) on all
nodes. Then, determine the current voting disk by issuing the following command:
crsctl query votedisk css

3) Then, issue the dd or ocopy command to back up a voting disk, as appropriate.

Give the syntax of backing up voting disks:-

On Linux or UNIX systems:

dd if=voting_disk_name of=backup_file_name

where,

voting_disk_name is the name of the active voting disk

backup_file_name is the name of the file to which we want to back up the voting disk
contents

On Windows systems, use the ocopy command:

ocopy voting_disk_name backup_file_name

Q What is the Oracle Recommendation for backing up voting disk?

Oracle recommends us to use the dd command to backup the voting disk with a minimum
block size of 4KB.

Q How do you restore a voting disk?

To restore the backup of your voting disk, issue the dd or ocopy command for Linux and
UNIX systems or ocopy for Windows systems respectively.
On Linux or UNIX systems:

dd if=backup_file_name of=voting_disk_name

On Windows systems, use the ocopy command:

ocopy backup_file_name voting_disk_name

where,

backup_file_name is the name of the voting disk backup file

voting_disk_name is the name of the active voting disk

Q How can we add and remove multiple voting disks?

If we have multiple voting disks, then we can remove the voting disks and add them back
into our environment using the following commands, where path is the complete path of the
location where the voting disk resides:

crsctl delete css votedisk path

crsctl add css votedisk path

Q How do we stop Oracle Clusterware?When do we stop it?

Before making any modification to the voting disk, as root user, stop Oracle Clusterware
using the crsctl stop crs command on all nodes.
Q How do we add voting disk?

To add a voting disk, issue the following command as the root user, replacing the path
variable with the fully qualified path name for the voting disk we want to add:

crsctl add css votedisk path -force

Q How do we move voting disks?

To move a voting disk, issue the following commands as the root user, replacing the path
variable with the fully qualified path name for the voting disk we want to move:

crsctl delete css votedisk path -force

crsctl add css votedisk path -force

Q How do we remove voting disks?

To remove a voting disk, issue the following command as the root user, replacing the path
variable with the fully qualified path name for the voting disk we want to remove:

crsctl delete css votedisk path -force

Q What should we do after modifying voting disks?

After modifying the voting disk, restart Oracle Clusterware using the crsctl start crs
command on all nodes, and verify the voting disk location using the following command:

crsctl query css votedisk


Q When can we use -force option?

If our cluster is down, then we can include the -force option to modify the voting disk
configuration, without interacting with active Oracle Clusterware daemons. However, using
the -force option while any cluster node is active may corrupt our configuration

. What are the special background processes for RAC (or) what is difference in stand-alone
database & RAC database background processes?

DIAG, LCKn, LMD, LMSn, LMON

5. What are structural changes in 11g R2 RAC?

Ans:

http://satya-racdba.blogspot.com/2010/07/new-features-in-9i-10g-11g-rac.html

Grid & ASM are on one home,

Voting disk & ocrfile can be on the ASM,

SCAN,

By using srvctl, we can mange diskgroups, home, ons, eons, filesystem, srvpool, server,
scan, scan_listener, gns, vip, oc4j,

GSD

6. What are the new features in 11g (R2) RAC?

Ans:

http://satya-racdba.blogspot.com/2010/07/new-features-in-9i-10g-11g-rac.html

Grid & ASM are on one home,

Voting disk & ocrfile can be on the ASM,

SCAN,

By using srvctl, we can mange diskgroups, home, ons, eons, filesystem, srvpool, server,
scan, scan_listener, gns, vip, oc4j,

GSD
7. What is cache fusion?

Ans:

Transferring of data between RAC instances by using private network. Cache Fusion is the
remote memory mapping of Oracle buffers, shared between the caches of participating
nodes in the cluster. When a block of data is read from datafile by an instance within the
cluster and another instance is in need of the same block, it is easy to get the block image
from the instance which has the block in its SGA rather than reading from the disk.

8. What is the purpose of Private Interconnect?

Ans:

Clusterware uses the private interconnect for cluster synchronization (network heartbeat)
and daemon communication between the clustered nodes. This communication is based on
the TCP protocol. RAC uses the interconnect for cache fusion (UDP) and inter-process
communication (TCP).

9. What are the Clusterware components?

Ans:

Voting Disk - Oracle RAC uses the voting disk to manage cluster membership by way of a
health check and arbitrates cluster ownership among the instances in case of network
failures. The voting disk must reside on shared disk.

Oracle Cluster Registry (OCR) - Maintains cluster configuration information as well as


configuration information about any cluster database within the cluster. The OCR must
reside on shared disk that is accessible by all of the nodes in your cluster. The daemon
OCSSd manages the configuration info in OCR and maintains the changes to cluster in the
registry.

Virtual IP (VIP) - When a node fails, the VIP associated with it is automatically failed over to
some other node and new node re-arps the world indicating a new MAC address for the IP.
Subsequent packets sent to the VIP go to the new node, which will send error RST packets
back to the clients. This results in the clients getting errors immediately.

crsd – Cluster Resource Services Daemon

cssd – Cluster Synchronization Services Daemon

evmd – Event Manager Daemon

oprocd / hangcheck_timer – Node hang detector


10. What is OCR file?

Ans:

RAC configuration information repository that manages information about the cluster node
list and instance-to-node mapping information. The OCR also manages information about
Oracle Clusterware resource profiles for customized applications. Maintains cluster
configuration information as well as configuration information about any cluster database
within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes
in your cluster. The daemon OCSSd manages the configuration info in OCR and maintains
the changes to cluster in the registry.

11. What is Voting file/disk and how many files should be there?

Ans:

Voting Disk File is a file on the shared cluster system or a shared raw device file. Oracle
Clusterware uses the voting disk to determine which instances are members of a cluster.
Voting disk is akin to the quorum disk, which helps to avoid the split-brain syndrome. Oracle
RAC uses the voting disk to manage cluster membership by way of a health check and
arbitrates cluster ownership among the instances in case of network failures. The voting
disk must reside on shared disk.

12. How to take backup of OCR file?

Ans:

#ocrconfig -manualbackup

#ocrconfig -export file_name.dmp

#ocrdump -backupfile my_file

$cp -p -R /u01/app/crs/cdata /u02/crs_backup/ocrbackup/RAC1

13. How to recover OCR file?

Ans:

#ocrconfig -restore backup_file.ocr

#ocrconfig -import file_name.dmp

14. What is local OCR?


Ans:

/etc/oracle/local.ocr

/var/opt/oracle/local.ocr

15. How to check backup of OCR files?

Ans:

#ocrconfig –showbackup

16. How to take backup of voting file?

Ans:

dd if=/u02/ocfs2/vote/VDFile_0 of=$ORACLE_BASE/bkp/vd/VDFile_0

crsctl backup css votedisk -- from 11g R2

17. How do I identify the voting disk location?

Ans:

# crsctl query css votedisk

18. How do I identify the OCR file location?

check /var/opt/oracle/ocr.loc or /etc/ocr.loc

Ans:

# ocrcheck

19. If voting disk/OCR file got corrupted and don’t have backups, how to get them?

Ans:

We have to install Clusterware.

20. Who will manage OCR files?

Ans:

cssd will manage OCR.


1.What is cache fusion?

Cache fusion is nothing but a mapping of remote memory of oracle buffers, which is shared
between the caches participating nodes in the cluster. It is very easy to gain the block
image from the instance that contain the block in its SGA instead of reading from the disk,
this happens when the block of data is read from data file by an instance in the cluster and
when another instance require the same block.

2.What are the components of clusterware?

Oracle cluster registry (OCR): It contains all information about instances, services, state
information, cluster configuration, nodes and ASM storage if needed. The OCR should
occupy on a shared disk, which is accessible by all the nodes in your cluster. In OCR, the
daemon OCSSd is used to manage the configuration and in the registry, it maintains the
changes to the cluster.

Voting Disk: It helps to verify, if a node has failed, which means it got separated from the
majority, then it is rebooted forcibly and after rebooting, it is added again to the surviving
nodes of cluster. The Oracle RAC uses it to maintain the membership of cluster.

3.What is the purpose of OLR?

Oracle Local Repository (OLR) contains an information which allows the cluster programs to
initiate with the OCR, which is being in the ASM storage. As until the grid processes are
started, the ASM file is unavailable, then a local copy of the data of the OCR is required,
that is stored in OLR.

4.What is FAN?

FAN stands for fast application notification, which is connected to the events containing
services, nodes and instances. In order to describe the other processes about the service
level information and configuration which contains the changes of the service status like UP
or DOWN events, Oracle RAC uses this notification mechanism. Using FAN events, the
application gives response and can take immediate actions.
5.What is SCAN?

SCAN stands for Single Client Access Name is a feature of new Oracle RAC 11g release 2
which provides one name for clients to access an Oracle Database cluster. The benefit to the
SCAN user is that, there is no need to change if you remove or add nodes in the cluster.

6.What is the difference between instance and crash recovery?

The crash recovery takes place during the startup.when an instance, breaks up in a single
node database. The same recovery is performed in the RAC environment by the surviving
nodes, which is called as an instance recovery.

7.What is hangcheck timer?

The hangcheck timer is used to check the health of the system regularly. The node is
restarted automatically, when the system stops or hangs.

There are two key parameters. They are:

Hangcheck margin- this shows that how much delay can be permitted before the reset of
the RAC node is done by the hangcheck timer.

Hangcheck Tick: It is defined as the time period between system health checks. 60 seconds
is the default time, but Oracle recommends it to be 30 seconds.

8.What is the split brain?

When nodes of the database in a cluster can’t communicate with each other, they modify
the data blocks and may continue to process independently. If more than one instance
modify the same block, locking or synchronization of the blocks of the data does not occur
and it may happen that the blocks get overwritten by others in the cluster. This process is
called split brain.
9.What is GRD?

GRD is the Global Resource Directory. The GRD is used by the GES and GCS to maintain the
records of each cached block and each datafile. This process is known as cache fusion and
can be used in data integrity.

10.What is OCR file?

It is a RAC configuration information repository, which maintains the information about the
instance node mapping and cluster node. It also maintains information about the profiles of
oracle Clusterware resource for customed applications. It manages the configuration
information related to any cluster database in the cluster. It is necessary for the OCR to
reside on a shared disk, which is accessible by all of the cluster nodes. The command
daemon OCSSD maintains the configuration information in OCR and manages the changes
to cluster within the registry.

11.What is an interconnect network?

It is a private network, which connects all the servers in a cluster. It uses the multiple
switches which are accessed by only the nodes in the cluster.

12.What is a raw partition?

It is the part of the physical disk, which is accessed on the lowest level. When an addition
partition is created, raw partition is created and without any formatting, a logical partitions
are assigned to it. It is called cooked partition, once the formatting is completed.

13.What is the use of VIP?

When the node fails, then the VIP address of that node fails over to the other node on which
it cannot accept Oracle connections but not TCP connections.

14.What is load balancing Advisory?


To check the workload across resources in the balancing of application, the load balancing
advisory is provided.

15.What options are required to resolve OCR, if it is corrupted?

The backup copy of either physical or logical OCR copy is used to restore the repository.

1. Where are the Clusterware files stored on a RAC environment?

The Clusterware is installed on each node (on an Oracle Home) and on the shared disks (the
voting disks and the CSR file)

2. Where are the database software files stored on a RAC environment?

The base software is installed on each node of the cluster and the

database storage on the shared disks.

3. What kind of storage we can use for the shared Clusterware files?

- OCFS (Release 1 or 2)

- raw devices

- third party cluster file system such as GPFS or Veritas

4. What kind of storage we can use for the RAC database storage?

- OCFS (Release 1 or 2)

- ASM

- raw devices

- third party cluster file system such as GPFS or Veritas

5. What is a CFS?

A cluster File System (CFS) is a file system that may be accessed (read and write) by all
members in a cluster at the same time. This implies that all members of a cluster have the
same view.

6. What is an OCFS2?

The OCFS2 is the Oracle (version 2) Cluster File System which can be used for the Oracle
Real Application Cluster.

7. Which files can be placed on an Oracle Cluster File System?

- Oracle Software installation (Windows only)


- Oracle files (controlfiles, datafiles, redologs, files described by the bfile datatype)

- Shared configuration files (spfile)

- OCR and voting disk

- Files created by Oracle during runtime

Note: There are some platform specific limitations.

8. Do you know another Cluster Vendor?

HP Tru64 Unix, Veritas, Microsoft

9. How is possible to install a RAC if we don’t have a CFS?

This is possible by using a raw device.

10. What is a raw device?

A raw device is a disk drive that does not yet have a file system set up. Raw devices are
used for Real Application Clusters since they enable the sharing of disks.

11. What is a raw partition?

A raw partition is a portion of a physical disk that is accessed at the lowest possible level. A
raw partition is created when an extended partition is created and logical partitions are
assigned to it without any formatting. Once formatting is complete, it is called cooked
partition.

12. When to use CFS over raw?

A CFS offers:

- Simpler management

- Use of Oracle Managed Files with RAC

- Single Oracle Software installation

- Autoextend enabled on Oracle datafiles

- Uniform accessibility to archive logs in case of physical node failure

- With Oracle_Home on CFS, when you apply Oracle patches CFS guarantees that the
updated Oracle_Home is visible to all nodes in the cluster.

Note: This option is very dependent on the availability of a CFS on your platform.

13. When to use raw over CFS?

- Always when CFS is not available or not supported by Oracle.

- The performance is very, very important: Raw devices offer best performance without any
intermediate layer between Oracle and the disk.
Note: Autoextend fails on raw devices if the space is exhausted. However the space could
be added online if needed.

14. What CRS is?

Oracle RAC 10g Release 1 introduced Oracle Cluster Ready Services (CRS), a platform-
independent set of system services for cluster environments. In Release 2, Oracle has
renamed this product to Oracle Clusterware.

15. What is VIP IP used for?

It returns a dead connection IMMIDIATELY, when its primary node fails. Without using VIP
IP, the clients have to wait around 10 minutes to receive ORA-3113: “end of file on
communications channel”. However, using Transparent Application Failover (TAF) could
avoid ORA-3113.

16. Why we need to have configured SSH or RSH on the RAC nodes?

SSH (Secure Shell,10g+) or RSH (Remote Shell, 9i+) allows “oracle” UNIX account
connecting to another RAC node and copy/ run commands as the local “oracle” UNIX
account.

17. Is the SSH, RSH needed for normal RAC operations?

No. SSH or RSH are needed only for RAC, patch set installation and clustered database
creation.

18. Do we have to have Oracle RDBMS on all nodes?

Each node of a cluster that is being used for a clustered database will typically have the
RDBMS and RAC software loaded on it, but not actual data files (these need to be available
via shared disk).

19. What are the restrictions on the SID with a RAC database? Is it limited to 5 characters?

The SID prefix in 10g Release 1 and prior versions was restricted to five characters by
install/ config tools so that an ORACLE_SID of up to max of 5+3=8 characters can be
supported in a RAC environment. The SID prefix is relaxed up to 8 characters in 10g
Release 2, see bug 4024251 for more information.

20. Does Real Application Clusters support heterogeneous platforms?

The Real Application Clusters do not support heterogeneous platforms in the same cluster.

21. Are there any issues for the interconnect when sharing the same switch as the public
network by using VLAN to separate the network?

RAC and Clusterware deployment best practices suggests that the interconnect (private
connection) be deployed on a stand-alone, physically separate, dedicated switch. On big
network the connections could be instable.

22. What is the Load Balancing Advisory?


To assist in the balancing of application workload across designated resources, Oracle
Database 10g Release 2 provides the Load Balancing Advisory. This Advisory monitors the
current workload activity across the cluster and for each instance where a service is active;
it provides a percentage value of how much of the total workload should be sent to this
instance as well as service quality flag.

23. How many nodes are supported in a RAC Database?

With 10g Release 2, we support 100 nodes in a cluster using Oracle Clusterware, and 100
instances in a RAC database. Currently DBCA has a bug where it will not go beyond 63
instances. There is also a documentation bug for the max-instances parameter. With 10g
Release 1 the Maximum is 63.

24. What is the Cluster Verification Utiltiy (cluvfy)?

The Cluster Verification Utility (CVU) is a validation tool that you can use to check all the
important components that need to be verified at different stages of deployment in a RAC
environment.

25. What versions of the database can I use the cluster verification utility (cluvfy) with?

The cluster verification utility is release with Oracle Database 10g Release 2 but can also be
used with Oracle Database 10g Release 1.

26. If I am using Vendor Clusterware such as Veritas, IBM, Sun or HP, do I still need Oracle
Clusterware to run Oracle RAC 10g?

Yes. When certified, you can use Vendor Clusterware however you must still install and use
Oracle Clusterware for RAC. Best Practice is to leave Oracle Clusterware to manage RAC. For
details see Metalink Note 332257.1 and for Veritas SFRAC see 397460.1.

27. Is RAC on VMWare supported?

Yes.

28. What is hangcheck timer used for ?

The hangcheck timer checks regularly the health of the system. If the system hangs or stop
the node will be restarted automatically.

There are 2 key parameters for this module:

-> hangcheck-tick: this parameter defines the period of time between checks of system
health. The default value is 60 seconds; Oracle recommends setting it to 30seconds.

-> hangcheck-margin: this defines the maximum hang delay that should be tolerated before
hangcheck-timer resets the RAC node.

29. Is the hangcheck timer still needed with Oracle RAC 10g?

Yes.
30. What files can I put on Linux OCFS2?

For optimal performance, you should only put the following files on Linux OCFS2:

- Datafiles

- Control Files

- Redo Logs

- Archive Logs

- Shared Configuration File (OCR)

- Voting File

- SPFILE

31. Is it possible to use ASM for the OCR and voting disk?

No, the OCR and voting disk must be on raw or CFS (cluster file system).

32. Can I change the name of my cluster after I have created it when I am using Oracle
Clusterware?

No, you must properly uninstall Oracle Clusterware and then re-install.

33. What the O2CB is?

The O2CB is the OCFS2 cluster stack. OCFS2 includes some services. These services must
be started before using OCFS2 (mount/ format the file systems).

34. What the OCR file is used for?

OCR is a file that manages the cluster and RAC configuration.

35. What the Voting Disk file is used for?

The voting disk is nothing but a file that contains and manages information of all the node
memberships.

36. What is the recommended method to make backups of a RAC environment?


RMAN to make backups of the database, dd to backup your voting disk and hard copies of
the OCR file.

37. What command would you use to check the availability of the RAC system?

crs_stat -t -v (-t -v are optional)

38. What is the minimum number of instances you need to have in order to create a RAC?

You can create a RAC with just one server.

39. Name two specific RAC background processes

RAC processes are: LMON, LMDx, LMSn, LKCx and DIAG.


40. Can you have many database versions in the same RAC?

Yes, but Clusterware version must be greater than the greater database version.

41. What was RAC previous name before it was called RAC?OPS: Oracle Parallel Server

42. What RAC component is used for communication between instances?Private


Interconnect.

43. What is the difference between normal views and RAC views?A RAC view has the prefix
‘G’. For example, GV$SESSION instead of V$SESSION

44. Which command will we use to manage (stop, start) RAC services in command-line
mode?

srvctl

45. How many alert logs exist in a RAC environment?

A- One for each instance.

46. What are Oracle Clusterware Components

Voting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a
health check and arbitrates cluster ownership among the instances in case of network
failures. The voting disk must reside on shared disk.

Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as


configuration information about any cluster database within the cluster. The OCR must
reside on shared disk that is accessible by all of the nodes in your cluster

47. How do you backup voting disk

#dd if=voting_disk_name of=backup_file_name

48. How do I identify the voting disk location

#crsctl query css votedisk

49. How do I identify the OCR file location

check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)

or

#ocrcheck

50. What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.
What is the main purpose of Oracle Real Application Clusters (RAC)?

Oracle Real Application (RAC) provides the interaction of executable file with the Oracle
database.

It allows the running of any packaged or custom built application with the Oracle database
that is running on a server pool.It provides very high level of availability, flexibility and
scalability to run the application and store it to the database.

It creates the database such that if the pool fails then the database is continued to run from
the remaining servers and the load can be distributed.It makes it easier for the
administrator to maintain many servers at the same time by load-balancing techniques and
providing provision to add more and more servers when the load increases.

What is the difference between Crash recovery and Instance recovery?

When an instance crashes in a single node database on startup a crash recovery takes
place. In a RAC enviornment the same recovery for an instance is performed by the
surviving nodes called Instance recovery.

What is the interconnect used for?

It is a private network which is used to ship data blocks from one instance to another for
cache fusion. The physical data blocks as well as data dictionary blocks are shared across
this interconnect.

How do you determine what protocol is being used for Interconnect traffic?

One of the ways is to look at the database alert log for the time period when the database
was started up.

What methods are available to keep the time synchronized on all nodes in the cluster?
Either the Network Time Protocol(NTP) can be configured or in 11gr2, Cluster Time
Synchronization Service (CTSS) can be used.

What files components in RAC must reside on shared storage?

Spfiles, ControlFiles, Datafiles and Redolog files should be created on shared storage.

Where does the Clusterware write when there is a network or Storage missed heartbeat?

The network ping failure is written in $CRS_HOME/log

How do you find out what OCR backups are available?

The ocrconfig -showbackup can be run to find out the automatic and manually run backups.

If your OCR is corrupted what options do have to resolve this?

You can use either the logical or the physical OCR backup copy to restore the Repository.

How do you find out what object has its blocks being shipped across the instance the most?

You can use the dba_hist_seg_stats.

What is a VIP in RAC use for?

The VIP is an alternate Virtual IP address assigned to each node in a cluster. During a node
failure the VIP of the failed node moves to the surviving node and relays to the application
that the node has gone down. Without VIP, the application will wait for TCP timeout and
then find out that the session is no longer live due to the failure.
How do we know which database instances are part of a RAC cluster?

You can query the V$ACTIVE_INSTANCES view to determine the member instances of the
RAC cluster.

What is OCLUMON used for in a cluster environment?

The Cluster Health Monitor (CHM) stores operating system metrics in the CHM repository for
all nodes in a RAC cluster. It stores information on CPU, memory, process, network and
other OS data, This information can later be retrieved and used to troubleshoot and identify
any cluster related issues. It is a default component of the 11gr2 grid install. The data is
stored in the master repository and replicated to a standby repository on a different node.

What would be the possible performance impact in a cluster if a less powerful node (e.g.
slower CPU’s) is added to the cluster?

All processing will show down to the CPU speed of the slowest server.

What is the purpose of OLR?

Oracle Local repository contains information that allows the cluster processes to be started
up with the OCR being in the ASM storage ssytem. Since the ASM file system is unavailable
until the Grid processes are started up a local copy of the contents of the OCR is required
which is stored in the OLR.

What is the default memory allocation for ASM?

In 10g the default SGA size is 1G in 11g it is set to 256M and in 12c ASM it is set back to
1G.

How do you backup ASM Metadata?


You can use md_backup to restore the ASM diskgroup configuration in-case of ASM
diskgroup storage loss.

What files can be stored in the ASM diskgroup?

In 11g the following files can be stored in ASM diskgroups.

– Datafiles

– Redo logfiles

– Spfiles

In 12c the files below can also new be stored in the ASM Diskgroup

– Password file

What it the ASM POWER_LIMIT?

This is the parameter which controls the number of Allocation units the ASM instance will try
to rebalance at any given time. In ASM versions less than 11.2.0.3 the default value is 11
however it has been changed to unlimited in later versions.

What is a rolling upgrade?

A patch is considered a rolling if it is can be applied to the cluster binaries without having to
shutting down the database in a RAC environment. All nodes in the cluster are patched in a
rolling manner, one by one, with only the node which is being patched unavailable while all
other instance open.

What are some of the RAC specific parameters?


Some of the RAC parameters are:

– CLUSTER_DATABASE

– CLUSTER_DATABASE_INSTANCE

– INSTANCE_TYPE (RDBMS or ASM)

– ACTIVE_INSTANCE_COUNT

– UNDO_MANAGEMENT

What is the future of the Oracle Grid?

The Grid software is becoming more and more capable of not just supporting HA for Oracle
Databases but also other applications including Oracle’s applications. With 12c there are
more features and functionality built-in and it is easier to deploy these pre-built solutions,
available for common Oracle applications.

What components of the Grid should I back up?

The backups should include OLR, OCR and ASM Metadata.

Is there an easy way to verify the inventory for all remote nodes?

You can run the opatch lsinventory -all_nodes command from a single node to look at the
inventory details for all nodes in the cluster.

What is cache fusion?


In a RAC environment, it is the combining of data blocks, which are shipped across the
interconnect from remote database caches (SGA) to the local node, in order to fulfill the
requirements for a transaction (DML, Query of Data Dictionary).

What is split brain?

When database nodes in a cluster are unable to communicate with each other, they may
continue to process and modify the data blocks independently. If the

same block is modified by more than one instance, synchronization/locking of the data
blocks does not take place and blocks may be overwritten by others in the cluster. This
state is called split brain.

What is the Load Balancing Advisory?

Load balancing advisory is a process through which the load of the applications and
resources can be managed throughout the servers.

It monitors the workload of the current activities from all the clusters and the instances that
is being given on the server.

The service that is being provided is active all the time to see the workload of the
applications on the servers.

To simplify it, it provides a percentage value to show the total workload of the instance and
it flags the instance according to the quality.

Load Balancing Advisory helps in maintaining the loads from the servers and equally
distributes it among many other servers that are not currently working.

Why is Cluster Verification Utility so important in Oracle RAC?

– Cluster Verification Utility is a tool in the Oracle Grid that is used to eliminate the errors
that come up with the validations of the steps.
– It provides the verification on the changes that is being made in the configuration of the
files or the system.

– The tool can be used with the command line interface and it is used to validate the
configuration input as well such that during the installation it can be found out that
everything is perfectly ok.

– The tool is used to verify the system pre-requisites that are related to Oracle Clusterware,
ASM and the databases.

– There are few fix up scripts available if by any means the verification tool fails then these
scripts can be used to automatically fix the errors.

What are the components required to manage Oracle Real Application Clusters Database?

Oracle RAC uses a single system in the form of an image to configure and manage the
servers in an easy way. It provides a database for the installed and configured applications
from one location so that it can be managed in an easy way.

The components required to be provided with it is as follows:

– Oracle Universal Installer (OUI) is used to manage the database that is related to the
cluster and provide enterprise level configuration.

– Database configuration assistant (DBCA) that manages the database and its related
functionality and services.

– Database upgrade assistant (DBUA) is the tool that allows the database to be upgraded
when it is required on the server.

1. What is the major difference between 10g and 11g RAC?

Well, there is not much difference between 10g and 11gR (1) RAC.
But there is a significant difference in 11gR2.

Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS

 Databases
 Instances
 Applications
 Node Monitoring
 Event Services
 High Availability

From 11gR2(onwards) its completed HA stack managing and providing the


following resources as like the other cluster software like VCS etc.

 Databases
 Instances
 Applications
 Cluster Management
 Node Management
 Event Services
 High Availability
 Network Management (provides DNS/GNS/MDNSD services on behalf of other
traditional services) and SCAN – Single Access Client Naming method, HAIP
 Storage Management (with help of ASM and other new ACFS filesystem)
 Time synchronization (rather depending upon traditional NTP)
 Removed OS dependent hang checker etc, manages with own additional monitor
process

2. What are Oracle Cluster Components?

Cluster Interconnect (HAIP)

Shared Storage (OCR/Voting Disk)

Clusterware software

3. What are Oracle RAC Components?

VIP, Node apps etc.

4. What are Oracle Kernel Components (nothing but how does Oracle RAC
database differs than Normal single instance database in terms of Binaries and
process)

Basically Oracle kernel need to switched on with RAC On option when you convert to RAC,
that is the difference as it facilitates few RAC bg process like LMON,LCK,LMD,LMS etc.

To turn on RAC
# link the oracle libraries
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk rac_on
# rebuild oracle
$ cd $ORACLE_HOME/bin
$ relink oracle
Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database.Oracle RAC
instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion.Oracle RAC instances are composed of following background processes:

ACMS—Atomic Controlfile to Memory Service (ACMS)


GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor

5. What is Clusterware?

Software that provides various interfaces and services for a cluster. Typically, this includes
capabilities that:

 Allow the cluster to be managed as a whole


 Protect the integrity of the cluster
 Maintain a registry of resources across the cluster
 Deal with changes to the cluster
 Provide a common view of resources

6. What are the background process that exists in 11gr2 and functionality?

Process
Functionality
Name

•The CRS daemon (crsd) manages cluster resources based on configuration


information that is stored in Oracle Cluster Registry (OCR) for each resource.
crsd
This includes start, stop, monitor, and failover operations. The crsd process
generates events when the status of a resource changes.

•Cluster Synchronization Service (CSS): Manages the cluster configuration


by controlling which nodes are members of the cluster and by notifying
members when a node joins or leaves the cluster. If you are using certified
third-party clusterware, then CSS processes interfaces with your clusterware
to manage node membership information. CSS has three separate
cssd processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and the
CSS Monitor (cssdmonitor). The cssdagent process monitors the cluster and
provides input/output fencing. This service formerly was provided by Oracle
Process Monitor daemon (oprocd), also known as OraFenceService on
Windows. A cssdagent failure results in Oracle Clusterware restarting the
node.

•Disk Monitor daemon (diskmon): Monitors and performs input/output


fencing for Oracle Exadata Storage Server. As Exadata storage can be added
diskmon
to any Oracle RAC node at any point in time, the diskmon daemon is always
started when ocssd is started.
•Event Manager (EVM): Is a background process that publishes Oracle
evmd
Clusterware events

•Multicast domain name service (mDNS): Allows DNS requests. The mDNS
mdnsd process is a background process on Linux and UNIX, and a service on
Windows.

•Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS
gnsd and external DNS servers. The GNS process performs name resolution within
the cluster.

•Oracle Notification Service (ONS): Is a publish-and-subscribe service for


ons
communicating Fast Application Notification (FAN) events

•oraagent: Extends clusterware to support Oracle-specific requirements and


complex resources. It runs server callout scripts when FAN events occur.
oraagent
This process was known as RACG in Oracle Clusterware 11g Release 1
(11.1).

•Oracle root agent (orarootagent): Is a specialized oraagent process that


orarootagent helps CRSD manage resources owned by root, such as the network, and the
Grid virtual IP address

•Cluster kill daemon (oclskd): Handles instance/node evictions requests that


oclskd
have been escalated to CSS

•Grid IPC daemon (gipcd): Is a helper daemon for the communications


gipcd
infrastructure

•Cluster time synchronisation daemon(ctssd) to manage the time


ctssd
syncrhonization between nodes, rather depending on NTP

7. Under which user or owner the process will start?

Component Name of the Process Owner

Oracle High Availability


ohasd init, root
Service

Cluster Ready Service


Cluster Ready Services root
(CRS)

Cluster Synchronization ocssd,cssd monitor,


grid owner
Service (CSS) cssdagent
Event Manager (EVM) evmd, evmlogger grid owner

Cluster Time
Synchronization Service octssd root
(CTSS)

Oracle Notification Service


ons, eons grid owner
(ONS)

Oracle Agent oragent grid owner

Oracle Root Agent orarootagent root

Grid Naming Service


gnsd root
(GNS)

Grid Plug and Play (GPnP) gpnpd grid owner

Multicast domain name


mdnsd grid owner
service (mDNS)

8. What is startup sequence in Oracle 11g RAC? 11g RAC startup sequence?

Click here to know more details

9. As you said Voting & OCR Disk resides in ASM Diskgroups, but as per startup
sequence OCSSD starts first before than ASM, how is it possible?

How does OCSSD starts if voting disk & OCR resides in ASM Diskgroups?

You might wonder how CSSD, which is required to start the clustered ASM instance, can be
started if voting disks are stored in ASM? This sounds like a chicken-and-egg problem:
without access to the voting disks there is no CSS, hence the node cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM instance. To solve this
problem the ASM disk headers have new metadata in 11.2: you can use kfed to read the
header of an ASM disk containing a voting disk. The kfdhdb.vfstart and kfdhdb.vfend fields
tell CSS where to find the voting file. This does not require the ASM instance to be up. Once
the voting disks are located, CSS can access them and joins the cluster.

Source: Pro Oracle Database 11g RAC on Linux- Martin Bach ... - Amazon.com

10. How does SCAN works?


1. Client Connected through SCAN name of the cluster (remember all three IP
addresses round robin resolves to same Host name (SCAN Name), here in this case
our scan name is cluster01-scan.cluster01.example.com
2. The request reaches to DNS server in your corp and then resolves to one of the node
out of three. a. If GNS (Grid Naming service or domain is configured) that is a
subdomain configured in the DNS entry for to resolve cluster address the request
will be handover to GNS (gnsd)
3. Here in our case assume there is no GNS, now the with the help of SCAN listeners
where end points are configured to database listener.
4. Database Listeners listen the request and then process further.
5. In case of node addition, Listener 4, client need not to know or need not change any
thing from their tns entry (address of 4th node/instance) as they just using scan IP.
6. Same case even in the node deletion.

11. What is GNS?

Grid Naming service is alternative service to DNS , which will act as a sub domain in your
DNS but managed by Oracle, with GNS the connection is routed to the cluster IP and
manages internally.

12.

13. What are the file types that ASM support and keep in disk groups?

Data Pump dump


Control files Flashback logs
sets
Data Guard
Data files DB SPFILE
configuration

Change tracking
Temporary data files RMAN backup sets
bitmaps

RMAN data file


Online redo logs OCR files
copies

Archive logs Transport data files ASM SPFILE

14. List Key benefits of ASM?

 Stripes files rather than logical volumes


 Provides redundancy on a file basis
 Enables online disk reconfiguration and dynamic rebalancing
 Reduces the time significantly to resynchronize a transient failure by tracking
changes while disk is offline
 Provides adjustable rebalancing speed
 Is cluster-aware
 Supports reading from mirrored copy instead of primary copy for extended clusters
 Is automatically installed as part of the Grid Infrastructure

15. List key benefits of Oracle Grid Infrastructure?

16. List some of the background process that used in ASM?

Process Description

Opens all device files as part of discovery and


RBAL
coordinates the rebalance activity

One or more slave processes that do the rebalance


ARBn
activity

Responsible for managing the disk-level activities such


GMON as drop or offline and advancing the ASM disk group
compatibility

MARK Marks ASM allocation units as stale when needed

One or more ASM slave processes forming a pool of


Onnn connections to the ASM instance for exchanging
messages

PZ9n One or more parallel slave processes used in fetching


data on clustered ASM installation from GV$ views

13. What is node listener?

In 11gr2 the listeners will run from Grid Infrastructure software home

 The node listener is a process that helps establish network connections from ASM
clients to the ASM instance.
 Runs by default from the Grid $ORACLE_HOME/bin directory
 Listens on port 1521 by default
 Is the same as a database instance listener
 Is capable of listening for all database instances on the same machine in addition to
the ASM instance
 Can run concurrently with separate database listeners or be replaced by a separate
database listener
 Is named tnslsnr on the Linux platform

15. What is SCAN listener?

A scan listener is something that additional to node listener which listens the incoming db
connection requests from the client which got through the scan IP, it got end points
configured to node listener where it routes the db connection requests to particular node
listener.

16. What is the difference between CRSCTL and SRVCTL?

crsctl manages clusterware-related operations:

 Starting and stopping Oracle Clusterware


 Enabling and disabling Oracle Clusterware daemons
 Registering cluster resources

srvctl manages Oracle resource–related operations:

 Starting and stopping database instances and services


 Also from 11gR2 manages the cluster resources like network,vip,disks etc

17. How to control Oracle Clusterware?

To start or stop Oracle Clusterware on a specific node:

# crsctl stop crs

# crsctl start crs

To enable or disable Oracle Clusterware on a specific node:

# crsctl enable crs

# crsctl disable crs

19. How to check the cluster (all nodes) status?

To check the viability of Cluster Synchronization Services (CSS) across nodes:


$ crsctl check cluster

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Services is online

CRS-4533: Event Manager is online

20. How to check the cluster (one node) status?

$ crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Services is online

CRS-4533: Event Manager is online

21. How to find Voting Disk location?

•To determine the location of the voting disk:

# crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- ---------- ----------

1. ONLINE 8c2e45d734c64f8abf9f136990f3daf8 (ASMDISK01) [DATA]

2. ONLINE 99bc153df3b84fb4bf071d916089fd4a (ASMDISK02) [DATA]

3. ONLINE 0b090b6b19154fc1bf5913bc70340921 (ASMDISK03) [DATA]

Located 3 voting disk(s).

22. How to find Location of OCR?

 cat /etc/oracle/ocr.loc

ocrconfig_loc=+DATA

local_only=FALSE

 #OCRCHECK (also about OCR integrity)

23. List some background process that used in ASM Instances?

Process Description

RBAL Opens all device files as part of discovery and


coordinates the rebalance activity

One or more slave processes that do the rebalance


ARBn
activity

Responsible for managing the disk-level activities such


GMON as drop or offline and advancing the ASM disk group
compatibility

MARK Marks ASM allocation units as stale when needed

One or more ASM slave processes forming a pool of


Onnn connections to the ASM instance for exchanging
messages

One or more parallel slave processes used in fetching


PZ9n
data on clustered ASM installation from GV$ views

24. What are types of ASM Mirroring?

Supported Default
Disk Group Type
Mirroring Levels Mirroring Level

External redundancy Unprotected (None) Unprotected (None)

Two-wayThree-
Normal redundancy wayUnprotected Two-way
(None)

High redundancy Three-way Three-way

25. What is ASM Striping?

ASM can use variable size data extents to support larger files, reduce memory
requirements, and improve performance.

Each data extent resides on an individual disk.

Data extents consist of one or more allocation units.

The data extent size is:

 Equal to AU for the first 20,000 extents (0–19999)


 Equal to 4 × AU for the next 20,000 extents (20000–39999)
 Equal to 16 × AU for extents above 40,000
ASM stripes files using extents with a coarse method for load balancing or a fine method to
reduce latency.

 Coarse-grained striping is always equal to the effective AU size.


 Fine-grained striping is always equal to 128 KB.

26. How many ASM Diskgroups can be created under one ASM Instance?

ASM imposes the following limits:

 63 disk groups in a storage system


 10,000 ASM disks in a storage system
 Two-terabyte maximum storage for each ASM disk (non-Exadata)
 Four-petabyte maximum storage for each ASM disk (Exadata)
 40-exabyte maximum storage for each storage system
 1 million files for each disk group
 ASM file size limits (database limit is 128 TB):

1. External redundancy maximum file size is 140 PB.


2. Normal redundancy maximum file size is 42 PB.
3. High redundancy maximum file size is 15 PB.

27. How to find the cluster network settings?

To determine the list of interfaces available to the cluster:

$ oifcfg iflist –p -n

To determine the public and private interfaces that have been configured:

$ oifcfg getif

eth0 192.0.2.0 global public

eth1 192.168.1.0 global cluster_interconnect

To determine the Virtual IP (VIP) host name, VIP address, VIP subnet mask, and VIP
interface name:

$ srvctl config nodeapps -a

VIP exists.:host01

VIP exists.: /192.0.2.247/192.0.2.247/255.255.255.0/eth0

...

28. How to change Public or VIP Address in RAC Cluster?

Click here for details

29. How to change Cluster interconnect in RAC?

On a single node in the cluster, add the new global interface specification:

$ oifcfg setif -global eth2/192.0.2.0:cluster_interconnect


Verify the changes with oifcfg getif and then stop Clusterware on all nodes by running the
following command as root on each node:

# oifcfg getif

# crsctl stop crs

Assign the network address to the new network adapters on all nodes using ifconfig:

#ifconfig eth2 192.0.2.15 netmask 255.255.255.0 broadcast 192.0.2.255

Remove the former adapter/subnet specification and restart Clusterware:

$ oifcfgdelif -global eth1/192.168.1.0

# crsctl start crs

30. Managing or Modifying SCAN in Oracle RAC?

To add a SCAN VIP resource:

$ srvctl add scan -n cluster01-scan

To remove Clusterware resources from SCAN VIPs:

$ srvctl remove scan [-f]

To add a SCAN listener resource:

$ srvctl add scan_listener

$ srvctl add scan_listener -p 1521

To remove Clusterware resources from all SCAN listeners:

$ srvctl remove scan_listener [-f]

31. How to check the node connectivity in Oracle Grid Infrastructure?

$ cluvfy comp nodecon -n all –verbose

32. Can I stop all nodes in one command? Meaning that stopping whole cluster ?

In 10g its not possible, where in 11g it is possible

[root@pic1]# crsctl start cluster -all


[root@pic2]# crsctl stop cluster –all

33. What is OLR? Which of the following statements regarding the Oracle Local
Registry (OLR) is true?

1.Each cluster node has a local registry for node-specific resources.

2.The OLR should be manually created after installing Grid Infrastructure on each node in
the cluster.
3.One of its functions is to facilitate Clusterware startup in situations where the ASM stores
the OCR and voting disks.

4.You can check the status of the OLR using ocrcheck.

35. How to stop whole cluster with single command

crsctl stop cluster (possible only from 11gr2), please note crsctl commands becomes
global now, if you do not specify node specifically the command executed globally for
example
crsctl stop crs (stops in all crs resource in all nodes)
crsctl stop crs –n <ndeoname) (stops only in specified node)

36. CRS is not starting automatically after a node reboot, what you do to make it
happen?

crsctl enable crs (as root)

to disable

crsctl disable crs (as root)

37. What are server pools in 11gr2?

Read here

38. What is policy managed databases in RAC?

Read here

39. What is Load balancing & how does it work?

You must read here & here

40. Describe high level Steps to convert single instance to RAC?

Read here

41. What is the difference between TAF and FAN & FCF? at what conditions you
use them?

1) TAF with tnsnames


a feature of Oracle Net Services for OCI8 clients. TAF is transparent application failover
which will move a session to a backup connection if the session fails. With Oracle 10g
Release 2, you can define the TAF policy on the service using dbms_service package. It will
only work with OCI clients. It will only move the session and if the parameter is set, it will
failover the select statement. For insert, update or delete transactions, the application must
be TAF aware and roll back the transaction. YES, you should enable FCF on your OCI client
when you use TAF, it will make the failover faster.
Note: TAF will not work with JDBC thin.
2) FAN with tnsnames with aq notifications true
FAN is a feature of Oracle RAC which stands for Fast Application Notification. This allows the
database to notify the client of any change (Node up/down, instance up/down, database
up/down). For integrated clients, inflight transactions are interrupted and an error message
is returned. Inactive connections are terminated.
FCF is the client feature for Oracle Clients that have integrated with FAN to provide fast
failover for connections. Oracle JDBC Implicit Connection Cache, Oracle Data Provider for
.NET (ODP.NET) and Oracle Call Interface are all integrated clients which provide the Fast
Connection Failover feature.
3) FCF, along with FAN when using connection pools
FCF is a feature of Oracle clients that are integrated to receive FAN events and abort inflight
transactions, clean up connections when a down event is received as well as create new
connections when a up event is received. Tomcat or JBOSS can take advantage of FCF if the
Oracle connection pool is used underneath. This can be either UCP (Universal Connection
Pool for JAVA) or ICC (JDBC Implicit Connection Cache). UCP is recommended as ICC will be
deprecated in a future release.

4) ONS, with clusterware either FAN/FCF

ONS is part of the clusterware and is used to propagate messages both between nodes and
to application-tiers
ONS is the foundation for FAN upon which is built FCF.
RAC uses FAN to publish configuration changes and LBA events. Applications can react as
those published events in two way :
- by using ONS api (you need to program it)
- by using FCF (automatic by using JDBC implicit connection cache on the application
server)
you can also respond to FAN event by using server-side callout but this on the server side
(as their name suggests it)

Relationship between FAN/FCF/ONS

ONS --> FAN --> FCF


ONS -> send/receive messages on local and remote nodes.
FAN -> uses ONS to notify other processes about changes in configuration of service level
FCF -> uses FAN information working with conection pools JAVA and others.

42. Can you add voting disk online? Do you need voting disk backup?

Yes, as per documentation, if you have multiple voting disk you can add online, but if you
have only one voting disk , by that cluster will be down as its lost you just need to start crs
in exclusive mode and add the votedisk using

crsctl add votedisk <path>

43. You have lost OCR disk, what is your next step?

The cluster stack will be down due to the fact that cssd is unable to maintain the integrity,
this is true in 10g, From 11gR2 onwards, the crsd stack will be down, the hasd still up and
running. You can add the ocr back by restoring the automatic backup or import the manual
backup,

Read complete steps here

44. What happens when ocssd fails, what is node eviction? how does node eviction
happens? For all answer will be same.
Read here

45. What is virtual IP and how does it works?

Read here

46. Describe some rac wait events you experienced?

Oracle RAC Wait events

and this table,

47. Can you modify VIP address after your cluster installation?

Yes, read here

48. How do you interpret AWR report in RAC instances, what sections in awr report for rac
instances are most important?

Read here.

Update 12-May-2013, Some practical questions added here

1. Viewing Contents in OCR/Voting disks


There are three possible ways to view the OCR contents.
a. OCRDUMP (or)
b. crs_stat -p (or)
c. By using strings.
Voting disk contents are not persistent and are not required to view the contents,
because the voting disk contents will be overwritten. if still need to view, strings are used.

2. Server pools - Read in my blog

3. Verifying Cluster Interconnect

Cluster interconnects can be verified by:


i. oifcfg getif
ii. From AWR Report.
iii. show parameter cluster_interconnect
iv. srvctl config network

4. Does scan IP required or we can disable it

SCAN IP can be disabled if not required. However SCAN IP is mandatory during


the RAC installation. Enabling/disabling SCAN IP is mostly used in oracle apps environment
by the concurrent manager (kind of job scheduler in oracle apps).
To disable the SCAN IP,
i. Do not use SCAN IP at the client end.
ii. Stop scan listener
srvctl stop scan_listener
iii. Stop scan
srvctl stop scan (this will stop the scan vip's)
iv. Disable scan and disable scan listener
srvctl disable scan

5. Migrating to new Diskgroup scenarious

a. Case 1: Migrating disk group from one storage to other with same name
1. Consider the disk group is DATA,
2. Create new disks in DATA pointing towards the new storage (EMC),
a) Partioning provisioning done by storage and they give you the device name
or mapper like /dev/mapper/asakljdlas
3. Add the new disk to diskgroup DATA
a) Alter diskgroup data add disk '/dev/mapper/asakljdlas'
3. drop the old disks from DATA with which rebalancing is done automatically.
If you want you can the rebalance by alter system set asm_power_limit =12 for full
throttle.
alter diskgroup data drop disk 'path to hitachi storage'
Note: you can get the device name in v$asm_disk in path column.
4. Request SAN team to detach the old Storage (HITACHI).

b. Case 2: Migrating disk group from one to another with different diskgroup name.
1) Create the Disk group with new name in the new storage.
2) Create the spfile in new diskgroup and change the parameter scope = spfile for
control files etc.
3) Take a control file backup in format +newdiskgroup
4) Shutdown the db, startup nomount the database
5) restore the control file from backup (now the control will restore to new diskgroup)
6) Take the RMAN backup as copy of all the databases with new format.
RMAN&gt; backup database as copy format '+newdiskgroup name' ;
3) RMAN&gt; Switch database to copy.
4) Verify dba_data_files,dba_temp_files, v$log that all files are pointing to new
diskgroup name.

c. Case 3: Migrating disk group to new storage but no additional diskgroup given
1) Take the RMAN backup as copy of all the databases with new format and place it in
the disk.
2) Prepare rename commands from v$log ,v$datafile etc (dynamic queries)
3) Take a backup of pfile and modify the following referring to new diskgroup name
.control_files
.db_create_file_dest
.db_create_online_log_dest_1
.db_create_online_log_dest_2
.db_recovery_file_des
4) stop the database
5) Unmount the diskgroup
asmcmd umount ORA_DATA
6) use asmcmd renamedg (11gr2 only) command to rename to new
diskgroup
renamedg phase=both dgname=ORA_DATA newdgname=NEW_DATA
verbose=true
7) mount the diskgroup
asmcmd mount NEW_DATA
8) start the database in mount with new pfile taken backup in step 3
9) Run the rename file scripts generated at step2
9) Add the diskgroup to cluster the cluster (if using rac)
srvctl modify database -d orcl -p +NEW_FRA/orcl/spfileorcl.ora
srvctl modify database -d orcl -a "NEW_DATA"
srvctl config database -d orcl
srvctl start database -d orcl
10) Delete the old diskgroup from cluster
crsctl delete resource ora.ORA_DATA.dg
11) Open the database.

7. Database rename in RAC, what could be the checklist for you?

a. Take the outputs of all the services that are running on the databases.
b. set cluster_database=FALSE
c. Drop all the services associated with the database.
d. Stop the database
e. Startup mount
f. Use nid to change the DB Name.
Generic question, If using ASM the usual location for the datafile would
be +DATA/datafile/OLDDBNAME/system01.dbf'
Does NID changes this path too? to reflect the new db name?
Yes it will, by using proper directory structure it will create a links to
original directory structure. +DATA/datafile/NEWDBNAME/system01.dbf'
this has to be tested, We dont have test bed, but thanks to Anji who
confirmed it will

g. Change the parameters according to the new database name


h. Change the password file.
i. Stop the database.
j. Mount the database
k. Open database with Reset logs
l. Create spfile from pfile.
m. Add database to the cluster.
n. Create the services that are dropped in prior to rename.
o. Bounce the database.

8.How to find the database in which particular service is attached to when you have a large
number of databases running in the server, you cannot check one by one manually

Write a shell script to read the database name from oratab and iterate the loop taking inpt
as DB name in srvctl to get the result.

#!/bin/ksh

ORACLE_HOME=<crs_home>

PATH=$ORACLE_HOME/bin:$PATH

LD_LIBRARY_PATH=${SAVE_LLP}:${ORACLE_HOME}/lib

export TNS_ADMIN ORACLE_HOME PATH LD_LIBRARY_PATH

for INSTANCE in `cat /etc/oratab|grep -v "^#"|cut -f1 -d: -s`

do
export ORACLE_SID=$INSTANCE

echo `srvctl status service -d $INSTANCE -s $1| grep -i "is running"`

done

9. Difference between OHAS and CRS

OHAS is complete cluster stack which includes some kernel level tasks like managing
network,time synchronization, disks etc, where the CRS has the ability to manage the
resources like database,listeners,applications, etc With both of this Oracle provides the high
availability clustering services rather only affinity to databases.

Update, Few More Practical Questions & Answers

How to start cluster

crsctl start crs

crsct start cluster -all

How to stop cluster

crsctl stop crs --> node

crsctl stop cluster -all

How to automatically start the cluster when reboot

crsctl enable crs

How to disable the crs when reboot

crsctl disable crs

How to check voting Disk

crsctl query css votedisk

How to check ocr

ocrcheck

How to take backup or check backups of ocr


ocrconfig -showbackup

ocrconfig -manualbackup (to take manual backup)

How to check crs versions

crsctl query crs softwareversion

crsctl query crs activeversion

How to check Private/Public COnfiguration

oifcfg getif

How to modify ocr to a different diskgroup

ocrconfig -add +NEWDG

How to modify voting disk to different diskgroup

crsctl replace votedisk +NEWDG

How to set the Private IP/Public IP/Public

oifcfg setif <IP>

crsctl stop cluster

crsctl start cluster

How to modify scan IP/Public VIP

srvctl stop vip

srvctl stop scan

srvctl stop scan_listener

srvctl stop listener

Change the IP address in network/dns/etc/hosts

srvctl modify vip -n nodename -vip vipname

srvctl modify scan


srvctl start scan_listener

srvctl start listener

How to delete a node from Cluster

on the node to delete

cd $GRID_home/CRSCONFIG/INSTALL

ROOTCRS.PL -DECONFIG -FORCE

ON OTHER NODE

CRSCTL DELETE NODE -N NODE4

How to add a node to cluster

1. New Node Readiness

a) RPMS Install

b) Kernel Parameters

c) Directory structures creation i.e oracle home

d) oracle user creation

e) File limits

d) ntp setup

2. Setup user equivalency - Password less conneticity

From existing node in cluster

$GRID_HOME/sshsetup/sshsetup -user oracle -node node1,node2,node3,newnode

3. Copy GRID HOME from any node to new node

SCP $GRID_HOME newnode:/u01/app/12.1.0/grid

4. In new node

Remove unwanted log directories in the grid home you just copied
5. Run clone.pl in new node

cd $GRID_HOME/oui/bin/clone.pl -ORACLE_HOME=/u01/oracle/12.1.0/grid

6. Run addnode.sh in existing node (remember this need to be run on existing node not on
new node)

cd $ORACLE_HOME/oui/bin/addnode.sh newnodename newnodevip newnodeprivateip

6. Copy the gpnp profile and crs params to new node from existing node

scp $GRID_HOME/gpnp/peer/profile.xml newnode:$GRID_HOME/gpnp/peer/profile

scp $GRID_HOME/crsconfig/install/crs_configparams
newnode:$GRID_HOME/gpnp/peer/profile

7. On new node , run root.sh

./root.sh

How to add a database to cluster

srvctl add database -d orcl -o oracle_home -p pfile -s startupoptions

srvctl add instance -d orcl -i orc11 -n node1

How to check config details in cluster

srvctl config database -d orcl

srvctl config scan

srvctl config service

1) Mention what is cluster?

A cluster is referred to a group of independent, but connected servers that behaves as a


single system.

2) Mention what is Oracle Real Application Clusters?


RAC or Real Application Cluster is a component of the database product that enables the
database to be installed across multiple servers. Oracle RAC uses Oracle Clusterware for the
infrastructure to bind multiple servers, so they operate as a single system.

3) Mention what are the main components of an Oracle RAC system?

The main elements of an Oracle RAC system are,

Shared disk system

Oracle Clusterware

Cluster Interconnects

Oracle Kernel Components

4) Mention what are the benefits of Oracle RAC?

Benefits of RAC is that

Business Continuity and High Availability

Workload Management with least expense

Agility and Scalability

System management and Standardized deployment

5) Mention what are the file storage options provided by Oracle Database for Oracle RAC?

The file storage options provided by Oracle Database for Oracle RAC are,

Automatic Storage Management (ASM)

OCFS2 and Oracle Cluster File System (OCFS)

A network file system

Raw devices

im07t1-real-application-clusters-1895795
6) Mention what is the volume management techniques used in Oracle RAC?

Volume management techniques used in Oracle RAC is that,

Oracle RAC provides dynamic volume manager. It has a file system that consists of
information of the cluster file system

Cluster file system in Oracle is known as OCFS. It has the connection with the databases
that provide raw devices and command line features.

7) Mention what is new feature in Oracle ASM 12c?

The new feature added in Oracle ASM 12c is Oracle Flex ASM. Its a new ASM deployment
model which increases instance database availability and reduces the Oracle ASM related
resource consumption.

8) Mention how Oracle Flex ASM works?

Oracle Flex ASM instance when fails on a particular node, then the Oracle Flex ASM instance
is passed over to another node in the cluster.

9) Mention what are the key characteristics of RAC or why to use RAC?

The key characteristics of RAC are,

Reliability: Eliminates the database server from a single point of failure. If an instance fails,
the remaining instances in the cluster remain active and open.

Error Detection: Provides fast detection of problems in the environment. It automatically


recovers from failures even before user’s notice that a failure has occurred.

Recoverability: Easy to recover from various types of failures.


Continuous Operations: provides continuous service for both unplanned and planned
outages

10) Mention what is the function of Cache Fusion in Oracle RAC?

Cache function is used to show the storage of the information in the clustered network with
the Oracle database. It involves two nodes, one writes the data to the same disk, and other
reads the data block from the disk. For its network connection, RAC uses a dedicated server
for its network, and cache function is an internal part of the cluster.

11) Mention what is the difference between single instance environment and RAC
environment?

Single Instance Environment RAC Environment

Instance has its own SGA (System Global Area)

Each instance has its own SGA

Datafiles and control files are accessed by only one instance

Datafiles and control files shared by all instances

Online redo logfile dedicated for read/write to only one instance

Online redo logfile only one instance can write, but other instances can read during recovery
and archiving.

Flash recovery log accessed by only one instance

Flash recovery log shared by all instances

Alert log and trace files dedicated to the instance

Alert log and trace files are private to each instance. Other instance never write or read to
those files

12) Mention what is split brain syndrome in RAC?

In Oracle RAC, all the instances/servers communicate with each other using a private
network. When the instance members in a RAC fail to ping/connect to each other via this
private network and continue to process data block independently. Then this process is
referred as Split Brain Syndrome.
13) What happens if you keep split brain syndrome in RAC unresolved? How it can be
resolved?

If you keep split brain syndrome unresolved, then there would be data integrity issue. The
blocks changed in one instance will not be locked and could be over-written by another
instance. It is resolved by using the voting disk, it will decide which node(s) will survive
and which node(s) will be evicted.

14) Mention how can you determine what protocol is being used for Interconnect traffic?

To determine what protocol is being used for Interconnect traffic you can look at the
database alert log for the time period when the database was started up.

15) Mention in RAC what files should be created on shared storage?

In RAC ControlFiles, Spfiles, Redolog files, and Datafiles should be created on shared
storage.

16) Mention where does the Clusterware write when there is a network or storage issue?

When there is a network or storage issue the network ping failure is written in
$CRS_HOME/log

17) Mention what are the tools provided in Oracle Enterprise Manager?

Tools provided in Oracle Enterprise Manager are,

Grid Control-

It is used to deliver the centralized management system and provides configuration and
administration capabilities.
It provides the cost reduction plans and provides higher efficiency

Database Control-

It is used as a graphical management tool to manage the database to make it configure


automatically.

It is related to the Oracle Clusterware. It is used to maintain the services of the Oracle RAC.

It also manages the server pools that are being created with the Oracle Clusterware and
provision to manage it from a single place.

18) Mention what is the difference between Instance recovery and Crash recovery?

A crash recovery takes place when an instance crashes in a single node database on
startup. When the same recovery for an instance is performed in RAC environment by the
surviving nodes then it is called Instance recovery.

19) What if your OCR (Oracle Cluster Registry) is corrupted?

if your OCR is corrupted, you can either use the logical or physical OCR backup copy to
restore the repository.

20) Mention what is OLR?

ORL stands for Oracle Local Repository (OLR). It consists of information which enables the
cluster programs to initiate with the OCR in the ASM Storage. Until the grid process are
started, the ASM file is unavailable. In such case, a local copy of the data of the OCR is
required, that is stored in OLR.

What is RAC?

RAC stands for Real Application cluster. It is a clustering solution from Oracle Corporation
that ensures high availability of databases by providing instance failover, media failover
features.

How many nodes are supported in a RAC Database?

10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances
in a RAC database.

What is SCAN?
Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.

Click here for more details from Oracle

Mention the Oracle RAC software components:-

An Environment that supports of two or more database instances is an RAC.


They are composed of Memory structures and background processes.
Oracle RAC instances use two processes GES (Global Enqueue Service), GCS (Global
Cache Service) that enable cache fusion.
Oracle RAC instances are composed of following background processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor

What is GRD?

GRD stands for Global Resource Directory. The GES and GCS maintain records of the status
of each datafile and each cached block using global resource directory. This process is
referred to as cache fusion and helps in data integrity.

Cache Fusion in Detail:-

Oracle RAC is composed of two or more instances. When a block of data is read from
datafile by an instance within the cluster and another instance is in need of the same block,
it is easy to get the block image from the instance which has the block in its SGA rather
than reading from the disk. To enable inter instance communication Oracle RAC makes use
of interconnects. The Global Enqueue Service (GES) monitors and Instance enqueue process
manages the cache fusion.

What are Oracle database background processes specific to RAC

•LMS—Global Cache Service Process

•LMD—Global Enqueue Service Daemon

•LMON—Global Enqueue Service Monitor

•LCK0—Instance Enqueue Process

To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy
a query or transaction, OracleRAC instances use two processes, the Global Cache Service
(GCS) and the Global Enqueue Service (GES). The GCS andGES maintain records of the
statuses of each data file and each cached block using a Global Resource Directory (GRD).
The GRD contents are distributed across all of the active instances.

RAC Background Processes in Detail.


ACMS in Detail:-

ACMS stands for Atomic Controlfile Memory Service. In an Oracle RAC environment ACMS is
an agent that ensures a distributed SGA memory update (ie) SGA updates are globally
committed on success or globally aborted in event of a failure.

GTX0-j in Detail:-

The process provides transparent support for XA global transactions in a RAC environment.
The database auto tunes the number of these processes based on the workload of XA global
transactions.

LMON in Detail:-

This process monitors global enques and resources across the cluster and performs Global
Enqueue recovery operations. This is called as Global Enqueue Service Monitor.

LMD in Detail:-

This process is called as global enqueue service daemon. This process manages incoming
remote resource requests within each instance.

LMS in Detail:-

This process is called as Global Cache service process. This process maintains status of
datafiles and each cached block by recording information in a Global Resource Directory
(GRD). This process also controls the flow of messages to remote instances and manages
global data block access and transmits block images between the buffer caches of different
instances. This processing is a part of cache fusion feature.

LCK0 in Detail:-

This process is called as Instance enqueue process. This process manages non-cache fusion
resource requests such as library and row cache requests.

RMSn in Detail:-

This process is called as Oracle RAC management process. These processes perform
manageability tasks for Oracle RAC. Tasks include creation of resources related
Oracle RAC when new instances are added to the cluster.

RSMN in Detail:-

This process is called as Remote Slave Monitor. This process manages background slave
process creation and communication on remote instances. This is a background slave
process. This process performs tasks on behalf of a coordinating process running in another
instance.

What are Oracle Clusterware processes for 10g on Unix and Linux

Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as
the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be
a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application
process, and so on) based on the resource's configuration information that is stored in
the OCR. This includes start, stop, monitor and failover operations. This process runs as the
root user

Event manager daemon (evmd) —A background process that publishes events that crs
creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O
fencing. OPROCD performs its check, stops running, and if the wake up is beyond the
expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure
results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on
Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements


and complex resources. Runs server callout scripts when FAN events occur.

What are Oracle Clusterware Components

Voting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a
health check and arbitrates cluster ownership among the instances in case of network
failures. The voting disk must reside on shared disk.

Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as


configuration information about any cluster database within the cluster. The OCR must
reside on shared disk that is accessible by all of the nodes in your cluster

What components in RAC must reside in shared storage?

All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shared
storage.

What is the significance of using cluster-aware shared storage in an


Oracle RAC environment?

All instances of an Oracle RAC can access all the datafiles, controlfiles, SPFILE's, redolog
files when these files are hosted out of cluster-aware shared storage which are group of
shared disks.

Give few examples for solutions that support cluster storage:-

ASM (automatic storage management),


raw disk devices,
network file system (NFS),
OCFS2 and
OCFS (Oracle Cluster Fie systems).

What is an interconnect network?

An interconnect network is a private network that connects all of the servers in a cluster.
The interconnect network uses a switch/multiple switches that only the nodes in the cluster
can access.

How can we configure the cluster interconnect?

Configure User Datagram Protocol (UDP) on Gigabit Ethernet for cluster interconnects.
On UNIX and Linux systems we use UDP and RDS (Reliable data socket) protocols to be
used by Oracle Clusterware.
Windows clusters use the TCP protocol.

Can we use crossover cables with Oracle Clusterware interconnect?

No, crossover cables are not supported with Oracle Clusterware interconnects.

What is the use of cluster interconnect?

Cluster interconnect is used by the Cache fusion for inter instance communication.

what is the purpose of Private Interconnect?

Clusterware uses the private interconnect for cluster synchronization (network heartbeat)
and daemon communication between the the clustered nodes. This communication is based
on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP).
Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches
of participating nodes in the cluster.

How do users connect to database in an Oracle RAC environment?

Users can access a RAC database using a client/server configuration or through one or more
middle tiers, with or without connection pooling. Users can use oracle services feature to
connect to database.

What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle database. Services
enable us to define rules and characteristics to control how users and applications connect
to database instances.

What are the characteristics controlled by Oracle services feature?

The characteristics include a unique name, workload balancing, failover options, and high
availability.

Which enables the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the
instances in an Oracle RACdatabase.

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead
of the standard public IP address. To configure VIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same subnet as the public
network.

What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which
the VIP address can accept TCPconnections but it cannot accept Oracle connections.

Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often wait for
a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you
don't really have a good HA solution without using VIPs.
When a node fails, the VIP associated with it is automatically failed over to some other node
and new node re-arps the world indicating a new MAC address for the IP. Subsequent
packets sent to the VIP go to the new node, which will send error RST packets back to the
clients. This results in the clients getting errors immediately.

Give situations under which VIP address failover happens:-

VIP addresses failover happens when the node on which the VIP address runs fails; all
interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from
the network.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don't have to wait for TCP connection timeout
messages.

What are the administrative tools used for Oracle RAC environments?

Oracle RAC cluster can be administered as a single image using the below
OEM (Enterprise Manager),
SQL*PLUS,
Server control (SRVCTL),
Cluster Verification Utility (CLUVFY),
DBCA,
NETCA

How do we verify that RAC instances are running?

Issue the following query from any one node connecting through SQL*PLUS.
$connect sys/sys as sysdba
SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER column, host instance name
under INST_NAME column.

What is FAN?

Fast application Notification as it abbreviates to FAN relates to the events related to


instances, services and nodes. This is a notification mechanism that Oracle RAC uses to
notify other processes about the configuration and service level information that includes
service status changes such as, UP or DOWN events. Applications can respond to FAN
events and take immediate action.

Where can we apply FAN UP and DOWN events?

FAN UP and FAN DOWN events can be applied to instances, services and nodes.

State the use of FAN events in case of a cluster configuration change?

During times of cluster configuration changes, Oracle RAC high availability framework
publishes a FAN event immediately when a state change occurs in the cluster. So
applications can receive FAN events and react immediately. This prevents applications from
polling database and detecting a problem after such a state change.
Why should we have separate homes for ASM instance?

It is a good practice to have ASM home separate from the database home (ORACLE_HOME).
This helps in upgrading and patching ASM and the Oracle database software independent of
each other. Also, we can deinstall the Oracle database software independent of the ASM
instance.

What is the advantage of using ASM?

Having ASM is the Oracle recommended storage option for RAC databases as the ASM
maximizes performance by managing the storage configuration across the disks. ASM does
this by distributing the database file across all of the available storage within our cluster
database environment.

What is rolling upgrade?

It is a new ASM feature from Database 11g. ASM instances in Oracle database 11g
release(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables
us to patch or upgrade ASM nodes in a clustered environment without affecting database
availability. During a rolling upgrade we can maintain a functional cluster while one or more
of the nodes in the cluster are running in different software versions.

Can rolling upgrade be used to upgrade from 10g to 11g database?

No, it can be used only for Oracle database 11g releases (from 11.1).

State the initialization parameters that must have same value for every instance in
an Oracle RAC database:-

Some initialization parameters are critical at the database creation time and must have
same values. Their value must be specified in SPFILE or PFILE for every instance. The list of
parameters that must be identical on every instance are given below:

Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?

These parameters can be identical on all instances only if these parameter values are set to
zero.

What two parameters must be set at the time of starting up an ASM instance in
a RAC environment?

The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.

Mention the components of Oracle Clusterware:-

Oracle Clusterware is made up of components like voting disk and Oracle Cluster Registry
(OCR).

What is a CRS resource?

Oracle Clusterware is used to manage high-availability operations in a cluster. Anything that


Oracle Clusterware manages is known as a CRS resource. Some examples of CRS resources
are database, an instance, a service, a listener, a VIPaddress, an application process etc.
What is the use of OCR?

Oracle Clusterware manages CRS resources based on the configuration information


of CRS resources stored in OCR(Oracle Cluster Registry).

How does an Oracle Clusterware manage CRS resources?

Oracle Clusterware manages CRS resources based on the configuration information


of CRS resources stored in OCR(Oracle Cluster Registry).

Name some Oracle Clusterware tools and their uses?

OIFCFG - allocating and deallocating network interfaces.


OCRCONFIG - Command-line tool for managing Oracle Cluster Registry.
OCRDUMP - Identify the interconnect being used.
CVU - Cluster verification utility to get status of CRS resources

What are the modes of deleting instances from Oracle Real Application cluster
Databases?

We can delete instances using silent mode or interactive mode using DBCA (Database
Configuration Assistant).

How do we remove ASM from an Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or silent mode. After
that ASM can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name

How do we verify that an instance has been removed from OCR after deleting an
instance?

Issue the following srvctl command:


srvctl config database -d database_name
cd CRS_HOME/bin
./crs_stat

How do we verify an existing current backup of OCR?

We can verify the current backup of OCR using the following command : ocrconfig -
showbackup

What are the performance views in an Oracle RAC environment?

We have v$ views that are instance specific. In addition we have GV$ views called as global
views that has an INST_ID column of numeric data type.GV$ views obtain information from
individual V$ views.

What are the types of connection load-balancing?

There are two types of connection load-balancing: server-side load balancing and client-side
load balancing.
What is the difference between server-side and client-side connection load
balancing?

Client-side balancing happens at client side where load balancing is done using listener. In
case of server-side load balancing listener uses a load-balancing advisory to redirect
connections to the instance providing best service.

Give the usage of srvctl:-


srvctl start instance -d db_name -i "inst_name_list" [-o start_options]
srvctl stop instance -d name -i "inst_name_list" [-o stop_options]
srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate
srvctl start database -d name [-o start_options]
srvctl stop database -d name [-o stop_options]
srvctl start database -d orcl -o mount
How do you troubleshoot node reboot

Please check metalink ...

Note 265769.1 Troubleshooting CRS Reboots


Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle
Clusterware Node evictions.

How do you backup the OCR

There is an automatic backup mechanism for OCR. The default location is :


$ORA_CRS_HOME\cdata\"clustername"\

To display backups :
#ocrconfig -showbackup
To restore a backup :
#ocrconfig -restore

With Oracle RAC 10g Release 2 or later, you can also use the export command:
#ocrconfig -export -s online, and use -import option to restore the contents back.
With Oracle RAC 11g Release 1, you can do a manual backup of the OCR with the
command:
# ocrconfig -manual backup

How do you backup voting disk

#dd if=voting_disk_name of=backup_file_name

How do I identify the voting disk location

#crsctl query css votedisk

How do I identify the OCR file location

check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)


or
#ocrcheck

What is the purpose of the ONS daemon?

The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware
as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service daemon receives a subset of published clusterware events
via the local evmd and racgimon Clusterware daemons and forward those events to
application subscribers and to the local listeners.

This in order to facilitate:

a. the FAN or Fast Application Notification feature or allowing applications to respond to


database state changes.
b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing across
different RAC nodes dependent of the load on the different nodes. The rdbms MMON is
creating an advisory for distribution of work every 30seconds and forward it via racgimon
and ONS to listeners and applications.

Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215,
however sqlplus can start it on both nodes? How do you identify the problem?

Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl.
Now you will get detailed error stack.

What is (use of) Virtual IP (VIP) in Oracle Real Application Clusters (RAC)?

When installing Oracle 10g/11g R1 RAC, three network interfaces (IPs) are required for each
node in the RAC cluster, they are:
 Public Interface: Used for normal network communications to the node
 Private Interface: Used as the cluster interconnect
 Virtual (Public) Interface: Used for failover and RAC management
When installing Oracle 11g R2 RAC, we need one more network interface (IP) is required for
each node in the RACcluster.

 SCAN Interface (IP): Single Client Access Name (SCAN) is a new Oracle Real
Application Clusters (RAC) 11g Release 2 feature, which provides a single name for clients
to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not
need to change if you add or remove nodes in the cluster.
When a client connects to a tns-alias, it uses a TCP connection to an IP address, defined in
the tnsnames.ora file. When using RAC, we define multiple addresses in our tns-alias, to be
able to failover when an IP address, listener or instance is unavailable. TCP timeouts can
differ from platform to platform or implementation to implementation. This makes it difficult
to predict the failover time.

Oracle 10g Cluster Ready Services enables databases to use a Virtual IP address to
configure the listener ON. This feature is to assure that oracle clients quickly failover when a
node fails. In Oracle Database 10g RAC, the use of a virtual IP address to mask the
individual IPO addresses of the clustered nodes is required. The virtual IP addresses are
used to simplify failover and are automatically managed by CRS.

To create a Virtual IP (VIP) address, the Virtual IP Configuration Assistant (VIPCA) is called
from the root.sh script of a RACinstall, which then configures the virtual IP addresses for
each node specified during the installation process. In order to be able to run VIPCA, there
must be unused public IP addresses available for each node that has been configured in the
/etc/hosts file.

One public IP address for each node to use for its Virtual IP address for client connections
and for connection failover. This IP address is in addition to the operating system managed
public host IP address that is already assigned to the node by the operating system. This
public Virtual IP must be associated with the same interface name on every node that is a
part of the cluster. The IP addresses that are used for all of the nodes that are part of a
cluster must be from the same subnet. The host names for the VIP addresses must be
registered with the domain name server (DNS). The Virtual IP address should not be in use
at the time of the installation because this is a Virtual IP address that Oracle manages
internally to the RAC processes. This virtual IP address does not require a separate NIC. The
VIPs should be registered in the DNS. The VIP addresses must be on the same subnet as
the public host network addresses. Each Virtual IP (VIP) configured requires an unused and
resolvable IP address.

Using virtual IP we can save our TCP/IP timeout problem because Oracle notification service
(ONS) maintains communication between each nodes and listeners. Once ONS found any
listener down or node down, it will notify another nodes and listeners. While new connection
is trying to establish connection to failure node or listener, virtual IP of failure node
automatically divert to surviving node and session will be establishing in another surviving
node. This process doesn't wait for TCP/IP timeout event. Due to this new connection gets
faster session establishment to another surviving nodes/listener.

Virtual IP (VIP) is for fast connection establishment in failover dictation. Still we can use
physical IP address in Oracle 10g in listener if we have no worry for failover timing. We can
change default TCP/IP timeout using operating system utilities/commands and kept smaller.
But taking advantage of VIP (Virtual IP address) in Oracle 10g RAC database is advisable.

What is RAC? What is the benefit of RAC over single instance database?

In Real Application Clusters environments, all nodes concurrently execute transactions


against the same database. Real Application Clusters coordinates each node’s access to the
shared data to provide consistency and integrity.

Benefits:

Improve response time

Improve throughput

High availability

Transparency

What is Oracle RAC One Node?

Oracle RAC one Node is a single instance running on one node of the cluster while the 2nd
node is in cold standby mode. If the instance fails for some reason then RAC one node
detect it and restart the instance on the same node or the instance is relocate to the 2nd
node incase there is failure or fault in 1st node.
The benefit of this feature is that it provides a cold failover solution and it automates the
instance relocation without any downtime and does not need a manual intervention. Oracle
introduced this feature with the release of 11gR2 (available with Enterprise Edition).

Advantages of RAC (Real Application Clusters)

Reliability – if one node fails, the database won’t fail

Availability – nodes can be added or replaced without having to shutdown the database

Scalability – more nodes can be added to the cluster as the workload increases

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead
of the standard public IP address. To configure VIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same

Where are the Clusterware files stored on a RAC environment?

The Clusterware is installed on each node (on an Oracle Home) and on the shared disks (the
voting disks and the CSR file)

Where are the database software files stored on a RAC environment?

The base software is installed on each node of the cluster and the database storage on the
shared disks.

What kind of storage we can use for the shared Clusterware files?

OCFS (Release 1 or 2)

Raw devices

Third party cluster file system such as GPFS or Veritas


What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don’t have to wait for TCP connection timeout
messages.

What is voting disk?

Voting Disk is a file that sits in the shared storage area and must be accessible by all nodes
in the cluster. All nodes in the cluster registers their heart-beat information in the voting
disk, so as to confirm that they are all operational. If heart-beat information of any node in
the voting disk is not available that node will be evicted from the cluster.

The CSS (Cluster Synchronization Service) daemon in the clusterware maintains the heart
beat of all nodes to the voting disk. When any node is not able to send heartbeat to voting
disk, then it will reboot itself, thus help avoiding the split-brain syndrome.

For high availability, Oracle recommends that you have a minimum of three or odd number
(3 or greater) of votingdisks.

Voting Disk – is file that resides on shared storage and Manages cluster members. Voting
disk reassigns cluster ownership between the nodes in case of failure.

The Voting Disk Files are used by Oracle Clusterware to determine which nodes are
currently members of the cluster. The voting disk files are also used in concert with other
Cluster components such as CRS to maintain the clusters integrity.

Oracle Database 11g Release 2 provides the ability to store the voting disks in ASM along
with the OCR. Oracle Clusterware can access the OCR and the voting disks present in ASM
even if the ASM instance is down. As a result CSS can continue to maintain the Oracle
cluster even if the ASM instance has failed.

What kind of storage we can use for the RAC database storage?

OCFS (Release 1 or 2)

ASM
raw devices

third party cluster file system such as GPFS or Veritas

What is a CFS?

A cluster File System (CFS) is a file system that may be accessed (read and write) by all
members in a cluster at the same time. This implies that all members of a cluster have the
same view.

What is an OCFS2?

The OCFS2 is the Oracle (version 2) Cluster File System which can be used for the Oracle
Real Application Cluster.

Which files can be placed on an Oracle Cluster File System?

Oracle Software installation (Windows only)

Oracle files (controlfiles, datafiles, redologs, files described by the bfile datatype)

Shared configuration files (spfile)

OCR and voting disk

Files created by Oracle during runtime

Do you know another Cluster Vendor?

HP Tru64 Unix, Veritas, Microsoft

How is possible to install a RAC if we don’t have a CFS?

This is possible by using a raw device.


What is a raw device?

A raw device is a disk drive that does not yet have a file system set up. Raw devices are
used for Real Application Clusters since they enable the sharing of disks.

Why we need to keep odd number of voting disks ?

Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You
should always configure an odd number of voting disks >= 3. This is because loss of more
than half your voting disks will cause the entire cluster to fail.

What is a raw partition?

A raw partition is a portion of a physical disk that is accessed at the lowest possible level. A
raw partition is created when an extended partition is created and logical partitions are
assigned to it without any formatting. Once formatting is complete, it is called cooked
partition.

When to use CFS over raw?

A CFS offers:

– Simpler management

– Use of Oracle Managed Files with RAC

– Single Oracle Software installation

– Autoextend enabled on Oracle datafiles

– Uniform accessibility to archive logs in case of physical node failure

– With Oracle_Home on CFS, when you apply Oracle patches CFS guarantees that the
updated Oracle_Home is visible to all nodes in the cluster.

What CRS is?


Oracle RAC 10g Release 1 introduced Oracle Cluster Ready Services (CRS), a platform-
independent set of system services for cluster environments. In Release 2, Oracle has
renamed this product to Oracle Clusterware.

What is VIP IP used for?

It returns a dead connection IMMEDIATELY, when its primary node fails. Without using VIP
IP, the clients have to wait around 10 minutes to receive ORA-3113: “end of file on
communications channel”. However, using Transparent Application Failover (TAF) could
avoid ORA-3113.

Why we need to have configured SSH or RSH on the RAC nodes?

SSH (Secure Shell,10g+) or RSH (Remote Shell, 9i+) allows “oracle” UNIX account
connecting to another RAC node and copy/ run commands as the local “oracle” UNIX
account.

Is the SSH, RSH needed for normal RAC operations?

No. SSH or RSH are needed only for RAC, patch set installation and clustered database
creation.

Do we have to have Oracle RDBMS on all nodes?

Each node of a cluster that is being used for a clustered database will typically have the
RDBMS and RAC software loaded on it, but not actual data files (these need to be available
via shared disk).

What are the restrictions on the SID with a RAC database? Is it limited to 5 characters?

The SID prefix in 10g Release 1 and prior versions was restricted to five characters by
install/ config tools so that an ORACLE_SID of up to max of 5+3=8 characters can be
supported in a RAC environment. The SID prefix is relaxed up to 8 characters in 10g
Release 2, see bug 4024251 for more information.
Does Real Application Clusters support heterogeneous platforms?

The Real Application Clusters do not support heterogeneous platforms in the same cluster.

Are there any issues for the interconnect when sharing the same switch as the public
network by using VLAN to separate the network?

RAC and Clusterware deployment best practices suggests that the interconnect (private
connection) be deployed on a stand-alone, physically separate, dedicated switch. On big
network the connections could be unstable.

What is the Load Balancing Advisory?

To assist in the balancing of application workload across designated resources, Oracle


Database 10g Release 2 provides the Load Balancing Advisory. This Advisory monitors the
current workload activity across the cluster and for each instance where a service is active;
it provides a percentage value of how much of the total workload should be sent to this
instance as well as service quality flag.

What is the Cluster Verification Utiltiy (cluvfy)?

The Cluster Verification Utility (CVU) is a validation tool that you can use to check all the
important components that need to be verified at different stages of deployment in a RAC
environment.

Is it possible to use ASM for the OCR and voting disk?

No, the OCR and voting disk must be on raw or CFS (cluster file system).

What the OCR file is used for?


OCR is a file that manages the cluster and RAC configuration.

What the Voting Disk file is used for?

The voting disk is nothing but a file that contains and manages information of all the node
memberships.

What is the recommended method to make backups of a RAC environment?

RMAN to make backups of the database, dd to backup your voting disk and hard copies of
the OCR file.

What command would you use to check the availability of the RAC system?

crs_stat -t -v (-t -v are optional)

What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.

What is cache fusion?

In a RAC environment, it is the combining of data blocks, which are shipped across the
interconnect from remote database caches (SGA) to the local node, in order to fulfill the
requirements for a transaction (DML, Query of Data Dictionary).

What is split brain?


When database nodes in a cluster are unable to communicate with each other, they may
continue to process and modify the data blocks independently. If the same block is modified
by more than one instance, synchronization/locking of the data blocks does not take place
and blocks may be overwritten by others in the cluster. This state is called split brain.

What is the difference between Crash recovery and Instance recovery?

When an instance crashes in a single node database on start-up a crash recovery takes
place. In a RAC environment the same recovery for an instance is performed by the
surviving nodes called Instance recovery.

What is the interconnect used for?

It is a private network which is used to ship data blocks from one instance to another for
cache fusion. The physical data blocks as well as data dictionary blocks are shared across
this interconnect.

How do you determine what protocol is being used for Interconnect traffic?

One of the ways is to look at the database alert log for the time period when the database
was started up.

What methods are available to keep the time synchronized on all nodes in the cluster?

Either the Network Time Protocol(NTP) can be configured or in 11gr2, Cluster Time
Synchronization Service (CTSS) can be used.

What files components in RAC must reside on shared storage?

Spfiles, ControlFiles, Datafiles and Redolog files should be created on shared storage.

Where does the clusterware write when there is a network or Storage missed heartbeat?
The network ping failure is written in $CRS_HOME/log

How do you find out what OCR backups are available?

The ocrconfig -showbackup can be run to find out the automatic and manually run backups.

If your OCR is corrupted what options do have to resolve this?

You can use either the logical or the physical OCR backup copy to restore the Repository.

How do you find out what object has its blocks being shipped across the instance the most?

You can use the dba_hist_seg_stats.

What is a VIP in RAC use for?

The VIP is an alternate Virtual IP address assigned to each node in a cluster. During a node
failure the VIP of the failed node moves to the surviving node and relays to the application
that the node has gone down. Without VIP, the application will wait for TCP timeout and
then find out that the session is no longer live due to the failure.

How do we know which database instances are part of a RAC cluster?

You can query the V$ACTIVE_INSTANCES view to determine the member instances of the
RAC cluster.

What is OCLUMON used for in a cluster environment?

The Cluster Health Monitor (CHM) stores operating system metrics in the CHM repository for
all nodes in a RAC cluster. It stores information on CPU, memory, process, network and
other OS data, This information can later be retrieved and used to troubleshoot and identify
any cluster related issues.

It is a default component of the 11gr2 grid install. The data is stored in the master
repository and replicated to a standby repository on a different node.

What would be the possible performance impact in a cluster if a less powerful node (e.g.
slower CPU’s) is added to the cluster?

All processing will show down to the CPU speed of the slowest server.

What are some of the RAC specific parameters?

Some of the RAC parameters are:

CLUSTER_DATABASE

CLUSTER_DATABASE_INSTANCE

INSTANCE_TYPE (RDBMS or ASM)

ACTIVE_INSTANCE_COUNT

UNDO_MANAGEMENT

What is the future of the Oracle Grid?

The Grid software is becoming more and more capable of not just supporting HA for Oracle
Databases but also other applications including Oracle’s applications. With 12c there are
more features and functionality built-in and it is easier to deploy these pre-built solutions,
available for common Oracle applications.

What components of the Grid should I back up?

The backups should include OLR, OCR and ASM Metadata.


Is there an easy way to verify the inventory for all remote nodes

You can run the opatch lsinventory -all_nodes command from a single node to look at the
inventory details for all nodes in the cluster.

What are Oracle RAC software components?

Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database.Oracle RAC
instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion.

Oracle RAC instances are composed of following background processes:

ACMS—Atomic Controlfile to Memory Service (ACMS)

GTX0-j—Global Transaction Process

LMON—Global Enqueue Service Monitor

LMD—Global Enqueue Service Daemon

LMS—Global Cache Service Process

LCK0—Instance Enqueue Process

RMSn—Oracle RAC Management Processes (RMSn)

RSMN—Remote Slave Monitor

What are Oracle database background processes specific to RAC?

LMS—Global Cache Service Process

LMD—Global Enqueue Service Daemon

LMON—Global Enqueue Service Monitor

LCK0—Instance Enqueue Process


Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global
Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file
and each cached block using a Global Resource Directory (GRD). The GRD contents are
distributed across all of the active instances.

What is Cache Fusion?

Transfer of data across instances through private interconnect is called cache fusion.Oracle
RAC is composed of two or more instances. When a block of data is read from datafile by an
instance within the cluster and another instance is in need of the same block,it is easy to
get the block image from the instance which has the block in its SGA rather than reading
from the disk. To enable inter instance communication Oracle RAC makes use of
interconnects. The Global en-queue Service(GES) monitors and Instance en-queue process
manages the cache fusion

What is SCAN? (11gR2 feature)

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.

What are SCAN components in a cluster?

SCAN Name

SCAN IPs (3)

SCAN Listeners (3)

What is FAN?

Fast application Notification as it abbreviates to FAN relates to the events related to


instances,services and nodes.This is a notification mechanism that Oracle RAC uses to notify
other processes about the configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can respond to FAN events and
take immediate action.
How to find location of OCR file when CRS is down?

If you need to find the location of OCR (Oracle Cluster Registry) but your CRS is down.

When the CRS is down:

Look into “ocr.loc” file, location of this file changes depending on the OS:

On Linux: /etc/oracle/ocr.loc

On Solaris: /var/opt/oracle/ocr.loc

When CRS is UP:

Set ASM environment or CRS environment then run the below command:

ocrcheck

In 2 node RAC, how many NIC’s are using ?

2 network cards on each clusterware node

Network Card 1 (with IP address set 1) for public network

Network Card 2 (with IP address set 2) for private network (for inter node communication
between rac nodes used by clusterware and rac database)

What is difference between RAC ip addresses ?

Public IP adress is the normal IP address typically used by DBA and SA to manage storage,
system and database. Public IP addresses are reserved for the Internet.

Private IP address is used only for internal clustering processing (Cache Fusion) (aka as
interconnect). Private IP addresses are reserved for private networks.

VIP is used by database applications to enable fail over when one cluster node fails. The
purpose for having VIP is so client connection can be failover to surviving nodes in case
there is failure
Can application developer access the private ip ?

No. private IP address is used only for internal clustering processing (Cache Fusion)

What are Oracle Clusterware Components?

Voting Disk —> Oracle RAC uses the voting disk to manage cluster membership by way of a
health check and arbitrates cluster ownership among the instances in case of network

failures. The voting disk must reside on shared disk.

Oracle Cluster Registry (OCR) —> Maintains cluster configuration information as well as
configuration information about any cluster database within the cluster. The OCR must

reside on shared disk that is accessible by all of the nodes in your cluster

What is the purpose of Private Interconnect ?

Clusterware uses the private interconnect for cluster synchronization (network heartbeat)
and daemon communication between the the clustered nodes. This communication is based
on the TCP protocol.

RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP).
Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches
of participating nodes in the cluster.

Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP
timeout period (which can be up to 10 min) before getting an error. As a result, you

don’t really have a good HA solution without using VIPs.

When a node fails, the VIP associated with it is automatically failed over to some other node
and new node re-arps the world indicating a new MAC address for the IP. Subsequent

packets sent to the VIP go to the new node, which will send error RST packets back to the
clients. This results in the clients getting errors immediately.
What is dynamic remastering ? When will the dynamic remastering happens?

dynamic remastering is ability to move the ownership of resource from one instance to
another instance in RAC.

dynamic resource remastering is used to implement for resource affinity for increased
performance.

resource affinity optimized the system in situation where update transactions are being
executed in one instance.

when activity shift to another instance the resource affinity correspondingly move to
another instance.

If activity is not localized then resource ownership is hashed to the instance.

What is RAC and how is it different from non RAC databases?

RAC stands for Real Application Cluster,

you have n number of instances running in their own separate nodes and based on the
shared storage.

Cluster is the key component and is a collection of servers operations as one unit.

RAC is the best solution for high performance and high availably.

Non RAC databases has single point of failure in case of hardware failure or server crash.

What is GRD?

GRD stands for Global Resource Directory.

The GES and GCS maintains records of the statuses of each datafile and each cached block
using global resource directory.This process is referred to as cache fusion and helps in data
integrity.

What are the major RAC wait events?


In a RAC environment the buffer cache is global across all instances in the cluster and hence
the processing differs.The most common wait events related to this are gc cr request and gc
buffer busy

GC CR request :the time it takes to retrieve the data from the remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will
increase the amount of data blocks requested by an Oracle session.

The more blocks requested typically means the more often a block will need to be read from
a remote instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested
data block.

What is the use of cluster interconnect?

Cluster interconnect is used by the Cache fusion for inter instance communication.

What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle database.Services
enable us to define rules and characteristics to control how users and applications connect
to database instances.

How do we verify that RAC instances are running?

Issue the following query from any one node connecting through SQL*PLUS.

$connect sys/sys as sysdba

SQL>select * from V$ACTIVE_INSTANCES;

The query gives the instance number under INST_NUMBER column,host_:instancename


under INST_NAME column.
How does a Oracle Clusterware manage CRS resources?

Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).

How do we remove ASM from a Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or silent mode.After
that asm can be removed using srvctl tool as follows:

srvctl stop asm -n node_name

srvctl remove asm -n node_name

We can verify if ASM has been removed by issuing the following command:

srvctl config asm -n node_name

What are the types of connection load-balancing?

There are two types of connection load-balancing:server-side load balancing and client-side
load balancing.

What is the difference between server-side and client-side connection load balancing?

Client-side balancing happens at client side where load balancing is done using listener.In
case of server-side load balancing listener uses a load-balancing advisory to redirect
connections to the instance providing best service.

What is the Oracle Recommendation for backing up voting disk?

Oracle recommends us to use the dd command to backup the voting disk with a minimum
block size of 4KB.

How do you restore a voting disk?


To restore the backup of your voting disk, issue the dd or ocopy command for Linux and
UNIX systems or ocopy for Windows systems respectively.

On Linux or UNIX systems:

dd if=backup_file_name of=voting_disk_name

On Windows systems, use the ocopy command:

ocopy backup_file_name voting_disk_name

where,

backup_file_name is the name of the voting disk backup file

voting_disk_name is the name of the active voting disk

1. What is RAC?
RAC stands for Real Application cluster.
It is a clustering solution from Oracle Corporation that ensures high availability of databases
by providing instance failover, media failover features.
Oracle RAC is a cluster database with a shared cache architecture that overcomes the
limitations of traditional shared-nothing and shared-disk approaches to provide a highly
scalable and available database solution for all the business applications.
Oracle RAC provides the foundation for enterprise grid computing.

2. What is Oracle RAC One Node?


Oracle RAC one Node is a single instance running on one node of the cluster while the 2nd
node is in cold standby mode. If the instance fails for some reason then RAC one node
detect it and restart the instance on the same node or the instance is relocate to the 2nd
node incase there is failure or fault in 1st node. The benefit of this feature is that it provides
a cold failover solution and it automates the instance relocation without any downtime and
does not need a manual intervention. Oracle introduced this feature with the release of
11gR2 (available with Enterprise Edition).

3. What is RAC and how is it different from non RAC databases?


Oracle Real Application clusters allows multiple instances to access a single database; the
instances will be running on multiple nodes.
In Real Application Clusters environments, all nodes concurrently execute transactions
against the same database.
Real Application Clusters coordinates each node's access to the shared data to provide
consistency and integrity.

4. What are the advantages of RAC (Real Application Clusters)?


Reliability - if one node fails, the database won't fail
Availability - nodes can be added or replaced without having to shutdown the database
Scalability - more nodes can be added to the cluster as the workload increases

5. What is Oracle RAC One Node?


Oracle RAC one Node is a single instance running on one node of the cluster while the 2nd
node is in cold standby mode. If the instance fails for some reason then RAC one node
detect it and restart the instance on the same node or the instance is relocate to the 2nd
node incase there is failure or fault in 1st node. The benefit of this feature is that it provides
a cold failover solution and it automates the instance relocation without any downtime and
does not need a manual intervention. Oracle introduced this feature with the release of
11gR2 (available with Enterprise Edition).

6. What is Cache Fusion?


Oracle RAC is composed of two or more instances. When a block of data is read from data-
file by an instance within the cluster and another instance is in need of the same block, it is
easy to get the block image from the instance which has the block in its SGA rather than
reading from the disk. To enable inter instance communication Oracle RAC makes use of
interconnects. The Global Enqueue Service (GES) monitors and Instance enqueue process
manages the cache fusion.

7. What command would you use to check the availability of the RAC system?
crs_stat -t -v (-t -v are optional)

8. How do we verify that RAC instances are running?


SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER column,host_:instancename
under INST_NAME column.

9. How can you connect to a specific node in a RAC environment?


tnsnames.ora ensure that you have INSTANCE_NAME specified in it.

10. Which is the "MASTER NODE" in RAC?


The node with the lowest node number will become master node and dynamic premastering
of the resources will take place.
To find out the master node for particular resource, you can query v$ges_resource for
MASTER_NODE column.
To find out which is the master node, you can see ocssd.log file and search for "master node
number".
When the first master node fails in the cluster the lowest node number will become master
node.

11. What components in RAC must reside in shared storage?


All data-files, control-files, SP-Files, redo log files must reside on cluster-aware shred
storage.

12. Give few examples for solutions that support cluster storage?
·ASM (automatic storage management),
·Raw disk devices,
·Network file system (NFS),
·OCFS2 and
·OCFS (Oracle Cluster Fie systems).

13. What are Oracle Cluster Components?


1. Cluster Interconnect (HAIP)
2.Shared Storage (OCR/Voting Disk)
3.Clusterware software
4.Oracle Kernel Components

14. What are Oracle RAC Components?


VIP, Node apps etc.

15. What are Oracle Kernel Components?


Basically Oracle kernel need to switch on with RAC On option when you convert to RAC, that
is the difference as it facilitates few RAC big process like LMON, LCK, LMD, LMS etc.

16. How to turn on RAC?


# link the oracle libraries
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk rac_on
# rebuild oracle
$ cd $ORACLE_HOME/bin
$ relink oracle

17. Disk architecture in RAC?


SAN (Storage Area Networks) - generally using fibre to connect to the SAN
NAS (Network Attached Storage) - generally using a network to connect to the NAS using
either NFS, ISCSI

18. What is Oracle Cluster-ware?


The Cluster-ware software allows nodes to communicate with each other and forms the
cluster that makes the nodes work as a single logical server.
The software is run by the Cluster Ready Services (CRS) using the Oracle Cluster Registry
(OCR) that records and maintains the cluster and node membership information and the
voting disk which acts as a tiebreaker during communication failures. Consistent heartbeat
information travels across the interconnect to the voting disk when the cluster is running.

19. Real Application Clusters?


Oracle RAC is a cluster database with a shared cache architecture that overcomes the
limitations of traditional shared-nothing and shared-disk approaches to provide a highly
scalable and available database solution for all your business applications. Oracle RAC
provides the foundation for enterprise grid computing.

Oracle’s Real Application Clusters (RAC) option supports the transparent deployment of a
single database across a cluster of servers, providing fault tolerance from hardware failures
or planned outages. Oracle RAC running on clusters provides Oracle’s highest level of
capability in terms of availability, scalability, and low-cost computing.

One DB opened by multiple instances so the the db ll be Highly Available if an instance


crashes.
Cluster Software. Oracles Cluster-ware or products like Veritas Volume Manager are
required to provide the cluster support and allow each node to know which nodes belong to
the cluster and are available and with Oracle Cluster-ware to know which nodes have
failed and to eject then from the cluster, so that errors on that node can be cleared.

Oracle Cluster-ware has two key components Cluster Registry OCR and Voting Disk.

The cluster registry holds all information about nodes, instances, services and ASM
storage if used, it also contains state information ie they are available and up or similar.

The voting disk is used to determine if a node has failed, i.e. become separated from the
majority. If a node is deemed to no longer belong to the majority then it is forcibly rebooted
and will after the reboot add itself again the the surviving cluster nodes.

20. What are the Oracle Cluster-ware key components?


Oracle Cluster-ware has two key components Cluster Registry OCR and Voting Disk.

21. What is Voting Disk and OCR?


Voting Disk
Oracle RAC uses the voting disk to manage cluster membership by way of a health check
and arbitrates cluster ownership among the instances in case of network failures. The voting
disk must reside on shared disk.
A node must be able to access more than half of the voting disks at any time.
For example, if you have 3 voting disks configured, then a node must be able to access at
least two of the voting disks at any time. If a node cannot access the minimum required
number of voting disks it is evicted, or removed, from the cluster.

Oracle Cluster Registry (OCR)


The cluster registry holds all information about nodes, instances, services and ASM storage
if used, it also contains state information ie they are available and up or similar.
The OCR must reside on shared disk that is accessible by all of the nodes in your cluster.

22. What are the administrative tasks involved with voting disk?
Following administrative tasks are performed with the voting disk :
1) Backing up voting disks
2) Recovering Voting disks
3) Adding voting disks
4) Deleting voting disks
5) Moving voting disks

23. Can you add voting disk online? Do you need voting disk backup?
Yes, as per documentation, if you have multiple voting disk you can add online, but if you
have only one voting disk , by that cluster will be down as its lost you just need to start crs
in exclusive mode and add the vote disk using
crsctl add votedisk <path>

24. What is the Oracle Recommendation for backing up voting disk?


Oracle recommends us to use the dd command to backup the voting disk with a minimum
block size of 4KB.

25. How do we backup voting disks?


1) Oracle recommends that you back up your voting disk after the initial cluster creation
and after we complete any node addition or deletion procedures.
2) First, as root user, stop Oracle Cluster-ware (with the crsctl stop crs command) on all
nodes. Then, determine the current voting disk by issuing the following command:
crsctl query vote disk css
3) Then, issue the dd or o copy command to back up a voting disk, as appropriate.
Give the syntax of backing up voting disks:-
On Linux or UNIX systems:
dd if=voting_disk_name of=backup_file_name
where,
voting_disk_name is the name of the active voting disk
backup_file_name is the name of the file to which we want to back up the voting disk
contents
On Windows systems, use the ocopy command:
copy voting_disk_name backup_file_name

26. How do we verify an existing current backup of OCR?


We can verify the current backup of OCR using the following command: ocrconfig –show
backup.

27. You have lost OCR disk, what is your next step?
The cluster stack will be down due to the fact that cssd is unable to maintain the integrity,
this is true in 10g, from 11gR2 onwards, and the crsd stack will be down, the hasd still up
and running. You can add the ocr back by restoring the automatic backup or import the
manual backup,

28. What are the major RAC wait events?


In a RAC environment the buffer cache is global across all instances in the cluster and hence
the processing differs. The most common wait events related to this are gc cr request and
gc buffer busy

GC CR request: the time it takes to retrieve the data from the remote cache
Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will
increase the amount of data blocks requested by an Oracle session. The more blocks
requested typically means the more often a block will need to be read from a remote
instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends accessing the
requested data block.

30. How do OCSSD starts first if voting disk & OCR resides in ASM Disk groups?

You might wonder how CSSD, which is required to start the clustered ASM instance,

31. Can be started if voting disks are stored in ASM?

Without access to the voting disks there is no CSS, hence the node cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM instance.
To solve this problem the ASM disk headers have new metadata in 11.2:
you can use kfed to read the header of an ASM disk containing a voting disk.
The kfdhdb.vfstart and kfdhdb.vfend fields tell CSS where to find the voting file. This does
not require the ASM instance to be up.
Once the voting disks are located, CSS can access them and joins the cluster.

32. What is gsdctl in RAC? List gsdctl commands in Oracle RAC?

GSDCTL stands for Global Service Daemon Control, we can use gsdctl commands to start,
stop, and obtain the status of the GSD service on any platform.

The options for gsdctl are:-


$ gsdctl start -- To start the GSD service
$ gsdctl stop -- To stop the GSD service
$ gsdctl stat -- To obtain the status of the GSD service

Log file location for gsdctl:


$ ORACLE_HOME/srvm/log/gsdaemon_node_name.log

33. How do you troubleshoot node reboot?


Please check metalink ...
Note 265769.1 Troubleshooting CRS Reboots
Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle
Clusterware Node evictions.

Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215,

34. However sql-plus can start it on both nodes? Or, how do you identify the
problem?
Set the environmental variable SRVM_TRACE to true... And start the instance with srvctl.
Now you will get detailed error stack.
35. What are Oracle Cluster-ware processes for 10g on UNIX and Linux?
Cluster Synchronization Services (ocssd) — Manages cluster node membership and
runs as the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could
be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application
process, and so on) based on the resource's configuration information that is stored in the
OCR. This includes start, stop, monitor and failover operations. This process runs as the root
user

Event manager daemon (evmd) — a background process that publishes events that crs
creates.

Process Monitor Daemon (OPROCD) —this process monitor the cluster and provide I/O
fencing. OPROCD performs its check, stops running, and if the wake up is beyond the
expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure
results in Oracle Cluster-ware restarting the node. OPROCD uses the hangcheck timer on
Linux platforms.

RACG (racgmain, racgimon) —Extends cluster-ware to support Oracle-specific


requirements and complex resources. Runs server callout scripts when FAN events occur.

36. What are Oracle database background processes specific to RAC?


Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database. Oracle RAC
instances use two processes GES (Global Enqueue Service), GCS (Global Cache Service)
that enable cache fusion. Oracle RAC instances are composed of following background
processes:
ACMS—Atomic Control-file to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor
To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy
a query or transaction, Oracle RAC instances use two processes, the Global Cache Service
(GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the
statuses of each data file and each cached block using a Global Resource Directory (GRD).
The GRD contents are distributed across all of the active instances.

37. What is GRD?


GRD stands for Global Resource Directory. The GES and GCS maintain records of the
statuses of each data-file and each cached block using global resource directory. This
process is referred to as cache fusion and helps in data integrity.

38. What is ACMS?


ACMS stands for Atomic Control-file Memory Service. In an Oracle RAC environment ACMS
is an agent that ensures a distributed SGA memory up date (ie) SGA updates are globally
committed on success or globally aborted in event of a failure.

39. What is SCAN listener?


A scan listener is something that additional to node listener which listens the incoming db
connection requests from the client which got through the scan IP, it got end points
configured to node listener where it routes the db connection requests to particular node
listener.

SCAN IP can be disabled if not required. However SCAN IP is mandatory during the RAC
installation. Enabling/disabling SCAN IP is mostly used in oracle apps environment by the
concurrent manager (kind of job scheduler in oracle apps).

Steps to disable the SCAN IP,


i. Do not use SCAN IP at the client end.
ii. Stop scan listener
srvctl stop scan_listener
iii.Stop scan
srvctl stop scan (this will stop the scan vip's)
iv. Disable scan and disable scan listener
srvctl disable scan

40. What are the different networks components are in 10g RAC?
Public, private, and vip components
Private interfaces is for intra node communication.
VIP is all about availability of application. When a node fails then the VIP component fail
over to some other node, this is the reason that all applications should based on vip
components means tns entries should have vip entry in the host list

41. What is an interconnect network?


An interconnect network is a private network that connects all of the servers in a cluster.
The interconnect network uses a switch/multiple switches that only the nodes in the cluster
can access.

42. What is the use of cluster interconnecting?


Cluster interconnect is used by the Cache fusion for inter instance communication.

43. How can we configure the cluster interconnect?


· Configure User Datagram Protocol (UDP) on Gigabit Ethernet for cluster interconnects.
· On UNIX and Linux systems we use UDP and RDS (Reliable data socket) protocols to be
used by Oracle Cluster-ware.
· Windows clusters use the TCP protocol.

44. What is the purpose of Private Interconnect?


Cluster-ware uses the private interconnect for cluster synchronization (network heartbeat)
and daemon communication between the clustered nodes. This communication is based on
the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP).
Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches
of participating nodes in the cluster.

45. What is a virtual IP address or VIP?


A virtual IP address or VIP is an alternate IP address that the client connections use instead
of the standard public IP address. To configure VIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same subnet as the public
network.

46. What is the use of VIP?


If a node fails, then the node's VIP address fails over to another node on which the VIP
address can accept TCP connections but it cannot accept Oracle connections.

47. Why do we have a Virtual IP (VIP) in Oracle RAC?


Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP
timeout period (which can be up to 10 min) before getting an error. As a result, you don't
really have a good HA solution without using VIPs.
When a node fails, the VIP associated with it is automatically failed over to some other node
and new node re-arps the world indicating a new MAC address for the IP. Subsequent
packets sent to the VIP go to the new node, which will send error RST packets back to the
clients. This results in the clients getting errors immediately.

48. Give situations under which VIP address failover happens?


VIP addresses failover happens when the node on which the VIP address runs fails; all
interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from
the network.

49. What is the significance of VIP address failover?


When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don't have to wait for TCP connection timeout
messages.

50. What is the use of a service in Oracle RAC environment?


Applications should use the services feature to connect to the Oracle database. Services
enable us to define rules and characteristics to control how users and applications connect
to database instances.

51. What are the characteristics controlled by Oracle services feature?


The characteristics include a unique name, workload balancing, failover options, and high
availability.

52. What enables the load balancing of applications in RAC?


Oracle Net Services enable the load balancing of application connections across all of the
instances in an Oracle RAC database.

53. What are the types of connection load-balancing?


Connection Workload management is one of the key aspects when you have RAC instances
as you want to distribute the connections to specific nodes/instance or those have less load.
There are two types of connection load-balancing:
1.Client Side load balancing (also called as connect time load balancing)
2.Server side load balancing (also called as Listener connection load balancing)

54. What is the difference between server-side and client-side connection load
balancing?
Client-side balancing happens at client side where load balancing is done using listener. In
case of server-side load balancing listener uses a load-balancing advisory to redirect
connections to the instance providing best service.

Client Side load balancing: - Oracle client side load balancing feature enables clients to
randomize the connection requests among all the available listeners based on their load.

An tns entry that contains all nodes entries and use load balance=on (default its on) will use
the connect time load balancing or client side load balancing.

Sample Client Side TNS Entry:-

finance =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac2-vip)(PORT = 2042))
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac1-vip)(PORT = 2042))
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac3-vip)(PORT = 2042))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = FINANCE) (FAILOVER=ON)
(FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY =
5))
)
)

Server side load balancing:- This improves the connection performance by balancing the
number of active connections among multiple instances and dispatchers. In a single
instance environment (shared servers), the listener selects the least dispatcher to handle
the incoming client requests. In rac environments, PMON is aware of all instances load and
dispatchers, and depending on the load information PMON redirects the connection to the
least loaded node.

In a RAC environment, *.remote_listener parameter which is a tns entry containing all


nodes addresses need to set to enable the load balance advisory updates to PMON.

Sample Tns entry should be in an instances of RAC cluster,

local_listener=LISTENER_MYRAC1
remote_listener = LISTENERS_MYRACDB

55. What are the administrative tools used for Oracle RAC environments?
Oracle RAC cluster can be administered as a single image using the below
· OEM (Enterprise Manager),
· SQL*PLUS,
· Server control (SRVCTL),
· Cluster Verification Utility (CLUVFY),
· DBCA,
· NETCA

56. Name some Oracle Cluster-ware tools and their uses?


·OIFCFG - allocating and de-allocating network interfaces.
·OCRCONFIG - Command-line tool for managing Oracle Cluster Registry.
·OCRDUMP - Identify the interconnect being used.
·CVU - Cluster verification utility to get status of CRS resources.

57. What is the difference between CRSCTL and SRVCTL?


crsctl manages cluster-ware-related operations:
Starting and stopping Oracle Cluster-ware
Enabling and disabling Oracle Cluster-ware daemons
Registering cluster resources

srvctl manages Oracle resource–related operations:


Starting and stopping database instances and services
Also from 11gR2 manages the cluster resources like network,vip,disks etc

58. How do we remove ASM from a Oracle RAC environment?


We need to stop and delete the instance in the node first in interactive or silent mode.After
that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name

59. How do we verify that an instance has been removed from OCR after deleting
an instance?
Issue the following srvctl command:
srvctl config database -d database_name
cd CRS_HOME/bin
./crs_stat

60. What are the modes of deleting instances from Oracle Real Application cluster
Databases?
We can delete instances using silent mode or interactive mode using DBCA (Database
Configuration Assistant).

61. What are the background process that exists in 11gr2 and functionality?
Process Name Functionality
crsd •The CRS daemon (crsd) manages cluster resources based on configuration
information that is stored in Oracle Cluster Registry (OCR) for each resource. This includes
start, stop, monitor, and failover operations. The crsd process generates events when the
status of a resource changes.
cssd •Cluster Synchronization Service (CSS): Manages the cluster configuration by
controlling which nodes are members of the cluster and by notifying members when a node
joins or leaves the cluster. If you are using certified third-party cluster-ware, then CSS
processes interfaces with your cluster-ware to manage node membership information. CSS
has three separate processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and
the CSS Monitor (cssdmonitor). The cssdagent process monitors the cluster and provides
input/output fencing. This service formerly was provided by Oracle Process Monitor daemon
(oprocd), also known as Ora Fence Service on Windows. A cssdagent failure results in
Oracle Clusterware restarting the node.
diskmon •Disk Monitor daemon (diskmon): Monitors and performs input/output fencing
for Oracle Exadata Storage Server. As Exadata storage can be added to any Oracle RAC
node at any point in time, the diskmon daemon is always started when ocssd is started.
evmd •Event Manager (EVM): Is a background process that publishes Oracle Cluster-
ware events
mdnsd •Multicast domain name service (mDNS): Allows DNS requests. The mDNS
process is a background process on Linux and UNIX, and a service on Windows.
gnsd •Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and
external DNS servers. The GNS process performs name resolution within the cluster.
ons •Oracle Notification Service (ONS): Is a publish-and-subscribe service for
communicating Fast Application Notification (FAN) events
oraagent •oraagent: Extends cluster-ware to support Oracle-specific requirements and
complex resources. It runs server callout scripts when FAN events occur. This process was
known as RACG in Oracle Clusterware 11g Release 1 (11.1).
orarootagent •Oracle root agent (orarootagent): Is a specialized oraagent process that
helps CRSD manage resources owned by root, such as the network, and the Grid virtual IP
address
oclskd •Cluster kill daemon (oclskd): Handles instance/node evictions requests that have
been escalated to CSS
gipcd •Grid IPC daemon (gipcd): Is a helper daemon for the communications
infrastructure
ctssd •Cluster time synchronization daemon(ctssd) to manage the time synchronization
between nodes, rather depending on NTP

62. Under which user or owner the process will start?


Component Name of the Process Owner
Oracle High Availability Service ohasd init, root
Cluster Ready Service (CRS) Cluster Ready Services root
Cluster Synchronization Service (CSS) ocssd,cssd monitor, cssdagent grid owner
Event Manager (EVM) evmd, evmlogger grid owner
Cluster Time Synchronization Service (CTSS) octssd root
Oracle Notification Service (ONS) ons, eons grid owner
Oracle Agent oragent grid owner
Oracle Root Agent orarootagent root
Grid Naming Service (GNS) gnsd root
Grid Plug and Play (GPnP) gpnpd grid owner
Multicast domain name service (mDNS) mdnsd grid owner

63. What is the major difference between 10g and 11g RAC?
There is not much difference between 10g and 11gR (1) RAC. But there is a significant
difference in 11gR2.

Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS


Databases
Instances
Applications
Node Monitoring
Event Services
High Availability

From 11gR2(onwards) its completed HA stack managing and providing the following
resources as like the other cluster software like VCS etc.
Databases
Instances
Applications
Cluster Management
Node Management
Event Services
High Availability
Network Management (provides DNS/GNS/MDNSD services on behalf of other traditional
services) and SCAN – Single Access Client Naming method, HAIP
Storage Management (with help of ASM and other new ACFS filesystem)
Time synchronization (rather depending upon traditional NTP)
Removed OS dependent hang checker etc, manages with own additional monitor process

64. What is hang check timer?


The hang-check timer checks regularly the health of the system. If the system hangs or
stop the node will be restarted automatically.
There are 2 key parameters for this module:
-> hang-check-tick: this parameter defines the period of time between checks of system
health. The default value is 60 seconds; Oracle recommends setting it to 30seconds.
-> hang-check-margin: this defines the maximum hang delay that should be tolerated
before hang-check-timer resets the RAC node.

65. State the initialization parameters that must have same value for every
instance in an Oracle RAC database?
Some initialization parameters are critical at the database creation time and must have
same values. Their value must be specified in SPFILE or PFILE for every instance. The list of
parameters that must be identical on every instance are given below:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_passWORD_FILE
UNDO_MANAGEMENT

-------------------------------------------------------------------------------------------------------
---

66. What is RAC? What is the benefit of RAC over single instance database?
In Real Application Clusters environments, all nodes concurrently execute transactions
against the same database. Real Application Clusters coordinates each node's access to the
shared data to provide consistency and integrity.
Benefits:
Improve response time
Improve throughput
High availability
Transparency

Advantages of RAC (Real Application Clusters)

Reliability - if one node fails, the database won't fail


Availability - nodes can be added or replaced without having to shutdown the database
Scalability - more nodes can be added to the cluster as the workload increases

67. What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead
of the standard public IP address. To configure VIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same subnet as the public
network.

68. What is the use of VIP?


If a node fails, then the node's VIP address fails over to another node on which the VIP
address can accept TCP connections but it cannot accept Oracle connections.
Give situations under which VIP address failover happens:-
VIP addresses failover happens when the node on which the VIP address runs fails, all
interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from
the network.
Using virtual IP we can save our TCP/IP timeout problem because Oracle notification service
maintains communication between each nodes and listeners.

69. What is the significance of VIP address failover?


When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don't have to wait for TCP connection timeout
messages.

70. What is voting disk?


Voting Disk is a file that sits in the shared storage area and must be accessible by all nodes
in the cluster. All nodes in the cluster register their heart-beat information in the voting
disk, so as to confirm that they are all operational. If heart-beat information of any node in
the voting disk is not available that node will be evicted from the cluster. The CSS (Cluster
Synchronization Service) daemon in the cluster-ware maintains the heart beat of all nodes
to the voting disk. When any node is not able to send heartbeat to voting disk, then it will
reboot itself, thus help avoiding the split-brain syndrome.

For high availability, Oracle recommends that you have a minimum of three or odd number
(3 or greater) of voting disks.

Voting Disk - is file that resides on shared storage and Manages cluster members. Voting
disk reassigns cluster ownership between the nodes in case of failure.

The Voting Disk Files are used by Oracle Cluster-ware to determine which nodes are
currently members of the cluster. The voting disk files are also used in concert with other
Cluster components such as CRS to maintain the clusters integrity.

Oracle Database 11g Release 2 provides the ability to store the voting disks in ASM along
with the OCR. Oracle Cluster-ware can access the OCR and the voting disks present in ASM
even if the ASM instance is down. As a result CSS can continue to maintain the Oracle
cluster even if the ASM instance has failed.

71. How many voting disks are you maintaining?

By default Oracle will create 3 voting disk files in ASM.

Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You
should always configure an odd number of voting disks >= 3. This is because loss of more
than half your voting disks will cause the entire cluster to fail.

You should plan on allocating 280MB for each voting disk file. For example, if you are using
ASM and external redundancy then you will need to allocate 280MB of disk for the voting
disk. If you are using ASM and normal redundancy you will need 560MB.

72. Why we need to keep odd number of voting disks?


Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You
should always configure an odd number of voting disks >= 3. This is because loss of more
than half your voting disks will cause the entire cluster to fail.

73. What are Oracle RAC software components?


Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database. Oracle RAC
instances use two processes GES (Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion. Oracle RAC instances are composed of following background
processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor

74. What are Oracle Cluster-ware processes for 10g?


Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as
the oracle user; failure of this process results in cluster restart.
Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be
a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application
process, and so on) based on the resource's configuration information that is stored in the
OCR. This includes start, stop, monitor and failover operations. This process runs as the root
user
Event manager daemon (evmd) —a background process that publishes events that crs
creates.
Process Monitor Daemon (OPROCD) —this process monitor the cluster and provide I/O
fencing. OPROCD performs its check, stops running, and if the wake up is beyond the
expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure
results in Oracle Cluster-ware restarting the node. OPROCD uses the hang-check timer on
Linux platforms.
RACG (racgmain, racgimon) —Extends cluster-ware to support Oracle-specific requirements
and complex resources. Runs server callout scripts when FAN events occur.

75. What are Oracle database background processes specific to RAC?


LMS—Global Cache Service Process
LMD—Global Enqueue Service Daemon
LMON—Global Enqueue Service Monitor
LCK0—Instance Enqueue Process
Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global
Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file
and each cached block using a Global Resource Directory (GRD). The GRD contents are
distributed across all of the active instances.

76. What is Cache Fusion?


Transform of data across instances through private interconnect is called cache fusion.
Oracle RAC is composed of two or more instances. When a block of data is read from data
file by an instance within the cluster and another instance is in need of the same block, it is
easy to get the block image from the instance which has the block in its SGA rather than
reading from the disk. To enable inter instance communication Oracle RAC makes use of
interconnects. The Global Enquire Service (GES) monitors and Instance enquires process
manages the cache fusion

77. What is SCAN? (11gR2 feature)


Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.

SCAN provides a single domain name via (DNS), allowing and-users to address a RAC
cluster as-if it were a single IP address. SCAN works by replacing a hostname or IP list with
virtual IP addresses (VIP).

Single client access name (SCAN) is meant to facilitate single name for all Oracle clients to
connect to the cluster database, irrespective of number of nodes and node location. Until
now, we have to keep adding multiple address records in all clients tnsnames.ora, when a
new node gets added to or deleted from the cluster.
Single Client Access Name (SCAN) eliminates the need to change TNSNAMES entry when
nodes are added to or removed from the Cluster. RAC instances register to SCAN listeners
as remote listeners. Oracle recommends assigning 3 addresses to SCAN, which will create 3
SCAN listeners, though the cluster has got dozens of nodes.. SCAN is a domain name
registered to at least one and up to three IP addresses, either in DNS (Domain Name
Service) or GNS (Grid Naming Service). The SCAN must resolve to at least one address on
the public network. For high availability and scalability, Oracle recommends configuring the
SCAN to resolve to three addresses.

78. What are SCAN components in a cluster?


1. SCAN Name
2.SCAN IPs (3)
3.SCAN Listeners (3)

79. What is FAN?


Fast application Notification as it abbreviates to FAN relates to the events related to
instances, services and nodes. This is a notification mechanism that Oracle RAC uses to
notify other processes about the configuration and service level information that includes
service status changes such as, UP or DOWN events. Applications can respond to FAN
events and take immediate action.

80. What is TAF?


TAF (Transparent Application Failover) is a configuration that allows session fail-over
between different nodes of a RAC database cluster.
Transparent Application Failover (TAF). If a communication link failure occurs after a
connection is established, the connection fails over to another active node. Any disrupted
transactions are rolled back, and session properties and server-side program variables are
lost. In some cases, if the statement executing at the time of the failover is a Select
statement, that statement may be automatically re-executed on the new connection with
the cursor positioned on the row on which it was positioned prior to the failover.

After an Oracle RAC node crashes—usually from a hardware failure—all new application
transactions are automatically rerouted to a specified backup node. The challenge in
rerouting is to not lose transactions that were "in flight" at the exact moment of the crash.
One of the requirements of continuous availability is the ability to restart in-flight application
transactions, allowing a failed node to resume processing on another server without
interruption. Oracle's answer to application failover is a new Oracle Net mechanism dubbed
Transparent Application Failover. TAF allows the DBA to configure the type and method of
failover for each Oracle Net client.
TAF architecture offers the ability to restart transactions at either the transaction (SELECT)
or session level.

81. What are the requirements for Oracle Cluster-ware?


1. External Shared Disk to store Oracle Cluster ware file (Voting Disk and Oracle Cluster
Registry - OCR)
2. Two network cards on each cluster ware node (and three set of IP address) -
Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter node communication
between rac nodes used by cluster-ware and rac database)
IP address set 3 for Virtual IP (VIP) (used as Virtual IP address for client connection and for
connection failover)
3. Storage Option for OCR and Voting Disk - RAW, OCFS2 (Oracle Cluster File System), NFS,
Which enable the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application connections across all of the
instances in an Oracle RAC database.

82. How to find location of OCR file when CRS is down?


If you need to find the location of OCR (Oracle Cluster Registry) but your CRS is down.
When the CRS is down:
Look into “ocr.loc” file, location of this file changes depending on the OS:
On Linux: /etc/oracle/ocr.loc
On Solaris: /var/opt/oracle/ocr.loc

When CRS is UP:


Set ASM environment or CRS environment then run the below command:
ocr-check

83. In 2 node RAC, how many NIC’s are using ?


2 network cards on each clusterware node
Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter node communication
between rac nodes used by clusterware and rac database)

84. in 2 nodes RAC, how many IP’s are using?


6 - 3 set of IP address
## eth1-Public: 2
## eth0-Private: 2
## VIP: 2

85. How to find IP’s information in RAC?


Edit the /etc/hosts file as shown below:
# does not remove the following line, or various programs
# That requires network functionality will fail.
127.0.0.1 localhost.localdomain localhost
## Public Node names
192.168.10.11 node1-pub.hingu.net node1-pub
192.168.10.22 node2-pub.hingu.net node2-pub
## Private Network (Interconnect)
192.168.0.11 node1-prv node1-prv
192.168.0.22 node2-prv node2-prv
## Private Network (Network Area storage)
192.168.1.11 node1-nas node1-nas
192.168.1.22 node2-nas node2-nas
192.168.1.33 nas-server nas-server
## Virtual IPs
192.168.10.111 node1-vip.hingu.net node1-vip
192.168.10.222 node2-vip.hingu.net node2-vip

86. What is difference between RAC ip addresses?


Public IP address is the normal IP address typically used by DBA and SA to manage storage,
system and database. Public IP addresses are reserved for the Internet.
Private IP address is used only for internal clustering processing (Cache Fusion) (aka as
interconnect). Private IP addresses are reserved for private networks.
VIP is used by database applications to enable fail over when one cluster node fails. The
purpose for having VIP is so client connection can be failover to surviving nodes in case
there is failure

87. Can application developer access the private ip?


No. private IP address is used only for internal clustering processing (Cache Fusion) (aka as
interconnect)

Oracle 11g RAC Interview Questions


1. What is the major difference between 10g and 11g RAC?
Well, there is not much difference between 10g and 11gR (1) RAC.
But there is a significant difference in 11gR2.
Prior to 11gR1 (10g) RAC, the following were managed by Oracle CRS

 Databases
 Instances
 Applications
 Node Monitoring
 Event Services
 High Availability
From 11gR2 (onwards) it’s completed HA stack managing and providing the
following resources as like the other cluster software like VCS etc.

 Databases
 Instances
 Applications
 Cluster Management
 Node Management
 Event Services
 High Availability
 Network Management (provides DNS/GNS/MDNSD services on behalf of other
traditional services) and SCAN – Single Access Client Naming method, HAIP
 Storage Management (with help of ASM and other new ACFS filesystem)
 Time synchronization (rather depending upon traditional NTP)
 Removed OS dependent hang checker etc, manages with own additional monitor
process
2. What are Oracle Cluster Components?
Cluster Interconnect (HAIP)
Shared Storage (OCR/Voting Disk)
Cluster-ware software
3. What are Oracle RAC Components?
VIP, Node apps etc.
4. What are Oracle Kernel Components (nothing but how does Oracle RAC
database differs than Normal single instance database in terms of Binaries and
process)?
Basically Oracle kernel need to switched on with RAC On option when you convert to RAC,
that is the difference as it facilitates few RAC bg process like LMON,LCK,LMD,LMS etc.
To turn on RAC
# link the oracle libraries
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk rac_on
# rebuild oracle
$ cd $ORACLE_HOME/bin
$ relink oracle
Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database.Oracle RAC
instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion.Oracle RAC instances are composed of following background processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor
5. What is Clusterware?
Software that provides various interfaces and services for a cluster. Typically, this includes
capabilities that:

 Allow the cluster to be managed as a whole


 Protect the integrity of the cluster
 Maintain a registry of resources across the cluster
 Deal with changes to the cluster
 Provide a common view of resources
6. What are the background process that exists in 11gr2 and functionality?
Process Name Functionality
•The CRS daemon (crsd) manages cluster resources based on configuration
information that is stored in Oracle Cluster Registry (OCR) for each resource.
crsd
This includes start, stop, monitor, and failover operations. The crsd process
generates events when the status of a resource changes.
•Cluster Synchronization Service (CSS): Manages the cluster configuration by
controlling which nodes are members of the cluster and by notifying members
when a node joins or leaves the cluster. If you are using certified third-party
clusterware, then CSS processes interfaces with your clusterware to manage
node membership information. CSS has three separate processes: the CSS
cssd
daemon (ocssd), the CSS Agent (cssdagent), and the CSS Monitor
(cssdmonitor). The cssdagent process monitors the cluster and provides
input/output fencing. This service formerly was provided by Oracle Process
Monitor daemon (oprocd), also known as OraFenceService on Windows. A
cssdagent failure results in Oracle Clusterware restarting the node.
•Disk Monitor daemon (diskmon): Monitors and performs input/output fencing
for Oracle Exadata Storage Server. As Exadata storage can be added to any
diskmon
Oracle RAC node at any point in time, the diskmon daemon is always started
when ocssd is started.
•Event Manager (EVM): Is a background process that publishes Oracle
evmd
Clusterware events
•Multicast domain name service (mDNS): Allows DNS requests. The mDNS
mdnsd process is a background process on Linux and UNIX, and a service on
Windows.
•Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS
gnsd and external DNS servers. The GNS process performs name resolution within
the cluster.
•Oracle Notification Service (ONS): Is a publish-and-subscribe service for
ons
communicating Fast Application Notification (FAN) events
•oraagent: Extends clusterware to support Oracle-specific requirements and
oraagent complex resources. It runs server callout scripts when FAN events occur. This
process was known as RACG in Oracle Clusterware 11g Release 1 (11.1).
•Oracle root agent (orarootagent): Is a specialized oraagent process that
orarootagent helps CRSD manage resources owned by root, such as the network, and the
Grid virtual IP address
•Cluster kill daemon (oclskd): Handles instance/node evictions requests that
oclskd
have been escalated to CSS
•Grid IPC daemon (gipcd): Is a helper daemon for the communications
gipcd
infrastructure
ctssd •Cluster time synchronisation daemon(ctssd) to manage the time
syncrhonization between nodes, rather depending on NTP
7. Under which user or owner the process will start?
Component Name of the Process Owner
Oracle High Availability
ohasd init, root
Service
Cluster Ready Service (CRS) Cluster Ready Services root
Cluster Synchronization ocssd,cssd monitor,
grid owner
Service (CSS) cssdagent
Event Manager (EVM) evmd, evmlogger grid owner
Cluster Time Synchronization
octssd root
Service (CTSS)
Oracle Notification Service
ons, eons grid owner
(ONS)
Oracle Agent oragent grid owner
Oracle Root Agent orarootagent root
Grid Naming Service (GNS) gnsd root
Grid Plug and Play (GPnP) gpnpd grid owner
Multicast domain name
mdnsd grid owner
service (mDNS)
8. What is startup sequence in Oracle 11g RAC? 11g RAC startup sequence?
9. as you said Voting & OCR Disk resides in ASM Diskgroups, but as per startup
sequence OCSSD starts first before than ASM, how is it possible?
How OCSSD does starts if voting disk & OCR resides in ASM Disk-groups?
You might wonder how CSSD, which is required to start the clustered ASM instance, can be
started if voting disks are stored in ASM? This sound like a chicken-and-egg problem:
without access to the voting disks there is no CSS, hence the node cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM instance. To solve this
problem the ASM disk headers have new metadata in 11.2: you can use kfed to read the
header of an ASM disk containing a voting disk. The kfdhdb.vfstart and kfdhdb.vfend fields
tell CSS where to find the voting file. This does not require the ASM instance to be up. Once
the voting disks are located, CSS can access them and joins the cluster.
10. How does SCAN work?
1. Client Connected through SCAN name of the cluster (remember all three IP
addresses round robin resolves to same Host name (SCAN Name), here in this case
our scan name is cluster01-scan.cluster01.example.com
2. The request reaches to DNS server in your corp and then resolves to one of
the node out of three. a. If GNS (Grid Naming service or domain is configured) that
is a sub domain configured in the DNS entry for to resolve cluster address the
request will be handover to GNS (gnsd)
3. Here in our case assume there is no GNS, now the with the help of SCAN
listeners where end points are configured to database listener.
4. Database Listeners the request and then process further.
5. In case of node addition, Listener 4, client need not to know or need not
change any thing from their tns entry (address of 4 th node/instance) as they just
using scan IP.
6. Same case even in the node deletion.

13. What are the file types that ASM support and keep in disk groups?
Control files Flashback logs Data Pump dump sets
Data Guard
Data files DB SPFILE
configuration
Change tracking
Temporary data files RMAN backup sets
bitmaps
Online redo logs RMAN data file copies OCR files
Archive logs Transport data files ASM SPFILE
14. List Key benefits of ASM?

 Stripes files rather than logical volumes


 Provides redundancy on a file basis
 Enables online disk reconfiguration and dynamic rebalancing
 Reduces the time significantly to resynchronize a transient failure by tracking
changes while disk is offline
 Provides adjustable rebalancing speed
 Is cluster-aware
 Supports reading from mirrored copy instead of primary copy for extended clusters
 Is automatically installed as part of the Grid Infrastructure
15. List key benefits of Oracle Grid Infrastructure?
16. List some of the background process that used in ASM?
Process Description
Opens all device files as part of discovery and coordinates
RBAL
the rebalance activity
ARBn One or more slave processes that do the rebalance activity
Responsible for managing the disk-level activities such as
GMON drop or offline and advancing the ASM disk group
compatibility
MARK Marks ASM allocation units as stale when needed
One or more ASM slave processes forming a pool of
Onnn
connections to the ASM instance for exchanging messages
One or more parallel slave processes used in fetching data
PZ9n
on clustered ASM installation from GV$ views
13. What is node listener?
In 11gr2 the listeners will run from Grid Infrastructure software home

 The node listener is a process that helps establish network connections from ASM
clients to the ASM instance.
 Runs by default from the Grid $ORACLE_HOME/bin directory
 Listens on port 1521 by default
 Is the same as a database instance listener
 Is capable of listening for all database instances on the same machine in addition to
the ASM instance
 Can run concurrently with separate database listeners or be replaced by a separate
database listener
 Is named tnslsnr on the Linux platform
15. What is SCAN listener?
A scan listener is something that additional to node listener which listens the incoming db
connection requests from the client which got through the scan IP, it got end points
configured to node listener where it routes the db connection requests to particular node
listener.
16. What is the difference between CRSCTL and SRVCTL?
crsctl manages clusterware-related operations:

 Starting and stopping Oracle Clusterware


 Enabling and disabling Oracle Clusterware daemons
 Registering cluster resources
srvctl manages Oracle resource–related operations:

 Starting and stopping database instances and services


 Also from 11gR2 manages the cluster resources like network,vip,disks etc
17. How to control Oracle Cluster-ware?
To start or stop Oracle Cluster-ware on a specific node:
# crsctl stop crs
# crsctl start crs
To enable or disable Oracle Cluster-ware on a specific node:
# crsctl enable crs
# crsctl disable crs
19. How to check the cluster (all nodes) status?
To check the viability of Cluster Synchronization Services (CSS) across nodes:
$ crsctl check cluster
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
20. How to check the cluster (one node) status?
$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
21. How to find Voting Disk location?
•To determine the location of the voting disk:
# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
– —– —————– ———- ———-
1. ONLINE 8c2e45d734c64f8abf9f136990f3daf8 (ASMDISK01) [DATA]
2. ONLINE 99bc153df3b84fb4bf071d916089fd4a (ASMDISK02) [DATA]
3. ONLINE 0b090b6b19154fc1bf5913bc70340921 (ASMDISK03) [DATA]
Located 3 voting disk(s).
22. How to find Location of OCR?

 cat /etc/oracle/ocr.loc
ocrconfig_loc=+DATA
local_only=FALSE

 #OCRCHECK (also about OCR integrity)


23. List some background process that used in ASM Instances?
Process Description
Opens all device files as part of discovery and coordinates
RBAL
the rebalance activity
ARBn One or more slave processes that do the rebalance activity
Responsible for managing the disk-level activities such as
GMON drop or offline and advancing the ASM disk group
compatibility
MARK Marks ASM allocation units as stale when needed
One or more ASM slave processes forming a pool of
Onnn
connections to the ASM instance for exchanging messages
One or more parallel slave processes used in fetching data
PZ9n
on clustered ASM installation from GV$ views
24. What are types of ASM Mirroring?
Supported Mirroring Default Mirroring
Disk Group Type
Levels Level
External redundancy Unprotected (None) Unprotected (None)
Two-wayThree-
Normal redundancy wayUnprotected Two-way
(None)
High redundancy Three-way Three-way
25. What is ASM Striping?
ASM can use variable size data extents to support larger files, reduce memory
requirements, and improve performance.
Each data extent resides on an individual disk.
Data extents consist of one or more allocation units.
The data extent size is:

 Equal to AU for the first 20,000 extents (0–19999)


 Equal to 4 × AU for the next 20,000 extents (20000–39999)
 Equal to 16 × AU for extents above 40,000
ASM stripes files using extents with a coarse method for load balancing or a fine method to
reduce latency.

 Coarse-grained striping is always equal to the effective AU size.


 Fine-grained striping is always equal to 128 KB.
26. How many ASM Disk-groups can be created under one ASM Instance?
ASM imposes the following limits:

 63 disk groups in a storage system


 10,000 ASM disks in a storage system
 Two-terabyte maximum storage for each ASM disk (non-Exadata)
 Four-petabyte maximum storage for each ASM disk (Exadata)
 40-exabyte maximum storage for each storage system
 1 million files for each disk group
 ASM file size limits (database limit is 128 TB):
1. External redundancy maximum file size is 140 PB.
2. Normal redundancy maximum file size is 42 PB.
3. High redundancy maximum file size is 15 PB.
27. How to find the cluster network settings?
To determine the list of interfaces available to the cluster:
$ oifcfg iflist –p -n
To determine the public and private interfaces that have been configured:
$ oifcfg getif
eth0 192.0.2.0 global public
eth1 192.168.1.0 global cluster_interconnect
To determine the Virtual IP (VIP) host name, VIP address, VIP subnet mask, and VIP
interface name:
$ srvctl config nodeapps -a
VIP exists.:host01
VIP exists.: /192.0.2.247/192.0.2.247/255.255.255.0/eth0

28. How to change Public or VIP Address in RAC Cluster?
29. How to change Cluster interconnect in RAC?
On a single node in the cluster, add the new global interface specification:
$ oifcfg setif -global eth2/192.0.2.0:cluster_interconnect
Verify the changes with oifcfg getif and then stop Clusterware on all nodes by running the
following command as root on each node:
# oifcfg getif
# crsctl stop crs
Assign the network address to the new network adapters on all nodes using ifconfig:
#ifconfig eth2 192.0.2.15 netmask 255.255.255.0 broadcast 192.0.2.255
Remove the former adapter/subnet specification and restart Clusterware:
$ oifcfgdelif -global eth1/192.168.1.0
# crsctl start crs
30. Managing or Modifying SCAN in Oracle RAC?
To add a SCAN VIP resource:
$ srvctl add scan -n cluster01-scan
To remove Clusterware resources from SCAN VIPs:
$ srvctl remove scan [-f]
To add a SCAN listener resource:
$ srvctl add scan_listener
$ srvctl add scan_listener -p 1521
To remove Clusterware resources from all SCAN listeners:
$ srvctl remove scan_listener [-f]
31. How to check the node connectivity in Oracle Grid Infrastructure?
$ cluvfy comp nodecon -n all –verbose
32. Can I stop all nodes in one command? Meaning that stopping whole cluster?
In 10g its not possible, where in 11g it is possible
[root@pic1]# crsctl start cluster -all
[root@pic2]# crsctl stop cluster –all
33. What is OLR? Which of the following statements regarding the Oracle Local
Registry (OLR) is true?
1. Each cluster node has a local registry for node-specific resources.
2. The OLR should be manually created after installing Grid Infrastructure on each node in
the cluster.
3. One of its functions is to facilitate Cluster-ware startup in situations where the ASM
stores the OCR and voting disks.
4. You can check the status of the OLR using ocr-check.
34. What is runfixup.sh script in Oracle Cluster-ware 11g release 2 installations?
With Oracle Cluster-ware 11g release 2, Oracle Universal Installer (OUI) detects when the
minimum requirements for an installation are not met, and creates shell scripts, called fixup
scripts, to finish incomplete system configuration steps. If OUI detects an incomplete task,
then it generates fixup scripts (runfixup.sh). You can run the fixup script after you click the
Fix and Check Again Button.
The Fixup script does the following:
If necessary sets kernel parameters to values required for successful installation, including:

 Shared memory parameters.


 Open file descriptor and UDP send/receive parameters.
Sets permissions on the Oracle Inventory (central inventory) directory. Reconfigures
primary and secondary group memberships for the installation owner, if necessary, for the
Oracle Inventory directory and the operating system privileges groups.

 Sets shell limits if necessary to required values.


35. How to stop whole cluster with single command
crsctl stop cluster (possible only from 11gr2), please note crsctl commands becomes global
now, if you do not specify node specifically the command executed globally for example
crsctl stop crs (stops in all crs resource in all nodes)
crsctl stop crs –n <ndeoname) (stops only in specified node)
36. CRS is not starting automatically after a node reboot, what you do to make it
happen?
crsctl enable crs (as root)
to disable
crsctl disable crs (as root)
37. What are server pools in 11gr2?
38. What is policy managed databases in RAC?
39. What is Load balancing & how does it work?
40. Describe high level Steps to convert single instance to RAC?
41. What is the difference between TAF and FAN & FCF? at what conditions you
use them?
1) TAF with tnsnames
a feature of Oracle Net Services for OCI8 clients. TAF is transparent application failover
which will move a session to a backup connection if the session fails. With Oracle 10g
Release 2, you can define the TAF policy on the service using dbms_service package. It will
only work with OCI clients. It will only move the session and if the parameter is set, it will
failover the select statement. For insert, update or delete transactions, the application must
be TAF aware and roll back the transaction. YES, you should enable FCF on your OCI client
when you use TAF, it will make the failover faster.
Note: TAF will not work with JDBC thin.
2) FAN with tnsnames with aq notifications true
FAN is a feature of Oracle RAC which stands for Fast Application Notification. This allows the
database to notify the client of any change (Node up/down, instance up/down, database
up/down). For integrated clients, inflight transactions are interrupted and an error message
is returned. Inactive connections are terminated.
FCF is the client feature for Oracle Clients that have integrated with FAN to provide fast
failover for connections. Oracle JDBC Implicit Connection Cache, Oracle Data Provider for
.NET (ODP.NET) and Oracle Call Interface are all integrated clients which provide the Fast
Connection Failover feature.
3) FCF, along with FAN when using connection pools
FCF is a feature of Oracle clients that are integrated to receive FAN events and abort inflight
transactions, clean up connections when a down event is received as well as create new
connections when a up event is received. Tomcat or JBOSS can take advantage of FCF if the
Oracle connection pool is used underneath. This can be either UCP (Universal Connection
Pool for JAVA) or ICC (JDBC Implicit Connection Cache). UCP is recommended as ICC will be
deprecated in a future release.
4) ONS, with clusterware either FAN/FCF
ONS is part of the clusterware and is used to propagate messages both between nodes and
to application-tiers
ONS is the foundation for FAN upon which is built FCF.
RAC uses FAN to publish configuration changes and LBA events. Applications can react as
those published events in two way :
– by using ONS api (you need to program it)
– by using FCF (automatic by using JDBC implicit connection cache on the application
server)
you can also respond to FAN event by using server-side callout but this on the server side
(as their name suggests it)
Relationship between FAN/FCF/ONS
ONS –> FAN –> FCF
ONS -> send/receive messages on local and remote nodes.
FAN -> uses ONS to notify other processes about changes in configuration of service level
FCF -> uses FAN information working with conection pools JAVA and others.
42. Can you add voting disk online? Do you need voting disk backup?
Yes, as per documentation, if you have multiple voting disk you can add online, but if you
have only one voting disk , by that cluster will be down as its lost you just need to start crs
in exclusive mode and add the vote-disk using
crsctl add votedisk <path>
43. You have lost OCR disk, what is your next step?
The cluster stack will be down due to the fact that cssd is unable to maintain the integrity,
this is true in 10g, From 11gR2 onwards, the crsd stack will be down, the hasd still up and
running. You can add the ocr back by restoring the automatic backup or import the manual
backup,
44. What happens when ocssd fails, what is node eviction? How node eviction
does happen? For all answer will be same.
45. What is virtual IP and how does it works?
46. Describe some rac wait events you experienced?
Oracle RAC Wait events and this table,
47. Can you modify VIP address after your cluster installation?
Yes,
48. How do you interpret AWR report in RAC instances, what sections in awr report for rac
instances are most important?
1. Viewing Contents in OCR/Voting disks

There are three possible ways to view the OCR contents.


a. OCRDUMP (or)
b. crs_stat -p (or)
c. By using strings.
Voting disk contents are not persistent and are not required to view the contents,
because the voting disk contents will be overwritten. if still need to view, strings are used.
2. Server pools – Read in my blog
3. Verifying Cluster Interconnect

Cluster interconnects can be verified by:


i. oifcfg getif
ii. From AWR Report.
iii. show parameter cluster_interconnect
iv. srvctl config network
4. Does scan IP required or we can disable it

SCAN IP can be disabled if not required. However SCAN IP is mandatory during the
RAC installation. Enabling/disabling SCAN IP is mostly used in oracle apps environment by
the concurrent manager (kind of job scheduler in oracle apps).
To disable the SCAN IP,
i. Do not use SCAN IP at the client end.
ii. Stop scan listener
srvctl stop scan_listener
iii. Stop scan
srvctl stop scan (this will stop the scan vip's)
iv. Disable scan and disable scan listener
srvctl disable scan
5. Migrating to new Disk-group scenarious

a. Case 1: Migrating disk group from one storage to other with same name
1. Consider the disk group is DATA,
2. Create new disks in DATA pointing towards the new storage (EMC),
a) Partioning provisioning done by storage and they give you the device name
or mapper like /dev/mapper/asakljdlas
3. Add the new disk to diskgroup DATA
a) Alter diskgroup data add disk '/dev/mapper/asakljdlas'
3. drop the old disks from DATA with which rebalancing is done automatically.
If you want you can the rebalance by alter system set asm_power_limit =12 for full
throttle.
alter diskgroup data drop disk 'path to hitachi storage'
Note: you can get the device name in v$asm_disk in path column.
4. Request SAN team to detach the old Storage (HITACHI).

b. Case 2: Migrating disk group from one to another with different diskgroup name.
1) Create the Disk group with new name in the new storage.
2) Create the spfile in new diskgroup and change the parameter scope = spfile for
control files etc.
3) Take a control file backup in format +newdiskgroup
4) Shutdown the db, startup nomount the database
5) restore the control file from backup (now the control will restore to new diskgroup)
6) Take the RMAN backup as copy of all the databases with new format.
RMAN&gt; backup database as copy format '+newdiskgroup name' ;
3) RMAN&gt; Switch database to copy.
4) Verify dba_data_files,dba_temp_files, v$log that all files are pointing to new
diskgroup name.

c. Case 3: Migrating disk group to new storage but no additional diskgroup given
1) Take the RMAN backup as copy of all the databases with new format and place it in
the disk.
2) Prepare rename commands from v$log ,v$datafile etc (dynamic queries)
3) Take a backup of pfile and modify the following referring to new diskgroup name
.control_files
.db_create_file_dest
.db_create_online_log_dest_1
.db_create_online_log_dest_2
.db_recovery_file_des
4) stop the database
5) Unmount the diskgroup
asmcmd umount ORA_DATA
6) use asmcmd renamedg (11gr2 only) command to rename to new
diskgroup
renamedg phase=both dgname=ORA_DATA newdgname=NEW_DATA
verbose=true
7) mount the diskgroup
asmcmd mount NEW_DATA
8) start the database in mount with new pfile taken backup in step 3
9) Run the rename file scripts generated at step2
9) Add the diskgroup to cluster the cluster (if using rac)
srvctl modify database -d orcl -p +NEW_FRA/orcl/spfileorcl.ora
srvctl modify database -d orcl -a "NEW_DATA"
srvctl config database -d orcl
srvctl start database -d orcl
10) Delete the old diskgroup from cluster
crsctl delete resource ora.ORA_DATA.dg
11) Open the database.
7. Database rename in RAC, what could be the checklist for you?

a. Take the outputs of all the services that are running on the databases.
b. set cluster_database=FALSE
c. Drop all the services associated with the database.
d. Stop the database
e. Startup mount
f. Use nid to change the DB Name.
Generic question, If using ASM the usual location for the datafile would be
+DATA/datafile/OLDDBNAME/system01.dbf'
Does NID changes this path too? to reflect the new db name?
Yes it will, by using proper directory structure it will create a links to original
directory structure. +DATA/datafile/NEWDBNAME/system01.dbf'
this has to be tested, We dont have test bed, but thanks to Anji who confirmed
it will

g. Change the parameters according to the new database name


h. Change the password file.
i. Stop the database.
j. Mount the database
k. Open database with Reset logs
l. Create spfile from pfile.
m. Add database to the cluster.
n. Create the services that are dropped in prior to rename.
o. Bounce the database.
8.How to find the database in which particular service is attached to when you have a large
number of databases running in the server, you cannot check one by one manually
Write a shell script to read the database name from oratab and iterate the loop taking inpt
as DB name in srvctl to get the result.
#!/bin/ksh
ORACLE_HOME=<crs_home>
PATH=$ORACLE_HOME/bin:$PATH
LD_LIBRARY_PATH=${SAVE_LLP}:${ORACLE_HOME}/lib
export TNS_ADMIN ORACLE_HOME PATH LD_LIBRARY_PATH
for INSTANCE in `cat /etc/oratab|grep -v “^#”|cut -f1 -d: -s`
do
export ORACLE_SID=$INSTANCE
echo `srvctl status service -d $INSTANCE -s $1| grep -i “is running”`
done
9. Difference between OHAS and CRS?

OHAS is complete cluster stack which includes some kernel level tasks like managing
network, time synchronization, disks etc, where the CRS has the ability to manage the
resources like database,listeners,applications, etc With both of this Oracle provides the high
availability clustering services rather only affinity to databases.

NODE EVICTION OVERVIEW

The Oracle Clusterware is designed to perform a node eviction by removing one or more
nodes from the cluster if some critical problem is detected. A critical problem could be a
node not responding via a network heartbeat, a node not responding via a disk heartbeat, a
hung or severely degraded machine, or a hung ocssd.bin process. The purpose of this node
eviction is to maintain the overall health of the cluster by removing bad members.

Starting in 11.2.0.2 RAC (or if you are on Exadata), a node eviction may not actually reboot
the machine. This is called a rebootless restart. In this case we restart most of the
clusterware stack to see if that fixes the unhealthy node.

1.0 - PROCESS ROLES FOR REBOOTS

OCSSD (aka CSS daemon) - This process is spawned by the cssdagent process. It runs in
both vendor clusterware and non-vendor clusterware environments. OCSSD's primary job
is internode health monitoring and RDBMS instance endpoint discovery. The health
monitoring includes a network heartbeat and a disk heartbeat (to the voting files). OCSSD
can also evict a node after escalation of a member kill from a client (such as a database
LMON process). This is a multi-threaded process that runs at an elevated priority and runs
as the Oracle user.

Startup sequence: INIT --> init.ohasd --> ohasd --> ohasd.bin --> cssdagent --> ocssd -->
ocssd.bin

CSSDAGENT - This process is spawned by OHASD and is responsible for spawning the
OCSSD process, monitoring for node hangs (via oprocd functionality), and monitoring to the
OCSSD process for hangs (via oclsomon functionality), and monitoring vendor clusterware
(via vmon functionality). This is a multi-threaded process that runs at an elevated priority
and runs as the root user.

Startup sequence: INIT --> init.ohasd --> ohasd --> ohasd.bin --> cssdagent

CSSDMONITOR - This proccess also monitors for node hangs (via oprocd functionality),
monitors the OCSSD process for hangs (via oclsomon functionality), and monitors vendor
clusterware (via vmon functionality). This is a multi-threaded process that runs at an
elevated priority and runs as the root user.
Startup sequence: INIT --> init.ohasd --> ohasd --> ohasd.bin --> cssdmonitor

2.0 - DETERMINING WHICH PROCESS IS RESPONSIBLE FOR A REBOOT

Important files to review:

Clusterware alert log in <GRID_HOME>/log/<nodename>

The cssdagent log(s) in <GRID_HOME>/log/<nodename>/agent/ohasd/oracssdagent_root

The cssdmonitor log(s) in


<GRID_HOME>/log/<nodename>/agent/ohasd/oracssdmonitor_root

The ocssd log(s) in <GRID_HOME>/log/<nodename>/cssd

The lastgasp log(s) in /etc/oracle/lastgasp or /var/opt/oracle/lastgasp

IPD/OS or OS Watcher data

'opatch lsinventory -detail' output for the GRID home

*Messages files:

* Messages file locations:

Linux: /var/log/messages

Sun: /var/adm/messages

HP-UX: /var/adm/syslog/syslog.log

IBM: /bin/errpt -a > messages.out

Please refer to the following document which provides information on collecting together
most of the above files:

Document 1513912.1 - TFA Collector - Tool for Enhanced Diagnostic Gathering

11.2 Clusterware evictions should, in most cases, have some kind of meaningful error in the
clusterware alert log. This can be used to determine which process is responsible for the
reboot. Example message from a clusterware alert log:

[ohasd(11243)]CRS-8011:reboot advisory message from host: sta00129, component:


cssagent, with timestamp: L-2009-05-05-10:03:25.340
[ohasd(11243)]CRS-8013:reboot advisory message text: Rebooting after limit 28500
exceeded; disk timeout 27630, network timeout 28500, last heartbeat from CSSD at epoch
seconds 1241543005.340, 4294967295 milliseconds ago based on invariant clock value of
93235653

This particular eviction happened when we had hit the network timeout. CSSD exited and
the cssdagent took action to evict. The cssdagent knows the information in the error
message from local heartbeats made from CSSD.

If no message is in the evicted node's clusterware alert log, check the lastgasp logs on the
local node and/or the clusterware alert logs of other nodes.

3.0 - TROUBLESHOOTING OCSSD EVICTIONS

If you have encountered an OCSSD eviction review common causes in section 3.1 below.

3.1 - COMMON CAUSES OF OCSSD EVICTIONS

Network failure or latency between nodes. It would take 30 consecutive missed checkins (by
default - determined by the CSS misscount) to cause a node eviction.

Problems writing to or reading from the CSS voting disk. If the node cannot perform a disk
heartbeat to the majority of its voting files, then the node will be evicted.

A member kill escalation. For example, database LMON process may request CSS to
remove an instance from the cluster via the instance eviction mechanism. If this times out
it could escalate to a node kill.

An unexpected failure or hang of the OCSSD process, this can be caused by any of the
above issues or something else.

An Oracle bug.

3.2 - FILES TO REVIEW AND GATHER FOR OCSSD EVICTIONS

All files from section 2.0 from all cluster nodes. More data may be required.

Example of an eviction due to loss of voting disk:


CSS log:

2012-03-27 22:05:48.693: [ CSSD][1100548416](:CSSNM00018:)clssnmvDiskCheck:


Aborting, 0 of 3 configured voting disks available, need 2

2012-03-27 22:05:48.693: [
CSSD][1100548416]###################################

2012-03-27 22:05:48.693: [ CSSD][1100548416]clssscExit: CSSD aborting from thread


clssnmvDiskPingMonitorThread

OS messages:

Mar 27 22:03:58 choldbr132p kernel: Error:Mpx:All paths to Symm 000190104720 vol 0c71
are dead.

Mar 27 22:03:58 choldbr132p kernel: Error:Mpx:Symm 000190104720 vol 0c71 is dead.

Mar 27 22:03:58 choldbr132p kernel: Buffer I/O error on device sdbig, logical block 0

...

4.0 - TROUBLESHOOTING CSSDAGENT OR CSSDMONITOR EVICTIONS

If you have encountered a CSSDAGENT or CSSDMONITOR eviction review common causes


in section 4.1 below.

4.1 - COMMON CAUSES OF CSSDAGENT OR CSSDMONITOR EVICTIONS

An OS scheduler problem. For example, if the OS is getting locked up in a driver or


hardware or there is excessive amounts of load on the machine (at or near 100% cpu
utilization), thus preventing the scheduler from behaving reasonably.

A thread(s) within the CSS daemon hung.

An Oracle bug.

4.2 - FILES TO REVIEW AND GATHER FOR CSSDAGENT OR CSSDMONITOR EVICTIONS

All files from section 2.0 from all cluster nodes. More data may be required.
Importance of master node in a cluster:

- Master node has the least Node-id in the cluster. Node-ids are assigned to the nodes in
the same order as the nodes join the cluster. Hence, normally the node which joins the
cluster first is the master node.

- CRSd process on the Master node is responsible to initiate the OCR backup as per the
backup policy

- Master node is also responsible to sync OCR cache across the nodes

- CRSd process oth the master node reads from and writes to OCR on disk

- In case of node eviction, The cluster is divided into two sub-clusters. The sub-cluster
containing fewer no. of nodes is evicetd. But, in case both the sub-clusters have same no. of
nodes, the sub-cluster having the master node survives whereas the other sub-cluster is
evicted.

- When OCR master (crsd.bin process) stops or restarts for whatever reason, the crsd.bin
on surviving node with lowest node number will become new OCR master.

The following method can be used to find OCR master:

1. By searching crsd.l* on all nodes:

grep "OCR MASTER" $ORA_CRS_HOME/log/$HOST/crsd/crsd.l*

Query V$GES_RESOURCE to identified master node.

how to monitor block transfer interconnects nodes in rac ?

The v$cache_transfer and v$file_cache_transfer views are used to examine RAC statistics.

The types of blocks that use the cluster interconnects in a RAC environment are monitored
with the v$ cache transfer series of views:

v$cache_transfer: This view shows the types and classes of blocks that Oracle transfers
over the cluster interconnect on a per-object basis.

The forced_reads and forced_writes columns can be used to determine the types of objects
the RAC instances are sharing.
Values in the forced_writes column show how often a certain block type is transferred out of
a local buffer cache due to the current version being requested by another instance.

Querying GV$CLUSTER_INTERCONNECTS view lists the interconnect used by all the


participating instances of the RAC database.

SQL> select INST_ID, IP_ADDRESS from GV$CLUSTER_INTERCONNECTS;

INST_ID IP_ADDRESS

---------- ----------------

1 192.168.261.1

2 192.168.261.2

New Features for Release 2 (11.2)

Oracle Automatic Storage Management and Oracle Clusterware Installation

With Oracle Grid Infrastructure 11g release 2 (11.2), Oracle Automatic Storage
Management (Oracle ASM) and Oracle Clusterware are installed into a single home
directory, which is referred to as the Grid Infrastructure home. Configuration assistants
start after the installer interview process that configures Oracle ASM and Oracle Cluster
ware. The installation of the combined products is called Oracle Grid Infrastructure.
However, Oracle Clusterware and Oracle Automatic Storage Management remain

separate products.

Oracle Automatic Storage Management and Oracle Clusterware Files

With this release, Oracle Cluster Registry (OCR) and voting disks can be placed on Oracle
Automatic Storage Management (Oracle ASM).

This feature enables Oracle ASM to provide a unified storage solution, storing all the data
for the clusterware and the database, without the need for third-party volume managers or
cluster filesystems. For new installations, OCR and voting disk files can be placed either on
Oracle ASM or on a cluster file system or NFS system. Installing Oracle Clusterware files on
raw or block devices is no longer supported, unless an existing system is being upgraded.

Fix up Scripts and Grid Infrastructure Checks


With Oracle Clusterware 11g release 2 (11.2), Oracle Universal Installer (OUI) detects when
minimum requirements for installation are not completed, and creates shell script programs,
called fix up scripts, to resolve many incomplete system configuration requirements. If OUI
detects an incomplete task that is marked "fixable", then you can easily fix the issue by
generating the fix up script by clicking the Fix & Check Again button.

The fixup script is generated during installation. You are prompted to run the script as root
in a separate terminal session. When you run the script, it raises kernel values to required
minimums, if necessary, and completes other operating system configuration tasks. You
also can have Cluster Verification Utility (CVU) generate fixup scripts before installation.

Grid Plug and Play

In the past, adding or removing servers in a cluster required extensive manual preparation.
With this release, you can continue to configure server nodes manually or use Grid Plug and
Play to configure them dynamically as nodes are added or removed from the cluster.

Grid Plug and Play reduces the costs of installing, configuring, and managing server nodes
by starting a grid naming service within the cluster to allow each node to perform the
following tasks dynamically:

■ Negotiating appropriate network identities for itself

■ Acquiring additional information it needs to operate from a configuration profile

■ Configuring or reconfiguring itself using profile data, making host names and addresses
resolvable on the network

Because servers perform these tasks dynamically, the number of steps required to add or
delete nodes is minimized.

Improved Input/Output Fencing Processes

Oracle Clusterware 11g release 2 (11.2) replaces the oprocd and Hangcheck processes with
the cluster synchronization service daemon Agent and Monitor to provide more Accurate
recognition of hangs and to avoid false termination

Intelligent Platform Management Interface (IPMI) Integration

Intelligent Platform Management Interface (IPMI) is an industry standard management


protocol that is included with many servers today. IPMI operates
independently of the operating system, and can operate even if the system is not powered
on. Servers with IPMI contain a baseboard management controller (BMC) which is used to
communicate to the server.

If IPMI is configured, then Oracle Clusterware uses IPMI when node fencing is required and
the server is not responding.

SCAN for Simplified Client Access

With this release, the Single Client Access Name (SCAN) is the host name to provide for all
clients connecting to the cluster. The SCAN is a domain name registered to at

least one and up to three IP addresses, either in the domain name service (DNS) or the Grid
Naming Service (GNS).

The primary benefit of a Single Client Access Name (SCAN) is not having to update client
connection information (such as TNSNAMES.ora) every time you add or remove nodes from
an existing RAC cluster.

Clients use a simple EZconnect string and JDBC connections can use a JDBC thin URL to
access the database, which is done independently of the physical hosts that the database
instances are running on. Additionally, SCAN automatically provides both failover and load
balancing of connects, where the new connection will be directed to the least busy instance
in the cluster by default.

It should be noted here that because EZconnect is used with SCAN, the SQLNET.ora file
should include EZconnect as one of the naming methods, for example:

NAMES.DIRECTORY_PATH=(tnsnames,ezconnect,ldap)

An EZconnect string would look like

sqlplus user/pass@mydb-scan:1521/myservice

A JDBC thin string would look like

jdbc:oracle:thin@mydb-scan:1521/myservice

It's highly recommended that the clients are Oracle 11g R2 clients, to allow them to fully
take advantage of the failover with the SCAN settings.
The TNSNAMES.ora file would now reference the SCAN rather than the VIPs as has been
done in previous versions. This is what a TNSNAMES entry would be:

MYDB

(DESCRIPTION=

(ADDRESS=(PROTOCOL=TCP)(HOST=mydb-scan.ORACLE.COM)(PORT=1521))

(CONNECT_DATA=(SERVICE_NAME=myservice.ORACLE.COM)))

There are two methods available for defining the SCAN. These are to use your corporate
DNS to define the SCAN; the second option is to use Grid Naming Service.

DEFINE SCAN USING DNS

To use the DNS method for defining your SCAN, the network administrator must create a
single name that resolves to three separate IP addresses using round-robin algorithms.
Regardless of how many systems are part of your cluster, Oracle recommends that 3 IP
addresses are configured to allow for failover and load-balancing.

It is important that the IP addresses are on the same subnet as the public network for the
server. The other two requirements are that the name (not including the domain suffix) are
15 characters or less in length and that the name can be resolved without using the domain
suffix. Also, the IP addresses should not be specifically assigned to any of the nodes in the
cluster.

You can test the DNS setup by running an nslookup on the scan name two or more times.
Each time, the IP addresses should be returned in a different order:

Syntax: nslookup mydatabase-scan

DEFINE SCAN USING GRID NAMING SOLUTIONS (GNS)

Using GNS assumes that a DHCP server is running on the public network with enough
available addresses to assign the required IP addresses and the SCAN VIP. Only one static
IP address is required to be configured and it should be in the DNS domain.

DATABASE PARAMETERS FOR SCAN

The database will register each instance to the scan listener using the REMOTE_LISTENER
parameter in the spfile. Oracle 11g R2 RAC databases will only register with the SCAN
listeners. Upgraded databases, however, will continue to register with the local listener as
well as the SCAN listener via the REMOTE_LISTENER parameter. The LOCAL_LISTENER
parameter would be set to the node VIP for upgraded systems.
The REMOTE_LISTENER parameter, rather than being set to an alias that would be in a
server side TNSNAMES file (as it has been in previous versions), would be set simply to the
SCAN entry: The alter command would be

ALTER SYSTEM SET REMOTE_LISTENER=mydb-scan.oracle.com:1521

POINTS TO BE NOTED FOR SCAN LISTENER

SOURCE:
http://docs.oracle.com/cd/E11882_01/install.112/e48195/undrstnd.htm#RIWIN610

An Oracle Database 11g release 2 (11.2) database service automatically registers with the
listeners specified in the LOCAL_LISTENER and REMOTE_LISTENER parameters. During
registration, PMON sends information such as the service name, instance names, and
workload information to the listeners. This feature is called service registration

Services coordinate their sessions by registering their workload, or the amount of work they
are currently handling, with the local listener and the SCAN listeners. Clients are redirected
by the SCAN listener to a local listener on the least-loaded node that is running the instance
for a particular service. This feature is called load balancing. The local listener either directs
the client to a dispatcher process (if the database was configured for shared server), or
directs the client to a dedicated server process.

When a listener starts after the Oracle instance starts, and the listener is available for
service registration, registration does not occur until the next time the Oracle Database
process monitor (PMON) starts its discovery routine. By default, the PMON discovery routine
is started every 60 seconds. To override the 60-second delay, use the SQL statement ALTER
SYSTEM REGISTER. This statement forces PMON to register the service immediately.

Local Listeners

- Starting with Oracle Database 11g release 2 (11.2), the local listener, or default
listener, is located in the Grid home when you have Oracle Grid Infrastructure installed.

Grid_home\network\admin directory.

- Oracle Clusterware 11g release 2 and later, the listener association no longer requires
tnsnames.ora file entries. The listener associations are configured as follows:

· - DBCA no longer sets the LOCAL_LISTENER parameter. The Oracle Clusterware agent
that starts the database sets the LOCAL_LISTENER parameter dynamically, and it sets it to
the actual value, not an alias. So listener_alias entries are no longer needed in the
tnsnames.ora file.

· - The REMOTE_LISTENER parameters are configured by DBCA to reference the SCAN


and SCAN port, without any need for a tnsnames.ora entry. Oracle Clusterware uses the
Easy Connect naming method with scanname:scanport, so no listener associations for the
REMOTE_LISTENER parameter are needed in the tnsnames.ora file.

Three SCAN addresses are configured for the cluster, and allocated to servers. When a
client issues a connection request using SCAN, the three SCAN addresses are returned to
the client. If the first address fails, then the connection request to the SCAN name fails over
to the next address. Using multiple addresses allows a client to connect to an instance of
the database even if the initial instance has failed.

The net service name does not need to know the physical address of the server on which
the database, database instance, or listener runs. SCAN is resolved by DNS, which returns
three IP addresses to the client. The client then tries each address in succession until a
connection is made.

Understanding SCAN

SCAN is a fully qualified name (host name.domain name) that is configured to resolve to all
the addresses allocated for the SCAN listeners.

The default value for SCAN is cluster_name.GNS_sub_domain, or, cluster_name-


scan.domain_name if GNS is not used. For example, in a cluster that does not use GNS, if
your cluster name issalesRAC, and your domain is example.com, then the default SCAN
address is salesRAC-scan.example.com:1521.

SCAN is configured in DNS to resolve to three IP addresses, and DNS should return the
addresses using a round-robin algorithm. This means that when SCAN is resolved by DNS,
the IP addresses are returned to the client in a different order each time.

- Based on the environment, the following actions occur when you use SCAN to connect to
an Oracle RAC database using a service name.

1. The PMON process of each instance registers the database services with the default
listener on the local node and with each SCAN listener, which is specified by the
REMOTE_LISTENER database parameter.
2. The listeners are dynamically updated on the amount of work being handled by the
instances and dispatchers.

The client issues a database connection request using a connect descriptor of the
form:

orausr/@scan_name:1521/sales.example.com

Note:

If you use the Easy Connect naming method, then ensure the sqlnet.ora file on the client
contains EZCONNECT in the list of naming methods specified by
theNAMES.DIRECTORY_PATH parameter.

3. The client uses DNS to resolve scan_name. After DNS returns the three addresses
assigned to SCAN, the client sends a connect request to the first IP address. If the connect
request fails, then the client attempts to connect using the next IP address.

4. When the connect request is successful, the client connects to a SCAN listener for the
cluster which hosts the sales database. The SCAN listener compares the workload of the
instances sales1 and sales2 and the workload of the nodes on which they are running.
Because node2 is less loaded than node1, the SCAN listener selects node2 and sends the
address for the listener on that node back to the client.

The client connects to the local listener on node2. The local listener starts a dedicated
server process for the connection to the database. The client connects directly to the
dedicated server process on node2 and accesses the sales2 database instance.

SRVCTL Command Enhancements for Patching

With this release, you can use the server control utility SRVCTL to shut down all Oracle
software running within an Oracle home, in preparation for patching. Oracle Grid
Infrastructure patching is automated across all nodes, and patches can be applied in a
multi-node, multi-patch fashion.

Typical Installation Option

To streamline cluster installations, especially for those customers who are new to clustering,
Oracle introduces the Typical Installation path. Typical installation defaults as many options
as possible to those recommended as best practices.

Voting Disk Backup Procedure Change


In prior releases, backing up the voting disks using a dd command was a required post
installation task. With Oracle Cluster ware release 11.2 and later, backing up and restoring
a voting disk using the dd command is not supported. Backing up voting disks manually is
no longer required, because voting disks are backed up automatically in the OCR as part of
any configuration change. Voting disk data is automatically restored to any added voting
disks.

Oracle RAC Background Processes

The GCS and GES processes, and the GRD collaborate to enable Cache Fusion. The Oracle
RAC processes and their identifiers are as follows:

■ ACMS: Atomic Controlfile to Memory Service (ACMS)

In an Oracle RAC environment, the ACMS per-instance process is an agent that contributes
to ensuring a distributed SGA memory update is either globally

committed on success or globally aborted if a failure occurs.

■ GTX0-j: Global Transaction Process

The GTX0-j process provides transparent support for XA global transactions in an Oracle
RAC environment. The database autotunes the number of these processes

based on the workload of XA global transactions.

■ LMD: Global Enqueue Service Daemon (GES)

The LMD process manages incoming remote resource requests within each instance.

■ LMON: Global Enqueue Service Monitor (GES)

The LMON process manages the GES, it maintains consistency of GCS memory structure in
case of process death. It is also responsible for cluster reconfiguration and locks
reconfiguration (node joining or leaving), it checks for instance deaths and listens for local
messaging. A detailed log file is created that tracks any reconfigurations that have
happened.

■ LMSn: Lock Manager Server process (GCS)

This is the cache fusion part and the most active process; it handles the consistent copies of
blocks that are transferred between instances.

It receives requests from LMD to perform lock requests. It rolls back any uncommitted
transactions.

There can be up to ten LMS processes running and can be started dynamically if demand
requires it.
They manage lock manager service requests for GCS resources and send them to a service
queue to be handled by the LMSn process.

It also handles global deadlock detection and monitors for lock conversion timeouts.

As a performance gain you can increase this process priority to make sure CPU starvation
does not occur

You can see the statistics of this daemon by looking at the view X$KJMSDP

■ LCK0: Instance Enqueue Process

The LCK0 process manages non-Cache Fusion resource requests such as library and row
cache requests.

■ RMSn: Oracle RAC Management Processes (RMSn)

The RMSn processes perform manageability tasks for Oracle RAC. Tasks accomplished by an
RMSn process include creation of resources related to Oracle

RAC when new instances are added to the clusters.

RSMN: Remote Slave Monitor manages background slave process creation and
communication on remote instances. These background slave processes perform

Tasks on behalf of a coordinating process running in another instance.

DIAG: Diagnostic Daemon

This is a lightweight process; it uses the DIAG framework to monitor the health of the
cluster. It captures information for later diagnosis in the event of failures. It will perform
any necessary recovery

If an operational hang is detected.

Top 5 issues for Instance Eviction (Doc ID 1374110.1)

srvctl status database -d HRPRD

Instance HRPRD1 is running on node hrprddb1

Instance HRPRD2 is running on node hrprddb2

============================================

$crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------


1. ONLINE 8c377cc40f344f0dff8da3d9b0d7d610 (/psvoting01/hrprd_vote) []

2. ONLINE a371af8108beffcfbfb43e12ff17d2c4 (/psvoting02/hrprd_vote) []

3. ONLINE 8b2ee6a3aabcefc6bf61114b6802cc48 (/psvoting03/hrprd_vote) []

Located 3 voting disk(s).

Cluster Verification Utility (CVU) command to verify OCR integrity of all of the nodes in your
cluster database:

cluvfy comp ocr -n all -verbose

List the nodes in your cluster by running the following command on one node:

olsnodes

Listing Backup Files

$ocrconfig -showbackup

eldevdb2 2014/06/24 15:09:02


/oracrs/app/11.2.0.3/grid/cdata/hhcpsoft/backup00.ocr

eldevdb2 2014/06/24 11:09:00


/oracrs/app/11.2.0.3/grid/cdata/hhcpsoft/backup01.ocr

eldevdb2 2014/06/24 07:08:58


/oracrs/app/11.2.0.3/grid/cdata/hhcpsoft/backup02.ocr

eldevdb2 2014/06/23 03:08:44 /oracrs/app/11.2.0.3/grid/cdata/hhcpsoft/day.ocr

eldevdb2 2014/06/11 23:06:00 /oracrs/app/11.2.0.3/grid/cdata/hhcpsoft/week.ocr

hrdevdb2 2011/08/18 12:53:04


/oracrs/app/11.2.0/grid/cdata/hhcpsoft/backup_20110818_125304.ocr

ocrconfig - Configuration tool for Oracle Cluster/Local Registry.

Synopsis:

ocrconfig [option]

option:
[-local] -export <filename>

- Export OCR/OLR contents to a file

[-local] -import <filename> - Import OCR/OLR contents from a file

[-local] -upgrade [<user> [<group>]] - Upgrade OCR from previous


version

-downgrade [-version <version string>] - Downgrade OCR to the


specified version

[-local] -backuploc <dirname> - Configure OCR/OLR backup location

[-local] -showbackup [auto|manual] - Show OCR/OLR backup information

[-local] -manualbackup - Perform OCR/OLR backup

[-local] -restore <filename> - Restore OCR/OLR from physical backup

-replace <current filename> -replacement <new filename> - Replace a OCR


device/file <filename1> with <filename2>

-add <filename> - Add a new OCR device/file

-delete <filename> - Remove a OCR device/file

-overwrite - Overwrite OCR configuration on disk

-repair -add <filename> | -delete <filename> | -replace <current filename> -


replacement <new filename>

- Repair OCR configuration on the local node

-help - Print out this help information

Run the following command to inspect the contents and verify the integrity of the backup
file:

ocrdump -backupfile backup_file_name

Verify the integrity of OCR:

ocrcheck

Run ocrcheck and if the command returns a failure message, then both the primary OCR
and the OCR mirror have failed.
The OCRCHECK utility displays the version of the OCR's block format, total space available
and used space, OCRID, and the OCR locations that you have configured. OCRCHECK
performs a block-by-block checksum operation for all of the blocks in all of the OCRs that
you have configured. It also returns an individual status for each file as well as a result for
the overall OCR integrity check.

OCRCHECK creates a log file in the directory CRS_home/log/hostname/client. To change


amount of logging, edit the file CRS_home/srvm/admin/ocrlog.ini.

$./ocrcheck -local

Status of Oracle Local Registry is as follows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 2584

Available space (kbytes) : 259536

ID : 814444380

Device/File Name : /oracrs/app/11.2.0/grid/cdata/hrprddb2.olr

Device/File integrity check succeeded

Local registry integrity check succeeded

Logical corruption check bypassed due to non-privileged user

Run the following command to inspect the contents and verify the integrity of the backup
file:

ocrdump -backupfile /oracrs/app/11.2.0.3/grid/cdata/hhcpsoft/backup01.ocr

The number of voting files you can store in a particular Oracle ASM disk group depends
upon the redundancy of the disk group.

· External redundancy: A disk group with external redundancy can store only one
voting disk

· Normal redundancy: A disk group with normal redundancy stores three voting disks
· High redundancy: A disk group with high redundancy stores five voting disks

To migrate voting disks to Oracle ASM, specify the Oracle ASM disk group name in the
following command:

$ crsctl replace votedisk +asm_disk_group

Backing up Voting Disks

In Oracle Clusterware 11g release 2 (11.2), you no longer have to back up the voting disk.

The voting disk data is automatically backed up in OCR as part of any configuration change
and is automatically

restored to any voting disk added. If all voting disks are corrupted, however,

Restoring Voting Disks

If all of the voting disks are corrupted, then you can restore them, as follows:

I. Restore OCR: This step is necessary only if OCR is also corrupted or otherwise
unavailable,

such as if OCR is on Oracle ASM and the disk group is no longer available.

Restoring Oracle Cluster Registry

If a resource fails, then before attempting to restore OCR, restart the resource.

As a definitive verification that OCR failed, run ocrcheck and if the command returns a
failure message,

then both the primary OCR and the OCR mirror have failed. Attempt to correct the problem
using the

OCR restoration procedure for your platform.

Use the following procedure to restore OCR on Linux or UNIX systems:


1. List the nodes in your cluster by running the following command on one node:

$ olsnodes

2. Stop Oracle Clusterware by running the following command as root on all of the nodes:

# crsctl stop crs

If the preceding command returns any error due to OCR corruption, stop Oracle Clusterware
by running the following command as root on all of the nodes:

# crsctl stop crs -f

3. If you are restoring OCR to a cluster file system or network file system, then run the
following command as root to restore OCR with an OCR backup that you can identify in
"Listing Backup Files":

# ocrconfig -restore file_name

After you complete this step, proceed to step 10.

4. Start the Oracle Clusterware stack on one node in exclusive mode by running the
following command as root:

# crsctl start crs -excl -nocrs

The -nocrs option ensures that the crsd process and OCR do not start with the rest of the
Oracle Clusterware stack.

Ignore any errors that display.

Check whether crsd is running. If it is, then stop it by running the following command as
root:

# crsctl stop resource ora.crsd -init

Caution:
Do not use the -init flag with any other command.

5. If you want to restore OCR to an Oracle ASM disk group, then you must first create a disk
group using SQL*Plus that has the same name as the disk group you want to restore and
mount it on the local node.

If you cannot mount the disk group locally, then run the following SQL*Plus command:

SQL> drop diskgroup disk_group_name force including contents;

Optionally, if you want to restore OCR to a raw device, then you must run the ocrconfig -
repair -replace command as root, assuming that you have all the necessary permissions on
all nodes to do so and that OCR was not previously on Oracle ASM.

6. Restore OCR with an OCR backup that you can identify in "Listing Backup Files" by
running the following command as root:

# ocrconfig -restore file_name

Notes:

Ensure that the OCR devices that you specify in the OCR configuration exist and that these
OCR devices are valid.

If you configured OCR in an Oracle ASM disk group, then ensure that the Oracle ASM disk
group exists and is mounted.

See Also:

Oracle Grid Infrastructure Installation Guide for information about creating OCRs

Oracle Automatic Storage Management Administrator's Guide for more information about
Oracle ASM disk group management

7. Verify the integrity of OCR:

# ocrcheck
8. Stop Oracle Clusterware on the node where it is running in exclusive mode:

# crsctl stop crs -f

9. Run the ocrconfig -repair -replace command as root on all the nodes in the cluster where
you did not the ocrconfig -restore command. For example, if you ran the ocrconfig -restore
command on node 1 of a four-node cluster, then you must run the ocrconfig -repair -replace
command on nodes 2, 3, and 4.

10. Begin to start Oracle Clusterware by running the following command as root on all of
the nodes:

# crsctl start crs

11. Verify OCR integrity of all of the cluster nodes that are configured as part of your cluster
by running the following CVU command:

$ cluvfy comp ocr -n all –verbose

II. Run the following command as root from only one node to start the Oracle Clusterware
stack in exclusive mode,

which does not require voting files to be present or usable:

# crsctl start crs -excl

III. Run the crsctl query css votedisk command to retrieve the list of voting files currently
defined, similar to the following:

$ crsctl query css votedisk

-- ----- ----------------- --------- ---------

## STATE File Universal Id File Name Disk group

1. ONLINE 7c54856e98474f61bf349401e7c9fb95 (/dev/sdb1) [DATA]

This list may be empty if all voting disks were corrupted, or may have entries that are
marked as status 3 or OFF.

IV. Depending on where you store your voting files, do one of the following:
If the voting disks are stored in Oracle ASM, then run the following command to migrate the
voting disks to the Oracle ASM disk group you specify:

crsctl replace votedisk +asm_disk_group

The Oracle ASM disk group to which you migrate the voting files must exist in Oracle ASM.
You can use this

command whether the voting disks were stored in Oracle ASM or some other storage
device.

If you did not store voting disks in Oracle ASM, then run the following command using the
File Universal Identifier (FUID) obtained in the previous step:

$ crsctl delete css votedisk FUID

Add a voting disk, as follows:

$ crsctl add css votedisk path_to_voting_disk

V. Stop the Oracle Clusterware stack as root:

# crsctl stop crs

Note:

If the Oracle Clusterware stack is running in exclusive mode, then use the -f option to force
the shutdown of the stack.

VI. Restart the Oracle Clusterware stack in normal mode as root:

# crsctl start crs

Voting Files stored in ASM - How many disks per disk group do I need?

If Voting Files are stored in ASM, the ASM disk group that hosts the Voting Files will place
the appropriate number of Voting Files in accordance to the redundancy level. Once Voting
Files are managed in ASM, a manual addition, deletion, or replacement of Voting Files will
fail, since users are not allowed to manually manage Voting Files in ASM.
If the redundancy level of the disk group is set to "external", 1 Voting File is used.

If the redundancy level of the disk group is set to "normal", 3 Voting Files are used.

If the redundancy level of the disk group is set to "high", 5 Voting Files are used.

Note that Oracle Clusterware will store the disk within a disk group that holds the Voting
Files. Oracle Clusterware does not rely on ASM to access the Voting Files.

In addition, note that there can be only one Voting File per failure group. In the above list of
rules, it is assumed that each disk that is supposed to hold a Voting File resides in its own,
dedicated failure group.

In other words, a disk group that is supposed to hold the above mentioned number of
Voting Files needs to have the respective number of failure groups with at least one disk. (1
/ 3 / 5 failure groups with at least one disk)

Consequently, a normal redundancy ASM disk group, which is supposed to hold Voting Files,
requires 3 disks in separate failure groups, while a normal redundancy ASM disk group that
is not used to store Voting Files requires only 2 disks in separate failure groups.

What happens if I lose my voting disk(s)?

If you lose 1/2 or more of all of your voting disks, then nodes get evicted from the cluster,
or nodes kick themselves out of the cluster. It doesn't threaten database corruption.
Alternatively you can use external redundancy which means you are providing redundancy
at the storage level using RAID.

For this reason when using Oracle for the redundancy of your voting disks, Oracle
recommends that customers use 3 or more voting disks in Oracle RAC 10g Release 2. Note:
For best availability, the 3 voting files should be physically separate disks. It is
recommended to use an odd number as 4 disks will not be any more highly available than 3
disks, 1/2 of 3 is 1.5...rounded to 2, 1/2 of 4 is 2, once we lose 2 disks, our cluster will fail
with both 4 voting disks or 3 voting disks.

Restoring corrupted voting disks is easy since there isn't any significant persistent data
stored in the voting disk. See the Oracle Clusterware Admin and Deployment Guide for
information on backup and restore of voting disks.

An odd number of voting disks is required for proper clusterware configuration. A node must
be able to strictly access more than half of the voting disks at any time. So, in order to
tolerate a failure of n voting disks, there must be at least 2n+1 configured. (n=1 means 3
voting disks).

Why should we have an odd number of voting disks?

The odd number of voting disks should be configured to provide a method to determine who
in the cluster should survive.

A node must be able to access more than half of the voting disks at any time. For example,
let’s have a two node cluster with an even number of let’s say 2 voting disks. Let Node1 is
able to access voting disk1 and Node2 is able to access voting disk2. This means that there
is no common file where clusterware can check the heartbeat of both the nodes. If we have
3 voting disks and both the nodes are able to access more than half i.e. 2 voting disks,
there will be at least on disk which will be accessible by both the nodes. The clusterware can
use that disk to check the heartbeat of both the nodes. Hence, each node should be able to
access more than half the number of voting disks. A node not able to do so will have to be
evicted from the cluster by another node that has more than half the voting disks, to
maintain the integrity of the cluster. After the cause of the failure has been corrected and
access to the voting disks has been restored, you can instruct Oracle Clusterware to recover
the failed node and restore it to the cluster.

Loss of more than half your voting disks will cause the entire cluster to fail!!

HOW TO IDENTIFY THE MASTER NODE IN RAC?

Master node has the least Node-id in the cluster.

Node-ids are assigned to the nodes in the same order as the nodes join the cluster.

Hence, normally the node which joins the cluster first is the master node.

- CRSd process on the Master node is responsible to initiate the OCR backup as per the
backup policy

- Master node is also responsible to sync OCR cache across the nodes

- CRSd process oth the master node reads from and writes to OCR on disk

- In case of node eviction, the cluster is divided into two sub-clusters. The sub-cluster
containing fewer no. of nodes is evicetd. But, in case both the sub-clusters have same no. of
nodes, the sub-cluster having the master node survives whereas the other sub-cluster is
evicted.

Oracle ClusterWare master’s information can be found


- by scanning ocssd logs from various nodes

- by scanning crsd logs from various nodes.

- by identifying the node which takes the backup of the OCR.

If master node gets evicted/rebooted, another node becomes the master.

HOW TO FIND THE RESOURCE MASTER?

In RAC, every data block is mastered by an instance. Mastering a block simply means that
master instance

keeps track of the state of the block until the next reconfiguration event.

– Remastering can be triggered as result of

– Manually

– Resource affinity

– Instance crash

- Method – I gets info about master node from v$gcspfmaster_info using data_object_id

- Method – II gets info about master node from v$dlm_ress and v$ges_enqueue using
resource name in hexadecimal format

- Method – III gets info about master node from x$kjbl with x$le using resource name in
hexadecimal format

– CURRENT SCENARIO -

- 3 node setup

- name of the database – orcl

— SETUP –

SYS@NODE1>create table scott.emp1 as select * from scott.emp;

– Get data_object_id for scott.emp1

SQL> col owner for a10

col data_object_id for 9999999

col object_name for a15

select owner, data_object_id, object_name from dba_objects where owner = ‘SCOTT’ and
object_name = ‘EMP1';

OWNER DATA_OBJECT_ID OBJECT_NAME


———- ————– —————

SCOTT 74652 EMP1

For Method-II and Method-III, we need to find out file_id and block_id and hence GCS
resource name in hexadecimal format

– Get File_id and range of block_ids of emp1 table

– It can be seen that emp1 lies in block 523 of file 4.

SQL>select dbms_rowid.rowid_relative_fno(rowid) FILE_NO,

min(dbms_rowid.rowid_block_number(rowid)) MIN_BLOCK_ID,

max(dbms_rowid.rowid_block_number(rowid)) MAX_BLOCK_ID

from scott.emp1

group by dbms_rowid.rowid_relative_fno(rowid);

FILE_NO MIN_BLOCK_ID MAX_BLOCK_ID

———- ———— ————

4 523 523

– Find the GCS resource name to be used in the query using blodk_id and data_object_id
retrieved above.

x$kjbl.kjblname = resource name in hexadecimal format([id1],[id2],[type]

x$kjbl.kjblname2 = resource name in decimal format

Hexname will be used to query resource master using v$dlm_ress , v$ges_enqueue, $kjbl

and x$le

SQL> col hexname for a25

col resource_name for a15

select b.kjblname hexname, b.kjblname2 resource_name

from x$le a, x$kjbl b

where a.le_kjbl=b.kjbllockp

and a.le_addr = ( select le_addr

from x$bh

where dbablk = 523

and obj = 74652

and class = 1
and state <> 3);

HEXNAME RESOURCE_NAME

————————- —————

[0x20b][0x4],[BL] 523,4,BL

– Manually master the EMP table to node1 –

SYS@NODE1>oradebug lkdebug -m pkey <objectid>

SYS@NODE1>oradebug lkdebug -m pkey 74652

—- GET RESOURCE MASTER NAME ———-

Method – I gets info about master node from v$gcspfmaster_info using data_object_id

– ——-

– Note that current master is node1 (Node numbering starts from 0)

SYS@node1>col object_name for A10

select o.object_name, m.CURRENT_MASTER

from dba_objects o, v$gcspfmaster_info m

where o.data_object_id=74652

and m.data_object_id = 74652 ;

OBJECT_NAM CURRENT_MASTER

———- ————–

EMP1 0

—- Method II gets info about master node from v$dlm_ress and v$ges_enqueue

using resource name in hexadecimal format

– check that master node is node1 (node numbering starts with 0)

SYS@NODE1>col resource_name for a22 select a.resource_name, a.master_node

from v$dlm_ress a, v$ges_enqueue b

where upper(a.resource_name) = upper(b.resource_name1)

and a.resource_name like ‘%[0x20b][0x4],[BL]%’;

RESOURCE_NAME MASTER_NODE

———————- ———–

[0x20b][0x4],[BL] 0
Method – III gets info about master node from x$kjbl with x$le

using resource name in hexadecimal format

– This SQL joins x$kjbl with x$le to retrieve resource master for a block

– Note that current master is node1(MASTER=0)

SYS@NODE1> select kj.kjblname, kj.kjblname2, kj.kjblmaster master

from (select kjblname, kjblname2, kjblowner, kjblmaster, kjbllockp

from x$kjbl

where kjblname = ‘[0x20b][0x4],[BL]‘

) kj, x$le le

where le.le_kjbl = kj.kjbllockp

order by le.le_addr;

KJBLNAME KJBLNAME2 MASTER

—————————— —————————— ———-

[0x20b][0x4],[BL] 523,4,BL 0

RAC Troubleshooting, RAC LOG FILES

Cluster Ready Services Daemon (crsd) Log Files: CRS home/log/hostname/crsd

Oracle Cluster Registry (OCR) records

For the OCR tools (OCRDUMP, OCRCHECK, OCRCONFIG) record log information in the
following location:

CRS_Home/log/hostname/client

To change the amount of logging, edit the path in the CRS_home/srvm/admin/ocrlog.ini file.

The OCR server records log information in the following location:

CRS_home/log/hostname/crsd

To change the amount of logging, edit the path in the


CRS_home/log/hostname/crsd/crsd.ini file.

Oracle Process Monitor Daemon (OPROCD)


The following path is specific to Linux: /etc/oracle/hostname.oprocd.log

This path is dependent upon the installed Linux or UNIX platform.

Cluster Synchronization Services (CSS): CRS_home/log/hostname/cssd

Event Manager (EVM) information generated by evmd: CRS_home/log/hostname/evmd

Oracle RAC RACG

The Oracle RAC high availability trace files are located in the following two locations:

CRS_home/log/hostname/racg

$ORACLE_HOME/log/hostname/racg

Core files are in subdirectories of the log directory.

Each RACG executable has a subdirectory assigned exclusively for that executable.

The name of the RACG executable subdirectory is the same as the name of the executable.

Enable Threads for remaining 3 nodes

Alter database enable public thread 2;

Alter database enable public thread 3;

Alter database enable public thread 4;

shutdown database on primary node vmohswort018.

Login to below dbnodes as root user and perform mentioned steps

vmohswort022

vmohswort026

vmohswort029

cd /oracrs/oracle/product/112/bin
./crsctl start crs

./crsctl enable crs

Login to vmohswort022 server as orpwor1i osuser

SQL>startup mount;

alter system set undo_tablespace='APPS_UNDOTS2' scope=spfile sid='PWOR1I2';

alter system set undo_tablespace='APPS_UNDOTS3' scope=spfile sid='PWOR1I3';

alter system set undo_tablespace='APPS_UNDOTS4' scope=spfile sid='PWOR1I4';

alter database open;

SQL> alter database disable thread 1;

SQL> alter database enable public thread 1;

SQL> shutdown immediate;

Start DB & DB listnener on all RAC nodes

RCONFIG TO CONVERT RAC

1. As the oracle user, navigate to the directory


$ORACLE_HOME/assistants/rconfig/sampleXMLs, and open the file ConvertToRAC.xml using
a text editor, such as vi.

2. Review the ConvertToRAC.xml file, and modify the parameters as required for your
system. The XML sample file contains comment lines that provide instructions for how to
configure the file.

When you have completed making changes, save the file with the syntax filename.xml.
Make a note of the name you select.

3. Navigate to the directory $ORACLE_HOME/bin, and use the following syntax to run the
command rconfig: rconfig input.xml

Where input.xml is the name of the XML input file you configured in step 2.

For example, if you create an input XML file called convert.xml, then enter the following
command
$./rconfig convert.xml

Note:

The Convert verify option in the ConvertToRAC.xml file has three options:

Convert verify="YES": rconfig performs checks to ensure that the prerequisites for single-
instance to Oracle RAC conversion have been met before it starts conversion

Convert verify="NO": rconfig does not perform prerequisite checks, and starts conversion

Convert verify="ONLY" rconfig only performs prerequisite checks; it does not start
conversion after completing prerequisite checks

If performing the conversion fails, then use the following procedure to recover and
reattempt the conversion.:

· Attempt to delete the database using the DBCA delete database option.

· Restore the source database.

· Review the conversion log, and fix any problems it reports that may have caused the
conversion failure. The rconfig log files are under the rconfig directory in
$ORACLE_BASE/cfgtoollogs.

· Reattempt the conversion.

RESTORE VOTE DISK

root@hrdevdb2 /oracrs/app/11.2.0.3/grid/bin =>crsctl query css votedisk

Start CRS in exclusive mode

./crsctl start crs -excl

Query for voting disk

./crsctl query css votedisk


If you did not store voting disks in Oracle ASM, then run the following command using the
File Universal Identifier (FUID) obtained in the previous step:

$ crsctl delete css votedisk FUID

./crsctl delete css votedisk a4a849393fb14f4fbf92cbef0d2d215a

./crsctl delete css votedisk a8906deabdd3ff29ff0df2f382271ce0

Add a voting disk, as follows:

./crsctl add css votedisk /appvotingocrtie/vote/hhcvote1

./crsctl add css votedisk /apptmpctrltrcenv/vote/hhcvote2

./crsctl query css votedisk

root@hrdevdb2 /oracrs/app/11.2.0.3/grid/network/admin
=>/oracrs/app/11.2.0.3/grid/bin/crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE 70165ff16d50df98ff135ecded8bd82e (/appvotingocr/vote/hhcvote0) []

2. ONLINE 69e63cb962f84f1cbf3055292d79fe86 (/appvotingocrtie/vote/hhcvote1) []

3. ONLINE 8a844b35ed364f85bfb3ee3ab8262215 (/apptmpctrltrcenv/vote/hhcvote2) []

root@eldevdb2 /apptmpctrltrcenv/vote =>/oracrs/app/11.2.0.3/grid/bin/crsctl query css


votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE 70165ff16d50df98ff135ecded8bd82e (/appvotingocr/vote/hhcvote0) []

2. ONLINE 69e63cb962f84f1cbf3055292d79fe86 (/appvotingocrtie/vote/hhcvote1) []

3. ONLINE 8a844b35ed364f85bfb3ee3ab8262215 (/apptmpctrltrcenv/vote/hhcvote2) []

Located 3 voting disk(s).

root@eldevdb2 /apptmpctrltrcenv/vote =>/oracrs/app/11.2.0.3/grid/bin/ocrcheck


Status of Oracle Cluster Registry is as follows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 8916

Available space (kbytes) : 253204

ID : 38393080

Device/File Name : /appvotingocr/ocr/hhcocr

Device/File integrity check succeeded

Device/File Name : /appvotingocrtie/ocr/hhcmirrocr

Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

root@hrdevdb2 /oracrs/app/11.2.0.3/grid/network/admin
=>/oracrs/app/11.2.0.3/grid/bin/ocrcheck

Status of Oracle Cluster Registry is as follows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 8916

Available space (kbytes) : 253204

ID : 38393080

Device/File Name : /appvotingocr/ocr/hhcocr


Device/File integrity check succeeded

Device/File Name : /appvotingocrtie/ocr/hhcmirrocr

Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

A current read is one where a session reads the current value of the data block from
another instance’s Data Buffer Cache. This current value contains the most up-to-date
committed data. The current read would happen when a second instance needs a data block
that has not been changed. This is often thought of as a read/read situation. The current
read will be seen as any wait event that starts with gc current.

Consistent Read

A consistent read is needed when a particular block is being accessed/modified by


transaction T1 and at the same time another transaction T2 tries to access/read the block.
If T1 has not been committed, T2 needs a consistent read (consistent to the non-modified
state of the database) copy of the block to move ahead. A CR copy is created using the
UNDO data for that block. A sample series of steps for a CR in a normal setup would be:

1. Process tries to read a data block

2. Finds an active transaction in the block

3. Then checks the UNDO segment to see if the transaction has been committed or not

4. If the transaction has been committed, it creates the REDO records and reads the
block
5. If the transaction has not been committed, it creates a CR block for itself using the
UNDO/ROLLBACK information.

6. Creating a CR image in RAC is a bit different and can come with some I/O overheads.
This is because the UNDO could be spread across instances and hence to build a CR copy of
the block, the instance might has to visit UNDO segments on other instances and hence
perform certain extra I/O

For Full Version Very Nice# http://www.dba-


oracle.com/t_gupta_oracle_rac_cache_fusion.htm

As you said Voting & OCR Disk resides in ASM Diskgroups, but as per startup sequence
OCSSD starts first before than ASM, how is it possible?

How does OCSSD starts if voting disk & OCR resides in ASM Diskgroups?

You might wonder how CSSD, which is required to start the clustered ASM instance, can be
started if voting disks are stored in ASM? This sounds like a chicken-and-egg problem:
without access to the voting disks there is no CSS, hence the node cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM instance. To solve this
problem the ASM disk headers have new metadata in 11.2: you can use kfed to read the
header of an ASM disk containing a voting disk. The kfdhdb.vfstart and kfdhdb.vfend fields
tell CSS where to find the voting file. This does not require the ASM instance to be up. Once
the voting disks are located, CSS can access them and joins the cluster.

What is cache fusion?

In a RAC environment, it is the combining of data blocks, which are shipped across the
interconnect from remote database caches (SGA) to the local node, in order to fulfill the
requirements for a transaction (DML, Query of Data Dictionary).

What is split brain?

When database nodes in a cluster are unable to communicate with each other, they may
continue to process and modify the data blocks independently. If the
same block is modified by more than one instance, synchronization/locking of the data
blocks does not take place and blocks may be overwritten by others in the cluster. This
state is called split brain.

What is the difference between Crash recovery and Instance recovery?

When an instance crashes in a single node database on startup a crash recovery takes
place. In a RAC enviornment the same recovery for an instance is performed by the
surviving nodes called Instance recovery.

What is the interconnect used for?

It is a private network which is used to ship data blocks from one instance to another for
cache fusion. The physical data blocks as well as data dictionary blocks are shared across
this interconnect.

How do you determine what protocol is being used for Interconnect traffic?

One of the ways is to look at the database alert log for the time period when the database
was started up.

What methods are available to keep the time synchronized on all nodes in the cluster?

Either the Network Time Protocol(NTP) can be configured or in 11gr2, Cluster Time
Synchronization Service (CTSS) can be used.

What files components in RAC must reside on shared storage?

Spfiles, ControlFiles, Datafiles and Redolog files should be created on shared storage.
Where does the Clusterware write when there is a network or Storage missed heartbeat?

The network ping failure is written in $CRS_HOME/log

How do you find out what OCR backups are available?

The ocrconfig -showbackup can be run to find out the automatic and manually run backups.

If your OCR is corrupted what options do have to resolve this?

You can use either the logical or the physical OCR backup copy to restore the Repository.

How do you find out what object has its blocks being shipped across the instance the most?

You can use the dba_hist_seg_stats.

What is a VIP in RAC use for?

The VIP is an alternate Virtual IP address assigned to each node in a cluster. During a node
failure the VIP of the failed node moves to the surviving node and relays to the application
that the node has gone down. Without VIP, the application will wait for TCP timeout and
then find out that the session is no longer live due to the failure.

How do we know which database instances are part of a RAC cluster?

You can query the V$ACTIVE_INSTANCES view to determine the member instances of the
RAC cluster.
What is OCLUMON used for in a cluster environment?

The Cluster Health Monitor (CHM) stores operating system metrics in the CHM repository for
all nodes in a RAC cluster. It stores information on CPU, memory, process, network and
other OS data, This information can later be retrieved and used to troubleshoot and identify
any cluster related issues. It is a default component of the 11gr2 grid install. The data is
stored in the master repository and replicated to a standby repository on a different node.

What would be the possible performance impact in a cluster if a less powerful node (e.g.
slower CPU’s) is added to the cluster?

All processing will show down to the CPU speed of the slowest server.

What is the purpose of OLR?

Oracle Local repository contains information that allows the cluster processes to be started
up with the OCR being in the ASM storage ssytem. Since the ASM file system is unavailable
until the Grid processes are started up a local copy of the contents of the OCR is required
which is stored in the OLR.

What is the default memory allocation for ASM?

In 10g the default SGA size is 1G in 11g it is set to 256M and in 12c ASM it is set back to
1G.

How do you backup ASM Metadata?


You can use md_backup to restore the ASM diskgroup configuration in-case of ASM
diskgroup storage loss.

What files can be stored in the ASM diskgroup?

In 11g the following files can be stored in ASM diskgroups.

Datafiles

Redo logfiles

Spfiles

In 12c the files below can also new be stored in the ASM Diskgroup

Password file

What it the ASM POWER_LIMIT?

This is the parameter which controls the number of Allocation units the ASM instance will try
to rebalance at any given time. In ASM versions less than 11.2.0.3 the default value is 11
however it has been changed to unlimited in later versions.

What is a rolling upgrade?

A patch is considered a rolling if it is can be applied to the cluster binaries without having to
shutting down the database in a RAC environment. All nodes in the cluster are patched in a
rolling manner, one by one, with only the node which is being patched unavailable while all
other instance open.

What are some of the RAC specific parameters?


Some of the RAC parameters are:

CLUSTER_DATABASE

CLUSTER_DATABASE_INSTANCE

INSTANCE_TYPE (RDBMS or ASM)

ACTIVE_INSTANCE_COUNT

UNDO_MANAGEMENT

What is the future of the Oracle Grid?

The Grid software is becoming more and more capable of not just supporting HA for Oracle
Databases but also other applications including Oracle’s applications. With 12c there are
more features and functionality built-in and it is easier to deploy these pre-built solutions,
available for common Oracle applications.

What components of the Grid should I back up?

The backups should include OLR, OCR and ASM Metadata.

Is there an easy way to verify the inventory for all remote nodes

You can run the opatch lsinventory -all_nodes command from a single node to look at the
inventory details for all nodes in the cluster.

How does OCSSD starts first if voting disk & OCR resides in ASM Diskgroups?

You might wonder how CSSD, which is required to start the clustered ASM instance, can be
started if voting disks are stored in ASM?

This sounds like a chicken-and-egg problem:

without access to the voting disks there is no CSS, hence the node cannot join the cluster.

But without being part of the cluster, CSSD cannot start the ASM instance.

To solve this problem the ASM disk headers have new metadata in 11.2:
you can use kfed to read the header of an ASM disk containing a voting disk.

The kfdhdb.vfstart and kfdhdb.vfend fields tell CSS where to find the voting file. This does
not require the ASM instance to be up.

Once the voting disks are located, CSS can access them and joins the cluster.

What is gsdctl in RAC? list gsdctl commands in Oracle RAC?

GSDCTL stands for Global Service Daemon Control, we can use gsdctl commands to start,
stop, and obtain the status of the GSD service on any platform.

The options for gsdctl are:-

$ gsdctl start -- To start the GSD service

$ gsdctl stop -- To stop the GSD service

$ gsdctl stat -- To obtain the status of the GSD service

Log file location for gsdctl:

$ ORACLE_HOME/srvm/log/gsdaemon_node_name.log

What is RAC?

RAC stands for Real Application cluster.

It is a clustering solution from Oracle Corporation that ensures high availability of databases
by providing instance failover, media failover features.

Oracle RAC is a cluster database with a shared cache architecture that overcomes the
limitations of traditional shared-nothing and shared-disk approaches to provide a highly
scalable and available database solution for all the business applications.

Oracle RAC provides the foundation for enterprise grid computing.

What is Oracle RAC One Node?

Oracle RAC one Node is a single instance running on one node of the cluster while the 2nd
node is in cold standby mode. If the instance fails for some reason then RAC one node
detect it and restart the instance on the same node or the instance is relocate to the 2nd
node incase there is failure or fault in 1st node. The benefit of this feature is that it provides
a cold failover solution and it automates the instance relocation without any downtime and
does not need a manual intervention. Oracle introduced this feature with the release of
11gR2 (available with Enterprise Edition).

What is RAC and how is it different from non RAC databases?

Oracle Real Application clusters allows multiple instances to access a single database, the
instances will be running on multiple nodes.

In Real Application Clusters environments, all nodes concurrently execute transactions


against the same database.

Real Application Clusters coordinates each node's access to the shared data to provide
consistency and integrity.

What are the advantages of RAC (Real Application Clusters)?

Reliability - if one node fails, the database won't fail

Availability - nodes can be added or replaced without having to shutdown the database

Scalability - more nodes can be added to the cluster as the workload increases

What is Oracle RAC One Node?

Oracle RAC one Node is a single instance running on one node of the cluster while the 2nd
node is in cold standby mode. If the instance fails for some reason then RAC one node
detect it and restart the instance on the same node or the instance is relocate to the 2nd
node incase there is failure or fault in 1st node. The benefit of this feature is that it provides
a cold failover solution and it automates the instance relocation without any downtime and
does not need a manual intervention. Oracle introduced this feature with the release of
11gR2 (available with Enterprise Edition).

What is Cache Fusion?

Oracle RAC is composed of two or more instances. When a block of data is read from
datafile by an instance within the cluster and another instance is in need of the same block,
it is easy to get the block image from the instance which has the block in its SGA rather
than reading from the disk. To enable inter instance communication Oracle RAC makes use
of interconnects. The Global Enqueue Service (GES) monitors and Instance enqueue process
manages the cache fusion.

What command would you use to check the availability of the RAC system?

crs_stat -t -v (-t -v are optional)


How do we verify that RAC instances are running?

SQL>select * from V$ACTIVE_INSTANCES;

The query gives the instance number under INST_NUMBER column,host_:instancename


under INST_NAME column.

How can you connect to a specific node in a RAC environment?

tnsnames.ora ensure that you have INSTANCE_NAME specified in it.

Which is the "MASTER NODE" in RAC?

The node with the lowest node number will become master node and dynamic remastering
of the resources will take place.

To find out the master node for particular resource, you can query v$ges_resource for
MASTER_NODE column.

To find out which is the master node, you can see ocssd.log file and search for "master node
number".

when the first master node fails in the cluster the lowest node number will become master
node.

What components in RAC must reside in shared storage?

All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.

Give few examples for solutions that support cluster storage?

·ASM (automatic storage management),

·Raw disk devices,

·Network file system (NFS),

·OCFS2 and

·OCFS (Oracle Cluster Fie systems).

What are Oracle Cluster Components?

1.Cluster Interconnect (HAIP)


2.Shared Storage (OCR/Voting Disk)

3.Clusterware software

4.Oracle Kernel Components

What are Oracle RAC Components?

VIP, Node apps etc.

What are Oracle Kernel Components?

Basically Oracle kernel need to switched on with RAC On option when you convert to RAC,
that is the difference as it facilitates few RAC bg process like LMON,LCK,LMD,LMS etc.

How to turn on RAC?

# link the oracle libraries

$ cd $ORACLE_HOME/rdbms/lib

$ make -f ins_rdbms.mk rac_on

# rebuild oracle

$ cd $ORACLE_HOME/bin

$ relink oracle

Disk architechture in RAC?

SAN (Storage Area Networks) - generally using fibre to connect to the SAN

NAS (Network Attached Storage) - generally using a network to connect to the NAS using
either NFS, ISCSI

What is Oracle Clusterware?

The Clusterware software allows nodes to communicate with each other and forms the
cluster that makes the nodes work as a single logical server.

The software is run by the Cluster Ready Services (CRS) using the Oracle Cluster Registry
(OCR) that records and maintains the cluster and node membership information and the
voting disk which acts as a tiebreaker during communication failures. Consistent heartbeat
information travels across the interconnect to the voting disk when the cluster is running.
Real Application Clusters

Oracle RAC is a cluster database with a shared cache architecture that overcomes the
limitations of traditional shared-nothing and shared-disk approaches to provide a highly
scalable and available database solution for all your business applications. Oracle RAC
provides the foundation for enterprise grid computing.

Oracle’s Real Application Clusters (RAC) option supports the transparent deployment of a
single database across a cluster of servers, providing fault tolerance from hardware failures
or planned outages. Oracle RAC running on clusters provides Oracle’s highest level of
capability in terms of availability, scalability, and low-cost computing.

One DB opened by multipe instances so the the db ll be Highly Available if an instance


crashes.

Cluster Software. Oracles Clusterware or products like Veritas Volume Manager are required
to provide the cluster support and allow each node to know which nodes belong to the
cluster and are available and with Oracle Cluterware to know which nodes have failed and to
eject then from the cluster, so that errors on that node can be cleared.

Oracle Clusterware has two key components Cluster Registry OCR and Voting Disk.

The cluster registry holds all information about nodes, instances, services and ASM storage
if used, it also contains state information ie they are available and up or similar.

The voting disk is used to determine if a node has failed, i.e. become separated from the
majority. If a node is deemed to no longer belong to the majority then it is forcibly rebooted
and will after the reboot add itself again the the surviving cluster nodes.

What are the Oracle Clusterware key components?

Oracle Clusterware has two key components Cluster Registry OCR and Voting Disk.

What is Voting Disk and OCR?


Voting Disk

Oracle RAC uses the voting disk to manage cluster membership by way of a health check
and arbitrates cluster ownership among the instances in case of network failures. The voting
disk must reside on shared disk.

A node must be able to access more than half of the voting disks at any time.

For example, if you have 3 voting disks configured, then a node must be able to access at
least two of the voting disks at any time. If a node cannot access the minimum required
number of voting disks it is evicted, or removed, from the cluster.

Oracle Cluster Registry (OCR)

The cluster registry holds all information about nodes, instances, services and ASM storage
if used, it also contains state information ie they are available and up or similar.

The OCR must reside on shared disk that is accessible by all of the nodes in your cluster.

What are the administrative tasks involved with voting disk?

Following administrative tasks are performed with the voting disk :

1) Backing up voting disks

2) Recovering Voting disks

3) Adding voting disks

4) Deleting voting disks

5) Moving voting disks

Can you add voting disk online? Do you need voting disk backup?

Yes, as per documentation, if you have multiple voting disk you can add online, but if you
have only one voting disk , by that cluster will be down as its lost you just need to start crs
in exclusive mode and add the votedisk using

crsctl add votedisk

What is the Oracle Recommendation for backing up voting disk?

Oracle recommends us to use the dd command to backup the voting disk with a minimum
block size of 4KB.
How do we backup voting disks?

1) Oracle recommends that you back up your voting disk after the initial cluster creation
and after we complete any node addition or deletion procedures.

2) First, as root user, stop Oracle Clusterware (with the crsctl stop crs command) on all
nodes. Then, determine the current voting disk by issuing the following command:

crsctl query votedisk css

3) Then, issue the dd or ocopy command to back up a voting disk, as appropriate.

Give the syntax of backing up voting disks:-

On Linux or UNIX systems:

dd if=voting_disk_name of=backup_file_name

where,

voting_disk_name is the name of the active voting disk

backup_file_name is the name of the file to which we want to back up the voting disk
contents

On Windows systems, use the ocopy command:

copy voting_disk_name backup_file_name

How do we verify an existing current backup of OCR?

We can verify the current backup of OCR using the following command : ocrconfig -
showbackup

You have lost OCR disk, what is your next step?

The cluster stack will be down due to the fact that cssd is unable to maintain the integrity,
this is true in 10g, From 11gR2 onwards, the crsd stack will be down, the hasd still up and
running. You can add the ocr back by restoring the automatic backup or import the manual
backup,

What are the major RAC wait events?

In a RAC environment the buffer cache is global across all instances in the cluster and hence
the processing differs.The most common wait events related to this are gc cr request and gc
buffer busy
GC CR request :the time it takes to retrieve the data from the remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will
increase the amount of data blocks requested by an Oracle session. The more blocks
requested typically means the more often a block will need to be read from a remote
instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested
data block.

How do you troubleshoot node reboot?

Please check metalink ...

Note 265769.1 Troubleshooting CRS Reboots

Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle
Clusterware Node evictions.

Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus
can start it on both nodes? How do you identify the problem?

Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl.
Now you will get detailed error stack.

What are Oracle Clusterware processes for 10g on Unix and Linux?

Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as
the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be
a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application
process, and so on) based on the resource's configuration information that is stored in the
OCR. This includes start, stop, monitor and failover operations. This process runs as the root
user

Event manager daemon (evmd) —A background process that publishes events that crs
creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O
fencing. OPROCD performs its check, stops running, and if the wake up is beyond the
expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure
results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on
Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements


and complex resources. Runs server callout scripts when FAN events occur.

What are Oracle database background processes specific to RAC?

Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database.Oracle RAC
instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion.Oracle RAC instances are composed of following background processes:

ACMS—Atomic Controlfile to Memory Service (ACMS)

GTX0-j—Global Transaction Process

LMON—Global Enqueue Service Monitor

LMD—Global Enqueue Service Daemon

LMS—Global Cache Service Process

LCK0—Instance Enqueue Process

RMSn—Oracle RAC Management Processes (RMSn)

RSMN—Remote Slave Monitor

To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy
a query or transaction, Oracle RAC instances use two processes, the Global Cache Service
(GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the
statuses of each data file and each cached block using a Global Resource Directory (GRD).
The GRD contents are distributed across all of the active instances.

What is GRD?

GRD stands for Global Resource Directory. The GES and GCS maintains records of the
statuses of each datafile and each cahed block using global resource directory.This process
is referred to as cache fusion and helps in data integrity.

What is ACMS?

ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is
an agent that ensures a distributed SGA memory update(ie)SGA updates are globally
committed on success or globally aborted in event of a failure.
What is SCAN listener?

A scan listener is something that additional to node listener which listens the incoming db
connection requests from the client which got through the scan IP, it got end points
configured to node listener where it routes the db connection requests to particular node
listener.

SCAN IP can be disabled if not required. However SCAN IP is mandatory during the RAC
installation. Enabling/disabling SCAN IP is mostly used in oracle apps environment by the
concurrent manager (kind of job scheduler in oracle apps).

Steps to disable the SCAN IP,

i. Do not use SCAN IP at the client end.

ii. Stop scan listener

srvctl stop scan_listener

iii.Stop scan

srvctl stop scan (this will stop the scan vip's)

iv. Disable scan and disable scan listener

srvctl disable scan

What are the different network components are in 10g RAC?

public, private, and vip components

Private interfaces is for intra node communication.

VIP is all about availability of application. When a node fails then the VIP component fail
over to some other node, this is the reason that all applications should based on vip
components means tns entries should have vip entry in the host list

What is an interconnect network?

An interconnect network is a private network that connects all of the servers in a cluster.
The interconnect network uses a switch/multiple switches that only the nodes in the cluster
can access.

What is the use of cluster interconnect?


Cluster interconnect is used by the Cache fusion for inter instance communication.

How can we configure the cluster interconnect?

· Configure User Datagram Protocol (UDP) on Gigabit Ethernet for cluster interconnects.

· On UNIX and Linux systems we use UDP and RDS (Reliable data socket) protocols to be
used by Oracle Clusterware.

· Windows clusters use the TCP protocol.

What is the purpose of Private Interconnect?

Clusterware uses the private interconnect for cluster synchronization (network heartbeat)
and daemon communication between the the clustered nodes. This communication is based
on the TCP protocol.

RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP).
Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches
of participating nodes in the cluster.

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead
of the standard public IP address. To configure VIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same subnet as the public
network.

What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which the VIP
address can accept TCP connections but it cannot accept Oracle connections.

Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP
timeout period (which can be up to 10 min) before getting an error. As a result, you don't
really have a good HA solution without using VIPs.

When a node fails, the VIP associated with it is automatically failed over to some other node
and new node re-arps the world indicating a new MAC address for the IP. Subsequent
packets sent to the VIP go to the new node, which will send error RST packets back to the
clients. This results in the clients getting errors immediately.
Give situations under which VIP address failover happens?

VIP addresses failover happens when the node on which the VIP address runs fails; all
interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from
the network.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don't have to wait for TCP connection timeout
messages.

What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle database. Services
enable us to define rules and characteristics to control how users and applications connect
to database instances.

What are the characteristics controlled by Oracle services feature?

The characteristics include a unique name, workload balancing, failover options, and high
availability.

What enables the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the
instances in an Oracle RAC database.

What are the types of connection load-balancing?

Connection Workload management is one of the key aspects when you have RAC instances
as you want to distribute the connections to specific nodes/instance or those have less load.

There are two types of connection load-balancing:

1.Client Side load balancing (also called as connect time load balancing)

2.Server side load balancing (also called as Listener connection load balancing)

What is the difference between server-side and client-side connection load balancing?
Client-side balancing happens at client side where load balancing is done using listener.In
case of server-side load balancing listener uses a load-balancing advisory to redirect
connections to the instance providing best service.

Client Side load balancing:- Oracle client side load balancing feature enables clients to
randomize the connection requests among all the available listeners based on their load.

An tns entry that contains all nodes entries and use load_balance=on (default its on) will
use the connect time load balancing or client side load balancing.

Sample Client Side TNS Entry:-

finance =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = myrac2-vip)(PORT = 2042))

(ADDRESS = (PROTOCOL = TCP)(HOST = myrac1-vip)(PORT = 2042))

(ADDRESS = (PROTOCOL = TCP)(HOST = myrac3-vip)(PORT = 2042))

(LOAD_BALANCE = yes)

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = FINANCE) (FAILOVER=ON)

(FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY =


5))

Server side load balancing:- This improves the connection performance by balancing the
number of active connections among multiple instances and dispatchers. In a single
instance environment (shared servers), the listener selects the least dispatcher to handle
the incoming client requests. In a rac environments, PMON is aware of all instances load
and dispatchers , and depending on the load information PMON redirects the connection to
the least loaded node.
In a RAC environment, *.remote_listener parameter which is a tns entry containing all
nodes addresses need to set to enable the load balance advisory updates to PMON.

Sample Tns entry should be in an instances of RAC cluster,

local_listener=LISTENER_MYRAC1

remote_listener = LISTENERS_MYRACDB

What are the administrative tools used for Oracle RAC environments?

Oracle RAC cluster can be administered as a single image using the below

· OEM (Enterprise Manager),

· SQL*PLUS,

· Server control (SRVCTL),

· Cluster Verification Utility (CLUVFY),

· DBCA,

· NETCA

Name some Oracle Clusterware tools and their uses?

·OIFCFG - allocating and deallocating network interfaces.

·OCRCONFIG - Command-line tool for managing Oracle Cluster Registry.

·OCRDUMP - Identify the interconnect being used.

·CVU - Cluster verification utility to get status of CRS resources.

What is the difference between CRSCTL and SRVCTL?

crsctl manages clusterware-related operations:

Starting and stopping Oracle Clusterware

Enabling and disabling Oracle Clusterware daemons

Registering cluster resources


srvctl manages Oracle resource–related operations:

Starting and stopping database instances and services

Also from 11gR2 manages the cluster resources like network,vip,disks etc

How do we remove ASM from a Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or silent mode.After
that asm can be removed using srvctl tool as follows:

srvctl stop asm -n node_name

srvctl remove asm -n node_name

We can verify if ASM has been removed by issuing the following command:

srvctl config asm -n node_name

How do we verify that an instance has been removed from OCR after deleting an instance?

Issue the following srvctl command:

srvctl config database -d database_name

cd CRS_HOME/bin

./crs_stat

What are the modes of deleting instances from ORacle Real Application cluster Databases?

We can delete instances using silent mode or interactive mode using DBCA(Database
Configuration Assistant).

What are the background process that exists in 11gr2 and functionality?

Process Name Functionality

crsd •The CRS daemon (crsd) manages cluster resources based on configuration
information that is stored in Oracle Cluster Registry (OCR) for each resource. This includes
start, stop, monitor, and failover operations. The crsd process generates events when the
status of a resource changes.

cssd •Cluster Synchronization Service (CSS): Manages the cluster configuration by


controlling which nodes are members of the cluster and by notifying members when a node
joins or leaves the cluster. If you are using certified third-party clusterware, then CSS
processes interfaces with your clusterware to manage node membership information. CSS
has three separate processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and
the CSS Monitor (cssdmonitor). The cssdagent process monitors the cluster and provides
input/output fencing. This service formerly was provided by Oracle Process Monitor daemon
(oprocd), also known as OraFenceService on Windows. A cssdagent failure results in Oracle
Clusterware restarting the node.

diskmon •Disk Monitor daemon (diskmon): Monitors and performs input/output fencing
for Oracle Exadata Storage Server. As Exadata storage can be added to any Oracle RAC
node at any point in time, the diskmon daemon is always started when ocssd is started.

evmd •Event Manager (EVM): Is a background process that publishes Oracle Clusterware
events

mdnsd •Multicast domain name service (mDNS): Allows DNS requests. The mDNS
process is a background process on Linux and UNIX, and a service on Windows.

gnsd •Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and
external DNS servers. The GNS process performs name resolution within the cluster.

ons •Oracle Notification Service (ONS): Is a publish-and-subscribe service for


communicating Fast Application Notification (FAN) events

oraagent •oraagent: Extends clusterware to support Oracle-specific requirements and


complex resources. It runs server callout scripts when FAN events occur. This process was
known as RACG in Oracle Clusterware 11g Release 1 (11.1).

orarootagent •Oracle root agent (orarootagent): Is a specialized oraagent process that


helps CRSD manage resources owned by root, such as the network, and the Grid virtual IP
address

oclskd •Cluster kill daemon (oclskd): Handles instance/node evictions requests that have
been escalated to CSS

gipcd •Grid IPC daemon (gipcd): Is a helper daemon for the communications
infrastructure

ctssd •Cluster time synchronisation daemon(ctssd) to manage the time syncrhonization


between nodes, rather depending on NTP

Under which user or owner the process will start?

Component Name of the Process Owner

Oracle High Availability Service ohasd init, root

Cluster Ready Service (CRS) Cluster Ready Services root

Cluster Synchronization Service (CSS) ocssd,cssd monitor, cssdagent grid owner

Event Manager (EVM) evmd, evmlogger grid owner

Cluster Time Synchronization Service (CTSS) octssd root


Oracle Notification Service (ONS) ons, eons grid owner

Oracle Agent oragent grid owner

Oracle Root Agent orarootagent root

Grid Naming Service (GNS) gnsd root

Grid Plug and Play (GPnP) gpnpd grid owner

Multicast domain name service (mDNS) mdnsd grid owner

What is the major difference between 10g and 11g RAC?

There is not much difference between 10g and 11gR (1) RAC. But there is a significant
difference in 11gR2.

Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS

Databases

Instances

Applications

Node Monitoring

Event Services

High Availability

From 11gR2(onwards) its completed HA stack managing and providing the following
resources as like the other cluster software like VCS etc.

Databases

Instances

Applications

Cluster Management

Node Management

Event Services

High Availability

Network Management (provides DNS/GNS/MDNSD services on behalf of other traditional


services) and SCAN – Single Access Client Naming method, HAIP
Storage Management (with help of ASM and other new ACFS filesystem)

Time synchronization (rather depending upon traditional NTP)

Removed OS dependent hang checker etc, manages with own additional monitor process

What is hangcheck timer?

The hangcheck timer checks regularly the health of the system. If the system hangs or stop
the node will be restarted automatically.

There are 2 key parameters for this module:

-> hangcheck-tick: this parameter defines the period of time between checks of system
health. The default value is 60 seconds; Oracle recommends setting it to 30seconds.

-> hangcheck-margin: this defines the maximum hang delay that should be tolerated before
hangcheck-timer resets the RAC node.

State the initialization parameters that must have same value for every instance in an
Oracle RAC database?

Some initialization parameters are critical at the database creation time and must have
same values.Their value must be specified in SPFILE or PFILE for every instance.The list of
parameters that must be identical on every instance are given below:

ACTIVE_INSTANCE_COUNT

ARCHIVE_LAG_TARGET

COMPATIBLE

CLUSTER_DATABASE

CLUSTER_DATABASE_INSTANCE

CONTROL_FILES

DB_BLOCK_SIZE

DB_DOMAIN

DB_FILES

DB_NAME

DB_RECOVERY_FILE_DEST

DB_RECOVERY_FILE_DEST_SIZE

DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)

PARALLEL_MAX_SERVERS

REMOTE_LOGIN_passWORD_FILE

UNDO_MANAGEMENT

-------------------------------------------------------------------------------------------------------
--------

What is RAC? What is the benefit of RAC over single instance database?

In Real Application Clusters environments, all nodes concurrently execute transactions


against the same database. Real Application Clusters coordinates each node's access to the
shared data to provide consistency and integrity.

Benefits:

Improve response time

Improve throughput

High availability

Transparency

Advantages of RAC (Real Application Clusters)

Reliability - if one node fails, the database won't fail

Availability - nodes can be added or replaced without having to shutdown the database

Scalability - more nodes can be added to the cluster as the workload increases

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead
of the standard public IP address. To configure VIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same subnet as the public
network.

What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which the VIP
address can accept TCP connections but it cannot accept Oracle connections.

Give situations under which VIP address failover happens:-

VIP addresses failover happens when the node on which the VIP address runs fails, all
interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from
the network.

Using virtual IP we can save our TCP/IP timeout problem because Oracle notification service
maintains communication between each nodes and listeners.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don't have to wait for TCP connection timeout
messages.

What is voting disk?

Voting Disk is a file that sits in the shared storage area and must be accessible by all nodes
in the cluster. All nodes in the cluster registers their heart-beat information in the voting
disk, so as to confirm that they are all operational. If heart-beat information of any node in
the voting disk is not available that node will be evicted from the cluster. The CSS (Cluster
Synchronization Service) daemon in the clusterware maintains the heart beat of all nodes to
the voting disk. When any node is not able to send heartbeat to voting disk, then it will
reboot itself, thus help avoiding the split-brain syndrome.

For high availability, Oracle recommends that you have a minimum of three or odd number
(3 or greater) of votingdisks.

Voting Disk - is file that resides on shared storage and Manages cluster members. Voting
disk reassigns cluster ownership between the nodes in case of failure.

The Voting Disk Files are used by Oracle Clusterware to determine which nodes are
currently members of the cluster. The voting disk files are also used in concert with other
Cluster components such as CRS to maintain the clusters integrity.
Oracle Database 11g Release 2 provides the ability to store the voting disks in ASM along
with the OCR. Oracle Clusterware can access the OCR and the voting disks present in ASM
even if the ASM instance is down. As a result CSS can continue to maintain the Oracle
cluster even if the ASM instance has failed.

How many voting disks are you maintaining ?

http://www.toadworld.com/KNOWLEDGE/KnowledgeXpertforOracle/tabid/648/TopicID/RACR
2ARC6/Default.aspx

By default Oracle will create 3 voting disk files in ASM.

Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You
should always configure an odd number of voting disks >= 3. This is because loss of more
than half your voting disks will cause the entire cluster to fail.

You should plan on allocating 280MB for each voting disk file. For example, if you are using
ASM and external redundancy then you will need to allocate 280MB of disk for the voting
disk. If you are using ASM and normal redundancy you will need 560MB.

Why we need to keep odd number of voting disks ?

Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You
should always configure an odd number of voting disks >= 3. This is because loss of more
than half your voting disks will cause the entire cluster to fail.

What are Oracle RAC software components?

Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database.Oracle RAC
instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion.Oracle RAC instances are composed of following background processes:

ACMS—Atomic Controlfile to Memory Service (ACMS)

GTX0-j—Global Transaction Process


LMON—Global Enqueue Service Monitor

LMD—Global Enqueue Service Daemon

LMS—Global Cache Service Process

LCK0—Instance Enqueue Process

RMSn—Oracle RAC Management Processes (RMSn)

RSMN—Remote Slave Monitor

What are Oracle Clusterware processes for 10g ?

Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as
the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be
a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application
process, and so on) based on the resource's configuration information that is stored in the
OCR. This includes start, stop, monitor and failover operations. This process runs as the root
user

Event manager daemon (evmd) —A background process that publishes events that crs
creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O
fencing. OPROCD performs its check, stops running, and if the wake up is beyond the
expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure
results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on
Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements


and complex resources. Runs server callout scripts when FAN events occur.

What are Oracle database background processes specific to RAC?

LMS—Global Cache Service Process

LMD—Global Enqueue Service Daemon

LMON—Global Enqueue Service Monitor

LCK0—Instance Enqueue Process

Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global
Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file
and each cached block using a Global Resource Directory (GRD). The GRD contents are
distributed across all of the active instances.
What is Cache Fusion?

Transfor of data across instances through private interconnect is called cachefusion.Oracle


RAC is composed of two or more instances. When a block of data is read from datafile by an
instance within the cluster and another instance is in need of the same block,it is easy to
get the block image from the insatnce which has the block in its SGA rather than reading
from the disk. To enable inter instance communication Oracle RAC makes use of
interconnects. The Global Enqueue Service(GES) monitors and Instance enqueue process
manages the cahce fusion

What is SCAN? (11gR2 feature)

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.

SCAN provides a single domain name via (DNS), allowing and-users to address a RAC
cluster as-if it were a single IP address. SCAN works by replacing a hostname or IP list with
virtual IP addresses (VIP).

Single client access name (SCAN) is meant to facilitate single name for all Oracle clients to
connect to the cluster database, irrespective of number of nodes and node location. Until
now, we have to keep adding multiple address records in all clients tnsnames.ora, when a
new node gets added to or deleted from the cluster.

Single Client Access Name (SCAN) eliminates the need to change TNSNAMES entry when
nodes are added to or removed from the Cluster. RAC instances register to SCAN listeners
as remote listeners. Oracle recommends assigning 3 addresses to SCAN, which will create 3
SCAN listeners, though the cluster has got dozens of nodes.. SCAN is a domain name
registered to at least one and up to three IP addresses, either in DNS (Domain Name
Service) or GNS (Grid Naming Service). The SCAN must resolve to at least one address on
the public network. For high availability and scalability, Oracle recommends configuring the
SCAN to resolve to three addresses.

http://www.freeoraclehelp.com/2011/12/scan-setup-for-oracle-11g-release211gr2.html

What are SCAN components in a cluster?

1.SCAN Name

2.SCAN IPs (3)

3.SCAN Listeners (3)


What is FAN?

Fast application Notification as it abbreviates to FAN relates to the events related to


instances,services and nodes.This is a notification mechanism that Oracle RAc uses to notify
other processes about the configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can respond to FAN events and
take immediate action.

What is TAF?

TAF (Transparent Application Failover) is a configuration that allows session fail-over


between different nodes of a RAC database cluster.

Transparent Application Failover (TAF). If a communication link failure occurs after a


connection is established, the connection fails over to another active node. Any disrupted
transactions are rolled back, and session properties and server-side program variables are
lost. In some cases, if the statement executing at the time of the failover is a Select
statement, that statement may be automatically re-executed on the new connection with
the cursor positioned on the row on which it was positioned prior to the failover.

After an Oracle RAC node crashes—usually from a hardware failure—all new application
transactions are automatically rerouted to a specified backup node. The challenge in
rerouting is to not lose transactions that were "in flight" at the exact moment of the crash.
One of the requirements of continuous availability is the ability to restart in-flight application
transactions, allowing a failed node to resume processing on another server without
interruption. Oracle's answer to application failover is a new Oracle Net mechanism dubbed
Transparent Application Failover. TAF allows the DBA to configure the type and method of
failover for each Oracle Net client.

TAF architecture offers the ability to restart transactions at either the transaction (SELECT)
or session level.

What are the requirements for Oracle Clusterware?

1. External Shared Disk to store Oracle Cluster ware file (Voting Disk and Oracle Cluster
Registry - OCR)

2. Two netwrok cards on each cluster ware node (and three set of IP address) -

Network Card 1 (with IP address set 1) for public network

Network Card 2 (with IP address set 2) for private network (for inter node communication
between rac nodes used by clusterware and rac database)

IP address set 3 for Virtual IP (VIP) (used as Virtual IP address for client connection and for
connection failover)

3. Storage Option for OCR and Voting Disk - RAW, OCFS2 (Oracle Cluster File System), NFS,
…..
Which enable the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the
instances in an Oracle RAC database.

How to find location of OCR file when CRS is down?

If you need to find the location of OCR (Oracle Cluster Registry) but your CRS is down.

When the CRS is down:

Look into “ocr.loc” file, location of this file changes depending on the OS:

On Linux: /etc/oracle/ocr.loc

On Solaris: /var/opt/oracle/ocr.loc

When CRS is UP:

Set ASM environment or CRS environment then run the below command:

ocrcheck

In 2 node RAC, how many NIC’s are r using ?

2 network cards on each clusterware node

Network Card 1 (with IP address set 1) for public network

Network Card 2 (with IP address set 2) for private network (for inter node communication
between rac nodes used by clusterware and rac database)

In 2 node RAC, how many IP’s are r using ?

6 - 3 set of IP address

## eth1-Public: 2

## eth0-Private: 2

## VIP: 2

How to find IP’s information in RAC ?

Edit the /etc/hosts file as shown below:

# Do not remove the following line, or various programs

# that requires network functionality will fail.


127.0.0.1 localhost.localdomain localhost

## Public Node names

192.168.10.11 node1-pub.hingu.net node1-pub

192.168.10.22 node2-pub.hingu.net node2-pub

## Private Network (Interconnect)

192.168.0.11 node1-prv node1-prv

192.168.0.22 node2-prv node2-prv

## Private Network (Network Area storage)

192.168.1.11 node1-nas node1-nas

192.168.1.22 node2-nas node2-nas

192.168.1.33 nas-server nas-server

## Virtual IPs

192.168.10.111 node1-vip.hingu.net node1-vip

192.168.10.222 node2-vip.hingu.net node2-vip

What is difference between RAC ip addresses ?

Public IP adress is the normal IP address typically used by DBA and SA to manage storage,
system and database. Public IP addresses are reserved for the Internet.

Private IP address is used only for internal clustering processing (Cache Fusion) (aka as
interconnect). Private IP addresses are reserved for private networks.

VIP is used by database applications to enable fail over when one cluster node fails. The
purpose for having VIP is so client connection can be failover to surviving nodes in case
there is failure

Can application developer access the private ip ?

No. private IP address is used only for internal clustering processing (Cache Fusion) (aka as
interconnect)

What is Oracle RAC One Node?


Oracle RAC one Node is a single instance running on one node of the cluster while the 2nd
node is in cold standby mode. If the instance fails for some reason then RAC one node
detect it and restart the instance on the same node or the instance is relocate to the 2nd
node in case there is failure or fault in 1st node. The benefit of this feature is that it
provides a cold failover solution and it automates the instance relocation without any
downtime and does not need a manual intervention. Oracle introduced this feature with the
release of 11gR2 (available with Enterprise Edition).

Real Application Clusters:

Oracle RAC is a cluster database with a shared cache architecture that overcomes the
limitations of traditional shared-nothing and shared-disk approaches to provide a highly
scalable and available database solution for all your business applications. Oracle RAC
provides the foundation for enterprise grid computing.

Oracle’s Real Application Clusters (RAC) option supports the transparent deployment of a
single database across a cluster of servers, providing fault tolerance from hardware failures
or planned outages. Oracle RAC running on clusters provides Oracle’s highest level of
capability in terms of availability, scalability, and low-cost computing.

One DB opened by multipe instances so the the db ll be Highly Available if an instance


crashes.

Cluster Software. Oracles Clusterware or products like Veritas Volume Manager are required
to provide the cluster support and allow each node to know which nodes belong to the
cluster and are available and with Oracle Cluterware to know which nodes have failed and to
eject then from the cluster, so that errors on that node can be cleared.

Oracle Clusterware has two key components Cluster Registry OCR and Voting Disk.

The cluster registry holds all information about nodes, instances, services and ASM storage
if used, it also contains state information ie they are available and up or similar.

The voting disk is used to determine if a node has failed, i.e. become separated from the
majority. If a node is deemed to no longer belong to the majority then it is forcibly rebooted
and will after the reboot add itself again the the surviving cluster nodes.
Advantages of RAC (Real Application Clusters)

Reliability – if one node fails, the database won’t fail

Availability – nodes can be added or replaced without having to shutdown the database

Scalability – more nodes can be added to the cluster as the workload increases

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead
of the standard public IP address. To configure VIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same subnet as the public
network.

What is the use of VIP?

If a node fails, then the node’s VIP address fails over to another node on which the VIP
address can accept TCP connections but it cannot accept Oracle connections.

Give situations under which VIP address failover happens:-

VIP addresses failover happens when the node on which the VIP address runs fails, all
interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from
the network.

Using virtual IP we can save our TCP/IP timeout problem because Oracle notification service
maintains communication between each nodes and listeners.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don’t have to wait for TCP connection timeout
messages.

What is voting disk?

Voting Disk is a file that sits in the shared storage area and must be accessible by all nodes
in the cluster. All nodes in the cluster registers their heart-beat information in the voting
disk, so as to confirm that they are all operational. If heart-beat information of any node in
the voting disk is not available that node will be evicted from the cluster. The CSS (Cluster
Synchronization Service) daemon in the clusterware maintains the heart beat of all nodes to
the voting disk. When any node is not able to send heartbeat to voting disk, then it will
reboot itself, thus help avoiding the split-brain syndrome.

For high availability, Oracle recommends that you have a minimum of three or odd number
(3 or greater) of votingdisks.

Voting Disk – is file that resides on shared storage and Manages cluster members. Voting
disk reassigns cluster ownership between the nodes in case of failure.

The Voting Disk Files are used by Oracle Clusterware to determine which nodes are
currently members of the cluster. The voting disk files are also used in concert with other
Cluster components such as CRS to maintain the clusters integrity.

Oracle Database 11g Release 2 provides the ability to store the voting disks in ASM along
with the OCR. Oracle Clusterware can access the OCR and the voting disks present in ASM
even if the ASM instance is down. As a result CSS can continue to maintain the Oracle
cluster even if the ASM instance has failed.

How many voting disks are you maintaining ?

By default Oracle will create 3 voting disk files in ASM. Oracle expects that you will
configure at least 3 voting disks for redundancy purposes. You should always configure an
odd number of voting disks >= 3. This is because loss of more than half your voting disks
will cause the entire cluster to fail.

You should plan on allocating 280MB for each voting disk file. For example, if you are using
ASM and external redundancy then you will need to allocate 280MB of disk for the voting
disk. If you are using ASM and normal redundancy you will need 560MB.

Why we need to keep odd number of voting disks ?

Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You
should always configure an odd number of voting disks >= 3. This is because loss of more
than half your voting disks will cause the entire cluster to fail.

What are Oracle RAC software components?


Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database. Oracle RAC
instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion. Oracle RAC instances are composed of following background
processes:

ACMS—Atomic Controlfile to Memory Service (ACMS)

GTX0-j—Global Transaction Process

LMON—Global Enqueue Service Monitor

LMD—Global Enqueue Service Daemon

LMS—Global Cache Service Process

LCK0—Instance Enqueue Process

RMSn—Oracle RAC Management Processes (RMSn)

RSMN—Remote Slave Monitor

What are Oracle Clusterware processes for 10g ?

Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as
the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be
a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application
process, and so on) based on the resource’s configuration information that is stored in the
OCR. This includes start, stop, monitor and failover operations. This process runs as the root
user

Event manager daemon (evmd) —A background process that publishes events that crs
creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O
fencing. OPROCD performs its check, stops running, and if the wake up is beyond the
expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure
results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on
Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements


and complex resources. Runs server callout scripts when FAN events occur.

What are Oracle database background processes specific to RAC?

LMS—Global Cache Service Process

LMD—Global Enqueue Service Daemon


LMON—Global Enqueue Service Monitor

LCK0—Instance Enqueue Process

Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global
Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file
and each cached block using a Global Resource Directory (GRD). The GRD contents are
distributed across all of the active instances.

What is Cache Fusion?

Transfor of data across instances through private interconnect is called cachefusion.Oracle


RAC is composed of two or more instances. When a block of data is read from datafile by an
instance within the cluster and another instance is in need of the same block,it is easy to
get the block image from the insatnce which has the block in its SGA rather than reading
from the disk. To enable inter instance communication Oracle RAC makes use of
interconnects. The Global Enqueue Service(GES) monitors and Instance enqueue process
manages the cahce fusion

What is SCAN? (11gR2 feature)

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.

SCAN provides a single domain name via (DNS), allowing and-users to address a RAC
cluster as-if it were a single IP address. SCAN works by replacing a hostname or IP list with
virtual IP addresses (VIP).

Single client access name (SCAN) is meant to facilitate single name for all Oracle clients to
connect to the cluster database, irrespective of number of nodes and node location. Until
now, we have to keep adding multiple address records in all clients tnsnames.ora, when a
new node gets added to or deleted from the cluster.

Single Client Access Name (SCAN) eliminates the need to change TNSNAMES entry when
nodes are added to or removed from the Cluster. RAC instances register to SCAN listeners
as remote listeners. Oracle recommends assigning 3 addresses to SCAN, which will create 3
SCAN listeners, though the cluster has got dozens of nodes.. SCAN is a domain name
registered to at least one and up to three IP addresses, either in DNS (Domain Name
Service) or GNS (Grid Naming Service). The SCAN must resolve to at least one address on
the public network. For high availability and scalability, Oracle recommends configuring the
SCAN to resolve to three addresses.
What are SCAN components in a cluster?

1.SCAN Name

2.SCAN IPs (3)

3.SCAN Listeners (3)

What is FAN?

Fast application Notification as it abbreviates to FAN relates to the events related to


instances,services and nodes.This is a notification mechanism that Oracle RAc uses to notify
other processes about the configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can respond to FAN events and
take immediate action.

What is TAF?

TAF (Transparent Application Failover) is a configuration that allows session fail-over


between different nodes of a RAC database cluster.

Transparent Application Failover (TAF). If a communication link failure occurs after a


connection is established, the connection fails over to another active node. Any disrupted
transactions are rolled back, and session properties and server-side program variables are
lost. In some cases, if the statement executing at the time of the failover is a Select
statement, that statement may be automatically re-executed on the new connection with
the cursor positioned on the row on which it was positioned prior to the failover.

After an Oracle RAC node crashes—usually from a hardware failure—all new application
transactions are automatically rerouted to a specified backup node. The challenge in
rerouting is to not lose transactions that were “in flight” at the exact moment of the crash.
One of the requirements of continuous availability is the ability to restart in-flight application
transactions, allowing a failed node to resume processing on another server without
interruption. Oracle’s answer to application failover is a new Oracle Net mechanism dubbed
Transparent Application Failover. TAF allows the DBA to configure the type and method of
failover for each Oracle Net client.

TAF architecture offers the ability to restart transactions at either the transaction (SELECT)
or session level.

What are the requirements for Oracle Clusterware?

1. External Shared Disk to store Oracle Cluster ware file (Voting Disk and Oracle Cluster
Registry – OCR)

2. Two netwrok cards on each cluster ware node (and three set of IP address) –
Network Card 1 (with IP address set 1) for public network

Network Card 2 (with IP address set 2) for private network (for inter node communication
between rac nodes used by clusterware and rac database)

IP address set 3 for Virtual IP (VIP) (used as Virtual IP address for client connection and for
connection failover)

3. Storage Option for OCR and Voting Disk – RAW, OCFS2 (Oracle Cluster File System),
NFS, …..

Which enable the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the
instances in an Oracle RAC database.

How to find location of OCR file when CRS is down?

If you need to find the location of OCR (Oracle Cluster Registry) but your CRS is down.

When the CRS is down:

Look into “ocr.loc” file, location of this file changes depending on the OS:

On Linux: /etc/oracle/ocr.loc

On Solaris: /var/opt/oracle/ocr.loc

When CRS is UP:

Set ASM environment or CRS environment then run the below command:

ocrcheck

In 2 node RAC, how many NIC’s are r using ?

2 network cards on each clusterware node

Network Card 1 (with IP address set 1) for public network

Network Card 2 (with IP address set 2) for private network (for inter node communication
between rac nodes used by clusterware and rac database)

In 2 node RAC, how many IP’s are r using ?

6 – 3 set of IP address

## eth1-Public: 2

## eth0-Private: 2
## VIP: 2

How to find IP’s information in RAC ?

Edit the /etc/hosts file as shown below:

# Do not remove the following line, or various programs

# that requires network functionality will fail.

127.0.0.1 localhost.localdomain localhost

## Public Node names

192.168.10.11 node1-pub.hingu.net node1-pub

192.168.10.22 node2-pub.hingu.net node2-pub

## Private Network (Interconnect)

192.168.0.11 node1-prv node1-prv

192.168.0.22 node2-prv node2-prv

## Private Network (Network Area storage)

192.168.1.11 node1-nas node1-nas

192.168.1.22 node2-nas node2-nas

192.168.1.33 nas-server nas-server

## Virtual IPs

192.168.10.111 node1-vip.hingu.net node1-vip

192.168.10.222 node2-vip.hingu.net node2-vip

What is difference between RAC IP addresses ?

Public IP adress is the normal IP address typically used by DBA and SA to manage storage,
system and database. Public IP addresses are reserved for the Internet.

Private IP address is used only for internal clustering processing (Cache Fusion) (aka as
interconnect). Private IP addresses are reserved for private networks.

VIP is used by database applications to enable fail over when one cluster node fails. The
purpose for having VIP is so client connection can be failover to surviving nodes in case
there is failure.
Can application developer access the private ip ?

No. private IP address is used only for internal clustering processing (Cache Fusion) (aka as
interconnect)

5) What is GRD? How does this help with cache fusion?

GRD stands for Global Resource Directory. The GES and GCS maintains records of the
statuses of each datafile and each cached block using global resource directory.This process
is referred to as cache fusion and helps in data integrity.

6) Give Details on Cache Fusion:-

Oracle RAC is composed of two or more instances. When a block of data is read from
datafile by an instance within the cluster and another instance is in need of the same
block,it is easy to get the block image from the instance which has the block in its SGA
rather than reading from the disk. To enable inter instance communication Oracle RAC
makes use of interconnects. The Global Enqueue Service(GES) monitors and Instance
enqueue process manages the cahce fusion.

7) Give Details on ACMS:-

ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is
an agent that ensures a distributed SGA memory update(ie)SGA updates are globally
committed on success or globally aborted in event of a failure.

8) What is clustering?

Clustering is a High availability solution.Clustering makes physically separate servers appear


as a single server to the end user.Clustering provides scalability at all level – OS, Storage,
database, Applications, hardware.Clustering makes the application available 24x7x365.

9) Give details on GTX0-j :-

The process provides transparent support for XA global transactions in a RAC


environment.The database autotunes the number of these processes based on the workload
of XA global transactions.GLOBAL_TXN_PROCESSES setting specifies the initial number of
GTXn background processes per instance. This process is seen only in RAC environments.
The range of value for GLOBAL_TXN_PROCESSES can be from 1 to 20 and there is no
definite need to set this parameter. The number of processes needed is decided by oracle
database automatically and is tuned on demand

10) Give details on LMON:-

This process monitors instance membership in a RAC encironment, detects isntance


transitions, reconfigures GES and GCS resources as needed. This is called Global Enqueue
Service Monitor Process primarily used for managing global resources

11) Give details on LMD:-

As LMON is for monitoring global enqueue services, this is global enqueue services daemon
process. This process manages incoming remote resource requests within each instance.
LMD0 particularly processes incoming enqueue request messages. IT controls access to
global enqueues

12) Give details on LMS:-

This process is called as Global Cache service process.This process maintains statuses of
datafiles and each cahed block by recording information in a Global Resource
Dectory(GRD).This process also controls the flow of messages to remote instances and
manages global data block access and transmits block images between the buffer caches of
different instances.This processing is a part of cache fusion feature.

13) Give details on LCK0:-

This process is called as Instance enqueue process.This process manages non-cache fusion
resource requests such as library and row cache requests.

14) Give details on RMSn:-

This process is called as Oracle RAC management process.These processes perform


manageability tasks for Oracle RAC. Tasks include creation of resources related Oracle RAC
when new instances are added to the cluster.

15) Give details on RSMN:-

This process is called as Remote Slave Monitor.This process manages background slave
process creation and communication on remote http://learnersreference.com/ instances.
This is a background slave process.This process performs tasks on behalf of a co-ordinating
process running in another instance.

16) What components in RAC must reside in shared storage?

All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.

17) What is the significance of using cluster-aware shared storage in an Oracle RAC
environment?

All instances of an Oracle RAC can access all the datafiles,control files, SPFILE’s, redolog
files when these files are hosted out of cluster-aware shared storage which are group of
shared disks.

18) Give few examples for solutions that support cluster storage:-

ASM(automatic storage management),raw disk devices,network file system(NFS), OCFS2


and OCFS(Oracle Cluster Fie systems).

19) Give details on oracle rac lkdebug utility:-

LKDEBUG is an oracle supplied utility.LKDEBUG is integrated with ORADEBUG utility.To use


LKDEBUG we must login with SYSDBA system privilege.

LKDEBUG is used to obtain information about the current state GCS and GES structures in
the instance.

To obtain information on LKDEBUG options issue the following command


SQL> ORADEBUG LKDEBUG HELP

20) What is an interconnect network?

an interconnect network is a private network that connects all of the servers in a cluster.
The interconnect network uses a switch/multiple switches that only the nodes in the cluster
can access.

21) How can we configure the cluster interconnect?

Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster interconnect.On unix
and linux systems we use UDP and RDS(Reliable data socket) protocols to be used by
Oracle Clusterware. Windows clusters use the TCP protocol.

22) Can we use crossover cables with Oracle Clusterware interconnects?

No, crossover cables are not supported with Oracle Clusterware interconnects.

23) What is the use of cluster interconnect?

Cluster interconnect is used by the Cache fusion for inter instance communication.

24) How do users connect to database in an Oracle RAC environment?

Users can access a RAC database using a client/server configuration or through one or more
middle tiers ,with or without connection pooling.Users can use oracle services feature to
connect to database.

25) What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle database.Services
enable us to define rules and characteristics to control how users and applications connect
to database instances.

26) What are the characteristics controlled by Oracle services feature?

The characteristics include a unique name, workload balancing and failover options,and high
availability characteristics.

27) Which enable the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the
instances in an Oracle RAC database

28) What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead
of the standard public IP address. To configure VIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same subnet as the public
network.

29) What is the use of VIP?

If a node fails, then the node’s VIP address fails over to another node on which the VIP
address can accept TCP connections but it cannot accept Oracle connections.
30) Give situations under which VIP address failover happens:-

VIP addresses failover happens when the node on which the VIP address runs fails, all
interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from
the network.

31) What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don’t have to wait for TCP connection timeout
messages.

32) What are the administrative tools used for Oracle RAC environments?

Oracle RAC cluster can be administered as a single image using OEM(Enterprise


Manager),SQL*PLUS,Servercontrol(SRVCTL),clusterverificationutility(cvu),DBCA,NETC

33) How do we verify that RAC instances are running?

Issue the following query from any one node connecting through SQL*PLUS.

$connect sys/sys as sysdba

SQL&gt;select * from V$ACTIVE_INSTANCES;

The query gives the instance number under INST_NUMBER column, host_:instancename
under INST_NAME column.

34) What is FAN?

Fast application Notification as it abbreviates to FAN relates to the events related to


instances,services and nodes.This is a notification mechanism that Oracle RAC uses to notify
other processes about the configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can respond to FAN events and
take immediate action.

35) Where can we apply FAN UP and DOWN events?

FAN UP and FAN DOWN events can be applied to instances,services and nodes.

State the use of FAN events in case of a cluster configuration change?

During times of cluster configuration changes,Oracle RAC high availability framework


publishes a FAN event immediately when a state change occurs in the cluster.So
applications can receive FAN events and react immediately.This prevents applications from
polling database and detecting a problem after such a state change.

36) Why should we have separate homes for ASM instance?

It is a good practice to have ASM home separate from the database home
(ORACLE_HOME).This helps in upgrading and patching ASM and the Oracle database
software independent of each other.Also,we can deinstall the Oracle database software
independent of the ASM instance.
37) What is the advantage of using ASM?

Having ASM is the Oracle recommended storage option for RAC databases as the ASM
maximizes performance by managing the/storage configuration across the disks. ASM does
this by distributing the database file across all of the available storage within our cluster
database environment.

38) What is rolling upgrade?

It is a new ASM feature from Database 11g. ASM instances in Oracle database 11g
release(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables
us to patch or upgrade ASM nodes in a clustered environment without affecting database
availability.During a rolling upgrade we can maintain a functional cluster while one or more
of the nodes in the cluster are running in different software versions

39) Can rolling upgrade be used to upgrade from 10g to 11g database?

No,it can be used only for Oracle database 11g releases(from 11.1) and upwards

40) State the initialization parameters that must have same value for every instance in an
Oracle RAC database:-

Some initialization parameters are critical at the database creation time and must have
same values.Their value must be specified in SPFILE or PFILE for every instance.The list of
parameters that must be identical on every instance are given below:

ACTIVE_INSTANCE_COUNT

ARCHIVE_LAG_TARGET

COMPATIBLE

CLUSTER_DATABASE

CLUSTER_DATABASE_INSTANCE

CONTROL_FILES

DB_BLOCK_SIZE

DB_DOMAIN

DB_FILES

DB_NAME

DB_RECOVERY_FILE_DEST

DB_RECOVERY_FILE_DEST_SIZE

DB_UNIQUE_NAME

INSTANCE_TYPE (RDBMS or ASM)

PARALLEL_MAX_SERVERS
REMOTE_LOGIN_PASSWORD_FILE

UNDO_MANAGEMENT

41) Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?

These parameters can be identical on all instances only if these parameter values are set to
zero.

42) What two parameters must be set at the time of starting up an ASM instance in a RAC
environment?

The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.

43) Mention the components of Oracle clusterware:-

Oracle clusterware is made up of components like voting disk and Oracle Cluster
Registry(OCR).

44) What is a CRS resource?

Oracle clusterware is used to manage high-availability operations in a cluster.Anything that


Oracle Clusterware manages is known as a CRS resource.Some examples of CRS resources
are database,an instance,a service,a listener,a VIP address,an application process etc.

45) What is the use of OCR?

Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).

46) How does a Oracle Clusterware manage CRS resources?

Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).

47) Name some Oracle clusterware tools and their uses?

OIFCFG – allocating and deallocating network interfaces

OCRCONFIG – Command-line tool for managing Oracle Cluster Registry

OCRDUMP – Identify the interconnect being used

CVU – Cluster verification utility to get status of CRS resources

48) What are the modes of deleting instances from ORacle Real Application cluster
Databases?

We can delete instances using silent mode or interactive mode using DBCA(Database
Configuration Assistant).

49) How do we remove ASM from a Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or silent mode.After
that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name

srvctl remove asm -n node_name

We can verify if ASM has been removed by issuing the following command:

srvctl config asm -n node_name

50) How do we verify that an instance has been removed from OCR after deleting an
instance?

Issue the following srvctl command:

srvctl config database -d database_name

cd CRS_HOME/bin

./crs_stat

51) How do we verify an existing current backup of OCR?

We can verify the current backup of OCR using the following command : ocrconfig -
showbackup

52) What are the performance views in an Oracle RAC environment?

We have v$ views that are instance specific. In addition we have GV$ views called as global
views that has an INST_ID column of numeric data type. GV$ views obtain information from
individual V$ views.

53) What are the types of connection load-balancing?

There are two types of connection load-balancing:server-side load balancing and client-side
load balancing.

54) What is the difference between server-side and client-side connection load balancing?

Client-side balancing happens at client side where load balancing is done using listener.In
case of server-side load balancing listener uses a load-balancing advisory to redirect
connections to the instance providing best service.

55) Give details on srvm_trace:-

Oracle RAC (Real Application Cluster) SRVM_TRACE environment variable is an oracle RAC
(Real Application Cluster) environment variable from Oracle

It is used in the debugging on Oracle RAC (Real Application Cluster) utility srvctl.

56) Give the usage of srvctl:-

srvctl start instance -d db_name -i “inst_name_list” [-o start_options]srvctl stop instance -d


name -i “inst_name_list” [-o stop_options]srvctl stop instance -d orcl -i “orcl3,orcl4” -o
immediatesrvctl start database -d name [-o start_options]srvctl stop database -d name [-o
stop_options]srvctl start database -d orcl -o mount
57) What is an OCRCHECK utility?

An ocrcheck utility is a diagnostic tool used for diagnosing OC(Oracle Cluster Registry)
Problems.This is used to verify the Oracle Cluster Registry(OCR) integrity.

58) What does an ocrcheck display?

The OCRCHECK utility displays the version of the OCR’s block format, total space available
and used space, OCRID, and the OCR locations that we have configured.

59) How does ocrcheck perform integrity check?

OCRCHECK performs a block-by-block checksum operation for all of the blocks in all of the
OCRs that we have configured. It also returns an individual status for each file as well as a
result for the overall OCR integrity check.

60) Give a sample output of ocrcheck utility:-

Sample of the OCRCHECK utility output:

Status of Oracle Cluster Registry is as follows :

Version : 2

Total space (kbytes) : 262144

Used space (kbytes) : 16256

Available space (kbytes) : 245888

ID : 1918913332

Device/File Name : /dev/raw/raw1

Device/File integrity check succeeded

Device/File Name : /dev/raw/raw2

Device/File integrity check succeeded

Cluster registry integrity check succeeded

61) Where does an ocrcheck utility create a log file?

OCRCHECK creates a log file in the directory CRS_home/log/hostname/client.

How can we change the amount of logging?

To change amount of logging, edit the file CRS_home/srvm/admin/ocrlog.ini.

62) What is scalability?

Scalability ensures that performance should remain good irrespective of increase in


workload.Infrastructure scalability starts from hardware (physical server), OS, storage,
database and ends with application
Clustering solution provides scalability as it makes physically seperate servers appear as a
single machine to the end user.

Why odd number of Votedisk?

Odd number of disk are to avoid split brain, When Nodes in cluster can't talk to each other
they run to lock the Voting disk and whoever lock the more disk will survive, if disk number
are even there are chances that node might lock 50% of disk (2 out of 4) then how to
decide which node to evict.

whereas when number is odd, one will be higher than other and each for cluster to evict the
node with less number.

How do you troubleshoot node reboot

Please check metalink ...

Note 265769.1 Troubleshooting CRS Reboots

Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle
Clusterware Node evictions.

How do you backup the OCR

There is an automatic backup mechanism for OCR. The default location is :


$ORA_CRS_HOME\cdata\"clustername"\

To display backups :

#ocrconfig -showbackup

To restore a backup :

#ocrconfig -restore

With Oracle RAC 10g Release 2 or later, you can also use the export command:

#ocrconfig -export -s online, and use -import option to restore the contents back.

With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR with the
command:

# ocrconfig -manualbackup
How do you backup voting disk

#dd if=voting_disk_name of=backup_file_name

How do I identify the voting disk location

#crsctl query css votedisk

How do I identify the OCR file location

check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)

or

#ocrcheck

What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access an Oracle Database
running in a cluster. The benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.

Click here for more details from Oracle

What is the purpose of Private Interconnect ?

Clusterware uses the private interconnect for cluster synchronization (network heartbeat)
and daemon communication between the the clustered nodes. This communication is based
on the TCP protocol.

RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP).
Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches
of participating nodes in the cluster.
Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP
timeout period (which can be up to 10 min) before getting an error. As a result, you don't
really have a good HA solution without using VIPs.

When a node fails, the VIP associated with it is automatically failed over to some other node
and new node re-arps the world indicating a new MAC address for the IP. Subsequent
packets sent to the VIP go to the new node, which will send error RST packets back to the
clients. This results in the clients getting errors immediately.

How many nodes are supported in a RAC Database?

10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances
in a RAC database.

Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus
can start it on both nodes? How do you identify the problem?

Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl.
Now you will get detailed error stack.

what is the purpose of the ONS daemon?

The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware
as part of the nodeapps. There is one ons daemon started per clustered node.

The Oracle Notification Service daemon receive a subset of published clusterware events via
the local evmd and racgimon clusterware daemons and forward those events to application
subscribers and to the local listeners.

This in order to facilitate:

a. the FAN or Fast Application Notification feature or allowing applications to respond to


database state changes.
b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing accross
different rac nodes dependent of the load on the different nodes. The rdbms MMON is
creating an advisory for distribution of work every 30seconds and forward it via racgimon
and ONS to listeners and applications.

You might also like