Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 12

PURPOSE

Cluster Verification Utility (aka CVU, command runcluvfy.sh or cluvfy) does very good checking on
the network and name resolution setup, but it may not capture all issues. If the network and name
resolution is not setup properly before installation, it is likely the installation will fail; if network or
name resolution is malfunctioning, likely the clusterware and/or RAC will have issues. The goal of
this note is to provide a list of things to verify regarding the network and name resolution setup for
Grid Infrastructure (clusterware) and RAC.

SCOPE
This document is intended for Oracle Clusterware/RAC Database Administrators and Oracle Support
engineers.

DETAILS
A. Requirement
o Network ping with package size of Network Adapter (NIC) MTU should work on all public and priv
ate network and the time of ping should be small (sub-second).
o IP address 127.0.0.1 should only map to localhost and/or localhost.localdomain, not anything else.
o 127.*.*.* should not be used by any network interface.
o Public NIC name must be same on all nodes.
o Private NIC name should be same in 11gR2 and must be same for pre-11gR2 on all nodes
o Public and private network must not be in link local subnet (169.254.*.*), should be in non-related
separate subnet.
o MTU should be the same for corresponding network on all nodes.
o Network size should be same for corresponding network on all nodes.
o As the private network needs to be directly attached, traceroute should work with a packet size of
NIC MTU without fragmentation or going through the routing table on all private networks in 1 hop.
o Firewall needs to be turned off on the private network.
o For 10.1 to 11.1, name resolution should work for the public, private and virtual names.
o For 11.2 without Grid Naming Service (aka GNS), name resolution should work for all public, virtu
al, and SCAN names; and if SCAN is configured in DNS, it should not be in local hosts file.
o For 11.2.0.2 and above, multicast group 230.0.1.0 should work on private network; with patch 9974
223, both group 230.0.1.0 and 224.0.0.251 are supported. With patch 10411721 (fixed in 11.2.0.3), br
oadcast will be supported. See Multicast/Broadcast section to verify.
o For 11.2.0.1-11.2.0.3, Installer may report a warning if reverse lookup is not setup correctly for pubi
c IP, node VIP, and SCAN VIP, with bug 9574976 fix in 11.2.0.4, the warning shouldn't be there any
more.
o OS level bonding is recommended for the private network for pre-11.2.0.2. Depending on the platfo
rm, you may implement bonding teaming, Etherchannel, IPMP, MultiPrivNIC etc, please consult with
your OS vendor for details. Started from 11.2.0.2, Redundant Interconnect and HAIP is introduced to
provide native support for multiple private network, refer to note 1210883.1 for details.
o The commands verifies jumbo frames if it's configured. To know more about jumbo frames, refer to
note 341788.1

B. Example of what we expect


Example below shows what we expect while validating the network and name resolution setup. As the
network setup is slightly different for 11gR2 and 11gR1 or below, we have both case in the below exa
mple. The difference between 11gR1 or below and 11gR2 is for 11gR1, we need a public name, VIP n
ame, private hostname, and we rely on the private name to find out the private IP for cluster commuun
ication. For 11gR2, we do not rely on the private name anymore, rather the private network is selected
based on the GPnP profile while the clusterware comes up. Assuming a 3-node cluster with the follo
wing node information:

11gR1 or below cluster:


Nodename |Public IP |VIP name |VIP |Private |private IP1 |private IP2
|NIC/MTU | | |Name1 |NIC/MTU |
---------|----------|---------|-----------|--------|----------------------
<node1> |120.X.X.1 |<node1>v |120.X.X.11 |<node1>p |10.X.X.1 |
|eth0/1500 | | | |eth1/1500 |
---------|----------|---------|-----------|--------|----------------------
<node2> |120.X.X.2 |<node2>v |120.0.0.12 |<node2>p |10.X.X.2 |
|eth0/1500 | | | |eth1/1500 |
---------|----------|---------|-----------|--------|----------------------
<node3> |120.X.X.3 |<node3>v |120.X.X.13 |<node3>p |10.X.X.3 |
|eth0/1500 | | | |eth1/1500 |
---------|----------|---------|-----------|--------|----------------------

11gR2 cluster
Nodename |Public IP |VIP name |VIP |private IP1 |
|NIC/MTU | | |NIC/MTU |
---------|----------|---------|-----------|------------|----------
<node1> |120.X.X.1 |<node1>v |120.X.X.11 |10.X.X.1 |
|eth0/1500 | | |eth1/1500 |
---------|----------|---------|-----------|------------|----------
<node2> |120.X.X.2 |<node2>v |120.X.X.12 |10.X.X.2 |
|eth0/1500 | | |eth1/1500 |
---------|----------|---------|-----------|------------|----------
<node3> |120.X.X.3 |<node3>v |120.X.X.13 |10.X.X.3 |
|eth0/1500 | | |eth1/1500 |
---------|----------|---------|-----------|------------|----------

SCAN name |SCAN IP1 |SCAN IP2 |SCAN IP3


----------|-----------|-----------|--------------------
<scanname> |120.X.X.21 |120.X.X.22 |120.X.X.23
----------|-----------|-----------|--------------------

Below is what is needed to be verify on each node - please note the example is from a Linux platform:
1. To find out the MTU
/bin/netstat -in
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 0 203273 0 0 0 2727 0 0 0 BMRU
In above example MTU is set to 1500 for eth0.

2. To find out the IP address and subnet, compare Broadcast and Netmask on all nodes
/sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:16:3E:11:11:11
inet addr:120.X.X.1 Bcast:120.0.0.127 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1111/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:203245 errors:0 dropped:0 overruns:0 frame:0
TX packets:2681 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:63889908 (60.9 MiB) TX bytes:319837 (312.3 KiB)
In the above example, the IP address for eth0 is 120.X.X.1, broadcast is 120.X.X.127, and net mask is
255.255.255.128, which is subnet of 120.0.0.0 with a maximum of 126 IP addresses. Refer to Section
"Basics of Subnet" for more details.
Note: An active NIC must have both "UP" and "RUNNING" flag; on Solaris, "PHYSRUNNING" will
indicate whether the physical interface is running

3. Run all ping commands twice to make sure result is consistent


Below is an example ping output from node1 public IP to node2 public hostname:
PING <nodename2> (120.X.X.2) from 120.X.X.1 : 1500(1528) bytes of data.
1508 bytes from <nodename2> (120.X.X.2): icmp_seq=1 ttl=64 time=0.742 ms
1508 bytes from rac1 (120.0.0.2): icmp_seq=2 ttl=64 time=0.415 ms

--- rac2 ping statistics ---


2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.415/0.578/0.742/0.165 ms
Please pay attention to the packet loss and time. If it is not 0% packet loss, or if it is not sub-second ti
me, then it indicates there is a problem in the network. Please engage network administrator to check
further.

3.1 Ping all public nodenames from the local public IP with packet size of MTU
/bin/ping -s <MTU> -c 2 -I 120.0.0.1 <nodename1>
/bin/ping -s <MTU> -c 2 -I 120.0.0.1 <nodename1>
/bin/ping -s <MTU> -c 2 -I 120.0.0.1 <nodename2>
/bin/ping -s <MTU> -c 2 -I 120.0.0.1 <nodename2>
/bin/ping -s <MTU> -c 2 -I 120.0.0.1 <nodename3>
/bin/ping -s <MTU> -c 2 -I 120.0.0.1 <nodename3>

3.2.1 Ping all private IP(s) from all local private IP(s) with packet size of MTU
applies to 11gR2 example, private name is optional
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 10.0.0.1
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 10.0.0.1
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 10.0.0.2
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 10.0.0.2
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 10.0.0.3
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 10.0.0.3

3.2.2 Ping all private nodename from local private IP with packet size of MTU
applies to 11gR1 and earlier example
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 rac1p
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 rac1p
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 rac2p
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 rac2p
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 rac3p
/bin/ping -s <MTU> -c 2 -I 10.0.0.1 rac3p

4. Traceroute private network


Example below shows traceroute from node1 private IP to node2 private hostname
# Packet size of MTU - on Linux packet length needs to be MTU - 28 bytes otherwise error send:
Message too long is reported.
# For example with MTU value of 1500 we would use 1472 :
traceroute to <nodename2>p (10.0.0.2), 30 hops max, 1472 byte packets
1 rac2p (10.0.0.2) 0.626 ms 0.567 ms 0.529 ms
MTU size packet traceroute complete in 1 hop without going through the routing table. Output other
than above indicates issue, i.e. when "*" or "!H" presents.
Note: traceroute option "-F" may not work on RHEL3/4 OEL4 due to OS bug, refer to note: 752844.1
for details.
4.1 Traceroute all private IP(s) from all local private IP(s) with :
applies to 11gR2 onwards
/bin/traceroute -s 10.0.0.1 -r -F 10.0.0.1 <MTU-28>
/bin/traceroute -s 10.0.0.1 -r -F 10.0.0.2 <MTU-28>
/bin/traceroute -s 10.0.0.1 -r -F 10.0.0.3 <MTU-28>
If "-F" option does not work, then traceroute without the "-F" parameter but with packet that's triple
the MTU size, i.e.:
/bin/traceroute -s 10.0.0.1 -r 10.0.0.1 <3 x MTU>

4.2 Traceroute all private nodename from local private IP with packet size of MTU
applies to 11gR1 and earlier example
/bin/traceroute -s 10.0.0.1 -r -F <nodename1>p <MTU-28>
/bin/traceroute -s 10.0.0.1 -r -F <nodename2>p <MTU-28>
/bin/traceroute -s 10.0.0.1 -r -F <nodename3>p <MTU-28>
If "-F" option does not work, then run traceroute without the "-F" parameter but with packet that's
triple MTU size, i.e.:
/bin/traceroute -s 10.0.0.1 -r <nodename1>p <3 x MTU>

5. Ping VIP hostname


# Ping of all VIP nodename should resolve to correct IP
# Before the clusterware is installed, ping should be able to resolve VIP nodename but
# should fail as VIP is managed by the clusterware
# After the clusterware is up and running, ping should succeed
/bin/ping -c 2 <nodename1>v
/bin/ping -c 2 <nodename1>v
/bin/ping -c 2 <nodename2>v
/bin/ping -c 2 <nodename2>v
/bin/ping -c 2 <nodename3>v
/bin/ping -c 2 <nodename3>v

6. Ping SCAN name


# applies to 11gR2
# Ping of SCAN name should resolve to correct IP
# Before the clusterware is installed, ping should be able to resolve SCAN name but
# should fail as SCAN VIP is managed by the clusterware
# After the clusterware is up and running, ping should succeed
/bin/ping -s <MTU> -c 2 -I 120.0.0.1 <scanname>
/bin/ping -s <MTU> -c 2 -I 120.0.0.1 <scanname>
/bin/ping -s <MTU> -c 2 -I 120.0.0.1 <scanname>

7. Nslookup VIP hostname and SCAN name


# applies to 11gR2
# To check whether VIP nodename and SCAN name are setup properly in DNS
/usr/bin/nslookup <nodename1>v
/usr/bin/nslookup <nodename2>v
/usr/bin/nslookup <nodename3>v
/usr/bin/nslookup <scanname>

8. To check name resolution order


# /etc/nsswitch.conf on Linux, Solaris and hp-ux, /etc/netsvc.conf on AIX
/bin/grep ^hosts /etc/nsswitch.conf
hosts: dns files
9. To check local hosts file
# If local files is in naming switch setting (nsswitch.conf), to make sure hosts file doesn't have typo or
# misconfiguration, grep all nodename and IP 127.0.0.1 should not map to SCAN, public, private and
# VIP hostname
Public and node VIP:
/bin/grep <nodename1> /etc/hosts
/bin/grep <nodename2> /etc/hosts
/bin/grep <nodename3> /etc/hosts
/bin/grep <nodename1>v /etc/hosts
/bin/grep <nodename2>v /etc/hosts
/bin/grep <nodename3>v /etc/hosts
/bin/grep 120.X.X.1 /etc/hosts
/bin/grep 120.X.X.2 /etc/hosts
/bin/grep 120.X.X.3 /etc/hosts
/bin/grep 120.X.X.11 /etc/hosts
/bin/grep 120.X.X.12 /etc/hosts
/bin/grep 120.X.X.13 /etc/hosts

# pre-11gR2 private example


/bin/grep <nodename1>p /etc/hosts
/bin/grep <nodename2>p /etc/hosts
/bin/grep <nodename3>p /etc/hosts
/bin/grep 10.X.X.1 /etc/hosts
/bin/grep 10.X.X.2 /etc/hosts
/bin/grep 10.X.X.3 /etc/hosts

# 11gR2 private example


/bin/grep 10.X.X.1 /etc/hosts
/bin/grep 10.X.X.2 /etc/hosts
/bin/grep 10.X.X.3 /etc/hosts

# SCAN example
# If SCAN name is setup in DNS, it should not be in local hosts file
/bin/grep <scanname> /etc/hosts
/bin/grep 120.X.X.21 /etc/hosts
/bin/grep 120.X.X.22 /etc/hosts
/bin/grep 120.X.X.23 /etc/hosts

C. Syntax reference
Please refer to below for command syntax on different platform
Linux:
/bin/netstat -in
/sbin/ifconfig
/bin/ping -s <MTU> -c 2 -I source_IP nodename
/bin/traceroute -s source_IP -r -F nodename-priv <MTU-28>
/usr/bin/nslookup

Solaris:
/bin/netstat -in
/usr/sbin/ifconfig -a
/usr/sbin/ping -i source_IP -s nodename <MTU> 2
/usr/sbin/traceroute -s source_IP -r -F nodename-priv <MTU>
/usr/sbin/nslookup
HP-UX:
/usr/bin/netstat -in
/usr/sbin/ifconfig NIC
/usr/sbin/ping -i source_IP nodename <MTU> -n 2
/usr/contrib/bin/traceroute -s source_IP -r -F nodename-priv <MTU>
/bin/nslookup

AIX:
/bin/netstat -in
/usr/sbin/ifconfig -a
/usr/sbin/ping -S source_IP -s <MTU> -c 2 nodename
/bin/traceroute -s source_IP -r nodename-priv <MTU>
/bin/nslookup

Windows:
MTU:
Windows XP: netsh interface ip show interface
Windows Vista/7: netsh interface ipv4 show subinterfaces
ipconfig /all
ping -n 2 -l <MTU-28> -f nodename
tracert
nslookup

D. Multicast
Started with 11.2.0.2, multicast group 230.0.1.0 should work on private network for bootstrapping.
patch 9974223 introduces support for another group 224.0.0.251
Please refer to note 1212703.1 to verify whether multicast is working fine.

As fix for bug 10411721 is included in 11.2.0.3, broadcast is supported for bootstrapping as well as
multicast. When 11.2.0.3 Grid Infrastructure starts up, it will try broadcast, multicast group 230.0.1.0
and 224.0.0.251 simultaneously, if anyone succeeds, it will be able to start.

On hp-ux, if 10 Gigabit Ethernet is used as private network adapter, without driver revision
B.11.31.1011 or later of the 10GigEthr-02 software bundle, multicast may not work. Run "swlist
10GigEthr-02" command to identify the current version on your HP server.

E. Runtime network issues


OSWatcher or Cluster Health Monitor(IPD/OS) can be deployed to capture runtime network issues.

F. Symptoms of network issues


o ping doesn't work, ping packet loss or ping time is too long (not sub-second)
o traceroute doesn't work
o name resolution doesn't work
o traceroute output like:
1 racnode1 (192.168.30.2) 0.223 ms !X 0.201 ms !X 0.193 ms !X
o gipcd.log shows:
2010-11-21 13:00:44.455: [ GIPCNET][1252870464]gipcmodNetworkProcessConnect: [network]
failed connect attempt endp 0xc7c5590 [0000000000000356] { gipcEndpoint : localAddr
'gipc://<nodename3>:08b1-c475-a88e-8387#10.XX.XX.23#27573', remoteAddr
'gipc://<nodename2>:nm_rac-cluster#192.168.0.22#26869', numPend 0, numReady 1, numDone 0,
numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x80612, usrFlags 0x0 }, req 0xc7c5310
[000000000000035f] { gipcConnectRequest : addr 'gipc://<nodename2>:nm_rac-
cluster#192.168.0.22#26869', parentEn
2010-11-21 13:00:44.455: [ GIPCNET][1252870464]gipcmodNetworkProcessConnect: slos op :
sgipcnTcpConnect
2010-11-21 13:00:44.455: [ GIPCNET][1252870464]gipcmodNetworkProcessConnect: slos dep : No
route to host (113)
or
2010-11-04 12:33:22.133: [ GIPCNET][2314] gipcmodNetworkProcessSend: slos op :
sgipcnUdpSend
2010-11-04 12:33:22.133: [ GIPCNET][2314] gipcmodNetworkProcessSend: slos dep : Message too
long (59)
2010-11-04 12:33:22.133: [ GIPCNET][2314] gipcmodNetworkProcessSend: slos loc : sendto
2010-11-04 12:33:22.133: [ GIPCNET][2314] gipcmodNetworkProcessSend: slos info : dwRet
4294967295, addr '19
o ocssd.log shows:
2010-02-03 23:26:25.804: [GIPCXCPT][1206540320]gipcmodGipcPassInitializeNetwork: failed to
find any interfaces in clsinet, ret gipcretFail (1)
2010-02-03 23:26:25.804: [GIPCGMOD][1206540320]gipcmodGipcPassInitializeNetwork:
EXCEPTION[ ret gipcretFail (1) ] failed to determine host from clsinet, using default
..
2010-02-03 23:26:25.810: [ CSSD][1206540320]clsssclsnrsetup: gipcEndpoint failed, rc 39
2010-02-03 23:26:25.811: [ CSSD][1206540320]clssnmOpenGIPCEndp: failed to listen on gipc
addr gipc://rac1:nm_eotcs- ret 39
2010-02-03 23:26:25.811: [ CSSD][1206540320]clssscmain: failed to open gipc endp
or
2010-09-20 11:52:54.014: [ CSSD][1103055168]clssnmvDHBValidateNCopy: node 1, racnode1,
has a disk HB, but no network HB, DHB has rcfg 180441784, wrtcnt, 453, LATS 328297844,
lastSeqNo 452, uniqueness 1284979488, timestamp 1284979973/329344894
2010-09-20 11:52:54.016: [ CSSD][1078421824]clssgmWaitOnEventValue: after CmInfo State val
3, eval 1 waited 0
.. >>>> after a long delay
2010-09-20 12:02:39.578: [ CSSD][1103055168]clssnmvDHBValidateNCopy: node 1,
<nodename1>, has a disk HB, but no network HB, DHB has rcfg 180441784, wrtcnt, 1037, LATS
328883434, lastSeqNo 1036, uniqueness 1284979488, timestamp 1284980558/329930254
2010-09-20 12:02:39.895: [ CSSD][1107286336]clssgmExecuteClientRequest: MAINT recvd from
proc 2 (0xe1ad870)
o crsd.log shows:
2010-11-29 10:52:38.603: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found
to node for 2614824036 ms, node 111ea99b0 { host '<nodename1>', haName '1e0b-174e-37bc-a515',
srcLuid 2612fa8e-3db4fcb7, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0,
lastValidAck 0, sendSeq [55 : 55], createTime 2614768983, flags 0x4 }
2010-11-29 10:52:42.299: [ CRSMAIN][515] Policy Engine is not initialized yet!
2010-11-29 10:52:43.554: [ OCRMAS][3342]proath_connect_master:1: could not yet connect to
master retval1 = 203, retval2 = 203
2010-11-29 10:52:43.554: [ OCRMAS][3342]th_master:110': Could not yet connect to new master [1]
or
2009-12-10 06:28:31.974: [ OCRMAS][20]proath_connect_master:1: could not connect to master
clsc_ret1 = 9, clsc_ret2 = 9
2009-12-10 06:28:31.974: [ OCRMAS][20]th_master:11: Could not connect to the new master
2009-12-10 06:29:01.450: [ CRSMAIN][2] Policy Engine is not initialized yet!
2009-12-10 06:29:31.489: [ CRSMAIN][2] Policy Engine is not initialized yet!
or
2009-12-31 00:42:08.110: [ COMMCRS][10]clsc_receive: (102b03250) Error receiving, ns (12535,
12560), transport (505, 145, 0)
o octssd.log shows:
2011-04-16 02:59:46.943: [ CTSS][1]clsu_get_private_ip_addresses: clsinet_GetNetData failed ().
Return [7]
[ CTSS][1](:ctss_init6:): Failed to call clsu_get_private_ip_addr [7]
gipcmodGipcPassInitializeNetwork: failed to find any interfaces in clsinet, ret gipcretFail (1)
gipcmodGipcPassInitializeNetwork: EXCEPTION[ ret gipcretFail (1) ] failed to determine host from
clsinet, using default
[ CRSCCL][2570585920]No private IP addresses found.
(:CSSNM00008:)clssnmCheckDskInfo: Aborting local node to avoid splitbrain. Cohort of 1 nodes
with leader 2, <nodename2>, is smaller than cohort of 1 nodes led by node 1, <nodename1>, based on
map type 2

On Oracle RAC environment Load Balancing is something which is critical for distribution of connect
ions between the servers we have. Load Balancing Advisory(LBA) is one of the key components for S
CAN listener to decide the best instance for the new incoming connection request. I was just wonderin
g if we can check if Load Balancing Advisory is enabled by default to all the services that run on mult
iple nodes then can we disable this feature? But let us understand exactly how critical is LBA for SCA
N listener and its effects. The figure above explains that SCAN listener might use the data from LBA
and decides the best instance to route the connection, but which type of Load Balancing this falls into?
We have already understood the functionality of SCAN listener in our earlier blog “How does SCAN
listener works in Oracle RAC 11gR2?”

Let me quickly take you through the concepts of Load Balancing types and then we will move ahead
to the activity of controlling it.

Concepts:
Typically, there are two types of load balancing:
Connection Load Balancing (CLB)
Run-time Load Balancing (RTLB)

RTLB has come into existence from 10gR2 which can be either set to “SERVICE_TIME” or “THRO
UGHPUT”. CLB can be configured at Client-Side or at Server-Side. Of which, Server-Side load balan
cing is recommended and have better functionality over Client-Side.

Is RTLB a client-side or server-side load balancing? Well, it is combination of both and majorly on th
e client side. Connection and performance statistics is collected on server side and transferred to the ap
plication server recursively by using ONS (Oracle Notification Service) infrastructure. It might not be
applicable to all the application setups, by default JDBC based applications can use this RTLB feature.

What is the role of LBA? LBA is the key for Run-time Load Balancing. Both these configurations co
mpletely rely on the statistical data provided by LBA (Load Balancing Advisory).

Functionality of RTLB: For a fixed time period the performance statistics collected by LBA (with the
help of PMON of all instances) are pushed into application server through ONS service and we call th
em as LBA events. Every time a new connection request triggers from application it will pass through
the connection cache/pool on the client side and picks up the best instance to connect based on the ser
vice level goal configured.

Yes, now you might agree that Run-Time Load Balancing is more preferred than Connection Load B
alancing as major and important decisions are taken at the client side before connection request reach
es SCAN listener. In this blog, let us run through the process of creating a service, understand the wei
ghtage of LBA and if that can be disabled.
Check out how Runtime load balancing works with SCAN listener in Oracle RACCLICK TO TWEE
T

Case study:
Here is our case study, to create a default service on a 2-node RAC environment and verify the
parameters of load balancing. We will take this study ahead by changing the parameters to understand
the importance of LBA.

1. Creating a default service without specifying the type of load balancing.


[oracle@RAC1 ~]$ srvctl add service -s LBASRV1 -d RACDB -g RACPOOL
[oracle@RAC1 ~]$ srvctl start service -s LBASRV1 -d RACDB
[oracle@RAC1 ~]$ srvctl config service -s LBASRV1 -d RACDB

<top lines removed>


Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
<next lines removed>

Our Findings: When a service is created by default Connection Load Balancing is enabled and set to
“LONG” and Run-Time Load Balancing is disabled.

Does it mean that LBA is not used by SCAN at all? No, as it is server side load balancing LBA statist
ics are used by SCAN listener to perform the load balancing (Keep this on hand). This is the appropri
ate load balancing that happens on the database server at the listener.

2. Now let us edit the configuration of this service and enable Run-Time Load Balancing.
[oracle@RAC1 ~]$ srvctl modify service -d RACDB -s LBASRV1 -B SERVICE_TIME
[oracle@RAC1 ~]$ srvctl config service -d RACDB -s LBASRV1

<top lines removed>


Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: SERVICE_TIME
<removed: same as above>

[oracle@RAC1 ~]$ srvctl modify service -d RACDB -s LBASRV1 -B THROUGHPUT


[oracle@RAC1 ~]$ srvctl config service -d RACDB -s LBASRV1

<removed: same as above>


Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: THROUGHPUT
<next lines removed>

Our Findings: Irrespective of Connection Load Balancing Goal “SHORT” or “LONG” you can enable
Runtime Load Balancing Goal to “SERVICE_TIME” or “THROUGHPUT”.

SERVICE_TIME: Attempts to direct work requests to instances according to response time. Load
balancing advisory data is based on elapsed time for work done in the service plus available bandw
idth to the service.

THROUGHPUT: Attempts to direct work requests according to throughput. The load balancing advis
ory is based on the rate that work is completed in the service plus available bandwidth to the service.
Key point: With Connection Load Balancing goal set to LONG, do not configure Run-time Load Bala
ncing as it is only applicable to applications where next session or job starts only after the current one
ends which is not practical. This is the reason you must have read that Run-time Load Balancing must
be enabled with CLB goal set to SHORT.

Is there a way to disable Load Balancing Advisory at all in Oracle RAC?CLICK TO TWEET

Can we disable CLB?


Disabling CLB on server-side is equivalent to disabling LBA (Load Balancing Advisory). You will
not find an option to do it if you search in “srvctl” help.

Oracle document says that configuring GOAL to NONE will disable Load Balancing Advisory (LBA)
on the service. Let us try doing it and see what exactly it is.

3. I will try to use DBMS_SERVICE package to modify this service and to disable LBA.
SQL> EXECUTE DBMS_SERVICE.MODIFY_SERVICE (service_name => ‘LBASRV1’, goal =>
DBMS_SERVICE.GOAL_NONE);

SQL> select GOAL,CLB_GOAL from dba_services where name=’LBASRV1′;


GOAL CLB_G
———— —–
NONE LONG

But as soon as we restart the service using srvctl GOAL has come back to THROUGHPUT.
[oracle@RAC1 ~]$ srvctl stop service -s LBASRV1 -d RACDB
[oracle@RAC1 ~]$ srvctl start service -s LBASRV1 -d RACDB

SQL> select GOAL,CLB_GOAL from dba_services where name=’LBASRV1′;


GOAL CLB_G
———— —–
THROUGHPUT LONG

You can call this as bug as the configuration changes done in SQL prompt will not be updated in
OCR. So let us try doing it using srvctl.
[oracle@RAC1 ~]$ srvctl stop service -s LBASRV1 -d RACDB
[oracle@RAC1 ~]$ srvctl modify service -s LBASRV1 -d RACDB -B NONE
[oracle@RAC1 ~]$ srvctl config service -s LBASRV1 -d RACDB

<top lines removed>


Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
<next lines removed>

This has come back to the default configuration when we created this service. Yes, RTLB is now disa
bled but will it not use LBA according to the Oracle document?

4. Let us connect to the database now using this service from the client server and check if LBA is
used by SCAN listener.
Client server: C:\Users\Administrator>sqlplus system/########@//192.168.122.#/LBASRV1
Database server:
SQL> select user_data from sys.sys$service_metrics_tab order by enq_time;

Last row of the output:


SYS$RLBTYP(‘LBASRV1’, ‘VERSION=1.0 database=RACDB service=LBASRV1
{ {instance=RACDB_2 percent=50 flag=UNKNOWN aff=FALSE}
{instance=RACDB_1 percent=50 flag=UNKNOWN aff=FALSE} } timestamp=2016-03-17
16:38:38′) Current system time:

SQL> !date
Thu Mar 17 16:49:46 IST 2016
This means that LBA events are not generated when new connection request was handled by SCAN
listener.

5. Now let us change the Connection Load Balancing to “SHORT”, reconnect from client server and
see if LBA is used by SCAN listener.
[oracle@RAC1 ~]$ srvctl modify service -s LBASRV1 -d RACDB -j SHORT
[oracle@RAC1 ~]$ srvctl stop service -s LBASRV1 -d RACDB
[oracle@RAC1 ~]$ srvctl start service -s LBASRV1 -d RACDB
[oracle@RAC1 ~]$ srvctl config service -s LBASRV1 -d RACDB

<top lines removed>


Connection Load Balancing Goal: SHORT
Runtime Load Balancing Goal: NONE
<next lines removed>

Client server: C:\Users\Administrator>sqlplus system/########@//192.168.122.#/LBASRV1

Database server:
SQL> SELECT user_data FROM sys.sys$service_metrics_tab ORDER BY 1 ;
Last row of the output:
SYS$RLBTYP(‘LBASRV1’, ‘VERSION=1.0 database=RACDB service=LBASRV1
{ {instance=RACDB_2 percent=50 flag=UNKNOWN aff=FALSE}
{instance=RACDB_1 percent=50 flag=UNKNOWN aff=FALSE} } timestamp=2016-03-17
16:38:38′) Current system time:

SQL> !date
Thu Mar 17 16:54:46 IST 2016

Even with “SHORT” goal we don’t see any LBA events generated by Oracle RAC. Does it mean that
LBA is not used by SCAN listener when only CLB is configured? Hold on!!! Let us clear this road
block.

6. Let me try to connect from multiple client sessions and see if load is distributed across RAC
instances.
SQL> select inst_id,username from gv$session where username=’SYSTEM’;
INST_ID USERNAME
———- ——————————
1 SYSTEM
2 SYSTEM
2 SYSTEM

Woooh!!!, there is load balancing on server side. Is it with the help of LBA? We have disabled it
before this step, isn’t it?
One of the Oracle documents shown above says that LBA can be disabled by setting GOAL to NONE
and this document says that server-side load balancing uses LBA even when Run-time Load
Balancing is disabled. It is not surprising if you have got confused now 🙂

Summary of our case study:


a) Server-side CLB is better over Client-side CLB and RTLB is best over Server-side CLB.
b) Application should be able to read LBA events published by RAC to make the most out of
RTLB. Mostly JBDC, ODBC, Oracle Data Provider for .NET (ODP.NET) can be configured to
handle RTLB.
c) We can enable (SERVICE_TIME/THROUGHPUT) or disable(NONE) RTLB by using srvctl
command line utility for a service.
d) When RTLB is disabled Load Balancing Advisory EVENTS are disabled, which means that no
notifications are published through ONS service – This is what we have observed by running a
select query on sys$service_metrics view.
e) You cannot disable CLB – it has to be either SHORT/LONG. This also means that it uses Load
Balancing Advisory(LBA) data collected by PMON of all the instances without which service
goal metrics cannot be met.
Now, I understand that LBA is definitely mandatory for SCAN listener in Oracle RAC.

You might also like