Linux Clustering

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 25

Installation Manual

Prerequisite:
• Operating system on all servers
• Physical connectivity between shared storage and
servers (to be done by hardware vendor)

Configuring the disk on shared storage


• Shut down server with init 0 command
• Switch off the power button
• Power on the shared storage wait till you get the message “ startup
complete ”on the storage.
• Switch on the server
• When system will detects the hp smart array first controller (for
system), will display the message logical drive found
• Press F8 key When system will detects the hp smart array second
controller (for shared storage), will display the message logical drive
found and ask you the press the F8 key to configure the controller
• select 1st option to create logical drive
• select the as per annexure and press enter to create the logical drive
for quorum select last 2 disk
• press F8 key to save configuration
• again select 1st option again if you need to configure the more disk
• press esc key to quit configuration menu

Creating quorum partition for cluster


Login as root
Run “fdisk /dev/cciss/c0d2” (where c0 is controller name and d0 is
device name pl see the annexure for details )
1. Press “p” to see the partition details
2. Press “n” to create new partition
3. type “p” for primary partition
4. type “1” for 1st primary partition
5. press enter to !st cylinder
6. type “ +100M”
7. again Press “n” to create new partition
8. type “p” for primary partition
9. type “2” for 1st primary partition
10. press enter to !st cylinder
11. type “ +100M”
12. press ‘w” to write and save the partition table
Page 1 of 25 Telesoft Confidential
Installation Manual

13. press “q” to quit from fdisk menu

Creating data partition on shared storage for cluster

Login as root
Run “fdisk /dev/cciss/c0d2” (where c0 is controller name and d0 is
device name pl see the annexure for details )
1. Press “p” to see the partition details
2. Press “n” to create new partition
3. type “p” for primary partition
4. type “1” for 1st primary partition
5. press enter to 1st cylinder
6. type “ +100M”
Repeat this process to create more partition and see the annexure for
respective cluster
7. press ‘w” to write and save the partition table
8. press “q” to quit from fdisk menu

Configuring raw devices for cluster


Edit /etc/sysconfig/rawdevices and make following entry

/dev/raw/raw1 /dev/cciss/c1d0p1
/dev/raw/raw2 /dev/cciss/c1d0p2

(where c1d0p1 is the device name please see the annexure for correct
partition name)

Installing the Red Hat Cluster Suite

The clumanager and redhat-config-cluster packages are required to


configure the Red Hat. Cluster Manager. Perform the following instructions
to install the Red Hat Cluster Suite on Red Hat Enterprise Linux system.
Installation with the Package Management Tool
1. Insert the Red Hat Cluster Suite CD in your CD-ROM drive. If you are
using a graphical desktop, the CD will automatically run the Package
Management Tool. Click Forward to continue.

Page 2 of 25 Telesoft Confidential


Installation Manual

2. Check the box for the Red Hat Cluster Suite, and click the Details link to
the package descriptions.

Page 3 of 25 Telesoft Confidential


Installation Manual

3. While viewing the package group details, check the box next to the
packages to install. Click Close when finished.

Page 4 of 25 Telesoft Confidential


Installation Manual

4. The Package Management Tool shows an overview of the packages to be


installed. Click Forward to install the packages.
5. When the installation is complete, click Finish to exit the Package
Management Tool.

Installation with rpm


If you are not using a graphical desktop environment, you can install the
packages manually using the rpm utility at a shell prompt. Insert the Red
Hat Cluster Suite CD into the CD-ROM drive. Log into a shell prompt,
change to the RedHat/RPMS/ directory on the CD, and type the following
commands as root (replacing with the version and architecture of the
packages to install):
rpm -ivh clumanager-version.arch.rpm
rpm --Uvh redhat-config-cluster-version.noarch.rpm

The Cluster Configuration Tool (you need to run this tools only on
one server)
Red Hat Cluster Manager consists of the following RPM packages:
clumanager . This package consists of the software that is responsible for
cluster operation (including the cluster daemons).
redhat-config-cluster . This package contains the Cluster Configuration
Tool and the Cluster Status Tool, which allow for the configuration of the
cluster and the display of the current status of the cluster and its members
and services.
You can use either of the following methods to access the Cluster
Configuration Tool:
Select Main Menu => System Settings => Server Settings => Cluster.
Or At a shell prompt, type the redhat-config-cluster

The first time that the application is started, the Cluster Configuration Tool
is displayed. After complete the cluster configuration, the command starts
the Cluster Status Tool by default.

To access the Cluster Configuration Tool from the Cluster Status Tool,
select Cluster => Configure.

Page 5 of 25 Telesoft Confidential


Installation Manual

Page 6 of 25 Telesoft Confidential


Installation Manual

The following tabbed sections are available within the Cluster Configuration
Tool:
Members : Use this section to add members to the cluster and optionally
configure a power controller connection for any given member.
Failover Domains : Use this section to establish one or more subsets of the
cluster members for specifying which members are eligible to run a service
in the event of a system failure. (Note that the use of failover domains is
optional.)
Services : Use this section to configure one or more services to be
managed by the cluster. As you specify an application service, the
relationship between the service and its IP address, device special file,
mount point, and NFS exports is represented by a hierarchical structure.
The parent-child relationships in the Cluster Configuration Tool reflect the
organization of the service information in the /etc/cluster.xml file.

Do not manually edit the contents of the /etc/cluster.xml file.


Do not simultaneously run the Cluster Configuration Tool on multiple
members. (It is permissible to run the Cluster Status Tool on more than
one member at a time.) The Cluster Configuration Tool stores information
about the cluster service and daemons, cluster members, and cluster
services in the /etc/cluster.xml configuration file. The cluster configuration
file is created the first time the Cluster Configuration Tool is started.
Save the configuration at any point (using File => Save) while running the
Cluster Configuration Tool. When File => Quit is selected, it prompts you to
save changes if any unsaved changes to the configuration are detected.

The Cluster Configuration Tool uses a hierarchical tree structure to show


relationships between components in the cluster configuration. A triangular
icon to the left of a component name indicates that the component has
children. To expand or collapse the portion of the tree below a component,
click the triangle icon.

Page 7 of 25 Telesoft Confidential


Installation Manual

Configuring Cluster Daemons


The Red Hat Cluster Manager provides the following daemons to monitor
cluster operation:
Clumembd : Cluster member deamon

Page 8 of 25 Telesoft Confidential


Installation Manual

Select 10 seconds failover time


Select broadcast heartbeating
Cluquorumd: Cluster Quorum daemon
Select ping interval
Type 2 seconds ping interval time

Clurmtabd: Synchronizes NFS mount entries in /var/lib/nfs/rmtab with a


private copy on a service's mount point

Page 9 of 25 Telesoft Confidential


Installation Manual

Clusvcmgrd: Service manager daemon

Clulockd: Global lock manager (the only client of this daemon is


clusvcmgrd) clumembd.

Membership daemon Each of these daemons can be individually


configured using the Cluster Configuration Tool. To access the Cluster
Daemon Properties dialog box, choose Cluster => Daemon Properties.
The following sections explain how to configure cluster daemon properties.
However, note that the default values are applicable to most configurations
and do not need to be changed.

Page 10 of 25 Telesoft Confidential


Installation Manual

You can specify the following properties for the clumembd daemon:
Log Level . Determines the level of event messages that get logged to the
cluster log file (by default /var/log/messages). Choose the appropriate
logging level from the menu.
Failover Speed.Determines the number of seconds that the cluster service
waits before shutting down a non-responding member (that is, a member
from which no heartbeat is detected). To set the failover speed, drag the
slider bar. The default failover speed is 10 seconds

Configuring the raw devices in cluster


The rawdevices configuration must be performed on all cluster members,
and all members must use the same raw devices (from the previous
example, /dev/raw/raw1 and/dev/raw/raw2).
To check raw device configuration on the current cluster member, choose
Cluster => Shared State
in the Cluster Configuration Tool. The Shared State dialog is displayed, as

Page 11 of 25 Telesoft Confidential


Installation Manual

Adding Member to cluster


1. Ensure that the Members tab is selected and click New. It prompts for a
member name.
2. Enter the host name of a system on the cluster subnet. Note that each
member must be on the same subnet as the system on which you are
running the Cluster Configuration Tool and must be defined either in DNS
or in each cluster system's /etc/hosts file
The system on which you are running the Cluster Configuration Tool must
be explicitly added as a cluster member; the system is not automatically
added to the cluster configuration as a result of running the Cluster
Configuration Tool.

3. Leave Enable SWWatchdog checked. (A software watchdog timer


enables a member to reboot itself.)
4. Click OK.
5. Choose File => Save to save the changes to the cluster configuration.
Configuring a Failover Domain

Page 12 of 25 Telesoft Confidential


Installation Manual

Adding a failover domain to the cluster


1. Select the Failover Domains tab and click New. The Failover Domain
dialog box is displayed

2. Enter a name for the domain (example- Oracle) in the Domain Name
field. The name should be descriptive enough to distinguish its purpose
relative to other names used on your network.
2. Check Ordered Failover if you want members to assume control of a
failed service in a particular sequence; preference is indicated by the
member's position in the list of members in the domain, with the most
preferred member at the top.
5. Click Add Members to select the members for this failover domain. The
Failover Domain Member dialog box is displayed.

Page 13 of 25 Telesoft Confidential


Installation Manual

Page 14 of 25 Telesoft Confidential


Installation Manual

You can choose multiple members from the list by pressing either the
[Shift] key while clicking the start and end of a range of members, or
pressing the [Ctrl] key while clicking on non-contiguous members.
6. When finished selecting members from the list, click OK. The selected
members are displayed on the Failover Domain list.
7. When Ordered Failover is checked, you can rearrange the order of the
members in the domain by dragging the member name in the list box to
the desired position. A thin, black line is displayed to indicate the new row
position (when you release the mouse button).
8. When finished, click OK.
9. Choose File => Save to save the changes to the cluster configuration.
To remove a member from a failover domain, follow these steps:
1. On the Failover Domains tab, double-click the name of the domain you
want to modify (or select the domain and click Properties).
2. In the Failover Domain dialog box, click the name of the member you
want to remove from the domain and click Delete Member. (Members must
be deleted one at a time.) You are prompted to confirm the deletion.
3. When finished, click OK.
4. Choose File => Save to save the changes to the cluster configuration.

Add a service to the cluster


1. Select the Services tab and click New. The Service dialog is displayed as
shown in Figure .

Page 15 of 25 Telesoft Confidential


Installation Manual

2. Give the service a descriptive Service Name to distinguish its


functionality relative to other services that may run on the cluster.
3. If you want to restrict the members on which this service is able to run,
choose a failover domain from the Failover Domain list.
4. Adjust the quantity in the Check Interval field, which sets the interval (in
seconds) that the cluster infrastructure checks the status of a service. This
field is only applicable if the service script is written to check the status of
a service.
5. Specify a User Script that contains settings for starting, stopping, and
checking the status of a service.
6. Specify service properties, including an available fioating IP address (an
address that can be transferred transparently from a failed member to a
running member in the event of failover) and devices (which are
configured as children of the service).

Adding a IP Address for service


To specify a service IP address, follow these steps:
1. On the Services tab of the Cluster Configuration Tool, select the service
you want to configure and click Add Child.
2. Select Add Service IP Address and click OK.
3. Specify an IP address (which must be resolvable by DNS but cannot be
the IP address of a running service). See annexure for service ip address.
4. Optionally specify a netmask and broadcast IP address.
5. Choose File => Save to save the change to the /etc/cluster.xml
configuration file.

Page 16 of 25 Telesoft Confidential


Installation Manual

Adding a device for a service


1. On the Services tab of the Cluster Configuration Tool, select the service
you want to configure and click Add Child.
2. Select Add Device and click OK.
3. Specify a Device Special File (for example, /dev/cciss/c1d0p1) and a
mount point (for example, /data). Each device must have a unique device
special file and unique mount point within the cluster. See annexure for
Device special file and mount point for respective cluster.
5. Specify a directory from which to mount the device in the Mount Point
field. This directory should not be listed in /etc/fstab as it is automatically
mounted by the Red Hat Cluster Manager when the service is started.
6. Choose a file system type from the FS Type list.
7. Optionally specify Options for the device. If you leave the Options field
blank, the default mount options (rw,suid,dev,exec,auto,nouser,async) are
used. Refer to the mount man page for a complete description of the
options available for mount.
8. Check Force Unmount to force any application that has the specified file
system mounted to be killed prior to disabling or relocating the service
(when the application is running on the same member that is running the
disabled or relocated service).
9. When finished, click OK.
10. Choose File => Save to save the change to the /etc/cluster.xml
configuration file.
Your Single node cluster has been configured
To Check the cluster service,
1. once again Select Main Menu => System Settings => Server
Settings => Cluster.
2. Or run ‘redhat-config-cluster’command line in terminal
3. Cluster status window will appear
4. click on configure menu and select start local cluster daemon
5. cluster service will be started on this node
6. app will be started on this node
7. failed and running service s will displayed in below window

Page 17 of 25 Telesoft Confidential


Installation Manual

Adding second member on the cluster


stop cluster service on this node
copy /etc/cluster.xml file to second node at same location (/etc/ directory)

configuring the log file for cluster


1. edit /etc/syslog.conf
2. add ‘local4.none’ in message type line
(example- *.info;mail.none;authpriv.none;cron.none;local4.none
/var/log/messages )
3. add the following 2 lines and save file

#cluster messages to be put in /var/log/cluster file

local4.* /var/log/cluster

Page 18 of 25 Telesoft Confidential


Installation Manual

Modify the Log rotate file for cluster


Edit /etc/logrotate.conf file and add the following line in this file

/var/log/cluster.log {
monthly
create 0664 root utmp
rotate 5
postrotate
/sbin/killall -HUP syslogd

endscript
}

configuring cluster at startup

run “chkconfig clumanager on” on the terminal

Page 19 of 25 Telesoft Confidential


Installation Manual

Health checkup and Managing linux server

df –h
Make sure any partition should not full more than 80%
Output will be like the (all outputs are examples only)
[root@delhi-oam OAM]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p8 1008M 437M 520M 46% /
/dev/cciss/c0d0p1 97M 15M 77M 17% /boot
/dev/cciss/c0d0p3 5.8G 3.8G 1.8G 69% /data
/dev/cciss/c0d0p10 114G 72G 37G 67% /home
none 1.9G 0 1.9G 0% /dev/shm
/dev/cciss/c0d0p7 1008M 17M 941M 2% /tmp
/dev/cciss/c0d0p5 2.9G 2.1G 666M 77% /usr
/dev/cciss/c0d0p9 483M 53M 406M 12% /usr/local
/dev/cciss/c0d0p6 1008M 909M 49M 95% /var

uptime : to check system has not rebooted


[root@delhi-oam OAM]# uptime
15:17:33 up 198 days, 23:47, 13 users, load average: 0.10, 0.08, 0.06

TOP : to check the system load

[root@subeng proc]# top


5:22pm up 17 days, 6:08, 6 users, load average: 20.07, 19.82, 16.69
369 processes: 348 sleeping, 21 running, 0 zombie, 0 stopped
CPU0 states: 78.0% user, 20.4% system, 78.3% nice, 0.2% idle
CPU1 states: 78.1% user, 21.2% system, 77.4% nice, 0.2% idle
CPU2 states: 80.0% user, 17.2% system, 80.2% nice, 1.4% idle
CPU3 states: 77.3% user, 20.1% system, 77.0% nice, 1.4% idle

Page 20 of 25 Telesoft Confidential


Installation Manual

Mem: 3727828K av, 3700168K used, 27660K free, 370940K shrd,


177712K buff
Swap: 2096440K av, 47416K used, 2049024K free 2828716K
cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME
COMMAND
28742 root 39 5 141M 141M 14716 R N 22.6 3.8 10:03 jrun
28840 root 39 5 141M 141M 14716 R N 22.4 3.8 4:34 jrun
28955 root 39 5 141M 141M 14716 R N 21.8 3.8 4:59 jrun
28744 root 39 5 141M 141M 14716 R N 21.2 3.8 5:16 jrun
28660 root 39 5 141M 141M 14716 R N 21.0 3.8 6:33 jrun
28934 root 39 5 141M 141M 14716 R N 20.4 3.8 3:39 jrun
28722 root 39 5 141M 141M 14716 R N 20.2 3.8 6:01 jrun
28839 root 39 5 141M 141M 14716 R N 20.2 3.8 7:35 jrun
28658 root 39 5 141M 141M 14716 R N 20.0 3.8 13:22 jrun
28768 root 39 5 141M 141M 14716 R N 19.8 3.8 8:51 jrun
28732 root 39 5 141M 141M 14716 R N 19.4 3.8 5:36 jrun
28667 root 39 5 141M 141M 14716 R N 19.2 3.8 6:19 jrun
28757 root 39 5 141M 141M 14716 R N 18.6 3.8 5:00 jrun
28941 root 39 5 141M 141M 14716 R N 18.6 3.8 3:23 jrun
28931 root 39 5 141M 141M 14716 R N 18.4 3.8 5:21 jrun
28867 root 39 5 141M 141M 14716 R N 17.8 3.8 3:37 jrun
25037 root 39 5 141M 141M 14716 R N 17.6 3.8 49:38 jrun
25041 root 39 5 141M 141M 14716 R N 17.0 3.8 49:09 jrun
28726 root 39 5 141M 141M 14716 R N 16.6 3.8 11:58 jrun
28856 root 39 5 141M 141M 14716 R N 16.4 3.8 6:19 jrun
24926 root 20 5 141M 141M 14716 S N 1.5 3.8 1:50 jrun
25001 root 20 5 141M 141M 14716 S N 1.3 3.8 0:41 jrun
25002 root 20 5 141M 141M 14716 S N 0.9 3.8 0:42 jrun
30149 root 15 0 1240 1240 832 R 0.7 0.0 0:00 top
14573 test 15 0 728 704 624 S 0.1 0.0 4:51 ping
28946 root 21 5 141M 141M 14716 S N 0.1 3.8 0:03 jrun
1 root 15 0 496 448 448 S 0.0 0.0 0:32 init
2 root 15 0 0 0 0 SW 0.0 0.0 0:00 keventd
3 root 15 0 0 0 0 SW 0.0 0.0 0:00 keventd
4 root 15 0 0 0 0 SW 0.0 0.0 0:00 keventd

dmesg: to find the latest status of devices


any device error refelects in this message, check the status of the device
like nic interface, disk status and filesystem
EXT3 FS 2.4-0.9.11, 3 Oct 2001 on sd(8,3), internal journal

Page 21 of 25 Telesoft Confidential


Installation Manual

EXT3-fs: mounted filesystem with ordered data mode.


kjournald starting. Commit interval 5 seconds
EXT3 FS 2.4-0.9.11, 3 Oct 2001 on sd(8,7), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
parport0: PC-style at 0x378 [PCSPP]
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
tg3.c:v0.97 (Mar 13, 2002)
eth0: Tigon3 [partno(BCM95703A30U) rev 1002 PHY(5703)] (PCIX:100MHz:64-bit)
10/100/1000BaseT Ethernet 00:10:18:0b:d2:75
eth0: Link is up at 100 Mbps, full duplex.
eth0: Flow control is off for TX and off for RX.
You have new mail in /var/spool/mail/root

Managing Cluster
clustat :
This will show the status of all member in this cluster make sure all node
should be active. Last trasition time show the latest time of service
restarted and Restart column show the how many times service restarted
since cluster running. Service should be running with 0 restart count
clustat
[root@cluster1 root]# clustat
Cluster Status - BTSL-CLUSTER 15:01:53
Cluster Quorum Incarnation #3
Shared State: Shared Raw Device Driver v1.2

Member Status
------------------ ----------
cluster0 Active
cluster1 Active <-- You are here

Service Status Owner (Last) Last Transition Chk Restarts


-------------- -------- ---------------- --------------- --- --------
oracle started cluster1 01:49:15 Feb 14 120 0

In ideal condition Restarts value should be 0 (zero)


If you find the any service restart during last 24 hrs.
Then check the cluster log in /var/log/cluster.log for root cause

Page 22 of 25 Telesoft Confidential


Installation Manual

If you find the status of any node inactive, then login to thr respective
server as root
Run ‘service clumanager status’ this command will tell u the status of the
node whethet service is running on the server or not and check the
/var/log/cluster.log for details
• fsck utility

Caution: before running fsck utility, make sure restective


partition should not be mounted to any cluster node.
Running fsck on mounted partition may cause loss of data.
when to run fsck utility ?
1. before starting cluster manually
Run fsck (on storage partition only) manually every time if you start
cluster manually to avoid the inconsistencies in the data on storage.
2. If below message flash on the screen
EXT3-fs warning: maximal mount count reached, running e2fsck is
recommended
Then stop cluster service on both server by
Run service clumanager stop (first on that where service is not running) on
both member
Make sure storage partition has dismounted on both server by running ‘mount’
command
[root@delhi-oam OAM]# mount
/dev/cciss/c0d0p8 on / type ext3 (rw)
none on /proc type proc (rw)
usbdevfs on /proc/bus/usb type usbdevfs (rw)
/dev/cciss/c0d0p1 on /boot type ext3 (rw)
/dev/cciss/c0d0p3 on /data type ext3 (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/cciss/c0d0p10 on /home type ext3 (rw)
none on /dev/shm type tmpfs (rw)
/dev/cciss/c0d0p7 on /tmp type ext3 (rw)
/dev/cciss/c0d0p5 on /usr type ext3 (rw)
/dev/cciss/c0d0p9 on /usr/local type ext3 (rw)
/dev/cciss/c0d0p6 on /var type ext3 (rw)
[root@delhi-oam OAM]#
Then Run fsck utility on shared storage partition

3. After completing the cold backup run fsck manually

Page 23 of 25 Telesoft Confidential


Installation Manual

• Start cluster service


service clumanager start
[root@cluster1 root]# service clumanager start
starting clumanager ok

• stop cluster service


service clumanager stop

[root@cluster1 root]# service clumanager stop


waiting to stop clumanager ok

• check cluster service


service clumanager status
[root@cluster1 root]# service clumanager status
clumembd (pid 8769) is running...
cluquorumd (pid 8767) is running...
clulockd (pid 8775) is running...
clusvcmgrd (pid 8790) is running...
[root@cluster1 root]#

• enable the service in cluster

clusvcadm –e service_name
[root@cluster1 root]# clusvcadm –e service_name
trying to enable service oracle ……… success

• disable clusters service

clusvcadm –d service_name
[root@cluster1 root]# clusvcadm –d service_name
trying to disable service oracle ……… success

• relocating clusters service


Page 24 of 25 Telesoft Confidential
Installation Manual

clusvcadm –r service_name member_name


[root@cluster1 root]# clusvcadm –r oracle cluster1
member cluster1 trying to relocate service oracle ……… success

Page 25 of 25 Telesoft Confidential

You might also like