Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Debian Lenny HowTo

From ClusterLabs
Jump to: navigation, search This page will guide you trough installing a Corosync+ Pacemaker two node cluster which is later extended and worked with. The aim is to provide you with a working example of such a cluster. Once you get up to speed using this HowTo you can dive into the more advanced configuration and documentation.

Contents
[hide]

1 Introduction 2 Installation 2.1 Backports.org and the Madkiss-repository 2.2 How to enable the repository

2.3 Install the packages 3.1 Create authkey 3.2 Edit configfile 3.3 Enabling corosync 3.4 Deal with firewall 4.1 Check the status

3 Initial Configuration

4 Running corosync

5 Configure an IP resource 6 Resource operations

6.1 Put a node in standby and back online again 6.1.1 Put node1 in standby

6.1.2 Put node1 online again

6.2 Migrate the resource to the other node 6.3 Stop the resource

7 Add another node

Introduction
In this example we will first use these names and IP adresses for example code:

node1 - ip 10.0.0.11 - first node node2 - ip 10.0.0.12 - second node virt1 - ip 10.0.0.21 - virtual IP adress

Disclaimer: We assume that you can work with debian linux already and know the security implications of working as root and so on. If you get stuck using this HowTo you might try your luck on the #linux-ha irc channel on freenode.net

Installation
First of all, please install two servers (node1 and node2) with Debian GNU/Linux 5.0 (alias Lenny) and set them both up the way you want them -- finish any non-cluster related changes before fiddling with Pacemaker.

Backports.org and the Madkiss-repository


As of Jul 8, 2010, packages for the whole Linux-HA clusterstack (corosync, openais, heartbeat, cluster-glue, cluster-agents, pacemaker) are available from the official backports repository for Debian GNU/Linux 5.0. They are derived from the official packages in the current "testing"-branch of Debian GNU/Linux, currently codenamed Squeeze. Due to this, the APT-Repository formely known as the "Madkiss"-repo has kind of lost its original intention; it does no longer include packages for the whole cluster stack. It will, however, continue to exist in order to provide Lenny-packages for cases where updated packages have been uploaded to the Debian development branch codenamed "Sid" (also referred to as "Unstable"). Due to the Backports.org-policy, packages which want to enter the Backports.org-repo must be present in the same version in the testing-branch (currently Squeeze). Thus, there might be situations where a more up-to-date version of the Linux-HA cluster stack exists in Unstable but didn't migrate to testing yet and accordingly can not be made available in the Backports.org-repository. Packages for Lenny might then be available from the Madkiss-repo. So in order to use Pacemaker on Debian GNU/Linux 5.0 ("Lenny"), please add the Backports.org-repository to your APT-configuration according to the How-To on this site. This has to be done on all nodes in your cluster.

How to enable the repository


If you already have a backports stanza in your apt sources lists, you should be ok. Otherwise, create a new file /etc/apt/sources.list.d/pacemaker.list that contains:
# only if you want to madkiss repo, which may sometimes not include the full stack. # deb http://people.debian.org/~madkiss/ha lenny main # usually you should be ok just using backports: deb http://backports.debian.org/debian-backports lenny-backports main

If you use the Madkiss repo, you want to add the Madkiss key to you package system:
apt-key adv --keyserver pgp.mit.edu --recv-key 1CFA3E8CD7145E30

If you omit this step you will get this error:


W: GPG error: http://people.debian.org lenny Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 1CFA3E8CD7145E30

Update the package list


aptitude update

Install the packages

Installing the package pacemaker will install pacemaker with corosync, if you need openais later on, you could install that as a plugin in corosync. OpenAIS is need for example for DLM or CLVM, but thats beyond the scope of this howto.
aptitude install pacemaker

If you want to run pacemaker on top of Heartbeat 3 instead of Corosync, please use the following command:
aptitude install pacemaker heartbeat

Please note that Corosync will still be installed as dependency; however, if you set up Heartbeat properly, Corosync can remain unused.

Initial Configuration
Create authkey
To create an authkey for corosync communication between your two nodes do this on the first node:
node1~: sudo corosync-keygen

This creates a key in /etc/corosync/authkey You need to copy this file to the second node and put it in the /etc/corosync directory with the right permissions. So on the first node:
node1~: scp /etc/corosync/authkey node2:

And on the second node:


node2~: sudo mv ~/authkey /etc/corosync/authkey node2~: sudo chown root:root /etc/corosync/authkey node2~: sudo chmod 400 /etc/corosync/authkey

Edit configfile
Most of the options in the /etc/corosync/corosync.conf file are ok to start with, you must however make sure that it can communicate so make sure to adjust this section:
interface { # The following values need to be set based on your environment ringnumber: 0 bindnetaddr: 192.168.2.0 mcastaddr: 226.94.1.1 mcastport: 5405 }

Change your bindnetaddr to your local subnet so if you have configured the IP 10.0.0.23 for the first node and 10.0.0.24 for the second node, adjust your bindnetaddr to 10.0.0.0.

Enabling corosync
Corosync is disabled by default and starting it with the initscript will not work. To enable corosync you need to replace START=no with START=yes in /etc/default/corosync

Deal with firewall


Make sure you have opened the multicast port for udp traffic in your firewall. For example when using shorewall add this rule to your /etc/shorewall/rules file on both nodes:
# Multicast for pacemaker ACCEPT net fw udp 5405

Running corosync
Now that you have configured both nodes you can start the cluster on both sides:
node1~: sudo /etc/init.d/corosync start Starting corosync daemon: corosync. node2~: sudo /etc/init.d/corosync start Starting corosync daemon: corosync.

Check the status


To check corosync status you can look at /var/log/daemon.log If you take a look at the processlist using 'ps auxf' you should get something like this:
root 29980 0.0 0.8 44304 3808 0:00 /usr/sbin/corosync root 29986 0.0 2.4 10812 10812 \_ /usr/lib/heartbeat/stonithd 102 29987 0.0 0.8 13012 3804 0:00 \_ /usr/lib/heartbeat/cib root 29988 0.0 0.4 5444 1800 0:00 \_ /usr/lib/heartbeat/lrmd 102 29989 0.0 0.5 12364 2368 0:00 \_ /usr/lib/heartbeat/attrd 102 29990 0.0 0.5 8604 2304 0:00 \_ /usr/lib/heartbeat/pengine 102 29991 0.0 0.6 12648 3080 0:00 \_ /usr/lib/heartbeat/crmd ? ? ? ? ? ? ? Ssl SLs S S S S S 20:55 20:55 20:55 20:55 20:55 20:55 20:55 0:00

And you can issue the crm_mon tool to get info about the current status of the cluster. We use -V for extra information.
node1~: sudo crm_mon --one-shot -V crm_mon[7363]: 2009/07/26_22:05:40 ERROR: unpack_resources: No STONITH resources have been defined crm_mon[7363]: 2009/07/26_22:05:40 ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option crm_mon[7363]: 2009/07/26_22:05:40 ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity ============ Last updated: Fri Nov 6 21:03:51 2009 Stack: openais Current DC: node1 - partition with quorum Version: 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe 2 Nodes configured, 2 expected votes 0 Resources configured. ============ Online: [ node1 node2 ]

As you can see the setup is complaining about STONITH, but that is since we have not configured that part of the cluster.

Configure an IP resource
We are now going to configure the Configuration Information Base or CIB using the Cluster Resouce Manager or CRM command line tool. First we start the crm commandline tool:

node1~: sudo crm crm(live)#

Then we create a copy of the current configuration to edit in, we will commit this copy when we are done editing:
crm(live)# cib new config20090726 INFO: config20090726 shadow CIB created crm(config20090726)#

Then we go into configuration mode and we show the current config:


crm(config20090726)# configure crm(config20090726)configure# show node host132.procolix.com node host133.procolix.com property $id="cib-bootstrap-options" \ dc-version="1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe" cluster-infrastructure="openais" \ expected-quorum-votes="2"

We now turn off STONITH since we don't need it in this example configuration:
crm(config20090726)configure# property stonith-enabled=false

Now we add our failover IP to the configuration:


crm(config20090726)configure# primitive failover-ip ocf:heartbeat:IPaddr params ip=10.0.0.21 op monitor interval=10s

And lastly we check if our configuration is valid and then commit it to the cluster and quit the configuration tool:
crm(config20090726)configure# verify crm(config20090726)configure# end There are changes pending. Do you want to commit them? y crm(config20090726)# crm(config20090726)# cib use live crm(live)# cib commit config20090726 INFO: commited 'config20090726' shadow CIB to the cluster crm(live)# quit bye

When we now do a one-shot crm_mon we get:


node1~: sudo crm_mon --one-shot ============ Last updated: Fri Nov 6 21:5:51 2009 Stack: openais Current DC: node1 - partition with quorum Version: 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ node1 node2 ] failover-ip (ocf::heartbeat:IPaddr): Started node1

Resource operations
There are quite some things you can do with a resource, here are a few examples:

Put a node in standby and back online again

Put node1 in standby


When you want to do maintenance on node1 you can put that node in standby mode. That works like this:
node1~: sudo crm crm(live)# node crm(live)node# standby crm(live)node# quit bye

You can see that it actually failed to the other node with crm_mon:
node1~: sudo crm_mon --one-shot ============ Last updated: Fri Nov 6 21:04:31 2009 Stack: openais Current DC: node1 - partition with quorum Version: 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Node node1: standby Online: [ node2 ] failover-ip (ocf::heartbeat:IPaddr): Started node2

Put node1 online again


When maintenance is over you can start node1 again like this:
node1~: sudo crm crm(live)# node crm(live)node# online crm(live)node# bye bye

Now you can see that the resource has failed back again to node1:
node1~: sudo crm_mon --one-shot ============ Last updated: Fri Nov 6 21:08:22 2009 Stack: openais Current DC: node1 - partition with quorum Version: 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ node1 node2 ] failover-ip (ocf::heartbeat:IPaddr): Started node1

Migrate the resource to the other node


You might want the resource to run on the other node then the one it is running on right now, this is being done with the migrate command. We are now telling our cluster to run the IP resource on node2 instead of node1:

node1~: sudo crm crm(live)# resource crm(live)resource# list failover-ip (ocf::heartbeat:IPaddr) Started crm(live)resource# migrate failover-ip node2 crm(live)resource# bye bye

You can now see that it is running on the other node using crm_mon:
node1~: sudo crm_mon --one-shot

============ Last updated: Fri Nov 6 21:09:45 2009 Stack: openais Current DC: node1 - partition with quorum Version: 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ node1 node2 ] failover-ip (ocf::heartbeat:IPaddr): Started node2

Stop the resource


You might want to stop your resource or, in other words, make your resource unavailable. That can be done like this:
node1~: sudo crm crm(live)# resource crm(live)resource# stop failover-ip crm(live)resource# bye bye

Using crm_mon that will look like this:


node1~: sudo crm_mon --one-shot

============ Last updated: Fri Nov 6 21:11:56 2009 Stack: openais Current DC: node1 - partition with quorum Version: 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ node1 node2 ]

Note that there is no resource listed here, but you can see that there is one configured resource.

Add another node


Now we have a two node cluster, but you might want to upgrade your setup by adding a node. We will call this node:

node3 - ip 10.0.0.13 - third node

First install an extra node as described above under 'Installation' and add it to the cluster by adding the authkey and the configuration and possibly configure the firewall. Then check if it all worked:
node1~: crm_mon --one-shot ============ Last updated: Fri Nov 6 21:18:14 2009 Stack: openais Current DC: node1 - partition with quorum Version: 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ node1 node2 node3 ] failover-ip (ocf::heartbeat:IPaddr): Started node1

You might also like