Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

TheLinux-HAUsersGuide

FlorianHaas<florian.haas@linbit.com> TheLinux-HAProject<linux-ha@lists.linux-ha.org>

TheLinux-HAUsersGuide
by Florian Haas Copyright 2009, 2010 LINBIT HA-Solutions GmbH, The Linux-HA Project

License information
The text of and illustrations in this document are licensed under a Creative Commons AttributionShare Alike 3.0 Unported license ("CC-BY-SA"). A summary of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. The full license text is available at http://creativecommons.org/licenses/by-sa/3.0/legalcode. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.

Preface ....................................................................................................................... iv I. Introduction to Heartbeat .......................................................................................... 1 1. Heartbeat as a Cluster Messaging Layer ............................................................. 2 2. Components .................................................................................................... 3 2.1. Communication module .......................................................................... 3 2.2. Cluster Consensus Membership ............................................................... 3 2.3. Cluster Plumbing Library ........................................................................ 3 2.4. IPC Library ........................................................................................... 4 2.5. Non-blocking logging daemon ................................................................ 4 II. Installing Heartbeat .................................................................................................. 5 3. Building and installing from source ..................................................................... 6 3.1. Building and installing Cluster Glue from source ......................................... 6 3.1.1. Cluster Glue build prerequisites .................................................... 6 3.1.2. Downloading Cluster Glue sources ................................................ 6 3.1.3. Building Cluster Glue ................................................................... 7 3.1.4. Building Packages ....................................................................... 8 3.2. Building and installing Heartbeat from source ............................................ 8 3.2.1. Heartbeat build prerequisites ....................................................... 8 3.2.2. Downloading Heartbeat sources ................................................... 8 3.2.3. Building Heartbeat ...................................................................... 9 3.2.4. Building Packages ..................................................................... 10 4. Installing pre-built packages ............................................................................ 11 4.1. Debian and Ubuntu .............................................................................. 11 4.2. Fedora, RHEL and CentOS .................................................................... 11 4.3. OpenSUSE and SLES ............................................................................ 11 III. Administrative Tasks .............................................................................................. 12 5. Creating an initial Heartbeat configuration ........................................................ 13 5.1. The ha.cf file ................................................................................... 13 5.2. The authkeys file ............................................................................. 13 5.3. Propagating the cluster configuration to cluster nodes ............................. 14 5.4. Starting Heartbeat services .................................................................. 14 5.5. Where to go from here ........................................................................ 15 6. Upgrading from previous Heartbeat versions ..................................................... 16 6.1. Upgrading from Heartbeat 2.1 clusters not using the CRM ........................ 16 6.1.1. Stopping Heartbeat services ...................................................... 16 6.1.2. Upgrade software ..................................................................... 16 6.1.3. Enabling the Heartbeat cluster to use Pacemaker .......................... 16 6.1.4. Restarting Heartbeat ................................................................. 17 6.2. Upgrading from CRM-enabled Heartbeat 2.1 clusters .............................. 17 6.2.1. Placing the cluster in unmanaged mode ....................................... 17 6.2.2. Backing up the CIB ................................................................... 18 6.2.3. Stopping Heartbeat services ...................................................... 18 6.2.4. Wiping files related to the CRM .................................................. 18 6.2.5. Restoring the CIB ..................................................................... 18 6.2.6. Upgrading software .................................................................. 19 6.2.7. Restarting Heartbeat services .................................................... 19 6.2.8. Returning the cluster to managed mode ...................................... 19 6.2.9. Upgrading the CIB schema ......................................................... 19 IV. Getting Help and Helping Out ................................................................................. 20 7. Reporting problems ........................................................................................ 21 7.1. Mailing Lists ....................................................................................... 21 7.1.1. Linux-HA ................................................................................. 21 7.1.2. Linux-HA-dev .......................................................................... 21 7.2. Bug Tracking System ........................................................................... 21 7.3. IRC .................................................................................................... 21 8. Submitting patches ........................................................................................ 22

iii

Preface
This guide aims to be the definitive reference guide for users of the Heartbeat cluster messaging layer, the Linux-HA cluster resource agents, and other clustering building blocks maintained by the Linux-HA project. We welcome and actively encourage any and all feedback about this document. Please use the linux-ha@lists.linux-ha.org [mailto:linux-ha@lists.linux-ha.org] mailing list for comments, corrections, and suggestions for improvement.

iv

PartI.IntroductiontoHeartbeat

Chapter1.HeartbeatasaCluster MessagingLayer
Heartbeat is a daemon that provides cluster infrastructure (communication and membership) services to its clients. This allows clients to know about the presence (or disappearance!) of peer processes on other machines and to easily exchange messages with them. In order to be useful to users, the Heartbeat daemon needs to be combined with a cluster resource manager (CRM) which has the task of starting and stopping the services (IP addresses, web servers, etc) that cluster will make highly available. The canonical cluster resource manager typically associated with Heartbeat is Pacemaker [http://www.clusterlabs.org], a highly scalable and feature-rich implementation that supports both Heartbeat and, alternatively, the Corosync [http://www.corosync.org] cluster messaging layer.

Note
Up until Heartbeat release 2.1.3, Pacemaker was developed jointly with Heartbeat as part of the Linux-HA umbrella project. After this release, the Pacemaker project was spun off as a separate project and continues as such, while maintaining full support for Heartbeat cluster messaging.

Chapter2.Components
The Heartbeat messaging layer comprises several interlocking but distinct components.

2.1.Communicationmodule
The Heartbeat communication module provides strongly authenticated, locally-ordered multicast messaging over basically any media, IP-based or not. Heartbeat supports cluster communications over the following network link types: unicast UDP over IPv4; broadcast UDP over IPv4; multicast UDP over IPv4; serial link communications.

Warning
Please consider some important caveats with regard to serial link connectivity (see the ha.cf man page, man ha.cf). As a general rule: when in doubt, serial links should be avoided. Heartbeat can detect node failure reliably in less than a half-second. It will register with the system watchdog timer if configured to do so. The heartbeat layer has an API which provides the following classes of services: intra-cluster communication - sending and receiving packets to cluster nodes configuration queries connectivity information (who can the current node hear packets from) - both for queries and state change notifications basic group membership services

2.2.ClusterConsensusMembership
CCM provides strongly connected consensus cluster membership services. It ensures that every node in a computed membership can talk to every other node in this same membership. CCM Implements both an OCF draft membership API, and the SAF AIS membership API. Typically it computes membership in sub-second time.

2.3.ClusterPlumbingLibrary
The Cluster plumbing library is a collection of very useful functions which provide a variety of services used by many of our main components. A few of the major objects provided by this library include: compression API (with underlying compression plugins) Non-blocking logging API memory management oriented to continuously running services

Components Hierarchical name-value pair messaging facility promoting portability and version upgrade compatibility (also provides optional message compression facilities) Signal unification - allowing signals to appear as mainloop events Core dump management utilities - promoting capture of core dumps in a uniform way, and under all circumstances timers (like glib mainloop timers - but they work even when the time of day clock jumps) child process management - death of children causes invocation of process object, with configurable death-of-child messages Triggers (arbitrary events triggered by software) Realtime management - setting and unsetting high priorities, and locked into memory attributes of processes. 64-bit HZ-granularity time manipulation (longclock_t) User id management for security purposes, for processes which need some root privileges. Mainloop integration for IPC, plain file descriptors, signals, etc. This means that all these different event sources are managed and dispatched consistently.

2.4.IPCLibrary
All interprocess communication is performed using a very general IPC library which provides nonblocking access to IPC using a flexible queueing strategy, and includes integrated flow control. This IPC API does not require sockets, but the currently available implementations use UNIX (Local) Domain sockets. This API also includes built-in authentication and authorization of peer processes, and is portable to most POSIX-like OSes. Although use of Glib mainloop with these APIs is not required, Heartbeat provides simple and convenient integration with mainloop.

2.5.Non-blockingloggingdaemon
logd is Heartbeats logging daemon, capable of logging to a syslog daemon, to files, or both. logd never blocks, instead, it discards messages lagging too far behind. Once it is capable of outputting messages again, logd prints a lost message count. Queue sizes are controllable both overall, and on a per-application basis.

PartII.InstallingHeartbeat

Chapter3.Buildingandinstallingfrom source
Building and installing the Heartbeat cluster messaging layer from source amounts to building the following packages: heartbeat itself; the cluster-glue package containing the Heartbeat local resource manager (LRM) and STONITH plugins. Since heartbeat has a build dependency on cluster-glue, it is necessary to build and install cluster-glue first, before one proceeds with building and installing heartbeat.

3.1.BuildingandinstallingClusterGluefrom source
3.1.1.ClusterGluebuildprerequisites
Building Cluster Glue requires the presence of the following tools and libraries on the build system: A C compiler (typically gcc) and associated C development libraries; the flex scanner generator and the bison parser compiler; net-snmp development headers, to enable SNMP related functionality; OpenIPMI development headers, to enable IPMI related functionality; Python (just the language interpreter, not library headers).

Note
This list applies to the default software configuration. If you configure the source with non-standard options, other dependencies may apply.

3.1.2.DownloadingClusterGluesources
Several options are available for retrieving the Cluster Glue source code, for building locally on a target system.

Downloadingareleasetarball
Downloading a released version of Heartbeat as a compressed tarball is equivalent to fetching a tagged snapshot from the Mercurial source code repository. Release tags follow the format glue-x.y.z, where x.y.z is the released version of Cluster Glue you wish to download. If, for example, one wants to download the 1.0.1 release, the correct sequence of commands would be: # wget http://hg.linux-ha.org/glue/archive/glue-1.0.1.tar.bz2 # tar -vxjf glue-1.0.1.tar.bz2

Building and installing from source

DownloadingthelatestMercurialsnapshot
The latest development code is always available in the Mercurial repository as the tip revision. To download a tarball auto-generated from the tip, use this sequence of commands: # wget http://hg.linux-ha.org/glue/archive/tip.tar.bz2 # tar -vxjf tip.tar.bz2

CheckingoutsourcesfromMercurial
This is the method you would apply if you have the Mercurial utilities locally installed. Checking out the sources amounts to cloning the repository: $ hg clone http://hg.linux-ha.org/glue cluster-glue requesting all changes adding changesets adding manifests adding file changes added 12491 changesets with 34830 changes to 2632 files updating working directory 356 files updated, 0 files merged, 0 files removed, 0 files unresolved

3.1.3.BuildingClusterGlue
Building Cluster Glue is an automated process making extensive use of GNU Autotools. When building and installing on the same machine, it usually amounts to just the following sequence of commands: $ $ $ $ ./autogen.sh ./configure make sudo make install

Note
The autogen.sh script is a convenience wrapper around automake, autoheader, autoconf, and libtool. A number of configuration options are supported, and you may tweak some of them to optimize Heartbeat for your system. To retrieve a list of configuration options, you may invoke configure with the --help option. A customized build may thus comprise these steps: $ $ $ $ $ ./autogen.sh ./configure --help ./configure configuration-options make sudo make install

Some typical configuration options that you may wish to set are --prefix, --sysconfdir, and --localstatedir, as shown in this example: $ $ $ $ $ ./autogen.sh ./configure --help ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var make sudo make install

Building and installing from source

3.1.4.BuildingPackages
RPMpackages
RPM spec files are provided in the Cluster Glue source tree both for SuSE and Red Hat based distributions: cluster-glue-suse.spec should be used for OpenSUSE and SLES installations. cluster-glue-fedora.spec is for Fedora, Red Hat Enterprise Linux, and CentOS.

Debianpackages
Debian packaging for Cluster glue is maintained in a Mercurial repository on alioth.debian.org [http://hg.debian.org/hg/debian-ha/cluster-glue/]. Thus, rather than cloning or downloading from the upstream Mercurial repository, use the one hosted on alioth. Once you have checked out or unpacked the source tree from the alioth repository, simply invoke dpkg-buildpackage from the top of the source tree like you would with any other Debian package.

3.2.BuildingandinstallingHeartbeatfrom source
3.2.1.Heartbeatbuildprerequisites
Building Heartbeat requires the presence of the following tools and libraries on the build system: A C compiler (typically gcc) and associated C development libraries; the flex scanner generator and the bison parser compiler; net-snmp development headers, to enable SNMP related functionality; OpenIPMI development headers, to enable IPMI related functionality; Python (just the language interpreter, not library headers) the cluster-glue development headers. See Section3.1, Building and installing Cluster Glue from source [6] for details on how to build these from source.

Note
This list applies to the default software configuration. If you configure the source with non-standard options, other dependencies may apply.

3.2.2.DownloadingHeartbeatsources
Downloadingareleasetarball
Downloading a released version of Heartbeat as a compressed tarball is equivalent to fetching a tagged snapshot from the Mercurial source code repository. Release tags follow the format STABLE-x.y.z, where x.y.z is the released version of Heartbeat you wish to download. If, for example, one wants to download the 3.0.4 release, the correct sequence of commands would be:

Building and installing from source

# wget http://hg.linux-ha.org/dev/archive/STABLE-3.0.4.tar.bz2 # tar -vxjf STABLE-3.0.4.tar.bz2

DownloadingthelatestMercurialsnapshot
The latest development code is always available in the Mercurial repository as the tip revision. To download a tarball auto-generated from the tip, use this sequence of commands: # wget http://hg.linux-ha.org/dev/archive/tip.tar.bz2 # tar -vxjf tip.tar.bz2

CheckingoutsourcesfromMercurial
This is the method you would apply if you have the Mercurial utilities locally installed. Checking out the sources amounts to cloning the repository: $ hg clone http://hg.linux-ha.org/dev heartbeat-dev requesting all changes adding changesets adding manifests adding file changes added 12491 changesets with 34830 changes to 2632 files updating working directory 356 files updated, 0 files merged, 0 files removed, 0 files unresolved

3.2.3.BuildingHeartbeat
Building Heartbeat is an automated process making extensive use of GNU Autotools. When building and installing on the same machine, it usually amounts to just the following sequence of commands: $ $ $ $ ./bootstrap ./configure make sudo make install

Note
The bootstrap script is a convenience wrapper around automake, autoheader, autoconf, and libtool. ConfigureMe is a convenience wrapper for the autoconf-generated configure script. A number of configuration options are supported, and you may tweak some of them to optimize Heartbeat for your system. To retrieve a list of configuration options, you may invoke configure with the --help option. A customized build may thus comprise these steps: $ $ $ $ $ ./bootstrap ./configure --help ./configure <configuration-options> make sudo make install

Some typical configuration options that you may wish to set are --prefix, --sysconfdir, and --localstatedir, as shown in this example: $ ./bootstrap $ ./configure --help $ ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var

Building and installing from source

$ make $ sudo make install

3.2.4.BuildingPackages
RPMpackages
RPM spec files are provided in the Heartbeat source tree both for SuSE and Red Hat based distributions: heartbeat-suse.spec should be used for OpenSUSE and SLES installations. heartbeatfedora.spec is for Fedora, Red Hat Enterprise Linux, and CentOS.

Debianpackages
Debian packaging for Heartbeat is maintained in a Mercurial repository on alioth.debian.org [http://hg.debian.org/hg/debian-ha/heartbeat/]. Thus, rather than cloning or downloading from the upstream Mercurial repository, use the one hosted on alioth. Once you have checked out or unpacked the source tree from the alioth repository, simply invoke dpkg-buildpackage from the top of the source tree like you would with any other Debian package.

10

Chapter4.Installingpre-builtpackages
Cluster Glue and Heartbeat are available as pre-built binary packages on a number of platforms, including Debian (fully included in squeeze and up, backports packages are available for lenny); Ubuntu (since lucid); Fedora (since release 12); OpenSUSE (since release 11). Commercially supported enterprise packages for Red Hat Enterprise Linux and SUSE Linux Enterprise server are available from LINBIT [http://www.linbit.com]. This section describes the steps necessary to install binary packages on these platforms.

4.1.DebianandUbuntu
Installing the cluster-glue and heartbeat packages on Debian and Ubuntu is a straightforward process. Assuming you have the correct package repositories configured for APT, install the two packages with the following commands: aptitude install heartbeat cluster-glue Since you most likely will also want to install Pacemaker (beyond the scope of this manual), do so by issuing the following commands as well: aptitude install cluster-agents pacemaker

4.2.Fedora,RHELandCentOS
On Red Hat platforms, you install the cluster-glue and heartbeat packages with the YUM package manager. Assuming you have the correct package repositories configured in +/etc/ yum.repos.d/, install the two packages with the following commands: yum install heartbeat cluster-glue Since you most likely will also want to install Pacemaker (beyond the scope of this manual), do so by issuing the following commands as well: yum install resource-agents pacemaker

4.3.OpenSUSEandSLES
On SUSE platforms, you install the cluster-glue and heartbeat packages with the Zypper package manager. Assuming you have the correct package repositories configured, install the two packages with the following commands: zypper install heartbeat cluster-glue Since you most likely will also want to install Pacemaker (beyond the scope of this manual), do so by issuing the following commands as well: zypper install resource-agents pacemaker

11

PartIII.AdministrativeTasks

Chapter5.Creatinganinitial Heartbeatconfiguration
For any Heartbeat cluster, the following configuration files must be available: /etc/ha.d/ha.cf the global cluster configuration file. /etc/ha.d/authkeys a file containing keys for mutual node authentication.

5.1.Theha.cffile
The following example is a small and simple ha.cf file: autojoin none mcast bond0 239.0.0.43 694 1 0 bcast eth2 warntime 5 deadtime 15 initdead 60 keepalive 2 node alice node bob pacemaker respawn Setting autojoin to none disables cluster node auto-discovery and requires that cluster nodes be listed explicitly, using the node options. This speeds up cluster start-up in clusters with a fixed small number of nodes. This example assumes that bond0 is the clusters interface to the shared network, and that eth2 is the interface dedicated for DRBD replication between both nodes. Thus, bond0 can be used for Multicast heartbeat, whereas on eth2 broadcast is acceptable as eth2 is not a shared network. The next options configure node failure detection. They set the time after which Heartbeat issues a warning that a no longer available peer node may be dead (warntime), the time after which Heartbeat considers a node confirmed dead (deadtime), and the maximum time it waits for other nodes to check in at cluster startup (initdead). keepalive sets the interval at which Heartbeat keep-alive packets are sent. All these options are given in seconds. The node option identifies cluster members. The option values listed here must match the exact host names of cluster nodes as given by uname -n. pacemaker respawn enables the Pacemaker cluster manager, and ensures that Pacemaker is automatically restarted in case of a failure.

Note
Prior to Heartbeat release 3.0.4, the pacemaker keyword was named crm. Newer versions still retain the old name as a compatibility alias, but pacemaker is the preferred syntax.

5.2.Theauthkeysfile
/etc/ha.d/authkeys contains pre-shared secrets used for mutual cluster node authentication. It should only be readable by root and follows this format: auth <num>

13

Creating an initial Heartbeat configuration <num> <algorithm> <secret> num is a simple key index, starting with 1. Usually, you will only have one key in your authkeys file. algorithm is the signature algorithm being used. You may use either md5 or sha1; the use of crc (a simple cyclic redundancy check, not secure) is not recommended. secret is the actual authentication key. You may create an authkeys file, using a generated secret, with the following shell hack: ( echo -ne "auth 1\n1 sha1 "; \ dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \ > /etc/ha.d/authkeys chmod 0600 /etc/ha.d/authkeys

5.3.Propagatingtheclusterconfigurationto clusternodes
In order to propagate the contents of the ha.cf and authkeys configuration files, you may use the ha_propagate command, which you would invoke using either /usr/lib/heartbeat/ha_propagate or /usr/lib64/heartbeat/ha_propagate This utility will copy the configuration files over to any node listed in /etc/ha.d/ha.cf using scp. It will afterwards also connect to the nodes using ssh and issue chkconfig heartbeat on in order to enable Heartbeat services on system startup.

5.4.StartingHeartbeatservices
The Heartbeat services are started just as you would any other system service on your machine. Depending on your system platform, you might be using one of the following commands: /etc/init.d/heartbeat start service heartbeat start rcheartbeat start Through the pacemaker entry in your ha.cf [13], Heartbeat will now start the Pacemaker daemon (named crmd for historical reasons) along with the rest of its services. After a few seconds, you should be able to detect Heartbeat processes in your process table: # ps -AHfww | grep heartbeat root 2772 1639 0 14:27 root 4175 1 0 Nov08 root 4224 4175 0 Nov08 root 4227 4175 0 Nov08 root 4228 4175 0 Nov08 root 4229 4175 0 Nov08 root 4230 4175 0 Nov08 102 4233 4175 0 Nov08 102 4234 4175 0 Nov08

pts/0 ? ? ? ? ? ? ? ?

00:00:00 00:37:57 00:01:13 00:01:28 00:01:29 00:01:35 00:01:32 00:03:37 00:15:02

grep heartbeat heartbeat: master control proc heartbeat: FIFO reader heartbeat: write: bcast eth2 heartbeat: read: bcast eth2 heartbeat: write: mcast bond heartbeat: read: mcast bond0 /usr/lib/heartbeat/ccm /usr/lib/heartbeat/cib

14

Creating an initial Heartbeat configuration root root 102 102 102 4235 4236 4237 4238 5724 4175 4175 4175 4175 4238 0 0 0 0 0 Nov08 Nov08 Nov08 Nov08 Nov08 ? ? ? ? ? 00:17:14 00:02:48 00:00:54 00:08:32 00:04:47

/usr/lib/heartbeat/lrmd -r /usr/lib/heartbeat/stonithd /usr/lib/heartbeat/attrd /usr/lib/heartbeat/crmd /usr/lib/heartbeat/pengine

Finally, you should also be able to confirm that the cluster is working, via the Pacemaker crm_mon command:

# crm_mon -1 ============ Last updated: Mon Dec 13 14:29:36 2010 Stack: Heartbeat Current DC: alice (083146b9-6e26-4ac8-a705-317095d0ba57) - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, unknown expected votes 24 Resources configured. ============ Online: [ alice bob ]

5.5.Wheretogofromhere
Now that you have a working Heartbeat configuration, you will want to continue with configuring Pacemaker and adding cluster resources. The following documents are available for your perusal: Clusters From Scratch [http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/ Clusters_from_Scratch/] is an excellent guide for configuring Pacemaker clusters. The document primarily covers Pacemaker on Corosync, but starting with its chapter "Using Pacemaker Tools" applies identically to Heartbeat clusters. The DRBD Users Guide [http://www.drbd.org/users-guide/] has a chapter dedicated to integrating DRBD with Pacemaker clusters. Several Technical Guides covering Heartbeat/Pacemaker clusters for a variety of applications are available from the LINBIT web site [http://www.linbit.com/en/education/tech-guides].

15

Chapter6.Upgradingfromprevious Heartbeatversions
6.1.UpgradingfromHeartbeat2.1clustersnot usingtheCRM
For Heartbeat 2.1 clusters not using the CRM (i.e., clusters configured with the haresources file, an upgrade to 3.0 involves converting the current configuration to one that is suitable for Pacemaker.

Note
This upgrade procedure does incur application down time. However, when the upgrade is properly planned, tested, and executed, this down time amounts to minutes, possibly even seconds (depending on configuration).

6.1.1.StoppingHeartbeatservices
You should commence the upgrade procedure on your current standby node, that is, the cluster node currently not running any resources. If your cluster is using an active-active configuration (both nodes running resources), select one and issue the following command to transfer all resources to the peer node: # hb_standby Then, and on that node only, stop Heartbeat services: # /etc/init.d/heartbeat stop

6.1.2.Upgradesoftware
While upgrading, it is important to recall that the monolithic Heartbeat 2.1 tree has been split up into modular parts. Thus you will replace Heartbeat with three individual pieces of software: Cluster Glue, Pacemaker, and Heartbeat 3 which comprises just the cluster messaging layer. Upgrading from source: In the unpacked archive that you installed Heartbeat 2.1 from, run make uninstall. Then, install Cluster Glue and Heartbeat. Upgrading using locally built packages: When installing packages manually, uninstall the heartbeat package first. Then install cluster-glue, the version 3 heartbeat package, resource-agents, and pacemaker. Upgrading using a package repository: When upgrading using an APT, YUM, or Zypper repository, you should just be able to run the install command for heartbeat version 3 and pacemaker, and the dependencies will be resolved automatically. Do not restart Heartbeat services at this point.

6.1.3.EnablingtheHeartbeatclustertousePacemaker
The cluster messaging layer must now be instructed to start Pacemaker on cluster start-up. To do so, add

16

Upgrading from previous Heartbeat versions crm respawn to your ha.cf configuration file.

Important
At this point, you should also check your ha.cf file against the ha.cf(5) manual page, and remove any deprecated options. When your ha.cf modifications are complete, copy the file to the peer node.

6.1.4.RestartingHeartbeat
Your cluster is now ready to be restarted in Pacemaker-enabled mode. To do so: 1. Run /etc/init.d/heartbeat stop on your still-active node. This will shutdown your cluster resources. 2. Run /etc/init.d/heartbeat start on your standby node (the one where you created your CIB). This will start the local Heartbeat instance and Pacemaker, and wait for other cluster nodes to check in. 3. Run /etc/init.d/heartbeat start on your the other node. This will start the local Heartbeat instance and Pacemaker, fetch the CIB automatically, and start applications.

6.2.UpgradingfromCRM-enabledHeartbeat 2.1clusters
This section outlines the steps necessary to upgrade a Heartbeat 2.1 cluster with the built-in CRM enabled, to Heartbeat 3.0 with Pacemaker.

Note
When properly planned and executed, the upgrade procedure can be completed in a manner of minutes, with no application down time. It is strongly recommended to read and understand the steps outlined in this section before attempting a production cluster uqpgrade. All commands stated in this section must be run as root. Do not perform the individual steps in parallel on all of your cluster nodes. Instead, complete the procedure on each node before continuing with the next.

6.2.1.Placingtheclusterinunmanagedmode
With this step, the cluster temporarily relinquishes control of its resources. This means that the cluster no longer monitors its nodes or resources for the duration of the upgrade, and will not rectify and application or node failures during this time. Currently running resources, however, will continue to run. # crm_attribute -t crm_config -n is_managed_default -v false In most configurations, individual resources do not set the is_managed attribute individually, and hence the cluster-wide attribute is_managed_default applies to all of them. If in your specific configuration you do have resources that have this attribute set, you should remove it to make sure the default applies: # crm_resource -t primitive -r <resource-id> -p is_managed -D

17

Upgrading from previous Heartbeat versions

6.2.2.BackinguptheCIB
At this point, is is important to save a copy of the Cluster Information Base (CIB). The CIB to save is stored in a file named cib.xml, normally located in /var/lib/heartbeat/crm. # cp /var/lib/heartbeat/crm/cib.xml ~ You need to perform this step on only one node currently connected to the cluster. Do not delete this file, it will be restored later.

6.2.3.StoppingHeartbeatservices
You may now stop Heartbeat with /etc/init.d/heartbeat stop or the preferred command to stop a system service on your distribution (service heartbeat stop, rcheartbeat stop, etc.). In case you are running a legacy version of Heartbeat affected by a shutdown bug, then graceful crmd shutdown will not work properly in unmanaged mode. In this case, after you have initiated a graceful service shutdown with the above command, kill the crmd process manually: use ps -AHfww to retrieve the process ID of crmd; kill crmd with a TERM signal.

6.2.4.WipingfilesrelatedtotheCRM
Warning
Before proceeding with this section, verify that you have created a backup copy of the CIB on one of your cluster nodes, as outlined in Section6.2.2, Backing up the CIB [18]. You should now wipe local CRM-related files from your node. To do so, remove all files from the directory where the CRM stores CIB information, commonly /var/lib/heartbeat/crm. # rm /var/lib/heartbeat/crm/*

6.2.5.RestoringtheCIB
Note
You should only proceed with this step if Heartbeat is still stopped on all cluster nodes, and all cluster nodes have had their CIB contents wiped. If you still have remaining nodes that have a residual CIB configuration, proceed as outlined in Section6.2.4, Wiping files related to the CRM [18]. Restoring the CIB means copying the CIB backup described in Section 6.2.2, Backing up the CIB [18] to the /var/lib/heartbeat/crm directory. # cp ~/cib.xml /var/lib/heartbeat/crm/cib.xml # chown hacluster:haclient /var/lib/heartbeat/crm/cib.xml # chmod 0600 /var/lib/heartbeat/crm/cib.xml You must perform this step on one node only, namely the first node on which you are about to upgrade the cluster software. On all other nodes, the /var/lib/heartbeat/crm directory must remain empty Pacemaker distributes the CIB automatically.

18

Upgrading from previous Heartbeat versions

6.2.6.Upgradingsoftware
While upgrading, it is important to recall that the monolithic Heartbeat 2.1 tree has been split up into modular parts. Thus you will replace Heartbeat with three individual pieces of software: Cluster Glue, Pacemaker, and Heartbeat 3 which comprises just the cluster messaging layer. Upgrading from source: In the unpacked archive that you installed Heartbeat 2.1 from, run make uninstall. Then, install Cluster Glue and Heartbeat. Upgrading using locally built packages: When installing packages manually, uninstall the heartbeat package first. Then install cluster-glue, the version 3 heartbeat package, resource-agents, and pacemaker. Upgrading using a package repository: When upgrading using an APT, YUM, or Zypper repository, you should just be able to run the install command for heartbeat version 3 and pacemaker, and the dependencies will be resolved automatically. If this is the last node to be upgraded in your cluster, and your package management system did not restart Heartbeat services after the software upgrade, you should now proceed to Section 6.2.7, Restarting Heartbeat services [19]. Otherwise, you should move to the next node and proceed as outlined in Section 6.1.1, Stopping Heartbeat services [16] Section6.2.6, Upgrading software [19].

6.2.7.RestartingHeartbeatservices
Note
This step may be omitted in case your package management system automatically restarts Heartbeat services during post-install. First, restart Heartbeat on the node where you restored the CIB (see Section6.2.5, Restoring the CIB [18]) with /etc/init.d/heartbeat start. Then, repeat this command on the remaining cluster nodes. At this time, the cluster is still in unmanaged mode (meaning it does not start, stop, or monitor any resources), the cluster redistributes the old CIB among its nodes, and the cluster is still using the pre-upgrade CIB schema.

6.2.8.Returningtheclustertomanagedmode
Once the cluster software has been upgraded, it is recommended to return the cluster into managed mode: # crm_attribute -t crm_config -n is_managed_default -v true

6.2.9.UpgradingtheCIBschema
Although an upgraded cluster can theoretically operate on the pre-upgrade CIB schema indefinitely, it is strongly recommended to upgrade the CIB to the current schema. To do so, run the following command after cluster communications between all nodes have been reestablished: # cibadmin --upgrade --force

19

PartIV.GettingHelpandHelpingOut

Chapter7.Reportingproblems
This chapter describes the appropriate channels for dealing with bugs and issues releated to the software.

7.1.MailingLists
The Linux-HA project maintains two mailing lists. Both are subscriber-only lists, hosted at lists.linux-ha.org.

7.1.1.Linux-HA
This is the general-purpose community mailing list. Post here if you have a general question about Heartbeat or the resource agents. The post address for this list is linux-ha@lists.linux-ha.org [mailto:linux-ha@lists.linux-ha.org].

7.1.2.Linux-HA-dev
This is the developer mailing list. Post here if you have enhancement suggestions or think you have found a bug. The post address for this list is linux-ha-dev@lists.linux-ha.org [mailto:linuxha-dev@lists.linux-ha.org].

7.2.BugTrackingSystem
The Linux-HA project uses a public Bugzilla installation run by the Linux Foundation at http:// developerbugs.linuxfoundation.org. To report bugs or comment on existing entries, you must create an account, which requires a valid email address.

7.3.IRC
Linux-HA developers can usually be found on the irc.freenode.net server, in the #linuxha channel.

21

Chapter8.Submittingpatches
If you have found a problem in Cluster Glue or Heartbeat, and you are able to fix it, the developers would love to see a patch from you. To do so, please follow the steps outlined in this section. Create a working copy (a Mercurial clone) of the upstream repository with the following command: hg clone http://hg.linux-ha.org/dev heartbeat Create a new Mercurial queue, and a new patchset: cd heartbeat hg qinit hg qnew --edit --force fix-superfrobnication

Note
The --force option rolls your local modications into the patch set you are just creating. In your patch message, be sure to include a meaningful description, for example: High: IPC: fix superfrobnication on 63-bit platforms In the IPC layer, superfrobnication breaks on 63-bit middle-endian Linux platforms when ACPI is disabled. Fencepost align foobar_create_reqqueue() and guard against memory spinlock starvation to fix this. Now the patch set is good for review on the mailing list: hg email --to=linux-ha-dev@lists.linux-ha.org fix-superfrobnication Once your patch has been accepted for merging, one of the upstream developers will push it into the repository. At that point, you can update your checkout from upstream, and remove your own patch set. hg qpop -a hg pull --update hg qdelete fix-superfrobnication

22

You might also like