Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

WebSphere MQ For iSeries Best Practices Guide

By Jonathan Rumsey
IBM Hursley
E-mail: jrumsey@uk.ibm.com
Date: 31st May 2006

© Copyright IBM Corp., 2006


WebSphere MQ for iSeries Best Practice Guide
Version 1.2

Table of Contents
WebSphere MQ for iSeries Best Practice Guide ........................................................ 2
1 Introduction ....................................................................................................... 3
2 WebSphere MQ maintenance and software rollout............................................. 4
2.1 Software maintenance ................................................................................ 4
2.2 Managing rollouts ...................................................................................... 5
3 General housekeeping........................................................................................ 6
3.1 Creating and changing objects.................................................................... 6
3.2 Journal housekeeping................................................................................. 6
3.3 Shared memory cleanup............................................................................. 6
3.4 Queue manager shutdown .......................................................................... 7
3.5 12-Step / Cold Start procedure ................................................................... 7
3.6 Backups ..................................................................................................... 8
3.6.1 Data ................................................................................................... 8
3.6.2 Object definitions ............................................................................... 9
3.7 Multiple queue managers ........................................................................... 9
3.8 Daylight saving time ................................................................................ 10
3.8.1 Spring time change........................................................................... 10
3.8.2 Autumn or fall time change .............................................................. 10
4 Performance .................................................................................................... 12
4.1 Journal receiver location .......................................................................... 12
4.2 Journal receiver switching........................................................................ 12
4.3 Restart time.............................................................................................. 13
4.4 Channel process pooling .......................................................................... 13
4.5 IFS Type 2 ............................................................................................... 14
5 Availability...................................................................................................... 15
5.1 High Availability (HA) clustering ............................................................ 15
5.1.1 Remote mirroring............................................................................. 15
5.1.2 Backup queue managers ................................................................... 15
5.2 High Availability features ........................................................................ 16
5.2.1 WebSphere MQ clustering ............................................................... 16
5.2.2 Client Channel table ......................................................................... 17
5.3 Network topology .................................................................................... 17
5.4 Hardware assistance................................................................................. 18
6 Conclusion ...................................................................................................... 18
1 Introduction
This document is intended for people who manage WebSphere® MQ 5.3 or 6.0
software on iSeries machines. It has been written for both novice and expert users of
WebSphere MQ and encapsulates general best practice information that has been
collated by the IBM® development and service teams from customer installations. It
is divided into four main sections: WebSphere MQ maintenance and software rollout,
General housekeeping, Performance, and Availability.

This guide should be used in conjunction with, and not as a replacement to, the
WebSphere MQ publications. Full details of the WebSphere MQ administration and
programming interfaces are available in these publications; the latest editions of all
books can be downloaded from the WebSphere MQ Web site .

WebSphere MQ for iSeries – Best Practice Guide Page 3 of 18


2 WebSphere MQ maintenance and software rollout

2.1 Software maintenance


You should ensure that you keep current with maintenance. Preventative maintenance
for WebSphere MQ is delivered by distributing Fix packs (formerly known as CSDs)
to customers as Final PTFs and corrective maintenance is carried out via iSeries Test
Fix delivery. A Fix pack contains a number of cumulative fixes; these PTFs are
usually made available every 3 months. The entire set of PTFs can be ordered using a
single “marker” PTF.

A Test Fix is an OS/400 specific method of delivery. It lets a customer use the Load
PTF (LODPTF) and Apply PTF (APYPTF) CL commands to apply the fix, and the
remove PTF (RMVPTF) CL command to remove it. A Test Fix (unlike a PTF) is not
the final delivery vehicle for the fix. The final fix is delivered in the Fix pack. Test
Fixes cannot be superseded by another PTF (all Test Fixes need to be applied on top
of the last Fix pack), so there is a requirement to remove a Test Fix before applying a
new Fix pack or another Test Fix that fixes the same object.

Producing Test Fixes that cannot supersede other Test Fixes, allows WebSphere MQ
Service to target individual fixes to specific customers because there are no
dependencies that force a Test Fix to drag in fixes devised for other customers. This
process is consistent with the other WebSphere MQ platforms on the common code
base (though the fixes are packaged differently). It allows emergency fixes to be
delivered quickly, in a format that allows customers to use standard PTF commands to
apply, remove, and track these fixes, and allows IBM to target individual fixes to
specific customers.

You should not need to apply every Fix pack on the day it is issued, but you should be
aware of what has been released. It is a good idea to check the list of APARs that
have been included in each PTF. Expect that at least one of the APARs could show up
in your system unless you upgrade. Plan to install PTFs on a test machine, and then
migrate to production systems.

If a high impact pervasive problem arises in-between the release of WebSphere MQ


Fix packs, as with other iSeries products, the service team will release a set of HIPER
(High Impact PERvasive) PTFs so a fix can be provided to all customers before the
next Fix pack is released. You should ensure that you are current with both
WebSphere MQ and other iSeries HIPER PTFs. Regularly review the Preventive
Service Planning (PSP) information that is available online to iSeries customers and
which lists all new important PTFs that have been released on the iSeries for each
supported version of the operating system. It also lists any PTFs that are known to be
defective and should not be applied to your system.

New full versions of WebSphere MQ have generally been released every 18-24
months, with an overlap of service of approximately 12-15 months, but sometimes for
a longer period. When a new release arrives make early plans to upgrade to it. It is
advisable to schedule your upgrade before service has been withdrawn for the older
version.

WebSphere MQ for iSeries – Best Practice Guide Page 4 of 18


It takes time to test applications and roll the product into production systems and last-
minute upgrades can run into problems if this testing is not done. We know that some
sites take several months to implement a replacement version, as they go through the
tiers of testing. Problems need to be found early so that you have time to fix these
problems before the final end of service deadline.

The latest version of WebSphere MQ for iSeries at the time of writing (May 2006) is
V6.0. This release contains many functional enhancements over V5.3. It also
contains fixes to all the relevant, known defects that were included in the V5.3 PTFs
available before V6.0 shipped. Future fixes made to V5.3 will also be included in
V6.0 PTFs where appropriate. You can see the WebSphere MQ Support,
Service summary for OS/400 for a summary of the problems fixed in each PTF. This
page also shows the end of service date for each version of WebSphere MQ.

2.2 Managing rollouts


All application and system software should be fully tested, using production
workloads and configurations, on separate machines or partitions.

Many customers have a multi-tier approach, covering development, system testing,


production testing, and real production systems. Programs and configurations are
moved between these environments using locally defined practices. A fundamental
rule is that no changes should be made without change control and audit trail
mechanisms. Do not go into production without properly testing your systems. This
normally means having a spare machine of equivalent processing power that can run
workloads similar to that of the production system. The machines should also, of
course, use identical levels and versions of all software.

While we cannot make specific recommendations, we know that any successful


enterprise will have implemented a staged rollout process, testing at each stage to
ensure quality and service requirements are met. Testing needs to cover not just basic
application function but also performance, capacity and disaster recovery scenarios. A
self-contained, repeatable, regression test suite that is enhanced as necessary will
contribute significantly to a successful rollout.

Logical partitions can be used for some tests, as these can simulate testing on separate
machines. Any recommendations in this document that refer to machines can equally
be applied to separate partitions. The only difference is that the CPU cycles of a
machine can be shared between the partitions, so for example you cannot accurately
test the peak performance of a dedicated machine.

WebSphere MQ for iSeries – Best Practice Guide Page 5 of 18


3 General housekeeping
3.1 Creating and changing objects
It is good practice when creating queue manager objects, to put the object creation
commands into an MQSC script or CL command program. If you always create
WebSphere MQ objects programmatically then you have a record of the queue
manager definitions that allows you to very quickly rebuild an entire queue manager.

When changing WebSphere MQ objects, change the definition of the object in the
program and rerun the program rather than changing the object directly.

3.2 Journal housekeeping


On iSeries, WebSphere MQ periodically issues messages AMQ7460 and AMQ7462
to indicate which journals are no longer needed for crash or media recovery. In
WebSphere MQ 6.0 this information is now also available via the DSPMQMSTS CL
command, via the DISPLAY QMSTATUS MQSC command and programmatically
through the MQAI/PCF MQCMD_INQUIRE_Q_MGR_STATUS interface.

Older journal receivers should be deleted or archived once they are not needed by
WebSphere MQ to recover storage and to reduce the number of journal receivers that
MQ needs to read. Ideally, this should be done by an automated task that runs
regularly, the frequency of which should be dependent on the number of journal
receivers used each day.

For a discussion and sample program showing how to automate the Housekeeping of
journal receivers, see the article, Automating Journal Management in WebSphere MQ
for iSeries.

3.3 Shared memory cleanup


Customers have occasionally experienced problems starting a queue manager that
have lead to them clearing out the shared memory and semaphore related files that are
found in the IFS file system. The manual deletion of these files is not recommended.

The existence of these files provides a placeholder to ensure that WebSphere MQ


does not attempt to create new shared resources with the same key as existing shared
resources. Manually deleting these files can orphan inter process communication
(IPC) objects such as shared memory and semaphores, which may result in
unexpected behaviour of WebSphere MQ.

In V5.3 and later releases, shared memory can be safely purged for a single queue
manager by specifying a queue manager name on the command:

ENDMQM MQMNAME(queuemanager) ENDCCTJOB(*YES)

WebSphere MQ for iSeries – Best Practice Guide Page 6 of 18


3.4 Queue manager shutdown
Customers have found that they occasionally have trouble shutting down their queue
managers on busy systems within a reasonable period of time. Often this is related to
the time required to end a large numbers of channel jobs connected into the queue
manager.

The recommended way to quiesce a queue manager is to use the End Message
Queue Manager (ENDMQM) command with ENDCCTJOB(*YES) to end
connected jobs, listeners etc.

In WebSphere MQ 5.3, the ENDCCTJOB(*YES) option forces the queue


manager to record media images (with an implicit RCDMQMIMG) before it is
shut down. The implicit RCDMQMIMG can take some time. Specifying
RCDMQMIMG(*NO) at WebSphere MQ 6.0 prevents this step occurring and so
can greatly reduce the time it takes to end a queue manager at the expense that it
may mean that media recovery of a damaged object may take longer to perform.

To speed up the ENDMQM OPTION(*IMMED) ENDCCTJOB(*YES) you


can first issue an ENDMQM OPTION(*CTRLD). This command returns
immediately and flags the queue manager as ending. Issuing this initial controlled
end of the queue manager at WebSphere MQ 5.3 will prevent the implicit
RCDMQMIMG. The ENDCCTJOB(*YES) will still run, ending the queue
manager immediately and shut down any jobs connected to the queue manager. If
you bypass the RCDMQMIMG on ENDMQM then you should consider issuing
a manual RCDMQMIMG command when the queue manager restarts to ensure
you have media images of your queue manager objects readily available in the
latest journal receivers.

Splitting queue managers may also help to alleviate this problem by distributing the
channel jobs between the queue managers (Section 3.7).

3.5 12-Step / Cold Start procedure


In the past, some customers have experienced problems starting their WebSphere MQ
queue managers on iSeries. These problems have manifested themselves as failures
in the Start Message Queue Manager (STRMQM) command, or as apparent “hangs”
while running STRMQM (in fact this is more likely to be STRMQM running
extremely slowly, a problem which has now been resolved).

To recover from these problems, the queue manager is often restarted using a
procedure referred to as the “12-step” or “cold start” procedure. This procedure
involves deleting the existing AMQAJRN journal and associated journal receivers for
the queue manager, and creating new empty journals and journal receivers.

WebSphere MQ for iSeries – Best Practice Guide Page 7 of 18


Performing a cold start is not best practice. When WebSphere MQ journal receivers
that are still required by the queue manager are deleted it must be understood that
messages can be lost or duplicated.

Normal queue manager start-up processing involves reconciling the data in the
AMQAJRN journal with the data in the IFS queue files. When a queue manager is
restarted using the cold start procedure, there is no journal data to replay, so
WebSphere MQ cannot perform its normal start-up processing.

The start-up reconciliation is required by WebSphere MQ to guarantee integrity. One


of the key reasons this is required is because WebSphere MQ uses a technique called
“write-ahead” logging. This means that whenever WebSphere MQ puts or gets
persistent messages, the put or get is recorded in two ways:

1. With a forced disk write to the AMQAJRN Journal, meaning that WebSphere
MQ waits for OS/400 or i5/OS to confirm that the disk has been physically
updated.

2. With a lazy write to the queue, WebSphere MQ caches some data in memory,
and writes some data to the IFS queue file with an OS/400 unforced write.
The unforced write will be stored in operating system buffers and written to
disk at the operating system's convenience.

The journal data on disk is therefore the master copy of WebSphere MQ data. In
normal circumstances when WebSphere MQ is shut down and restarted, STRMQM
processing ensures that the data in the queue files is brought up to date with the data
in the journals.

If a queue manager or system terminates abnormally, then information about


messages and transactions may exist in the journal, but not in the IFS. Deleting the
journal receivers as part of a cold start deletes the only copy of these messages and
transactions. By using the cold start procedure (by deleting the journals and bypassing
any STRMQM reconciliation) could lead to loss, duplication, or corruption of data.

Because the data in the journal receivers is so important to the recovery of MQ data,
we strongly recommend that the journal receivers are stored on RAID protected disk,
and that the cold start is avoided wherever possible.

3.6 Backups
Two methods to consider when planning a backup and recovery strategy for
WebSphere MQ are the data backup and object definition backup. These methods are
complementary, and most enterprises successfully implement a combination of these
two techniques.

3.6.1 Data
It is necessary to quiesce a queue manager before fully backing up its IFS and journal
data, you can however, take a backup of just the journal data while the queue manager
is running. If the backed up journal data is restored then it is possible to fully recover
a queue manager and its IFS data.

WebSphere MQ for iSeries – Best Practice Guide Page 8 of 18


It is a good idea to perform a full backup of the WebSphere MQ libraries and IFS
directories occasionally (for example, when the queue manager is quiesced to apply
maintenance). Journal backups can be taken more frequently, for example, daily or
weekly.

3.6.2 Object definitions


Journal data backups will let you recover object definitions and messages up to the
last backup, but if you are using WebSphere MQ for non-persistent messaging, then it
is simpler to just backup your WebSphere MQ queue manager definitions. This lets
you quickly recreate a copy of a queue manager without requiring the messages and
transactions to be restored.

SupportPac - MS03 – saves the queue manager definitions as MQSC commands that
can be replayed to recreate the objects. The SupportPac can be downloaded from:
http://www.ibm.com/software/integration/support/supportpacs/individual/ms03.html

MS03 does not save or restore the queue manager’s authority records, but a utility
program (AMQOAMD) ships with WebSphere MQ for this purpose. Calling
AMQOAMD with the “-s” flag will dump the authority records in the form of
GRTMQMAUT commands that can be replayed to recreate the authority records. So
to dump the authority commands to a file the following approach can be used:

OVRDBF FILE(STDOUT) TOFILE(&OUTLIB/&OUTFILE) MBR(&OUTMEMB)


CALL AMQOAMD PARM('-m' &MQMNAME '-s')
DLTOVR FILE(STDOUT)

MS03 and AMQOAMD provide a quick and lightweight way of backing up and
restoring a queue manager’s definitions from one machine to another, or even
between queue managers on a single machine.

The alternative to using MS03 is using one of the third-party system management
tools that hold queue manager configurations in a central repository.

3.7 Multiple queue managers


Prior to V5.1, WebSphere MQ on iSeries only supported a single queue manager per
machine, and so, many disparate applications were forced to share a single queue
manager. Despite the introduction of multiple queue manager support in v5.1, many
customers are still using one queue manager for all their applications.

Sharing a single queue manager between applications in this way can lead to some
problems that can be alleviated by allocating individual queue managers to
applications. Changing a WebSphere MQ topology from a shared queue manager to
multiple queue managers can involve changes to applications, and so is not a trivial
undertaking. The following advantages and disadvantages should be reviewed when
considering how your applications are distributed over queue managers.

WebSphere MQ for iSeries – Best Practice Guide Page 9 of 18


Advantages of a single queue manager:
• Low overhead - There is a fixed minimum overhead associated with a queue
manager. Dependent on the version, each queue manager can start between
five and eleven jobs before any applications connect. The fewer queue
managers, the fewer system resources are used.

• Management - An increase to the number of queue managers may increase


the management tasks that need to be performed. Running each queue
manager in a separate subsystem is a good strategy to simplify queue manager
maintenance if multiple queue managers are used.

Advantages of multiple queue managers:


• Failure impact - A single queue manager is a single point of failure for all
applications. Multiple queue managers can reduce the impact of a failure to
only the application(s) or location(s) being served by that queue manager.

• Scalability - On a single queue manager system, one high volume application


can adversely affect the performance of all other applications using the queue
manager. Having multiple queue managers gives scope for greater throughput
on a queue manager by queue manager basis. For example, if the performance
of a queue manager is suffering because of a bottleneck when writing to the
journals it is possible to move the journals for that queue manager to a
dedicated disk. WebSphere MQ 6.0 allows you to allocate storage for queue
manager journal data to a specific auxiliary storage pool.

• Availability planning – Multiple queue managers can simplify planning for


high availability. It allows individual queue managers and applications to be
considered as a single failover unit that can be failed over to another machine
without affecting services for other applications.

3.8 Daylight saving time


Changing the system clock to account for daylight savings time causes problems for
WebSphere MQ on iSeries if a queue manager is active because WebSphere MQ uses
timestamps based on the system clock to access data in the queue manager’s journal
entries. If the system clock changes while WebSphere MQ is running, then
WebSphere MQ can fail to access journal data correctly and may prevent queue
manager start-up. It is therefore necessary to quiesce WebSphere MQ before changing
the system clock.

3.8.1 Spring time change


When the clocks go forward one hour in the spring, WebSphere MQ can just be shut
down for the time it takes to adjust the clock. The queue manager can be restarted
immediately after changing the clock and UTC offset.

3.8.2 Autumn or fall time change


When the clocks go backward in the autumn or fall, you cannot restart the queue
manager immediately after changing the clock backwards. If you do so, WebSphere
MQ could write duplicate timestamps to the journal. You should ensure that

WebSphere MQ for iSeries – Best Practice Guide Page 10 of 18


WebSphere MQ is stopped for an hour either before, or after the time change and
UTC offset update, to avoid the problems associated with setting the system clock
backward by an hour.

In environments where downtime must be minimized, an enforced outage of one hour


may not be acceptable. IBM is investigating a solution to this problem, but until a
solution is available the only alternative to quiescing the queue manager for an hour is
to perform a controlled cold start of the system (see section 3.5 for a discussion of the
Cold Start).

A controlled cold start is one where all queues are emptied of any persistent messages
and the queue manager is cleanly shut down. The queue manager journal data can
then be deleted per the cold start procedure. This eliminates the risk of losing
messages, but it still deletes all media recovery information. You will not be able to
recover damaged objects without media recovery information, so you should ensure
that you have backed up your object definitions prior to attempting this (see section
3.6.2). Your IBM service representative will be able provide details of the cold start
procedure should it be required.

WebSphere MQ for iSeries – Best Practice Guide Page 11 of 18


4 Performance
4.1 Journal receiver location
The critical path for performance of persistent messages is usually the update of the
WebSphere MQ log files (journals on iSeries), these updates should ideally be
isolated using operating system facilities so that they are written using dedicated disk
heads. This is achieved by putting the Queue Manager AMQAJRN and its receivers
into a separate Auxiliary Storage Pools, this facility is provided as an option when
creating your queue manager at WebSphere MQ 6.0.

4.2 Journal receiver switching


Some customers have encountered performance problems when WebSphere MQ
journal receivers switch. This is because when a receiver switch happens, WebSphere
MQ uses a lock to protect data while it builds an in-memory image of the journal
receiver chain. In extreme cases WebSphere MQ can appear to hang while the
AMQALMPX job builds this image. This problem has been resolved in WebSphere
MQ 5.3 via Fix pack and in WebSphere MQ 6.0 and thus the following advice applies
only to prior releases of the product.

In WebSphere MQ releases prior to 5.3 (CSD#6) there is a direct relationship between


the amount of journal data that is stored on the system and the amount of time
WebSphere MQ takes to perform a journal switch, so it pays to remove redundant
receivers as described in section 3.2.

You can reduce the number of times you switch journals by using a small number of
large journal receivers as opposed to a large number of small journal receivers. The
optimum size for journal receivers depends on workload and the amount of persistent
data passing through the queue manager.

You can avoid journal receiver switches during busy periods by making the journal
receivers large enough to contain a full day’s data. At close of day, journal receivers
should be switched manually with CHGJRN *GEN to ensure a new receiver is used
the next day.

You can define the size of the journal receivers for a queue manager in WebSphere
MQ 6.0 via a parameter on the CRTMQM command. On previous releases you can
change the size of a queue manager journal receiver by creating a new receiver with
the desired size, and attaching it to the journal. All subsequent receivers will be
created with the new size. Use the following commands to do this.

CRTJRNRCV JRNRCV(QMGRLIB/AMQAnnnnnn) THRESHOLD(NEW_SIZE)


TEXT('MQM local journal receiver') AUT(*EXCLUDE)
CHGOBJOWN OBJ(QMGRLIB/AMQAnnnnnn) OBJTYPE(*JRNRCV) +
NEWOWN(QMQM)
CHGJRN JRN(QMGRLIB/AMQAJRN) JRNRCV(QMGRLIB/AMQAnnnnnn)

WebSphere MQ for iSeries – Best Practice Guide Page 12 of 18


…where QMGRLIB is the name of your queue manager library, AMQAnnnnnn is
the name of the next journal receiver in sequence, and NEW_ SIZE is the new
receiver size.

4.3 Restart time


If a queue manager is not ended with the normal ENDMQM command, STRMQM
will take longer than if the queue manager was shut down normally.

The queue manager restart time after an abnormal shutdown is heavily dependent on
the amount of work needed to replay and recover and resolve transactions that were
in-flight when the queue manager shut down.

If queue data files are found to be corrupt when replaying transactions, then the queue
files must also be recovered from media image. Queues are recovered from the last
recorded media image, and all operations (put/get) are replayed from the journals into
this queue.

To reduce restart time:


• Make sure that applications check for “fail if quiescing” when using MQGET,
and roll back their Units of Work.

• Where possible, write your applications so that units of work are short lived.

• Use the RCDMQMIMG command regularly to update the media recovery


images for queues. This will reduce the amount of data replayed if the queue
files become corrupt. Ideally you should record the media image of a queue
when the queue depth is low to reduce the amount of information that needs to
be stored in the journals.

4.4 Channel process pooling


As previously mentioned, the channel process-pooling feature first introduced in
WebSphere MQ V5.3 lets channels run as threads within a pool of processes. By
default each pooling job can run up to 60 channel threads, which significantly reduces
the number of active jobs in the system.

Channel process pooling is the default behaviour for queue managers created in
WebSphere MQ V5.3 or later releases. You can turn on the channel process pooling
feature for queue managers which were originally migrated from previous releases by
adding the “ThreadedListener=YES” value to the Channels stanza in the qm.ini file.
For example:
Channels:
ThreadedListener=YES

Important: If you use Channel process pooling you must ensure that all channel exits
are thread-safe, as you would need to do so with any threaded MCAs.

WebSphere MQ for iSeries – Best Practice Guide Page 13 of 18


4.5 IFS Type 2
As WebSphere MQ uses the IFS extensively you may want to consider converting
your file system to use IFS Type 2 directories to improve performance and reliability.

WebSphere MQ for iSeries – Best Practice Guide Page 14 of 18


5 Availability

5.1 High Availability (HA) clustering


High Availability (HA) clustering is a general term for systems where a service is
automatically restarted, perhaps on a different box, when a failure is discovered.
WebSphere MQ provides technology for integrating with HA frameworks where the
WebSphere MQ data and journals are stored on a disk that can be accessed by more
than one machine (not necessarily simultaneously). When a machine fails, the disk is
switched to the other machine and the queue manager is restarted. There will be a
short delay while the takeover occurs, queue managers outside the HA cluster will
automatically reconnect as channel retry kicks in, and no persistent messages are lost.

OS/400 V5R2 introduced the ability to place libraries onto disks that can be switched
between two machines (Independent Auxiliary Storage Pools or IASPs). This makes
it possible to develop an HA clustering solution for WebSphere MQ.

IBM have provided a SupportPac that shows how to configure OS/400 HA clusters
with WebSphere MQ.

5.1.1 Remote mirroring


Without a shared disk, and therefore without a “standard” HA integration facility, an
alternative approach to restarting a queue manager on a second machine is to mirror
the contents of the disks holding WebSphere MQ data and logs.

Provided the mirror is precisely synchronized with the original data, then this exactly
has the same availability characteristics as an HA cluster. If, however, the mirror is
an asynchronous process, there is a possibility that journal entries written by the
queue manager might not have been copied before the failure, and therefore that the
rebuilt queue manager image can have missed updates. This could result in lost or
duplicated messages.

There are several vendor products that work using a combination of mirroring disk
files and extracting data out of WebSphere MQ journals. Any true mirror is likely to
introduce a performance impact, as any forced update (flush) to the disk is going to
have to be written to the mirrored system before the queue manager can continue.
This performance hit may or may not be acceptable, depending on customer
requirements; we would recommend running a production-level workload to ensure
performance is adequate and data replication is complete.

5.1.2 Backup queue managers


WebSphere MQ 6.0 introduced an alternative feature to using a vendor product to
mirror a queue manager. This feature allows you to periodically update the backup
with journal data and then activate it in a disaster recovery scenario.

To create a backup queue manager, the primary queue manager IFS and library are
saved and restored to a backup system. The backup queue manager is then designated
as such by starting the queue manager with the STRMQM REPLAY(*YES)

WebSphere MQ for iSeries – Best Practice Guide Page 15 of 18


command which informs the queue manager to just replay the journal data to update
the IFS, rather than perform a full queue manager startup.

Periodically new journal receivers are copied from the primary queue manager to the
backup and replayed using STRMQM REPLAY(*YES) command. At this stage the
backup cannot yet be started proper, but if the primary queue manager fails the
backup queue manager can then be activated using STRMQM ACTIVATE(*YES).

Using the backup queue manager feature does not impact performance of the primary
queue manager as would occur in remote mirroring and in most scenarios it is faster
to start a backup rather than a mirrored queue manager. This solution may not be
acceptable if loss of some messages is unacceptable, as the backup queue manager
may not be completely up to date with the same configuration and message data to
that of the primary queue manager.

5.2 High Availability features


WebSphere MQ includes a number of features that let applications become more
resilient to queue manager or hardware failures. We consider two in more detail here:
WebSphere MQ queue manager clusters, and the client channel table. Both queue
manager clustering and client channel table reconnection techniques assume that any
one of the queue managers in the cluster / table is capable of delivering the service
that the client needs. By duplicating services on all queue managers, you eliminate
single points of failure.

5.2.1 WebSphere MQ clustering


WebSphere MQ clustering provides two main benefits: reduced administration and
workload distribution (WLM).

The default WLM algorithm routes each inbound message to one of the available
queue managers that contains the target named queue. If a queue manager is not
currently running, then the messages are sent elsewhere; if no queue manager is
running that hosts the named queue, the messages remain on the originating system
until a server queue manager is restarted. The WLM algorithm can be tailored by a
workload manager exit program by any release of WebSphere MQ that supports
clustering or via various channel and queue parameters such as rank, priority and
weighting which were introduced in WebSphere MQ 6.0.

This approach allows new work to be injected into the cluster, even when some of the
server machines are not running. However messages which have been sent to a queue
manager, which subsequently fails, cannot be processed until the queue manager
restarts. These messages are often called ‘marooned’. A WebSphere MQ cluster is
especially effective if there are no affinities between individual messages enabling
them to be processed in any sequence. Affinities can be maintained if the application
is written to use options in the MQI, but this has to be a conscious decision based on
the business requirements.

WebSphere MQ for iSeries – Best Practice Guide Page 16 of 18


Many customers are using WebSphere MQ clustering successfully in large
configurations; it is our normal recommendation for managing a set of queue
managers that need to share workload. More information about Clustering can be
found in these online WebSphere MQ manuals.

5.2.2 Client Channel table


Available previously on other platforms, the Client Connection Table was introduced
in WebSphere MQ V5.3 for iSeries as a way of building more resilient client
applications without clustering.

The Client Channel table is created by the iSeries server queue manager and contains
a list of Client Connection (*CLTCN) channels that have been created on the server.
The table allows client applications to select the queue manager they connect to at run
time via a list. If the client uses a wildcard in the queue manager name on the
MQCONN, the client code will pick the first available queue manager from the pre-
configured Client Channel table.

If the connection to the queue manager subsequently fails for any reason, the client
can be programmed to detect the failure and attempt the connection again. The client
code will pick the next available queue manager from the Client Channel table. In
this way, the client application can recover from server failures.

5.3 Network topology


There are times when it is appropriate to share workload between multiple systems
even when a single machine could theoretically handle the entire capacity. The
decision to partition the workload here is often made for geographic or organizational
reasons. For example, it might be appropriate to have work executed in a regional
center instead of sending it all to a central location.

There are no fixed rules about “business-driven” partitioning, however you need to
consider the availability (including bandwidth) of networks between processing
centers and the amount of inter-server messaging. Using WebSphere MQ clustering,
perhaps with a custom-written workload exit or with the new features introduced in
WebSphere MQ 6.0 would be a good way to direct messages to the nearest available
server.

Designing an efficient queue manager and network topology is something that


requires an analysis of the expected message traffic patterns. This activity needs to
consider the other material in this document to ensure good availability and
performance of all of the queue managers in the network, avoiding single failure
points and bottlenecks.

WebSphere MQ for iSeries – Best Practice Guide Page 17 of 18


5.4 Hardware assistance
There are a number of hardware technologies that can improve the availability of a
WebSphere MQ system. These include Uninterruptible Power Supplies and RAID
technologies for disks. Use of these facilities is transparent to WebSphere MQ, but
should be considered as part of the planning process when ordering machine
configurations.

6 Conclusion
This article discussed some of the best practices that will help you to get the most out
of WebSphere MQ on iSeries. These practices will help you keep your system up to
date, safely backed up, available, and performing well.

WebSphere MQ is a versatile product that is used in many environments and that


makes it difficult to describe procedures that will cover every eventuality. If you have
a “best practice” that is not covered here, the authors would be interested to hear
about it.

IBM and WebSphere are trademarks or registered trademarks of IBM Corporation in the
United States, other countries, or both.
Other company, product, and service names may be trademarks or service marks of others.
IBM copyright and trademark information

WebSphere MQ for iSeries – Best Practice Guide Page 18 of 18

You might also like