Professional Documents
Culture Documents
IBM WebSphere MQ For Iseries Best Practices Guide - Middleware News
IBM WebSphere MQ For Iseries Best Practices Guide - Middleware News
By Jonathan Rumsey
IBM Hursley
E-mail: jrumsey@uk.ibm.com
Date: 31st May 2006
Table of Contents
WebSphere MQ for iSeries Best Practice Guide ........................................................ 2
1 Introduction ....................................................................................................... 3
2 WebSphere MQ maintenance and software rollout............................................. 4
2.1 Software maintenance ................................................................................ 4
2.2 Managing rollouts ...................................................................................... 5
3 General housekeeping........................................................................................ 6
3.1 Creating and changing objects.................................................................... 6
3.2 Journal housekeeping................................................................................. 6
3.3 Shared memory cleanup............................................................................. 6
3.4 Queue manager shutdown .......................................................................... 7
3.5 12-Step / Cold Start procedure ................................................................... 7
3.6 Backups ..................................................................................................... 8
3.6.1 Data ................................................................................................... 8
3.6.2 Object definitions ............................................................................... 9
3.7 Multiple queue managers ........................................................................... 9
3.8 Daylight saving time ................................................................................ 10
3.8.1 Spring time change........................................................................... 10
3.8.2 Autumn or fall time change .............................................................. 10
4 Performance .................................................................................................... 12
4.1 Journal receiver location .......................................................................... 12
4.2 Journal receiver switching........................................................................ 12
4.3 Restart time.............................................................................................. 13
4.4 Channel process pooling .......................................................................... 13
4.5 IFS Type 2 ............................................................................................... 14
5 Availability...................................................................................................... 15
5.1 High Availability (HA) clustering ............................................................ 15
5.1.1 Remote mirroring............................................................................. 15
5.1.2 Backup queue managers ................................................................... 15
5.2 High Availability features ........................................................................ 16
5.2.1 WebSphere MQ clustering ............................................................... 16
5.2.2 Client Channel table ......................................................................... 17
5.3 Network topology .................................................................................... 17
5.4 Hardware assistance................................................................................. 18
6 Conclusion ...................................................................................................... 18
1 Introduction
This document is intended for people who manage WebSphere® MQ 5.3 or 6.0
software on iSeries machines. It has been written for both novice and expert users of
WebSphere MQ and encapsulates general best practice information that has been
collated by the IBM® development and service teams from customer installations. It
is divided into four main sections: WebSphere MQ maintenance and software rollout,
General housekeeping, Performance, and Availability.
This guide should be used in conjunction with, and not as a replacement to, the
WebSphere MQ publications. Full details of the WebSphere MQ administration and
programming interfaces are available in these publications; the latest editions of all
books can be downloaded from the WebSphere MQ Web site .
A Test Fix is an OS/400 specific method of delivery. It lets a customer use the Load
PTF (LODPTF) and Apply PTF (APYPTF) CL commands to apply the fix, and the
remove PTF (RMVPTF) CL command to remove it. A Test Fix (unlike a PTF) is not
the final delivery vehicle for the fix. The final fix is delivered in the Fix pack. Test
Fixes cannot be superseded by another PTF (all Test Fixes need to be applied on top
of the last Fix pack), so there is a requirement to remove a Test Fix before applying a
new Fix pack or another Test Fix that fixes the same object.
Producing Test Fixes that cannot supersede other Test Fixes, allows WebSphere MQ
Service to target individual fixes to specific customers because there are no
dependencies that force a Test Fix to drag in fixes devised for other customers. This
process is consistent with the other WebSphere MQ platforms on the common code
base (though the fixes are packaged differently). It allows emergency fixes to be
delivered quickly, in a format that allows customers to use standard PTF commands to
apply, remove, and track these fixes, and allows IBM to target individual fixes to
specific customers.
You should not need to apply every Fix pack on the day it is issued, but you should be
aware of what has been released. It is a good idea to check the list of APARs that
have been included in each PTF. Expect that at least one of the APARs could show up
in your system unless you upgrade. Plan to install PTFs on a test machine, and then
migrate to production systems.
New full versions of WebSphere MQ have generally been released every 18-24
months, with an overlap of service of approximately 12-15 months, but sometimes for
a longer period. When a new release arrives make early plans to upgrade to it. It is
advisable to schedule your upgrade before service has been withdrawn for the older
version.
The latest version of WebSphere MQ for iSeries at the time of writing (May 2006) is
V6.0. This release contains many functional enhancements over V5.3. It also
contains fixes to all the relevant, known defects that were included in the V5.3 PTFs
available before V6.0 shipped. Future fixes made to V5.3 will also be included in
V6.0 PTFs where appropriate. You can see the WebSphere MQ Support,
Service summary for OS/400 for a summary of the problems fixed in each PTF. This
page also shows the end of service date for each version of WebSphere MQ.
Logical partitions can be used for some tests, as these can simulate testing on separate
machines. Any recommendations in this document that refer to machines can equally
be applied to separate partitions. The only difference is that the CPU cycles of a
machine can be shared between the partitions, so for example you cannot accurately
test the peak performance of a dedicated machine.
When changing WebSphere MQ objects, change the definition of the object in the
program and rerun the program rather than changing the object directly.
Older journal receivers should be deleted or archived once they are not needed by
WebSphere MQ to recover storage and to reduce the number of journal receivers that
MQ needs to read. Ideally, this should be done by an automated task that runs
regularly, the frequency of which should be dependent on the number of journal
receivers used each day.
For a discussion and sample program showing how to automate the Housekeeping of
journal receivers, see the article, Automating Journal Management in WebSphere MQ
for iSeries.
In V5.3 and later releases, shared memory can be safely purged for a single queue
manager by specifying a queue manager name on the command:
The recommended way to quiesce a queue manager is to use the End Message
Queue Manager (ENDMQM) command with ENDCCTJOB(*YES) to end
connected jobs, listeners etc.
Splitting queue managers may also help to alleviate this problem by distributing the
channel jobs between the queue managers (Section 3.7).
To recover from these problems, the queue manager is often restarted using a
procedure referred to as the “12-step” or “cold start” procedure. This procedure
involves deleting the existing AMQAJRN journal and associated journal receivers for
the queue manager, and creating new empty journals and journal receivers.
Normal queue manager start-up processing involves reconciling the data in the
AMQAJRN journal with the data in the IFS queue files. When a queue manager is
restarted using the cold start procedure, there is no journal data to replay, so
WebSphere MQ cannot perform its normal start-up processing.
1. With a forced disk write to the AMQAJRN Journal, meaning that WebSphere
MQ waits for OS/400 or i5/OS to confirm that the disk has been physically
updated.
2. With a lazy write to the queue, WebSphere MQ caches some data in memory,
and writes some data to the IFS queue file with an OS/400 unforced write.
The unforced write will be stored in operating system buffers and written to
disk at the operating system's convenience.
The journal data on disk is therefore the master copy of WebSphere MQ data. In
normal circumstances when WebSphere MQ is shut down and restarted, STRMQM
processing ensures that the data in the queue files is brought up to date with the data
in the journals.
Because the data in the journal receivers is so important to the recovery of MQ data,
we strongly recommend that the journal receivers are stored on RAID protected disk,
and that the cold start is avoided wherever possible.
3.6 Backups
Two methods to consider when planning a backup and recovery strategy for
WebSphere MQ are the data backup and object definition backup. These methods are
complementary, and most enterprises successfully implement a combination of these
two techniques.
3.6.1 Data
It is necessary to quiesce a queue manager before fully backing up its IFS and journal
data, you can however, take a backup of just the journal data while the queue manager
is running. If the backed up journal data is restored then it is possible to fully recover
a queue manager and its IFS data.
SupportPac - MS03 – saves the queue manager definitions as MQSC commands that
can be replayed to recreate the objects. The SupportPac can be downloaded from:
http://www.ibm.com/software/integration/support/supportpacs/individual/ms03.html
MS03 does not save or restore the queue manager’s authority records, but a utility
program (AMQOAMD) ships with WebSphere MQ for this purpose. Calling
AMQOAMD with the “-s” flag will dump the authority records in the form of
GRTMQMAUT commands that can be replayed to recreate the authority records. So
to dump the authority commands to a file the following approach can be used:
MS03 and AMQOAMD provide a quick and lightweight way of backing up and
restoring a queue manager’s definitions from one machine to another, or even
between queue managers on a single machine.
The alternative to using MS03 is using one of the third-party system management
tools that hold queue manager configurations in a central repository.
Sharing a single queue manager between applications in this way can lead to some
problems that can be alleviated by allocating individual queue managers to
applications. Changing a WebSphere MQ topology from a shared queue manager to
multiple queue managers can involve changes to applications, and so is not a trivial
undertaking. The following advantages and disadvantages should be reviewed when
considering how your applications are distributed over queue managers.
A controlled cold start is one where all queues are emptied of any persistent messages
and the queue manager is cleanly shut down. The queue manager journal data can
then be deleted per the cold start procedure. This eliminates the risk of losing
messages, but it still deletes all media recovery information. You will not be able to
recover damaged objects without media recovery information, so you should ensure
that you have backed up your object definitions prior to attempting this (see section
3.6.2). Your IBM service representative will be able provide details of the cold start
procedure should it be required.
You can reduce the number of times you switch journals by using a small number of
large journal receivers as opposed to a large number of small journal receivers. The
optimum size for journal receivers depends on workload and the amount of persistent
data passing through the queue manager.
You can avoid journal receiver switches during busy periods by making the journal
receivers large enough to contain a full day’s data. At close of day, journal receivers
should be switched manually with CHGJRN *GEN to ensure a new receiver is used
the next day.
You can define the size of the journal receivers for a queue manager in WebSphere
MQ 6.0 via a parameter on the CRTMQM command. On previous releases you can
change the size of a queue manager journal receiver by creating a new receiver with
the desired size, and attaching it to the journal. All subsequent receivers will be
created with the new size. Use the following commands to do this.
The queue manager restart time after an abnormal shutdown is heavily dependent on
the amount of work needed to replay and recover and resolve transactions that were
in-flight when the queue manager shut down.
If queue data files are found to be corrupt when replaying transactions, then the queue
files must also be recovered from media image. Queues are recovered from the last
recorded media image, and all operations (put/get) are replayed from the journals into
this queue.
• Where possible, write your applications so that units of work are short lived.
Channel process pooling is the default behaviour for queue managers created in
WebSphere MQ V5.3 or later releases. You can turn on the channel process pooling
feature for queue managers which were originally migrated from previous releases by
adding the “ThreadedListener=YES” value to the Channels stanza in the qm.ini file.
For example:
Channels:
ThreadedListener=YES
Important: If you use Channel process pooling you must ensure that all channel exits
are thread-safe, as you would need to do so with any threaded MCAs.
OS/400 V5R2 introduced the ability to place libraries onto disks that can be switched
between two machines (Independent Auxiliary Storage Pools or IASPs). This makes
it possible to develop an HA clustering solution for WebSphere MQ.
IBM have provided a SupportPac that shows how to configure OS/400 HA clusters
with WebSphere MQ.
Provided the mirror is precisely synchronized with the original data, then this exactly
has the same availability characteristics as an HA cluster. If, however, the mirror is
an asynchronous process, there is a possibility that journal entries written by the
queue manager might not have been copied before the failure, and therefore that the
rebuilt queue manager image can have missed updates. This could result in lost or
duplicated messages.
There are several vendor products that work using a combination of mirroring disk
files and extracting data out of WebSphere MQ journals. Any true mirror is likely to
introduce a performance impact, as any forced update (flush) to the disk is going to
have to be written to the mirrored system before the queue manager can continue.
This performance hit may or may not be acceptable, depending on customer
requirements; we would recommend running a production-level workload to ensure
performance is adequate and data replication is complete.
To create a backup queue manager, the primary queue manager IFS and library are
saved and restored to a backup system. The backup queue manager is then designated
as such by starting the queue manager with the STRMQM REPLAY(*YES)
Periodically new journal receivers are copied from the primary queue manager to the
backup and replayed using STRMQM REPLAY(*YES) command. At this stage the
backup cannot yet be started proper, but if the primary queue manager fails the
backup queue manager can then be activated using STRMQM ACTIVATE(*YES).
Using the backup queue manager feature does not impact performance of the primary
queue manager as would occur in remote mirroring and in most scenarios it is faster
to start a backup rather than a mirrored queue manager. This solution may not be
acceptable if loss of some messages is unacceptable, as the backup queue manager
may not be completely up to date with the same configuration and message data to
that of the primary queue manager.
The default WLM algorithm routes each inbound message to one of the available
queue managers that contains the target named queue. If a queue manager is not
currently running, then the messages are sent elsewhere; if no queue manager is
running that hosts the named queue, the messages remain on the originating system
until a server queue manager is restarted. The WLM algorithm can be tailored by a
workload manager exit program by any release of WebSphere MQ that supports
clustering or via various channel and queue parameters such as rank, priority and
weighting which were introduced in WebSphere MQ 6.0.
This approach allows new work to be injected into the cluster, even when some of the
server machines are not running. However messages which have been sent to a queue
manager, which subsequently fails, cannot be processed until the queue manager
restarts. These messages are often called ‘marooned’. A WebSphere MQ cluster is
especially effective if there are no affinities between individual messages enabling
them to be processed in any sequence. Affinities can be maintained if the application
is written to use options in the MQI, but this has to be a conscious decision based on
the business requirements.
The Client Channel table is created by the iSeries server queue manager and contains
a list of Client Connection (*CLTCN) channels that have been created on the server.
The table allows client applications to select the queue manager they connect to at run
time via a list. If the client uses a wildcard in the queue manager name on the
MQCONN, the client code will pick the first available queue manager from the pre-
configured Client Channel table.
If the connection to the queue manager subsequently fails for any reason, the client
can be programmed to detect the failure and attempt the connection again. The client
code will pick the next available queue manager from the Client Channel table. In
this way, the client application can recover from server failures.
There are no fixed rules about “business-driven” partitioning, however you need to
consider the availability (including bandwidth) of networks between processing
centers and the amount of inter-server messaging. Using WebSphere MQ clustering,
perhaps with a custom-written workload exit or with the new features introduced in
WebSphere MQ 6.0 would be a good way to direct messages to the nearest available
server.
6 Conclusion
This article discussed some of the best practices that will help you to get the most out
of WebSphere MQ on iSeries. These practices will help you keep your system up to
date, safely backed up, available, and performing well.
IBM and WebSphere are trademarks or registered trademarks of IBM Corporation in the
United States, other countries, or both.
Other company, product, and service names may be trademarks or service marks of others.
IBM copyright and trademark information