Professional Documents
Culture Documents
MQ High Availability
MQ High Availability
• IBM MQ provides many capabilities that will keep your data safe and your
business running in the event of failures whether you're running on your
own systems in your data centres, on-premise IaaS, IaaS in a public cloud,
or a hybrid cloud across all of these. This session introduces you to the
solutions available and how they can be effectively used together to build
extremely reliable environments providing HA and DR for messaging on-
premise in in the hybrid cloud.
Introduction
• You can have the best technology in the world, but you have to manage it
correctly
• It is not about High Availability although often the two are related and share
design and implementation choices
‒ “HA is having 2, DR is having them a long way apart”
‒ More seriously, HA is about keeping things running, while DR is about recovering
when HA has failed.
• Designs for HA typically involve a single site for each component of the
overall architecture
• Designs for DR typically involve separate sites
• HA clusters
• Client reconnection
Queue Manager Clusters
• Sharing cluster queues on
multiple queue managers
prevents a queue from being a
SPOF
• Benefits:
Shared
‒ Messages remain available even if a queues
queue manager fails
‒ Pull workload balancing
‒ Apps can connect to the group Queue
manager
Private
queues
Introduction to Failover and MQ
• Instances are the SAME queue manager – only one set of data files
‒ Queue manager data is held in networked storage
Multi-instance Queue Managers
1. Normal MQ MQ
execution Client Client
network
168.0.0.1 168.0.0.2
Machine A Machine B
QM1 QM1
Active can fail-over Standby
instance instance
QM1
networked storage
Owns the queue manager data
Multi-instance Queue Managers
2. Disaster MQ MQ
strikes Client Client
Client network
connections
broken
IPA 168.0.0.2
Machine A Machine B
QM1 QM1
Active locks freed Standby
instance instance
QM1
networked storage
Multi-instance Queue Managers
3. FAILOVER MQ MQ
Client Client
Standby
Client
becomes connection
active network still broken
168.0.0.2
Machine B
QM1
Active
instance
QM1
networked storage
Owns the queue manager data
Multi-instance Queue Managers
4. Recovery MQ MQ
complete Client Client
Client
network connections
reconnect
168.0.0.2
Machine B
QM1
Active
instance
QM1
networked storage
Owns the queue manager data
Dealing with multiple IP addresses
• HA clusters can:
‒ Coordinate multiple resources such as application server, database
‒ Consist of more than two machines
‒ Failover more than once without operator intervention
‒ Takeover IP address as part of failover
‒ Likely to be more resilient in cases of MQ and OS defects
HA clusters
• In HA clusters, queue manager data and logs are placed on a shared disk
‒ Disk is switched between machines during failover
Normal MQ MQ
execution Client Client
network HA cluster
168.0.0.1 168.0.0.2
Machine A Machine B
QM1 QM2
Active Active
QM1 instance
instance
data
and logs
shared disk
QM2
data
and logs
MQ in an HA cluster – Active/active
FAILOVER MQ MQ
Client Client
IP address
takeover
network HA cluster
168.0.0.1 168.0.0.2
QM1
Active
Machine A Machine
instance B
QM2
Active Queue
QM1 instance manager
data
and logs
restarted
shared disk
Shared disk
switched QM2
data
and logs
Multi-instance QM or HA cluster?
• HA cluster
Capable of handling a wider range of failures
Failover historically rather slow, but some HA clusters are improving
Capable of more flexible configurations (eg N+1)
Extra product purchase and skills required
• Storage distinction
• Multi-instance queue manager typically uses NAS
• HA clustered queue manager typically uses SAN
High Availability for the MQ Appliance
Primary Secondary
• Easier config than other HA solutions (no shared file system/shared disks)
• Supports manual failover, e.g. for rolling upgrades
• Replication is synchronous over Ethernet, for 100% fidelity
‒ Routable but not intended for long distances
Setting up HA for the Appliance
• Use the pattern editor to create a virtual machine containing MQ and GPFS
How it works
31
Pattern Builder
32
Pattern Builder – Configuration model
33
Pattern Builder – Active QM
34
Pattern Builder – Standby QM
35
Comparison of Technologies
Shared Queues,
HP NonStop Server continuous continuous
MQ
Clusters none continuous
automatic continuous
HA Clustering,
Multi-instance automatic automatic
No special
support none none
APPLICATIONS AND
AUTO-RECONNECTION
HA applications – MQ connectivity
‒ End abnormally
• Requires:
‒ Threaded client
‒ 7.0.1 server – including z/OS
‒ Full-duplex client communications (SHARECNV >= 1)
Client Configurations for Availability
• http://www.ibm.com/developerworks/websphere/library/techarticles/1303_b
roadhurst/1303_broadhurst.html
Disaster Recovery
What makes a Queue Manager on Dist?
/var/mqm/log/QMGR
Registry Registry
/var/mqm/qmgrs/QMGR
Obj Definiitions
Security
Cluster
QFiles etc
System QMgr
Backups
DB2 CF
What makes up a Queue Manager?
• Pagesets
‒ Updates made “lazily” and brought “up to date” from logs during restart
‒ Start up with an old pageset (restored backup) is not really any different from start up after
queue manager failure
‒ Backup needs to copy page 0 of pageset first (don’t do volume backup!)
• Sometimes 2 data centres are in daily use; back each other up for disasters
‒ Normal workload distributed to the 2 sites
‒ These sites are probably geographically distant
• Disk replication cannot be used between the active and standby instances
of a multi-instance queue manager
‒ Could be used to replicate to a DR site in addition though
DR for the MQ Appliance
Primary Secondary
QM1 QM1
Replication
shared storage
Combining HA and DR – “Active/Active”
Site 1 Site 2
HA Pair
Replication
QM1
HA Pair
QM2 QM2
Replication
Integration with other products
• Only way for guaranteed consistency is disk replication where all logs are
in same group
‒ Otherwise transactional state might be out of sync
Planning and Testing
Planning for Recovery
• Write a DR plan
‒ Document everything – to tedious levels of detail
‒ Include actual commands, not just a description of the operation
Not “Stop MQ”, but “as mqm, run /usr/local/bin/stopmq.sh US.PROD.01”
• Remember that the person executing the plan in a real emergency might
be under-skilled and over-pressured
‒ Plan for no access to phones, email, online docs …
• If Hursley machine room was taken out by a plane missing its landing at
Southampton airport
‒ Could we carry on developing the MQ product
‒ Source code libraries, build machines, test machines …
‒ Could fixes be produced
• LOCLADDR configuration
‒ Not normally used by MQ, but firewalls, IPT and channel exits may inspect it
‒ May need modification if a machine changes address
• Didn’t have monitoring in place that picked this up until users complained
about lack of response
Other Resources
• And afterwards:
‒ Update your plan for the next time
Summary
Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission
from IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of
initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS
DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE
USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY.
IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided.
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers
have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in
which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials
and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or
their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and
interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such
laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law
66
Notices and Disclaimers Con’t.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not
tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the
ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT
NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained h erein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual
property right.
IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®,
FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG,
Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®,
PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®,
StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business
Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.
67
Thank You
Your Feedback is Important!