About System MGT CF Duplex

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

A Positioning Paper

July 2004

Coupling Facility
Configuration Options

David Raften
Raften@us.ibm.com

GF22-5042
Coupling Facility Configuration Options
Page 1

Acknowledgments
The authors would like to acknowledge the reviewers of this document. The following subject
experts reviewed and commented on this document:

Steve Anania – JES2 Rich Guski – RACF


Dave Anderson – Product Planning Dick Hannan – IMS
Judy Ruby Brown – DB2 Geoff Miller – PET Test
Angelo Corridori – Planning Jeff Seidell – Marketing
Ken Davies – CICS Jimmy Strickland – VSAM (RLS)
Scott Fagen – GRS Don Sutherland – Product Planning
Rob Geiner – Logger Nora Sweet - Marketing
Peter Yocom - WLM Joan Kelley- Performance
Additionally, we would like to give special thanks to the following people who reviewed the
entire document and gave significant comments which added to the value of this work:
Dave Surman – XCF/XES Development
Audrey Helffrich – Engineering System Test
Jeff Josten – DB2 Development
Mike Cox – S/390 Large System Support (WSC)

Poughkeepsie Lab Authors:


Jeff Nick – IBM Distinguished Engineer, Chief Architect for Server Group
Gary King – Senior Technical Staff Member, S/390 Parallel Sysplex Performance
Chitta Rao – Advisory Development Engineer, S/390 Parallel Sysplex Performance
Madeline Nick – Senior Software Engineer, S/390 Parallel Sysplex, PDT Technical Support
David Raften – Advisory Software Engineer, S/390 Parallel Sysplex, PDT Technical Support

Marketing/Sales Authors:
Jim Fyffe – Senior Marketing Specialist, Area 5 S/390 Parallel Sysplex Support

GF22-5042
Coupling Facility Configuration Options
TOC

Table of Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Section 1. Coupling Facility Options — Preface . . . . . . . . . . . . . . . . . . . . . Page 5
Coupling Facility Options Available Today . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 6
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 6
Standalone Coupling Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 8
Coupling Facility LPAR on a Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 10
Integrated CF Migration Facility (ICMF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 11
Dynamic Coupling Facility Dispatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 11
Internal Coupling Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 12
Dynamic ICF Expansion Across Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 14
Dynamic ICF Expansion Across ICFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 14
Link Technologies – HiPerLinks/Integrated Cluster Bus and Internal Channel . . . . . . . . . . . Page 15
ISC-3 and ICB-3 Peer Mode Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 16
Dense Wavelength Division Multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 16
Performance Implications of Each CF Option . . . . . . . . . . . . . . . . . . . . . . . . . . Page 17
Performance of the Standalone CF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 18
Performance of CF on a Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 18
Dynamic CF Dispatch Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 19
Performance of Internal Coupling Facility (ICF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 20
Performance of Dynamic ICF Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 20
Performance of HiPerLinks, Integrated Cluster Bus (ICB), and Internal Coupling Channel (IC) . . Page 21
z990 CFCC Compatibility Patch and z/OS Compatibility Feature . . . . . . . . . . . . . . . . . . Page 23
CFRM and CFRM Policy Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 23
Performance of the Coupling Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 24
Number of CF Engines for Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 24
Comparison of Coupling Efficiencies Across Coupling Technologies . . . . . . . . . . . . . . . Page 25
Cost/Value Implications of Each CF Option . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 28
CF Alternatives, a Customer Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 28
Standalone Coupling Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 28
Coupling Facility Processor LPAR on a Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 29
Coupling Facility LPAR With Dynamic CF Dispatch . . . . . . . . . . . . . . . . . . . . . . . . . Page 29
Internal Coupling Facility (ICF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 30
Dynamic ICF Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 31
Combining an ICF with Standalone CF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 31
System Managed CF Structure Duplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 33

GF22-5042
Coupling Facility Configuration Options
TOC

Section 2. Target Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 34


Resource Sharing Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 34
Resource Sharing: CF Structure Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 37
Data Sharing Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 38
Application Enabled Data Sharing: CF Structure Placement . . . . . . . . . . . . . . . . . . . . . Page 39
ICF Use as a Standalone Coupling Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 40
Lock Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
................ Page 40
Cache Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . ................. Page 40
Transaction Log Structures . . . . . . . . . . . . . . . . . . . . . . . ................ Page 41
Other CF Structures Requiring Failure Independence . . . . . . . . . ................ Page 42
Other CF Structures NOT Requiring Failure Independence . . . . . . ................ Page 42
Failure Independence with System Managed CF Structure Duplexing ................ Page 43
Section 3: Other Configuration Considerations . . . . . . . . . . . . . . . . . . . . Page 44
CFCC enhanced patch apply: ................................... Page 44
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 45
Appendix A. Parallel Sysplex Exploiters - Recovery Matrix . . . . . . . . . . . . Page 47
Appendix B. Resource Sharing Exploiters - Function and Value . . . . . . . . . Page 47
Appendix B. Resource Sharing Exploiters - Function and Value . . . . . . . . . Page 48
Systems Enabled Exploiters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 50
XCF, High Speed Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 50
System Logger, OPERLOG and LOGREC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 51
Allocation, Automatic Tape Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 51
z/OS GRS Star . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 51
z/OS Security Server (RACF), High-speed Access to Security Profiles . . . . . . . . . . . . . . . Page 52
JES2, Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 52
SmartBatch, Cross System BatchPipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 52
VTAM GR (non LU6.2) for TSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 53
DFSMShsm Common Recall Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 53
Enhanced Catalog Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 53
WebSphere MQ (MQSeries) Shared Message Queues . . . . . . . . . . . . . . . . . . . . . . . . Page 53
Workload Manager (WLM) Support for Multisystem Enclave Management . . . . . . . . . . . . . Page 54
Workload Manager support for IRD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 54
Recommended Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 55
Appendix C. Data Sharing Exploiters - Usage and Value . . . . . . . . . . . . . . Page 57
DB2 V4 or V5, Shared Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 57
DB2 V6 – Duplexing of Cache Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 57
VTAM, Generic Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 58
VTAM, Multi-Node Persistent Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 58

GF22-5042
Coupling Facility Configuration Options
TOC

CICS TS R2.2, CICS LOG Manager................................... Page 59


VSAM RLS . . . . . . . . . . . . .
................................... Page 59
IMS V5, Shared Data . . . . . . . .
................................... Page 60
IMS V6, Shared Data . . . . . . . .
................................... Page 60
IMS/ESA Version 6 (IMS V6) SMQ................................... Page 60
IMS V6, Shared VSO . . . . . . .
................................... Page 61
VSO DEDB . . . . . . . . . . . . .
................................... Page 61
RRS, z/OS SynchPoint Manager .................................... Page 62
Recommended Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 63
Appendix D. Server Packaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 64
IBM ^ zSeries Packaging — CP – SAP – ICF – Options . . . . . . . . . . . . . . Page 64
IBM S/390 G6 Server Packaging — CP – SAP – ICF – Options . . . . . . . . . . . . . . . Page 65
Appendix E. CF Alternatives Questions and Answers . . . . . . . . . . . . . . . . Page 66

GF22-5042
Coupling Facility Configuration Options
Page 5

Section 1. Coupling Facility Options — Preface


IBM's Parallel Sysplex® clustering technology (Parallel Sysplex) continues to evolve to
provide increased configuration flexibility, improved performance and value to our
customers.

With this increased flexibility in configuring the many Coupling Facility (CF) options, IBM
has recognized the need to provide guidance in the selection of the Coupling Facility option
that best meets a given customer’s needs.

In this paper, we will examine the various Coupling Facility technology alternatives from
several perspectives. We will compare the characteristics of each CF option in terms of
function, inherent availability, performance and value. We will also look at CF structure
placement requirements based on an understanding of CF exploitation by z/OS® and
OS/390® components and subsystems. We will not go into detail on the different availability
recovery scenarios as this is covered in WSC FLASH "Parallel Sysplex Configuration
Planning for Availability" (#9829).

For a copy of the z/OS system test report, please go to the following Web site:
ibm.com/servers/eserver/zseries/zos/integtst/library.html .

More information about the IBM ~® zSeries® can be found at:


ibm.com/servers/eserver/zseries/ .

The customer's implementation of a specific CF Configuration will be based on the


installation's evaluation of the following aspects of Coupling Facilities/links:
• Performance
• Business Value
• Cost
• Availability
The objective of this white paper is to provide the customer with the information to make a
knowledgeable decision. We provide our recommendations of structure placements in
relation to the above listed aspects but the customer will make the final decision that best
suits his/her requirements and expectations.

This paper is structured into two sections with five appendices and has been updated to
include exploitations, support delivered through z/OS R1.4 and the IBM ~ zSeries 990
(z990) with respective enhanced link technology. (i.e. Peer mode, ICB3, ICB4, ISC3, IC3).

Section 1: Describes each of the Coupling Facility options available, along with their
performance and cost/value implications.

GF22-5042
Coupling Facility Configuration Options
Page 6

Section 2: Looks at the target environments for these alternatives and shows the structure
placement for each exploiter of the Coupling Facility.

Summary: High level positioning statement for the different coupling alternatives.

Coupling Facility Options Available Today


Introduction
Perhaps the best way to understand the key differences between the various physical
Coupling Facility (CF) offerings is to start with a basic understanding of what all of the CF
alternatives have in common. For example, much confusion has arisen from the use of loose
terminology such as "CF LPAR" to distinguish a CF on a server such as a 9672 or zSeries
partition, from a standalone CF model such as for example, a 2064-100 when in fact all IBM
CF microcode runs within a PR/SM™ Logical Partition (LPAR). This is true regardless of
whether the CF processors are dedicated or shared. All CFs within the same family of servers
can run the same CFCC microcode load and the CFCC code always runs within a PR/SM
LPAR. Thus, when positioning certain CF configurations as fit for data sharing production
(like a standalone model) while characterizing other CF configurations as appropriate for
test/migration (e.g., ICF or a logical partition on a server without System Managed CF
Structure Duplexing) one must be careful not to erroneously assume such positioning is
based on fundamental functional differences.

Further, given the above, it logically follows that the inherent availability characteristics of
the various CF offerings are basically the same. Yet how can this be reconciled with IBM's
recommendation to use one or more standalone CFs for maximum configuration availability?
The answer is that configuring the Parallel Sysplex environment for high availability is most
keenly tied to isolating the CF from the processors on which software exploiters are executing
(also isolating CFs from one another) to remove single points of failure. Configuring a
Parallel Sysplex environment for high availability considering the various CF options will be
discussed in detail later in this paper.

GF22-5042
Coupling Facility Configuration Options
Page 7

CFCC Levels
The following is a table of some of the CFCC functions available as of May 2003. Not all of
these functions are available on all CF models. For more details on each function, please
refer to the white paper titled "Parallel Sysplex Cluster Technology: IBM's Advantage,"
GF22-5015.

CF Function G3, G5/ z800 z900 z890


Level G4 G6 z990
13 DB2 Castout Performance X X X
12 z990 Compatability x x x
64-bit CFCC Addressability x x x
Message TIme Ordering x x
DB2 Performance x x x
SM Duplexing support for zSeries x x x
11 z990 Compatibility X
SM Duplexing Support for 9672 G5/G6/R06 X
10 z900 GA2 Level X
9 Intelligent Resource Director X X X
IC3 / ICB3 / ISC3 Peer Mode X X X
MQSeries Shared Queues X X X X
WLM Multi-System Enclaves X X X X
8 Dynamic ICF Expansion into shared ICF pool X X X X
Systems-Managed Rebuild X X X X X
7 Shared ICF partitions on server models X X X X
DB2 Delete Name optimization X X X X X
6 ICB & IC X X X X
TPF support X X X X X
5 DB2 cache structure duplexing X X X X X
DB2 castout performance improvement X X X X X
Dynamic ICF expansion into shared CP pool X X X X X
4 Performance optimization for IMS & VSAM RLS X X X X X
Dynamic CF Dispatching X X X X X
Internal Coupling Facility X X X X X
IMS shared message queue extensions X X X X X
3 IMS shared message queue base X X X X X
2 DB2 performance X X X X X
VSAM RLS X X X X X
255 Connectors / 1023 structures for IMS Batch DL1 X X X X X
1 Dynamic Alter support X X X X X
CICS temporary storage queues X X X X X
System logger X X X X X

In summary, it can be seen that the various CF offerings have a great deal in common. So
how do they differ? At a high level, the differences are primarily associated with qualities of
service, such as price/performance, maximum number of systems supported (based on link
connectivity), CF capacity (number of CF processors), and the like. With the above
discussion as background, each of the CF options will now be discussed, focusing on their
distinguishing properties.

GF22-5042
Coupling Facility Configuration Options
Page 8

Standalone Coupling Facilities (9672 R06, 2064-100, 2066-0CF, 2084-ICF Only)


The standalone CF provides the most "robust" CF capability, as the CPC is wholly dedicated
to running the CFCC microcode — all of the processors, links and memory are for CF use
only. The various standalone CF models provide for maximum Parallel Sysplex connectivity
(typically up to 32 external coupling links are standard, with a maximum of 64 via RPQ on
z900 Model 100s). The maximum number of configurable processors and memory is tied to
the physical limitations of the associated machine family (i.e. 9672 R06 10-way, or z990 with
32 CPs).

This environment, which precludes running any operating system software, allows the
standalone CF to be priced aggressively in comparison to servers. A natural fallout of this
characteristic is that the standalone CF is always failure-isolated from exploiting z/OS
software and the CEC that z/OS is running on for environments without System-Managed CF
Structure Duplexing. Another advantage of this kind of environment is that when
"disruptive" CF upgrade or maintenance must be installed, the upgrade can be done without
having to take down operating system images because there aren’t any on that machine. In
these upgrades, the isolation of the CF images from the operating system is an advantage.
With a backup CF, a standalone CF is already configured for high availability.

9672 R06
The 9672 R06 standalone Coupling Facility supports up to 10 ICF G5 CPs. The R06 can be
up to double the capacity of the previous 9674 Model C05. The R06 is directly upgradeable
to G5 9672 server models. It also supports the Integrated Cluster Bus as well as up to 32
HiPerlinks. There are enhanced availability benefits that Internal Coupling Facilities (ICFs),
starting on the G5, derive from the Transparent CP sparing feature. If an ICF processor fails,
even running dedicated, a spare CP can be dynamically switched in to substitute for the failed
processor, transparent to the CFCC microcode running on that processor.

2064 Model 100


The 2064 Model 100 standalone Coupling Facility improves on the 9672 R06 in terms of
uniprocessor and overall capacity, as it supports up to 9 ICF 2064 CPs running up to 15
logical partitions (LPs). The Model 100 engine speed is approximately double that of the
9672 R06. The Model 100 is directly upgradeable to 2064 Server models and can be
upgraded from a 9672 R06. It also supports a mix of the new Peer Mode coupling links
(ICB-3 and ISC-3) as well as Compatibility Mode links, with a total of 64 external links.

GF22-5042
Coupling Facility Configuration Options
Page 9

2066 Model 0CF


The z800 0CF is the standalone CF model for the 2066 family. Scalable up to 4 ICF engines
with 32 GB of central storage, this standalone CF can support up to 32 IC links to other CF
in the CEC (for a test environment) 6 ICB3 Peer Mode links to connect to other zSeries
processors, or 24 ISC3s, some of which can be optionally configured to run in compatibility
mode for connectivity to 9672s.

2084 Model 300


The 2084 family of servers includes the following four models:

1. z990-A08: Supports up to 8 engines


2. z990-B16: Supports up to 16 engines
3. z990-C24: Supports up to 24 engines
4. z990-D32: Supports up to 32 engines

Each engine can be ordered as a general purpose CP (to run z/OS, z/VM®, etc.), an Internal
Facility for Linux (IFL), or an Internal Coupling Facility (ICF). A z990 standalone Coupling
Facility is configured by ordering only ICF engine types. The functions and features available
on such an ICF-only configuration are the same as for any other model z990 configuration.
This includes support for the ICB-4 coupling links.

Message Time Ordering


As processor and Coupling Facility link technologies have improved over the years, the
synchronization tolerance between systems in a Parallel Sysplex cluster has become more
rigorous. To help allow for any exchanges of time stamped information between systems in a
sysplex involving the Coupling Facility observe the correct time ordering, time stamps are
now included in the message-transfer protocol between the systems and the Coupling Facility.
Therefore, when a Coupling Facility is configured as an ICF on any model 2C1 through 216
z900, or any z990 and later, the Coupling Facility will require connectivity to the same 9037
Sysplex Timer® that the systems in its Parallel Sysplex cluster are using for time
synchronization. The 2064 2xx models do not include the option of a CF-only model.
However, it is possible to configure CF logical partitions using ICFs and/or general purpose
CPs on this model.

If the ICF is on the same server as a member of its Parallel Sysplex cluster, no additional
Sysplex Timer connectivity is required, since the server already has connectivity to the
Sysplex Timer. However, when an ICF is configured on a z900 model 2C1 through 216,
which does not host any systems in the same Parallel Sysplex cluster, it is necessary to attach
the server to the 9037 Sysplex Timer.

GF22-5042
Coupling Facility Configuration Options
Page 10

z/OS ICF z/OS

12
11 1
10 2
9 3
8 4
7 5
6

New ETR Connection

z/OS- ICF
Outside
Parallel z900 Model 2xx or later
Sysplex

Coupling Facility LPAR on a Server


In many regards, running the CFCC in a PR/SM LPAR on a server is the same as running it
on a standalone model. Distinctions are mostly in terms of price/performance, maximum
configuration characteristics, CF link options, and recovery characteristics. 9672s support up
to 32 coupling links as standard. 2064s and 2084s support up to 32 external links together
with up to 32 IC links. The links can be a mixture of sender and receiver channels. CF
processors can run dedicated or shared. The flexibility of the non-standalone models to run
z/OS and CFCC operating systems results in its higher relative price for the CF capability and
also subjects the customer to software license charges for the total MIPS configured on the
processor model, even though some of those MIPS are associated with execution of CFCC
code.

As stated earlier, the ability to run z/OS workloads side by side on the same physical Central
Processor Complex (CPC) with the CF in the same Parallel Sysplex cluster introduces the
potential for single points of failure for certain environments. These considerations will be
explored later.

Prior to the introduction of the Dynamic CF Dispatching algorithms, the CFCC dispatcher
runs in a tight polling loop when idle, as is true for the 9674. Implications of Dynamic CF
Dispatching will be discussed later.

GF22-5042
Coupling Facility Configuration Options
Page 11

Integrated CF Migration Facility (ICMF)


The ICMF was initially introduced as a test/migration aid for customers wishing to test their
Parallel Sysplex environment on a single machine prior to multisystem deployment. The
ICMF was basically a linkless CF. It runs the same CFCC code as other CF alternatives.
Without real links, commands to the CF partition are intercepted by the PR/SM hipervisor
and passed internally to the CF partition via simulated link buffer manipulation. The ICMF
did not support real coupling links and therefore was a single-machine solution. This is, by
definition, not failure isolated from an exploiting operating system partition. Further, OS/390
partitions that exploit the ICMF have to be defined to PR/SM, which precludes their
attachment to any CF via real coupling links as well.

The dispatching logic in CFCC is sensitive to ICMF execution, and does not execute in a tight
polling loop while idle. Rather, the CFCC gave up control of the physical processor when
there were no commands to execute, thus enabling those processor cycles to be usefully
applied to other logical partitions. This benefit in reduced CFCC overhead is offset against
reduced responsiveness to CF commands issued by OS/390 partitions, since the CF partition
may need to be re-dispatched in order to execute the issued command. Since PR/SM
intercepts all issued CF commands, ICMF has additional performance costs. It is the above
"quality of service" considerations which have branded the ICMF as a test/migration aid only.
With the introduction of IC links, the ICMF option is no longer recommended and in fact is
not available on zSeries servers.

Dynamic Coupling Facility Dispatch


The Dynamic CF Dispatch function, introduced in April 1997, is actually not a different CF at
all, but rather provides an enhanced CFCC dispatcher algorithm for improved
price/performance. For CF partitions other than ICMF, the strict tight polling loop algorithm
is replaced with an intelligent algorithm which heuristically gives up control of the physical
processor voluntarily when there are no CF commands queued for execution at intervals
based on observed arrival rates. This behavior enables the processor to be fully used for
customer workloads until CF processor resource is needed, and even then CF CPU
consumption is limited to the resource level required. As traffic to the CF ramps up, the CF
CPU capacity is dynamically tuned in response to ensure maximum overall system
throughput.

Note that the Dynamic CF Dispatch algorithms apply to all non-dedicated CF processor
configurations, other than ICMF. This includes standalone models, where at customer
configuration option, the Dynamic CF Dispatch algorithm can be activated. Customers
running multiple CF partitions on a single 2064, for example, with very disparate request
rates or priority between the multiple CF partitions (like test and production CF LPAR) find

GF22-5042
Coupling Facility Configuration Options
Page 12

the new function of significant value. Also, customers using this function on their server
systems, where the engines are shared with z/OS partitions, may "give back" CPU resources
to the z/OS images.

With the Dynamic CF Dispatch mechanism, customers can configure a robust


highly-available environment with a single standalone Coupling Facility and a Dynamic CF
Dispatch LPAR partition as the "hot standby" second Coupling Facility. This is deisgned to
offer high availability at reduced cost, eliminating the need to dedicate an entire processor for
Coupling Facility use just in case of primary Coupling Facility failure.

When first delivered, support for the Dynamic CF Dispatch (along with previous shared CP
configurations) capability resulted in conversion of all CF commands from synchronous mode
to asynchronous execution mode, if the issuing OS/390 partition resided on the same
machine as a CF partition running the dynamic dispatch function, and that OS/390 partition
was running on shared processors. This was judiciously done to prevent possible processor
deadlocks. This unfortunately resulted in conversion of requests to asynchronous execution
mode even when the CF commands were targeted to a standalone CF model! This behavior
has resulted in degraded performance in customer configurations running a CF LPAR on a
9672 as a "hot standby" with the primary CF being a standalone model. In response to this
problem, IBM delivered, starting with G3 servers, an enhanced version of the Dynamic CF
Dispatch function which allows requests targeted to CF partitions running on dedicated
processors to be executed synchronously — thus, providing the benefits intended for the
Dynamic CF Dispatch algorithm without the initially observed performance penalty.

Internal Coupling Facility


The design and MultiChip Module (MCM) technology on the zSeries servers provide the
flexibility to configure the PUs for different use. PUs can be used as Central Processor (CPs),
System Assist Processors (SAPs), including zSeries Application Assist Processors (zAAPs)
Internal Coupling Facility processors (ICFs) and Integrated Facility for Linux processors
(IFLs). For the general purpose processor model, the number of CPs and SAPs (provide I/O
processing) are specified by the model number. This capability provide tremendous
flexibility in establishing the best system for running applications. One PU is always reserved
as a spare.

GF22-5042
Coupling Facility Configuration Options
Page 13

Introduced on the IBM S/390® G3 servers, the Internal Coupling Facility (ICF) is a CF
partition which is configured to run on what would otherwise be spare processors, not part of
the production model. Special PR/SM microcode precludes the defined ICF processors from
executing non-CFCC code. Changes to the number of ICF processors can only be made
through Miscellaneous Equipment Specification (MES) upgrade. As a result of this behavior,
software license charges are not applicable to the ICF processor. The maximum number of
ICFs that can be defined are as follows.
Server Model ICF Engines
9672 G3/G4 4
9672 G5 7
9672 G6 11
2066 z800 4
2064 z900 CF 9
2064 z900 Server 15
2086 z890 4
2084 z990 Server 32

The ICF is a highly attractive CF option, because, like the standalone models, it is not subject
to software license charges, while being attractively priced relative to the standalone CFs as a
hardware feature. This is achievable given that the cost of the processor machine "nest"
(frame, power and cooling units, etc.) is amortized over the server system as a whole and not
just the burden of the CF component of the system, as is the case in a standalone Coupling
Facility.

The ICF can be used as the backup Coupling Facility when a standalone CF is used for the
primary, reducing the need for a second standalone Coupling Facility in a multisystem
Parallel Sysplex configuration. As will be seen later, the ICF can serve as an active
production Coupling Facility in many resource sharing environments (see Section 2: System
Enabled Environment) or with System-Managed CF Structure Duplexing even in a data
sharing environment.

Customers can use the ICF as a production Coupling Facility even in a single-CEC Parallel
Sysplex configuration, running two or more z/OS logical partitions sharing data via the ICF.
This configuration enables customers interested in getting into the Parallel Sysplex
environment to take advantage of its continuous operations protection from software outages.
Individual z/OS partitions can be taken down for maintenance or release upgrade, without
suffering an application outage, through the datasharing provided by the remaining LPARs in
the CEC. The ICF uses external Coupling Facility links (HiPerLinks, ISC-3, ICB, ICB-3, or
ICB-4), or internal IC links. By putting the Coupling Facility in an ICF, both the benefits of
reduced software license charges and reduced hardware costs can be realized.

GF22-5042
Coupling Facility Configuration Options
Page 14

Dynamic ICF Expansion Across Server


Announced in February 1998, Dynamic ICF Expansion across the server allows an ICF
logical partition to acquire additional processing power from the LPAR pool of shared CPs
being used to execute production and/or test work on the system: in essence, it is a combined
shared/dedicated CP LPAR for CFs. This feature is available to every ICF partition starting
with the G3 server.

With this capability, customers can configure their dedicated ICF capacity to match their
steady state CF processor resource requirements, while providing the flexibility for the ICF
partition to expand into the shared CP processor pool when a spike in CF resource demand
occurs. This might result either from the failure of a standalone CF for which the ICF is a
back-up, or for unusual spikes in workload demand for any reason.

When coupled with the z/OS Workload Manager functions and dynamic workload balancing,
the ICF with Dynamic Expansion provides a form of "self-tuning ecosystem," wherein new
z/OS workload arriving can be dynamically balanced across nodes of the Parallel Sysplex
cluster when unusual spikes in demand for CF resources cause the ICF to expand into the
shared z/OS processor pool.

With Dynamic ICF Expansion capability, the ICF combines the performance benefits of
dedicated CF processors, the software pricing benefits of a standalone CF, and the dynamic
CF processing capacity tuning of the CFCC Dynamic CF Dispatch algorithms.

Dynamic ICF Expansion Across ICFs


Dynamic ICF Expansion across ICFs allows an ICF logical partition to acquire additional
processing power from the LPAR pool of shared CPs being used by other ICFs. This feature
is available to every ICF partition starting with the G5 server. This feature protects the CPU
resources of z/OS logical partitions at the cost of less important ICFs, such as those that
support the test environment.

With this capability, customers can configure their dedicated ICF capacity to match their
steady state CF processor resource requirements, while providing the flexibility for the ICF
partition to expand into the shared CP processor pool when a spike in CF resource demand
occurs. This might result either from the failure of a standalone CF for which the ICF is a
back-up, or for unusual spikes in workload demand for any reason.

GF22-5042
Coupling Facility Configuration Options
Page 15

Link Technologies – HiPerLinks/Integrated Cluster Bus and Internal Channel


IBM offers an upgrade to the initial InterSystem Coupling (ISC) Coupling Facility links with
HiPerLinks, starting on 9672/9674 G3 servers. These links provide increased efficiency in
the processing of asynchronous requests to the Coupling Facility with increased capacity to
process requests. This can potentially reduce the number of links required in a Parallel
Sysplex cluster with better performance. With HiPerLinks, XCF message delivery times are at
least comparable to that achieved by using channel-to-channel (CTC) pathing. This enables
customers to derive reduced systems management costs through simplified Parallel Sysplex
configuration management, without a compromise in performance. Performance
considerations on XCF vs. CTC links can be found at
ibm.com/support/docview.wss?uid=tss1flash10011 .

The Integrated Cluster Bus is a low cost external coupling link alternative to HiPerLinks
(ISC links) for short distance Coupling Facility connections. It is a 10 meter copper cable (up
to seven meters between machines) using the STI (Self Timed Interface) I/O bus with an
effective data rate of 250 MB/second (maximum speed of 333 MB/second). Available starting
with the G5 servers, the ICB is the core of the Parallel Sysplex Integrated Cluster, offering
both improved CF interconnect bandwidth and dramatically reduced command latency
to/from the CF. With the ICB technology, IBM S/390 clustered systems are able to scale in
capacity without incurring additional overhead as the individual processors increase in engine
speed with each successive processor generation.

The G5 server also introduces support for the Internal Coupling channel (IC), which is a
micro coded "linkless" coupling channel between CF LPARs and z/OS LPARs on the same
CEC. It is designed to eliminate the overhead associated with LPAR-simulation of CF
coupling links previously supported via the Internal Coupling Migration Facility (ICMF),
enabling exceptional performance benefit. Additionally, the IC has significant value over
ICMF beyond the performance characteristics. LPARs using ICs to communicate internally
within a CEC can simultaneously use external Coupling Links to communicate with CFs and
z/OS systems external to the CEC. This flexibility eliminates the need to "wrap" a coupling
link to the same CEC to communicate internally if external communication from the same
partition is also required. Hence, the restrictions associated with ICMF are removed and
internal communication performance to the CFCC LPAR or ICF is greatly improved. The ICs
obsolete ICMF on G5 machines. Please refer to the Performance section of this paper for
details on the added value of these G5 functions.

GF22-5042
Coupling Facility Configuration Options
Page 16

ISC-3 and ICB-3 Peer Mode Links


Peer Mode supports coupling between zSeries servers and provides both sender and receiver
capability on the same link. This allows a single link to be connected to both a z/OS and a
CF, reducing the number of links needed. Peer links can provide up to seven expanded
buffer sets (compared to two buffers with G5/G6) and protocol improvements . With these
design improvements, the new Peer Mode ISC3 connections perform at 200 MBytes/second
compared to former ISC capability of 100 MBytes/second, and the new Peer Mode ICB3
connections perform at a theoretical peak of one GBytes/second, effective rate of 800 MB/Sec
compared to former ICB capability measured at 250 Mbytes/second . Faster links yield an
increased coupling efficiency. When ISC3s are connected from 10–20 km via an RPQ,
however, they can only support 100 MByte/second throughput. They do not support direct
connections beyond 20 km.

MIF Technology can be used with CF links. A single link can be shared from multiple z/OS
images on the same CEC but can only go to a single CF. Links cannot share links between
two or more Coupling Facilities.

Dense Wavelength Division Multiplexer


The Dense Wavelength Division Multiplexer is designed to use advanced dense wavelength
multiplexing technology to transmit up to 32 channels over a single pair of fibers at distances
of up to 50 km (31 miles). This distance can be extended by using optical amplifiers.
Maximum transmission rate for each channel is 2.5 Gigabits per second (Gbps) on each of the
32 wavelengths for a theoretical total of 96 Gbps. ISC3 links in peer mode run at 2 Gbps,
this is the maximum data transmission rate of coupling links through DWDM. In real-world
terms, by using the DWDM technology to multiplex data, an organization could replace up to
128 fibers with just four. Within a Parallel Sysplex environment, although DWDM can be
used to carry signals to and from Coupling Facilities, limitations in External Timer (ETR)
technology limits the usefulness to 40 km.

DWDM can provide significant benefits across a number of applications, but is particularly
valuable in that it enables the use of a Geographically Dispersed Parallel Sysplex™ (GDPS®)
for disaster avoidance and recovery. A GDPS consists of a Parallel Sysplex cluster spread
across two or more sites at distances of up to 40 km (25 miles). It provides the ability to
perform a controlled site switch for both planned and unplanned outages, with minimal or no
data loss, while maintaining full data integrity.

GF22-5042
Coupling Facility Configuration Options
Page 17

Performance Implications of Each CF Option


With the rich and diverse set of Coupling Facilities that are made available by IBM for
Parallel Sysplex functions, it is both interesting and worthwhile to examine the performance
characteristics of these alternative configurations. While each Coupling Facility has some
unique features, the general performance characteristics of the CFs participating in a Parallel
Sysplex cluster are similar. In the general discussion of coupled systems' performance,
internal throughput rate (ITR), internal response time, and coupling efficiency are of interest.
The internal throughput rate for online transaction processing (OLTP) may be denoted as the
number of transactions completed per CPU second. The transaction's internal response time
is a combination of the CPU busy time and the I/O time. The coupling efficiency of a given
system is a measure of its productivity while participating in a Parallel Sysplex environment.
Formally, the coupling efficiency of a Parallel Sysplex cluster may be defined as a ratio of the
coupled systems' ITR to the aggregate ITR of the uncoupled systems.

In this prologue the general performance characteristics that are applicable to all of the
Coupling Facility configurations are discussed. In the following subsections, the contrasting
performance characteristics are pointed out for each of the Coupling Facility alternatives.

A small overhead is associated with the enablement of a Parallel Sysplex cluster. This could
be viewed as the cost to manage multiple z/OS images. Roughly, a 3% cost in total capacity
is anticipated when a customer intends to migrate his applications from a single system to
multisystems. This includes overhead for shared DASD, JES2 MAS, and (base) sysplex. This
3% cost is less if any of these options were already configured. On the other hand, the cost of
datasharing in a Parallel Sysplex cluster depends on the characteristics of a customer's
workloads. More specifically, it depends on the frequency and type of CF accesses made by
the requesting systems. Even at a heavy production datasharing request rate (8–10 CF
accesses per million instructions), the datasharing cost in a Parallel Sysplex cluster will
generally be under 10% of the total systems' capacity, resulting in a coupling efficiency of
90%.

The cost of a CF access is comprised of software and hardware components. The software
component includes the subsystem (such as, DB2®) and the operating system (z/OS) path
lengths to setup for and complete a request, and in general is independent of the CF type.
The hardware component is the CF service time, which includes the time spent in the
requester hardware, in the requester's link adapter, in the coupling link, in the CF’s link
adapter, and in the Coupling Facility Control Code. For CF read/write requests associated
with data transfers, the link speed is of particular importance. The higher the link speed, the
faster the data transfer. The highest effective fiber optic link speed available is 200 million
bytes per second. With the ICB-4 copper STI channel, the effective link speed is 1500

GF22-5042
Coupling Facility Configuration Options
Page 18

million bytes per second. IC "links" are even faster. The Coupling Facility Control Code
processes the request and responds to the requesting system. The capacity of the CF to
process requests depends on the speed and number of its processors in addition to the
application workload characteristics. It is generally recommended that CFs run below 50%
processor utilization. The reason for this is twofold: 1) as CF utilization grows above 50%,
service times may noticeably degrade impacting coupling efficiency; 2) if one CF fails, there
will be sufficient capacity in the remaining CF(s) to handle the load.

The requesting processor will issue CF commands in one of two modes: synchronous or
asynchronous. A CF command is said to be executing synchronously when the requesting
processor spins in an idle microcode loop from the time the request is sent over the link to
the time it receives a response from the Coupling Facility. On the other hand, during the
processing of an asynchronous CF command, the requesting CP does not wait for a response
from the CF, but is free to do other pending work. However, significant software costs are
incurred in performing the task switch and in the processing of the asynchronous CF
command completion. Starting with z/OS 1.2, z/OS will pick the mode that provides the best
coupling efficiency.

The performance characteristics of each of the Coupling Facility alternatives are briefly
described in the following subsections.

Performance of the Standalone CF


Dedicated processors in a logical partition will always provide the highest coupling efficiency
and best CF response times. For CF partitions with shared processors, the coupling efficiency
will be lower and CF response times will be longer, depending on the degree of processor
resource sharing between the CF partitions. Although shared CF CPs are generally not
recommended for production Coupling Facilities, if they are used, they should be weighted to
get at least 50% of a CP for an environment with only simplex or user-managed duplexed
requests. If the coupling environment contains structures that are system-managed duplexed
command, the shared CF CPs should be weighted so 95% of a CP is available.

Performance of CF on a Server.
An alternative to a standalone Coupling Facility could be a CF partition defined on a 9672 or
zSeries server. This internal CF may have dedicated processors or it could share processors
with z/OS images on the same CEC. The connectivity between the z/OS partitions and the
CF partition is made possible by "wrapping around" coupling links to the same CEC or by
using the Internal Coupling (IC) channel available starting with the G5 servers. This CF
could also be used as a Coupling Facility to external CECs connected by coupling links.

GF22-5042
Coupling Facility Configuration Options
Page 19

The performance of an internal CF partition with dedicated processors in a server is similar to


that of a standalone CF because a dedicated LPAR requires only a little LPAR dispatching
overhead. Defining the CF processors as ICFs can provide a less expensive alternative to
standalone CFs, plus the ICF processors do not affect the capacity rating of the box so there
is no increase in the software licensing fees. As was the case with standalone CFs, the
performance of CF partitions with shared processors will result in lower , coupling efficiencies
than CF partitions with dedicated processors depending on the degree of processor resource
sharing. This is due to the fact that sometimes when the CF partition has work to do, there
will be a delay before the LPAR can dispatch it (when all physical processors are already
busy).

An additional impact to CF performance occurs when a CF partition shares CPs with z/OS
images in the same sysplex. Since processors are shared between the z/OS partitions sending
the synchronous CF commands, and the CF partition that is executing them, a deadlock
situation might occur in the process; the z/OS partition and the CF partition may
simultaneously require the same processor resource. Hence, it is necessary for LPAR to
"under the covers" treat all CF commands as asynchronous to the processor when those CF
commands stem from z/OS partitions sharing the processors with the CF partition. Note that
to the operating system, the CF command still appears to be processed as it was issued, either
synchronously or asynchronously, thus, there is no additional software path length involved.
However, the CF service times will be degraded resulting in a small loss in coupling
efficiency.

Some of the negative impacts of sharing processors between Coupling Facilities can be
reduced if one of the sharing CFs does not require optimal performance. This may be the
case, for example, if one of the CFs is a production CF and the other is a test CF. In this
case, by enabling Dynamic CF Dispatching for the test CF, the production CF will obtain
most of the processor resource at the expense of performance in the test CF.

Dedicating ICFs or CPs to a CF partition is just like dedicating CPs to z/OS partitions. For
performance reasons, it is recommended that datasharing production environments run with
only dedicated CF partitions.

Dynamic CF Dispatch Performance


CFCC Release 4 introduced a new, optional, dispatching algorithm for CF partitions defined
with shared processors. With the activation of the dynamic CF dispatch function, the
percentage of the processor used in the CF logical partition is dependent on the request rate
to the CF. When the requirement for CF processing is low, the CFCC runs less frequently
making more processor resource available for the needs of the other LPARs sharing the

GF22-5042
Coupling Facility Configuration Options
Page 20

processor. When the request rate to the CF increases, the CFCC runs more often, increasing
the share of the processor resource used for this partition. Since the CF is dispatched less
frequently when the traffic to the CF is low, the idle or near-idle CF consumption of the
processor can be less than 2%. However, the CF response times seen by the OS/390 or z/OS
partitions will elongate due to infrequent CF dispatch. As the traffic to the CF ramps up, the
CF processor utilization progressively increases due to more frequent CF dispatch, until
reaching its LPAR weight. Correspondingly, the CF response times seen by the z/OS
partitions will decrease significantly due to the considerable drop in the average wait time for
CF dispatch. In all cases, however, CF response times with Dynamic Dispatch enabled will be
considerably higher than those configurations where Dynamic Dispatch is not enabled.
Therefore, the use of this function should be limited to environments where response time is
not important, such as test and backup CFs.

Performance of Internal Coupling Facility (ICF)


The Internal Coupling Facility can be used as a shared or dedicated CF partition with
external or internal coupling links. The performance of an ICF with external coupling links
is similar to that of a CF partition on a server, with the addition of a small degradation due to
the multiprocessor effects of "turning on" an additional processor (note that these MP effects
are much less than those associated with adding another processor to run additional z/OS
work).

A table later in this chapter provides a comparison of the coupling efficiencies for the CF
alternative configurations described above.

Performance of Dynamic ICF Expansion


While an internal Coupling facitily with a dedicated ICF processor may have enough capacity
to handle a normal CF workload, unexpected CF workload demands may require additional
processing power. These demands can be addressed by using the Dynamic ICF expansion
facility to temporarily allocate a portion of existing CP or ICF processors to this Coupling
Facility. As long as the dedicated ICF processor can handle the load, each shared processor
defined to the partition may consume less than 1% of a processor. When the CF workload
demand grows beyond the ability of the dedicated processors to keep up, the shared
processors will begin to use additional resource attempting to preserve the efficient
processing of CF commands.

It is important, especially if this CF is used for structures that are system-managed duplexed,
that the shared CF have sufficient weight. Please refer to the PR/SM Planning Guide for a
more complete description of these CF options and implementation considerations.

GF22-5042
Coupling Facility Configuration Options
Page 21

Performance of HiPerLinks, Integrated Cluster Bus (ICB), and Internal Coupling Channel (IC)
High Performance Coupling Links (HiPerLinks), available starting on G3 servers and on the
corresponding Coupling Facility models, improve the performance of S/390 coupling
communications. They facilitate improved channel efficiency and service times in the
processing of the CF requests. HiPerLinks improve the coupling efficiency by one to two
percent in comparison to the performance of the earlier coupling channels. Additionally, the
CF link capacity can be improved by up to 20%.

The Integrated Cluster Bus (ICB) was available starting on the G5 servers and on the
corresponding Coupling Facility models. It provides a bi-directional, high-speed bus
connection between systems, separated by up to seven meters, to carry out coupling
communications by using the G5 server's Self Timed Interface (STI). Since the data transfer
rate of the ICB is much faster than that of the HiPerLinks, the latency of the synchronous CF
requests is reduced, resulting in coupling efficiency improvements.

Note: ICB-2s are not supported connecting a z990 with either a z900 or z800

The Internal Coupling (IC) channel available starting on the G5 server provides a high-speed
internal communication link between a z/OS LPAR and a CF LPAR on the same server. In
comparison to the performance of the HiPerLinks, the IC channel can improve the coupling
efficiency of the system by one to four percent.

The zSeries family of processors introduces three additional links: ISC-3, ICB-3, and IC-3.
These links can only be configured in Peer Mode, supporting connections only between
zSeries processors. Peer Mode by itself provides performance improvements. These include:

• 4 KB to 64 KB data buffers
• Protocol changes so less acknowledgments are needed for messages. This improves long
distance performance as well as transfers of more than 4 KB of data. 32 KB transfers are
common with VSAM/RLS, so this environment should especially see better performance. At
long distances, this is a very significant improvement. The large data buffers in ISC3 allow
the maximum data move (64 KB) without intermediate link acknowledgments. A 64 KB
data transfer takes 320 microseconds at 200 MBytes/second. At 10 km, one link turn
around takes 100 microseconds. With larger data buffers, the 15 additional link
acknowledgments (15 x 100 = 1.5 milliseconds) are avoided. So, just for the data transfer
and turn around time, we go from 1.82 milliseconds to 0.42 milliseconds.
• Increase from two to seven buffer sets (subchannels) per link reducing the number of links
required. From an ISC/ICB point of view, the Peer Mode increase from two to seven buffer
sets is actually multiplied by two since both the “sender” and “receiver” functions of the
Peer channel increase the number of buffer sets from two to seven. With today's channels
and tomorrow's workloads, we find that we are often adding more channels because two, or

GF22-5042
Coupling Facility Configuration Options
Page 22

four, or even six subchannels (requiring one, two, and three ISCs/ICBs/ICs, respectively) are
not enough between a collection of OSs and a CF. Also, the link utilization with ISC and
ICB is extremely low with only two subchannels. So, to better utilize the links and consume
fewer channels, we increased the number of subchannels (= buffers) from two to seven, and
doubled this number again by introducing Peer Mode. So, in reality, an ISC/ICB/IC running
in Peer Mode has seven times the number of buffers that the present channels have.

ISC-3 links are based upon HiPerLinks, but can only be configured in Peer Mode. This link
normally provides double the capacity of HiPerLinks at 200 MB/second while improving
coupling efficiency by 1%. The same physical link is used with ISCs and ISC-3s, but the way
it is defined determines how the link will be used.

ICB-3 links are based upon ICBs. Clocking in at three times the capacity of ICBs, ICB-3s can
provide a maximum of 1000 MB/second capacity. As with ICBs, they only support a
maximum of 7 meters between machines. Although ICB-3s are supported between two
z990s, it is not recommended as ICB-4s provide better performance.

ICB-4 links starting with the z990s are similar to the ICB-3. They provide a maximum of
2000 MB/second capacity and are supported starting with the z990s. This means they can
not be used to connect to z900s, for example.

IC-3s on the zSeries (also sometimes called ICs) are similar to ICs. Since by definition an IC
must be within the server, one can not have a mix of different server types while using ICs.
The capacity of an IC link is directly related to the cycle time of the server. The numbers
listed are measured from actual workloads. These number reflect expected throughput
rates in a production type environment.

Model IC ICB-4 ICB-3 ICB ISC-3 ISC


9672 G5/G6 700 MB/sec - - 250 - 100
MB/sec MB/sec

z800 1125 - 500 - 200 MB/sec -


MB/sec MB/sec 100 MB/sec beyond 10
km
100 MB/Sec Compat
Mode
z890 (tbd) 1500 500 - (same as z800)
MB/sec MB/sec MB/sec
z900 1400 MB/sec - 500 250 (Same as z800) -
MB/sec MB/sec
z990 3500 1500 500 250 (same as z800) -
MB/sec MB/sec MB/sec MB/sec

GF22-5042
Coupling Facility Configuration Options
Page 23

z990 CFCC Compatibility Patch and z/OS Compatibility Feature


Max. # Links IC ICB-4 ICB-3 ICB ISC-3 Max. # Links

z900 - 100 CF 32 - 16 16 32 64
42 w/RPQ
z900 Server 32 - 16 8 32 32+32
12/16
w/ RPQ
z990 32 16 16 8 48 64

z800 32 - 5 - 24 26+32
6 (0CF)
z890 32 8 16 - 48 64

The z990s support 15 LPARs in each processor book. The GA1 version (May, 2003)
supports up to two books (A08 and B16 models) and the GA2 version (October, 2003)
supporting up to four books (C24 and D32 models).

CFRM and CFRM Policy Support


The CFRM support for the z990 processor now allows two-digit LPAR identifiers to be
assigned in the CFRM policy. This extension is essential for allowing more than 15 LPARs to
coexist on the processor.

The LPAR ID is not the same as the partition number. The LPAR identifier is assigned in
the Image Profile on the Support Element and is unique to each LPAR. It is a two digit value
in the range x'00' to x'3F'.

This update is identical for both compatibility and exploitation support. The LPAR id is also
used in IOS and XCF messages.

The LPAR id for a Coupling Facility (specified in its image profile on the Support Element)
must correspond with that specified on the PARTITION() keyword in the CFRM. This is
different than earlier zSeries and S/390 processors. On previous systems, the value specified
on the PARTITION() keyword was the partition number from the IOCDS.

In May 2003, OS/390 R10 and z/OS support running on only the first book in LCSS 0
(LPAR ID will always be 15 or less). This basic z990 support is available through the z/OS
V1R4 z990 Compatibility Support Feature. This orderable, unpriced and optional feature
is required to allow z/OS V1R4 to run on a z990 server. Because of the CFRM support, it is
also required on all systems in a Parallel Sysplex cluster when a z/OS or CF image in the
Parallel Sysplex cluster is running on a z990 and the internal LPAR identifier of the

GF22-5042
Coupling Facility Configuration Options
Page 24

operating system or CF image is greater than 15. This happens when z/OS is in LCSS 1 or
greater, or a CF is defined when there are more than 15 total LPARs on the z990.

Similar support is provided with the z990 Compatibility for Selected Releases and z/OS.e
V1.4 z990 Coexistence to allow OS/390 V2.10, z/OS V1.2, and z/OS 1.3 and z/OS.e to run
on a z990 or to be in the same Parallel Sysplex cluster as a system on the z990. Although
z/OS.e can not run on a z990, it can be in the same Parallel Sysplex cluster with a z/OS on a
z990.

Compatibility support is not provided for z/OS V1.1. Therefore, z/OS V1.1 cannot run on a
z990 server. Also, z/OS V1.1 cannot coexist in a Parallel Sysplex cluster if any OS/390, z/OS,
or Coupling Facility image in that sysplex is running on a z990 server.

In October 2003, the z/OS V1.4 z990 Exploitation Support was made available. This
orderable, unpriced and optional feature provides exploitation support for two Logical
Channel SubSystems and 30 LPARs.

Support on the Coupling Facilities for the z990 servers is provided on the 9672 G5 and G6
servers with CFCC Level 11 and with a (disruptive) MCL upgrade of CF Level 12 for the
Coupling Facilities on the z800s and z900s. The z990s come with the support already on CF
Level 12 when they are ordered.

Performance of the Coupling Facilities


As mentioned before, the cost of a CF access is comprised of software and hardware
components. The hardware cost can be minimized by using the most efficient links with
faster engine speeds for the CFs. This reduces the time that z/OS is waiting for the response
while the message is on the CF link and while the CF is busy processing the request. With
this in mind, it becomes obvious that the best coupling efficiency is generally seen with the
newest model CF. z990 ICF-only CFs perform better than z900-100s, which perform better
than 9672 R06s.

Number of CF Engines for Availability


When determining if the Coupling Facility should be configured with one or more CPs, two
issues arise. The first one is capacity. The recommendation is to provide enough capacity for
either Coupling Facility to support all the structures in a failover situation while staying
under 50% busy for optimal response times. One would then need enough engines to make
this so. For optimal performance, it is best to keep the CF within one generation from the
general purpose servers where z/OS is running.

GF22-5042
Coupling Facility Configuration Options
Page 25

Assuming that one engine is sufficient to provide capacity for the Parallel Sysplex cluster, a
more interesting question is if an additional engine be configured for availability. This is not
needed. All CMOS processors since the G5 family (including the R06 CF) have supported
transparent sparing. Transparent CPU Sparing is designed to enable the hardware to activate
a spare PU to replace a failed PU with no involvement from the operating system or the
customer, while preserving the application that was running at the time of the error. If one of
the running CPUs fails and instruction retry is unsuccessful, a transparent sparing operation
is performed. Using the service element, Dynamic CPU Sparing copies state information from
the failed CPU into the spare CPU. To ensure that the action is transparent to the operating
system, special hardware tags the spare with the CPU address of the failed CPU. Now the
spare is ready to begin executing at precisely the instruction where the other CPU failed.
Sparing is executed completely by hardware, with no operating system awareness.
Applications continue to run without noticeable interruption.

With transparent CPU Sparing, even if there were a hardware hit, a second engine would not
provide significant benefits.

Comparison of Coupling Efficiencies Across Coupling Technologies


As discussed earlier, the coupling efficiency of a Parallel Sysplex cluster, particularly one that
has heavy datasharing, is sensitive to the performance of the operations to the Coupling
Facility. As the IBM CMOS technology has advanced, many improvements in coupling
technology have also been made to ensure excellent coupling efficiencies will be preserved.
The following table and the accompanying plot illustrate the performance of an average
production datasharing workload as a function of processor and coupling technology. Many
customers who have migrated their major applications to datasharing have observed access
rates to their CFs of 8–10 commands per million instructions and this forms the basis of the
values in the table. Workloads that have lower datasharing rates will have even higher
coupling efficiencies with correspondingly smaller differentials between the options.

GF22-5042
Coupling Facility Configuration Options
Page 26

This chart is based on 9 CF operations per MIPS. One can calculate your activity by simply
summing the total req/sec of the two CFs and dividing by the used MIPS of the attached
systems (that is, MIPS rating times CPU busy). Then, the values in the table would be
linearly scaled. For example, if the customer was running 4.5 CF ops per MIPS then all the
values in the table would be cut in half.

With z/OS 1.2 and above, Heuristic Synchronous to Asynchronous conversion caps the
values in the table to about 15%.

Host CF G4 G5 G6 z800 z900 z900 z890 z990


1xx 2xx 3xx
C04 - SM 11% 16% 19% 21% 22% 25%
C05/HiPerLink 10% 14% 16% 18% 19% 22% 26% 30%
R06- HL 9% 12% 14% 16% 17% 19% 22% 26%
R06 - ICB --- 9% 10% --- 13% 14% 17% 20%
G5/G6 - IC --- 8% 8% --- --- --- ---
z800 - ISC 9% 11% 12% 11% 12% 13% 15% 18%
Peer Peer Peer Peer Peer
z800 - ICB/IC --- --- --- 9% 10% 11% 12% 14%
Peer Peer Peer Peer Peer
z900 - ISC 9% 11% 12% 10% 11% 12% 14% 16%
Peer Peer Peer Peer Peer
z900 - ICB / IC --- 8% 9% 8% 9% 10% 11% 12%
Peer Peer Peer Peer Peer
z890 - ISC 8% 8% 9% 9% 10% 11% 13% 15%
Peer Peer Peer Peer Peer
z890 - ICB/IC --- 8% 8% 7% 8% 8% 9% 10%
Peer Peer Peer Peer Peer
z990 - ISC 8% 8% 9% 9% 10% 11% 13% 14%
Peer Peer Peer Peer Peer
z990 - ICB/IC --- 8% 9% 7% 8% 8% 9% 9%
Peer Peer Peer Peer Peer

* ICB compatibility mode links are required to connect to G5/G6 ICB links

In reviewing the table, one performance aspect of the Parallel Sysplex environment is to note
that the coupling efficiency increases as the older generation processors are coupled to newer
technology Coupling Facilities. Conversely, it decreases when newer generation processors
are coupled to Coupling Facilities of the earlier technologies. However, the magnitude of
change in coupling efficiency depends on the magnitude of change in the processor speed as
well as the technology of the coupling link. The earlier server models are generally slower

GF22-5042
Coupling Facility Configuration Options
Page 27

than the newer technology processors. Since the coupling efficiency is quite sensitive to the
CF service time, the performance improvement from moving to new coupling technologies
will be more dramatic for the faster than for the slower processors. This effect is clearly
depicted in the table shown. As one traverses from C04 to a z900 model 100, the
performance improvement for 9672-G5s processors is about four percentage points, whereas
it is 15 percentage points for the z900 2xx processors. Thus, it is of increasing importance to
keep coupling technology reasonably current with the latest generation of processors. The
table may be used to quantify the value of moving to a new coupling technology. However, it
should be reiterated that the relative improvements between technologies are given for
observed average production workloads with heavy datasharing. Workloads with lighter
datasharing requirements will have less relative difference between the coupling options.

Customers desiring more Coupling Facility capacity can garner the processing power by
upgrading the R06 Coupling Facility model to a z900 Model 100. An interesting and unique
aspect of this upgrade is that similar to an R06, the z900 Model 100 which is a standalone
Coupling Facility, can further be upgraded to a z900 server model. This upgrade path, then,
culminates into a processor, which is a powerful server and can hold ICFs and IC-3 channels
(it could have been already equipped with the ICB-3s).

A z900 Model 100 can not be directly upgraded to an ICF-only z990 (2084 Model 300).
Possibly the best approach is the multi-step path to first add a "z/OS" CP and upgrade the
z900 Model 100 to a Model 101 with an ICF. This can then be upgraded to a 2084 Model
301 with an ICF, but define the CP as an "Unassigned CP" so it is hidden from software cost
impact. Later, this CP can be turned on if extra capacity is needed.

As mentioned earlier, coupling efficiencies with the use of the ICB and IC channels could
improve by serveral percent in comparison to that of the standalone CF equipped with
HiPerLinks. Peer link technology further improves on this.

Since coupling efficiency is a function of the relative difference in processor speed between
the server and CF, a good rule of thumb is to try to keep the CF on a comparable machine as
the server while using the fastest link that can be configured. This can be done by keeping
the Coupling Facility no more than one generation from the general purpose servers which
they support. An R06 (G5) is sufficient to support z/OS on other G5s or on G6 servers. A
z900-100 CF is sufficient to support z900 and z990 servers, and so on.

GF22-5042
Coupling Facility Configuration Options
Page 28

Cost/Value Implications of Each CF Option


CF Alternatives, a Customer Perspective
The Coupling Facility (CF) is at the heart of Parallel Sysplex technology. In April 1997, IBM
continued its leadership role in delivering Coupling Facility technology by introducing two
new customer configuration alternatives, namely the Internal Coupling Facility and Dynamic
CF Dispatch. In 1998 IBM introduced Dynamic ICF expansion and with the G5 announce,
IBM introduced the 9672 R06 along with ICBs and ICs. This was followed up in 2000 with
ISC-3, ICB-3, and IC-3 Peer links on the zSeries family of servers, and ICB-4 in 2003. From
a customer perspective let's look at how each of the Coupling Facility configurations can be
viewed. The configuration alternatives we will examine are:

• Standalone Coupling Facility


• Coupling Facility LPAR on a Server
• CF LPAR with Dynamic CF Dispatch
• Internal Coupling Facility (ICF)
• Dynamic ICF Expansion
• System Managed CF Structure Duplexing

Standalone Coupling Facilities


Customers seeking the most highly available production Parallel Sysplex environment should
configure with two standalone Coupling Facilities. Standalone CFs offer an additional layer
of component isolation, hence higher availability. Additionally, given the physical separation
of the CF from the production (workload) processors, CFs can be independently upgraded
and/or maintenance applied as required with minimal planning and no impact to production
workloads. On the other hand, as the general purpose server is upgraded to a new processor
family, any ICFs on that footprint get upgraded "for free." While purchasing separate
standalone CF(s) introduces the expense of an additional footprint, its advantages are upgrade
and availability plus software savings. Given that a standalone CF cannot execute customer
workloads, software charges do not apply. Standalone CFs are priced as an enabling
technology from IBM.

System-Managed CF Structure Duplexing with the CFs configured as ICFs on the production
models do offer a high availability environment without single points of failure and allows the
use of ICB and IC links, but this involves a performance cost as described in the technical
paper GM13-0103.

GF22-5042
Coupling Facility Configuration Options
Page 29

A balance of availability and performance can be obtained by configuring a single standalone


CF containing structures that require failure isolation, with an ICF that contains structures
not requiring failure isolation. System-Managed CF Structure Duplexing can then be enabled
for those structures where the benefit outweighs the cost. This can typically be for those
structures that have no fast method of recovery such as CICS® named counters, CICS
Temporary Storage, or BatchPipes®.

Coupling Facility Processor LPAR on a Server


Customers seeking a flexible and simple Coupling Facility (CF) configuration can enable an
LPAR on their server with CF Control Code also known as CFCC. This might be particularly
attractive to customers beginning their Parallel Sysplex implementation or those who have
"spare" capacity on their system to support a CF. While this approach does not offer the
same availability as a standalone coupling solution since it does not offer the physical
independence of a separate machine, it is appropriate for (System Enabled) Resource Sharing
production and test application datasharing environments. Unlike the standalone CF and
ICF approach which require additional hardware planning this implementation offers the
advantage of using existing MIPS to enable a Parallel Sysplex environment. Customers can
start with a small amount of capacity and scale the CF up MIPS-wise, based on workload
requirements. The Dynamic CF Dispatch feature (described below) is highly recommended
for this environment. With this implementation, software license charges apply to the CF
MIPS given that the LPAR can be dynamically reconfigured to support production workloads.
This alternative is available on all IBM servers and supports a multiprocessor environment.
Customers seeking a coupling environment for intensive datasharing should closely evaluate
the advantages of a standalone coupling solution or an Internal Coupling Facility (ICF)
implementation.

Coupling Facility LPAR With Dynamic CF Dispatch


With the introduction of Dynamic Coupling Facility Dispatch (DCFD) algorithms in the
CFCC, additional value is provided for running shared CF LPARs . DCFD, an IBM exclusive,
introduces "standby" coupling capability in a logical partition (LPAR). This standby facility
shares engines (CPs) with active workloads. DCFD uses minimal processor resources when in
backup mode. This enables the processor to be fully used for customer workloads until CF
processor resource is needed. Given that the primary role of this capacity is customer
workloads, software charges do apply. However, DCFD eliminates the need to dedicate an
entire processor for Coupling Facility functions just in case of a primary Coupling Facility
failure. It is also of value in standalone CFs where customers want to run both a production
and test CF partition and can easily allocate processor resources between these partitions
based on CF capacity demand. Dynamic Dispatch provides an ideal configuration solution
for periodic testing needs. DCFD is available starting with the 9672 G3 processors.

GF22-5042
Coupling Facility Configuration Options
Page 30

Internal Coupling Facility (ICF)


The Internal Coupling Facility (ICF) is an IBM exclusive configuration made available first on
the G3 family of processors. ICF is a priced feature that enables a CF on a dedicated "spare"
processor (an engine not in the z/OS pool). Its performance is close to that of standalone
Coupling Facility. ICF exclusively executes CFCC code, it is not available for production
workloads and therefore is excluded from software charges. The absence of software charges
delivers an annual recurring savings to our customers as compared to a CFCC LPAR on the
server implementation. ICF is aggressively priced to enable customers to inexpensively
enable a new CF for test, production or as a backup facility. The Internal Coupling Facility is
a very attractively priced configuration alternative. ICF is also a configuration option that IBM
intends to build heavily on. With the addition of the Dynamic ICF Expansion capability, the
ICF has flexibility in terms of dynamic assignment of processor resources that is not available
with any other CF option.

Customers considering an ICF solution should keep the following points in mind.
1. ICF is available on a G6 through an 11-way, and a z900 through a 15-way. Customers with
growing workloads need to be mindful of their growth path.
2. Starting with the z800 and z990 servers, the engines (up to the limit of the specific family)
can be configured as general purpose CPs, IFLs, or ICFs, including the "ICF-only" models.
3. When servers are upgraded to a new generation technology, the ICFs on that footprint are
also taken down. Given that this upgrade requires a Coupling Facility outage along with a
z/OS outage, customers should exercise their pre-planned reconfiguration policies and
procedures. Standalone Coupling Facility upgrades are independent decisions from server
upgrades.
4. When configuring an ICF, memory requirements are an important consideration.
5. ICF uses memory from the 9672 "pool" of memory. Customers adding ICF(s) to their
processor must determine whether additional memory is now required on the processor to
support Coupling Facility functions. The somewhat complicating factor here is how
memory increments are made available. In February 1998 IBM made memory increments
more granular on the G4/G5 family of processors. This change in granularity added to the
price attractiveness of the G4/G5 ICF feature.
In terms of pricing, given typical customer pricing for a standalone CF, ICF will generally be
significantly less expensive than a standalone Coupling Facility solution. When excess 9672
memory must be configured due to restrictions associated with memory increment sizes (on
top of existing 9672 memory), the ICF configuration may cost more than the 9674.

While price can be a driving factor in purchasing decisions, configuration choices should be
made based on availability, growth and performance requirements.

GF22-5042
Coupling Facility Configuration Options
Page 31

Dynamic ICF Expansion


Announced in February 1998, Dynamic ICF Expansion allows an ICF logical partition to
acquire additional processing power from the LPAR pool of shared CPs being used to execute
production and/or test work on the system or by other ICFs. Dynamic ICF Expansion across
the server is available to every ICF partition starting with the S/390 9672 G3 server.
Dynamic ICF Expansion across ICFs is available starting with the G5 family. With this
capability, customers can configure their dedicated ICF capacity to match their typical CF
processor resource requirements, while providing the flexibility for the ICF partition to
expand into the shared CP processor pool when a spike in CF resource demand occurs. This
might result either from the failure of a standalone CF for which the ICF is a back-up, or for
unusual spikes in workload demand for any reason.

Dynamic ICF Expansion combines the performance benefits of dedicated CF processors with
the software pricing benefits of a standalone CF. It also frees customers from added capacity
planning for spikes in CF workload demands.

Combining an ICF with Standalone CF


Many customers choose to take advantage of the availability provided by the standalone CF
with the price benefits of an ICF. This is done by configuring two Coupling Facilities, one as
a standalone and one as an ICF. The only caveat is that without duplexing, any application
datasharing structures requiring failure-isolation should reside on the standalone model for
highest availability. For example, this would include the DB2 SCA and IRLM's lock
structure. Any duplexed structures (such as DB2 Group Buffer Pools or by using
System-Managed CF Structure Duplexing) actually reside on both CFs, so there is no single
point of failure. Any Resource Sharing structures could reside on the ICF without loss of
availability.

GF22-5042
Coupling Facility Configuration Options
Page 32

More information on how this can be configured is described in a later section. An example
of how this may be configured is as follows:

Standalone CF ICF
XCF Path 1 XCF Path 2
DB2 SCA
IRLM Lock
DB2 GBPO (P) DB2 GBP0 (S)
DB2 GBPI (S) DB2 GBP1 (P)
DB2 GBP2 (S)
DB2 GBP2 (P) GRS Lock
ECS (Catalog)
JES2 Checkpoint
RACF®
Logger (OperLog)

It should be noted that if an ICF configured on a server whose logical partitions are not part
of the Parallel Sysplex cluster that the ICF belongs to, then the ICF acts as a "logical"
standalone Coupling Facility.
An example of this is shown below:

z/OS or OS/390 ICF


outside Parallel Logical z/OS or OS/390
Sysplex Cluster Standalone
CF

z/OS orOS/390
ICF

GF22-5042
Coupling Facility Configuration Options
Page 33

System Managed CF Structure Duplexing


System Managed CF Duplexing allows any enabled structure to be duplexed across multiple
Coupling Facilities, similar to the support DB2 put into place for its Group Buffer Pools.
With this support, many of the failure isolation issues surrounding data sharing structures are
no longer relevant as there is no single point of failure and a standalone CF is no longer
required. If, for example, an IRLM lock structure is duplexed between two ICFs on different
servers, then a failure of a server containing one of the ICFs together with one of the IRLM
instances does not stop the surviving IRLM(s) from still accessing the duplexed structure on
the other ICF.

There is a performance cost to System Managed CF Structure Duplexing, so the benefits of


being able to run with only ICFs have to be weighed against the added coupling overhead.
More details about this can be found in System Managed CF Duplexing, GM13-0103-00.

Heuristic CF Access Sync to Async Conversion: When a z/OS image sends a command to a
Coupling Facility, it may execute either synchronously or asynchronously with respect to the
z/OS CPU which initiated it. Often, CPU-synchronous execution of CF commands provides
optimal performance and responsiveness, by avoiding the back-end latencies and overhead
that are associated with asynchronous execution of CF requests. However, under some
conditions, it may actually be more efficient for the system to process CF requests
asynchronously, especially if the CF to which the commands are being sent is far away (e.g. at
GDPS distances), is running on slower processor technology than the z/OS system, or is
configured with shared processors, dynamic CF dispatching, CFs on low weighted LPARs
using shared CPs, overloaded CFs, overloaded links, etc. Anything that causes the service
times to be long either of a permanent or transient nature is handled.

z/OS 1.2 can dynamically determine if starting an asynchronous CF request instead of a


synchronous CF request would provide better overall performance and thus self-optimize the
CF accesses. The system will observe the actual performance of synchronous CF requests,
and based on this real-time information, heuristically determine how best to initiate new
requests to the CF.
S

ync/async optimizations are granular based on the type of CF request and the amount of
data transfer on the request, so the algorithm factors in these considerations as well.
Customers that may benefit most from this include those that are configuring their Parallel
Sysplex environment to serve as a remote site recovery alternative (i.e. using GDPS), and
those that have configurations where the CPCs running z/OS are significantly faster than the
Coupling Facility to which they are sending requests.

GF22-5042
Coupling Facility Configuration Options
Page 34

Section 2. Target Environments


The first half of this document provided a detailed look at coupling alternatives and their
common characteristics. For example, we now know that all Coupling Facilities run in a
Logical Partition (LPAR). This LPAR requires either shared or dedicated processors, storage,
and high-speed channels known as coupling links. For test configurations, CF LPARs with
Dynamic CF Dispatching can be used. Alternatively, ICF LPARs with Internal Coupling (IC)
links provide the added flexibility of supporting multisystem configurations but without
external links that need to be configured.

The standalone Coupling Facility was defined as the most robust production Coupling
Facility configuration option, capable of delivering the highest levels of performance,
connectivity and availability. Several Coupling Facility options exist on the 9672 and zSeries
CMOS systems. The Internal Coupling Facility (ICF) is generally the most attractive
production CF option, given its superior price/performance and flexibility provided with the
introduction of Dynamic ICF Expansion. The ICF is comparable in performance to a
standalone Coupling Facility. With proper configuration planning the ICF can support a
highly available Parallel Sysplex environment, either used in combination with a standalone
CF or used exclusively for "suitable" configurations or when used in conjunction with CF
Duplexing. But what constitutes a "suitable" environment for exclusive use of ICFs? When is
a standalone Coupling Facility required for availability reasons? Are there any production
scenarios where a single CF is acceptable?

To answer these questions, it is important to have a clear understanding of two distinct


Parallel Sysplex configuration environments: Resource Sharing (Systems Enabled) and Data
Sharing (Application Enabled). These two Parallel Sysplex environments require different
degrees of availability. Understanding what these terms mean and matching a customer's
business objectives to these basic environments will help simplify the Coupling Facility
configuration process.

Following is an explanation of each environment, its corresponding benefits, and availability


requirements.

Resource Sharing Environment


The Systems Enabled Parallel Sysplex environment represents those customers that have
established a functional Parallel Sysplex environment, but are not currently in production
datasharing with IMS™, DB2, or VSAM/RLS. Systems Enabled is also referred to as Resource
Sharing. This environment is characterized as one in which the Coupling Facility technology
is exploited by key z/OS components to significantly enhance overall Parallel Sysplex
management and execution. In the Parallel Sysplex Resource Sharing environment, CF

GF22-5042
Coupling Facility Configuration Options
Page 35

exploitation can help deliver substantial benefits in the areas of simplified systems
management, improved performance and increased scalability.

Perhaps the Systems Enabled Resource Sharing environment can be best understood through
consideration of some examples.

First, simplified systems management can be achieved by using XCF Signaling Structures in
the Coupling Facility versus configuration of ESCON® Channel to Channel (CTC)
connections between every pair of systems in the Parallel Sysplex cluster. Anyone that has
defined ESCON CTC Links between more than two systems can certainly appreciate the
simplicity that XCF Signaling Structures can provide. Further, with the coupling technology
improvements delivered with faster engines and CF links on the G5 servers, XCF
communication through the CF could outperform CTC communication as well. More detail
in XCF Performance can be found in the Washington System Center flash:
ftp://ftp.software.ibm.com/software/mktsupport/techdocs/xcfflsh.pdf

Improved performance and scalability is delivered with enablement of GRS Star. With GRS
Star, the traditional ring-mode protocol for enqueue propagation is replaced with a star
topology where the CF becomes the hub. By using the CF, enqueue service times are now
measured in microseconds versus milliseconds. This enqueue response time improvement
can translate to a measurable improvement in overall Parallel Sysplex performance. For
example, a number of customers have reported significant reduction in batch window elapsed
time attributable directly to execution in GRS Star mode in the Parallel Sysplex cluster.

RACF is another example of an exploiter that can provide improved performance and
scalability. RACF is capable of using the Coupling Facility to read and register interest as the
RACF database is referenced. When the Coupling Facility is not used, updates to the RACF
database will result in discarding the entire database cache working set that RACF has built
in common storage within each system. If an installation enables RACF to use the Coupling
Facility, RACF can selectively invalidate changed entries in the database cache working set(s)
thus improving efficiency. Further, RACF will locally cache the results of certain command
operations. When administrative changes occur, such commands need to be executed on
each individual system. The need for this procedure can be eliminated in an XCF
environment by leveraging command propagation. Refreshing each participating system's
copy of the RACF database is still needed. However, this too is simplified when RACF
operates within a Parallel Sysplex cluster by leveraging command propagation. With RACF,
you can take this whole process one step further by caching the RACF database in the
Coupling Facility. RACF has implemented a store-through cache model. As a result,
subsequent read I/Os of changed RACF pages can be satisfied from the Coupling Facility,
eliminating the DASD I/O, providing high-speed access to security profiles across systems.

GF22-5042
Coupling Facility Configuration Options
Page 36

Dynamic assignment of tape drives across systems in a Parallel Sysplex configuration is


facilitated through CF exploitation by the z/OS Allocation component. With this function
enabled, tape drives can be reassigned on demand between systems without operator
intervention. Starting with z/OS 1.3, the Automatic Tape Switching (ATS) Star configuration
requires the GRS Star Lock structure.

Other Resource Sharing CF exploitation examples include VTAM® Generic Resource for
TSO session balancing, JES2 Checkpoint sharing for scalability, System Logger
single-system-image support of sysplex-merged Logrec and Operlog, and OS/390 SmartBatch
for improved batch performance, and many others.

As significant as the benefits derived, is the ease of migration to a Parallel Sysplex


Resource Sharing environment. Value is delivered by components shipped with OS/390
and z/OS, avoiding the need to concurrently upgrade IMS, DB2, and CICS datasharing
subsystems to latest software levels. Further, planning considerations associated with shared
database performance tuning, handling of transaction affinities, etc., are avoided. This
environment has high value, either as a straightforward migration step toward the full
benefits of Application Enabled Data Sharing or as an end point Parallel Sysplex
configuration goal.

This migration is further simplified by the many tools available either as wizards on the Web
(such as the Parallel Sysplex Configuration Assistant), or as part of the operating system
(msys for Setup).

The availability requirements differ significantly between the Resource Sharing and Data
Sharing environments in terms of CF structure placement. For the installation that plans to
only perform Resource Sharing, CF failure independence is generally not required.

Failure independence describes the need to isolate the CF LPAR from exploiting OS/390
and z/OS system images in order to avoid a single point of failure. This means that the CF
and OS/390 or z/OS systems should not reside on the same physical machine when failure
independence is needed. Stated another way, failure independence is required if
concurrent failure of the CF and one or more exploiting z/OS system images prevents the
CF structure contents from being recovered at all, or causes a lengthy outage due to
recovery from logs.

Failure independence is a function of a given z/OS to CF relationship. For example, all


connectors to a structure on a standalone CF are failure independent. However, with an ICF,
all connections from z/OS images on the same footprint are failure dependent.

GF22-5042
Coupling Facility Configuration Options
Page 37

Note: The term failure independence and failure isolation are used interchangeably in
Parallel Sysplex documentation. We decided to use failure independence throughout this
document so as not to confuse this attribute with "system isolation," which occurs during
sysplex partitioning recovery. More guidelines on failure-independence can be found in "Value
of Resource Sharing," GF22-5115.

Resource Sharing: CF Structure Placement


For Resource Sharing environments, servers hosting Internal Coupling Facility features along
with exploiting OS/390 and/or z/OS images are acceptable. For example, if the CF is
exploited by System Logger for Operlog or Logrec, and the CF should fail, no attempt at
recovery of the lost data would be made. Hence, if the machine hosting both the CF and an
exploiting z/OS system image should fail, no failure independence implications result.
Another example is GRS operating in a Star configuration. GRS Star demands two Coupling
Facilities. These Coupling Facilities could be Internal Coupling Facilities on two separate
servers within the Parallel Sysplex cluster. A configuration such as this will not compromise
GRS integrity. This is because in the event a z/OS image terminates, ENQs held by that
image are discarded by GRS. So if both one of the CFs and a z/OS image should fail, CF
Structure rebuild into the alternate CF would not involve recovery of the ENQs held by the
system that died. The surviving GRS images would each repopulate the alternate CF
structure with ENQs they held at the time of failure. Thus, failure independence is not
required.

The above scenarios similarly apply to the other Resource Sharing environment CF
exploiters. Thus, this environment does not demand configuration of a standalone CF (or
more than one).

In the following diagram, the loss of a single CEC that hosts an Internal Coupling Facility will
not result in a sysplex-wide outage. Actual recovery procedures are component dependent
and are covered in great detail in Parallel Sysplex Test and Recovery Planning (GG66-3270).
In general, it is safe to say that a production Resource Sharing environment configured with
two Internal Coupling Facilities can provide cost effective high availability and systems
integrity.

Now, of course, there are exceptions to every rule of thumb, and that is true for CF structure
placement as well. For Resource Sharing we stated that the GRS Star structure does not
require failure independence. This is true. It does not mean, however, that there is no
benefit from having at least one failure independent CF. Recovery for a failed CF will be
faster if it is not first necessary to detect the failure of an exploiting z/OS image and perform
Parallel Sysplex partitioning actions before the CF structure rebuild process can complete.

GF22-5042
Coupling Facility Configuration Options
Page 38

For Resource Sharing exploiters other than GRS, the impact of a longer CF recovery time is
not significant relative to the value gained in terms of reduced cost and complexity through
leverage of ICFs. For GRS however, the placement of the GRS Star structure warrants
consideration of the cost/complexity savings versus recovery time in a Data Sharing
Environment.

Structure Placement
Resource Sharing
Internal Coupling Facility Internal Coupling Facility
IC Link IC Link
XCF Signalling XCF Signalling
Logger - Operlog Logger - Logrec
z/OS Image 1 RACF Primary Automatic Tape Switching z/OS Image 2
VTAM G/R RACF Secondary
JES2 GRS Star
MQSeries Shared Queue WLM -
Catalog IRD
Multisystem Enclave

Data Sharing Environments


The Application Enabled Parallel Sysplex environment represents those customers that have
established a functional Parallel Sysplex cluster and have implemented IMS, DB2 or
VSAM/RLS datasharing. When Application Enabled, the full benefits of the Parallel Sysplex
technology are afforded, including dynamic workload balancing across systems with high
performance, improved availability for both planned and unplanned outages, scalable
workload growth both horizontally and vertically, etc. In this environment, customers are
able to take full advantage of the ability to view their multiple-system environment as a single
logical resource space able to dynamically assign physical system resources to meet workload
goals across the Parallel Sysplex environment. These benefits are the cornerstone of the
S/390 Parallel Sysplex architecture.

GF22-5042
Coupling Facility Configuration Options
Page 39

Application Enabled Data Sharing: CF Structure Placement


The Application Enabled environment does have more stringent availability characteristics
than the Resource Sharing environment. The datasharing configurations exploit the CF
cache structures and lock structures in ways which involve sophisticated recovery
mechanisms, and some CF structures do require failure independence to minimize recovery
times and outage impact.

Two Coupling Facilities are required to support this environment. Without System Managed
CF Structure Duplexing implemented for the data sharing structures, it is strongly
recommended that this Coupling Facility configuration include at least one standalone CF.
The second CF can run on a server as an Internal Coupling Facility. The standalone CF will
be used to contain the CF structures requiring Failure Independence. The ICF(s) will
support those structures not requiring failure independence as well as serving as a backup in
case of failure of the standalone CF. This should be viewed as a robust configuration for
application enabled datasharing.

There are reasons why NOT to use ICFs in a data sharing environment. This includes:

Unplanned Outages: Two standalone CFs provide the highest degree of availability since
there is always the possibility that an ICF or server might fail while the primary (standalone)
CF is unavailable

Planned Outages: Applying microcode upgrades to support the ICF Coupling Facility may
involve bringing down the entire server. This would affect any z/OS partitions that are
sharing that footprint.

With System Managed CF Structure Duplexing, two images of the structures exist; one on
each of the two CFs. This eliminates the single point of failure when a data sharing structure
is on the same server as one of its connectors. In this configuration, having two ICFs
provides a high availability solution. A cost / benefit analysis of System-Managed Duplexing
needs to be done for each exploiter to determine the value in each environment. More
information can be found in GM13-0103 available off of the Parallel Sysplex Web site at
ibm.com/servers/eserver/zseries/pso .

GF22-5042
Coupling Facility Configuration Options
Page 40

ICF Use as a Standalone Coupling Facility


Although, not the norm, an ICF can provide failure independence when configured on a CPC
that contains the ICF along with any of the following:
1. A z/OS LPAR (instance) that is a member of a different Parallel Sysplex cluster
2. A VM system
3. A network node (CMC)
4. A TPF image
5. A Linux system
These potential configurations will provide failure independence as long as there is no z/OS
instance that is exploiting the ICF on that CPC. The advantage of these configurations are
mostly cost, in that if the customer has to provide these other functions, i.e., TPF or CMC,
they needn’t dedicate an entire machine to them. In so doing, they do not require a separate
standalone CF to provide a highly available Parallel Sysplex environment.

Lock Structures
Accounts that are in production datasharing must worry about preserving the contents of
their Lock structures used by the lock managers. Should these structures be lost along with
one or more of the z/OS images that were using them, the installation will be forced to
recover from logs. This is a time consuming process that will vary depending on the specific
customer environment. Recovery of the lock structures from logs can be avoided if the CF
structure is isolated from exploiting lock managers, allowing rebuild of the lock structure
contents from in-memory lock information on each of the connected systems. This can also
be achieved through the use of CF Duplexing for the relevant lock structures.

Cache Structures
Cache structure recovery for the datasharing environments differs greatly depending on
whether the cache structure exploiters support CF structure rebuild or not. Both IMS and
VSAM/RLS support CF structure rebuild for their caches, and, given their caches do not
contain changed data, can recover the loss of a cache structure without requiring recovery
from logs. Failure of a datasharing IMS or VSAM/RLS system instance does not affect the
ability to recover the cache structure contents and therefore, CF failure independence is not
demanded for these cache structures.

DB2 Group Buffer Pools and IMS VSO DEDB caches contain changed data and do not
support CF structure rebuild in the event of a CF failure. The IMS VSO DEDB cache
support in IMS Version 6 supports multiple copies of a data area. Each area has its own copy
of the structure. This avoids a single point of failure. As such, it does not demand
failure-isolation between the CF and any sharing IMS instance.

GF22-5042
Coupling Facility Configuration Options
Page 41

DB2 GBPs support user-managed CF structure duplexing. As DB2 writes to a primary GBP,
it simultaneously writes to the secondary, duplexed GBP. Registering interest and Reads are
only done to the primary structure. Since both GBP instances are kept in synch with each
other in terms of changed data, there are no concerns about the loss of the primary GBP as
the data will still be available in the alternate. With DB2 GBP duplexing, there is no concern
about using an ICF to hold the production data.

As exploiters take advantage of System Managed CF Structure Duplexing, the possibility


exists to eliminate the requirement for a standalone CF. If, for example, the DB2 SCA and
Lock structures were duplexed, then with the primary and secondary structures on different
ICFs, there are no availability issues. A cost / benefit analysis of System-Managed Duplexing
needs to be done for each exploiter to determine the value in each environment. More
information on System Managed CF Structure Duplexing can be found in GM13-0103.

Transaction Log Structures


Several subsystems/components exploit the System Logger to provide logging for transactions
and media recovery. These CF structures warrant failure independence to avoid a single
point of failure for the CF log data. The System Logger component is able to rebuild these
structures from local processor storage if the CF fails. But if the CF and a System Logger
instance should fail, recovery is not possible, unless System Logger Staging Data Sets are in
use — which impacts performance as it requires log records to be forced to disk before
control can be returned for every log write request. Use of CF Duplexing can provide this
failure-isolation requirement and can also allow the use of staging data sets with their
accompanying performance implications to be avoided.

Subsystems/components which exploit the System Logger and warrant failure independence
include: z/OS Resource Recovery Services, IMS Common Queue Server component for
Shared Message Queues, and CICS system and forward recovery logs.

z/OS V1.2 CF / ICF CF / ICF z/OS V1.2

SCA(P) GBPy (S) SCA(S) GBPy (P)


DB2 Lock(P) GBPx (S) DB2 Lock(S) GBPx (P)

GF22-5042
Coupling Facility Configuration Options
Page 42

Other CF Structures Requiring Failure Independence


A number of other list structure exploiters provide CF failure recovery via CF rebuild from
local processor memory contents on each connector. Without System-Managed duplexing,
failure independence will allow full reconstruction of the CF structure contents. A
concurrent z/OS system image failure containing an exploiter of these structures would
preclude successful CF structure rebuild. CF structures in this category include:

• VTAM Multi-Node Persistent Sessions


• VTAM Generic Resource structures for LU 6.2 Sync Level 2 sessions
• DB2 SCA

Other CF Structures NOT Requiring Failure Independence


There are a set of CF structures in support of data sharing which do not require failure
independence, either because recovery does not need failure independence to recover the CF
contents, or because recovery for CF structure loss is not possible in any event. CF structures
in this category include:

• IMS Shared Message Queues


• VTAM Generic Resources for non-LU6.2 sync level 2
• CICS Shared Temp Store Queues
As one can see from the above discussion, a production datasharing environment has much
more stringent availability requirements than a production Resource Sharing environment.
Alternatively, these structures support System-Managed CF Structure duplexing, providing
full high availability support for these structures as well.

This means that careful thought must be given regarding structure placement in the event of
a system image failure. A properly configured datasharing environment consisting of a single
standalone Coupling Facility and one Internal Coupling Facility would have structures placed
in the following way.

GF22-5042
Coupling Facility Configuration Options
Page 43

Structure Placement - Data Sharing


Standalone CF

IMS VSO
IMS Logger
RRS Logger
CICS Logger
VTAM MNPS
VTAM GR
VSAM RLS Lock
DB2 and IMS Lock (IRLM)
DB2 SCA
DB2 GBP (Secondary)

z/OS B ICF
IMS Cache and Shared
Queues
VSAM RLS Cache
CICS Temp Storage z/OS A
VTAM G/R
IMS VSO
DB2 GBP (Primary)
+ Resource Sharing structures

In the above picture, structures that do not require failure independence are placed in the
ICF, while structures that do require failure independence are placed in the standalone CF.
More information on Resource Sharing structures can be found in the white paper:
GF22-5115 "Value of Resource Sharing," linked off of the Parallel Sysplex Web page at
ibm.com/servers/eserver/zseries/pso .

Failure Independence with System Managed CF Structure Duplexing


For information on System Managed CF Structure Duplexing, please see System Managed CF
Structure Duplexing (GM13-0103) or System Managed CF Structure Duplexing
Implementation Summary both of which are available on the Parallel Sysplex Web site
(ibm.com/servers/eserver/zseries/pso ).

GF22-5042
Coupling Facility Configuration Options
Page 44

Section 3: Other Configuration Considerations


CFCC enhanced patch apply:
Starting with z890 and z990 servers after April 7, 2004, the CFCC patch apply process can
eliminate the need for a power on reset (POR) of the servers to apply a "disruptive" CFCC
patch. This enhancement is designed to provide you the ability to:

y Selectively apply the new patch to one of possibly several CFs running on the z890 or
z990. For example, if you have a CF that supports a test Parallel Sysplex and a CF that
supports a production Parallel Sysplex on the same server, you now have the ability to
apply a "disruptive" patch to only the test CF without affecting the production CF. After
you have completed testing of the patch, it can be applied to the production CF as
identified in the example.
y Allow all other LPARs on the z990 where a "disruptive" CFCC patch will be applied to
continue to run without being impacted by the application of the "disruptive" CFCC patch.
This enhancement does not change the characteristics of a "concurrent" CFCC patch, but can
significantly enhance the availability characteristics of a "disruptive" CFCC patch by making
it much less disruptive.

Previously, small enhancements or "fixes" to the CFCC were usually distributed as a


"concurrent" patch that could be applied while the CF was running. Occasionally, a CFCC
patch was "disruptive." When such a "disruptive" change needed to be applied to a CF, it
required a POR of the server where the CF was running. This was especially disruptive for
those enterprises that had chosen to use internal CFs, because a POR of the server affects the
CF where the change was to be applied and all other LPARs running on the same server.
Further, if an enterprise was using multiple internal CFs on the same server (to support both
a test and production configurations, for example), there was no way to selectively apply the
"disruptive" patch to just one of the CFs — once applied, all the CFs on the server had the
change. Consequently, the application of the "disruptive" CFCC patch was very disruptive
from an operations and availability perspective.

CFCC Concurrent Patch Apply is a server microcode enhancement and does not require a
specific level of the Coupling Facility.

GF22-5042
Coupling Facility Configuration Options
Page 45

Summary
In this paper, we have examined the various Coupling Facility technology alternatives from
several perspectives. We compared the characteristics of each CF option in terms of function,
inherent availability and price/performance. We also looked at CF configuration alternatives
in terms of availability requirements based on an understanding of CF exploitation by z/OS
components and subsystems. Given the considerable flexibility IBM offers in selection of CF
technology and configuration options and inherent tradeoffs between qualities of service that
can be selected, there is some risk in trying to condense the content of this document into a
few simple conclusions. However, we will do our best to do just that.

First, we learned that the various CF technology options have a great deal in common in
terms of function and inherent availability characteristics, and that the fundamental decision
points in selecting one option over another have more to do with price/performance and
availability criteria dictated by customer choice of Parallel Sysplex environment (Resource
Sharing versus Data Sharing).

We described the Internal Coupling Facility as a robust, price/performance competitive, and


strategic CF option for many customer configurations, and a technology base built upon with
System-Managed duplexing as a high availability solution. Further, we examined the various
CF technology options in terms of performance characteristics and discussed the significant
improvements in cost/performance value introduced through technology enhancements such
as ICF, Dynamic ICF Expansion, Integrated Cluster Bus, Internal Coupling links, and the
ISC-3, ICB-3, and IC-3 links.

From the perspective of configuring for availability, we gave considerable attention to the
configuration requirements dictated by the recovery capabilities of specific CF exploiters.
While there is much flexibility in the configuration choices customers may make based on
cost versus value tradeoffs, a simple yet important distinction emerged regarding availability
in Resource Sharing versus Data Sharing environments.

Customers interested in the value of a Parallel Sysplex environment can obtain significant
benefit from a Systems Enabled Resource Sharing environment with minimal migration cost.
The value derived from system-level CF exploitation's (Resource Sharing) such as improved
RACF and GRS performance, dynamic tape-sharing across the Parallel Sysplex cluster,
simplified XCF Parallel Sysplex communications, can be easily introduced into customer's
existing configurations.

Further, Resource Sharing environments do not require a standalone CF in order to provide


a highly-available Parallel Sysplex environment nor do they generally require use of CF

GF22-5042
Coupling Facility Configuration Options
Page 46

Duplexing. For example, a configuration with two Internal CFs provides roughly the same
robustness as a configuration with two standalone CFs for such an environment.

As customers migrate forward to an environment supporting Application Data Sharing to


achieve all of the workload balancing, availability and scalability advantages offered by
Parallel Sysplex technology, a standalone Coupling Facility becomes important from an
availability perspective. This is due to the need for failure independence between the CF and
exploiting z/OS system and subsystem components. System-Managed CF Structure
Duplexing removes the requirement for a standalone Coupling Facility, but this has to be
evaluated against the cost of duplexing.

This paper demonstrated that a single standalone Coupling Facility configured with a second
CF as an ICF or other server CF option can provide high availability at significant cost
reduction in many customer configurations. With System-Managed CF Structure Duplexing,
the configuration can be reduced down to two ICFs while still providing the same high
availability configuration.

Finally, through exploitation of the significant CF technology alternatives and introduction of


ever-evolving new CF options, IBM continues to demonstrate a strategic commitment to
Parallel Sysplex clusters as a fundamental component of the z/Architecture™ business value
proposition.

In the following appendices, detailed information is provided on various aspects of the CF


configuration alternatives which were developed in order to build this paper and to provide
educational material for interested readers. Appendix A covers the recovery behavior of CF
exploiters given configurations with two Coupling Facilities. Appendix B describes the
function and value provided by CF exploiters in a Resource Sharing environment. Appendix
C describes the function and value provided by CF exploiters in an Application Data Sharing
environment. Appendix D covers Parallel Sysplex hardware component tradeoffs and options
in the servers starting with the G3. Finally, Appendix E contains answers to anticipated
questions related to the content of this paper.

GF22-5042
Coupling Facility Configuration Options
Page 47

Appendix A. Parallel Sysplex Exploiters - Recovery Matrix


This table assumes System-Managed CF Structure Duplexing is not configured. A table showing exploitation of
duplexing can be found at the end of GM13-0103: System Managed CF Structure Duplexing technical paper,
found at the Parallel Sysplex Web site: ibm.com/servers/eserver/zseries/pso.

Assumes Single CF Failure or (100% Loss Connectivity Failure)


Exploiter Dual CFs
(Subsystem or Function Structure Failure Failure Impact
System Component Type Independence
that is using CF) Required?1
DB2 SCA List Yes Datasharing continues.
DB2 (IRLM) Serialization Lock Yes Datasharing continues.
DB2 Group Buffer Pools Shared Cache Yes Shared data using the GBP is
Catalog/Directory unavailable until recovered from
Shared User Data logs.
DB2 V6 (Duplexing) Shared Cache Not Applicable Datasharing continues.
Group Buffer Pools Catalog/Directory with duplexed
Shared User Data structures
IMS (IRLM) Serialization Lock Yes Datasharing continues.
IMS OSAM V5 or V6 Caching Cache No Datasharing continues.
IMS VSAM V5 or V6 Caching Cache No Datasharing continues.
IMS VSO DEDB Caching Cache N/A Datasharing continues/ duplexed.
IMS CQS V6 Transaction Balancing List No Transaction balancing continues.
z/OS System Logger Merged Log List Yes1 Datasharing continues.
IMS CQS V6 LSTRMSGQ
Shared Message Queue
z/OS System Logger Merged Log List Yes1 Datasharing continues.
IMS CQS V6 LSTREMH
Shared EMH
VTAM V4.2 SSI to Network Generic List Yes3 Session balancing continues.
(ISTGENERIC) Resources (GR)
VTAM 4.3
VTAM V4.4 Multi-Node Persistent List Yes MNPS continues.
Session
MNPS
z/OS Resource Recovery Synch Point List Yes1 High-speed logging continues.
Services RRS Management for OS/390
– Merged Log
z/OS System Logger OPERLOG – Merged List Yes2 High-speed logging continues.
Log
z/OS System Logger LOGREC – Merged Log List Yes2 High-speed logging continues.

z/OS System Logger DFHLOG – List Yes1 Datasharing continues.


CICS TS R1.0 CICS Requirement
Emergency Restart
Backout Processing
Coupling Facility Configuration Options
Page 48

Assumes Single CF Failure or (100% Loss Connectivity Failure)


Exploiter Dual CFs
(Subsystem or Function Structure Failure Failure Impact
System Component Type Independence
that is using CF) Required?1
z/OS System Logger DFHSHUNT - List Yes1 Datasharing continues.
CICS TS R1.0 CICS Requirement
General Forward Recov.
z/OS System Logger General Forward List Yes1 Datasharing continues.
CICS TS R1.0 Recovery
CICS TS R1.0 Serialization Lock Yes Datasharing continues.
VSAM/RLS IGWLOCK00

CICS TS R1.0 Caching of VSAM data Cache No Datasharing continues.


VSAM/RLS
CICS TS R1.0 TempStor Affinities List No Function terminates.
Temp Stor Queues Load balancing continues.
CICS Named Counters Unique Index No User managed recovery needed .
Generation
CICS Data Tables Caching of VSAM data Cache No "scratch-pad" data lost.
z/OS Allocation Shared Tape Lock No Shared Tape continues.
ISGLOCK
z/OS GRS ENQ/DEQ Lock No High-speed ENQ/DEQ continues.
ISGLOCK
z/OS XCF High-speed Signaling List No High-speed signaling continues.
RACF V2.1 Cached Security Cache No Normal RACF processing
Database continues.
JES2 V5 Checkpoint List N/A With Duplex active, JES2
continues
SmartBatch Cross Systems List N/A Function not available.
BatchPipes Rebuild supported for planned
reconfiguration.
WLM Multi-system Enclaves List No Drop out of multi-system
management.
WLM Intelligent Resource List No Structure repopulated from WLM
Director (IRD) data over next 10 seconds to 2
minutes.
Assumes active SFM Policy * We strongly recommend having an active SFM policy
SSI – Single System Image
Data Sharing continues essentially means rebuild is supported.
Notes:
1.For the System Logger, Failure Independence can be achieved through the use of Staging Data Sets or proper configuration
planning. It is IBM’s recommendation to NOT USE Staging Data Sets due to the performance impact.
2.Though failure independence is suggested, it is really not necessary. Typical installations could live with the loss of data.
3.Failure independence is required for LU6.2 SYNCLVL2 only.
For all exploiters view Failure Independence as the need for the CEC that is hosting the Coupling Facility to be
physically separate from the z/OS or OS/390 image that is hosting the member of the multi-system application.
Please Note: In environments where Failure Independence is not required, possible recovery delays may occur as a result of
partitioning the system out of the Parallel Sysplex environment.
Coupling Facility Configuration Options
Page 49

Appendix B. Resource Sharing Exploiters - Function and Value


Systems Enabled represents those customers that have established a functional Parallel Sysplex environment
using a Coupling Facility, but are not currently implementing IMS, DB2 or VSAM/RLS datasharing.

The following exploiters fall into this category.

Exploiter Function Benefit


z/OS XCF High Speed Signaling Simplified System Definition
z/OS OPERLOG and LOGREC logstream Improved Systems Management
System Logger
z/OS Allocation Shared Tape Resource sharing
z/OS GRS/GRS Resource Serialization Improved ENQ/DEQ performance,
STAR availability and scalability
Security Server (RACF) High speed access to security profiles Performance
JES2 Checkpoint Systems Management
SmartBatch Cross Systems BatchPipes Load Balancing
VTAM GR Generic Resource for TSO Session Balancing/
(non LU6.2) availability
z/OS RRS Logger Coordinate changes across multiple
database managers
Catalog Extended Catalog Sharing Performance
MQSeries Shared message queues System Management, capacity,
availability, application flexibility
WLM MultiSystem Enclaves System Management
WLM Intelligent Resource Director System Management, Performance
Coupling Facility Configuration Options
Page 50

Systems Enabled Exploiters


Efficient recovery in a Systems Enabled environment requires operations to take the
necessary actions to complete the partitioning of failed images out of the Parallel Sysplex
environment. This must be done as quickly as possible. Doing so will allow rebuild
processing to continue (this includes proper COUPLExx interval specification, response to
XCF's failure detection/partitioning messages and use of SFM). Operator training on failure
recognition (i.e., responding to the IXC102A message) and the proper use of the XCF
command is key and should not be overlooked. Installations that run with an active SFM
policy may automate the partitioning process. Further automation could be applied to
improve responsiveness. For additional details on availability items like CF volatility, isolate
value, interval value and a sample recovery time line, please see WSC Flash 9829, "Parallel
Sysplex Configuration Planning for Availability."

A brief description of each exploiter follows along with reasons to consider using them.

XCF, High Speed Signaling


Signaling is the mechanism through which XCF group members communicate in a sysplex.
In a Parallel Sysplex environment, signaling can be achieved through channel to channel
connections, a Coupling Facility, or both. When using IBM HiPerLinks, implementing XCF
signaling through Coupling Facility list structures provides significant advantages in the areas
of systems management and recovery. A signaling path defined through CTC connections
must be exclusively defined as either outbound or inbound, a Coupling Facility list structure
can be used as both. z/OS automatically establishes the paths. If more than one signaling
structure is provided, the recovery from the loss of a signaling structure is automatically done
by Cross System Extended Services (XES).

It is highly recommended to have redundant XCF signaling paths. A combination of CTCs


and list structures may be used, two sets of CTCs, or two sets of structures. Both signaling
structures should not be in the same CF. This situation can be monitored for with
automation routines. WSC Flash: Parallel Sysplex Performance: XCF Performance
Considerations (Version 2) contains more information on configuring and tuning XCF.
Coupling Facility Configuration Options
Page 51

System Logger, OPERLOG and LOGREC


The system logger is a set of services which allows an installation to manage log data across
systems in a Parallel Sysplex environment. The log data is in a log stream which resides on
both a Coupling Facility structure and DASD data sets. System logger also provides for the
use of staging data sets to enable the log data is protected from a single point of failure. A
single point of failure can exist because of the way the Parallel Sysplex environment is
configured or because of dynamic changes in the configuration. Using system logger services,
you can merge log data across a Parallel Sysplex environment from the following sources:
• A Logrec logstream. This will allow an installation to maintain a Parallel Sysplex view of
logrec data.
• An Operations logstream. This will allow an installation to maintain a Parallel Sysplex
view of SYSLOG data.
• CICS Transaction Server utilizing VSAM/RLS to provide the installation with VSAM
datasharing in a Parallel Sysplex environment (discussed later).
• IMS Shared Message Queue providing support for transaction load balancing (discussed
later).
• RRS merged logs.

Allocation, Automatic Tape Switching


In MVS/ESA™ SP V5.2 and subsequent releases, tape allocation could use a list structure
within the Coupling Facility to provide multisystem management of tape devices. Prior to
MVS/ESA SP 5.2, a tape device could only be online to one system at a time and operator
intervention was needed to move tape devices between systems. Alternatively, the tape device
could be managed by a third party product.

Automatic tape switching extends the MVS™ allocation function whereby a shareable tape
device can be automatically shared among MVS images in a Parallel Sysplex environment.

This enhancement enables customers to displace the third party alternative if they desire.

In z/OS 1.3 (1.2 with APAR support), the use of the IEFAUTOS structure to support
Automatic Tape Switching has been dropped. Instead, the function uses the GRS Star
structure. This helps improve the system management and problem determination
capabilities of the ATS function.

z/OS GRS Star


z/OS R2 introduced the GRS Star methodology of Global Resource Serialization which
replaces the traditional ring mode protocol and potential ring disruptions. The star
configuration is built around a Coupling Facility, which is where the global resource
serialization lock structure resides. By using the Coupling Facility, ENQ/DEQ service times
could be measured in microseconds, not milliseconds. This has significant performance
Coupling Facility Configuration Options
Page 52

improvement. Installations that currently use GRS ring-mode or a third party alternative and
host a fair number of z/OS images should consider migrating to GRS Star for the improved
performance, scalability, and availability it provides. GRS Star is required for the Automatic
Tape Switching (ATS Star) support in z/OS R1.3.

z/OS Security Server (RACF), High-speed Access to Security Profiles


RACF uses the Coupling Facility to improve performance by using a cache structure to keep
frequently used information located in the RACF database. The cache structure is also used
by RACF to perform very efficient, high-speed cross invalidates of changed pages within the
RACF database. Doing so preserves the working set that RACF maintains in common
storage.

RACF is another example of an exploiter that can provide improved performance and
scalability. RACF is capable of using the CF to read and register interest as the RACF
database is referenced. When the CF is not used, updates to the RACF database will result in
discarding the entire database cache working set that RACF has built in common storage
within each system. If an installation enables RACF to use the CF, RACF can now selectively
invalidate only changed entries in the database cache working set(s) thus improving
efficiency. Further, RACF will locally cache the results of certain command operations.
When administrative changes occur, such commands need to be executed on each individual
system.

JES2, Checkpoint
JES2 supports placing its checkpoint in the Coupling Facility. When this option is selected, a
Coupling Facility list structure is used for the primary checkpoint data set. The alternate
checkpoint data set could reside either on DASD or in the other Coupling Facility. For
recovery, JES2 provides its own recovery mechanism called the JES2 Reconfiguration Dialog.
The dialog would be entered when a Coupling Facility fails that is hosting the checkpoint
structure. Starting with OS/390 R8, rebuild of the JES2 structure is supported. The benefits
of having the JES2 checkpoint in the Coupling Facility include equitable access to the
checkpoint lock across all members within the MAS complex and elimination of the DASD
bottleneck on the checkpoint data set. In addition, members of the MAS are now capable of
identifying who owns the lock in the event of a JES failure.

SmartBatch, Cross System BatchPipes


SmartBatch uses a list structure in a Coupling Facility to enable cross-system BatchPipes.
Cross-system BatchPipes is also known as pipeplex. In order to establish a pipeplex, the
BatchPipes subsystems on each of the z/OS images in the Parallel Sysplex cluster must have
the same name and connectivity to a common Coupling Facility list structure. It is possible to
Coupling Facility Configuration Options
Page 53

have multiple pipeplexes within a single Parallel Sysplex cluster by using different subsystem
names. Each pipeplex must have its own list structure.

VTAM GR (non LU6.2) for TSO


VTAM V4.2 and subsequent releases provide the ability to assign a generic resource name to
a group of active application programs that provide the same function. The generic resource
name is assigned to multiple programs simultaneously, and VTAM automatically distributes
sessions among these application programs rather than assigning all sessions to a single
resource. Thus, session workloads in the network are balanced. Session distribution is
transparent to the end user.

DFSMShsm Common Recall Queue


The common recall queue (CRQ) is a single recall queue that is shared by multiple
DFSMShsm™ hosts. The CRQ enables the recall workload to be balanced across each of
these hosts. This queue is implemented through the use of a Coupling Facility (CF) list
structure. Prior to this enhancement, recall requests were processed only by the host on
which they were initiated.

Enhanced Catalog Sharing


In a resource sharing Parallel Sysplex cluster, each image must have read/write access to
both the Master Catalog and User Catalogs. The state of the catalog(s) must be accessible
from all systems in the cluster to allow for data integrity and data accessibility. As the
number of systems in the cluster grows, serialized access to catalogs could have negative
sysplex-wide impact. OS/390 R7 contains enhancements that will improve the performance
of shared catalogs in a Parallel Sysplex environment. With Enhanced Catalog Sharing (ECS),
the catalog control record describing each catalog is copied into the Coupling Facility instead
of DASD. This reduces the number of I/Os to the shared catalogs and improves overall
system performance.

WebSphere MQ (MQSeries) Shared Message Queues


WebSphere® MQ for z/OS V5.3 supports shared queues for persistent and non-persistent
messages stored in Coupling Facility list structures. Applications running on multiple queue
managers in the same queue-sharing group anywhere in the Parallel Sysplex cluster can
access the same shared queues for:

• High availability
• High capacity
• Application flexibility
• Workload balancing
You can set up different queue sharing groups to access different sets of shared queues.
Coupling Facility Configuration Options
Page 54

All the shared queues stored in the same list structure can be accessed by applications
running on any queue manager in the same queue-sharing group anywhere in the Parallel
Sysplex cluster. Each queue manager can be a member of a single queue-sharing group only.

The MQSeries structure supports System Managed rebuild so it supports rebuild for planned
and unplanned reconfigurations. Failure isolation is not required since persistent messages
are recovered from the logs and non-persistent messages have no application impact if the
queue is lost. The MQ structures also support System-Managed CF duplexing for high
availability as well.

Workload Manager (WLM) Support for Multisystem Enclave Management


The Workload Manager is enhanced in OS/390 Version 2 Release 9 to support Multisystem
Enclaves. An enclave can be thought of as the "anchor point" for a transaction spread across
multiple dispatchable units executing in multiple address spaces. The resources used to
process the work can now be accounted to the work itself (the enclave), as opposed to the
various address spaces where the parts of the transaction may be running. This can now be
managed as a unit across multiple systems. Multisystem Enclaves support would provide
system management and performance benefits.

Workload Manager support for IRD


Intelligent Resource Director consists of the functions of LPAR CPU Management, Dynamic
Channel Path Management, and Channel Subsystem Priority Queuing. The scope of
management is the set of all LPARs in the same z900 Server in the same Parallel Sysplex
cluster. The first two functions require a CF structure.

CPU Management provides z/OS WLM with the ability to dynamically alter the logical
partitions (LP) weights for each of the logical partitions within the LPAR cluster, thereby
influencing PR/SM to allow for each logical partition receives the percentage of shared CPU
resources necessary to meet its workload goals.

Dynamic Channel Path Management (DCM) allows z/OS WLM to dynamically reconfigure
channel path resources between control units within the cluster in order to enable business
critical application's I/O processing requirements, necessary to meet its workload goals, are
not being delayed due to the unavailability or over utilization of channel path resources.
Coupling Facility Configuration Options
Page 55

The Workload Manager Multisystem Enclaves and IRD structures support System Managed
Rebuild to move structures and System-Managed duplexing for high availability.
Additionally, for both of these structures, there are no failure isolation requirements.

• If the Multisystem Enclave structure cannot be rebuilt, the system falls back to the
pre-Multisystem Enclave management way of managing the enclaves: Each enclave is
managed on a per-LPAR basis.
• If the IRD structure is unavailable, it gets reallocated on the other CF as soon as its loss is
detected. For LPAR CPU Management, the structure is refreshed within 10 seconds (the
WLM interval). For DCM, the history of CHPID usage is lost, and the system needs to
relearn this for optimal performance. This is done in two minutes. During that time, the
system runs no worse than in pre-DCM mode.

Recommended Configurations
Data Sharing Enabled represents those customers that have established a functional Parallel
Sysplex environment doing database datasharing. Please reference the "Target
Environments" section to understand the overall benefits when running in application
enablement mode.
Coupling Facility Configuration Options
Page 56

The following exploiters fall into this category.

Data Sharing Enabled Exploiters

Exploiter Function Benefit


DB2 V4/V5 Shared Data Increased Capacity beyond single
CEC/Availability
DB2 V6 Shared Data Increased Capacity Availability, DB2 Cache
Duplexing
DB2 IRLM Serialization (Multisystem Integrity/Serialization of Database datasharing
database locking)
VTAM Generic Resource Availability/Load Balancing
VTAM (Multi-Node MNPS Availability
Persistent Sessions)
CICS TS CICS LOG Manager Faster Emergency Restart, simplified systems
R1.0 management
DFSMS Shared VSAM/OSAM Increased Capacity beyond single
R1.3 CEC/Availability
IMS V5/V6 Increased Capacity beyond single
CEC/Availability
IMS V5/V6 IRLM Serialization (Multisystem Integrity/Serialization of Database datasharing
database locking)
IMS V6 SMQ Transaction Balancing Increased Capacity beyond single
CEC/Availability
IMS V6 Shared VSO Shared Data Increased Capacity beyond single
CEC/Availability
RRS OS/390 Synch point Multisystem distributed synch point
manager coordination
Coupling Facility Configuration Options
Page 57

Appendix C. Data Sharing Exploiters - Usage and Value


DB2 V4 or V5, Shared Data
Before DB2 V4, only the distributed function of DB2 allowed for read and write access
among multiple DB2 subsystems. Data sharing implemented in DB2 V4 allows for read and
write access to DB2 data from different DB2 subsystems residing on multiple CPCs in a
Parallel Sysplex environment.

A DB2 data sharing group is a collection of one or more DB2 subsystems accessing shared
DB2 data. Each DB2 subsystem in a datasharing group is referred to as a member of that
group. All members of the group use the same shared catalog and directory and must reside
in the same Parallel Sysplex cluster.
• The Coupling Facility list structure contains the DB2 Shared Communication Area (SCA).
• Each DB2 member uses the SCA to pass control information throughout the datasharing
group. The SCA contains all database exception status conditions and other information
necessary for recovery of the data sharing group.
• The Coupling Facility lock structure (IRLM) is used to serialize on resources such as table
spaces and pages.
• It protects shared DB2 resources and allows concurrency.
• One or more Coupling Facility cache structures, which are group buffer pools that can cache
these shared data pages.

For an in-depth discussion on DB2 Data Sharing recovery, consider reviewing DB2 on MVS
Platform: Data Sharing Recovery (SG24-2218).

It is not required to have an SFM policy active in order to rebuild SCA/Lock when a loss of
CF connectivity occurs although, as stated earlier, we strongly recommend running with an
active SFM policy.

DB2 V6 – Duplexing of Cache Structures


DB2 for OS/390 Version 6 delivers datasharing availability enhancements with duplexing
support of the group buffer pools.

This is a major improvement in Parallel Sysplex continuous availability characteristics. It is


designed to eliminate a single point of failure for the DB2 GBP caches in the CF via DB2
managed duplexing of cache structures. This eliminates the need to perform lengthy log
recovery (reduce from hours down to minutes) in the event of a CF failure containing the
DB2 GBP cache structure. With DB2 (user-managed) Cache Duplexing, at the time of a
failure of the cache structure, DB2 will automatically switch over to use the duplexed
structure in the other Coupling Facility. This enhancement makes it possible to safely
configure your GBPs in the ICF. DB2 duplexing is available with Version 6 of DB2 and
OS/390 Release 2 Version 6 in the base (R3 with APAR). Additionally, DB2 provides the
DB2 cache duplexing support earlier via an APAR on DB2 Version 5 (PQ17797).
Coupling Facility Configuration Options
Page 58

VTAM, Generic Resources


VTAM V4.2 and subsequent releases provide the ability to assign a generic resource name to
a group of active application programs that provide the same function. The generic resource
name is assigned to multiple programs simultaneously, and VTAM automatically distributes
sessions among these application programs rather than assigning all sessions to a single
resource. Thus, session workloads in the network are balanced. Session distribution is
transparent to the end user.

IMS Version 6 also has support provided for VTAM Generic Resources to allow session load
balancing across multiple IMS Transaction Managers. It provides a single system image for
the end user. It allows the end user to logon to a generic IMS Transaction Manager name
and VTAM to perform session load balancing by establishing the session with one of the
registered IMS TMs. As the workload grows, IMS customers need the power that distributing
the data and applications across the Parallel Sysplex cluster provides. This growth can be
accomplished while maintaining the single system image for their end users. End users are
able to access applications and data transparently, regardless of which IMS system in the
Parallel Sysplex cluster is processing that request. Generic Resource support allows the end
user to log on to a single generic IMS name that represents multiple systems. This provides
for easy end-user interaction and improves IMS availability. It also improves systems
management with enhanced dynamic routing and session balancing.

VTAM, Multi-Node Persistent Session


SNA network attachment availability is significantly improved in a Parallel Sysplex
configuration through VTAM Multi-Node Persistent Session support in OS/390 R3. With
MNPS, client session state information is preserved in the Coupling Facility across a system
failure, avoiding the need to reestablish sessions when clients relogon to host systems within
the Parallel Sysplex cluster. This significantly reduces the outage time associated with getting
clients reconnected to the server systems.
Coupling Facility Configuration Options
Page 59

CICS TS R2.2, CICS LOG Manager


CICS TS R1.0 and subsequent releases exploit the System Logger for CICS backout and
VSAM forward recovery logging as well as auto-journalling and user journalling. Exploiting
the System Logger for the CICS backout log provides faster emergency restart. Exploiting the
System Logger for VSAM forward recovery logs and for journalling allows merging of the
logged data across the Parallel Sysplex cluster in real-time. This simplifies forward recovery
procedures and reduces the time to restore the VSAM files in the event of a media failure.
CICSTS R2.2 enhancements include sign-on retention for persistent sessions, automatic
restart of CICS data sharing servers, and system-managed rebuild of Coupling Facility
structures. These include the Named Counters, Data Tables, and Temporary Storage
structures.

For further details see the CICS Release Guide, GC33-1570 and the CICS Recovery and
Restart Guide, SC33-1698.

VSAM RLS
CICS TS R1.0 and the subsequent releases provide support for VSAM Record Level Sharing.
This function supports full VSAM datasharing, that is the ability to read and write VSAM data
with integrity, from multiple CICS systems in the Parallel Sysplex environment. This
provides increased capacity, improved availability and simplified systems management for
CICS VSAM applications. In addition, the CICS RLS support provides partial datasharing
between CICS systems and batch programs. For further details see the CICS Release Guide,
GC33-1570.

VSAM RLS requires CF structures. A CF lock structure with name IGWLOCK00 must be
defined. In addition, CF cache structures must be defined. The number and names of the
CF cache structures are defined by the customer installation. Refer to publication
DFSMS/MVS® V1R3 DFSMSdfp™ Storage Administration Reference SC26-4920-03 for
details of how to specify the CF cache structure names in the SMS Base Configuration.

VSAM RLS provides nondisruptive rebuild for its CF lock and cache structures. Rebuild may
be used for planned CF structure size change or reconfiguration. In the event of CF failure or
loss of connectivity to CFs containing RLS structures, RLS internally rebuilds the structures
and normal operations continue. However, the lock rebuild requires failure independence.
The lock structure may be System-Managed duplexed for high availability.

Transactional VSAM (DFSMStvs) is available as a priced feature on z/OS 1.4. Based on


VSAM/RLS, it allows full read/write shared access to VSAM data between batch jobs and
CICS online transactions. In addition to the RLS requirements, DFSMStvs uses RLS, which
requires the z/OS System Logger.
Coupling Facility Configuration Options
Page 60

IMS V5, Shared Data


IMS n-way datasharing is IMS/DB full function support, which allows for more than 2-way
data sharing.

Each z/OS image involved in IMS n-way Parallel Sysplex datasharing must have at least one
IRLM. Each IRLM is connected to the IRLM structure in a Coupling Facility.

With IMS data sharing the concept of a data sharing group applies. A datasharing group is
defined using the CFNAMES control statement. The IMS subsystems are members of this
group, and in R5 and R6 of IMS.

Three Coupling Facility structure names must be specified for Parallel Sysplex datasharing.
The names are for the:

• IRLM structure (the IRLM lock table name)


• OSAM structure
• VSAM structure

IMS V5 Parallel Sysplex datasharing uses the OSAM and VSAM structures for buffer
invalidation only.

Like the VSAM/RLS and DB2 counterparts, IMS datasharing can provide increased
application availability and virtually unlimited horizontal growth.

IMS V6, Shared Data


Support is provided to utilize the Coupling Facility for store through caching (storing
changes) and store clean data (storing unaltered data) for OSAM databases. In IMS V5 the
OSAM/VSAM datasharing did not cache the actual data, it uses the structure in directory only
mode. IMS V6 allows OSAM database buffers to be selectively cached in a central Coupling
Facility, for increased performance.

IMS/ESA Version 6 (IMS V6) SMQ


The IMS Shared Message Queue (SMQ) solution allows IMS/ESA® customers to take
advantage of the Coupling Facility to store and share the IMS message queues among
multiple IMS TM systems. Incoming messages from an IMS TM on one Central Processor
Complex (CPC) in the Parallel Sysplex cluster could be placed on a shared queue in the
Coupling Facility by the IMS Common Queue Server (CQS) for processing by an IMS TM on
another CPC in the Parallel Sysplex cluster. This can provide increased capacity and
availability for their IMS systems. This SMQ solution takes advantage of the MVS function
Cross System Extended Services (XES) to access the Coupling Facility, and the Automatic
Restart Manager (ARM) to improve availability by automatically restarting the components in
Coupling Facility Configuration Options
Page 61

place or on another CPC. With shared queues, a message can be processed by any IMS
sharing the queues. This can result in automatic workload distribution as the messages can
be processed by whichever IMS subsystem has the capacity to handle the message. An IMS
customer can use shared queues to distribute work in the Parallel Sysplex cluster. In a
shared queues environment, all transaction-related messages and message switches are put
out on shared queues, which permits immediate access by all the IMSes in a shared queues
group. This provides an alternative to IMS transfer of messages across MSC links within a
shared queues group. The IMS SMQ solution allows IMS customers to take advantage of the
Parallel Sysplex environment, providing increased capacity, incremental horizontal growth,
availability and workload balancing.

Fast Path Shared Expedited Message Handler (EMH): This provides the capability to
process Fast Path transactions in a Parallel Sysplex environment using the Coupling Facility.
This enables workload balancing of Fast Path transactions in a Parallel Sysplex environment
and hence provides increased capacity. This capability can also improve the system and data
availability to the end user since multiple IMS subsystems can have access to the shared
EMH queues in the Coupling Facility. This provides for increased throughput in Fast Path
Message Processing.

IMS V6, Shared VSO


IMS V6 Fastpath support provides block level datasharing of IMS Fast Path Data Entry Data
Base (DEDB) Virtual Storage Option (VSO) databases, using the Coupling Facility. The VSO
structure can be duplexed. This provides for improved workload balancing and offers
increased capacity and growth in moving into Parallel Sysplex environments.

This function along with other IMS V6 functions remove previous restrictions on shared Fast
Path DEDBs.

For information on handling the recovery from these situations, refer to IMS/ESA V6
Operations Guide, SC26-8741, or the ITSO redbook S/390 MVS Parallel Sysplex Continuous
Availability SE Guide, SG24-4503.

VSO DEDB
There is no structure rebuild support for VSO DEDB Areas, either manual or automatic.

If two structures are defined for a specific area, then all data is duplexed to each structure,
performed by IMS. This enables each structure to act as a backup for the other in the case of
any connectivity failure from one or all IMSes, structure failure, or CF failure. In any of these
cases, all IMS subsystems are informed and processing continues from the remaining
structure.
Coupling Facility Configuration Options
Page 62

The decision whether to define two structures for any specific area is installation-dependent
As each structure is a store-in cache, there is always the possibility that at any given point in
time, data has not been hardened to DASD. Thus, any loss of a single structure will
inevitably require DB recovery action, which could be time consuming. This problem is
alleviated by the System-Managed CF duplexing support by these structures.

RRS, z/OS SynchPoint Manager


Many computer resources are so critical to a company's work that the integrity of these
resources must be protected. If changes to the data in the resources are corrupted by a
hardware or software failure, human error, or a catastrophe, the computer must be able to
restore the data. These critical resources are called protected resources or, sometimes,
recoverable resources. Resource recovery is the protection of the resources. Resource
recovery consists of the protocols, often called two-phase commit protocols, and program
interfaces that allow an application program to make consistent changes to multiple protected
resources.

In the OS/390 Release 3 (with additional functions in following releases) IBM provided as an
integral part of the system, commit coordination or synchpoint management as part of
OS/390 Recoverable Resource Management Services (RRMS). These services enabled
transactional capabilities in recoverable resource managers. Additionally, these services
provided Protected Communication Resource Managers distributed transactional capabilities
in many OS/390 application environments.

The Recoverable Resource Management Services consist of three parts for managing
transactional work in OS/390: 1) Context Services for the identification and management of
work requests, 2) Registration Services for identifying a Resource Manager to the system, and
3) Resource Recovery Services or RRS to provide services and commit coordination for all
the protected resource managers participating in a work request's transactional scope in many
of OS/390 application environments. Through RRMS capabilities, extended by
communications resource managers, multiple client/server platforms can have distributed
transactional access to OS/390 Transaction and Database servers.

RRS uses five log streams that are shared by the systems in a sysplex. Every MVS image with
RRS running needs access to the Coupling Facility and the DASD on which the system logger
log streams are defined.

Note that you may define RRS log streams as Coupling Facility log streams or as DASD-only
log streams.

The RRS images on different systems in a sysplex run independently but share logstreams to
keep track of the work. If a system fails, RRS on a different system in the sysplex can use the
shared logs to take over the failed system's work.
Coupling Facility Configuration Options
Page 63

Recommended Configurations

Sample Data Sharing Configuration


Without SM Duplexing
CICS Temp Store DB2 SCA
CICS Data Tables DB2 GBP (S)
CICS Named Counters CF DB2 IRLM
IMS VSO
IMS OSAM
IMS VSAM IMS IRLM
IMS CQS ICF IMS SMQ Logger
VSAM RLS Cache VSAM RLS Lock
VTAM G/R non-LU6.2 VTAM GR LU6.2
DB2 GBP (P) VTAM MNPS
RRS Logger
CICS Logger
Continuous Application Availability
Unlimited Capacity
Single System Image
Application Investment Protection
Flexible and Scalable Growth

For simplicity, only the Primary duplexed structures are shown. Although IMS OSAM and
VSAM structures do support System Managed CF Structure Duplexing, they are not shown as
duplexed since they obtain little value from this.

Sample Data Sharing Configuration


With SM Duplexing
IMS OSAM IMS VSO (P)
IMS VSAM IMS IRLM (P)
IMS SMQ (P) IMS SMQ Logger (P)
DB2 GBP (P) DB2 SCA (P)
IMS CQS (P) DB2 IRLM (P)
ICF ICF
CICS Temp Store (P) CICS Logger (P)
CICS Data Tables (P) VSAM RLS Lock (P)
CICS Named Counters (P) RRS Logger (P)
VSAM RLS Cache (P) VTAM MNPS (P)
VTAM G/R (P)

Continuous Application Availability


Unlimited Capacity
Single System Image
Application Investment Protection
Flexible and Scalable Growth
Coupling Facility Configuration Options
Page 64

Appendix D. Server Packaging


IBM ^ zSeries Packaging — CP – SAP – ICF – Options

Model PUs CPs SAPs Max ICFs Spare PUs


Std + Opt
101 12 1 2+3 8 9
102 12 2 2+3 7 8
103 12 3 2+3 6 7
104 12 4 2+3 5 6
105 12 5 2+3 4 5
106 12 6 2+3 3 4
107 12 7 2+2 2 3
108 12 8 2+1 1 2
109 12 9 2+0 0 1

1C1 20 1 3+5 15 16
1C2 20 2 3+5 14 15
1C3 20 3 3+5 13 14
1C4 20 4 3+5 12 13
1C5 20 5 3+5 11 12
1C6 20 6 3+5 10 11
1C7 20 7 3+5 9 10
1C8 20 8 3+5 8 9
1C9 20 9 3+5 7 8
110 20 10 3+5 6 7
111 20 11 3+5 5 6
112 20 12 3+4 4 5
113 20 13 3+3 3 4
114 20 14 3+2 2 3
115 20 15 3+1 1 2
116 20 16 3+0 0 1

100 12 0 2+0 9 9
The z900 Servers each support up to 32 IC (internal) links. In addition, they support up to 16 ICB-3 and 8 ICB
(Compatibility) links, although RPQs exist to increase the ICB capability to 12 or 16 at the cost of Parallel
channels and OSA connectivity. Up to 42 ISC/ISC-3 links can be installed. The total number of links (IC, ICB,
ICB-3, ISC/ISC3) is a maximum of 64. These numbers hold for all z900 models.
Coupling Facility Configuration Options
Page 65

IBM S/390 G6 Server Packaging — CP – SAP – ICF – Options


Model PUs CPs SAPs Max Max Max Max
Std +Opt ICFs ICBs STIs HiPerLinks
X17 14 1 2+0 11 18 24 32
X27 14 2 2+0 10 18 24 32
X37 14 3 2+1 9 18 24 32
X47 14 4 2+2 8 18 24 32
X57 14 5 2+3 7 18 24 32
X67 14 6 2+4 6 18 24 32
X77 14 7 2+5 5 18 24 32
X87 14 8 2+4 4 18 24 32
X97 14 9 2+3 3 18 24 32
XX7 14 10 2+2 2 18 24 32
XY7 14 11 2+1 1 18 24 32
XZ7 14 12 2+0 0 18 24 32

Z17 14 1 2+0 11 18 24 32
Z27 14 2 2+0 10 18 24 32
Z37 14 3 2+1 9 18 24 32
Z47 14 4 2+2 8 18 24 32
Z57 14 5 2+3 7 18 24 32
Z67 14 6 2+4 6 18 24 32
Z77 14 7 2+5 5 18 24 32
Z87 14 8 2+4 4 18 24 32
Z97 14 9 2+3 3 18 24 32
ZX7 14 10 2+2 2 18 24 32
ZY7 14 11 2+1 1 18 24 32
ZZ7 14 12 2+0 0 18 24 32
Coupling Facility Configuration Options
Page 66

Appendix E. CF Alternatives Questions and Answers


Typical questions and their associated answer relative to IBM's Coupling Facility and its usage.

Question 1 – Does ICF require an LPAR?

Answer – Yes. ICF provides Coupling Facility MIPS via leveraging CPs in the spare pool. The
Coupling Facility Control Code (CFCC) always runs in a Logical Partition – regardless of
how the machine is configured, whether in a standalone CF or not.

Question 2 – How is ICF selected from the HMC?

Answer – On machines with the ICF feature installed, see the Image Profile – Processor Page (on the
HMC). You will notice there is a new option that will allow you to select ICF as a CP
source. You must be in LPAR mode to see the option.

Question 3 – Are ICFs sized the same as standalone CFs?

Answer – Yes, ICF should still be sized in the traditional fashion using formulas and techniques that
are subsystem dependent. ICF provides the MIP part of the sizing equation – Links and
Storage are still needed. Links are not needed in some cases, e.g., ICMF or use of Internal
Channel (IC).

Question 4 – Should ICF be used in all coupling configurations?

Answer – No, what is very important is to first understand the failure independence that is required by
the customer. For customers doing application datasharing without all the data sharing
structures duplexed, at least one standalone CF is strongly recommended. But ICF can
be an excellent backup for this standalone CF, or can be used exclusively in system-enabled
configurations.
Coupling Facility Configuration Options
Page 67

Question 5 – Is adding an ICF subject to the same MP effects as traditional z/OS engines?

Answer – Yes. It is similar to adding another z/OS engine in all ways except in what is being run on it.
The ICF will vie for resources (storage hierarchy, busses within the system, etc.) With
whatever is running on the other engines. The impact is between 0.5% and 1% impact per
ICF engine. This is less than the effect of adding another CP to a z/OS partition.

Question 6 – If we choose to leverage ICF on an IBM machine, will this affect software pricing?

Answer – No, due to the fact that the CPs required to support the ICF are dedicated to the coupling
function, no IBM software license charges apply to the partition.

Question 7 - Which is better: ICFs or standalone CFs?

Answer - Performance: Since ICFs can share the same footprint as the z/OSs that they connect to,
efficient IC links can be employed. This yields the best performance (with pricing
advantages). For a Resource Sharing-only environment, this is the recommended
configuration. In a Data Sharing environment, many structures require failure-isolation.
Since there is a performance hit to System-Managed CF Structure Duplexing, configuring a
standalone CF as the primary location to hold the structures requiring failure-isolation would
provide the best performance.

Availability: System-Managed CF Structure Duplexing removes availability concerns of an


ICF-only environment for those structures being duplexed. Generally, all the structures will
NOT be duplexed due to performance issues. In addition, some microcode upgrades needed
for the Coupling Facility are disruptive and the CEC would be needed to be re-IMLed. This
affects the z/OS workload. For these reasons, standalone CFs provide better availability.

System Management: Processor family upgrades are sometimes needed to support capacity
requirements. If an ICF was configured on the same footprint, then the ICF would
automatically be upgraded at the same time, certainly keeping it “within one generation” of
the z/OS workload. In addition, less footprints to manage, install, power, etc. makes ICFs
easier to manage than standalone Cfs.

Software Pricing: Since neither standalone CFs or ICFs can run z/OS, the capacity used for
both of these are not calculated as part of the software licensing charge calculations. If
running a “normal LPAR” as a CF using general purpose CPs, then the capacity used for this
coupling facility is assumed to be used for z/OS and it will used for calculating software
licensing fees.
Coupling Facility Configuration Options
Page 68

Overall: In a Resource Sharing-only environment, two ICFs are recommended. Once Data
Sharing is introduced, at least one standalone CF should be considered.

Question 8 – Under which circumstances should I consider enabling Dynamic Dispatch on a coupling
facility? Dynamic Coupling Facility Dispatch (DCFD) is an enhancement to CF LPARs that
consist of shared CP or ICF processers. With Dynamic CF Dispatch enabled, the CFCC
LPAR voluntarily gives up use of the processor when there is no work to do. Then the CFCC
LPAR redispatches itself periodically at dynamically adjusted intervals based on the CF
request rate. This uses very little CP resource, leaving the general purpose CPs to process the
normal production work.

Answer – The tradeoff for a decreased usage of production CP or ICF processors is an increase in
response time from the Coupling Facility with Dynamic Dispatch enabled. Any request that
arrives at the coupling facility when it has given up its use of the processor will wait. So
enabling Dynamic Dispatch should be limited to environments where responsive performance
is not a primary objective, such as a test CF image. It could also be used as a backup CF. If
the primary CF should fail, the backup CF in the production system would sense an
increasing queue of CF work and begin employing the general purpose CPs to execute the
work. As the activity rate to this backup CF increases, it's response time improves at the
expensive of increased usage of the CP.

Question 9 – Can we consider the Internal Coupling Facility (ICF) as an option that can help a customer
to get a low cost entry CF for migration and/or to support his development environment?

Answer – Yes, starting with the G3, the CMOS servers provide the Internal Coupling Facility (ICF)
feature running as a Coupling Facility without increasing software licensing costs. ICF can
work either in CFCC or ICMF mode providing a low cost CF which can be used as an entry
level, test/development or single system production CF.
Coupling Facility Configuration Options
Page 69

Question 10 – Why would I want a Parallel Sysplex environment when I have no plans to move to an
application datasharing environment?

Answer – The IT industry is moving as rapidly as possible to clustered computer architectures. The
z/OS Parallel Sysplex environment is a highly advanced, state of the art clustered
computer architecture because of its advanced datasharing technology. An additional benefit
is the (PSLC) pricing feature. Prior to database datasharing customers receive value through
the resource sharing capabilities of exploiters like VTAM, GRS, JES2, System Logger, Catalog,
IRD, etc. This allows fail-over capability and single point of control for systems management.

Question 11 – Is the Coupling Facility a good vehicle for XCF signaling or should we still use CTCs?

Answer – With HiPerLinks available starting on the G3 servers and on the C04 Coupling Facilities,
XCF signaling through CF performance is comparable to CTCs. Use of multiple XCF
signaling structures in multiple CFs is essential for high availability. More information is
available in WSC flash: ibm.com/support/docview.wss?uid=tss1flash10011 Parallel Sysplex
Performance: XCF Performance Considerations.

Question 12 – Should I use System-Managed CF Structure Duplexing for all of my datasharing structures?

Answer – System-Managed Duplexing has additional performance costs over "simplex" mode. This
includes the additional CF request to the duplexed structure for each update, time for the
duplexed structure to coordinate the request, and z/OS receiving the additional response
and reconciling them. This has to be balanced with the benefits including eliminating the
requirement for a standalone CF.

Question 13 – What other benefits does System-Managed Duplexing provide?

Answer – The primary purpose of System-Managed Duplexing is to provide a rapid recovery in a loss
of connectivity or failure situation. This is especially true in the case of a structure that would
otherwise not support recovery such as CICS Temporary Storage, or MQSeries Shared
Queues.

Question 14 – Where can I get more information on configuring System-Managed Duplexing?

Answer – The technical paper "System-Managed CF Structure Duplexing," GM13-0100, contains


detailed information on what it is, how it works, and the environments where it would provide
the most benefit.
Coupling Facility Configuration Options
Page 70

Copyright IBM Corporation 2004

IBM Corporation
Software Communications
Route 100
Somers, NY 10589
U.S.A.

Produced in the United States of America


07/04
All Rights Reserved

IBM, the IBM logo, IBM eServer, the e-business logo, BatchPipes, CICS, DB2, DFSMS/dfp, DFSMShsm,
DFSMS/MVS, ESCON, GDPS, Geographically Dispersed Parallel Sysplex, IMS, IMS/ESA, MQSeries, MVS,
MVS/ESA, OS/390, Parallel Sysplex, PR/SM, RACF, S/390, Sysplex Timer, VTAM, WebSphere, z/OS,z/VM, and
zSeries are trademarks or registered trademarks of International Business Machines Corporation of the United
States, other countries or both.

Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States,
other countries or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Intel is a trademark of Intel Corporation in the United States, other countries or both.

Other company, product and service names may be trademarks or service marks of others.

Information concerning non-IBM products was obtained from the suppliers of their products or their published
announcements. Questions on the capabilities of the non-IBM products should be addressed with the suppliers.

IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our
warranty terms apply.

IBM may not offer the products, services or features discussed in this document in other countries, and the
information may be subject to change without notice. Consult your local IBM business contact for information on
the product or services available in your area.

All statements regarding IBM’s future direction and intent are subject to change or withdrawal without notice, and
represent goals and objectives only.

Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard
IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary
depending upon considerations such as the amount of multiprogramming in the user’s job stream, the
I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given
that an individual user will achieve throughput improvements equivalent to the performance ratios
stated here.

GF22-5042-06

You might also like