Professional Documents
Culture Documents
Customizing IBM Tivoli Workload Scheduler For Z-OS V8.2 To Improve Performance Sg246352
Customizing IBM Tivoli Workload Scheduler For Z-OS V8.2 To Improve Performance Sg246352
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance 8.2
Optimize performance in large Tivoli Workload Scheduler for z/OS environments Pure mainframe and end-to-end scheduling scenarios Best practices based on real-life experience
Vasfi Gucer Anna Dawson Art Eisenhour Stefan Franke Clive Kennedy John Misinkavitch Stephen Viola
ibm.com/redbooks
International Technical Support Organization Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance November 2004
SG24-6352-00
Note: Before using this information and the product it supports, read the information in Notices on page ix.
First Edition (November 2004) This edition applies to IBM Tivoli Workload Scheduler for z/OS and IBM Tivoli Workload Scheduler Version 8, Release 2.
Copyright International Business Machines Corporation 2004. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi The team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Definition of Tivoli Workload Scheduler performance . . . . . . . . . . . . . . . . . 2 1.2 Factors affecting performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Measuring performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Our lab environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 New functions related to performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5.1 Multiple first-level domain managers. . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5.2 Improved SCRIPTLIB parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.5.3 Check server status before Symphony file creation . . . . . . . . . . . . . . 8 1.5.4 Improved joblog retrieval performance . . . . . . . . . . . . . . . . . . . . . . . . 9 Chapter 2. Tivoli Workload Scheduler for z/OS subtask interaction . . . . 11 2.1 How Tivoli Workload Scheduler for z/OS interacts with the current plan . 12 2.1.1 The event manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.2 Initialization statements affecting event handling . . . . . . . . . . . . . . . 13 2.1.3 The general service task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.4 Initialization statements affecting the general service task . . . . . . . . 15 2.1.5 The normal mode manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.6 Initialization parameters that affect the normal mode manager . . . . 15 2.1.7 The workstation analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.1.8 Parameters that affect the workstation analyzer . . . . . . . . . . . . . . . . 16 2.1.9 Controlling access to the current plan. . . . . . . . . . . . . . . . . . . . . . . . 16 2.1.10 Balancing access to the current plan . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Tuning Tivoli Workload Scheduler for z/OS throughput . . . . . . . . . . . . . . 19 2.2.1 Breakdown of the workstation analyzer task . . . . . . . . . . . . . . . . . . . 19 2.2.2 Improving workstation analyzer throughput . . . . . . . . . . . . . . . . . . . 21 2.2.3 Software solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2.4 Find-a-winner algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3 A day in the life of a job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Chapter 3. Optimizing Symphony file creation and distribution. . . . . . . . 27
iii
3.1 End-to-end processing and performance issues . . . . . . . . . . . . . . . . . . . . 28 3.2 Symphony file creation and distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3 Preliminary testing (50,000 or fewer FTA jobs) . . . . . . . . . . . . . . . . . . . . . 30 3.3.1 FTA tuning parameters (localopts) . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.2 Script library as a PDS in LLA instead of a PDSE . . . . . . . . . . . . . . 31 3.3.3 Centralized scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.4 Initial test results (250,000 FTA jobs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.5 How to tune E2E for faster Symphony creation and distribution . . . . . . . . 33 3.5.1 z/OS UNIX System Services tuning and data set placement . . . . . . 34 3.5.2 UNIX System Services tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.3 zFS tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.4 TOPOLOGY parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.5.5 FTA tuning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.5.6 Centralized scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.6 Tuning results and recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.7 Additional tuning changes (z/OS UNIX System Services) . . . . . . . . . . . . 47 3.7.1 Using an empty EQQSCLIB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.7.2 Defining SHAREOPTION(1) for EQQSCPDS file . . . . . . . . . . . . . . . 48 3.7.3 Defining BUFFERSPACE for EQQSCPDS file . . . . . . . . . . . . . . . . . 48 3.8 Final tuning results for Symphony creation and distribution . . . . . . . . . . . 49 3.9 Recommendations based on tuning results . . . . . . . . . . . . . . . . . . . . . . . 50 Chapter 4. Optimizing the UNIX System Services environment . . . . . . . . 53 4.1 UNIX System Services overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.1.1 UNIX overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.1.2 What people like about UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.1.3 What people do not like about UNIX . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.1.4 UNIX operating system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.1.5 UNIX file system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.1.6 MVS and UNIX functional comparison . . . . . . . . . . . . . . . . . . . . . . . 60 4.1.7 z/OS UNIX System Services fundamentals . . . . . . . . . . . . . . . . . . . 61 4.1.8 Address spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.1.9 What people like about z/OS UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.1.10 What people do not like about z/OS UNIX . . . . . . . . . . . . . . . . . . . 66 4.2 z/OS UNIX performance tuning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.3 z/OS UNIX file systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.3.1 Hierarchical File System (HFS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.3.2 Network File System (NFS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3.3 Temporary File System (TFS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3.4 zSeries File System (zFS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3.5 zFS aggregates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.3.6 Installing a zFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.3.7 Tuning zFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
iv
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
4.4 HFS and zFS comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Chapter 5. Using Tivoli Workload Scheduler for z/OS effectively. . . . . . . 97 5.1 Prioritizing the batch flows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.1.1 Why do you need this? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.1.2 Latest start time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.1.3 Latest start time: Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.1.4 Latest start time: Maintaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.1.5 Latest start time: Extra uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.1.6 Earliest start time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.1.7 Balancing system resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.1.8 Workload Manager integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.1.9 Input arrival time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.1.10 Exploit Tivoli Workload Scheduler for z/OS restart capabilities . . 109 5.2 Designing your batch network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.3 Job streams in an end-to-end environment . . . . . . . . . . . . . . . . . . . . . . . 112 5.4 Moving JCL into the JS VSAM files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.4.1 Pre-staging JCL tests: Description . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.4.2 Pre-staging JCL tests: Results tables . . . . . . . . . . . . . . . . . . . . . . . 113 5.4.3 Pre-staging JCL conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.5 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 5.5.1 Pre-stage JCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 5.5.2 Optimize JCL fetch: LLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 5.5.3 Optimize JCL fetch: Exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 5.5.4 Best practices for tuning and usage of resources . . . . . . . . . . . . . . 117 5.5.5 Implement EQQUX004 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 5.5.6 Review your tracker and workstation setup . . . . . . . . . . . . . . . . . . 118 5.5.7 Review initialization parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.5.8 Review your z/OS UNIX System Services and JES tuning. . . . . . . 119 Chapter 6. Data store considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 6.1 What is the data store? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 6.2 When to use the data store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 6.3 Data store performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 6.4 Conclusions and recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Chapter 7. Optimizing Job Scheduling Console performance . . . . . . . . 131 7.1 Factors affecting the Job Scheduling Console performance . . . . . . . . . . 132 7.2 Applying the latest fixes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7.3 Resource requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7.4 Setting the refresh rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7.5 Setting the buffer size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.6 Minimize the JSC windows to force the garbage collector to work . . . . . 134 7.7 Number of open editors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Contents
7.8 Number of open windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.9 Applying filters and propagating to JSC users . . . . . . . . . . . . . . . . . . . . 135 7.10 Java tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 7.11 Startup script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Chapter 8. Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 8.1 E2E troubleshooting: Installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 8.1.1 EQQISMKD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 8.1.2 EQQDDDEF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 8.1.3 EQQPCS05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 8.1.4 EQQPH35E message after applying or installing maintenance . . . 145 8.2 Security issues with E2E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 8.2.1 Duplicate UID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 8.2.2 E2E server user ID not eqqUID. . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 8.2.3 CP batch user ID not in eqqGID . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 8.3 E2E PORTNUMBER and CPUTCPIP. . . . . . . . . . . . . . . . . . . . . . . . . . . 151 8.3.1 CPUTCPIP not same as nm port . . . . . . . . . . . . . . . . . . . . . . . . . . 151 8.3.2 PORTNUMBER set to PORT reserved for another task . . . . . . . . . 153 8.3.3 PORTNUMBER set to PORT already in use . . . . . . . . . . . . . . . . . 154 8.3.4 TOPOLOGY and SERVOPTS PORTNUMBER set to same value. 154 8.4 E2E Symphony switch and distribution problems . . . . . . . . . . . . . . . . . . 155 8.4.1 EQQPT52E cannot switch to new Symphony file . . . . . . . . . . . . . . 155 8.4.2 CP batch job for E2E is run on wrong LPAR. . . . . . . . . . . . . . . . . . 156 8.4.3 Changing the OPCMASTER that an FTA should use. . . . . . . . . . . 157 8.4.4 No valid Symphony file exists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 8.4.5 DM and FTAs alternate between linked and unlinked. . . . . . . . . . . 158 8.5 Other E2E problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 8.5.1 Delay in Symphony current plan (SCP) processing . . . . . . . . . . . . 158 8.5.2 E2E server started before TCP/IP initialized . . . . . . . . . . . . . . . . . . 159 8.5.3 Jobs run at wrong time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 8.5.4 CPUTZ defaults to UTC due to invalid setting . . . . . . . . . . . . . . . . 161 8.5.5 Domain manager (DM) file system full . . . . . . . . . . . . . . . . . . . . . . 161 8.5.6 CP batch job starting before file formatting has completed. . . . . . . 162 8.5.7 EQQW086E in controller EQQMLOG . . . . . . . . . . . . . . . . . . . . . . . 163 8.6 OMVS limit problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 8.6.1 MAXFILEPROC value set too low. . . . . . . . . . . . . . . . . . . . . . . . . . 164 8.6.2 MAXPROCSYS value set too low . . . . . . . . . . . . . . . . . . . . . . . . . . 165 8.6.3 MAXUIDS value set too low . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 8.7 Other useful E2E-related information . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 8.7.1 Restarting an E2E FTA from the distributed side . . . . . . . . . . . . . . 167 8.7.2 Adding or removing an E2E FTA . . . . . . . . . . . . . . . . . . . . . . . . . . 167 8.7.3 Reallocating the EQQTWSIN or EQQTWSOU file . . . . . . . . . . . . . 168 8.7.4 E2E server SYSMDUMP with Language Environment (LE) . . . . . . 168
vi
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
8.8 Troubleshooting the data store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 8.9 Where to find messages in UNIX System Services . . . . . . . . . . . . . . . . 172 8.10 Where to find messages in an end-to-end environment . . . . . . . . . . . . 174 Appendix A. Using the EQQUX000 and EQQUX002 exits . . . . . . . . . . . . 177 The workstation analyzer (WSA) subtask . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Improving workstation analyzer throughput . . . . . . . . . . . . . . . . . . . . . . . . . . 178 EQQUX000 and EQQUX002: Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 EQQUX000 and EQQUX002: Implementation. . . . . . . . . . . . . . . . . . . . . . . . 179 JCL fetch from specific libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Model JCL fetched from JCL libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Model JCL fetched from storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Insertion of JCL cards before first EXEC or PROC statement . . . . . . . . . 183 Amend CLASS= on JOBCARD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 General usage notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 How to define data store destinations on a per job basis with EQQUX002 . . 185 Appendix B. Gathering statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Gathering statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Changing statistics gathering and more . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Using the job tracking log data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Appendix C. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Locating the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Using the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 System requirements for downloading the Web material . . . . . . . . . . . . . 198 How to use the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Contents
vii
viii
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurement may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.
ix
COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces.
Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: AIX CICS DB2 DFS Domino IBM IMS Language Environment Lotus MVS NetView RACF RAMAC Redbooks (logo) Redbooks RMF Tivoli VTAM WebSphere z/OS zSeries
The following terms are trademarks of other companies: Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of others.
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Preface
Scheduling is generally considered as the nucleus of the data center, because the orderly, reliable sequencing and management of process execution is an essential part of IT management. IBM Tivoli Workload Scheduler for z/OS is the IBM strategic product used in many large-to-midsized customer environments, responsible for scheduling critical batch applications. Therefore, the performance of Tivoli Workload Scheduler for z/OS is one of the important factors that affect the overall satisfaction from the IT services for these companies. This IBM Redbook covers the techniques that can be used to improve performance of Tivoli Workload Scheduler for z/OS (including end-to-end scheduling). There are many factors that might affect the performance of any subsystem. In this book, we confine ourselves to those things that are internal to Tivoli Workload Scheduler, or can be easily verified and modified, and are likely to apply to the majority of Tivoli Workload Scheduler customer sites. Although this book is aimed at those very large installations with a batch load of 100,000 or more jobs per day, it will also be relevant to installations with a smaller batch workload who are suffering from a shrinking batch window, or those who are trying to maximize the throughput on their existing hardware, or both.
xi
implementation, and exploitation of their batch scheduling environment. She has many years of experience with the IBM Tivoli Workload Scheduler product and has focused on several aspects of the products performance and usability. She is currently engaged in the migration of the scheduling environments of many IBM customers from other schedulers to IBM Tivoli Workload Scheduler for z/OS. Art Eisenhour is a certified Consulting IT Specialist in the IBM Americas Advanced Technical Support organization. He is skilled in a number of IBM mainframe products and the integration of these products, which include Tivoli Workload Scheduler for z/OS and the Tivoli Workload Scheduler end-to-end feature, Tivoli NetView and z/OS, Tivoli Business Systems Manager, Systems Automation for z/OS, and automation with Tivoli InfoMan. Art joined IBM in 1965 and has experience in developing large systems and network solutions and has consulted on systems management to customers in industries such as banking, chemical, manufacturing, power utilities, and steel. Stefan Franke is a Customer Support Engineer of IBM Global Services based in the Central Region Support Center in Mainz, Germany. He is a member of the EMEA Tivoli Workload Scheduler Level 2 Support Team. In 1992, he began to support z/OS System Management software. Since 1994, he has mainly worked for Tivoli Workload Scheduler Support. His areas of expertise include installation and tuning, defect and non-defect problem determination, and on-site customer support. Clive Kennedy is an IBM Accredited Senior IT Specialist. He has more than 25 years experience in systems and network management in mainframe, distributed, and end-to-end environments. His current role is as a technical consultant with IBM Software Services for Tivoli, based in the Pan-EMEA team from the U.K. He provides consultancy on Tivoli Workload Scheduler for z/OS in mainframe and end-to-end deployments, as well as other IBM Tivoli software. He is an IBM Certified Tivoli Workload Scheduler Deployment Professional. John Misinkavitch is a Accredited Senior IT Specialist who has worked in Systems Management for more than 20 years. He has worked as a systems programmer, as a developer, and as a support specialist. He has worked at IBM for five years and now specializes in Tivoli Workload Scheduler for the distributed version and Tivoli Workload Scheduler for z/OS and the Tivoli Workload Scheduler end-to-end feature. He has participated in multiple Tivoli Workload Scheduler installations for both the distributed and end-to-end versions. Stephen Viola is an Advisory Software Engineer for IBM Tivoli Customer Support, based in Research Triangle Park, North Carolina. He is a member of the Americas Tivoli Workload Scheduler Level 2 Support Team. In 1997, he began to support Tivoli System Management software. Since 2003, he has worked primarily on Tivoli Workload Scheduler for z/OS, especially data store and E2E. Before joining IBM in 1997, he was a systems programmer and
xii
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
performance analyst. His areas of expertise include installation and tuning, problem determination, and on-site customer support. Thanks to the following people for their contributions to this project: Elizabeth Barnes, Budi Darmawan International Technical Support Organization, Austin Center Robert Haimowitz International Technical Support Organization, Poughkeepsie Center Mark Fantacone, Kim Querner IBM U.S. Anna Filomena Bufi, Maria Pia Cagnetta, Marco Cardelli, Rossella Donadeo, Paolo Falsi, Antonio Gallotti, Xavier Giannakopoulos, Vito Longo, Valeria Perticara, Stefano Proietti, Roberto Tomassi IBM Italy Hans Olsson IBM Sweden Paul B. Eaton IBM U.K.
Comments welcome
Your comments are important to us!
Preface
xiii
We want our Redbooks to be as helpful as possible. Send us your comments about this or other Redbooks in one of the following ways: Use the online Contact us review Redbook form found at:
ibm.com/Redbooks
Mail your comments to: IBM Corporation, International Technical Support Organization Dept. JN9B Building 905 11501 Burnet Road Austin, Texas 78758-3493
xiv
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Chapter 1.
Introduction
In this chapter, we first introduce the overall subject of IBM Tivoli Workload Scheduler (Tivoli Workload Scheduler) performance, what might affect it, and how it can be measured. Second, we explain our lab hardware environment. Finally, we talk about the new performance-related enhancements in Tivoli Workload Scheduler for z/OS Version 8.2. We assume that you already have an understanding of the Tivoli Workload Scheduler architecture and plan distribution and job submission processes. However, where appropriate, these are reviewed or expanded upon. We also assume that you have a sound working knowledge of Tivoli Workload Scheduler scheduling techniques and practices. The following topics are covered in this chapter: Definition of Tivoli Workload Scheduler performance Factors affecting performance Measuring performance Our lab environment New functions related to performance
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
With that in mind, the factors that can affect performance are summarized in Table 1-1.
Table 1-1 Factors that can affect performance Performance area Daily plan build Factors System software parameters: JES2 VSAM tuning Tivoli Workload Scheduler subsystem parameters Symphony file creation and distribution System software parameters: JES2, USS VSAM tuning Tivoli Workload Scheduler for z/OS subsystem parameters Tivoli Workload Scheduler fault tolerant agent (FTA) options Topology design Job stream design Job submission and status feedback System software parameters: JES2, USS VSAM tuning Tivoli Workload Scheduler for z/OS subsystem parameters Tivoli Workload Scheduler FTA options Topology design Job stream design Tivoli Workload Scheduler for z/OS scheduling parameters and techniques (for example, use of accurate deadline times and durations to correctly prioritize jobs for submission) Data store parameters Tivoli Workload Scheduler for z/OS exits User interaction Job stream design JSC parameters
In this book, we talk about most of these factors that affect performance. Specifically: Refer to 3.5, How to tune E2E for faster Symphony creation and distribution on page 33 for optimizing the daily plan build.
Chapter 1. Introduction
Refer to Chapter 3, Optimizing Symphony file creation and distribution on page 27 and Chapter 4, Optimizing the UNIX System Services environment on page 53 for best practices of optimizing the Symphony file creation. Refer to Chapter 2, Tivoli Workload Scheduler for z/OS subtask interaction on page 11, Chapter 5, Using Tivoli Workload Scheduler for z/OS effectively on page 97, Chapter 6, Data store considerations on page 121, and Appendix A, Using the EQQUX000 and EQQUX002 exits on page 177 for optimizing job submission and status feedback. Refer to Chapter 7, Optimizing Job Scheduling Console performance on page 131 and 5.3, Job streams in an end-to-end environment on page 112 for optimizing the user interaction.
We could get the time stamp of the Sinfonia file on the distributed agents to calculate the Symphony distribution times. For the data store, we used the start time of the first job and end time of the last job, plus CPU times and EXCP counts. It should be noted that we used clock time as a relative measurement of the effects of each parameter adjustment, not as an absolute indication of what a customer will achieve. The whole area of user interface performance is problematic. Although measurements and statistics can present the bare facts, performance at the screen is often a matter of user perception; what one user considers acceptable, another might not, and what is considered good or bad performance might be influenced by recent user experiences. In Tivoli Workload Scheduler for z/OS specifically, it should be noted that there are not only differences in performance between the Job Scheduling Console (JSC) and Interactive System Productivity Facility (ISPF) interfaces, but also significant differences in the way they present
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
information to the user. Users are likely to have a preference for one over the other, based on criteria other than just performance. A users ability and knowledge of how to respond to information and the speed with which they can interact with the interface are just as important, if not more so, than the speed at which information is delivered to the interface. Although our goal was to identify techniques to improve the performance of IBM Tivoli Workload Scheduler, there are other criteria against which Tivoli Workload Scheduler has to be judged, for example, usability. There is little point spending time and effort speeding up the delivery of information to the user interface if the net result is that the data that is displayed on the interface is so user-unfriendly that it actually impairs the overall performance of the user interaction. We have been mindful of the bigger picture, and where performance tuning might cause conflicts with other functional or usability criteria, we have highlighted this. Due to time constraints, we were unable to specifically perform testing on tuning the performance of the user interfaces; however, we did include in this book a chapter about tuning the Job Scheduling Console (JSC) for better performance (see Chapter 7, Optimizing Job Scheduling Console performance on page 131).
Chapter 1. Introduction
Table 1-2 Our lab environment (distributed) O/S AIX 5L Version 5.2.0.0 Host name Dallas FTA name AXDA AXAD AXHO AXOH AXEL AXLE AXMI AXIM AXHE AXHE AXST AXTS AXRO AXOR LXED LXDE LXAN LXNA W2FL W2IZ W2ZI W2LI W2IL W2AM W2MA Specification Memory: 384 MB Page space: 512 MB Processors: 1 Memory: 512 MB Page space: 512 MB Processors: 1 Memory: 256MB Page space: 512 MB Processors: 1 Memory: 1024 MB Page space: 512 MB Processors: 4 Memory: 256 MB Page space: 512 MB Processors: 1 Memory: 256 MB Page space: 512 MB Processors: 1 Memory: 256 MB Page space: 512 MB Processors: 1 Memory: 512 MB Swap space: 1020 MB Processor: P4 1.80 GHZ Memory: 1024 MB Swap space:2048 MB Processor: P3 0.9 GHZ Memory: 512 MB Processor: P4 3 GHZ Memory: 256 MB Processor: x86 Family 6 Memory: 256 MB Processor: x86 Family 6 Memory: 512 MB Processor: P4 1.80 GHZ
Houston
Elpaso
Milan
Helsinki
Stockholm
Rome
Red Hat Linux 8.0 Red Hat Linux AS 2.1 Windows 2000 Server
Edinburg
Ankara
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
OPCMASTER
z/OS
DomainZ
Domain Manager DMZ
AIX
DomainY
Domain Manager DMY
AIX
DomainA
AIX
DomainB
HPUX
DomainC
HPUX
FTA1
AIX
FTA2 Linux
FTA3
Windows 2000
FTA4
Solaris
Figure 1-1 Tivoli Workload Scheduler network with two first-level domains
Chapter 1. Introduction
The information in the SCRPTLIB member must be parsed every time a job is added to the Symphony file (both at Symphony creation or dynamically). In Tivoli Workload Scheduler V8.1, the TSO parser was used, but this caused a major performance issueup to 70% of the time taken to create a Symphony file was spent parsing the SCRIPTLIB library members. In Tivoli Workload Scheduler V8.2, a new parser has been implemented, significantly reducing the parsing time and consequently the Symphony file creation time.
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Chapter 1. Introduction
10
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Chapter 2.
11
2.1 How Tivoli Workload Scheduler for z/OS interacts with the current plan
In this section, we discuss three of the controller subtasks, event manager, workstation analyzer and the general service task, and how they interact with each other and with the current plan.
12
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The submit events are: IJ0 IJ1 IJ2 IJ3 IWTO IREL Submit synch event Submit job JCL Submit started-task JCL Submit stand-alone cleanup job Submit WTO message Submit a release command
User-created events (created with the EQQUSINx subroutines (EQQEVPGM) or the corresponding TSO commands) are sorted between jobs in event-creation order. Events for operations started by the z/OS tracker agents are X-type event records. Events started by distributed agents are 0-type event records. When the event manager gets the current plan lock, it processes each event in turn from its queue. The results of this processing are written to the job tracking log. If an event is received that cannot be matched correctly against an operation in the current plan, the event is placed on a suspend queue. The suspend queue is checked every time the event manager gets the current plan lock. If the event is on the suspend queue for five minutes, the event is either discarded, or if it was suspended due to being received out of order, it is processed and might result in an operation being placed in error status.
EWTROPTS
This is an option used by the event writer task. The following parameters are for this statement.
PRINTEVENTS
Using a value of NO prevents print events (type 4) from being passed to the controller. This is only valid if you do not want to track printing.
STEPEVENTS
Using a value of NZERO or ABEND will reduce the number of step events passed to the controller.
13
EWSEQNO
Use this parameter when the tracker is connected by the cross-system coupling facility (XCF) or SNA network communication function (NCF), but not when trackers communicate by shared DASD. It causes the tracker to write events to both the event data set and the communications task at the same time, rather than writing them to the data set and then reading them back to the communications task. This speeds up the delivery time of the event from the tracker to the controller (it eliminates two I/O operations per event).
EXITS
CALL04 is the parameter of EXITS statement.
CALL04
Use this parameter to load EQQUX004 (if you have written one).The exit can be used to filter out events of no interest to the controller before they are sent. An example would be to exclude testing jobs that could be identified by the jobname prefix.
OPCOPTS
JCCTASK is the parameter of OPCOPTS statement.
JCCTASK
Do not use the JCC task unnecessarily. The tracker cannot pass job termination events to the controller until after the JCC has processed the joblog. Where JCC must be used, consider its usage carefully. JCC processes jobs singularly, so try to have a particular output class for JCC processing and create job-specific table entries, rather than using the general table. If this task is activated, consideration should be given to all the parameters coded for the JCCOPTS statement.
14
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
OPCOPTS
GSTASSK is the parameter of the OPCOPTS statement.
GSTASK
This parameter enables you to specify how many dialog requests can be handled simultaneously, up to a maximum of five. The default is 5.
JTOPTS
The following parameters are for the JTOPTS statement that affects the normal mode manager performance.
BACKUP
The value used here determines how often the NMM instigates an internal backup of the CP file. A value of NO enables the user to control when this backup takes place by executing the BACKUP command.
MAXJSFILE
The value used here determines how often the NMM instigates an internal backup of the JS file. A value of NO enables the user to control when this backup takes place by executing the BACKUP command. It should be noted that stopping Tivoli Workload Scheduler for z/OS from performing an internal backup automatically, without having a regularly triggered process to replace it, will affect performance and may might bring down the controller.
15
The CP backup should be done no less than four times in any 24-hour period, at times that suit your disaster and contingency plans. The timing of the JS backups will depend on the frequency and timing of manual updates to the run-time JCL, but will probably be less frequent than the CP backups.
JTOPTS
QUEUELEN is the parameter of the JTOPTS statement that affects workstation analyzer performance.
QUEUELEN
This parameter defines the maximum number of ready operations that the WSA starts when it gets control of the current plan. The lowest (and default) value is 5.
16
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
update to do. When each task gets control, it will do as much as it can until it comes to the end of its queue, reaches a specified cut-off point, or is signalled to stop by another task. The amount of time each task holds the lock on the current plan and how much time is spent waiting is maintained by Tivoli Workload Scheduler for z/OS and displayed to the started task message log if CPLOCK is included in the STATMSG keyword of the JTOPTS initialization parameter. The issue frequency of these statistics is defaulted by the value used on the BACKUP keyword of the OPCOPTS initialization parameter. The statistics are displayed when more than 50% of that figure have been received since the last display. The STATIM parameter can be used to create a more suitable frequency of statistic messages. These parameters are discussed later in this book. Figure 2-1 shows the CP lock process.
Event Manager
Clear the Queue Current Plan
Workstation Analyzer
> Queuelen
EVENTS
READY
10 9 8
11
12
1 2 3 4 5
General Service
7 For every READY operation that requires JCL call the fetch routine
Only the workstation analyzer, the event manager, or the general service task can have access to the current plan at any one time. Each task must wait for access, although their queue of work might be increasing. The NMM task is not included in Figure 2-1, because its impact on performance is not great.
17
It is highly unlikely that there will be no wait time showing against any of the tasks, because they generally all have something else to do when they relinquish the CP lock and will enqueue again. The least busy is probably the general service task; however, that is the task used by the external users of Tivoli Workload Scheduler for z/OS, so its responsiveness affects their view of performance. Operators have every right to expect speedy responses. The last thing they want are responses so poor that, when they attempt to stop an operation from running, they find it has already started. Similarly, they need to correct jobs that have failed (and will elongate the batch window) as quickly as possible, not wait an extra five minutes just to re-submit them. The obvious way to achieve this is to keep to a minimum the time that the event manager and the workstation analyzer need to update the current plan. Even when the events have been filtered by the tracker tasks, the event manager nearly always has events to pass on. The time the EM takes is related to the number of events queued. The length of the queue depends on how long the EM must wait for the WSA to release the CP lock. The event manager processes events very quickly, but in order to let the general service task in quickly, the number of events on the queue should be kept to a minimum. You control this through the workstation analyzer. With the WSA, you have a task that can be influenced. You can determine how long it holds the lock and, consequently, how long the other two tasks wait. The QUEUELEN parameter determines how many operations are scheduled by the WSA before the CP lock is released.
But if the number of ready operations on the combined workstations ready list (the dynamic operation area, or DOA, queue) rises rapidly, some jobs could continually be relegated to the back of the queue, delaying their submission for a considerable time. How much time depends on how long it takes the WSA to select an operation for submission and actually place it in started status, multiplied by the number of jobs there are in the queue.
The number of jobs in the queue will continually change. This is because new operations are reaching ready status as the event manager completes their predecessors and the resources that operations require are freed up or allocated elsewhere. Striking the correct balance between job submission rate (QUEUELEN) and dialog responses is difficult without a thorough knowledge of the batch cycle and a Tivoli Workload Scheduler for z/OS environment tuned to provide the best WSA throughput possible. It is possible to dynamically change the QUEUELEN value by issuing the F subsys,QUELEN=nnnn command (0 to 9999, a minimum of 5 is enforced).
18
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
19
6. Passes the JCL to the external router for transmission to the appropriate tracker. 7. Checks if the QUEUELEN value has been exceeded (maximum number of loops is QUEUELEN value) and either starts searching the DOA again, or releases the lock. As this list shows, the WSA has quite a lot to do. If the ready queue is large, it has to search it at least six times and fetch the JCL for six operations. This find-a-winner routine is done very quickly, because the DOA queue will almost certainly be in storage. At least 50% of the current plan should be in memory. Due to the frequency with which the DOA queue is referenced, it is unlikely to have been paged out. Fetching the JCL from the JS file should also be quite fast. Although it uses an I/O call, it is to a keyed record in a VSAM file (which should be on a cached unit). It is the fetch of the JCL from the PDS library that takes the most time. Figure 2-2 on page 21 shows the JCL fetch routine. The moving of JCL from the data set (for example, a PDS) to the VSAM file (JS) for further processing can be a major bottleneck, depending on the size of the directories that Tivoli Workload Scheduler for z/OS needs to search.
20
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
JCL Fetch
"JCL" in JS file?
Yes
JS file
No
EQQUX002 used ?
Yes
No
No
Locate in EQQJBLIB
Found?
Yes
Submit task
Yes
21
Regardless of which method is used to improve the speed of promotion of the JCL from the PDS library to the VSAM file, the best improvement that can be made to WSA throughput is to ensure that the JCL for the job is already in the VSAM file when Tivoli Workload Scheduler for z/OS wants to submit it. The most appropriate method of pre-staging JCL into the JS file is by using the Tivoli Workload Scheduler for z/OS program interface (PIF). There are several samples of PIF usage in the SAMPLIB shipped with Tivoli Workload Scheduler for z/OS. However efficient the staging program is, it will still be bound by the performance problems that affect the WSA fetch times. The main problem is that Tivoli Workload Scheduler for z/OS searches the EQQJBLIB concatenation for the JCL. To find a member in the last data set of the concatenation, Tivoli Workload Scheduler for z/OS must read the directories of all preceding PDS libraries. When they contain thousands of members, this can be very time-consuming. In order to circumvent this directory search and go directly to a small operation-specific library, use EQQUX002. For improved performance of EQQUX002, we recommend that you use EQQUX000 (the Tivoli Workload Scheduler for z/OS stop/start exit) to do the open/close routines needed for the JCL PDS libraries. A sample EQQUX000 / 002 combination is documented in The workstation analyzer (WSA) subtask on page 178.
22
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
23
18:45
19:45
EM
RT
21:00
WSA
RX
21:26
EM
21:26
WSA
21:26
WSA
21:26
WSA
SU
EM EM EM
S SQ SS
24
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Time 21:33
Activity The tracker sends the A3S event (step ended) for a non-zero completion code back to the controller. This will not be an issue, because this particular non-zero code is acceptable and has been defined in the noerror list. The job ends and the job completion event, A3J, is sent to the controller. The output processing for this job completes in JES, and the A3P event is generated and sent to the controller.
Task EM
Status SS
21:37 21:37
EM EM
SS C
25
26
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Chapter 3.
27
28
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Program NMM
Messages EQQPT30I START SWITCHING SYMPHONY EQQ3106I WAITING FOR SCP EQQ3107I SCP IS READY: START JOBS ADDITION TO SYMPHONY FILE EQQ3108I JOBS ADDITION TO SYMPHONY FILE COMPLETED EQQ3087I THE SYMPHONY FILE HAS BEEN SUCCESSFULLY CREATED EQQN111I A NEW SYMPHONY FILES HAS BEEN CREATED (No message)
6 7
Translate to SymUSER Rename SymUSER to Symnew Copy to Symphony and Sinfonia Verify Symphony
EQQDNTOP EQQDNTOP
output translator
output translator
EQQPT31I SYMPHONY SUCCESSFULLY SWITCHED EQQW090I THE NEW SYMPHONY FILE HAS BEEN SUCCESSFULLY SWITCHED
a. The Extend LTP step is not a mandatory step to create the Symphony file, but in practice, it is usually performed to reflect the latest changes in the database, such as changing the runcycle of a job steam.
The following steps show the precise order in which the deletes, copies, and renames are done for the Symphony and Sinfonia files (step 8 in Table 3-1 on page 28): 1. Remove (delete) Symold. 2. Rename Symphony to Symold. 3. Copy Symnew to Symphony. 4. Remove Sinfold. 5. Rename Sinfonia to Sinfold. 6. Copy Symnew to Sinfonia. 7. Remove Symnew. Note that the Sinfonia file is equivalent to the new current plan (NCP). It is not modified after it is created until the next current plan EXTEND (or REPLAN) is done. The Symphony file is modified if, for example, a new FTA job is added to the current plan. We did each test in the same manner as shown in the following steps: 1. Unlink the PDM or PDMs if already linked.
29
2. Run CP refresh (dialog 9.5) if the current plan already exists. 3. Shut down the controller (which shuts down the E2E server). 4. Set up the AD (application) database using the batch loader job. 5. Initialize the WRKDIR, and run the EQQPCS05 job. 6. Change the localopts and TOPOLOGY parameters as needed. 7. Initialize EQQSCPDS, EQQTWSIN, and EQQTWSOU. 8. Start the controller (which starts the E2E server). 9. Run the LTP create (dialog 2.2.7). 10.Run CP EXTEND (dialog 3.2). The reason for this sequence was that we wanted to ensure that each test was totally independent of any previous tests. In this way, if a performance benefit was seen on any test, it was known that nothing other than the changes made for that test would have influenced the results. To ensure that an overall increase in system performance did not influence the test, each test was repeated several times. The time of day when system backups were run was not used for testing (because degraded performance could occur during this time). In order to analyze the test results, the following phases were used to make comparisons among tests easier: 1. Start of CP EXTEND until EQQW090I THE NEW SYMPHONY FILE HAS BEEN SUCCESSFULLY SWITCHED message. 2. EQQW090I message until PDM(s) LINKED and ACTIVE (EQQWL10W WORK STATION xxxx, HAS BEEN SET TO ACTIVE STATUS). 3. From PDM(s) LINKED and ACTIVE until all FTAs are linked and active. Note that Phase 1 above consists of steps 2 through 9 in Table 3-1 on page 28.
30
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
31
Table 3-3 Effect of using centralized scripts on Symphony create and distribute Phase Symphony create Symphony distribution Total Non-centralized scripts (time) 2:35 2:16 4:51 Centralized scripts (time) 2.50 1:42 4:32 Difference +9.0% -25.0% -6.5%
Although the Symphony file took longer to create using centralized scripts, the distribution of the Symphony file to the PDM and FTAs was substantially faster. We decided to keep using centralized scripts in the later tests. You can find more information about centralized scripts in 3.5.6, Centralized scripts on page 44.
32
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Note that because two copies of Tivoli Workload Scheduler were defined on a single machine, one copy had CPUTCPIP(31758) and CPUUSER(tws); the other copy had CPUTCPIP(15458) and CPUUSER(maestro). Non-centralized scripts were used for all 250,000 jobs (that is, each job had an EQQSCLIB member associated with it). The Hierarchical File System (HFS) was used as the file system for the BINDIR (eqqBINDIR) and WRKDIR (eqqWRKDIR). A normal speed DASD was used (RAMAC RVA), and no attempt was made to spread the key Tivoli Workload Scheduler data sets to different DASD volumes and channels. Table 3-4 shows the results of this initial test.
Table 3-4 Initial (untuned) results for Symphony create and distribute Phase 1 2 3 Total Time (hh:mm) 01:38:00 00:06 00:56 02:40 Description Create and switch Symphony PDM linked and active All FTAs linked and active Total time
3.5 How to tune E2E for faster Symphony creation and distribution
This section demonstrates the areas in which tuning can be performed in an E2E environment and also the results of this tuning. Areas to consider for tuning of an E2E environment include: z/OS UNIX System Services tuning and data set placement z/OS UNIX System Services tuning zSeries File System (zFS) tuning TOPOLOGY parameters FTA tuning Centralized scripts
33
3.5.1 z/OS UNIX System Services tuning and data set placement
The following areas were identified as candidates for z/OS UNIX System Services and data set-related tuning: Maximizing the region size for the controller, E2E server, and CP EXTEND job. Making the E2E server task non-swappable. Setting the controller, E2E server, and CP EXTEND to high priority through WLM. Moving files to faster DASD (SHARK 2105-800 versus RVA). Spreading key Tivoli Workload Scheduler files onto multiple DASD volumes. Adding BUFFERSPACE to the EQQSCPDS file. Using LLA for the EQQJBLIB data set containing centralized scripts. Making the controller and E2E tasks TRUSTED in RACF.
Maximizing the region size for controller, E2E server, and CP EXTEND job
In the initial test, the region size for the controller and E2E server was set at 64 MB. The smallest region size in which the CP EXTEND job would run without abend (S80A) was 256 MB, so this size was used for the test. For tuning purposes, the region size for all three (controller, E2E server and CP EXTEND job) was set to 0 MB (unlimited), and no IEFUSI exit was used to limit the region size that was coded on the EXEC statement in the JCL.
The following z/OS UNIX System Services console command was issued to put the change into effect:
SET SCH=xx
34
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Setting controller, E2E server, and CP EXTEND to high priority through WLM
The controller and E2E server tasks were modified to run in the SYSSTC WLM SRVCLASS instead of the default STC (default service class for address spaces) class. Likewise, the CP EXTEND job was changed to high-priority batch (BATCHHI) instead of normal batch.
Spreading key Tivoli Workload Scheduler files onto multiple DASD volumes
The following files were placed on separate DASD volumes, with no other Tivoli Workload Scheduler files on the volume: EQQCKPT EQQCP1DS EQQCP2DS JOBLIB(EQQJBLIB) for centralized scripts EQQSCPDS EQQSCLIB (script library for non-centralized scripts) EQQTWSIN EQQTWSOU
35
was done by taking a LISTCAT of the SCP file after a CP EXTEND was done and checking the HI-U-RBA value, as shown in Example 3-3.
Example 3-3 Checking the HI-U-RBA value ALLOCATION SPACE-TYPE------CYLINDER SPACE-PRI------------200 SPACE-SEC-------------60 HI-A-RBA-------144179200 HI-U-RBA-------126877696
In this example, because HI-U-RBA was set to 126877696, we set the BUFFERSPACE parameter about 10% higher (140000000) to allow for index records, plus some growth. The BUFFERSPACE override was put into the IDCAMS DEFINE for EQQSCPDS (distributed as job EQQPCS06), as shown in Example 3-4.
Example 3-4 IDCAMS DEFINE for EQQSCPDS DEFINE + CLUSTER ( + NAME('TWS.SC63.SCP) REUSE + NONSPANNED + BUFFERSPACE(140000000) + SHR(3) VOL(TWS010) CYLINDERS(200 60) + ) +
Note that the default BUFFERSPACE parameter for the same file was just 66048 bytes.
36
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
After the CSVLLAxx member is set up, it can be put into effect by issuing the following z/OS UNIX System Services command:
F LLA,UPDATE=xx
In the previous example, TWSRES1 was the user ID of the controller task, and TWSRES9 was the user ID of the E2E server task (eqqUID).
37
Refer to 4.4, HFS and zFS comparison on page 91 for additional information concerning zFS.
38
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Notes about multiple mailman servers: When setting up multiple mailman servers, do not forget that each mailman server process uses extra CPU resources on the workstation on which it is created, so be careful not to create an excessive number of mailman servers on low-end domain managers. Configuring extra mailman servers was much more important in the single domain architecture (pre-Tivoli Workload Scheduler V6.1 implementations). Multiple domain implementations reduced the requirement for multiple mailman server processes. Some cases where the usage of extra mailman servers might be beneficial are: Important FTAs that run mission-critical jobs. Slow-initializing FTAs that are at the other end of a slow link. (If you have more than a couple of workstations over a slow link connection to the OPCMASTER, a better idea is to place a remote domain manager to serve these workstations.) If you have unstable workstations in the network, do not put them under the same mailman server ID with your critical servers.
Example 3-8 TOPOLOGY parameter example DOMREC DOMAIN(UK) DOMMNGR(AXTS) DOMPARENT(MASTERDM) DOMREC DOMAIN(US) DOMMNGR(AXEL) DOMPARENT(MASTERDM) CPUREC CPUNAME(W2MA) CPUTCPIP(31758) CPUUSER(tws) CPUDOMAIN(UK) CPUSERVER(W) CPUREC CPUNAME(W2ZI) CPUTCPIP(31758) CPUUSER(tws) CPUDOMAIN(UK) CPUSERVER(A) CPUREC CPUNAME(AXOR) CPUTCPIP(31758) CPUUSER(tws) CPUDOMAIN(UK) CPUSERVER(B) CPUREC CPUNAME(W2FL) CPUTCPIP(15458)
39
CPUREC
CPUREC
CPUUSER(maestro) CPUDOMAIN(US) CPUSERVER(X) CPUNAME(W2IZ) CPUTCPIP(15458) CPUUSER(maestro) CPUDOMAIN(US) CPUSERVER(C) CPUNAME(AXRO) CPUTCPIP(15458) CPUUSER(maestro) CPUDOMAIN(US) CPUSERVER(D)
40
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The size of the network can help you decide whether to use a single domain or multiple domains. If you have a small number of systems or a small number of applications to manage with IBM Tivoli Workload Scheduler, there might not be a need for multiple domains. Geographic location and the communication speed between locations are one of the primary reasons for choosing a multiple domain architecture. One domain for each geographical location is a common configuration. If you choose a single domain architecture, you will be more reliant on the network to maintain continuous processing. An IBM Tivoli Workload Scheduler network, with either a single domain or multiple domains, gives you the ability to manage IBM Tivoli Workload Scheduler from a single nodethe master domain manager. If you want to manage multiple locations separately, you can consider installing a separate IBM Tivoli Workload Scheduler network at each location. Note that some degree of decentralized management is possible in a stand-alone IBM Tivoli Workload Scheduler network by mounting or sharing file systems. There can be other reasons for choosing a multiple domain configuration, for example, a domain for each building, department, business function, or application. The degree of interdependence between jobs is an important consideration when designing your IBM Tivoli Workload Scheduler network. If you use multiple domains, you should try to keep interdependent objects in the same domain. This will decrease network traffic and take better advantage of the domain architecture. What level of fault-tolerance is required? An obvious disadvantage of the single domain configuration is the reliance on a single domain manager. In a multi-domain network, the loss of a single domain manager only affects the agents in its own domain.
41
Recommended localopts changes for FTAs are: wr enable compression=yes (compress Symphony/Sinfonia if it is 4 MB or larger) sync level=low mm cache enable=yes mm cache size=512 (maximum value, unless you are running the FTA on a low memory machine; see the following Tip about mm cache size) bm look=9 (reduce from default value of 15 and make lower than bm read) The following section provides more insight into these parameters.
42
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
43
Tip: Due to the overhead of compression and decompression, we recommend that you use compression if Symphony/Sinfonia is 4 MB or larger. When Symphony/Sinfonia compression (wr enable compression) is used, a large number of extra messages are written to stdlist. This can be avoided by applying the fix for APAR IY58566. Note: This parameter is not used (has no effect) in the Tivoli Workload Scheduler for z/OS localopts.
The improvement using these suggested parameter changes is not in the area of Symphony file creation and distribution, but rather in the execution of the jobs in the Symphony file. The improvement is considerable, however, and these parameter settings will be made the default values by APAR IY59076. We tested making the same TWSCCLog.properties changes to the USS version of the file (in WRKDIR) and this did not have a measurable effect.
44
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Centralized scripts tend to use and exploit Tivoli Workload Scheduler for z/OS functionality and follow that syntax. Non-centralized scripts tend to use and exploit Tivoli Workload Scheduler distributed functionality and follow that syntax. Centralized scripts are downloaded from the controller to the agent (through the USS translator and script downloader tasks) at submission time. The download process is necessary every time a job runs; a rerun of the job causes the script to be downloaded again. This has performance implications in two areas: Size of Symphony file, and consequent speed to distribute it. Speed of job submission and dependency resolution, in that using centralized scripts automatically introduces a refer back to the controller on z/OS.
Effects of centralized scripts on the speed of job submission and dependency resolution
Centralized scripts are downloaded when the jobs enters the READY status. There are a number of implications of this: It causes a loss of fault tolerance. It builds a delay into processing, because the job will not start until the script has been downloaded (the script is downloaded in 1 KB blocks). There might be another delay created if the centralized jobs predecessor was also a distributed job, because the predecessors completion status will need to be fed back to the controller prior to the controller setting the centralized job to READY (the Symphony file on the FTA will always be ahead of the controller in maintaining the status of jobs running on that FTA). Using centralized scripts can cause an increase in network traffic.
45
However, there are obviously other criteria than performance in deciding whether to use centralized or non-centralized scripts, which include: Non-centralized (local) scripts: Using non-centralized scripts makes it possible for the fault tolerant agent to run local jobs without any connection to the controller on the mainframe. Conversely, if the non-centralized script is updated, it must be done locally on the agent. Local placed scripts can be consolidated in a central repository placed on the mainframe or on a distributed system. On a daily basis, changed or updated scripts can be distributed to the fault tolerant agents where they are going to be used. By doing this, all scripts are placed in a common repository, making modifications easy, and fault tolerance is preserved. Centralized scripts: This makes it possible to centrally manage all scripts. Conversely, it compromises the fault tolerance in the distributed Tivoli Workload Scheduler network, because the controller must have a connection to the fault tolerant agent to be able to send the script. The centralized script possibility makes the migration from Tivoli OPC tracker agents with centralized scripts to end-to-end scheduling much simpler. Combination of non-centralized and centralized scripts: The third possibility is to use a combination of non-centralized and centralized scripts. Here the decision can be made based on considerations such as: Where a particular FTA is placed in the network. How stable the network connection is to the FTA How fast the connection is to the FTA Special requirements for different departments to have dedicated access to their scripts on their local FTAs
For non-centralized scripts, it is still possible to have a centralized repository with the scripts, and then on a daily basis, to distribute changed or updated scripts to the FTAs with non-centralized scripts.
46
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
distribution on page 33 are summarized in Table 3-6. Table 3-4 on page 33 shows the pre-tuning results. Note that Table 3-6 shows the effect of a number of z/OS UNIX System Services and DASD tuning changes and the use of two PDMs each with three mailman servers. Centralized scripts were used, and the EQQJBLIB library containing the centralized scripts was placed in LLA. However, the localopts settings for the FTAs are identical to the earlier test shown in Table 3-4 on page 33.
Table 3-6 Symphony create and distribute results after some tuning Phase 1 2 3 Total Time (hh:mm) 01:07 00:07 00:46 02:00 Description Create and switch Symphony PDM linked and active All FTAs linked and active Total time
This represents a 25% improvement in time over the earlier test with the same number of FTA jobs. However, we felt that further improvements could be made, especially on the z/OS UNIX System Services side (create and switch Symphony or Phase 1). We discuss these changes in the next section.
47
performance penalty for having any (even just one) FTA jobs that use non-centralized scripts.
Table 3-7 Effect of empty EQQSCLIB on Symphony create time EQQSCLIB Non-empty Empty Phase 1 (mm:ss) 67:40 46:20 31.0% less Difference
A 35% improvement, just for changing the shareoption value of one file, was a remarkable result. We tried testing the other VSAM files related to current plan processing, such as the long term plan (LT), new current plan (NCP), and the active current plans (CP1 and CP2). Although testing showed that the shareoption of the NCP and LT file could be set to 1 without introducing any problems, and the shareoption of the CP1 and CP2 could be set to 2, there was no improvement in the Symphony creation time with these changes in place, so they were not implemented.
48
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
this tuning change is and also verify that this change is still important even after implementing the VSAM shareoption change (see 3.7.2, Defining SHAREOPTION(1) for EQQSCPDS file on page 48). The test shown in Table 3-8 on page 48 used the BUFFERPSACE override BUFFERSPACE(140000000). A test was done without the BUFFERPSACE override, allowing the default value of 66048 to be used. The results of this test are shown in Table 3-9.
Table 3-9 Effect of BUFFERSPACE for EQQSCPDS on Symphony create SCP BUFFERSPACE 140000000 66048 (default) Phase 1 time (mm:ss) 30:00 36:00 +20% Difference
This shows that the BUFFERSPACE parameter has a significant impact on the Symphony creation time, even after implementing the SHAREOPTION(1) change.
Table 3-11 on page 50 shows a comparison between Table 3-4 on page 33 (before tuning) and Table 3-10 (after all tuning).
49
Table 3-11 Comparison of Symphony creation and distribution before and after tuning Phase 1 2 3 Total Before tuning (hh:mm) 01:38 00:06 00:56 02:40 After tuning (hh:mm) 00:30 00:07 00:46 01:23 Difference -69.4% +16.7% -17.9% -48.1%
50
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Type FTA FTA FTA FTA z/OS UNIX System Services z/OS UNIX System Services z/OS UNIX System Services z/OS UNIX System Services
Change Use mm enable cache=yes (localopts) Use mm cache size=512 (localopts) Use twsHnd.logFile.className=ccg_filehandler (TWSCCLog.properties) Use tws.loggers.className=ccg_basiclogger (TWSCCLog.properties) Make E2E server (program EQQSERVR) non-swappable Use REGION=0M for controller, E2E server, and CP batch jobs Set WLM priority to HIGH for controller, E2E server, and CP batch jobs DASD placement: high-speed DASD, plus spread key Tivoli Workload Scheduler files across multiple volumes and channels
Major impact
51
52
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Chapter 4.
53
54
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Full multitasking with protected memory: Multiple users can run multiple programs concurrently without interfering with each other. Very efficient virtual memory: Many programs can execute with only a small amount of physical memory available. Access controls and security: All users must be authenticated by a valid account and password to use the system. All files are owned by particular accounts. The owner can decide whether others have read or write access to the owner files. Productive development environment: For programmers, UNIX offers rich set of tools and command language. Commands and utilities can be strung together in unlimited ways to accomplish complex tasks. Unified File System: Everything is a file: data, programs, and physical devices. The entire file system appears as a single large tree of nested directories.
55
Technically, only the kernel and the shell form the operating system, while the utilities have evolved over time to make the operating system more immediately useful to the user.
Kernel
The kernel is the core of the UNIX operating system. It consists of a small collection of software that makes it possible for the operating system to provide other services. The kernel provides four basic types of services: Creation and management of processes File system Communications A means to start the system Kernel functions are of two broad types: autonomous and responsive. Kernel functions, such as allocation of memory and CPU, are performed without being explicitly requested by user processes. Other functions of the kernel, such as resource allocation and process creation and management, are initiated by requests from processes. UNIX users do not need to know anything about the kernel, just as TSO users do not need to know anything about MVS.
Processes
A process is the execution of a program. Some operating systems (such as MVS) call the basic unit of execution a job or task. In UNIX, it is called a process. In the UNIX kernel, anything that is done, other than autonomous operations, is done by a process issuing system calls. Processes often spawn other processes (using the fork() system call) that run in parallel with them, accomplish subtasks, and when they are finished, terminate themselves. All processes have an owner. Typically, the human owner of a process is the owner of the account whose login process spawned the process in question. When a process creates or spawns another process, the original process is known as the parent process, and the process it creates is called a child process. The child process inherits the file access and execution privileges belonging to the parent.
Signals
One way that processes communicate with each other and with the kernel is through signals. Signals are used to inform processes of unexpected external events such as a time-out or forced termination of a process. A signal consists of
56
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
a prescribed message with a default action embedded in it. Each signal has a unique number associated with it.
Virtual memory
UNIX uses paging and swapping techniques similar to MVS.
Shell
The shell is the interactive environment UNIX users encounter when they log in, similar to what MVS users encounter when they log on to TSO. The shell's prompt is usually visible at the cursor's position on the screen, similar to line-mode in a TSO session. To perform work, commands are entered at the prompt. The shell is a command interpreter, that is, it takes each command entered and passes it to the operating system kernel to be acted upon. The results of this operation are displayed on the screen. Several shells might be available on a UNIX system for a user to choose from, each with its own strengths and weaknesses. A user can decide to use the default shell or override it. Some of the more common shells are: sh: The Bourne shell csh: The C-shell tcsh: The T C-shell bash: The Bourne Again shell ksh: The Korn shell Each shell also includes its own programming language. Command files, called shell scripts, are used to accomplish a series of tasks. There is a GUI shell available for UNIX systems, called X-Windows or simply X. This GUI has all the features found on a personal computer. In fact, the version used most commonly on modern UNIX systems (CDE) is made to look very similar to Microsoft Windows.
57
contained within the root directory. See Figure 4-1, where the shaded boxes represent directories, and the unshaded boxes represent files. This is similar to the Microsoft Windows Hierarchical File System, except that the directory separator is a forward slash (/), compared to the Windows backslash (\). There are slight differences in the arrangement of directories between variants of UNIX; however, the overall structure is basically the same. Note that UNIX is a case-sensitive operating system, so a file called ABC is different from a file called abc.
58
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
write [w]
execute [x]
These permissions are applied and tested at three levels: the owner user ID; the owner; groups and other users.
Parameter files
Parameter files are typically stored in the /etc directory. This is similar to SYS1.PARMLIB on an MVS system.
Daemons
A daemon is a program that runs continuously and exists for the purpose of handling periodic service requests that a computer system expects to receive. The daemon program forwards the requests to other programs (or processes) as appropriate. Daemons are similar to started tasks (STCs) in MVS.
Accessing UNIX
To access UNIX interactively, the user has to log in to their user account using the rlogin (remote login) or Telnet interface. The rlogin and Telnet interfaces are similar except rlogin supports access from trusted hosts without requiring a password (thus security people will like this less than Telnet). Most platforms (including Microsoft Windows) include a Telnet command interface. When logging in, remember that UNIX is case-sensitive, so uppercase characters used in the user ID or password are not the same as lowercase characters. UNIX also has a console interface (similar to an MVS console), but the console interface is normally only used by system administrators or computer operators.
59
UIDs
The user account of a UNIX user is represented in two ways: user name and UID. User name is an easy-to-remember word, and the UID is a number. This information might be stored in the file /etc/passwd. The UID is typically a number between 0 and 65,535, where 0 thorough 99 might be reserved. UID=0 has special meaning as the superuser.
Superuser (root)
Superuser is a privileged user (UID=0) who has unrestricted access to the whole system, that is, all commands and all files regardless of their permissions. By convention, the user name for the superuser account is root. Do not confuse the term root here with the root subdirectory in the file system; they are unrelated. The root account is necessary, because many system administration files and programs need to be kept separate from the executables available to non-privileged users Also, UNIX enables users to set permissions on the files they own. A system administrator (root) might need to override those permissions.
GIDs
Each UNIX user is also associated with a grouping so that people in the same workgroup can share data. This grouping is represented in two ways: group name and GID. Group name is an easy-to-remember word, and the GID is a number. This information might be stored in the file /etc/group. The GID is typically a number between 0 and 65,535, where 0 through 99 might be reserved. Unlike UID, GID=0 has no special meaning.
60
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
61
The z/OS support for z/OS UNIX enables two open systems interfaces on the z/OS operating system: An application program interface (API). The application interface is composed of C interfaces. Some of the C interfaces are managed within the C run-time library (RTL), and others access kernel interfaces to perform authorized system functions on behalf of the unauthorized caller. An interactive z/OS shell interface. Figure 4-3 shows the API and interactive shell open systems interfaces and their relationship.
With the APIs, programs can run in any environment, including in batch jobs, in jobs submitted by TSO/E users, and in most other started tasks, or in any other MVS application task environment. The programs can request: Only MVS services Only z/OS UNIX Both MVS and z/OS UNIX
62
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The shell interface is an execution environment analogous to TSO/E, with a programming language of shell commands analogous to the REXX language. The shell work consists of: Programs run by shell users Shell commands and scripts run by shell users Shell commands and scripts run as batch jobs z/OS UNIX has two shells, the z/OS shell and the tcsh shell. They are collectively called the z/OS UNIX shells.
z/OS shell
The z/OS shell is modeled after the UNIX System V shell with some of the features found in the Korn shell. As implemented for z/OS UNIX services, this shell conforms to POSIX standard 1003.2, which has been adopted as ISO/IEC International Standard 9945-2: 1992. The z/OS shell is upward-compatible with the Bourne shell.
tcsh shell
The tcsh shell is an enhanced but completely compatible version of the Berkeley UNIX C shell, csh. It is a command language interpreter usable both as an interactive login shell and a shell script command processor. It includes a command-line editor, programmable word completion, spelling correction, a history mechanism, job control, and a C-like syntax.
63
C/C++ Compiler, to compile programs. Language Environment, to execute the shell and utilities or any other XPG4-compliant. Shell application. Data Facility Storage Management Subsystem (DFSMS) (HFS is a component of DFSMS). Security Server for z/OS (RACF is a component of the Security Server). Resource Measurement Facility (RMF). System Display and Search Facility (SDSF). Time Sharing Option Extensions (TSO/E). z/OS Communications Server (TCP/IP). ISPF, to use the dialogs for OEDIT, or ISPF/PDF for the ISPF shell. Book Manager READ/MVS, to use the OHELP online help facility. Network File System (NFS). z/OS distributed File Service zSeries File System (zFS).
OMVS
The Open MVS (OMVS) address space runs a program that initializes the kernel. The STARTUP_PROC statement in the BPXPRMxx member of SYS1.PARMLIB specifies the name of the OMVS cataloged procedure. We strongly recommend that this procedure name remain its default value of OMVS, because changing it is likely to cause some impact with related functions such as TCP/IP.
BPXOINIT
The BPXOINIT address space runs the initialization process. BPXOINIT is also the jobname of the initialization process. The BPXOINIT address space has two categories of functions: It behaves as PID(1) of a typical UNIX system. This is the parent of /etc/rc, and it inherits orphaned children so that their processes get cleaned up using normal code in the kernel. This task is also the parent of any MVS address space that is dubbed and not created by fork() or spawn(). Therefore, TSO/E commands and batch jobs have a parent PID of 1.
64
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Certain functions that the kernel performs need to be able to make normal kernel calls. This address space is used for these activities (for example, mmap() and user ID alias processing). The STEPLIB DD statement is propagated from OMVS to BPXOINIT. Therefore, if there is a STEPLIB DD statement in the BPXOINIT procedure, it will not be used if a STEPLIB DD statement was specified in the OMVS procedure.
BPXAS
The BPXAS address spaces are those started by WLM when programs use the fork() or spawn() C function or callable services.
65
The ISPF shell The ISPF shell is an interface that heritage MVS users will find most comfortable. It exploits the full-screen capabilities of ISPF. BPXBATCH
66
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
This section presents the following considerations for performance tuning: The necessity of z/OS UNIX tuning Settings for z/OS UNIX with Workload Manager running in goal mode Virtual lookaside facility (VLF) Caching through the filecache command Choosing the file system type (HFS, TFS, zFS) Further tuning tips Where to get data to analyze the performance of z/OS UNIX
67
Note: Beginning with z/OS V1R3, WLM compatibility mode is no longer available.
The nice() and setpriority() kernel functions use definitions in the BPXPRMxx member of SYS1.PARMLIB for performance groups (compatibility mode) and goals (goal mode). These definitions are optional, but if they are not specified, the nice() and setpriority() kernel functions do not change the performance level. If there are applications that require the ability to control the priority of different processes, you must define appropriate priority levels for the application to use. If you have enabled the batch, at, and cron shell functions, you need to define priority groups or goals that are appropriate for running batch jobs, as in a UNIX system.
68
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
69
70
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
To achieve these performance improvements, the two VLF classes IRRGMAP and IRRUMAP should be added to the COFVLFxx parmlib member. Figure 4-7 shows what aCOFVLFxx parmlib member including the z/OS UNIX information should look like.
The VLF member can be activated by starting VLF using the operator command:
START VLF,SUB=MSTR,NN=xx
71
If a user requests to invoke z/OS UNIX, the z/OS UNIX kernel asks RACF about the permission. RACF checks whether VLF is active. If it is active, it asks VLF about the data of the user that tries to login. If the user has already been logged in to z/OS UNIX since VLF became active, VLF should know about this user and provide the UID and GID to RACF. RACF passes the information to z/OS UNIX for processing. However, if VLF is not active, or if the user tries to invoke z/OS UNIX for the first time since VLF became active, RACF has to start I/O to the RACF database to get the information. VLF is able to collect this data in its data spaces if the following two classes were added to the COFVLFxx member in SYS1.PARMLIB: IRRUMAP IRRGMAP When working with the z/OS UNIX shell or ISHELL, some of the commands or functions can display output with either the UID or user ID as the file owner. z/OS UNIX only knows about UIDs and GIDs. Whenever a user ID or group name is displayed, it has been looked up in the RACF database or the mapping tables to find the corresponding user ID for a UID or group for a GID. The end user does not have to be aware that such a mapping/conversion is done. One thing that causes a bit of confusion is which user ID RACF will be shown when a file or directory is owned by a UID=0. In a system, there will be multiple superusers (defined with UID=0, or to the BPX.SUPERUSER class), and RACF can only pick one of these user IDs from the mapping table to show as the owner. It seems like without VLF active, RACF will pick the first superuser user ID, in alphabetical order, that has logged in. With VLF active, RACF will pick the first superuser user ID that has logged in since VLF was started. This is not a problem; it just causes some confusion for people to see a file owned by one user ID one day, and another user ID another day. This situation happens only for users that share the same UID, which should only be the superusers.
72
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
For detailed description of the filecache command, see z/OS V1R5.0 UNIX System Services Command Reference, SA22-7802. Tip: The filecache command can be used to cache the Tivoli Workload Scheduler end -to-end servers binary files such as batchman and mailman. To use the filecache command, you need to explicitly specify the path where the files are located. The binary directory is pointed to by the topology subparameter bindir.
3. Avoid STEPLIBs.
73
To improve performance for users who log in to the shell with the OMVS command, do not place any STEPLIB or JOBLIB DD statements in login procedures. Specify STEPLIB=none in /etc/profile to avoid excessive searching of STEPLIB data sets. Be aware of storage consumption. If the system is running in an LPAR or as a VM guest, the storage size should be at least 64 MB; however, having quite a bit more than this will not be harmful. Extended common system area (ECSA) storage used by z/OS UNIX is based on the following formula: (n * 150 bytes) + (m * 500 bytes) Where n is the number of tasks using z/OS UNIX and m is the number of processes. So if your system supports 500 processes and 2000 threads, z/OS UNIX consumes 550 KB of ECSA storage. In addition to this: WLM uses some ECSA for each forked initiator. The OMVS address space itself uses 20 KB ECSA. Spawn usage requires approximately 100 KB of ECSA. Each process that has a STEPLIB that is propagated from parent to child or across an EXEC consumes about 200 bytes of ECSA.
74
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The RMFGAT started task must be associated with a user ID that has an OMVS segment. The following RACF command is used to give RMFGAT a UID and to designate the root directory as its home directory:
ADDUSER RMFGAT DFLTGRP(<OMVSGROUP>)OMVS(UID(4711)HOME('/'))
Gathering options for z/OS UNIX are not included in the default parmlib member for RMF Monitor I. z/OS UNIX data is gathered by Monitor III, and not by Monitor I. The Monitor III data gatherer collects z/OS UNIX data for input to the RMF post processor. This data can be used to create a z/OS UNIX kernel activity report. Note: To get an detailed report for most z/OS UNIX activity, you need to collect certain SMF record types. Those are SMF types 34, 35, 74, 80, and 92. For more information, contact your SMF administrator.
75
A path name identifies a file and consists of directory names and a file name. A fully qualified file name, which consists of the name of each directory in the path to a file plus the file name itself, can be up to 1023 bytes long. The Hierarchical File System allows for file names in mixed case.
76
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The Hierarchical File System (HFS) data set that contains the Hierarchical File System is a z/OS (MVS) data set type (similar PDS/E). The files in the Hierarchical File System are sequential files, and they are accessed as byte streams. A record concept does not exist with these files other than the structure defined by an application. The path name is constructed of individual directory names and a file name separated by the forward-slash character, for example:
/dir1/dir2/dir3/myfile
Like UNIX, z/OS UNIX is case-sensitive for file and directory names. For example, in the same directory, the file MYFILE is a different file from myfile. HFS data sets and z/OS data sets can reside on the same DASD volume.
77
zFS provides the following features and benefits: Performance gains in many customer environments when accessing files approaching 8 KB in size that are frequently accessed and updated. The access performance of smaller files is equivalent to that of HFS. zFS provides reduced exposure to loss of updates by writing data blocks asynchronously and not waiting for a sync interval. zFS is a logging file system. It logs metadata updates. If a system failure occurs, zFS replays the log when it comes back up to ensure that the file system is consistent. zFS provides space sharing. Multiple zFS File Systems can be defined in a single data set, enabling space that becomes available from erasing files in one file system to be available to other file systems in the same data set. This is an optional function that is available only in a non-Sysplex environment. Read-only cloning of a file system in the same data set. The cloned file system can be made available to users to provide a read-only, point-in-time copy of a file system. This is an optional feature that is available only in a non-Sysplex environment. Because zFS can be used in a Sysplex, users in a Sysplex can access zFS data that is owned by another system in the Sysplex. If there is a system failure, zFSs will be auto-moved and can be auto-mounted when the system is back on. Note: HFS is still required for the z/OS installation. The root file system can be HFS or zFS (zFS support for the root file system was available with z/OS V1R3).
78
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
In a UNIX System Services environment, the Physical File Systems are defined in the BPXPRMxx parmlib member. zFS, as a Physical File System, is also to be defined in the parmlib member. Figure 4-10 shows all the Physical File Systems that can be defined in a USS environment. The Logical File System (LFS) is called by POSIX programs, non-POSIX z/OS UNIX programs, and VFS servers.
79
80
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
zFS aggregates come in two types: Compatibility mode aggregates Multiple file system aggregates
81
Space sharing
Space sharing means that if you have multiple file systems in a single data set, and files are removed from one of the file systems, which frees DASD space, another file system can use that space when new files are created. This new type of file system is called a multi-file-system aggregate. The multiple file system aggregate OMVS.EBIS.zFS, shown in Figure 4-13, can contain multiple zFSs. This makes it possible to do space sharing between the zFSs within the aggregate. The multiple file system aggregate has its own name. This name is assigned when the aggregate is created. It is always the same as the VSAM LDS cluster name. Each zFS in the aggregate has its own file system name. This name is assigned when the particular file system in the aggregate is created. Each zFS also has a predefined maximum size, called the quota.
Metadata cache
The zFS has a cache for file system metadata, which includes directory contents and the data of files that are smaller than the aggregate block size. The setting of this cache size is important to performance because zFS references the file system metadata frequently.
82
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Synchronous reads of metadata increases I/O rates to disk and server response times. Metadata consists of things such as owner, permission bit settings, and data block pointers. The metadata cache is stored in the zFS primary address space; its default size is 32 MB. Because the metadata cache only contains metadata and small files, it normally does not need to be nearly as large as the user file cache.
zFS clones
zFS enables an administrator to make a read-only clone of a file system in the same aggregate. This clone file system can be made available to users to provide a read-only, point-in-time copy of a file system. The clone operation happens relatively quickly and does not take up too much additional space because only the metadata is copied. When a file system is cloned, a copy of the file system is created in the same aggregate, as shown in Figure 4-14. There must be physical space available in the aggregate for the clone to be successful. For the clone to be used, it must be mounted.
83
zFS recovery
zFS provides a recovery mechanism that uses a zFS File System log to verify or correct the structure of an aggregate. This recovery mechanism is invoked by an operator command, ioeagslv. When you do a system restart, a recovery program called the salvager uses the zFS log to return consistency to a file system by running recovery on the aggregate on which the file system resides. Recovery consists of reading the log that contains all the changes made to metadata as a result of the operations done to the aggregate, such as file creation and deletion. If problems are detected in the basic structure of the aggregate, if the log mechanism is damaged, or if the storage medium of the aggregate is damaged, the ioeagslv command must be used to verify or repair the structure of the aggregate.
Allocating a zFS
In next example (Example 4-2 on page 85), we want to show you one possible way to allocate a zFS aggregate through the zFSadm command. The zFSadm
84
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
command is very powerful for managing zFS aggregates. For a complete list of possible subparameters, see z/OS V1R5.0 Distributed File Service zSeries File System Administration, SC24-5989.
Example 4-2 Define an aggregate with the zFSadm command zFSadm define -aggregate TWS.SYS4.PROD.zFS -volume tws003 -megabyte 64 20 IOEZ00248E VSAM linear dataset TWS.SYS4.PROD.zFS successfully created.
Tip: Multiple volumes can be specified when you define the aggregate.
85
Migrating to zFS
You can migrate or copy existing HFSs into empty zFSs. This can be done by using the OMVS pax command. The pax utility reads, writes, and lists archive files. An archive file is a single file containing one or more files and directories. Archive files can be HFS files or MVS data sets. A file stored inside an archive is called a component file; similarly, a directory stored inside an archive is called a component directory. The pax command can be used with or without an intermediate archive file. When the data is being copied from an HFS, the file system being accessed must be mounted. The copytree utility is available from the z/OS UNIX Tools and Toys on the Internet. It is used for logical copying of z/OS UNIX structures. Beginning with z/OS V1R3, it is distributed as a z/OS UNIX sample in the directory /samples of the root file system. For the detailed syntax of the pax and copytree commands, see z/OS V1R5.0 UNIX System Services Command Reference, SA22-7802.
86
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Log file cache This cache is used to write file record transactions that describe changes to the file system. Figure 4-15 shows different types of zFS caches.
meta_cache_size
log_cache_size
87
tran_cache_size
Specifies the initial number of transactions in the transaction cache. The default is 2000.
88
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Metadata cache
The metadata cache is used to contain all file system metadata, which includes the following: All directory contents File status information, which includes atime, mtime, size, permission bits, and so on File system structures Also caching of data for files smaller than 7 KB
89
90
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
For HFS, we used the ISPF shell (ISHELL) dialog to create the file system, performing the following steps: 1. First, we selected New HFS from the pull-down menu (Figure 4-19 on page 92).
91
Tip: It is possible to define and attach a zFS aggregate with ISHELL. 2. Next, we defined space for the HFS allocation (Figure 4-20).
3. Then, we used the ISHELL dialog to mount both file systems (Figure 4-21 on page 93).
92
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
4. Finally, we inserted the mountpoint and the file system type (Figure 4-22).
93
compare the execution time of the job that fills the zFS (Example 4-5 on page 94). Important: If you want to increase the zFS cache, you need to take care not to run out of central storage. If you choose to use fixed cache, you need to know that it is reserved for zFS usage only. All other applications cannot used fixed zFS cache. To check your central storage, query the system configuration by entering the /D m command in the spool display and search facility (SDSF).
Tip: In our system, we had 4 GB central storage and we used 1.5 GB for the user cache without interfering with any other operating functions or applications.
Example 4-5 Setting zFS user cache to 1.5 GB />zFSadm config -user_cache_size 1536M IOEZ00300I Successfully set -user_cache_size to 1536M />zFSadm configquery -user_cache_size IOEZ00317I The value for configuration option -user_cache_size is 1536 MB
>confighfs -v 1536 >confighfs -l HFS Limits Maximum virtual storage: ______1536(MB) Minimum fixed storage: _________0(MB Figure 4-23 Changing the HFS buffer size
94
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
We ran our tests multiple times with the same buffer size of the HFS and zFS and used the average execution time of the job to compare. We kept the same values for the procs, blkios, typios, and seed to have a unique base for comparison.
Comparison results
Table 4-1 on page 96 shows the measured times in minutes that took to fill up the file systems with 500 MB of data. As you can see from the table, the execution
95
time with HFS and different buffer sizes shows no significant performance improvement. The time to fill the zFS with a user cache size of 1.5 GB is almost cut in half (minus 54%).
Table 4-1 Comparison results File system type HFS HFS zFS zFS Cache size (MB) 256 1536 256 1536 Execution time (minutes) 13.77 13.59 2.02 0.56
Conclusion
The comparison test results verified that performance gains (zFS versus HFS) are larger when accessing files larger than 8 KB in size. Even for smaller files, zFS is a better choice than HFS, because it is more reliable, has intelligent space sharing, provides a rich set of tools, and is more scalable than HFS. The following is an excerpt from the z/OS 1.5 announcement letter, available at:
http://www.ibm.com/common/ssi/rep_ca/7/897/ENUS204-017/ENUS204-017.PDF
HFS is expected to continue shipping as part of the operating system and will be supported in accordance with the terms of a customers applicable support agreement. IBM intends to continue enhancing zFS functionality, including RAS and performance capabilities, in future z/OS releases. All requirements for UNIX file services are expected to be addressed in the context of zFS only.
96
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Chapter 5.
97
98
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
99
bupJob1 jobonea
joboneb
jobonec jobonee
jobtwoc
jobtwog jobtwoi
jobone1
jobonem joboneh
joboneq
jobtonek
joboneg
jobone2 jobonej
jobtwoj
jobtwok
jobtwom
jobonei
jobtwoo
jobtwon jobtrey
jobtref
jobtwot
jobtwop
bupJob2
When calculating the latest start times for each job, Tivoli Workload Scheduler for z/OS is, in effect, using the latest start time of its successor job in lieu of a deadline time. So if it encounters a more urgent deadline time on an operation in the chain, that deadline will be used instead. Consider our example in Figure 5-1. The path that takes the longest elapsed time, the critical path, from BUPJOB1 is down the right side, where the number of jobs between it and one of the online startup jobs is greatest. But, what if JOBONE1 produces a file that must be sent to an external company by 02:00? Calculating back up the chain from JOBONEK (06:30 deadline), we have a calculated latest start time on JOBONEM of 04:30. This is not as urgent as 02:00, so JOBONE1 will use its 02:00 deadline time instead and get a latest start time of 01:30. This will affect the whole predecessor chain, so now BUPJOB1 has a latest start time of 23:30 the previous day.
100
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
101
In large installations, this type of easy information provision replaces the need for operators to know the entire batch schedule. The numbers of jobs involved and the rates of change make the accuracy of this old knowledge sporadic at best. Using this process will immediately integrate new batch processing into the prioritization. Because each operation has a relative priority in the latest start time, this can be used to sort the error queue, such that the most urgent failure is presented at the top of the list. Now, even the newest operator will know which job must be fixed first. For warnings of problems in the batch, late and duration alerting could be switched on. After this data has been entered and the plans are relatively accurate, these should only be issued for real situations. In addition, it becomes easy to create a monitoring job that can run as part of the batch that just compares the current time it is running with its latest start time (it is available as an Tivoli Workload Scheduler for z/OS variable) and issues a warning message if the two times are closer than is prudent.
102
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
relation with time dependency. It does not indicate when the job stream (application) or the jobs (operation) in the job stream are allowed or expected to run. If you need to give a time restriction for a job, you need to use the Time
Restrictions window for that job, as shown in Figure 5-2. Note that the default is No restrictions, which means that there is no time restriction.
103
The main use of the input arrival time is to resolve external dependencies. External dependencies are resolved backward in time using the input arrival time. It is also used for determining whether the job streams are included in the plan or not. Input arrival time is part of a key for the job stream in the long-term and current plan. The key is date and time (hhmm), plus the job stream name. This makes it possible in Tivoli Workload Scheduler for z/OS to have multiple instances of the same job stream in the plan. Note: Note that this is not possible in the Tivoli Workload Scheduler distributed product. Input arrival time is also used when listing and sorting job streams in the long-term and current plan. It is called Start time in the Time Restrictions window of the Job Scheduling Console, as shown in Figure 5-3.
104
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Note: Input arrival time (Start field in Figure 5-3) is not a required field in the JSC; the default is 12:00 AM (00:00), even if this field is blank. Let us explain all this with an example. Look at Figure 5-4 on page 106. Assume that we have a 24-hour current plan that starts at 6:00 AM. Here, we have two job streams with the same name (JS11) in the plan. As required, these job streams, or occurrences, have different input arrival times: 9:00 AM and 5:00 PM, respectively. If there is no other dependency (time dependency or resource dependency), both job streams will run as soon as possible (when they have been selected as eligible by the WSA).
105
JS11
Input Arrival Time: 9:00 AM
JS11
Input Arrival Time: 5:00 PM
JB11
JB11
JB12
JB12
Now let us assume that there is an other job stream (JS21) in the plan that has one job JB21. JB21 depends on successful completion of JB11 (Figure 5-5 on page 107). So far so good. But which JB11 will be considered as the predecessor of JB21? Here, the input arrival comes into play. To resolve this, Tivoli Workload Scheduler for z/OS will scan backward in time until first predecessor occurrence is found. So, scanning backward from 3:00 PM (input arrival time of JB21), JB11 with the input arrival time of 9:00 AM will be found. (For readability, we show this job instance as JB11(1) and the other as JB11(2).)
106
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
JS11
Input Arrival Time: 9:00 AM
JS21
Input Arrival Time: 3:00 PM
JS11
Input Arrival Time: 5:00 PM
JB11(1)
JB21
JB11(2)
JB12
JB12
With this logic, JB11(1) will be considered as the predecessor of JB21. Tip: Note that if the input arrival time of JB21 was, for example, 8:00 AM (or any time before 9:00 AM), Tivoli Workload Scheduler for z/OS would ignore the dependency of JB21 to JB11. The reason is that scanning backward from 8:00 AM it would not be able to locate any occurrence of JB11. Let us further assume that there is another job stream (JS01) in the plan that has one job, JB01 (Figure 5-6 on page 108). This job stream has an input arrival time of 7:00 AM, and its job (JB01) has the following properties: It is a predecessor of JB11. It has a time dependency of 3:00 PM.
107
JS01
Input Arrival Time: 7:00 AM
JS11
Input Arrival Time: 9:00 AM
JS21
Input Arrival Time: 3:00 PM
JS11
Input Arrival Time: 5:00 PM
JB01
JB11(1)
JB21
JB11(2)
JB12
JB12
Assuming that current time is, for example, 8:00 AM, and there is no other dependency or no other factor that prevents its launch, which instance of JB11 will be eligible to run first? JB11(1) with an input arrival time 9:00 AM or JB11(2) with an input arrival time 5:00 PM? The answer is JB11(2), although it has a later input arrival time. The reason is Tivoli Workload Scheduler for z/OS will calculate that (by scanning backward from 9:00 AM) JB01 is the predecessor of JB11(1). In that case, the dependency of JB11(2) will be ignored. This is an important concept. External job
dependencies in Tivoli Workload Scheduler for z/OS (and also on Tivoli Workload Scheduler distributed) are ignored if these jobs are not in the current plan with the jobs that depend on them. In other words, dependencies are not
implied. In most of the real-life implementations, the input arrival coding is not that complex, because usually only one instance of job stream (or occurrence) exists in the plan. In that case, there is no need for different input arrival time customizations. It could be same (or left default, which is 00:00) throughout the all job streams. But nevertheless, the input arrival time is there for your use. Note: Before finishing up the input arrival discussion, we want to point out that if more than one run cycle defines a job stream with the same input arrival time, it constitutes a single job stream (or occurrence).
108
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
109
The less connected the 100,000 jobs were to each other, the lower the clock time and the lower the EXCP. The probable reason for this is the alteration in the amount of the current plan that needs to be in storage for the whole network to be processed. This can also be seen in the processing overheads associated with resolving an internal, as opposed to an external, dependency. When an operation completes in Tivoli Workload Scheduler for z/OS that has successors, both ends of the connection must be resolved. In the case of an internal dependency, the dependant operation (job) will already be in storage with the rest of its application (job stream). The external dependency might be in an application that is not currently in storage and will need to be paged in to do the resolution (Figure 5-7).
Application A
JOBA
JOBB
JOBC
JOBD
JOBW
JOBX
JOBY
JOBZ
Application Z
Figure 5-7 Sample applications (job streams) with internal and external dependencies
In terms of processing time, this does not equate to a large delay, but understanding this can help when making decisions about how you build your schedules. Creating the minimal number of external dependencies is good practice anyway. Consider the flowcharts here; in the first (Figure 5-7), we have 16 external dependencies, in the second, only 1, just by adding a couple of operations on a dummy (non-reporting) workstation (Figure 5-8).
110
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Application A
JOBA
JOBB
JOBC
JOBD
NONR
NONR
JOBW
JOBX
JOBY
JOBZ
Application Z
Figure 5-8 Adding a dummy (non-reporting) workstation
111
112
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The results in Table 5-3 are for tests done using faster DASD with a very large cache. In fact, the cache was so large, we believe most if not all of the libraries were in storage for the PDSE and LLA tests.
113
Table 5-3 Results with faster DASD with a very large cache EXCP 4 x PDS 4 x PDSE 4 x PDS + EXITS 4 x PDS + LLA 4 x PDS + LLA + EXITS 4 x PDSE + EXITS 457k 463k 455k 455k 456k 455k CPU 25.68 25.20 25.21 25.07 24.98 25.02 SRB 0.08 0.08 0.08 0.08 0.08 0.08 CLOCK 176.00 129.81 111.98 101.47 95.12 94.99 SERV 66497k 65254k 65358k 64915k 64761k 64871k
One additional test was done using a facility provided by the EQQUX002 code we were using (Table 5-4). This enabled us to define some model JCL that was loaded into storage when the Tivoli Workload Scheduler for z/OS controller was started. When JCL was fetched, it was fetched from this storage version. The exit inserted the correct jobname and other elements required by the model from data within the operations record (CPOP) in the current plan.
Table 5-4 Results using the EQQUX002 code shipped with this book EXCP EQQUX002 455k CPU 25.37 SRB 0.08 CLOCK 95.49 SERV 65759k
As can been seen from these results, the quickest retrievals were possible when using PDSE files with the EQQUX000/002 exits, using PDS files with their directories in LLA with the EQQUX00/02 exits, or using the instorage model JCL facility.
114
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
For example, our JCL consisted of four records: A jobcard that continued over two lines, an exec card, and a steplib. For approximately 25,000 members, that equated to 3330 tracks for a PDSE and only 750 tracks for a PDS. Obviously from a space perspective, the cheapest option was the model JCL, because all our jobs followed the same model, so only one 4-line member was needed. More information about the JCL fetch exits used, how to use them, and where to obtain a copy can be found in Appendix A, Using the EQQUX000 and EQQUX002 exits on page 177.
Where the CSVLLAxx member contains a NOFREEZE statement for the PDS that needs to be updated. Then another LLA update command, F LLA,UPDATE=yy, can be issued, where CSVLLAyy contains a FREEZE statement for the source PDS library. By doing the NOFREEZE and then the FREEZE commands, the need for an LLA refresh is avoided
115
that is valid for many jobs, you also remove all those jobs from the EQQJBLIB or exit DD concatenations.
5.5 Recommendations
In order to ensure that Tivoli Workload Scheduler for z/OS performs well, both in terms of dialog response times and job submission rates, the following recommendations should be implemented. However, it should be noted that although these enhancements can improve the overall throughput of the base product, the amount of work that Tivoli Workload Scheduler for z/OS has to process in any given time frame will always be the overriding factor. The recommendations are listed in the sequence that will provide the most immediate benefit.
116
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Note: The use of LLA for PDS libraries is dependent on the level of z/OS UNIX System Services installed.
117
Clear out the JS file at regular intervals. It has a tendency to grow, because jobs that are only run once are never removed. There is a sample in SEQQSAMP (EQQPIFJX) that can be used to delete items that are older than required. Consider all methods of reducing the number of members, and their size, within production JCL libraries. Regularly clean the libraries and remove all redundant members. Whenever possible, call procedures rather than maintaining large JCL streams in Tivoli Workload Scheduler for z/OS libraries. Use JCL variables to pass specific details to the procedures, where procedural differences are based on data known to Tivoli Workload Scheduler for z/OS, such as workstation. Allow the Tivoli Workload Scheduler for z/OS exit EQQUX002 to create RDR JCL from a model. This idea is useful, for example, when several (especially if you have hundreds or thousands) of the members within the job library execute a procedure name that is the same as the jobname (or can be derived from it). Replacing the several members with just a few model members (held in storage) and having the exit modify the EXEC card would reduce the size of the job library and therefore the workstation analyzer overhead during JCL fetch times.
118
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Consideration should be given to the method of communication used between the controller and the trackers. Of the three methods, XCF gives the best performance; however, its use is possible only in installations with the right hardware and software configurations. Using VTAM (the NCF task) is second in the performance stakes, with shared DASD being the slowest due to its being I/O intensive.
5.5.8 Review your z/OS UNIX System Services and JES tuning
Ensure that your system is tuned to cope with the numbers of jobs being scheduled by Tivoli Workload Scheduler z/OS. It does no good to be able to schedule 20 jobs a second if the JES parameters are throttling back the systems and only allowing five jobs per second onto the JES queues. Specifically review System/390 MVS Parallel Sysplex Continuous Availability Presentation Guide, SG24-4502, paying special attention to the values coded for HOLD and DORMANCY.
119
120
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Chapter 6.
121
These DD cards are what Tivoli Workload Scheduler for z/OS inserts into the JCL if the FLOPTS and RCLOPTS are in the controller parameters. If, for operational reasons, the actual output of the job itself versus the JOBLOG information is needed, a third DD will be added if you specify YES for the USER SYSOUT parameter for a specific job when defining the job to the database:
//TIVDSTUS OUTPUT DEST=TWSDSC64
Example 6-2 on page 123 shows the DST parameters. We reference some of them later in the chapter.
122
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Example 6-2 DST parameters /*********************************************************************/ /* FLOPTS: data store connection */ /*********************************************************************/ FLOPTS /* */ CTLMEM(TWSDSCTL) /* XCF NAME OF CTRL DST XCF LINK */ /* MUST BE THE SAME AS CTLMEM IN DSTOPTS (DSTP) */ DSTGROUP(TWS82GRP) /* NAME OF DST XCF GROUP */ XCFDEST(TWSTSC64.TWSDSC64,TWSTSC63.TWSDSC64, /* */ TWSTSC65.TWSDSC64,TWSTSC70.TWSDSC64, /* */ ********.TWSDSC64) /* */ /*********************************************************************/ /* RCLOPTS: Restart and clean up options */ /*********************************************************************/ RCLOPTS /* RESTART AND CLEANUP FUNCTION OPTIONS */ CLNJOBPX(TWSCL) /* CLEANUP JOB PREFIX */ DDALWAYS(DDALW01,DDALW02) /* ALWAYS RE-EXECUTABLE DDNAMES */ DDNEVER(DDNEX01,DDNEX02) /* NEVER REEXECUTABLE DDNAMES */ DDNOREST(DDNRS01,DDNRS02) /* NOT RESTARTABLE DDNAMES */ DSTCLASS(TWSTSC64:D,TWSTSC63:D,TWSTSC65:D, /* */ TWSTSC70:D,********:D) /* */ DSTDEST(TWSDEST) /* SYSOUT DESTINATION */ DSTRMM(NO) /* RMM - NO RMM SUPPORT */ USERSYS(N) /* */ /*********************************************************************/ TWS.INST.PARM(DSTP): /*********************************************************************/ DSTOPTS CINTERVAL(60) CLNPARM(DSCLEAN) HOSTCON(XCF) MAXSTOL(0) MAXSYSL(0) NWRITER(16) /* Max number of writer tasks, need 17 UDF files*/ DELAYTIME(15) DSTLOWJOBID(1) DSTHIGHJOBID(30000) HDRJOBNAME(JOBNAME) HDRSTEPNAME(STEPNAME) HDRPROCNAME(PROCSTEP) HDRJOBLENGTH(21) HDRSTEPLENGTH(30) HDRPROCLENGTH(39) QTIMEOUT(15) SYSDEST(TWSDEST) RETRYCOUNTER(1) DSTGROUP(TWS82GRP) DSTMEM(TWSDSC&SYSCLONE) /* TWSDSC64 */
123
CTLMEM(TWSDSCTL) STOUNSD(Y) STORESTRUCMETHOD(DELAYED) WINTERVAL(1) /* Small value, more output can be retrieved */ /* Larger value,fewer EXCPs */
Test description
Our testing was performed on a 4-way Sysplex Multi-Access Spool system with the JES2 checkpoint residing in the coupling facility, as recommended in
124
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
System/390 MVS Parallel Sysplex Continuous Availability Presentation Guide, SG24-4502. Our goal was to first test using default values and then to suggest specific changes to those values to improve performance and maximize throughput. We concentrated on three areas. 1. The number of writers and writer output files After the DST captures the output from the JES spool, the number of writers determines how many active processes are writing that data to the output files. The number of writer output files (NWRITER) determines how many concurrent files can be written to at any point. The rule is that the number of writer tasks should be less than or equal to the number of files. 2. The wait interval (WINTERVAL) The wait interval (in seconds) is the maximum time interval the DST will wait before scanning the JES spool. 3. Storage structure method (STORESTRUCMETHOD) This defines when to format the data from the VSAM unstructured (UDF) file to a VSAM structured (SDF) file. The options are deferred or immediate. In our testing, we used 20,000 jobs for all the runs. We had 80 initiators across the four systems. The makeup of the jobs was five steps each, 272 JOBLOG output records, and 88 input statements. We also were running parallel, having the jobs running and the DST retrieving output at the same time. We did our first test using a wait interval of 1. The results of the number of writers and files (UDFs) tests is shown in Table 6-1.
Table 6-1 The results of the number of writers and files test Number of writers, files, and WINTERVAL 10 writers, 11 files, WI 30 5 writers, 6 files, WI 1 10 writers, 11 files, WI 5 16 writers, 17 files, WI 5 CPU time in minutes 4.79 4.49 4.74 4.82 EXCPs Jobs per hour 4320 7,256 7874 9914 Clock time
125
Number of writers, files, and WINTERVAL 10 writers, 11 files, WI1 16 writers, 17 files, WI 1 16 writers, 32 files, WI 1
EXCPs
Clock time
The results showed that the optimum configuration was 16 writers (which is the maximum) and 17 UDFs. Even doubling the number of files did not dramatically change the performance. Table 6-2 shows having the data store only pick up 10,000 JOBLOGS of the output versus all 20,000 jobs. This was with 16 writers and 17 UDFs and a WINTERVAL of 1. We provide instructions about how to manage what is destined to the data store in How to define data store destinations on a per job basis with EQQUX002 on page 185.
Table 6-2 Test results with 10,000 JOBLOGS of the output versus all 20,000 jobs Number of JOBS and JOBLOGS 20,000 jobs and 10,000 retrieved CPU EXCP Jobs per hour Clock time
3.43
1658k
9120
1:05:47
These next set of tests were done by running all the jobs with the DST down to verify the performance of the DST without the interference of the jobs running. After the jobs finished running, the data store was started so that it would process the JOBLOGS on the JES spool. See Table 6-3 on page 127. Because of spool capacity, these tests were all for 10,000 jobs.
126
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Table 6-3 Running all the jobs with the DST Number of writers, files, and WINTERVAL 5 writers, 6 files, WI 1 16 writers,17 files, WI 5 10 writers, 11 files, WI 1 16 writers, 17 files, WI 1 16 writers, 32 files, WI 1 16 writers, 17 files, WI 0 CPU EXCP Jobs per hour 6799 13,859 14808 19943 20,468 24,623 Clock time
The next test (Table 6-4) ran the DST and showed the change using STORESTRUCMETHOD immediate.
Table 6-4 Test results using STORESTRUCMETHOD immediate Number of writers, files, and WINTERVAL 16 writers, 17 files, WI 1 CPU EXCP Jobs per hour 3270 Clock time
137.68
1778k
3:03:29
127
The biggest single improvement in data store performance comes almost at no cost apart from the slight increase in CPU usage and disk space; set the maximum number of writers, NWRITER(16), and allocate 17 UDFs. The effects can be observed in the SYSLOG very easily by how many jobs are purged between wait intervals, WINTERVAL(nn). NWRITER(05): 16 jobs were purged per DST run interval for our test job output consisting of 5 steps and 272 joblog records. NWRITER(10): 31 jobs purged per DST run interval. NWRITER(16): 49 jobs purged per DST run interval, regardless of whether there are 16, 17, or 32 UDFs. Note that there is no significant benefit to allocating more UDFs than NWRITER+1, and our limited testing shows no performance benefit to allocating more than 16 UDFs for NWRITER(16). However, it is good to have at least one spare data set allocated in case there is a hardware problem on a disk volume. We did not measure performance for the default value of NWRITER(1), because, having seen the performance of NWRITER(5), it was obvious that the performance would be unacceptable. The best way to use your JES spool and take advantage of the job restart and cleanup and joblog retrieval benefits of the data store is to only send output to the data store for those jobs that can benefit from the data store. Do not set a TWSDST destination for test jobs or jobs for which restart is not an issue. We will provide instructions about how to manage what is destined to the data store in How to define data store destinations on a per job basis with EQQUX002 on page 185. 4. We believed the default WINTERVAL setting of 30 seconds was too high for our tests. Because we had large amounts of jobs, WINTERVAL acts as a polling interval. The lower the interval, the more we are scanning the JES queue for our work. In a site where there is not a lot of work to be processed by the DST, the higher the interval, the less overhead for JES. We recommend 5 seconds as a general rule, but adjust this based on the DST load and JES2 MAS dormancy settings. Note that setting WINTERVAL(0) improves the job retrieval rate over WINTERVAL(1), but expect to see substantial daily EXCP usage regardless of how many joblogs are retrieved.
128
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
the data store, so do not capture what you do not need. You can set up the controller to normally send joblogs to a purge queue and filter for needed jobs using the Tivoli Workload Scheduler for z/OS job fetch exit, EQQUX002. We describe this in How to define data store destinations on a per job basis with EQQUX002 on page 185.
129
130
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Chapter 7.
131
132
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
7200 seconds (12 minutes). Note that in addition to this default value, you can use a separate setting for each filter window. Tip: Do not set this property value too low. Otherwise, if the displayed list gets very large, the interval between auto-refreshes might be less than the time it takes to actually refresh, and the Job Scheduling Console will appear to lock up. Also, if you are using several detached windows (you can detach up to seven windows), setting the refresh rate properly becomes even more important. To adjust the refresh rate: 1. In the Job Scheduling view, select a scheduler engine and click the Properties button in the toolbar. The Scheduler Properties window opens (Figure 7-1). 2. Open the Settings page and alter the Periodic refresh value by entering the number of seconds after which a list display will periodically refresh.
Important: Increasing the frequency that a list is refreshed decreases performance. Unless separate settings are set for each filter window, the default refresh rate will be used.
133
7.6 Minimize the JSC windows to force the garbage collector to work
Whenever you need to decrease the amount of memory the JSC is using, you can minimize all the JSC windows. In this way, the Java garbage collector starts its work and releases the unnecessary allocated memory. This decreases the memory used by JSC.
134
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
135
Remember that all these lists can be made available to all JSC users simply by saving the preferences.xml file and propagating it to your JSC users. User preferences are stored in a file named preferences.xml. This file contains the names and the details, including filters, of all the queries (or lists) that were saved during a session. Every time you close the Job Scheduling Console, the preferences.xml file is updated with any queries you saved in, or deleted from, the Job Scheduling Tree. Note: In the previous version of JSC (JSC 1.2), the globalpreferences.ser file was used for propagating user settings. The preferences.xml file is saved locally in a user directory: On UNIX: ~/.tmeconsole/login-user@hostname_local Where ~ is the users HOME directory. On Windows: C:\Documents and Settings\Administrator\.tmeconsole\user@hostname_locale Where: user is the name of the operating system user that you enter in the User Name fields in the JSC logon window. It is followed by the at (@) sign. hostname is the name of the system running the connector followed by the underscore (_) sign. locale is the regional settings of the operating system where the connector is installed. For example, assume that to start the Job Scheduling Console, you log on for the first time to machine fta12, where the connector was installed by user ITWS12. A user directory named ITWS12@fta12_en_US (where en_US stands for English regional settings) is created under the path described above in your workstation. Every time you log on to a different connector, a new user directory is added, and every time you close the Job Scheduling Console, a preferences.xml is created or updated in the user directory that matches your connection. Note: The preferences.xml file changes dynamically if the user logs on to the same connector and finds that the regional settings have changed. If you want to propagate a specific set of queries to new users, copy the relevant preferences.xml file in the path described above in the users workstations. If you
136
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
want to propagate a preference file to existing users, you have them replace their own preferences.xml with the one you have prepared. Note: The preferences.xml file can also be modified with a simple text editor (for example, to create multiple lists with similar characteristics), but unless you are very familiar with the file structure, we do not recommend that you manipulate the file directly, but use the JSC instead.
137
Example 7-1 JSC startup script start "JSC" "%JAVAPATH%\bin\javaw" -Xms128m -Xmx256m -Dice.pilots.html4.defaultEncoding=UTF8 -Djava.security.auth.login.config="%CONFIGDIR%"/jaas.config -Djava.security.policy="%CONFIGDIR%"/java2.policy -Djavax.net.ssl.trustStore="%CONFIGDIR%"/jcf.jks -Dinstall.root="%TMEJLIB%" -Dinstall.dir="%TMEJLIB%" -Dconfig.dir="%CONFIGDIR%" -classpath "%CLASSPATH%" com.tivoli.jsc.views.JSGUILauncher
Tips: The default values (64 for Xms and 256 for Xmx) are average values for all platforms and for average machine configurations. To get better performance, you can change these values based on your machine environment. If, for example, you have a machine with 512 MB of RAM memory, the values given in Example 7-1 are a good choice, but if you have a machine with 256 MB of RAM memory, it is better to use -Xms64m -Xmx128m. Messages such as out.java.lang.OutofMemoryError in the JSC error.log file point out that these options (particularly -Xmx) should be increased. If you need more details about these settings, you can refer to:
http://java.sun.com/j2se/1.3/docs/tooldocs/win32/java.html
Note: Do not forget to make a backup of the JSC startup script before you test the settings.
138
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Chapter 8.
Troubleshooting
In this chapter, we discuss troubleshooting for Tivoli Workload Scheduler for z/OS, UNIX System Services, data store, and E2E and some general considerations. Apart from listing the most common problems you might encounter and their resolutions, we also want to make you familiar with where to find related information and messages related to troubleshooting. This information is important if you query the IBM Support database or open a PMR to work with IBM Support engineers. We cover the following topics in this chapter: E2E troubleshooting: Installation Security issues with E2E E2E PORTNUMBER and CPUTCPIP E2E Symphony switch and distribution problems Other E2E problems OMVS limit problems Other useful E2E-related information Troubleshooting the data store Where to find messages in UNIX System Services Where to find messages in an end-to-end environment
139
8.1.1 EQQISMKD
The EQQISMKD job invokes the EQQMKDIR REXX exec to create directories needed before the SMP/E APPLY job (EQQAPPLE) is run. The EQQAPPLE job populates the binary files in the eqqBINDIR directory. A path prefix, -PathPrefix-, is specified in the EQQISMKD job, and an installation directory (idir) is specified in the EQQMKDIR EXEC. As distributed, the idir is set as follows:
idir='usr/lpp/TWS/'
If the pathprefix in EQQISMKD is set to / (forward slash), for example, EQQMKDIR /, the resulting directory for the binaries would be:
/usr/lpp/TWS
If instead, a directory of /SERVICE/usr/lpp/TWS was needed, the path prefix in EQQISMKD should be specified as:
EQQMKDIR /SERVICE
The most common error with the EQQISMKD job is caused by ignoring the following warning in the comments of the job JCL:
Ensure the directory specified by -PathPrefix-/usr/lpp exists prior to running this job.
Example 8-1 shows an example of this type of problem to illustrate this. We did a test using a directory that did not exist (/twstest) as the path prefix for EQQISMKD.
Example 8-1 Using a directory that did not exist as the path prefix for EQQISMKD //SYSTSIN DD * PROF MSGID EQQMKDIR /twstest /*
140
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Example 8-2 Output of the EQQISMKD job EQQMKDIR /twstest The EXEC to create the directories has begun. It will run for a couple of minutes. The EQQMKDIR EXEC ran at 23:16:12 on 31 Jul 2004 Directory /twstest/ does not exist. Please create this directory and resubmit.
We corrected this error by using the MKDIR command to create the directory:
TWSRES9 @ SC63:/>mkdir /twstest TWSRES9 @ SC63:/>
We ran the EQQISMKD job again, but still got RC=12 and the output shown in Example 8-3.
Example 8-3 Output of the EQQISMKD job after correcting the error EQQMKDIR /twstest The EXEC to create the directories has begun. It will run for a couple of minutes. The EQQMKDIR EXEC ran at 23:18:35 on 31 Jul 2004
Created the following directories: ================================== No directories were created Following directories already exist: ==================================== No directories already existed Problems creating following directories: ======================================== /twstest/usr/lpp/TWS/ Not created. RC=81 RSN=594003D /twstest/usr/lpp/TWS/V8R2M0
Not created. RC=81 RSN=594003D /twstest/usr/lpp/TWS/V8R2M0/bin Not created. RC=81 RSN=594003D /twstest/usr/lpp/TWS/V8R2M0/bin/IBM Not created. RC=81 RSN=594003D /twstest/usr/lpp/TWS/V8R2M0/catalog Additional messages:
Chapter 8. Troubleshooting
141
Please refer to the OS/390 UNIX Messages and Codes book to interpret the Return and Reason Codes. Please correct and resubmit. The EQQMKDIR EXEC has completed with Return Code 12
An easier way to find the meaning of the RSN code 594003D is to use the following command:
tso bpxmtext 594003D
Next, we defined the directories up to the lpp subdirectory level, as indicated in the comments of the EQQISMKD job. After resubmitting the job, we got RC=0.
Example 8-5 Defining subdirectories to prevent the 594003D error TWSRES9 TWSRES9 TWSRES9 TWSRES9 TWSRES9 TWSRES9 @ @ @ @ @ @ SC63:/u/twsres9>cd /twstest SC63:/twstest>mkdir usr SC63:/twstest>cd /twstest/usr SC63:/twstest/usr>mkdir lpp SC63:/twstest/usr>cd /twstest/usr/lpp SC63:/twstest/usr/lpp>
8.1.2 EQQDDDEF
The EQQDDDEF job is run before the SMP/E APPLY job. One note in the comments in the JCL for this job that is important is shown in Example 8-6 on page 143.
142
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Example 8-6 Output of the EQQDDDEF job: Note in the comments IN STEP DEFPATH, CHANGE THE -PathPrefix- TO THE APPROPRIATE HIGH LEVEL DIRECTORY NAME. THE -PathPrefixSTRING MUST MATCH THE SPECIFICATION FOR -PathPrefixSTRING IN THE EQQISMKD JOB. If no change is performed for the -PathPrefix- string, then -PathPrefix- will become the high level directory name, which is probably not what you want.
This is referring to the part of the EQQDDDEF job shown in Example 8-7.
Example 8-7 Part of the EQQDDDEF job CHANGE PATH('/usr/lpp/TWS/'*, '-PathPrefix-/usr/lpp/TWS/'*).
What is not entirely clear is that if -PathPrefix- is set to a slash (/) in the EQQISMKD job, in the EQQDDDEF job, -PathPrefix- must be changed to nulls in order to get the command shown in Example 8-8.
Example 8-8 Part of the EQQDDDEF job CHANGE PATH('/usr/lpp/TWS/'*, '/usr/lpp/TWS/'*).
If -PathPrefix- in EQQISMKD were set to /SERVICE, in EQQDDEF, -PathPrefixwould also be changed to /SERVICE to get the command shown in Example 8-9.
Example 8-9 Part of the EQQDDDEF job CHANGE PATH('/usr/lpp/TWS/'*, '/SERVICE/usr/lpp/TWS/'*).
8.1.3 EQQPCS05
The EQQPCS05 job creates the work directory (eqqWRKDIR) and creates some of the files that reside in this directory. The security considerations involved in the setup of the EQQPCS05 job are discussed in 8.2, Security issues with E2E on page 145. The key part of the EQQPCS05 job that must be customized is shown in Example 8-10 on page 144.
Chapter 8. Troubleshooting
143
Example 8-10 The key part of the EQQPCS05 job that must be customized //STDIN DD PATH='/twstest/usr/lpp/TWS/V8R2M0/bin/config', // PATHOPTS=(ORDONLY) //STDENV DD * eqqBINDIR=/twstest/usr/lpp/TWS/V8R2M0 eqqWRKDIR=/tws/twsae2ew eqqGID=TWSRES9 eqqUID=TWS810
The PATH and eqqBINDIR both refer to the eqqBINDIR directory created by the EQQISMKD job (see 8.1.1, EQQISMKD on page 140). The eqqWRKDIR is the name of the work directory. eqqGID and eqqUID are discussed in 8.2, Security issues with E2E on page 145. After a correct run of the EQQPCS05 job, the contents of the work directory (from the z/OS UNIX System Services command ls -la) should look similar to the contents shown in Example 8-11.
Example 8-11 Contents of the work directory after a correct run of EQQPCS05 TWSRES9 @ SC63:/tws/twsae2ewz>ls -la total 502988 drwxrwxrwx 5 TWSRES9 TWS810 832 Jul 30 05:16 . drwxr-xr-x 16 HAIMO SYS1 8192 Jul 6 18:39 .. -rw------- 1 TWSRES9 TWS810 128 Jul 17 22:35 Intercom.msg -rw------- 1 TWSRES9 TWS810 48 Jul 17 22:35 Mailbox.msg -rw-rw---- 1 TWSRES9 TWS810 839 Jul 31 23:48 NetConf -rw-rw---- 1 TWSRES9 TWS810 48 Jul 17 22:36 NetReq.msg -rw------- 1 TWSRES9 TWS810 128672256 Jul 17 22:34 Sinfonia -rw-r--r-- 1 TWSRES9 TWS810 128672256 Jul 17 22:35 Symphony -rw-rw---- 1 TWSRES9 TWS810 1118 Jul 31 23:48 TWSCCLog.properties -rw-rw-rw- 1 TWSRES9 TWS810 720 Jul 17 22:36 Translator.chk -rw-rw-rw- 1 TWSRES9 TWS810 0 Jul 17 21:22 Translator.wjl -rw-rw---- 1 TWSRES9 TWS810 2743 Jul 31 23:48 localopts drwxrwx--- 2 TWSRES9 TWS810 544 Jul 17 22:33 mozart drwxrwx--- 2 TWSRES9 TWS810 384 Jul 17 22:33 pobox drwxrwxr-x 4 TWSRES9 TWS810 384 Jul 17 21:21 stdlist
One problem not related to security that we have seen in testing occurs if the work directory mount point is not mounted at the time that the EQQPCS05 job is run. The error messages seen in this case are shown in Example 8-12 on page 145.
144
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Example 8-12 Error messages You are running as TWSRES9 (68248) Running configuration with uid=68248 Configuration is running with real userid TWSRES9 (68248) mkdir: FSUM6404 directory "/tws/twsae2ewz/mozart": EDC5111I Permission denied. Configure failed executing: mkdir /tws/twsae2ewz/mozart. Please correct the problem and rerun configure.
If for some reason it is necessary to delete and re-create the work directory (WRKDIR), issue the following command from TSO z/OS UNIX System Services to delete the WRKDIR first, and then run the EQQPCS05 job:
rm -r /WRKDIR
So, if for example WRKDIR is defined as /var/TWS/inst, use this procedure: 1. Issue the following z/OS UNIX System Services command:
rm -r /var/TWS/inst
If this is not the case, the following command can be issued to modify the APF attribute for the eqqBINDIR/bin and eqqBINDIR/bin/IBM directories:
extattr +a /directory-name
Chapter 8. Troubleshooting
145
Example 8-13 Comments in the EQQPCS05 job This JCL must be executed by a user: a) with USS uid equal to 0 b) with an user permitted to the BPX.SUPERUSER FACILITY class profile within RACF c) or with the user specified below in eqqUID and belonging on the group specified in eqqGID
These statements are true. However, they have been misread in many cases to mean that root authority is needed to run EQQPCS05, or for the eqqUID itself. This is untrue. Not only is root authority not needed, to use a root ID (UID(0)) either to run the EQQPCS05 job or as eqqUID causes more problems than if a non-root ID is used. Here are the key points concerning EQQPCS05 and E2E security: eqqUID must be the user ID of the E2E server task. eqqGID must be a group to which eqqUID belongs. The user ID of the controller task must belong to eqqGID. The user ID of all CP batch jobs (EXTEND, REPLAN, and SYMPHONY RENEW) must belong to eqqGID. The user ID of any Tivoli Workload Scheduler for z/OS dialog user who is allowed to submit CP BATCH jobs from the dialog must belong to eqqGID. The user ID under which the EQQPCS05 job runs must belong to eqqGID. All the user IDs listed above must have a default group (DFLTGRP) that has a valid OMVS segment (that is, a GID is assigned to that group). If a non-unique UID is used for eqqUID (that is, a root UID or a non-root UID that is assigned to multiple user IDs), all the user IDs sharing that UID must have a DFLTGRP that has a valid OMVS segment. The reason that the non-unique user IDs can cause a problem is that RACF maintains a list in memory that relates UIDs to user IDs. For any UID, there can be only one user ID on the list, and if the UID is non-unique, the user ID will vary depending on the last user ID that was referenced for that UID. If a getuid command is issued to obtain the UID/user ID information, and the current user ID on the list has a default group (DFLTGRP) that does not have a valid OMVS segment, the getuid command will fail, causing E2E processing to fail. In the following sections, we describe some examples of common security-related problems in E2E.
146
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The messages shown in Example 8-15 would also be seen in the USS netman log with duplicate UIDs.
Example 8-15 Excerpt from the netman log NETMAN:+ AWSEDW026E Error firing child process: Service 2002 for NETMAN:+ TWS_z/OS/8.2 on OPCMASTER(UNIX) AWSDCJ010E Error opening new NETMAN:+ stdlist, Error: EDC5123I Is a directory.
In the case that caused these error messages, there were two user IDs, TSTCOPC and TSUOMVS, that both were defined with UID(3). TSTCOPC was used as eqqUID when the EQQPCS05 job was run.
Example 8-16 TSTCOPC and TSUOMVS user IDs //STDENV DD * eqqBINDIR=/usr/lpp/TWS/V8R2M0 eqqWRKDIR=/BAYP/twswork eqqGID=OPCGRP eqqUID=TSTCOPC
However, the ID TSUOMVS was the ID under which the EQQPCS05 job was run. The E2E server and controller tasks both ran under the TSTCOPC ID. A display of the stdlist and logs directories shows that TSUOMVS was the owner when the owner should have been TSTCOPC (eqqUID). See Example 8-17.
Example 8-17 Display of the stdlist and logs directories /BAYP/twswork/stdlist # ls -lisa total 72 3121 16 drwxrwxr-x 4 TSUOMVS OPCGRP 8192 Feb 18 14:50 . 0 16 drwxrwxrwx 5 TSUOMVS OPCGRP 8192 Feb 18 17:31 .. 3125 16 drwxrwxrwx 2 TSUOMVS OPCGRP 8192 Feb 18 14:50 2004.02.18 3122 16 drwxrwxr-x 2 TSUOMVS OPCGRP 8192 Feb 18 14:50 logs 3124 8 -rwxrwxrwx 1 TSUOMVS OPCGRP 32 Feb 18 16:24 stderr 3123 0 -rwxrwxrwx 1 TSUOMVS OPCGRP 0 Feb 18 16:24 stdout
Chapter 8. Troubleshooting
147
/BAYP/twswork/stdlist/logs # ls -lisa total 80 3122 16 drwxrwxr-x 2 TSUOMVS OPCGRP 8192 Feb 18 14:50 . 3121 16 drwxrwxr-x 4 TSUOMVS OPCGRP 8192 Feb 18 14:50 .. 3127 8 -rwxr--r-- 1 TSUOMVS OPCGRP 3450 Feb 18 16:29 20040218_E2EMERGE.log 3129 40 -rwxr--r-- 1 TSUOMVS OPCGRP 17689 Feb 18 17:35 20040218_NETMAN.log
The way permissions are set up on the logs, only the owner or a root ID can write to them, and because the ownership shows up as TSUOMVS while the E2E server is running under ID TSTCOPC (and thus the USS netman process is running under TSTCOPC), the netman process is unable to write to the logs directory, which causes the AWSEDW026E error in the netman log. To avoid this problem, ensure that the eqqUID ID used in the EQQPCS05 job has a unique OMVS UID. This can be enforced by implementing the support for RACF APAR OW52135 (NEW SUPPORT FOR THE MANAGEMENT OF UNIX UIDS AND GIDS) or by careful assignment of UIDs when RACF IDs are created. Using a root authority ID, it is possible to display the current list of IDs and UIDs from the TSO ISHELL environment as follows: 1. From TSO ISHELL, select SETUP with your cursor, and then enter 7 (ENABLE SU MODE). 2. Select SETUP with your cursor again, and then enter 2 (USER LIST). 3. In the USER LIST display, select FILE with your cursor, and then enter 2 (SORT UID). 4. Select a UID that is not currently in use.
The RACF error messages shown in Example 8-19 on page 149 were issued for the O82S (E2E server) task; however, the task kept running.
148
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Example 8-19 RACF error messages ICH408I USER(O82C ) GROUP(OPC ) NAME(TWS82 CONTROLLER /var/TWS820/inst/Intercom.msg CL(FSOBJ ) FID(01D9F0F1F9F1C5000F04000018F90000) INSUFFICIENT AUTHORITY TO OPEN ACCESS INTENT(RW-) ACCESS ALLOWED(GROUP ---) EFFECTIVE UID(0000000146) EFFECTIVE GID(0001000009) ICH408I USER(O82C ) GROUP(OPC ) NAME(TWS82 CONTROLLER /var/TWS820/inst/Mailbox.msg CL(FSOBJ ) FID(01D9F0F1F9F1C5000F04000018F00000) INSUFFICIENT AUTHORITY TO OPEN ACCESS INTENT(RW-) ACCESS ALLOWED(GROUP ---) EFFECTIVE UID(0000000146) EFFECTIVE GID(0001000009) ICH408I USER(O82C ) GROUP(OPC ) NAME(TWS82 CONTROLLER /var/TWS820/inst/Intercom.msg CL(FSOBJ ) FID(01D9F0F1F9F1C5000F04000018F90000) INSUFFICIENT AUTHORITY TO OPEN ACCESS INTENT(RW-) ACCESS ALLOWED(GROUP ---) EFFECTIVE UID(0000000146) EFFECTIVE GID(0001000009) )
The user ID of the server task was corrected using the RACF commands shown in Example 8-21, and then the server task was restarted.
Example 8-21 RACF commands to correct the user ID of the server task RALTER STARTED O82S.* STDATA(USER(O82STSO) GROUP(OPC) TRUSTED(NO)) SETROPTS RACLIST(STARTED) REFRESH SETROPTS GENERIC(STARTED) REFRESH
Note that now the user ID of the server task matches eqqUID. However, access to the stdlist/logs directory in USS WRKDIR was still not right, causing the RACF message shown in Example 8-22 on page 150 to be issued continuously in the server JESMSGLG.
Chapter 8. Troubleshooting
149
Example 8-22 Excerpt from the server JESMSGLG ICH408I USER(O82STSO ) GROUP(OPC ) NAME(O82STSO /var/TWS820/inst/stdlist/logs//20040801_E2EMERGE.log CL(FSOBJ ) FID(01D9F0F1F9F1C5000F0400001D360000) INSUFFICIENT AUTHORITY TO OPEN ACCESS INTENT(-W-) ACCESS ALLOWED(GROUP R--) EFFECTIVE UID(0000000154) EFFECTIVE GID(0001000009)
Example 8-23 shows that the files in the stdlist/logs directory were still owned by the O82C user ID (the user ID that was used the first time the server task was started).
Example 8-23 Files in the stdlist/logs directory -rwxr--r-- 1 O82C OPC 21978 Aug 1 00:48 20040801_E2EMERGE.log -rwxr--r-- 1 O82C OPC 4364 Aug 1 00:48 20040801_NETMAN.log SVIOLA:/SYSTEM/var/TWS820/inst/stdlist/logs>
We tried to correct this with the chown command, but this was unsuccessful. To get the server working correctly again, we deleted the wrkdir (using the command rm -r /wrkdir) and then reran the EQQPCS05 job.
To correct this, either change the user ID of the CP batch job to a user ID that is already part of the group eqqGID, or connect the user ID to the group eqqGID with the following RACF command, using the appropriate values for user ID and eqqGID:
connect (userid) group(eqqGID) uacc(update)
150
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
After this is done, when the CP batch job is run again, the messages shown in Example 8-25 should be issued instead of the ones shown in Example 8-24 on page 150.
Example 8-25 Excerpt from EQQMLOG EQQ3105I EQQ3106I EQQ3107I EQQ3108I A NEW CURRENT PLAN (NCP) HAS BEEN CREATED WAITING FOR SCP SCP IS READY: START JOBS ADDITION TO SYMPHONY FILE JOBS ADDITION TO SYMPHONY FILE COMPLETED
Chapter 8. Troubleshooting
151
Therefore, if the DM will not link, and the messages shown in Example 8-26 on page 151 are seen in TWSMERGE, the nm port value should be checked and compared to the CPUTCPIP value. In this case, correcting the CPUTCPIP value and running a SYMPHONY RENEW job eliminated the problem. We did another test with the same DM, this time setting CPUTCPIP to 31113.
Example 8-27 Setting CPUTCPIP to 31113 CPUREC CPUNAME(HR82) CPUTCPIP(31113)
The TOPOLOGY PORTNUMBER was also set to 31113, which is its normal value:
TOPOLOGY PORTNUMBER(31113)
After cycling the E2E server and running a CP EXTEND, the DM and all the FTAs were LINKED and ACTIVE, which is not what was expected. See Example 8-28.
Example 8-28 Messages showing DM and all the FTAs are LINKED and ACTIVE EQQMWSLL -------- MODIFYING WORK STATIONS IN THE CURRENT PLAN Row 1 to 8 of 8 Enter the row command S to select a work station for modification, or I to browse system information for the destination. Row cmd ' ' ' Work name HR82 OP82 R3X1 station text PDM on HORRIBLE MVS XAGENT on HORRIBLE SAP XAGENT on HORRIBLE L S T R Completed Active Remaining oper dur. oper oper dur. L A C A 4 0.00 0 13 0.05 L A C A 0 0.00 0 0 0.00 L A C A 0 0.00 0 0 0
How could the DM be ACTIVE if the CPUTCPIP value was intentionally set to the wrong value? What we found is that there was an FTA in the network that was
set up with nm port=31113. It was actually an master domain manager (MDM) for a Tivoli Workload Scheduler V8.1 distributed only (not E2E) environment. So our Version 8.2 E2E environment connected to the Version 8.1 MDM as though it were HR82. This illustrates that extreme care needs to be taken to code the CPUTCPIP values correctly, especially if there are multiple Tivoli Workload Scheduler environments present (for example, a test system and a production system). The localopts nm ipvalidate parameter could be used to prevent the overwrite of the Symphony file due to incorrect parameters being set up. If the following value is specified in localopts, the connection would not be allowed if IP validation fails:
nm ipvalidate=full
152
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
However, if SSL is active, we recommend that you use the following localopts parameter:
nm ipvalidate=none
After setting PORTNUMBER in TOPOLOGY to 3000 and running a CP EXTEND to create a new Symphony file, there were no obvious indications in the messages that there was a problem with the PORTNUMBER setting. However, the following message appeared in the netman log in USS stdlist/logs:
NETMAN:Listening on 3000 timeout 10 started Sun Aug 1 21:01:57 2004
These messages then occurred repeatedly in the netman log, as shown in Example 8-30.
Example 8-30 Excerpt from the netman log NETMAN:+ AWSEDW020E Error opening IPC: NETMAN:AWSDEB001I Getting a new socket: 7
If these messages are seen, and the DM will not link, the following command can be issued to determine that the problem is a reserved TCP/IP port:
TSO NETSTAT PORTLIST
Example 8-31 shows the output that shows the values for the PORTNUMBER port (3000).
Example 8-31 Excerpt from the netman log EZZ2350I EZZ2795I EZZ2796I EZZ2797I MVS TCP/IP Port# Prot ----- ---03000 TCP NETSTAT CS V1R5 User Flags -------CICSTCP DA TCPIP Name: TCPIP Range IP Address --------------
Chapter 8. Troubleshooting
153
When the E2E server was up, it handled port 424. When the E2E server was down, port 424 was handled by the controller task (which still had TCPIPPORT set to the default value of 424). Because there were some TCP/IP connected trackers defined on that system, message EQQTT11E was issued, because the FTA IP addresses did not match the TCP/IP addresses in the ROUTOPTS parameter.
The OPC connector got the error messages shown in Example 8-34 on page 155, and the JSC would not function.
154
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Example 8-34 Error message for the OPC connector GJS0005E Cannot load workstation list. Reason: EQQMA11E Cannot allocate connection EQQMA17E TCP/IP socket I/O error during Recv() call for "Socketlmpl<Binding= dns name/ip address,port=446,localport=4699>" failed with error" 10054=Connection reset by peer
In order to get the OPC connector and JSC working again, it was necessary to change the TOPOLOGY PORTNUMBER to a different value (not equal to the SERVOPTS PORTNUMBER) and cycle the E2E server task. Note that this problem could occur if the JSC and E2E PROTOCOL functions were implemented in separate tasks (one task E2E only, one task JSC only) if the two PORTNUMBER values were set to the same value.
Chapter 8. Troubleshooting
155
The x and y in the message example would be replaced by the actual run number values. Sometimes the problem is resolved by running a SYMPHONY RENEW or CP REPLAN (or CP EXTEND) job. However, there are some other things to check if this does not correct the problem. The EQQPT52E message can be caused if new FTA workstations are added through the Tivoli Workload Scheduler for z/OS dialog, but the TOPOLOGY parameters are not updated with the new CPUREC information. In this case, adding the TOPOLOGY information and running a CP batch job should resolve the problem. EQQPT52E can also occur if there are problems with the ID used to run the CP batch job or the E2E server task. See 8.2, Security issues with E2E on page 145 for a discussion about how the security user IDs should be set up for E2E. One clue that a user ID problem is involved with the EQQPT52E message is if after the CP batch job completes, there is still a file in the WRKDIR whose name is Sym, plus the user ID that the CP batch job runs under. For example, if the CP EXTEND job runs under ID TWSRES9, the file in the WRKDIR would be named SymTWSRES9. If security were set up correctly, the SymTWSRES9 file would have been renamed to Symnew before the CP batch job ended. If the cause of the EQQPT52E still cannot be determined, add the DIAGNOSE statements shown in Example 8-36 to the parm file indicated.
Example 8-36 DIAGNOSE statements added (1) CONTROLLER: DIAGNOSE NMMFLAGS('00003000') (2) BATCH (CP EXTEND): DIAGNOSE PLANMGRFLAGS('00040000') (3) SERVER : DIAGNOSE TPLGYFLAGS(X'181F0000')
Then collect this list of documentation for analysis: Controller and server EQQMLOGs Output of the CP EXTEND (EQQDNTOP) job EQQTWSIN and EQQTWSOU files USS stdlist/logs directory (or a tar backup of the entire WRKDIR)
156
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
running, the job will hang indefinitely until it is cancelled. The relevant ENQs, which can be seen by issuing the D GRS,RES=(SYSZDRK,*) command, are shown in the following list. The actual subsystem name will replace subn. subnEQQTWSIE subnEQQTWSOE subnEQQTWSIF subnEQQTWSOF
Chapter 8. Troubleshooting
157
If it were not intended to have an E2E environment or if it is critical to get the CP batch job to complete successfully in order to allow non-E2E processing to proceed, the following changes can be made: Comment out the TPLGYPRM keyword in the BATCHOPTS member. Comment out the TPLGYSRV parameter in OPCOPTS. Cycle (stop and then start) the controller task. After this, the CP batch job should run correctly, but E2E processing will be disabled.
158
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Example 8-37 Excerpt from EQQMLOG EQQ3106I WAITING FOR SCP EQQ3107I SCP IS READY: START JOBS ADDITION TO SYMPHONY FILE
This might be a normal delay, especially if the network of DMs and FTAs is large. After the EQQ3106I message is issued, the server issues STOP commands to OPCMASTER (batchman in the server) and all FT workstations. Then, the server waits until all the stopped processes and FTAs return an acknowledgment and signals to the normal mode manager (NMM) that the process has completed. The NMM applies the JT and finally copies the updated new current plan into CP1, CP2, and SCP. At this point, the message EQQ3107I is issued. However, it is also possible that the problem is not related just to the number of FTAs that need to be stopped. There could be a problem within the TCP/IP network or the communication between Tivoli Workload Scheduler for z/OS and TCP/IP. APAR PQ92466 deals with such a problem.
After TCP/IP was initialized, the following message was written to the server EQMLOG. However, the server task did not recover correctly.
EQQPH28I THE TCP/IP STACK IS AVAILABLE
The server task had to be cycled in order to connect to TCP/IP correctly. With the PQ90369 fix applied, the messages shown in Example 8-39 on page 160 occur in the server EQQMLOG if the server is started prior to TCP/IP being initialized.
Chapter 8. Troubleshooting
159
Example 8-39 Excerpt from EQQMLOG EQQPH09I EQQPH18E EQQPH18I EQQPH08I EQQPH08I EQQPH00I EQQPH18E EQQPH18I EQQPH08I EQQPH08I THE SERVER IS USING THE TCP/IP PROTOCOL COMMUNICATION FAILED, THE SOCKET SOCKET CALL FAILED WITH ERROR CODE TCP/IP IS EITHER INACTIVE OR NOT READY CHECK THAT TCP/IP IS AVAILABLE SERVER TASK HAS STARTED COMMUNICATION FAILED, THE SOCKET SOCKET CALL FAILED WITH ERROR CODE TCP/IP IS EITHER INACTIVE OR NOT READY CHECK THAT TCP/IP IS AVAILABLE
1036
1036
However, after TCP/IP is initialized, the messages shown in Example 8-40 are seen in the server EQQMLOG, and the server processes normally (no need to cycle).
Example 8-40 Excerpt from EQQMLOG EQQPH28I THE TCP/IP STACK IS AVAILABLE EQQPH37I SERVER CAN RECEIVE JSC REQUESTS
In addition, a new message has been added to indicate that the server will retry the TCP/IP connection every 60 seconds, as shown in Example 8-41.
Example 8-41 Excerpt from EQQMLOG EQQPT14W TCPIP stack is down. A connection retry will be attempted in 60 seconds.
160
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
In the GLOBALOPTS of the FTA, Timezone enable must be set to YES. Use the information in INFO APAR II13900 to download the latest list of recommended E2E maintenance and make sure that it is applied.
When a CP EXTEND was run with the incorrect CPUTZ value, the messages shown in Example 8-42 were seen in the CP batch job EQQMLOG.
Example 8-42 Excerpt from EQQMLOG EQQ3105I EQQ3103E EQQ3103I EQQ3099E EQQ3099I EQQ3106I A NEW CURRENT PLAN (NCP) HAS BEEN CREATED EQQputcp : UNABLE TO LOAD TIMEZONE INFORMATION FOR /America/Chicago TIMEZONE UTC WILL BE USED FOR CPU HR82 THE REASON OF THE PREVIOUS ERROR IS: EDC5129I No such file or directory. WAITING FOR SCP
The CP EXTEND runs to completion despite the error. However, any jobs set to run on the affected FTA would start at the wrong time. Therefore, if jobs are running at the wrong time on some FTAs, check the EQQMLOG for the last CP batch job that was run and look for message EQQ3103E.
Chapter 8. Troubleshooting
161
If repeated AWSBCV082I messages are seen in the USS TWSMERGE log, check the file system for the FTA mentioned in the message and take corrective action if the file system is full or nearly full.
Note that this is the same message that would appear if the E2E server task were down when a CP batch job was started. After verifying that the E2E server task was running, we checked the controller EQQMLOG and found the messages shown in Example 8-43.
Example 8-43 Excerpt from EQQMLOG EQQW030I EQQW038I EQQG001I EQQ3120E EQQW038I EQQG001I EQQZ193I A DISK DATA SET WILL BE FORMATTED, DDNAME = EQQTWSIN A DISK DATA SET HAS BEEN FORMATTED, DDNAME = EQQTWSIN SUBTASK E2E RECEIVER HAS STARTED END-TO-END TRANSLATOR SERVER PROCESS IS NOT AVAILABLE A DISK DATA SET HAS BEEN FORMATTED, DDNAME = EQQTWSOU SUBTASK E2E SENDER HAS STARTED END-TO-END TRANSLATOR SERVER PROCESSS NOW IS AVAILABLE
The JESMSGLG of the controller will also show D37 abends for the files that require formatting, as shown in Example 8-44.
Example 8-44 Excerpt from JESMSGLG IEC031I D37-04,IFG0554P,TWSA,TWSA,EQQTWSIN,3595,SBOXB6,TWS.SC63.TWSIN IEC031I D37-04,IFG0554P,TWSA,TWSA,EQQTWSOU,34CB,SBOXA8,TWS.SC63.TWSOU
If a RC=12 occurs in a CP batch job, check to make sure that the job was not submitted before the following message is seen in the controller EQQMLOG:
EQQG001I SUBTASK E2E SENDER HAS STARTED
If the batch job was submitted too early, restart the job and check for a successful completion code.
162
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Check if the PTF for APAR PQ77970 was applied. The ACTION HOLD for this PTF was missing an item, and this information was added in INFO APAR II13859. To avoid the EQQW086E message, use the following procedure: 1. Make sure that there are no JCL downloads and joblog retrievals pending. 2. Stop the Tivoli Workload Scheduler E2E server. 3. Delete the <workdir>/Translator.wjl file. 4. Restart the server, and the Translator.wjl file will be reallocated automatically. The next time the controller is started, the EQQW086E message will not appear in the EQQMLOG.
Chapter 8. Troubleshooting
163
4 0 0 ------0
4 0 0 0 0 0 0
The next sections illustrate the effect of having the values for MAXFILEPROC, MAXPROCSYS, and MAXUIDS set too low.
Note that the normal value on the system used for the test was 2000, which is set in the BPXPRMxx member in the system PARMLIB. The current value of MAXFILEPROC is also displayed if the following command is issued:
D OMVS,O
With MAXFILEPROC=5 in effect, the E2E server was started, and the messages shown in Example 8-47 were seen in the EQQMLOG.
Example 8-47 Excerpt from EQQMLOG EQQPH33I EQQZ024I EQQPH07E EQQPH07I THE END-TO-END PROCESSES HAVE BEEN STARTED Initializing wait parameters THE SERVER STARTER PROCESS ABENDED. THE STARTER PROCESS WAS CREATED TO PROCESS E2E REQUESTS
The TWSMERGE log in USS stdlist had the following message for the domain manager (HR82 in this example) and the FTAs:
MAILMAN:+ AWSBCV027I Unlinking from HR82
After correcting the MAXFILEPROC value with the following command and cycling the E2E server, the problem was corrected:
SETOMVS MAXFILEPROC=2000
The conclusion is that if message EQQPH07E is seen in the server EQQMLOG, the value of MAXFILEPROC should be checked and increased if needed.
164
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
After this, the E2E server task was started. The messages shown in Example 8-48 appeared in the server EQQMLOG.
Example 8-48 Excerpt from EQQMLOG EQQZ005I OPC SUBTASK SERVER IS BEING STARTED EQQPH09I THE SERVER IS USING THE TCP/IP PROTOCOL EQQPH18E COMMUNICATION FAILED, EQQPH18I THE SOCKET SOCKET CALL FAILED WITH ERROR CODE EQQPH08I TCP/IP IS EITHER INACTIVE OR NOT READY EQQPH08I CHECK THAT TCP/IP IS AVAILABLE EQQPH00I SERVER TASK HAS STARTED EQQPH18E COMMUNICATION FAILED, EQQPH18I THE SOCKET SOCKET CALL FAILED WITH ERROR CODE EQQPH08I TCP/IP IS EITHER INACTIVE OR NOT READY EQQPH08I CHECK THAT TCP/IP IS AVAILABLE EQQPH35E CANNOT START STARTER PROCESS: EQQPH35I BPX1ATX FAILED WITH RC=0156, RSN=0B0F0028
156
156
We used the TSO BPXMTEXT command to check the meaning of the reason code shown in the EQQPH35I message (0B0F0028):
TSO BPXMTEXT 0B0F0028
To correct the problem, MAXPROCSYS was reset to its normal value with this command, and the E2E server was stopped and started again:
setomvs maxprocsys=200
Chapter 8. Troubleshooting
165
The following console message appeared immediately, but we ignored it and attempted to start the E2E server anyway:
*BPXI039I SYSTEM LIMIT MAXUIDS HAS REACHED 300% OF ITS CURRENT CAPACITY
156
156
These message are very similar to those issued when MAXPROCSYS was set too low, with the exception of the reason code in the EQQPH35I message, which is 130C0013. We issued the following command to check the meaning of this RSN code:
TSO BPXMTEXT 130C0013
166
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The problem was corrected by issuing the following command and cycling the E2E server task:
setomvs maxuids=50
Example 2:
StartUp conman start conman link @!@;noask
Example 3:
StartUp conman start conman link @!opcmaster
2. Add a CPUREC definition to the parmlib member specified in the TPLGYMEM parameter of the TOPOLOGY statement. 3. Run a CP REPLAN or EXTEND job to put the new workstation into effect. 4. It is not necessary to cycle the E2E server task or controller.
Chapter 8. Troubleshooting
167
Use the following procedure to remove an E2E FTA: 1. Delete the workstation from Tivoli Workload Scheduler for z/OS dialog (panel 1.1.2). 2. Remove the CPUREC definition from the parmlib member specified in the TPLGYMEM parameter of the TOPOLOGY statement. 3. Run a CP REPLAN or EXTEND job to put the workstation change into effect. 4. It is not necessary to cycle the E2E server task or controller.
168
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The contents of the parmlib member ENV will be a single line that refers to the SYSMDUMP data set name:
_BPXK_MDUMP=tws.e2e.sysmdump
After the server task is started with these changes in place, it is necessary to determine the process ID (PID) of the translator process. This can be done by looking at the server EQQMLOG for message EQQPT01I, where bindir is the bin directory (eqqBINDIR):
EQQPT01I Program "/bindir/bin/translator" has been started, pid is nnnnnnnn
Note that the EQQPT01I message is also issued for the netman process. Be sure to use the translator PID. The other way to get the PID for the translator is to issue the following command:
D OMVS,A=ALL
The preceding command displays information for all OMVS processes. By checking the hexadecimal ASID value (ASIDX) for the E2E server in SDSF, the following command can be issued to display information only for the processes owned by the E2E server task:
D OMVS,A=asid
In the previous command, asid is the hexadecimal ASID of the server address space. An example of a display of the OMVS processes for a E2E server task is shown in Example 8-53.
Example 8-53 Example of a display of the OMVS processes for an E2E server task BPXO040I 20.05.52 DISPLAY OMVS 044 OMVS 000D ACTIVE OMVS=(01,0A,00) USER JOBNAME ASID PID PPID STATE START CT_SECS O82STSO O82S 003B 50331709 67109052 1F---- 20.03.13 .58 LATCHWAITPID= 0 CMD=/u/u/usr/lpp/TWS/V8R2M0/bin/netman O82STSO O82S 003B 67108957 1 MW---- 20.03.12 .58 LATCHWAITPID= 0 CMD=EQQPHTOP O82STSO O82S 003B 126 67109046 1F---- 20.03.25 .58 LATCHWAITPID= 0 CMD=/u/u/usr/lpp/TWS/V8R2M0/bin/batchman O82STSO O82S 003B 16777357 67109052 HS---- 20.03.13 .58 LATCHWAITPID= 0 CMD=/u/u/usr/lpp/TWS/V8R2M0/bin/translator O82STSO O82S 003B 67109046 50331709 1F---- 20.03.24 .58 LATCHWAITPID= 0 CMD=/u/u/usr/lpp/TWS/V8R2M0/bin/mailman O82STSO O82S 003B 67109052 67108957 1S---- 20.03.12 .58 LATCHWAITPID= 0 CMD=/u/u/usr/lpp/TWS/V8R2M0/bin/starter
Chapter 8. Troubleshooting
169
In this example, 16777357 was the PID of the translator process. After the PID of the translator process has been determined by either method, issue the following command to capture the dump with LE information:
F BPXOINIT,DUMP=translator_pid
The dump will be shown as SEC6 abend in the server address space JESMSGLG, as shown in Example 8-54.
Example 8-54 Excerpt from JESMSGLG IEA995I SYMPTOM DUMP OUTPUT 083 SYSTEM COMPLETION CODE=EC6 REASON CODE=0D2FFD27
170
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
EQQFSD5I SYSOUT DATABASE INITIALIZATION COMPLETE EQQFSD1I SYSOUT DATABASE ERROR HANDLER TASK STARTED EQQFSW1I DATA STORE WRITER TASK INITIALIZATION COMPLETED ............. EQQFSW1I DATA STORE WRITER TASK INITIALIZATION COMPLETED EQQFCU1I CLEAN UP TASK STARTED EQQFJK3I DATA STORE JESQUEUE TASK INITIALIZATION COMPLETED EQQFCU3I CLEAN UP TASK RUNNING EQQFSR1I DATA STORE READER TASK INITIALIZATION COMPLETED EQQFCUPI CLEAN UP ISSUED DELETE REQUEST FOR 00000000000 SYSOUTS EQQFCUPI STRUCTURED : 00000000000 EQQFCUPI UNSTRUCTURED : 00000000000 EQQFCU4I CLEAN UP TASK IN QUIESCE STATUS
b. Display the XCF group that the controller and DST are using, as shown in Example 8-56. You should only see the controllers and DSTs XCF group.
Example 8-56 Displaying the XCF group that the controller and DST are using: Part 1 Controller FLOPTS: FLOPTS CTLMEM(XCFCDST1) DSTGROUP(TWSDSGRP) XCFDEST(xxxxxxxx.XCFDDST1)
You are receiving JES resource shortage messages (HASP): a. Verify that the DST is up (it would also be good to check under SDSF to see it is getting cycles). b. Check the DST MLOG for errors such as Files full. c. Verify that it can retrieve output by logging in to Tivoli Workload Scheduler for z/OS and requesting a completed job.
Chapter 8. Troubleshooting
171
d. Check the JES class and destination you have defined to the DST to see if they are growing. You can go to SDSF and check the destination to see if the list is growing. You can also use the following JES command to check:
$do jobq,queue=d,dest=twsdst64
If everything seems as it should be, but you still receive the spool messages, you might need to go back and look at the tuning parameters in Chapter 3, Optimizing Symphony file creation and distribution on page 27.
The syslogd configuration file etc/syslogd.conf contains the definitions where the types of messages are routed. Example 8-60 shows you an example of the configuration file.
Example 8-60 The syslogd configuration file # all alert messages (and above priority messages) go to the MVS consol *.alert /dev/console # # all authorization messages go to auth.log auth.* /var/log/%Y/%m/%d/auth.log # # all error messages (and above priority messages) go to error.log
172
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
*.err /var/log/%Y/%m/%d/error.log # # all debug messages (and above priority messages) from telnet # from telnet go to local1.debug local1.debug /var/log/%Y/%m/%d/local1.debug
As you can see in Example 8-60 on page 172, alert messages are issued at the console, and Telnet debug messages, for example, are routed to the /var/log/ directory. Note: Remember that messages can be located in different places.
In order to list messages online, you can use the LookAt message tool, available at:
http://www.ibm.com/servers/s390/os390/bkserv/lookat/lookat.html
Chapter 8. Troubleshooting
173
The cccc is a reason code qualifier. This is used to identify the issuing module and represents a module ID. The second 2 bytes are the reason codes that are described in the messages books. If this value is between 0000 and 20FF, and the return code is not A3 or A4, this is a USS reason code. In this situation, you can use the BPXMTEXT procedure to get more information. See also 8.1.1, EQQISMKD on page 140 for an example of the BPXMTEXT command.
174
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
The MVS SYSLOG might contain useful messages if you issued any commands to OMVS, such as the following: D OMVS,A=ALL D OMVS,O D OMVS,L It is important for the EQQMLOGs to ensure that these files are not purged or overwritten. For example, if you write the EQQMLOGs to SYSOUT, either keep them on the spool or use your SYSOUT archival product to store them. If you write the EQQMLOGs to disk, insert a step in the startup JCL for the task to save the previous EQQMLOG to a generation data group (GDG) or other data set before the step that would overwrite the EQQMLOG file.
Chapter 8. Troubleshooting
175
176
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Appendix A.
177
178
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
In order to circumvent this directory search and go directly to a small operation-specific library, you can use exit EQQUX002. For improved performance of EQQUX002, we recommend that EQQUX000 (the Tivoli Workload Scheduler for z/OS stop/start exit) is used to do the open/close routines needed for the JCL PDS libraries. In this book, we provide you with both exits. For downloading instructions, refer to Appendix C, Additional material on page 197. Important: The exits provided with this IBM Redbook are supplied AS IS with no official support.
179
All functions are controlled by parameters coded in a member called USERPARM that should reside in the controllers EQQPARM data set concatenation. The functions are controlled by data available to Tivoli Workload Scheduler for z/OS from either the CPOC (occurrence), CPOP (operation), or CPWS (workstation) records available to EQQUX002 when called. Which you use is identified by the characters OC (CPOC), OP (CPOP), or WS (CPWS) in the statements in USERPARM. These records, and their offsets, are documented in the appendixes of IBM Tivoli Workload Scheduler for z/OS Program Interfaces, SC32-1266. To use any of the following functions, both EXIT00 and EXIT02 must be enabled in the Tivoli Workload Scheduler for z/OS controller initialization deck.
Where nnnn is the prefix used. When Tivoli Workload Scheduler for z/OS starts, EQQUX000 is called and is able to identify the ddnames that refer to the JCL libraries by this common prefix. The ddnames are completed by some data from the records available to EQQUX002 when the JCL needs to be fetched. For example, you might decide to hold your JCL in libraries defined to ddnames OPCJwsid, where OPCJ is the literal value defined as JCLPREFIX and wsid is the name of your workstations. There are two methods of identifying the suffix used to identify the ddname for each operations JCL: JCLDEFAULT (required) and JCLOVERRIDE. The format of the two statements is the same: keyword=(xx,ooo,l), where the keyword is JCLDEFAULT or JCLOVERRIDE.
180
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
xx is the record type (OC, OP, or WS). 000 is the decimal offset in the record to the string of data that completes the ddname. l is the length of the character string. Notes: The maximum length for a ddname is 8, so ensure that your prefix and suffix together do not exceed this length. The JCLOVERRIDE suffix is optional and is applied if the JCLDEFAULT value resolves to a blank. For example, you might want to get your JCL from the libraries defined to ddnames OPCJwsid, except where the operations CLASS field has been given a value. See Example A-1.
Example: A-1 JCL code example JCLPREFIX= OPCJ JCLDEFAULT=(OP,240,1) JCLOVERRIDE=(OP,78,4)
Therefore, if the operation was on workstation CPU1, and the class field was blank, the JCL would be fetched from the file or files identified by ddname OPCJCPU1, because the JCLDEFAULT value resolved to a blank. However, if the class field had been set to a W, the JCL would be fetched from the file or files identified by ddname OPCJW.
181
Thought should be given to the way this is used to allow the maximum number of model members to be created that is sufficient for your installation. It should be noted that this is a global definition. The model JCL is fetched if the operation names first x characters match those defined as the model prefix. MODELPREFIX=xxxxxxx describes the model member prefix, a literal character value from 1 to 7 characters in length. MODELSUFFIX=(oo,nnn,l) describes the suffix that will be concatenated to the prefix to determine the whole member name, where: oo is the record used OC, OP, or WS. nnn is the offset in that record to the suffix string. l is the length of the suffix string (ensure that the prefix plus suffix is less than or equal to eight).
Where: x is a numeric 1 through 16. oo is the identifying record type (OC, OP, or WS). nnn is the offset in the record type to a value (y) that identifies that this model should be used for this operation. y is the literal that must be found at the offset in the record type and matched to use the member. member is the member name in EQQPARM that contains the model JCL. For example, MODEL1=(OP,240,A,MYMODELA) would use the JCL found in member MYMODELA if the value at offset 240 (job class) of the CPOP record for this operation was A. The MODELxx statements should be coded in order, from MODEL1 to MODEL16. The jobname on the JOBCARD must be ###MODEL, which is replaced by the matching operation name. To substitute a value into the
182
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
model from the occurrence, operation, or workstation record, use the MODELVAR keyword:
MODELVAR=(#varname,oo,nnn,l,a)
Where: #varname is a 3-16 byte string in the JCL to be replaced, the first byte of which must be #. oo is the record type (OC, OP, or WS) where the replacement value will be found. nnn is the offset in the record to the start of the value that will replace #varname in its entirety. l is the length of the string in the record to be used. a is the action to take if the value at offset nnn is shorter than l (length) having been delimited by a blank or non-character value. P pads the string with blanks to length, and T (the default) truncates the variable at the blank or non-character value. For example, MODELVAR=(#PROCNAME,OP,63,7,T) would replace the string #PROCNAME with the value found in the CPOP record starting at offset 63 (second character of operation name) for up to seven characters. Up to 16 MODELVAR statements can follow each MODELxx statement. If none are present, the MODELxx member is used unchanged (except for the jobname). Substitutions take place in the order that the MODELVAR statements are specified in the USERPARM member, and all occurrences of the MODELVAR within the member will be substituted.
183
These are: JLINEx=string Where x = 1 to 9, and string identifies the line to be inserted in the JCL. It can be 1-72 characters long, where the first character will be aligned in column 1 of the job stream. ILINEx=(oo,nnn,string) or ILINEx=ALL Where x associates an INSERT reason with its JLINE. The line can be inserted into every job stream (ALL), or only into those that match the data string (1 to 10 characters in length) found at offset nnn of record oo (OC, OP, or WS). MLINEx=(oo,nnn,l,string) Where x associates this modification with its JLINE. Up to four MLINEx statements can exist, allowing four separate changes to each JLINE, where: oo is the record (OC, OP, or WS) where the replacement data can be found. nnn is the offset in the record that marks the start of the replacement data. l is the length of the replacement data. string is the literal in the JCL that should be replaced. Note: JLINEx, ILINEx, and MLINEx must be placed together in a set. For example:
JLINE1=//* JOBNAME JJJJJJJJ was submitted from Application AAAAA ILINE1=ALL MLINE1=(OP,62,8,JJJJJJJJ) MLINE1=(OC,0,16,AAAAA) JLINE2=//*..................
184
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Where: The value of keyword is CLASS (the only keyword currently supported). xx is OC, OP, or WS to identify the record containing the class value data. c is the activity to be done, M is insert value if missing, A is always. insert/override value, and I is override value if invalid on job card. nnn is the offset to the value (length is not required in this case, because 1 is always assumed).
How to define data store destinations on a per job basis with EQQUX002
In this section, we describe how to define Tivoli Workload Scheduler for z/OS data store destinations on a per job basis using EQQUX002. Note that this
185
assumes that the data store is already customized and running. For more information about the data store, refer to Chapter 6, Data store considerations on page 121. Our configuration has multiple JES spools and therefore multiple data stores. We gave each of our data stores its own SYSDEST, which is not normal and might be problematic in your environment. However, it is worth noting that the technique we describe also works when a primary workstation fails and jobs are rerouted to an alternate workstation that has a different SYSDEST. To define Tivoli Workload Scheduler for z/OS destinations on a per job basis using EQQUX002: 1. Decide which jobs should be captured. 2. Install the EQQUX000 and EQQUX002 exits provided at our IBM Redbook download site (see Appendix C, Additional material on page 197). Make sure the load modules are linked into an authorized load library accessible by the Tivoli Workload Scheduler for z/OS controller started task. 3. Decide on a fetch library indicator to tell exit02 which joblib contains the JCL for an operation. If the indicator is empty, as it will be for other jobs, exit02 knows not to fetch process the job. We chose to use the end of the Operation text field, and we chose fetch library DD names JCLID001 to JCLID004. Therefore, our indicator is any value from 01 to 04 as the last two characters of the text. This will make more sense when you look at the following example. 4. Decide on a data store trigger. We chose to use the AUTOMATIC OPTIONS, Print options: FORM NUMBER field in the operation definition. Operation text can be used as well. We chose the value TWSDST to indicate that we want to use the data store. 5. Decide how to tell the fetch exits the name of the data store DEST. We chose to use the last two characters of the workstation description field. Our good data store destinations are TWSDSTCB and TWSDSTFD, so our WS descriptions end in CB or FD. 6. Update the job operation definitions in Tivoli Workload Scheduler for z/OS. This can be done on a mass scale using the batch command utility, EQQYCAIN, to unload the appropriate ADs for editing, and the batch loader utility, EQQYLTOP, to reload updated definitions. We selected a limited number of applications through the dialogs as follows: a. Specify the fetch exit library DD name in the job Operation text field. From Tivoli Workload Scheduler for z/OS dialog 1.4.3, select the Application with m for modify, then OPER, then add the DD name. See Example A-2 on page 187.
186
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Example: A-2 Updating the job operation definitions Row Oper Duration Job name cmd ws no. HH.MM.SS .... CPU1 005 00.00.01 ATSTBR14 indicates fetch library 01. Operation text TEST_JOB........JOBLIB01 <== the "01"
b. Specify the data store trigger in the form number FIELD. Enter S.4 in the Row cmd on the left side of the operation line to specify AUTOMATIC OPTIONS. Add the Tivoli Workload Scheduler for z/OS destination. See Example A-3.
Example: A-3 Updating the job operation definitions Enter/Change data below: Application : ADHOC1 Operation : CPU1 005 Job name : ATSTBR14 .......... Print options: FORM NUMBER ===> TWSDST to add JCL
7. Update the workstation definitions as in Example A-4. This step can be bypassed if you do not have multiple data store instances and use different SYSDEST names. Go to Tivoli Workload Scheduler for z/OS dialog 1.1.2, and select a CPU workstation to associate with good a data store destination with m for modify, and update the description. Note that CB indicates which data store destination is used; in this case, it will be TWSDSTCB.
Example: A-4 Updating the workstation definitions. Enter the command R for resources , A for availability or M for access method above, or enter/change data below: Work station name : CPU1 DESCRIPTION ===> JURASSIC_PARK...SYSDEST=TWSDSTCB <== "CB"
187
* Specify actual Data Store SYSDEST JLINE2=//TIVDST02 OUTPUT JESDS=ALL,CLASS=O,DEST=TWSDSTXX * If CPOP operation FORM NUMBER field equals 'TWSDST' ILINE2=(OP,82,TWSDST) * Optional MLINE - if different SYSDEST names are used * Modify the OUTPUT DESTination to specify tracker dest MLINE2=(WS,34,2,XX)
9. Update the Tivoli Workload Scheduler for z/OS controller and data store parameters doing the following: a. First, update the controller options as shown in Example A-6.
Example: A-6 Controller options ...................... FLOPTS /* Options for Fetch Job Log (FL) task needed to access DSTs*/ CTLMEM(XCFCDST1) /* XCF NAME OF CTRL DST XCF LINK */ /* MUST BE THE SAME AS CTLMEM IN DSTOPTS (EQQDSTP) */ DSTGROUP(TWSDSGRP) /* NAME OF DST XCF GROUP */ XCFDEST(XCTWSTCB.XCFDDSCB,XCTWSTFD.XCFDDSFD) ******************************************************************** RCLOPTS /* Restart and clean up options */ DSTCLASS(XCTWSTCB:Z,XCTWSTFD:Z,********:Z) /*Z=PURGE QUEUE*/ DSTDEST(TWS82DST) /* DEAD LETTER SYSOUT DESTINATION */ ......................
b. Next, update the data store options on the CB system as shown in Example A-7.
Example: A-7 Data store options DSTOPTS ...................... HOSTCON(XCF) DSTGROUP(TWSDSGRP) CTLMEM(XCFCDST1) DSTMEM(XCFDDST1) NWRITER(16) SYSDEST(TWSDSTCB) WINTERVAL(5) STORESTRUCMETHOD(DELAYED) STOUNSD(Y) ......................
10.Copy the JCL for jobs to be captured by the data store into the fetch exit libraries.
188
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
11.Recycle Tivoli Workload Scheduler for z/OS to pick up the new options and the fetch exits' USERPARMs. Note: A replan is not needed to implement the above changes. 12.Verify that the EQQUX000-02 message shown in Example A-8 is issued with no errors when the controller starts.
Example: A-8 EQQUX000-02 message IEF403I TWSC - STARTED - TIME=........ EQQUX000-02 Action START
You will also see this message if you have not added MODELn statements, as shown in Example A-9.
Example: A-9 EQQUX000-02 message EQQUXPM0-01W NO MODELn statements, Modelling In-Storage bypassed
When you run a job that you have selected and edited the description and form number, you will see your equivalent to the statements shown in Example A-10 added to the JCL.
Example: A-10 Statements added to the JCL //* EQQUXJ80I 2 cards added below (Inserted by EQQUX002) //TIVDST01 OUTPUT JESDS=ALL output //TIVDST02 OUTPUT JESDS=ALL,CLASS=O,DEST=TWSDSTCB output //TIVDST00 OUTPUT JESDS=ALL,CLASS=Z,DEST=TWSDSTXX INSERTED BY TWS output <== normal joblog <== exit02 added DST <== purge Q, DST
Other jobs will have the regular Tivoli Workload Scheduler for z/OS inserted output statements that will cause the data store output to be purged, as shown in Example A-11.
Example: A-11 Statements added to the JCL //TIVDST00 OUTPUT JESDS=ALL,CLASS=Z,DEST=TWSDSTXX output //TIVDSTAL OUTPUT JESDS=ALL output INSERTED BY TWS INSERTED BY TWS <== purge Q, DST <== normal joblog
189
190
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Appendix B.
Gathering statistics
In this appendix, we describe how to use the statistics messages and the job tracking logs to analyze your IBM Tivoli Workload Scheduler for z/OS processing. We cover the following topics in this appendix: Gathering statistics Changing statistics gathering and more Using the job tracking log data
191
Gathering statistics
The initialization statement JTOPTS has a STATMSG parameter. This has the following three keywords associated with it: CPLOCK: The event manager subtask issues messages EQQE004 and EQQE005, which describe how often different tasks have referenced the current plan data set. Important: The following list identifies possible bottlenecks in the job submission: A long hold time for WSA A long WAIT time for EM, NMM, and GS compared to the WSA WSATASK: The event manager subtask issues messages EQQE008 and EQQE009, which describe statistic information gathered by the workstation analyzer subtask. EVENTS: The event manager subtask issues messages EQQE000, EQQE006, and EQQE007, which describe how many events were processed and provide statistics for the different event types. The messages for CPLOCK and EVENTS are issued when the number of events that have been processed by OPC since the previous message is greater than half the value of the BACKUP keyword, or 200 if the BACKUP value is NO.2. These statistics are generated at the same time and at one of the following frequencies: Every nn minutes based on the value in the STATIM(n) parameter of JTOPTS. Every nn events as supplied by the EVELIM(n) parameter of JTOPTS. Every n events, where n is approximately half the value of the BACKUP(nnn) parameter of JTOPTS; if this has been set to NO, the default value of 400 is used. This can be altered at any time using a the F subsys,AAAAAA=nn command, where subsys is the Tivoli Workload Scheduler for z/OS controller name, AAAAAA is one of the commands (STATIM or EVELIM), and nn is the value in minutes or events to be used. GENSERV: The general service subtask issues messages EQQG010 to EQQG013, which describe how often different tasks have been processed and how long the general service queue has been. Tivoli Workload Scheduler for z/OS issues these messages every 30 minutes if any requests have been processed.
192
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Deciphering these messages will tell you the length of the event manager queue between CP locks, roughly how many jobs were set to started, and how many locks the workstation analyzer took to achieve this total. (See Figure B-1 for a sample of these statistics). Collating this information over time should enable you to decide whether throughput is an issue.
Figure B-1 shows output by the event manager after a number of events have been processed. The output shows the CPLOCK relationships between the NMM, WSA, EM, and GS tasks and a breakdown of the event types received since the last statistics were output. Cumulative data since the controller was started is also shown in Figure B-1. Refer to IBM Tivoli Workload Scheduler for z/OS Messages and Codes, SC32-1267, for a full explanation of the messages issued by the STATMSG parameter. In a perfect world, the TOTAL and Q1 values on the EQQE000I message line would be the same, with zero for all the other queue lengths. That would mean that every time the event manager was passed an event, it was able to process it immediately. Instead, it is likely that your figures will resemble those in Figure B-1, or reflect even longer queues. The event manager queue grows because it is enqueuing on the current plan. This is because the CP is being held
193
by other tasks, most frequently by the workstation analyzer. The more the WSA has to do, the longer it holds the CP lock. In this example, the most jobs that the WSA could have scheduled is 656; this equates to the number of type 1 events received. Type 1 events are produced when a job is read onto the JES queue using an internal reader, although not necessarily submitted by Tivoli Workload Scheduler for z/OS. The lowest is 137, the number of CP locks held by the WSA, assuming at least one job was scheduled for each lock. This is quite a difference. Understanding the workload at the time of day the statistics cover will help, but it would still be fairly inaccurate unless no tasks start on any controlled image other than through Tivoli Workload Scheduler for z/OS. To really know how many operations were scheduled, you need to review the job tracking logs.
The frequency of each of these can be altered using the following commands: F subsys,EVELIM=nnnn F subsys,STATIM=nn F subsys, DSPSTA To set a new event limit value (0-9999) To set a new time frequency (0-99 minutes) To display the status of statistics messaging
There is another very powerful trace to show WSA performance; however, care should be exercised when using this command, because it is quite an overhead for Tivoli Workload Scheduler for z/OS to collect this data. It should only be used to identify potential performance problems in the WSA. The F subsys,JCLDBG=ON|OFF command activates and deactivates the single JCL trace. For each job handled by WSA task information, such as the elapsed time in milliseconds needed to handle the job, retrieve the JCL, access the JS VSAM, and so on, will be shown.
194
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
195
196
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Appendix C.
Additional material
This IBM Redbook refers to additional material that can be downloaded from the Internet as described here.
Select the Additional materials and open the directory that corresponds with the Redbook form number, SG246352.
197
198
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
199
STLIST TFS TMR TSO USS VLF VSAM VTAM WSA WLM WTO XCF zFS ZFS
standard list Temporary File System Tivoli Management Region time sharing option UNIX System Services Virtual Lookaside Facility Virtual Storage Access Method Virtual Telecommunications Access Method workstation analyzer Workload Manager write to operator cross-system coupling facility z/OS Distributed File Service zSeries File System z/OS File System
200
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Related publications
The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this Redbook.
IBM Redbooks
For information about ordering these publications, see How to get IBM Redbooks on page 202. Note that some of the documents referenced here may be available in softcopy only. ABCs of z/OS System Programming Volume 9, SG24-6989 End-to-End Scheduling with IBM Tivoli Workload Scheduler Version 8.2, SG24-6624 End-to-End Scheduling with Tivoli Workload Scheduler 8.1, SG24-6022 Maximizing Your OPC/ESA Throughput, SG24-2130 UNIX System Services z/OS Version 1 Release 4 Implementation, SG24-7035
Other publications
These publications are also relevant as further information sources: IBM Tivoli Workload Scheduler for z/OS Diagnosis Guide and Reference, SC32-1261 IBM Tivoli Workload Scheduler for z/OS: Managing the Workload, SC32-1263 IBM Tivoli Workload Scheduler for z/OS Messages and Codes, SC32-1267 IBM Tivoli Workload Scheduler for z/OS Program Interfaces, SC32-1266 System/390 MVS Parallel Sysplex Continuous Availability Presentation Guide, SG24-4502 z/OS V1R5.0 Distributed File Service zSeries File System Administration, SC24-5989 z/OS V1R5.0 MVS Planning: Workload Management, SA22-7602 z/OS V1R5.0 MVS Programming: Workload Management Services, SA22-7619
201
z/OS MVS System Codes, SA22-7626 z/OS V1R5.0 UNIX System Services Command Reference, SA22-7802 z/OS UNIX System Services Messages and Codes, SA22-7807
Online resources
These Web sites and URLs are also relevant as further information sources: IBM Web site that provides z/OS UNIX performance information
http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1tun.html
202
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Index
A
Accessing z/OS UNIX 65 aggregate 77 APAR database 122 APAR IY58566 44 APAR IY59076 44 APAR OW52135 148 APAR PQ77970 163 APAR PQ90090 160 APAR PQ90369 159 APAR PQ92466 159 API 62 at shell function 68 AWSEDW026E error 148 compress Symphony 42 compressed form 43 compression 4344 continuous processing 41 control area split 21 control interval split 21 controller 34 controller task 37 copytree utility 86 correct priority 98 CP batch job 174 CP EXTEND 150 CP EXTEND job 32, 34 CP REPLAN 150 CPOC 180 CPUREC 28 CPUSERVER parameter 38 CPUTCPIP 151 CPUTZ value 161 critical path 9899 cron 68 C-shell 57 current plan 28
B
backup domain manager 40 balancing system resources 103 batch 68 batch loader job 32 batch prioritization process 98 BATCHOPT parameters 156 bm look 42, 50 bm read 50 BM Tivoli Workload Scheduler for z/OS 28, 45 Book Manager 64 Bourne Again shell 57 Bourne shell 57, 63 BPXAS 65 BPXBATCH 66 BPXOINIT 64 BUFFERSPACE 36
D
daemons 59 daily plan build 2 daily plan EXTEND job 101 DASD tuning 47 data definition 116 data set cleanup 124 data set concatenation 112 data store initiators 125 number of writers and writer output files 125 performance evaluation 124 Storestruc method 125 test description 124 Wait interval 125 what is 122 when to use 124 database objects 135 deadline times 101
C
C shell 63 card images 185 ccg_basiclogger 44 ccg_filehandler 44 CI and CA splits 116 clock time 4 Colony address spaces 65 command files 57
203
decompression 44 default JSC list 135 defining BUFFERSPACE 48 designing batch network 109 distributed network 40 DOC APAR PQ77535 145 domain 41 domain manager 151 DOMREC 28 DORMANCY 119 Dub 63 dummy deadline operations 101 dummy jobs 112 dump with LE information 170 Duplicate UID 147 dynamic operation area 18
event-creation order 13 executables 60 EXTEND 174 external dependencies 104, 110 external dependency 110
F
fastest possible volumes 117 fault ratio 88 fault tolerance 40, 4546, 157 fetch library indicator 186 fetch the JCL 181 file system 57, 161 find-a-winner algorithm 23 first-in first-out basis 23 FLOPTS 122
E
E2E server task 37 E2E server USERID not eqqUID 148 E2EMERGE.log 174 Earliest latest start time 23 Earliest start time 102 empty EQQSCLIB 47 end-to-end environment 2 end-to-end scheduling 43 EQQAPPLE 140 eqqBINDIR directory 140 EQQEVPGM 13 eqqGID 150 EQQISMKD job 140 EQQJBLIB concatenation 180 EQQJCLIB concatenation 115 EQQMKDIR REXX exec 140 EQQPCS05 146 EQQSCLIB 28 eqqUID 146 EQQUSINx subroutines 13 EQQUX000 117, 180 EQQUX000 / 002 combination 22 EQQUX000 and EQQUX002 implementation 179 installation 179 EQQUX002 117 EQQW086E 163 errnojr 174 etc/syslogd.conf 172 event manager 13
G
garbage collected heap 137 garbage collector 134 general service 113 general service subtask 14 globalopts 157 goalmode 69 good scheduling practices 111 good throughput 19
H
high-priority batch 35
I
IBM Tivoli Workload Scheduler for z/OS xi, 2, 4, 12, 14, 1718, 20, 122 HFS 35 sequential files 35 temporary batch files 35 VSAM files 35 IEFUSI exit 34 IMS BMP 103 INFO APAR II13859 163 INFO APAR II13900 161 initial recovery action 109 input arrival coding 108 Input arrival time 103 interactive environment 57 internal dependency 110 IOEAGFMT format utility 85
204
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
J
JCCOPTS statement 14 JCL download 163 JCL fetch 180 JCL fetch time 113 JCL fetch times 118 JCL libraries 115 JCL retrieval 178 JCL VSAM repository 112 JCLDEFAULT 180 JCLOVERRIDE 180 JES2 checkpoint 124 job dependencies 40 job output elements 122 Job Scheduling Console buffer size 134 common plan lists 135 creating query lists 135 factors affecting performance 132 garbage collector 134 instance 134 Java tuning 137 latest fixes 132 open editors 134 open windows 134 propagating to JSC users 135 refresh rate 132 resource requirements 132 Startup script 137 Job Stream Editor 134 job tracking log 13 jobcard 115, 183 joblog retrieval 163 jobname 182183 JOBREC 124 JS backups 15 JS files 112 JS VSAM file 112 JSC 1.2 startup script 137 JSC 1.3 startup script 137 JSC launcher 137
L
Latest start time 99, 101 calculation 99 extra uses 101 maintaining 101 Limit of feedback and smoothing algorithms 101 Linux 54 LLA REFRESH 116 LLA UPDATE 116 localopts 3032, 43, 151 log file cache 87 Logical File System 57 long running task 115 long term plan 48 LOOKAT message tool 173
M
macros 179 mailbox message 42 mailman 42 mailman cache 42 mailman failure 42 MAILMAN server 38 mailman server 38 mandatory priority field 99 MAXFILEPROC 165 MAXJSFILE 117 MAXPROCSYS 165 metadata cache 82, 86 metadata update 78 Microsoft Windows 59 missed feedback 101 mm cache enable 31 mm cache mailbox 42 mm cache size 31, 42 MODELPREFIX 182 MODELSUFFIX 182 MODELVAR keyword 183 Monitor III data gatherer 75 monitoring job 102 mozart directory 157 msg files 155
K
kernel 56
Index
205
multiple DASD volumes 35 multiple domain structure 40 multiple domains 41 multiple instances of a job stream 104 MVS data sets 86 MVS PPT 34
N
netman log 148 NETMAN.log 174 network outage 158 new current plan 48 nm ipvalidate=full 152 nm ipvalidate=none 153 NM PORT 151 normal mode manager 159 Novell Network Services 67 Number of PDMs 38
Physical File System 58 PID 64 PIF 178 PIF program 116 PIF requests 113 Plan 134 plan 132, 134135 plan instances 135 preferences.xml 136 primary domain namager 38 prioritization 102 prioritizing batch flows 98 Priority 8 through 1 23 Priority 9 23 process 56 production topology 157 program interface 14, 178 programming language 57 propagating changes in JSC 135 PTF 37
O
OGET 91 OPCMASTER 157 operation definitions 186 operations duration time 102 operator instructions 101 OPUT 91 OutofMemoryError 138 output translator 29 overhead 44
Q
QUEUELEN parameter 18
R
UNIX read 59 RACF 34, 37, 6466 RAMAC RVA 35 RCLOPTS 122 read access only 72 Read-only cloning 78 Redbooks Web site 202 Contact us xiv region size 34 release command 13 REPLAN 174 Resource Measurement Facility 64 Restart and cleanup action 23 restart/recovery options 124 restarting an E2E FTA 167 REXX 91 REXX language 63 rlogin 59, 65 root authority 146 root subdirectory 60
P
parallel servers 103 parameter files 59 pax command 86 PDM 38 PDS 178 Performance acceptable 4 factors affecting 2 how it is measured? 4 Job Scheduling Console 5 what is it? 2 Performance improvements Job Scheduling Console 132 mm cache size 43 sync level 43 wr enable compression 43
206
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
S
SAP R/3 67 SCP 29, 35 script library 28 server JESMSGLGCP Batch USERID not in eqqGID 150 service class 69 set a benchmark 2 SHAREOPTION 48 shareoption 48 SHARK DASD 35 shell 57 shell script 57 Shortest estimated duration 23 Sinfold 29 Sinfonia 43 Sinfonia file 43 single domain 41 single domain structure 40 small Symphony file 112 SMF 63 SMF record types 75 SMP 140, 142 SMP/E APPLY job 140 special resource 103 special resource database 102 special resource planning horizon 102 special resources 102 specific libraries 180 spool capacity 126 spool messages 172 spool space 122 staging program 22 stand-alone cleanup job 13 Start time 104 started tasks 59 started-task JCL 13 STC class 35 stdlist 44, 147, 151 steplib 115 Storestruc method 125 Superuser 60 suspend queue 13 swappable 34 Symnew 29, 156 Symphony 38 Symphony current plan 28 Symphony file size 45 SYMPHONY RENEW 174
Symphony Renew 150 SymUSER 29 sync level 31 sync level=low 37 synch event 13 SYSMDUMP data set name 169 Sysplex Multi-Access Spool 124 System Display and Search Facility 64 system initiators 103
T
T C-shell 57 TCP/IP task 159 tcsh shell 63 Telnet 59, 65 test topology 157 The Open Group 54 time dependency 103 Time Restrictions window 104 TMR server 132 tomaster.msg 155 TOPOLOGY 28 topology definition 32 TOPOLOGY parameter 38 translator process 170 trial plans 102 Troubleshooting Changing the OPCMASTER 157 CHECKSUBSYS(YES) 156 CP batch USERID not in eqqGID 150 CPUTZ defaults to UTC 161 data store 170 Delay in SCP processing 158 DIAGNOSE statements 156 Domain Manager file system full 161 E2E PORTNUMBER and CPUTCPIP 151 E2E server started before TCP/IP initialized 159 EQQDDDEF job 142 EQQISMKD job 140 EQQPCS05 job 143 EQQPH35E message 145 EQQPT52E 155 installation 140 Jobs run at wrong time 160 link problems 158 MAXFILEPROC value set too low 164 MAXPROCSYS value set too low 165
Index
207
MAXUIDS value set too low 166 message EQQTT11E 154 No valid Symphony file exists 158 OMVS limit problems 163 other E2E problems 158 root authority 146 root ID 146 Security issues with E2E 145 SERVOPTS PORTNUMBER 154 Symphony distribution 155 Symphony switch 155 TOPOLOGY PORTNUMBER 153 useful E2E information 167 wrong LPAR 156 TSO commands 13 TSO ISHELL 148 TSO OMVS 65 TSO/E users 62 Tuning results 46 TWSCCLog.properties 44 TWSMERGE.log 174
U
undub 63 undubbed 63 Unified File System 55 UNIX Daemons 59 file system 57 functional comparison with MVS 60 functionality 55 GID 60 Kernel 56 overview 54 parameter files 59 permissions 59 processes 56 root 60 shell 57 signals 56 UIDs 60 virtual memory 57 what people do not like about it 55 what people like about it 54 write 59 UNIX System Services accessing 65 Address spaces 64
dub 63 file system type 73 file systems 76 NFS 77 TFS 77 fundamentals 61 Further tuning tips 73 Interaction with elements and features of z/OS 63 overview 54 performance tuning 66 tcsh shell 63 undub 63 What people do not like about it 66 What people like about it 66 WLM relationship 67 z/OS shell 63 unlink 157 User file cache 86 user ID 150 USERPARM 180 USRREC 28 USS 28 USS WRKDIR 174
V
very hot batch service class 103 Virtual lookaside facility 67, 70 VSAM 178
W
Wait interval 125 Workload Manager integration 103 workstation analyzer 101 workstation setup 118 wr enable compression 31 writer output files 125 writer tasks 125 writers 125 WTO message 13
X
UNIX execute 59 XCF 14 X-Windows 57
208
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Z
z/OS shell 63 z/OS tracker agents 13 z/OS UNIX reason codes 174 z/OS UNIX shell performance 73 zFS File System attaching an aggregate 85 configuration file 86 HFS comparison 91 installing 84 zFSadm command 85
Index
209
210
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Back cover
Customizing IBM Tivoli Workload Scheduler for z/OS V8.2 to Improve Performance
Optimize performance in large Tivoli Workload Scheduler for z/OS environments Pure mainframe and end-to-end scheduling scenarios Best practices based on real-life experience
Scheduling is generally considered as the nucleus of the data center, because the orderly, reliable sequencing and management of process execution is an essential part of IT management. IBM Tivoli Workload Scheduler for z/OS is the IBM strategic product used in many large-to-midsized customer environments, responsible for scheduling critical batch applications. Therefore, the performance of Tivoli Workload Scheduler for z/OS is one of the important factors that affect the overall satisfaction from the IT services for these companies. This IBM Redbook covers the techniques that can be used to improve performance of Tivoli Workload Scheduler for z/OS (including end-to-end scheduling). There are many factors that might affect the performance of any subsystem. In this book, we confine ourselves to those things that are internal to Tivoli Workload Scheduler, or can be easily verified and modified, and are likely to apply to the majority of Tivoli Workload Scheduler customer sites. Although this book is aimed at those very large installations with a batch load of 100,000 or more jobs per day, it will also be relevant to installations with a smaller batch workload who are suffering from a shrinking batch window, or those who are trying to maximize the throughput on their existing hardware, or both.
BUILDING TECHNICAL INFORMATION BASED ON PRACTICAL EXPERIENCE IBM Redbooks are developed by the IBM International Technical Support Organization. Experts from IBM, Customers and Partners from around the world create timely technical information based on realistic scenarios. Specific recommendations are provided to help you implement IT solutions more effectively in your environment.