Professional Documents
Culture Documents
Itera Ha: User Guide
Itera Ha: User Guide
Ve r s i o n 6 . 0
User Guide
September 12, 2011
iTERA HA Version 6.0 User Guide
Lock Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Object Sync Filter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Omit Filter Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
How to Create an Object Filter Entry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Filter Objects From User Journals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Sync Defined Non-Mirrored Objects By Library (4.31 - SYNCNMLIB) . . . . . . . . . . . . . . . . . 62
Sync Non-Mirrored Objects (4.32 - SYNCOBJ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Import Information From SAT Lite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Library Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Work with Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Library Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
How determine whether to use the Large Library Syncing Procedures . . . . . . . . . . . . . . . . 66
Steps to perform prior to a sync or resync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Specify a new library for mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Network Sync a Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Tape Sync a Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Steps to perform after a sync or resync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Sync a Library That Needs Objects Changed to a Different Journal Than the One Assigned to
the Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Remove a Library From Being Mirrored by iTERA HA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Change the Mirror Journal Assigned to the Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Change the Journal to Which a Library and All Its Objects is Being Journaled . . . . . . . . . . 73
Replicate a Library to an ASP Other Than ASP 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Object Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
How to Change the Max Network Sync Size Parameter. . . . . . . . . . . . . . . . . . . . . . . . . 76
Resync an Object (or Objects) Via the Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
How to Determine Whether to Use the Large Object Sync Procedure . . . . . . . . . . . . . . . . . 77
Large Object Resync Via Network or Tape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Change the Journal to Which an Object in a Library is Being Journaled . . . . . . . . . . . . . . . 83
Change the Journal Image Being Used by the Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
System Library AutoSync (4.10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
1
Purpose of this guide
This guide is designed to provide an overview of the essential elements for
understanding iTERA HA. It begins with basic terminology definitions, an
introduction and overview to iTERA HA, and then provides step-by-step
instructions essential for configuring the product and initiating replication. It
continues with instruction on implementing the required processes that are
critical in maintaining, monitoring, and auditing iTERA HA. Additional
chapters include instructions for various procedures, such as role swaps, virtual
role swaps, and obtaining and applying iTERA HA PTFs. Consult the iTERA
HA v6.0 Reference Guide for descriptions of screen options, functions and
processes available within iTERA HA.
This guide should be used both as a tool for initial product training and as a
resource for monitoring and maintenance of your iTERA HA environment.
Contact Information
For information about obtaining training on iTERA HA, or if you need help
with any of our products, please contact Vision Solutions CustomerCare at
If you would like additional information about any of Vision Solutions’ other
data management products, please contact our Customer Sales department.
Professional Services
The following items are considered consulting and are offered as a billable
service:
• Role Swap
• IP address changes
• IP connection issues
Services are scheduled based on availability. Allow at least two weeks for your
request to be scheduled.
Term Description
Term Description
Term Description
Journaling Basics
The following diagram depicts the basic concepts of IBM journaling. For more
information on journaling, see
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/topic/rzaki/rzaki.pdf.
IMPORTANT
Do not change the defined attributes of these profiles or iTERA HA will
cease to work properly!
Another profile that was automatically created for you is HAxxSUP. HAxxSUP
is a support profile which will allow access to the product using a language
other than the one defined for the CRG in order to facilitate assistance by
CustomerCare or other software service providers. The default language used
by the support profile is English, but it can be changed to any available
language. The support profile is automatically disabled upon creation and can
be enabled only when needed. The password for the profile is defined as
*FRCCHG, which will require a password change the first time it is used. To
assign a language to a profile, refer to the instructions in the iTERA HA v6.0
Advanced Features guide.
IMPORTANT
Failure to comply with these requirements will cause iTERA HA to
work incorrectly!
IMPORTANT
If you need to change the iTERA HA password, do so ONLY within
iTERA HA.
If the subsystems are running, you can use the following procedure on the
primary system to change the password. This procedure will make changes
on the primary and the target and in WRKUSRPRF.
– *ALLOBJ
– *AUDIT
– *IOSYSCFG
– *JOBCTL
– *SAVSYS
– *SECADM
– *SERVICE
– *SPLCTL
6. As a safeguard, you may want to set the initial menu for the
ITERAOWNER profile to *SIGNOFF after your installation of iTERA
HA is complete.
Also at installation, a subsystem was created both on the primary and the target
machines. This subsystem is named E2xxSBS (where xx is the CRG code as
indicated above).
Library
Library Description Name
(New iTERA HA
Installations)
Library
Library Description Name
(New iTERA HA
Installations)
IMPORTANT
These libraries should not be replicated.
NOTE
Systems that upgraded from an earlier version of the product may use
ITE2 instead of ITHA. This guide and all other documentation will
refer only to ITHA.
Accessing iTERA HA
If you want a user to be able to access iTERA HA without using the iTERA
HA profile, do the following:
1. Add the following libraries to your library list (in this order):
Library Name
ITHAxx
ITHA
ITXP43
The above illustration shows two environments. On the left is the production
or primary environment containing the objects to be mirrored. On the right is
the target (also referred to as a backup node) environment where the mirror
copy of objects are kept synchronized and ready to be used in the event of a
role swap or failover; i.e., the backup node becomes the production node.
While the journaling process records the data changes to journal receivers on
the production node, the remote journaling process also sends a copy of the
data changes (journal entries) to the target node via the network using TCP/IP.
(This is done by the operating system, independently of iTERA HA.) The
journal receivers on the primary node contain the actual data changes to the
mirrored objects, so that if for some reason communication between the
systems is interrupted, the i5/OS knows exactly what data was sent to the
target (and what data was not) by automatically restarting remote journaling at
Once the journal entries have been received by the receive component of the
remote journal on the target node, they are written to a journal receiver on the
target node.
The iTERA HA apply job then applies the data changes received by the journal
receiver on the target node to the appropriate objects on the target node. As
this is done, the file is kept synchronized with the production node.
When transactions are applied to objects on the target node, local journals on
the node record the changes to the mirrored objects, writing the transactions to
separate journal receivers, which are regularly deleted by iTERA HA. This
additional journaling process on the backup node is done to ensure that
journaling is active on that system and is ready to go when a Role Swap (or
Failover) is executed. This journaling will be needed upon execution of the
Role Swap (or Failover) to send changes to the former production system (the
new backup node) so that the mirroring process continues to take place. In
addition, there are ZZ Audits monitoring the local journal receivers in order to
ensure data integrity.
Remote journals are also associated with these contingency journals. However,
they remain inactive until a role swap occurs, at which time they will
automatically become activated. This ensures the entire mirroring process can
be quickly re-established between the new production node (formerly the
backup node), and the new backup node (formerly the production node).
Finally, the Object Monitor job processes the IBM system audit journal
QAUDJRN looking for object level changes (changes, deletes, creates,
renames, etc.) in mirrored libraries. If any of these conditions exist, then those
changes are replicated to the target node by iTERA HA.
iTERA HA PTFs
All updates and enhancements for iTERA HA v6.0 are done with IBM-style
PTFs. To successfully update iTERA HA, you must acquire the PTFs from
Vision Solutions’ FTP site, then load and apply them. See “How to Get and
Apply iTERA PTFs” on page 235 for complete instructions on PTF retrieval,
including alternative methods of obtaining PTFs, etc.
Vision Solutions releases a PTF Service Pack about once a month. Customers
subscribed to product notifications on the Vision Solutions Support Central
site will be notified when a Service Pack is available via e-mail.
System Roles
Node Description
NOTE
The term “target” is used to denote either a backup or replicate
node.
Menu System
The Main Menu screen has been organized into sub-categories based on
iTERA HA processes with similar functionality.
Environment Management
• Time and Date Stamps – Indicates when the menu option was last
accessed.
• iTERA HA commands – Any iTERA HA commands that can be used to
perform the same function as a menu option will be listed on the menu on
the right side of the description of the menu option in all capital letters.
Available menu options are based on the role the machine is assigned. When
iTERA HA was set up and configured in your environment, different roles
were assigned to the machines based on whether the machine was a primary
node or a target node (also referred to as production or source and backup).
IMPORTANT
Menu options differ based on the role assigned to the system.
iTERA HA Subsystems
You will have one subsystem per CRG on both the primary machine and the
target machine. If you have multiple CRGs defined, then you will have more
than one subsystem. The subsystem names are based on the two-character
CRG code designated during installation (e.g., E2A1SBS). All iTERA HA jobs
will run in the subsystem (unless otherwise noted during training).
IMPORTANT
When starting the subsystems, always start the target machine first,
then the primary machine. When ending the subsystems, always end
the primary machine first, then the target. Although iTERA HA will
be able to recover normal subsystem operation if this sequence is not
followed, it is recommended that you follow this procedure for
optimal streamlined performance.
Fast Path
• Type 2.12 from any command line within the iTERA HA menu system
and it will start the subsystems for that machine.
IMPORTANT
This will need to be added to the startup procedure that runs after an
IPL (Initial Program Load).
Use the appropriate administrative profile for the CRG. For example,
if the CRG code is B1, then the admin profile for that CRG would
be HAB1ADMIN.
The command E2ENDSBS is used to end the iTERA HA subsystems and all
jobs running under that subsystem.
Fast Path
Type 2.13 from any command line within the iTERA HA menu system and it
will end the subsystems for that machine.
In order for the commands to work, the following libraries must be in your
library list (in this order): ITHAxx, ITHA, ITXP43.
IMPORTANT
You must start the subsystems on all nodes. Issuing commands or
menu options from just one node will NOT bring up all nodes.
• From the Processing Menu, select option 11, Work with Subsystem Jobs.
Fast Path
• Type 2.11 from any command line within the iTERA HA menu system
and it will display the subsystems and all iTERA HA jobs running for that
machine.
NOTE
Different jobs will be running in the subsystems based on the role
that machine is actively performing.
The first two characters for most of the iTERA HA jobs is the two-character
CRG code (in this case, A1; most basic iTERA HA installations use A1).
• In a MSGW status.
Subsystem Jobs
In the ITHAxx library on both nodes is a file called E2PJOBS. This file holds a
list of jobs that should be running in your subsystems based on the role that
machine is performing. When the subsystems start up, it will automatically
start those jobs.
NOTE
The following is not a comprehensive list of all jobs. Additional job
information is available in an appendix of the iTERA HA Reference
Guide.
Primary Jobs
The following jobs should always be running on the primary machine.
Additional jobs may be running in your system based on whether certain
features, such as Spool file replication, Configuration Replication, etc., have
been enabled.
Target Jobs
The following jobs should be running continually on the target machines.
Additional jobs may be running in your subsystem based on whether certain
features, such as Spool file Replication and IFS, have been enabled.
Apply jobs for user journals. This job only runs if user
journals (journals that have been set up by third-party
ZU_xxxx
software vendors or created outside of iTERA HA)
have been incorporated into iTERA HA.
IMPORTANT
You must end and restart the iTERA HA subsystems after making
changes on this screen.
Both primary and target jobs are displayed. The System column displays the
node to which the job applies.
The F8=Set Delay key displays the Job Monitor Control screen. Here you can
adjust how often the jobs for the Job Monitor, System Monitor, and Auto Sync
run.
This chapter discusses the journaling technology used in iTERA HA, which
includes local journals, journal receivers, remote journals, and apply process.
NOTE
Doing this step on all systems prior to creating the first journal will
save steps.
2. Review the screen and the parameter descriptions in the table below and
set the values as indicated.
The fields described below are the only fields on this screen that apply to
iTERA HA.
Field Description
Journal Change Indicates (in minutes) how often journal receivers will be
Frequency changed.
This screen displays all journals being used to replicate objects in iTERA HA.
It is primarily used for changing the settings for existing journals.
This option accesses the i5/OS Work with Job Schedule Entries screen, which
displays the scheduled job that is created by the system to manage the deletion
of journal receivers.
Local Journals
Journal Types
Mirror journals are typically created within iTERA for the purpose of
mirroring objects. (Although some user journals may be converted to mirror
journals, see below). When a mirror journal is assigned to a library, all
journaling functions (ending and starting) will be controlled by iTERA HA
processes. For example, adding a filter for an object will end journaling on the
object if it is being journaled by a mirror journal. Only mirror journals are
eligible to be assigned as a default journal.
User journals are journals that have been created outside of iTERA HA, either
for in-house use or for use by third-party applications. The objects are being
journaled for a purpose other than HA. User journals may be imported (or
defined) in iTERA HA in order to mirror the objects. Most journal
maintenance functions for user journals are restricted in iTERA HA.
Journaling is not automatically started or ended for objects in user journals. A
user-created user journal may be converted to a mirror journal so that it may be
fully-managed by iTERA HA. User journals are ineligible to be assigned as a
default journal. User journals created by third-party applications should not be
converted to mirror journals. Doing so may cause the application to not
function correctly
This screen allows you to create and manage the local journals that are used by
iTERA HA for the mirroring process.
When accessed on the primary node, you are viewing the local journals that
journal the objects selected for mirroring.
When accessed on the target node, you are viewing the local journals on that
system. These local journals are active, but are not used in replication unless a
role swap or failover is executed. They are also read by the ZZ Audits, which
check for unauthorized updates performed on the target node.
Any changes to journals that affect both the primary and target nodes should
always be executed from 3.1 on the primary node.
Typically, the only time you need to view this menu from the backup node is
to ensure that everything is properly defined prior to a role swap.
IMPORTANT
Verify that the subsystems are active on all nodes (E2SBS). Use 2.11
or E2STRSBS to start them.
NOTE
If you have not already set the JRM default parameters in 3.31, do so
before creating the first journal. This must be done on both the
primary machine and the target machine. See “JRM Default
Parameters (3.31)” on page 29 for more information.
2. Select F6=Create Mirror Journal. Set the New Journal Type (use option
D=Data for the journals that will be used for library replication), then
press Enter.
3. Enter the library that contains the user journal, the user journal name and
a description, then press Enter.
5. If the Mirror Status is not ON, select 32=Toggle Mirror Status, then press
Enter.
6. Select Work with Apply Jobs (3.4, target). Ensure the Job Sts field for the
apply job does not display dashes. If it does, review the steps for
troubleshooting the apply job. Contact CustomerCare if unable to resolve.
IMPORTANT
QAUDJRN is not supported for data replication. It should never be
added to the Local Journal Maintenance screen, nor should it ever
have an attached remote journal or apply job. (QAUDJRN is read
and processed through the OBJMON jobs. Adding it to the Local
Journal Maintenance screen will cause conflicts with these jobs.)
Journal Receivers
• 1.1, F16, F7 – Journal Receivers can be accessed via the System Monitor.
• When you access the journal receivers option from the primary node, you
will initially view the status of the journal receivers that are attached to the
local journals on that system.
• When you access this option from the target node, you will initially view
the status of the journal receivers that are associated with the remote
journals.
Primary Node
F7 allows you to toggle between the Local and Remote Receivers on the
primary node.
NOTE
There are no attached receivers because a role swap has never been
executed in this environment.
Target Node
IMPORTANT
Always be aware whether you are viewing Local Receivers or Remote
Receivers.
The F7 key allows you to toggle from the view displaying the receivers attached
to the remote journals on this node to the receivers attached to the local
journals on this node.
This screen allows you to view and manage the remote journal components
that reside on the system you are on.
• When you initially access this option from the primary node, the status of
the “send” component of the remote journals is displayed.
• When you initially access this option from the target node, the status of
the “receive” component of the remote journals is displayed.
NOTE
Pay close attention to the direction of the remote journals and to
whether you are looking at the Send or the Receive portion of the
remote journal.
The F7 key is used to reset the starting point which includes retrieving the
current receivers being used by the journal, as well as the starting sequence
number. By resetting the start point, you will be able to determine how many
entries have been handled by each receiver.
This procedure is commonly used when there are multiple objects to be moved
to a different journal. To transfer a single object to a different journal the Work
with Mirrored Objects screen (4.21, option 7=Transfer Journal) may be used
instead.
NOTE
An object cannot be moved from a mirror journal to a user journal or
from a user journal to a mirror journal.
1. Press the F18=Import Object Change Requests key to select the library
and/or objects to load into the Journal Change Requests screen.
3. Once the journal has been selected, the Journal Change Requests screen is
again displayed. A separate request is generated for each of the objects
within the library being changed. Select the appropriate option, to the
default, after images, or both before and after images, then press Enter.
The Journal Change Requests screen is displayed.
5. Once the request is processed, the current and requested journal column
data will display yellow text. Press F7 to toggle between all, pending, and
completed change requests.
NOTE
Objects that are locked at the time the F8=Submit Change Processor
is executed will remain in a Pending status and iTERA HA will not
automatically reattempt the journal change request. Manually
resubmit the change request by pressing F8=Submit Change
Processor when the object is no longer locked.
7. Press F20=Clear Listed Requests to remove the objects from the display.
• Fast path: Enter 3.4 on any iTERA HA command line on the target
machine.
Once data has been journaled on the primary node and sent to the target node
via remote journaling, the data is ‘applied’ to the necessary files (i.e.,
transactions are applied to the necessary objects) on the target node via the
apply jobs process.
The Apply process runs only on the target machine. One apply process is
created for each journal (both iTERA HA and User journals).
IMPORTANT
Great care should be taken when overriding the sequence number (option
12). While the instructions to perform an override are documented as
part of the various procedures that require it, Vision Solutions does NOT
recommend you perform this procedure without having first consulted
with CustomerCare.
Replication Overview
Library and object replication in iTERA HA uses journaling to record
changes to selected objects (files, data areas, data queues) on the primary
node. Journal changes are sent to remote journals on the target. An apply job
runs on the target node and monitors the journal receivers (which are
attached to the remote journals) for new journal entries, then extracts and
writes the data changes recorded in the journal entries to the appropriate
objects on the target node. For objects that are already being journaled by
other journals, iTERA HA simply uses the existing journals to mirror data
for those objects.
• Refer to the iTERA HA v6.0 Reference Guide for a list of objects, object
replication changes, and system components not supported in iTERA
HA.
• In order for user profiles to be replicated to the target box, there are certain
non-mirrored objects that must first be replicated.
IMPORTANT
Vision Solutions does not support and strongly discourages
replication of system libraries beginning with Q, particularly
QUSRSYS. This OS library contains OS-specific objects and other
objects unique to the machine.
If you must mirror objects from this library, do so solely at your own
risk. Work with your implementation specialist or contact
CustomerCare for additional information.
• From the Main Menu, select option 4 – Library Replication Menu. Select
option 30 – Non-Mirrored Sync definitions.
The following table indicates the sync criteria and exclusions that have
been added to the screen:
IMPORTANT
This step is critical for all Q libraries!
IMPORTANT
Other non-mirrored objects in your system may need to be synced.
Evaluate your systems in order to determine if objects in Q libraries
should be replicated. Use the instructions above to add them.
5. Press F9=Sync NM-Library. Enter *ALL for Library and *ALL for Target
System.
Once user profiles have been replicated to the target, any changes to existing
profiles will be automatically replicated.
User profiles must exist on the target machine before libraries can be replicated.
Identical mapping (F16) is required on all nodes. If more than one target node
is defined, F8 is used to copy maps to other nodes.
NOTE
If you have more than one CRG defined, each user profile should be
replicated only through one CRG.
NOTE
User Profile Replication should already be enabled in the Replication
Options screen (30.23). If, for some reason, the menu option for
User Profile Replication is not displayed in the Non-Library
Replication screen (5), or you cannot access the screen, access the
Replication Options screen (30.23) then select option 6=Enable for
the *USRPRF component.
2. If more than one target node is defined, the following screen is displayed.
Select the BACKUP 1 node (not the REPLICATE node), then press Enter.
4. Press F18=Submit Info Build to view all profiles on the primary system.
5. Select F16=Dft Map and review the pre-defined profile maps on the
primary node.
Several maps are displayed. These maps control the behavior of the
affected user profiles on the target node.
• The *ALL generic entry will ensure all user profiles not handled by any
other entries (generic or specific) are replicated (Add, Delete, Change,
and Password parameters are all set to Yes). Also, the Disable parameter
should be set to Yes for this entry so that general users cannot access the
target node.
• The other pre-defined entries include the E2*, E2CSTMGR, iTERA
HA Admin profiles, ITERADIRE, ITERAOUTQ, ITERAOWNER,
ITERAUSPF, and Q*. The defaults settings for these entries should be
retained (these profiles should not be disabled on the target).
• iTERA HA will automatically exclude any profiles that begin with the
letter Q. The Q* map entry covers all profiles beginning with the letter
Q (system profiles). However, if you have individual users whose
profiles begin the letter Q, a map must be created for each, so that they
are included.
a. Press F6=Add to add a new entry to the map. The following screen
is displayed:
b. The screen below indicates how the map entry should look for user
profile “Quick”.
c. Enter the entire profile name in the User Profile or Generic* field.
d. Set the values for Add, Delete, Change, Update, and Disable Remote
Profile to Y, unless this profile should be allowed to generally access
the target node (not recommended).
7. Enter option 1=Select then press Enter to copy the map from the Backup 1
node indicated to the Replicate node.
10. The defaults should generally be left as defined (if needed, consult F1 Help
for information on these fields). However, the setting for Transport Choice
should be considered as follows:
11. Press F7=Add Exit Program to load the exit program with the desired
settings, then F3 to exit.
12. Press F21=Quick Sync to replicate all profiles from the primary node to
the target node (identified in the Target System screen heading).
13. User Profile audits will be automatically executed from the Audit
Command Console on a regular basis (you’ll learn more about this in a
later chapter). They may also be run manually from within the 5.1 screen
as follows:
IMPORTANT
Do not start replicating libraries until all profiles have finished
replicating.
NOTE
If profiles have been replicating and a new map is added, then the
profiles associated with the map will need to be audited or resynced.
Lock Analysis
1. Select menu option 4.22 on any iTERA HA command line.
NOTE
The lock analysis may take a significant amount of time to complete.
After the lock analysis has completed, the screen will display all objects that are
currently locked. This screen should be reviewed on a weekly basis and objects
that are no longer locked (i.e., the “State” of the object is no longer *EXCL or
*EXCLRD) should be synced using option 9=Sync Object.
Once the object has been synced, the object may be removed from this list
using option 4=Delete Lock Record.
Prior to a role swap, you will review this screen after the systems have been
quiesced (and all locks have been released) and then sync all objects that have
not been synced.
7=Toggle Perm Lock is used to retain objects on the screen for review and
syncing prior to a role swap. Objects marked for perm lock are not removed
from the screen when the F18=Submit *EXCL Lock Search is performed.
The table under the “Omit Filter Types” heading on page 60 illustrates the five
key omit types and illustrates the outcome of what will occur to the object
when the filter is selected. The iTERA HA Reference Guide contains
additional detail on filter behavior.
• Fast path: Enter 4.20 on any iTERA HA command line
IMPORTANT
Journaling is automatically ended for journalable objects in mirror
journals that match a filter definition. iTERA HA does not start or
end journaling on any object being journaled in a user journal (the
object is being journaled via user journals for some other purpose),
therefore, filters cannot be enforced for those objects. Do not create
filters for objects in user journals.
IMPORTANT
Objects within libraries that have QDFTJRN defined cannot be
filtered. Use the following command to determine whether
QDFTJRN is specified for a library:
WRKOBJ [LibName]/QDFTJRN
IMPORTANT
Objects in mirrored libraries that reside on the target machine that do
not reside on the primary machine and have an Omit Type code of 1
or 2 will be deleted through the audit process. The data area
E2AUDDLT must also be set to Y.
IMPORTANT
The following omit filter is NOT allowed:
Library = *ALL
Object = *ALL
Type = *ALL
Attribute = *ALL
Filter Type = Omit
Omit Type = 2=Omit All
IMPORTANT
Do NOT define a CLRDTA omit filter for:
Object =*ALL
Type = *FILE
Attribute = *ALL
Omit Type = 4
These combined filter settings will clear all files, including DDM
files. If the DDM file is pointing to a file on another system, the files
on the other system will be cleared. This includes the Primary. When
setting up a CLRDTA filter, an acceptable setting for Attribute is PF.
NOTE
During a Virtual Role Swap, changes made to objects that have been
filtered using an IGNCHG filter will be retained.
IMPORTANT
Vision Solutions does not support and strongly discourages
replication of system libraries beginning with Q, particularly
QUSRSYS. This OS library contains OS-specific objects and other
objects unique to the machine. However, we understand that some
third-party application vendors often place objects in this library that
are required on the target systems. The Non-mirrored Library Object
Sync (4.30) should be used for needed objects that exist in QGPL or
QUSRSYS libraries. Check with your third-party application vendor
to identify required objects.
IMPORTANT
Under no circumstances should a filter entry exist for the following:
Library = QSYS
Object = *ALL
Type = *LIB
Filter Type = Include
The existence of this filter (in combination with other factors, such
as performing a product upgrade) has resulted in the need to reinstall
the product.
IMPORTANT
Do NOT define an Include filter for:
Library = QSYS
Object = *ALL
Type = *FILE
Clear the
Delete Create the
Copy the Journal contents of
object from object definition
entire the object the object on
Omit Type target node
object to
on the target
on the target node
Code if it already
the target
and update
primary (keep only
exists definition when
there? node? changes occur? node? object
definition)?
1 = RMTDLT Yes* No No No No
2 = OMTALL No No No No No
3 = IGNCHG No Yes** No No No
* Deleted during the CHKOBJMTCH audit and only if the data area E2AUDDLT is set to Y.
** Library sync only
*** By default, filtered objects are not journaled. Therefore, changes to the objects can only be tracked
through QAUDJRN, which is processed by the Object Monitor. Changes detected by the Object Monitor
will result in a resync of the objects.
**** Content clearing applies only to physical files and save files.
2. Fill in the display with the appropriate data. Consult the table under the
“Omit Filter Types” heading on page 60 for descriptions of the filter types
available. Consult the iTERA HA v6.0 Reference Guide for more
information on other fields in this window.
In the following example, all files in library “MIKER” will be filtered from
replication to the target node.
1. Select 4.11.
2. Select option 7=Object Detail.
4. Enter an X in both the Journal Status and Syncing field, then press Enter.
The display is returned to the Mirrored Object Maintenance scree. An X is
displayed in both the Jrn and Snc fields.
Before you use this menu option, the library must exist on the target (it can be
empty, but it must exist there).
If you have more than one target system defined, you may press F4 to prompt
the list of systems defined as targets (and/or replicate nodes).
• Fast path: Enter 4.31 on any iTERA HA command line.
Objects that are selected for synchronization through this screen will be copied
to the target node, but will not be monitored for changes or resynced by iTERA
HA. If you want to resynchronize the objects, you must use this screen again.
Objects can be included or excluded for a library by type of object, by object
name or using wildcard naming conventions.
IMPORTANT
NMO sync is not intended to ensure that libraries defined in this
process are in sync between the primary and the target. It is intended
only to save objects on the primary and restore them to the target. It
will NOT clean up obsolete objects on the target machine.
NOTE
Setting the Save Definition field to *YES will add the object to the
Non-Mirrored Library Object Sync screen (4.30) so that the item
will be monitored for changes and resynced by iTERA HA when
necessary.
NOTE
The library entered must exist on the target node in order for this
option to work.
The results of the analysis will help you identify and understand important
information about your system in relation to syncing using iTERA HA and
will help you identify the libraries and objects that should be selected for
replication. The list of libraries produced by the SAT Lite analysis can be
imported into the iTERA HA All Available Libraries list (4.11). This will save
time by not having to run the Library Analyzer (4.11, F18).
Import the list of libraries from SAT Lite into iTERA HA by selecting menu
option 10.3. (This option runs as soon as it is executed. No screen is
displayed.)
Library Replication
In order for iTERA HA to be able to track and maintain objects being
replicated, a number of events must take place:
• The library must be identified and associated with a journal. (The journal
should already be created and all mirroring functions working properly.)
• Objects within the library must be identified and associated with a journal.
• Once the library has been sent across the network or is on tape, it must be
restored on the target node. (At the time of restore on the target, the
objects will also start journaling on the target node.)
Option/Field/Function Description
NOTE
There is a known issue in relation to joined logical file authorities.
Joined logical file authority cannot be replicated exactly the same to
the target system. This is an OS limitation.
Library Procedures
The following procedures pertain to the mirroring and management of
libraries. Syncing libraries via a network in iTERA HA is a simple and
commonly used procedure, requiring just a few steps. However, because of the
length of time that it can take to complete a tape sync or net sync of a large
library, issues can arise with these types of syncs. Normal iTERA HA processes
(such as auditing and journal receiver deletion) must be considered when
attempting these procedures. Tape syncs where the system is located at an
off-site facility can take from a few hours to several days. In this situation, the
journal receivers must be prevented from being deleted by the journal manager.
Net syncs for large libraries are dependent on a number of factors, including
object size, system bandwidth, etc. Additionally, files in a library that have
logicals with unique key constraints in a different library will require special
handling.
• For normal-sized library syncs via the network, if the library has not
previously been mirrored, begin with the procedure “Specify a new library
for mirroring” on page 67, then sync the library using the procedure
“Network Sync a Library” on page 68.
• For large library syncs and all library tape syncs, if the library has not
previously been mirrored do the following:
1. Place the Roleswap Monitor job on hold. In E2SBS, select option 3 for the
xx_RSRMON job.
2. End the Audit Command Console on all nodes (1.8, F11=End Auto
Audit).
5. Hold the sync job (xx_SNC_yyy) on the Primary system (2.11 or E2SBS).
6. If syncing a new library, perform the steps in“Specify a new library for
mirroring” on page 67.
7. Perform the tape sync or large net sync indicated in “Network Sync a
Library” on page 68 or “Tape Sync a Library” on page 69.
8. Perform the steps indicated in the section “Steps to perform after a sync or
resync” on page 70.
– To create a new journal and add the new journal for this library, see
“Create A New Mirror Journal (3.1)” on page 35.
– To use a User journal that is not defined to iTERA HA, see “Define a
User Journal to iTERA HA” on page 37.
NOTE
Only mirror journals are available for selection. A user journal can
only be assigned as the default journal if it is changed to a mirror
journal (3.1 opt 8). Once a library is synced, the type of journal
should generally be changed back to USER (3.1, option 2; Jrn Type
parm). Otherwise, if journaling is ended for the selected library,
journaling will be ended for all objects assigned to the user journal.
3. If needed, use the 4.20 (Object Sync Filter) screen on the Primary node to
set up filters for objects.
NOTE
If objects need to be filtered from replication for this library, use
F16=Filters to define the filter. (Filters can only be defined for
objects that are not already being journaled to user journals. For
more information on filters, see “Object Sync Filter” on page 57.)
4. From the Work with Libraries screen, select 4.11, F7, primary, to display
the list of all available libraries. The screen heading will indicate “All
Available Libraries”. Use option 1=Select to select the desired library, then
press Enter. Press F7=Toggle View until the “Mirrored Libraries Only”
view is displayed.
5. If the assigned mirror journal shown for the library is not correct:
IMPORTANT
Do not end the iTERA HA subsystem during the resync and restore
process.
2. For resyncs only, locate the affected library; if the status is Active, select opt
14=Cancel Syncing, press enter.
3. Select option 21=Quick NetSync for the library you wish to sync, then
press Enter. Quick syncing assigns the library to the default journal, starts
journaling on the objects, saves the library and restores it to the target
node. The library’s status will change to QuickSnc and then to Active once
the journaling step has completed.
NOTE
Multiple libraries can be selected simultaneously, but be aware of
available disk space bandwidth, and CPU resources.
4. Press F6=Work with library sync. The Library Syncing Status screen is
displayed. The Status column indicates where it is in the process of saving,
sending, and restoring the library to the target. Once the library has been
restored, the status will indicate Syncing.
5. Complete the steps in the section “Steps to perform after a sync or resync”
on page 70.
NOTE
“Steps to perform prior to a sync or resync” on page 66 should be
completed prior to executing this procedure.
IMPORTANT
Do not end the iTERA HA subsystem during the resync and restore
process.
1. Select option 23=Quick Journal. (This assigns the library to the default
journal then starts journaling on all objects.) The status will change from
QuikJrn to Active.
2. Select F6=Work With Library Sync. Press F5 until the status on the left
indicates Ready. Do not continue until after the status for each selected
library has changed to Ready. (This may take some time.)
3. Select option 6=Select to Sync for the libraries then press Enter.
4. Select F18=TapeSync Selected Libraries. Enter the Tape Device and
Volume ID you want to use for the tape sync. (A message may be received
in QSYSOPR instructing you to load the tape. Select option G to go.) A
job will be submitted to the job queue which will finish saving the library
to tape.
5. The sync status in the 4.12 screen will change from Prepping, to Saving, to
On Tape when finished.
8. On the target machine select Restore LibSync from Tape (4.51). A window
will prompt for the name of the tape drive where you loaded the tape.
Enter the tape device and Volume ID, press Enter. The system will start to
read the tape when it is needed. This may take from a few minutes to a few
hours. (A message may be displayed in QSYSOPR instructing you to load
the tape. Select option G to go.) A job will be submitted to the job queue
which will finish restoring the library to the target system.
9. After the sync has completed, on the primary, select Library Syncing Status
(4.12) and verify that the library has been identified and is syncing (status
should indicate SYNCING).
10. Continue with the steps the section “Steps to perform after a sync or
resync.”
1. On the primary, select Library Syncing Status (4.12) and verify that the
library has been identified and is syncing (status should indicate
SYNCING).
2. In the iTERA HA subsystem release the SYSMON job and the
xx_RSRMON jobs (E2SBS, opt 6).
If needed, this step should be done before you initially sync the library because
you are pre-assigning objects to journals.
If the library is already actively journaling, you can reassign individual objects
from the Mirrored Object Maintenance screen (4.21, option 7=Transfer
Journal). An exclusive lock on the object is required.
1. In the Work with Libraries (4.11) screen, verify that the journal you want
to use is defined as the default journal. If not, change the default journal to
the one you want to use by selecting F15=Change Default Journal.
3. Use option 7=Object Detail on the library to display the Journaled Object
Maintenance screen.
NOTE
Option 7 will display the Journaled Object Maintenance screen only
if the library’s status is Inactive. Otherwise, the Mirrored Object
Maintenance screen (4.21) is displayed.
5. Once objects are loaded, use the Position field to locate the object you want
to change to a different journal.
7. Once all desired objects have been changed, exit this screen by selecting
F3=Exit.
8. Sync the library via network or tape. See “Network Sync a Library” on
page 68 or “Tape Sync a Library” on page 69.
2. On the primary, select option 14=Cancel Syncing. Press enter. Read the
screen message then answer with a Y. Press Enter.
3. Select option 4=End Journaling on the library, then press enter. This
option will submit a job that will try to end journaling for any object that
is journaled to a mirrored journal. The system will not end journaling if
the journal is not a mirrored journal. You will not be able to proceed until
the job finishes (all nodes).
5. Locate the same library on the target node. Select option 4=End
Journaling, then press Enter. This will only end journaling for objects
being journaled to a mirrored journal.
6. Check E2MSGLOG for messages. Look for HAE0118 which indicates the
number of objects for which journaling was ended and the number that
were locked. If objects are locked, end the locks then reattempt the
previous step.
7. On the primary, select opt 24=Remove CRG Assignment and press Enter.
This will clear any entry in the Status column and removes it from the
Mirrored Libraries view (the library will still be displayed in the All
Available Libraries view).
If a journal no longer has objects attached to it, you can remove the journal
from iTERA HA. Refer to the iTERA HA v6.0 Advanced Features Guide for
the procedure “Remove a journal from being defined in iTERA HA”.
IMPORTANT
The journal you plan on using must be defined to iTERA HA before
it can be assigned.
- If you need to create a new journal see,“Create A New Mirror
Journal (3.1)” on page 35.
- If you want to add a user journal to iTERA HA see, “Define a User
Journal to iTERA HA” on page 37.
NOTE
This action affects only non-journaled objects added to the library
after the change.
IMPORTANT
This process will end journaling on the selected objects and start
journaling to a different journal. You can not end journaling if any
process is using the object.
d. Press Enter.
6. Select the journal with option 1=Select with Default Image.
2. Save the library to device and restore to new ASP assignment. (The new
ASP is number 2 for illustration purposes.) Note the Restore to ASP number
parameter below:
3. Create the library on target node for the desired ASP so the remote journal
jobs can use it.
4. Create the journal receiver and the user journal for the library in the
desired ASP in the same library using the following commands:
CRTJRNRCV JRNRCV(LIBX/RCVASP) ASP(2) TEXT('Receivers
for ASP2 Journal')
5. Load new user journal in 3.1 Local Journal Maintenance, F8=Load New
User Journal.
7. Verify that the journal has the mirror process set to “On”. If it is not, use
option 32=Toggle Mirror Status to enable the mirror status. Once it is on,
verify that remote journals have been created using the 3.3 screen.
8. Verify on the target machine that there is an apply job created. If any of the
components are missing, return to 3.1 on the primary machine and
rebuild the components.
9. Verify again that the remote journals and the apply jobs are in place.
10. Select 3.4, Apply Job Maintenance, and start the apply process on the
target machine as follows:
12. From 4.11 on the primary machine, change the default journal using the
F15=Change Default Journal. Select option 1=Select with Default Image
on the journal that you created in the other ASP.
13. Verify that the journal now shows as the default journal on the 4.11 (Work
with Libraries) screen.
14. Resync the library by using the instructions in “Network Sync a Library”
or “Tape Sync a Library” on page 69.
Object Procedures
Syncing objects via a network in iTERA HA is a simple and commonly used
procedure, requiring just a few steps. However, because of the length of time
that it can take to complete a tape sync or net sync of a large object, issues can
arise with these types of object syncs. Normal iTERA HA processes (such as
auditing and journal receiver deletion) must be considered when attempting
these procedures. Tape syncs where the system is located at an off-site facility
can take from a few hours to several days. In this situation, the journal receivers
must be prevented from being deleted by the journal manager. Net syncs for
large objects are dependent on a number of factors, including object size,
system bandwidth, etc. Additionally, files that have logicals with unique key
constraints will require special handling.
• For normal-sized object syncs, use the procedure “Resync an Object (or
Objects) Via the Network” on page 77.
• For large object resyncs via network or tape, review “How to Determine
Whether to Use the Large Object Sync Procedure” on page 77, then
execute the procedure “Large Object Resync Via Network or Tape” on
page 79.
• For normal-sized object resyncs via tape, use the procedure “Large Object
Resync Via Network or Tape” on page 79 and ensure the objects are larger
than the value specified in the Max Network Sync Size parameter (1.1,
F6=Objects Requesting Sync, F18=Update Max Network Size).
1. From within Objects Requesting Sync screen (1.1, F6) on the primary,
select F18=Update Max Network Size.
NOTE
Entering all 9s (999999) in this field will enable all objects to be
synchronized via the network.
NOTE
If more than one target node is defined, then to resync to all nodes,
initiate the resync from 4.21, opt 6, primary. If the object is to be
resynced to only one of the target nodes then initiate the resync from
that node using either 1.22, opt 6 or 3.7, opt 6.
• The process that drains data queues within the restore process ends
abnormally. (The data queue will remain in the work library.)
• After syncing, the object immediately falls out of sync. The reason code
“Bad write” is displayed in 1.1, F6.
• Logical files attached to the object being resynced have unique key
constraints. To identify whether a file has logicals with unique key
constraints, do the following:
2. Enter the name of a file and library to query, then press Enter. The
output will be displayed on the screen.
3. Position to the bottom of the screen and locate the Files Dependent
On Specified File section. If no dependent files are listed then the
additional steps are not required. If dependent files exist, each
dependent file will need to be checked for unique key constraints.
Note the name of each dependent file listed in the screen output,
then press F3=Exit to return to a command line.
4. Execute the command DSPFD for each dependent file. For example:
DSPFD ITHA/E2POBJL1
5. If the setting for Unique key values required is *YES, then execute
the “Large Object Resync Via Network or Tape” procedure,
described below.
NOTE
This procedure also includes an optional step for net syncs for
starting up an additional sync job through which the large object will
be processed. By doing this optional step, other object resyncs that
occur as the large object is being processed will not be held up. The
process entails placing the large object on hold, placing the sync job
on hold, submitting the extra sync job to process the large object,
then releasing the standard sync job.
1. On all nodes, select 1.7, Role Swap Readiness Monitor. Position to the
OBJSNCSTS test and select option 2=Change. Enter N for the Run All
Tests and Run Test Cmd Test parameters, then press Enter.
2. On all nodes, select 1.8, F11=End Auto Audit, to end the Audit
Command Console.
3. On all nodes, from the Job Scheduler (WRKJOBSCDE), place all the
iTERA HA audits on hold using option 3.
4. On all nodes, select 3.33. Select option 3=Hold for the XPJRNMGT job.
5. On the primary, select 1.1, F6=Objects Requesting Sync. Review the value
defined for the Max Network Sync Size parameter.
• In order for the object to be resynced via tape, it must be larger than
the value defined.
• In order for the object to be resynced via the network, it must be
smaller than the value defined.
6. If the object is highly active (if from the time it is saved to the time it is
restored within the iTERA sync process it exceeds approximately 18
million entries in the journal), do the following:
NOTE
If more than one target node is defined, then to resync to all nodes,
initiate the resync from 4.21, opt 6, primary. If the object is to be
resynced to only one of the target nodes then initiate the resync from
that node using either 1.22, opt 6 or 3.7, opt 6.
8. On the primary, select 1.1, F6. If the object is less than the Max Network
Sync Size defined it will be automatically synced via the network. The
syncing process can be monitored from this screen.
NOTE
If the objects are synced via the network AND there are unique key
constraints, then the apply job for the journal being used to journal
the objects must be held before the restore process has completed.
Select 3.4 on the target, then select option 11=Suspend Process State.
(If unsure which apply job to suspend, check the Mirrored Object
Maintenance screen [4.21] for the journal being used to journal the
objects.)
For network syncs, monitor for additional objects requesting sync. Press
F16=Submit Extra Network Sync to accommodate syncing and to prevent
a bottleneck through a single sync job.
9. Optional: For network syncs only, you may initiate a second sync job so
that the large object resync will run through the second sync job, thus
freeing up the first sync job to process any other sync requests that may
execute during the time the large object is being resynced.
a. Load an initialized tape into the tape drive and make the tape drive
ready.
b. Press F20=Tape Sync.
c. Select the object(s) to be synced via tape with option 1=Select Object
to Save or F21=Select All.
d. Press F6=Create Tape.
e. Specify the Tape Device and Volume ID and press Enter. Or, if using
an advanced tape system, select F4=Select Volume Set, specify the tape
drive, volume ID, and volume set, then press Enter.
f. Press F9=Submit Tape Sync.
g. A message may be displayed in QSYSOPR, indicating to load the tape.
Select option G to go.
h. A job will be submitted to the job queue which will finish saving the
objects to tape. Once this is finished, the sync status for the objects in
the 1.1 F6 screen will indicate that the objects are on tape.
i. Transport the tape to the target system and mount it.
j. If the objects are synced via tape AND there are unique key constraints,
then immediately suspend the apply job for the journal being used to
journal the objects. Select 3.4 on the target, then select option
11=Suspend Process State. (If unsure which apply job to suspend,
check the Mirrored Object Maintenance screen [4.21] for the journal
being used to journal the objects.)
k. On the target machine select 4.52, Restore Object Sync from Tape. In
the window that is displayed, enter the name of the tape drive where
you loaded the tape. Enter the tape device and Volume ID, press Enter.
The system will start to read the tape when it is needed. This may take
from a few minutes to a few hours.
l. A message may be displayed in QSYSOPR instructing you to load the
tape. Select option G to go.
m. A job will be submitted to the job queue which will finish restoring the
objects to the target system.
n. Select 1.1, F6 on any node. The tape sync is complete when the object
is no longer displayed on the screen.
11. For all object resyncs where there are unique key constraints, immediately
after the sync has completed, do the following. (If there are no unique key
constraints, skip to the next step.)
NOTE
Do the remaining steps only after it appears that the sync process has
been successful and the objects have remained in sync for several
hours.
12. On the target, select 4.21, Mirrored Object Maintenance. Position to the
object and verify that the Jrn, Snc, and Omt fields indicate Y, Y, and blank,
respectively. Any other settings will cause the object to be resynced again.
13. On the primary, select 1.7, Role Swap Readiness Monitor. Position to the
OBJSNCSTS test and select option 2=Change. Clear the N for both the
Run All Tests and Run Test Cmd Test parameters, then press Enter.
14. Start the Audit Command Console on all nodes (1.8, F11=Start Auto
Audit, F7=Ignore Time Setup). If audit jobs have been held in the job
scheduler, release them.
15. On all nodes, select 3.33. Select option 6=Release for the XPJRNMGT
job.
3. Select option 1=Select With Default Image on the journal to be used. Press
Enter.
4. The Library/Journal data in the 4.21 screen will be updated with the new
library and journal.
Press Enter.
With the System Library Autosync, replication is scheduled and does not
require an operator to be in attendance. Synchronization definitions are
created for objects at either the library or object level. The definitions include a
set of either individual objects, or entire libraries. A definition is assigned a
relative priority status, which is the order in which that definition will be
synced. A low number indicates a high priority. The sync definitions assigned a
higher priority will be synced first.
1. Access the System Library AutoSync from menu 4.10 on the primary. Press
F8=Selection to display the Library Selection Definitions screen.
2. Press F6=Add Selection to display the Select Libraries for Auto NetSync
screen.
3. Fill in the screen with the desired data. Instructions for only a few of the
key fields are described below. If additional information is needed, consult
the iTERA HA v6.0 Reference Guide.
Field Instruction/Notes
4. After the AutoSync definition is complete, press Enter. The screen will be
returned to the main AutoSync screen. The Syncing Status will indicate
Defined.
NOTE
At this point, you may choose to immediately begin syncing the
definition or use the System Sync Monitor to schedule syncing for
later. If you want to immediately sync the definition continue with
the next step below. However, if you are using the System Sync
Monitor to initiate and manage replication (recommended), skip to
step 7 on page 88.
NOTE
When using this option, the status should indicate “Defined”. Do not
set the status to “Ready” (F10 or option 11). If set to “Ready”, you will
not be able to select option 8, since you make it ready for the
monitor to control it.
6. Select option 8=Submit Sync for the library selection definition. This will
build the list of libraries that meet the selection criteria. The status will
change from Defined to Loading, to Submitted, to Complete.
7. Select option 5=View Results to view the libraries that are defined in the
AutoSync definition.
NOTE
If Library Max Size was defined in the Library Selection Criteria
section of the Select Libraries for AutoSync screen, then the number of
libraries exceeding the defined limit will be displayed here.
10. After creating all AutoSync definitions, verify they are in Complete status.
11. Press F12 to return to the main Library AutoSync screen (4.10). All libraries
included in any of the definitions created in the previous steps are
displayed with a Defined status.
12. Press F10 to set the status of all libraries to Ready (or select option 1 for
just a few libraries).
Field Instruction/Description
Warning Syncing Jobs Jobs in this field may be in a MSGW or HELD status.
Field Instruction/Description
14. The System Syncing Monitor example below shows a reduction in the
number of syncing jobs on the primary and the scheduling activated for
weekdays between 1:00 am and 5:00 am.
16. Once the System Sync Monitor is Active, press F12 to return to the
previous screen.
Press F5 periodically to watch the status of the libraries change as they are
synced to the target node. The list of statuses and their meanings is located
in the iTERA HA v6.0 Reference Guide.
Once the libraries have been processed and applied to the target node, the
library will be removed from the screen.
By default, only new libraries (i.e., libraries listed in the “New Libraries” view
in the 4.11 screen) are selected in AutoSync definitions. To build an AutoSync
definition for libraries not defined as “new”, do the following:
2. Position to E2LIBATOT.
Since not all users will replicate all non-library replication items, upon first
entering the Non-Library Replication menu, most menu options will not be
visible. They must be activated from the Replication Options menu.
Option Description
*MQ MQSeries/WebsphereMQ
User Profile
NOTE
*USRPRF
The Global State for User Profile Replication
should already be on.
If the CRG contains more than one target node and you want to restrict
replication for a component to one of those nodes, use option 14=Disable
Local on that node.
The screen below shows several non-library replication options enabled (the
Global State displays “On”).
NOTE
When the Global State is toggled “On” on one node, it will be
automatically toggled “On” on the other defined nodes.
• Directory Entries
• User Profile
Instructions for general setup for each feature in the Non-Library Replication
menu are described in the next several sections.
This option allows you to define the replication of system Directory Entries
between nodes.
There are two types of entries with iTERA HA Directory Entry Replication:
IBM only allows one directory entry to be created per user profile.
The user profile must exist on the node where the IBM i Directory Entry is
being created. If the user profile for the entry being replicated does not exist on
the target node(s) the iTERA HA entry will have an error status when an
attempt is made to replicate it.
NOTE
This program takes advantage of the advanced, feature-rich
capabilities of System Architecture. There are several additional
option and function keys available but not visible on the screens. To
display them, select menu option 50.2 (or execute the command
E2EXPLVL), set the experience level to 4, sign off, then sign back
on. The recommended level for general use is 2. These features are
documented in the appendices of the iTERA HA Reference Guide.
3. From the primary node, access the Directory Entries Replication screen
(5.7). The indicator “Exit Point Active” at the top of the screen will display
“No”. (When the Exit Program is not active it means that Directory
Entries are not being replicated.)
NOTE
Disregard the message “Data area E2DICTL in *LIBL not found”.
The data area will be automatically created.
4. Select F20=Control. The screen text indicates that only one CRG can
replicate Directory Entries. Specify the other parameters, then press F7 to
add the Exit Program.
NOTE
Disregard the message “Data area E2DIRPRD in *LIBL not found”.
The data area will be automatically created.
NOTE
Refer to the iTERA HA v6.0 Reference Guide for information on
the fields displayed in this screen.
5. When the Exit Program has been successfully loaded, “YES” will be
displayed in the Exit Point Active field. Press F3 to exit. (The exit program
is automatically enabled on the target node.)
6. Directory entry information from the IBM i must now be copied to a file
in iTERA HA. On the primary node, from the main Directory Entry
screen (5.7) press F18=Submit Info Build to retrieve the local IBM i
directory entry information. Press F5 periodically to refresh the screen
until the entries from the primary node are displayed.
NOTE
Only the primary node’s Directory Entries are displayed on this
screen.
7. When the entries have finished loading, select F16=Dft Map to display the
Directory Entry Mapping screen. The maps control how adds, deletes,
changes, etc. are handled on the target node.
8. The settings for the Q* map prohibit all directory entries beginning with
the letter Q from being replicated to the target node. (IBM directory
entries start with the letter Q should not be replicated.) Separate maps
must be created to replicate any profiles beginning with Q. If there are user
directory entries with names that start with Q, create a map for those
entries. Press F6=Add to create a new map, enter the information for the
directory entry, then press Enter.
NOTE
Adjustments may need to be made to replicated Directory Entries on
the target node in order for the entries to be viable after a role swap is
performed. For example, some third party applications require the
directory entry address to match the system name. Additionally,
SMTP routes may differ from primary to target. If either of these is
the case with your system, you’ll need to create a conversion map to
handle these issues. Follow the instructions in the section Conversion
Utility for Directory Entries. If this is not the case for your system,
continue with the next step.
a. To Quick Sync all entries from the primary to the target system, on the
primary node from within 5.7, F16=Dft Map, select F21=Quick Sync
All.
NOTE
Any changes made to Directory Entries (for example, through
WRKDIRE) will be replicated to the target node.
b. To Quick Sync only select entries, on the primary node, access the
Directory Entries screen (5.7) and verify that the entries you want to
sync appear on the list, select option 21 (Quick Sync) for each entry,
then press Enter.
NOTE
Changes made to individually selected directory entries on the
primary will be replicated to the target node.
The sample entry below will cause the address field in all replicated directory
entries to change to the name of the target node.
1. Select F19=Cvt Map from the main Directory Entries Replication screen.
NOTE
Use the F7=Load Default Map key to automatically create the
commonly used conversions for most systems.
Field Description
3. Select F4 to display the list of available fields. Select option 1 for the
Address field. (The screen automatically returns to the Directory Entry
Conversion display.)
NOTE
Additional information on this screen is located in the Reference
Guide.
NOTE
Before you can access IFS Replication screens, you must enable IFS
replication in Replication Options (30.23 opt 6) under the
Environment & Setup menu.
NOTE
Directory Entries must be replicated before IFS so that QDLS object
ownership information can be replicated to the target. If the
Directory Entry for the object’s owner is not replicated first, the
QDLS object owner information is not replicated.
IMPORTANT
IBM documentation for V5R3 of i5/OS states “The maximum
number of objects that can be associated with one journal is 250,000.
This maximum includes objects whose changes are currently being
journaled, objects for which journaling was ended while the [journal
receivers that are still in the current chain with the currently attached
receiver]. If the number of objects is larger than this maximum,
journaling does not start”.
If you have a directory that has more than 250,000 objects in it you
can either mirror it using object-level replication or journal
subdirectories to separate journals.
IMPORTANT
If in the event that syncing is cancelled for a directory on the
primary, journaling should be ended on the equivalent directory on
the target (otherwise, there is the potential that the number of
allowable objects in the journal could be exceeded). If the objects
that had previously been mirrored should not reside on the target,
they should be manually deleted.
Object-level replication sends an entire object to the target system if any data
in the file changes.
NOTE
Certain directories and objects within IFS should not be replicated.
Most of these directories are not displayed in the IFS Replication
screens. Consult the iTERA HA v6.0 Reference Guide for a partial
list of IFS object types and directories ineligible for replication.
issue. The actual limit is what is specified in the Journal Object Limit
parameter of the CHGJRN command.
• The display file command for the root directory may also be used to access
a list of all IFS on the system (DSPF ‘/’).
Regardless of the method used, the goal is to control the amount of data going
through the journal. If there are too many directories or excessively large
objects going through one journal, the apply job may fall behind.
Option 19 Output
Within IFS Replication, select option 5 to view a file system. Select option 19
for various directories (multi-select is supported). Drill down using option 5 as
needed within subfolders to determine which level of the directory structure to
replicate.
The output for option 19 displays at the bottom of the screen as such:
Size: 35,852 Nbr Dir: 1 Nbr Files: 1 Nbr SymLnk: 0
With the information about directory path size, the number of subdirectories,
files, and symbolic links, make a determination of how many directories will be
replicated at one time.
IMPORTANT
This selection process is wholly dependent upon the particular
system and bandwidth. If large directories are selected (or too many
directories at one time) to replicate, a critical bottleneck will occur.
The total number of subdirectories and files/objects does not include any
symbolic links within those directories.
• Look for large directories by checking the total path size, the number of
subdirectories, and the number of files/objects. (Large is relative to your
system.)
• Do screen prints or screen captures of the various screens and note the
intended syncing strategy.
What to Replicate
When IFS is replicated in iTERA HA, any objects located within a replicated
directory, including all subdirectories and objects within those subdirectories,
will be replicated. If a higher-level directory containing, for example,
subdirectories that contain licensing information is replicated, there will be
problems. Drill down into each subdirectory in order to determine if all items
within the directory are eligible for replication.
*DOC Document
*FLR Folder
*DIR Directory
Replication Instructions
Prior to actual replication, one or more IFS journals must be created, the
journaling defaults for the IFS journals must be adjusted, and the first journal
to which IFS objects will be replicated must be designated as the default
journal. These steps are common to both journaling and object-level
replication. Once these steps are completed, continue with either the
“Replicate IFS Using Journaling” on page 115 or “Replicate IFS Using
Object-Level Replication” on page 116.
1. From 3.1 (Work with Local Journals) on the primary, select F6=Create
Mirror Journal.
2. Specify I for the New Journal Type field, then press Enter.
IMPORTANT
The New Journal Type must be “I” for IFS journals.
5. On the target node, select 3.4, Apply Job Maintenance. The apply job for
the journal just created will be displayed with the Jrn State “*ON” and the
Process State “OFF”.
8. Since this is a new Journal, select option 3, “Use the first seq# of the
current attached receiver” then press Enter.
9. When prompted whether to delete prior receivers, answer “Y”. Type the
reason for change, then press Enter.
11. Select option 14=Restart Job to start the apply job. Press F5 periodically. If
the apply job has started correctly, the job status should change to either a
RUN status or an EVTW status.
12. Verify that remote journaling was added. On the primary, (still in 3.1)
select option 5=WRKJRNA for the new journal, then select F16=Work
with remote journal information. The Journal State should be *ACTIVE.
1. On the primary, select 5.2. The Work with File Systems screen is
displayed.
NOTE
If the QDLS directory cannot be accessed, it is due to an authority
issue with the User Profile that is signed on. Add the User Profile to
the Directory Entry (WRKDIRE).
2. Press F7=Journaling Defaults. The defaults displayed are the defaults used
for IFS Replication and generally will not need to be changed. However,
review the Help text (F1), change as needed, and press Enter.
3. The IFS Replication Options screen is displayed. Review the Help text
(F1) change as needed, and press Enter to return to the main screen.
4. Select Option 5=Work with file system to drill down one of the file
systems.
NOTE
To replicate objects using object-level replication, a default journal
must still be assigned to the object or directory. The assigned journal
is the journal through which the objects will be sent to the target
system.
Only IFS journals are displayed. Select one to be the default journal for the
chosen directory. Remember that all subdirectories will be replicated
(unless specified otherwise in IFS Journaling Defaults). Select option
1=Select with Default Image for the journal and press Enter.
6. The screen is automatically returned to the All Directories view and the
default journal is displayed.
7. Select option 5=Display link for the designated file system to drill down
further to view subdirectories and objects (refer to the replication mapping
strategy).
8. Select option 1=Assign to dft jrn for the directory you want to replicate.
The directory (and all objects within it) is then assigned to the default
journal.
• If using journaling, when replication is started, the directory and all its
objects will be replicated using the assigned journal.
• If using object-level replication, information about the object is sent to
the target node through the journal.
IMPORTANT
Some applications may get errors because they don’t have adequate
authority to the journal to which IFS objects are being journaled. To
avoid this, grant user *PUBLIC, *ALL authority to the journal. Do
this after the journal is created and before you start journaling.
1. Select option 2=Start jrn to start journaling the directory. The Status will
indicate journaling is active.
NOTE
It may take a long time to start journaling on large objects.
NOTE
As an alternative to selecting options 2 and 3 separately, option
4=Quick sync (2,3) may be used to start both journaling and
mirroring for each object to be replicated. Journaling will be fully
started before replication is initiated. Keep in mind that if there are a
large number of objects in the directory, it may take a long time to
start journaling.
NOTE
Options 2, 4 and 8 are not valid for object-level replication.
IFS Audit
The IFS audit reviews the data, attributes, and authorities. The audit will run
automatically via the Audit Command Console; however, to manually audit
selected files or directories, use option 20=Audit from the 5.2 screen or execute
the command E2IFSAUD on the primary. The directory to audit and the
parameters are displayed and may be adjusted, if needed. Press Enter to initiate
the audit.
View the IFS audit results in the Audit Command Console (1.8) on the target
node by selecting option 1 on IFS_REVIEW. The IFS Audit History screen
provides summary information about IFS audit history.
Select option 1 on the report to view. The Display IFS Audit History screen is
displayed.
Place 5 by an object on the next screen to view audit detail on the selected
object.
The report displays audit result by “Passed” and “Failed” statuses. The report
also shows difference and corrected values. See the section “Audit Detail” on
page 185 for additional information on viewing the IFS Audit results.
NOTE
Ensure that synchronization is complete before ending mirroring on
a sub directory. An indication of complete synchronization is that the
record in the sub file will turn a different color (usually pink) and the
status will be active.
NOTE
This program takes advantage of the advanced, feature-rich
capabilities of System Architecture. There are several additional
option and function keys available but not visible on the screens. To
display them, select menu option 50.2 (or execute the command
E2EXPLVL), set the experience level to 4, sign off, then sign back
on. The recommended level for general use is 2. These features are
documented in the appendices of the iTERA HA Reference Guide.
6. Display the subsystem (E2SBS) and press F5 until the job xx_RPTREP is
displayed.
7. On the target system, select Work with Subsystem Jobs (2.11 or E2SBS).
8. End all xx_OBJMON jobs using option 4, press Enter to confirm. Refresh
the screen and verify the job has ended, then press F12 to exit the screen.
12. Press F18=Submit Analyzer. Press F7=Toggle Views until the All Output
Queues view is displayed. The list will be populated as the Analyzer runs.
13. Select one of the following options for each output queue, then press
Enter.
– 11=Repl (Old and New) Replicates the output queue, existing reports
and new reports as they are added.
– 12=Repl (New only) Replicates the output queue and new reports as
they are added. Existing spool files in the output queue are replicated
only when the audit is run.
When Enter is pressed the Status field will display Active, and the Rep
field will display Yes, indicating that replication is active.
14. Optional: select option 15=Remove New Status for any output queues you
do not wish to replicate. This will remove the “New” indicator from the
Status field.
All components required for running a device being replicated on the primary
node are also automatically replicated to the target node. For example, if a
device which has a controller attached is replicated, the controller is also
automatically replicated. If a printer is attached, and the printer isn’t already
defined on the target node, it will replicate it automatically.
When this screen is first accessed, no devices will be displayed but the system
will automatically begin building the list. Select F18=Submit Info Build in
order to ensure the list is fully updated and displays all current devices.
NOTE
This program takes advantage of the advanced, feature-rich
capabilities of System Architecture. There are several additional
option and function keys available but not visible on the screens. To
display them, select menu option 50.2 (or execute the command
E2EXPLVL), set the experience level to 4, sign off, then sign back
on. The recommended level for general use is 2. These features are
documented in the appendices of the iTERA HA Reference Guide.
4. After the list is built, the lines, class of service descriptions, network serves,
and modes are displayed in the Config Name column.
NOTE
The *BLANK category contains controllers that do not have
attached lines. Within the *BLANK category for lines is another
*BLANK category for devices that do not have attached controllers
(not displayed in this view).
– The plus sign (+) to the left of a controller indicates there is at least one
device attached to it.
IMPORTANT
Replication can be done globally or individually. Read through the
instructions in the following two sections before determining the
best way to proceed. Complete one or both of the following sections.
A typical setting for the *ALL map is to set adds and changes to Y but to
leave deletes set to N. (The default *ALL map is set to NOT update the
target configuration components when adds, changes, and deletes occur—
this is a precaution in order to prevent someone from unintentionally
sending everything across without considering the ramifications.)
You must determine if you wish to set up a more specific map to replicate
certain controllers and/or devices, or you may skip this step and go back to
the main Configuration Replication screen, where you can select
individual entries to replicate.
In the example below, a map covering devices starting with “BI” (for
billing) and only those with the category of *DSP was created for the
billing department.
This map is defined to update the target node every time adds, deletes, and
changes are detected on the primary node.
NOTE
A list of Categories is available in the Reference Guide.
4. After the desired maps are adjusted and/or created, use F8=Copy Maps to
sync all defined map entries to the target node.
NOTE
The Sts column displays the status of replication. If not all Lines,
Devices, Controllers, Class of Service, Modes, and Network
Descriptions that should be replicated to the target node were
covered by the maps, continue with the next section.
The color coded data, along with the Sts (Status) column, will help to clearly
see what has been replicated and what has not. The color codes are as follows:
Color Description
Green Replication has not yet been started for an object for the first time.
The Ovr (Allow Device Override) column specifies whether a particular device
is eligible to be replicated.
• When set to N, the object is not eligible for replication. This value is
typically used in only two circumstances: 1) if you want to ensure that
replication is not accidentally started for an object (that is not currently
being replicated)—even if option 21=Quick Sync is attempted; and 2) if
you want to stop or prevent replication for a device that is associated with
a controller when a Y appears in the Allow Device Override column.
When initially starting replication for an object, type a Y in the Rep column.
Once this screen is exited, the Object Monitor in iTERA HA will recognize
this Y in its next cycle of object monitoring and will then begin replication of
the objects. If, however, you want the object to be replicated immediately, use
option 21=Quick Sync in the adjacent option field and press Enter.
All components required for running a device being replicated on the primary
node are also automatically replicated to the target node. For example, if a
device with an attached controller is replicated, the controller is also
automatically replicated. If a printer is attached, and the printer isn’t already
defined on the target node, it will replicate it automatically as well.
When you specify Y in the Replicate Change column for an object, it will be
replicated as soon as the Object Monitor detects it (which may take up to 15
minutes).
Two options for syncing immediately:
• Select option 14=Stop Mirror. This will only stop replication of the
selected line, controller, or device. Those devices will remain on the target
node but will not be monitored for updates.
• Select option 4=Stop Mirror and Dlt Rmt Cfg Objects. This option will
not only stop replication of the selected line, controller, or device but it
will also delete the controller and/or devices on the target.
IMPORTANT
iTERA HA v6.0 only supports Job Scheduler Replication from the
primary to the target system. Bi-directional replication is not
supported in this release. However, entries can be manually copied
from the target to the primary using the Remote Job Schedule
Maintenance screen.
One of the convenient tools within Job Scheduler Replication is the ability to
link a job on the primary with the same job on the target node (the job is
placed in Held status on the target node). This should be done prior to
replicating jobs from the primary to the target in order to prevent sending
duplicate jobs to the target node and for easy identification of jobs.
In Job Scheduler Replication, jobs can be copied both to and from the primary
and target nodes, the job schedule entry can be updated, and more.
A convenient way to create a scheduled job that runs on the target is to first
create it on the primary then replicate it over. This allows you to manage the
scheduled job from the primary and have it replicated to the target. See
“Manage Scheduled Target Jobs From the Primary” on page 133 for
additional information.
NOTE
This program takes advantage of the advanced, feature-rich
capabilities of System Architecture. There are several additional
option and function keys available but not visible on the screens. To
display them, select menu option 50.2 (or execute the command
E2EXPLVL), set the experience level to 4, sign off, then sign back
on. The recommended level for general use is 2. These features are
documented in the appendices of the iTERA HA Reference Guide.
3. Enter menu option 5.5 on all nodes the first time Job Scheduler
Replication is started in order to enable iTERA HA to automatically adjust
an audit level (to *CHANGE). This lets QUADJRN know about changes
to the job scheduler.
4. On the primary, select F16=Map to display the Job Schedule Mapping
Search screen. The Job Schedule Mapping Search screen controls how jobs
are handled.
When option 1 is selected for a map, the Job Schedule Map Maintenance
screen is displayed (see below).
Item Description
Item Description
Log Changes When set to Y, logs an entry for each change made to
a job and every time a job is run. The usual setting is
N.
5. Press Enter to accept any changes then F12 to exit. If changes to the map
were made, do the following:
7. Press F10=More Functions, select option 10 Remote Job Scd, then press
F10=Retrieve Remote to retrieve the list of jobs in the job scheduler on the
target node to the primary node. Select F5 to refresh the screen.
8. If the jobs on the target node are the same as those on the primary (either
manually created or already existing), then you have the choice to either
link the jobs as they are, or remove the jobs from the target system.
IMPORTANT
Failure to do this step correctly will result in duplicate jobs being
created on the target.
a. To link the jobs on primary to the equivalent job on the target, from
the main Job Scheduled Entries screen (5.5) select F16=Map to view
the Job Schedule Mapping Search screen, then select F20=Link All.
This will submit a job that will link jobs on the primary to those on the
target.
b. To remove the jobs from the target: either sign on to the target, execute
WRKJOBSCDE, then delete them, or, on the primary, select
F10=More Functions, opt 10 Remote Job Scd, then select option
9=Remove from Backup for all jobs to be removed (to remove them all,
select option 9 for the first entry, then select F13 to repeat the option
for all subsequent jobs in the list).
9. Select F20=Link All from the Job Schedule Mapping Search screen (5.5
F16) to compare jobs between the primary and target and link
corresponding jobs. This will prevent duplicate jobs from being copied to
the target node when the quick sync is performed. Press F12 to return to
the Job Schedule Entry screen.
10. Select either F21=Quick Sync from the Job Schedule Mapping Search
screen (5.5, F16, F21) to replicate all jobs to the target node, or, to
individually select jobs to sync, select option 21=Quick Sync from the Job
Scheduled Entries screen (5.5, opt 21).
When jobs have been synced, they will be color-coded to indicate the
status of replication:
• RED – An error exists. Usually indicates the job doesn’t exist on that
node.
• GREEN – The Job Schedule Entry has no match on the other node.
• BLUE – Job Schedule Entry is actively being replicated.
If you need to work within the actual Job Scheduler, select option
5=WRKJOBSCDE for a job or select F8=WRKJOBSCDE.
If you need to delete an entry from the Job Scheduler, do so from within the
Job Scheduler (option 5=WRKJOBSCDE), then option 4=Remove).
NOTE
Option 4=Delete from list used from the main iTERA HA Job
Scheduled Entries screen will only remove a job from being defined
in iTERA HA but will not delete the actual job from the Job
Scheduler.
Another convenient tool is option 3=Copy Jobs, which allows you to create a
new job on the same node by copying an existing job. Simply select option 3
for the job you wish to copy. The following screen is displayed.
1. Create a map on the primary (5.5, F16=Map). Set the map as follows: Add
job=Y, Change job=Y, Delete job=Y, Default status=SCD.
5. Copy the jobs using option 3=Copy Jobs. Leave the name the same as the
original name, this will copy it to the primary system. Press Enter, then
F12 to exit the screen. The job should now be visible in the 5.5 screen in
syncing status.
7. Put the job on hold using option 3. This will put the job on the primary
on hold so that it does not run. Return using F12.
11. Press F5 to refresh the screen. The job will now be listed twice on the
target.
12. Select option 9=Remove from backup on the job that is not linked. Use
F12 to return to the Job Scheduled Entries screen.
7. Press F5 to refresh the screen. The job will now be listed twice on the
target.
2. Entering *YES for the Prompt Command parameter and pressing Enter will
display each duplicated job schedule entry on the Remove Job Schedule
Entry Screen, as follows:
4. If *NO is used for the Prompt Command parameter then the duplicate
entries are automatically removed; the RMVJOBSCDE screen will not be
displayed.
WebSphere MQ Overview
WebSphere MQ is a communication system that provides assured
asynchronous, once-only delivery of data across a broad range of hardware and
software platforms. It is used for data transfer from one system to another for
interoperability of sharing data.
Checkpoints are created to establish a point of recovery. They are created at the
point when everything in a queue has been stored in a file.
• Journal entries are the most critical part of MQ in a failover or role swap.
WebSphere MQ Replication
All WebSphere MQ replication is performed on the primary node.
One ‘Controlling’ job runs on the target node (ZM_mqdftjrn) and one apply
job runs for each MQ Queue Manager (ZU_AMQAJRx). There are no
additional jobs run on the primary node.
IMPORTANT
Verify that the user profiles QMQM and QMQADMIN exist on the
target node prior to initiating replication. If these profiles do not
exist and MQ is replicated, all authorities are lost.
6. Applies the messages to the target node, then deletes messages as needed.
NOTE
A “Y” in the Jrn Active column indicates journaling is active.
3. Select F13 to assign the default journal. The New MQ Default Journal
screen is displayed. Select F6 to create a new MQ Journal.
6. Select option 1 on the journal you just created. The Default Journal is now
defined. Select option 1 for each Queue Manager to assign the Default
Journal to the Queue Managers.
8. Press F14 to start journaling the IFS. Note the status of journaling is
displayed in the Status field.
9. Press F15 to start replication. Status messages will display during start up.
F15 starts replication of MQ IFS directories and Queue Manager libraries.
11. The subsystem on the target node will display the MQ active jobs.
NOTE
Other active iTERA HA jobs will also be displayed in the iTERA HA
subsystem.
1. Press F17 to end replication. This will turn off IFS and library replication
for MQ directories and will end the jobs on the target node.
4. Press F3=Exit.
The system values that have been replicated can be audited individually by
using option 21=Audit or all replicated system values may be audited by using
F20=Audit All. SYSV_AUDP automatically runs in the Audit Command
Console and audits all replicated system values.
Entries on this screen in red text (default) are not eligible for replication. This
is also indicated in the Allow Repl column.
If Enter is pressed without specifying an output queue, the Work with All
Output Queues screen is displayed.
Selecting option 5 for a queue displays the Work with Output Queue screen,
where you may perform a variety of tasks on the various output queues listed.
System Monitor
The System Monitor displays the status of the iTERA HA mirroring process.
From this screen, use F16 (Mirror Process Monitor) to easily navigate to
additional mirroring information and even manage journaling components.
NOTE
When displaying the System Monitor, it is important to remember
that the data is current as of the last time the node updated the
data via the system monitor job. The system monitor job regularly
refreshes this information every fifteen minutes. To manually
update the data, press F10 from the primary node.
NOTE
Only selected fields are described in the table below. Consult the
Reference Guide for additional content.
% Total Disk Storage See “Troubleshooting High Disk Space” on page 310.
Used
% Total Used By
Receivers
IMPORTANT
Press F10=Update Monitor on the primary in order to view the most
current data.
IMPORTANT
Any NO indicators for Local/Remote Journals Active, Apply Jobs
Active, or Network/Subsystem Active fields need to be investigated
and resolved.
3. Enter the desired time for the System Monitor delay interval.
4. End and restart the iTERA HA Subsystems on both nodes for the
change to take effect.
a. E2ENDSBS - primary first, then target.
b. E2STRSBS - target first, then primary.
Item Description
Field Description
• When F6=Work with App is selected the iTERA HA program that is most
likely to be used to resolve the issue is executed. F6 is the same as selecting
option 5 from the main iTERA HA Message Log screen. The program
name is indicated in the “Solution” line toward the bottom of the screen.
• F7=Work with Solution is the same as selecting option 7 from the main
iTERA HA Message Log screen. It executes the solution most likely to be
used to resolve the issue.
“Message Groups” can be defined for sending a message to more than one
message queue. (The default routing for all messages is to the message log
only.)
While many of the tests executed can be run individually, the monitor provides
a simple, streamlined, automated process from which to manage them. It also
ensures that no tests are skipped nor forgotten.
The tests executed by the Role Swap Readiness Monitor are summarized in the
table below (listed alphabetically). For issue resolution information, see “Role
Swap Readiness Monitor Issue Resolution” on page 285. Additional
information on the monitor is located in the iTERA HA v6.0 Reference
Guide.
Data Apply Jobs Checks the apply jobs for the data Displays the
APPLYDATA journals to ensure they are active on the Mirroring Process
target machine. Monitor. (1.1, F4).
Check the results by
Apply job for the IFS Checks the apply jobs used for IFS
APPLYIFS transport journal Replication to ensure they are active on
viewing the Remote
the target machine. Journal screen (F7
twice).
Apply job for the MQ Checks the apply jobs used for MQSeries
APPLYMQ transport journal Replication to ensure they are active on
the target machine.
Apply job for the Checks the apply jobs for the transport
APPLYTRAN transport journal journals to ensure they are active on the
target machine.
Audit Monitor Displays the status of the Audit Displays the Audit
Command Console. All audits must have Command Console.
AUDIT_MON successfully run within their designated
run intervals in order to indicate an OK
status.
DDM Test The test creates and updates a DDM file Displays the DDM
to the target machine and then creates a Commands menu.
DDM
command to send one from the target to
the primary.
Encryption Test The encryption test checks to ensure the Displays the Work
licensed programs needed for encryption with License
ENCRYPTION exist on the machine. Encryption Information screen.
programs are needed for User Profile
Replication.
FTP Test The FTP test creates a file, FTPs it to the Displays the
target node, then issues a command to do Configure TCP/IP
the same from the target node back to the FTP screen.
primary. Additional
FTP
information on this
IBM screen is
accessible via the F1
key. command.
Heal Test Makes sure that there is not a backlog of Option 1 for this test
heal records that have not been applied. is only valid on the
target node and
HEAL
displays the Heal
Status Summary
Search screen.
Integrated File System The Integrated File System data can be N/A
Test replicated. They are copied using their
own journals which are separate from
data stored in libraries.
This test verifies that IFS Replication is
current. If the number of unprocessed
IFS
entries exceeds the limit an error status is
displayed. The warning threshold can be
adjusted.
This test is only available if the
component is toggled on in the
Replication Options screen (30.23).
There will be one Checks that all key jobqs and subsystems Displays a window
JobQ test for each are active (user-defined as well as Vision which allows access to
JobQ. Solutions-recommended). the specified jobq,
JOBQ If the subsystem is not active, the test will subsystem, or both.
not run. JobQs are critical in role swaps.
They are attached to the subsystem so the
subsystem must be active.
Job Scheduler Sent via the transport method. Displays the Job
This test is only available if the Scheduler Replication
JOBSCDE screen.
component is toggled on in the
Replication Options screen (30.23).
Library Sync Status Checks for any libraries that have had the Displays the Work
‘new’ status cleared, have had syncing with Libraries screen
LIBSYNC
cancelled, or have had journaling ended (4.11).
on them.
System Monitor Verifies that all settings in the System Displays the System
Thresholds Monitor are within the limits specified in Monitor Alert Limits
MONTHR
the System Monitor Alert Limits screen. screen
(E2MONTHR).
Object Monitor Tests OBJMON1 will display in error if the OBJMON1 displays
MONTHR test is in error. the System Object
OBJMON2 and OBJMON 3 tests verify Monitor.
OBJMON1
through that the jobs are running and not falling OBJMON2 and
OBJMON3 behind. OBJMON3 display
the Object Audit
Level Definition
screen.
Object Sync Status The test checks for objects that are not in Invokes the Mirrored
OBJSNCSTS a normal journal, sync, or omit status in Object Maintenance
the 4.21 screen. program (4.21).
Object Sync Requests Checks for any objects currently awaiting Displays the Objects
synchronization. Requesting Sync
screen (1.1, F6) in the
OSR context of primary or
target. If viewing the
target node data, press
F7 twice.
Record Count Audit Checks objects with record or member When option 1 is
Test count differences reported in the 1.21 selected from the
screen. target node, the
RCDCNT
This test runs only on the target node. If Record Count Audit
it fails, you must resolve the issue from Maintenance screen is
the target node. displayed.
Remote Journal Status Checks the status of remote journaling. Displays the Remote
RJSTS Journal Maintenance
screen (3.3).
Spool Files *OUTQs and reports can be replicated Displays the Output
using this process. The spool file Queue Search screen.
replication uses the transport method.
Reports sent between boxes running
SPLF V5R4 is different than from older
systems.
This test is only available if the
component is toggled on in the
Replication Options screen (30.23).
Remote SQL Test Makes sure that the remote SQL process Displays the
SQL works and is functioning as desired. Configure TCP/IP
screen.
System Values The system values are evaluated to ensure Executes IBM’s
SYSVAL they are set correctly. WRKSYSVAL
command.
User Profiles This test is only available if the Displays the User
USRPRF component is toggled on in the Profile Search screen.
Replication Options screen (30.23).
The Role Swap Readiness Monitor tests are initiated via the xx_RSRMON job
on all nodes. The monitor on each node will display all tests appropriate to the
machine and the status of each test.
The Summary Status field at the top of the screen will indicate OK if all tests
have completed successfully within the past thirty minutes.
• The WRN status (warning) is displayed if at least one test has issues that
should be investigated but it may not be critical to resolve prior to
performing a role swap.
• The ERR status is displayed if at least one job is in error status. Also, if a
mandatory test is in ERR status, it will be displayed in red. Errors should
be investigated and resolved prior to performing a role swap.
Jobs that have not run within the past thirty minutes will be displayed in
reverse image in the status column (for both Local and Remote nodes). Any
tests that have not have finished processing are indicated by two plus signs (++)
or two asterisks (**), displayed in the Status column. Two dashes are displayed
for tests that do not run on that node. When the test status is displayed in
reverse image (highlighted), it indicates that the test has not been run in the
last thirty minutes and the results may be invalid.
To submit all tests (for both Local and Remote nodes), press F18. To run a
single test interactively, select option 7. If a test is run locally on the primary it
Press F5 periodically to update the display until all results are reported.
Status
Displayed Test Outcome
OK Test successful
Test Description
Test Description
The previous section introduced the audits that are critical to system upkeep.
In this section we discuss the Monitoring Checklist, which is a very
important system procedure that should be executed at least once daily. Since
the goal of high availability software is to ensure that there is an exact
duplicate of all objects selected for replication residing on the target node at
all times, audits must be executed and the system must be monitored on a
regular basis to ensure this occurs. Compromise on regular execution of
either auditing or monitoring and you will be compromising the reliability
of the software to keep objects in perfect sync.
The items on the Monitoring Checklist are divided into three sections:
Daily, Weekly, and Monthly activities. Run through the checklist at
approximately the same time each day, since system resources can fluctuate
throughout the day to help ensure consistency; (for example, “Disk Space
Used” will be much higher during day-end processing than at other times.)
The checklist provides a brief description about each step and space to record
the results. Recording the results for a period of time and periodically
referring to them will provide a baseline for comparison of the results.
NOTE
Some checks can be performed from any node, while others must be
performed specifically on the node indicated. An “N/A” will be
indicated in the PRI or TGT column if a check cannot or should not
be performed on that node. If more than one target node is defined
in your CRG, perform the checks on all target nodes, where
applicable.
Access the System Monitor (1.1, all nodes). Check the iTERA HA subsystems on all nodes (F7) ❏ ❏
and ensure they are active, that all the normal jobs are running, and that no jobs have a status of
MSGW. Press F12 to return to the main System Monitor screen. (Refer to the section “iTERA HA
Subsystems” on page 18 for a description of the jobs that should be running.)
On the primary, press F10 to update the System Monitor. Check for current values in the Last ❏ N/A
Update Time fields (for both primary and target). If any value appears in red, it indicates that it has
been twice as long (as the preset update time interval) since the last System Monitor update. If it
does appear in red, press F10 again. The value should update (and the color should change to
green). If it does not update correctly, verify that the subsystems are active (F7). If not, restart the
subsystems (on all nodes) then return to the System Monitor and press F10 to update (you may
need to update the monitor twice if statuses are not corrected with the first update)
Check disk space usage in the % Total Disk Storage Used fields to make sure you have sufficient ❏
unused disk space and that journal receivers are not using excessive amounts of disk space.
The % Total Used by Receivers typically runs in the 3-5% range, depending on the system and how ❏
long receivers are retained. If greater than that, it indicates that the apply jobs aren’t running
correctly and receivers aren’t being deleted. (Note that you are only checking the primary node, but
you should record the values for both primary and target node usage.)
Verify that all three Local/Remote Journals Active fields display Yes statuses. ❏
If any display *NO, press F16 to access the Mirroring Process Monitor screens on both the primary
and target nodes to view additional information and to access the screens where you can manage
journaling components.
If remote journaling is inactive, then activate it.
Verify that Apply jobs are active. The Apply Jobs Active field should display Yes. If this displays No, ❏
press F16 to access the Mirroring Process Monitor screen on the primary node and check the Apply
Job Status field. To manage the apply job, select 3.4 on the target node.
Verify that the Network/Subsystem Active field (underneath the Backup 1 node column) display Yes ❏
in both positions. These values indicate whether the systems are able to communicate with each
other.
If the first position value is *NO, do the following:
1. Restart the DDM servers (30.5 on all nodes) then return to the System Monitor and press F10
to update (may need to update twice).
2. Check the status of the xx_RMTCMD job in the iTERA HA subsystem. If the status is MSGW,
answer the message (option 7). If in HLD status, release the job (option 6).
3. Perform the Ping, FTP, DDM check (30.7) on the primary node. Review the E2MSGLOG for
results of the test.
If the second position value is *NO, then it indicates FTP is not active. Perform the Ping, FTP and
DDM Test (30.7) on all nodes, to all nodes.
If these solutions do not work, contact CustomerCare for assistance.
Check Journal Entries Not Applied. If this value is larger than usual, press F16 to access the ❏
Mirroring Process Monitor to determine whether they are exposed or applies pending. Exposed
entries are related to communications or to a job generating more transactions that can be handled
by the communication lines. Contact CustomerCare for assistance with fine-tuning this.
Review Objects Requesting Sync on the primary node (1.1, F6) to see if any files are waiting to be ❏
resynchronized. The number of objects requesting resynchronization will appear in the Objects
Requesting Sync field.
If there are excessive objects requesting sync, then press F16=Submit Extra Network Sync as needed
to submit additional sync job(s).
Normally, the system will automatically resynchronize necessary objects. However, you should
verify the following:
1. Review the reason for the objects requesting sync.
2. Review the status; verify that the system is in the process of resyncing.
3. If an object is not resyncing, then on the primary, review the size of the object and verify that it
is smaller than the Max Network Sync size. If not, you will need to either increase the Network
Max Sync Size (F19) or perform a tape sync (F20).
Review the Object Sync History Log (1.1 F6, F17 on the target). Review the history of all objects ❏
requesting resync.
Once issues have been resolved, on the Object Sync History Log screen clear the log by pressing
F20=Clear Log.
Review iTERA HA system messages on all nodes (1.1 F11 or E2MSGLOG). Review messages and ❏ ❏
resolve issues, as needed.
Once messages are resolved (on all nodes), clear the message logs (F16=Clear Info Messages). This
will move all messages to the history log (F18=History). Clear the History Log, as needed, using
F16=Clear Info Messages.
Record Count Check - On the target node, select option 1 on RCDCNT test in the Role Swap ❏
Readiness Monitor or 1.22 to display the Record Count Audit Maintenance screen.
Set the Filtering I/O field to “O”, enter an asterisk “*” in each of the Alloc State and RSync Y/N
fields, then press Enter.
The Allocated Status column displays for each object whether the object could be allocated.
Objects with Y indicate that the record count audit was able to obtain an exclusive lock on the
object, indicating that no other process was changing, deleting, or adding records to the object.
Therefore, the count obtained was accurate. Objects with N indicate that another process has the
object allocated, meaning that adds, deletes, and changes may be occurring. Therefore, the record
count may or may not be accurate. Errors indicated on these objects may or may not be true errors.
The objects should be reaudited.
A blank in the Audit Error column will omit any records that completed normally and will display
only those records that had an error.
The Audit Error column displays for each object whether a record count error was found. If the
field is blank, there was no error for the object. The accuracy of the error is dependent on whether
or not an exclusive lock was obtained. If the Alloc State field is N, then the errors may not be
correct and a reaudit of the object is recommended.
Error types are as follows:
• E – Indicates a count error. The variance between the objects is shown in the Audit Variance
column.
• M – Indicates a member audit error (a disparity was found in the number of object members).
The variance is shown in the Audit Variance column.
• I – Incomplete IBM API info error when the audit was done.
• F – Error in record format.
• H – Indicates that the object is missing records. It should either be resynced or a CRC audit
should be run over the object which will initiate the Heal process.
• + – Data Queue has more entries on the primary node.
• - – Data Queue has more entries on the target node.
• [blank] – Indicates no errors.
For items that display errors, perform the following:
1. Select option 8=Audit (may select several items with errors simultaneously) to re-audit. For
errors with member differences, use option 9=Member Audit, then option 6=Resync.
2. Wait a few moments then press F5 to refresh the screen.
3. If the same objects reappear, select option 6 to resync the objects.
Clear the Record Count Audit screen (F23).
Review the Object Attribute Audit results on the target node (1.23 and E2MSGLOG) ❏
Messages generated by the Object Attribute Audit will be displayed in the iTERA HA Message log,
which you should have reviewed earlier. If not, review them now.
On the target node, select 1.23, then select option 5 for the most recent audit (top of the list),
continue to select option 5 to drill successively down into the object level. Manually resubmit the
audit for any objects with attribute discrepancies.
The IFSMON information in the lower portion of the System Object Monitor screen (1.4, ❏
primary) provides details on how well the xx_IFSMON job is processing requests. It is especially
helpful for those doing a significant amount of object-level replication by helping to determine
whether additional jobs should be run.
The Process Difference field displays the difference between the numbers in the Previous to Process
and Remain to Process fields. If the number is negative, more requests have been processed than
there are waiting to be processed. If positive, the xx_IFSMON job is not keeping up with the
number of requests. You may need to adjust the timeslice and/or priority for the job.
IFS Audit Review (If you replicate IFS, perform this monitoring check.) ❏ ❏
• Access 1.8 on the primary. If the IFS_AUDIT test is beyond the recommended run time interval
manually execute it using option 8=Submit.
• Access 1.8 on the target. Select option 1 on IFS_REVIEW to view the Display IFS Audit
History screen.
View the audit details by selecting option 1 in conjunction with the appropriate Fkey. In the
column to the right, indicate the numbers for the following:
– Authority Failures (opt 1 w/F8 [Enter], then opt 5 on any entry listed)
– Attribute Failures (opt 1 w/F9 [Enter], then opt 5 on any entry listed)
– Data Failures (opt 1 w/F10 [Enter], then opt 5 on any entry listed)
NOTE
If replicating QDLS, then the first time option 1 and the appropriate Fkey is selected,
leave the Skip QDLS Paths field blank. The QDLS paths are automatically built. These
paths establish the necessary links to be able to view the QDLS objects within the audits.
It typically takes longer to compile the audit data when building these paths. Once the
QDLS path information has been built, a Y will automatically be placed in the field and
subsequent use of option 1 will skip the build of the QDLS object paths. If you need to
rebuild the QDLS path information, change the Y to N or blank. If not replicating
QDLS, enter Y in the Skip QDLS Paths field.
In Non-Mirrored Object Replication Check (4.30, primary), verify that all objects that need to be ❏
replicated are being replicated. The Last Sync Date/Time data should be current. (This may be
scheduled in a job scheduler to run daily.)
Check Data Queues (1.50.1, primary). If not empty, check E2MSGLOG (not necessary if running ❏
V5R4 or higher).
Check for New Libraries (1.50.4; primary) (This feature is also available from 4.11, F7, F7.) ❏ N/A
The status column will display “New” for any libraries that have not been assigned to a journal or
that have been added after iTERA HA was installed. This status prompts you to determine whether
you want to select the library for mirroring. Typically, just use option 21=Quick NetSync libraries. If
you don’t want to mirror the library and want the “New” status removed, use option 16 next to the
desired library and press the Enter key (or press the F21 key to remove the new status from all
libraries). They will no longer be displayed in the new libraries list.
Check Syncing Libraries for Cross-Library Dependencies (1.50.4, F6; primary) ❏ N/A
Check User Profile Replication (1.50.5.1, primary). Press F16=Dft Map and verify the map is ❏
correct. If modifications are needed, make them and then use F21 to quick sync profiles.
Check the User Profile Replication Errors screen (5.1, target) for failures and resolve as necessary. ❏
Check for New IFS Objects or Directories (1.50.5.2, opt 5, primary) ❏ N/A
1.50.5.2 displays the IFS Work with File Systems screen (as does 5.2). Select option 5 for one of the
file systems in order to drill down into the file structure and locate new IFS objects and directories.
Objects added to a directory currently being replicated can automatically inherit journaling, if
specified in the IFS Journaling Defaults (5.2, F7). (This should have been defined when IFS was
initially replicated.)
You must drill down into the various subfolders in order to spot new objects that need to be
replicated. For this reason, it may be extremely helpful to maintain a map of replicated directories
and objects, indicating the location of subdirectories and objects that should not be replicated.
Check Spool File Replication (1.50.5.3; primary). Select F7 twice to review new output queues and ❏
replicate as needed.
Check Configuration Replication (1.50.5.4; primary). Select F21=Config List. Review for new ❏
items and replicate as needed. Review Configuration Replication errors (if needed, the objects can
be rebuilt from 5.4 on the target).
Check Job Scheduler Replication (1.50.5.5; primary). Ensure that everything that should be ❏
replicating is. Errors can be fixed through 5.5 on the target.
Check WebSphere MQ Replication (1.50.5.6; primary). If replicating WebSphere MQ, ensure that ❏
all queue managers are replicating.
Check Directory Entry Replication (1.50.5.7; all nodes). Review the maps and any errors. Review ❏ ❏
errors in 5.7 on the target.
If a new iTERA HA Service Pack has been released, load and apply the PTFs. ❏ ❏
• Load and apply PTFs for XP (Cross Product) using menu option 10.46 (specifies product ID
7PA2K02; release V4R3M0).
• Load and apply PTFs for iTERA HA using menu option 10.45 (specifies product ID 7PA2K05;
release V6R0M0).
• If using iTERA Alert, load and apply PTFs using menu option 10.47 (specifies product ID
7PA2K25; release V6R0M0).
NOTE
iTERA HA PTF Service Packs are released on a regular basis. E-mail notification is sent to
iTERA HA customers when they become available. (Notify CustomerCare if you are not
receiving these e-mails.)
Load and apply Cross Product PTFs prior to those for iTERA HA.
Verify E2IFSPRGA runs regularly in the job scheduler to purge audit data.
Verify that the non-mirrored library sync command (SYNCNMLIB) is scheduled to run regularly.
Troubleshooting
This filter will display all objects for which journaling is not active. An object
that displays an “X” in the JRN column (the first filter column) indicates that it
has been manually omitted and you may disregard it. Any object that appears
with a value other than X needs to be investigated in order to determine why it
is not being journaled.
This filter will display all objects that are not being synchronized by the system.
For any objects displayed, verify that they should not be synced (i.e., that they
were omitted from synchronization on purpose).
This filter will show all objects that are marked to be omitted. For any objects
displayed, verify that the objects should be omitted. (You may want to set up
an audit exclusion filter to exclude them from the audit so that the objects
don’t appear in this list on a daily basis. To do so from this screen, press F9,
then F6.)
The Audit Command Console also functions as a safety net. Each audit in
the console has a setting that controls the maximum amount of time that
may pass before it is automatically initiated. Once the audit has completed
(whether initiated by the console or from a scheduled job), the results are
reported back to the console. If, for some reason, the audit does not run
within the recommended interval, the status indicator will flash.
If an audit does not run within the maximum allowed run time, an alert is
generated by the console and, depending on the audit and the circumstances
or severity of the alert, is sent either to a message queue, iTERA Alert, or to a
designated messaging application. If sent to iTERA Alert (a separate message
queue monitoring application), a system operator can be notified by e-mail,
text message, message queue, or all three. iTERA Alert documentation is
located in the iTERA HA v6.0 Advanced Features Guide.
Alert intervals are user-defined but may not exceed the maximum allowable
time in which an audit must run. For example, an audit may be set to issue a
standard alert if it has not run in 24 hours and a severe alert if the audit has
not run in 48 hours. The user may lower the time interval in order to be
alerted sooner rather than later, but may not extend the interval beyond the
maximum.
The AUDIT_MON test in the Role Swap Readiness Monitor (1.7) checks
the Audit Command Console to verify whether it is active and whether all
audits have run within their defined intervals. The results of the
AUDIT_MON test are then reported to the System Monitor (1.1). If any
audit is past the Audit Interval, an “ERR” status is displayed. Any warnings
or errors for the Audit Command Console test should be investigated and
resolved promptly. Once issues are resolved within the Audit Command
Console and the test within the Role Swap Readiness Monitor has been
reprocessed, the status for the test will indicate OK for both nodes.
– Menu 1.8
– Menu 6
– 1.7, option 1 on AUDIT_MON
Upon first entering the console, the audits are displayed in a hierarchical order
in relation to the alert priority, but the Auto Audit feature will be turned off.
The initial screen of the Audit Command Console will display all audits
applicable to the node.
NOTE
The audits displayed in the console will vary, depending on
replication options. The default view is to display only active
(applicable) audits (i.e., audits for current replication options that
have been activated). If needed, the entire list of audits is available via
F8=List All Audits.
The following table describes some of the main features of the main Audit
Command Console screen.
NOTE
The following table does not contain descriptions for all options,
fields, and functions available on this screen; consult the iTERA HA
v6.0 Reference Guide for additional details.
Item Description
Auto Audit When Auto Audit is set to *ON, audits in the console will be
executed based on the Audit Interval, which can be defined in the
option 2 screen for the audit. (Audits that have exceeded the
specified run interval display in red in the Current column. The
auto audit feature should remain *OFF until initial syncing is
complete, all audits have been assigned a preferred run time, and
have run at least once.
Sort The sort order can be switched between Priority and Audit
(alphabetically by audit name) using F9. Priority is calculated
based on the Sts (status) column (jobs with errors appear at the
top, followed by jobs currently running), then by the Next Audit
Submission column data. Sorting by Audit sorts the list
alphabetically by audit name.
List Press F8 to toggle between Active and All Audits. The default
view is Active. In the All Audits view, audits for components not
being replicated or audits not applicable to that node are
displayed in red text.
1=Work with Invokes the program that will most likely be used to resolve an
App issue. Where option 1 is not valid for an audit, a message will be
displayed at the bottom of the screen.
2=Change Run status results, frequency settings, and scheduling options are
accessed via option 2. This screen is described in more detail
“Option 2=Change” on page 178.
8=Submit Sends the audit to the job queue for processing. The job will use
the parameters defined in the option 2=Change screen.
Audit Audits applicable to the current node are displayed in green text
(default). When F8=List All Audits is pressed, all audits are
displayed. The audits not applicable to the current node and
audits for components not being replicated are displayed in red.
Audit names ending with P run only on the primary and most
audits with names ending with the letter T are typically for
viewing the results of the audit (on the target). (A corresponding
audit with the same name exists on the target node, but is not
applicable to that node and is displayed in red.)
System Role Indicates the system on which the audit runs, e.g., Source, Target,
or Both. Use F8 to toggle between active audits and all audits.
When viewing active audits, only those audits applicable to the
node you are on are displayed. For example, when on the primary
system, only audits with Source and Both are displayed in the
System Role field.
Item Description
Audit Interval Each audit has a defined maximum run interval. Audits can be
(Type, adjusted to run more frequently than the maximum, but not less.
Current, If the audit exceeds the defined run interval, it will be displayed in
Maximum) red text. The audit interval can be adjusted using option 2.
• Type - Correlates with the value in the Maximum field. For
example, if 24 is specified in the Maximum field and HRS is
specified in Type, the audit will run daily, every twenty-four
hours.
• Current - Possible values include:
– Zero (0) in the default text color if the audit has run within
the number of days or hours indicated in the
Maximum/Type fields. Will display in red text if the audit is
currently running.
– Displays in red text the number of hours or days that the
audit has exceeded its maximum Run Interval.
– Values flashing in red text have exceeded the Alert Interval
specified in option 2.
• Maximum - Correlates with the Run Interval field accessed via
option 2.
Item Description
Next Audit Indicates the date the audit will next be initiated. The date is
Submission calculated by adding to the date the audit last ran the number of
hours and/or days specified for the Run Interval.
Dashes in the Next Audit Submission field on the target indicate
that the audit does not run on that node, but the audit results are
available by selecting option 1.
Option 2=Change
Several parameters, such as run status, frequency settings, and scheduling options,
are accessed via option 2.
NOTE
The following table does not contain descriptions for all options,
fields, and functions available on this screen; consult the iTERA HA
v6.0 Reference Guide for additional details.
Item Description
Retain History Indicates the number of days the audit history will be
retained. View audit history using option 5 from the main
console screen.
Submit Window, The audit will only submit automatically if there is at least
Execution Days, one Y in the Execution Days field and will only execute
and Preferred Time during the Submit Window time frame.
Specify a time in Preferred Time if the audit should be
executed at a particular time. It must fall within the Submit
Window and Execution Days in order to be executed.
“0:00 - 0:00” in the Submit Window field (and “0:00” in
Preferred Time) indicates the audit will be executed based on
the Last Run Start time and Run Interval specified.
For all audits except the Object Attribute Audit, the audit
will be submitted in the window and will continue to process
until finished. For the Object Attribute Audit, the audit will
submit and run only during the time length indicated in start
and end time fields. If the audit has not finished processing,
the job is canceled and the next audit run will start with the
library that was being processed when the audit was
canceled. This ensures that all libraries on the system are
audited even when the time allowed to run each day may be
limited. When checking results, you may need to check more
than one audit result entry.
Item Description
F7=Create Job Displays the IBM Job Schedule Entry screen for the audit
with the minimum recommended run interval settings
pre-populated. Scheduling the audits in the job scheduler is
recommended and helps ensure the audits run precisely at
the same time each interval. While the settings may be
changed, Vision Solutions does not recommend extending
the run frequency beyond the default setting. For example,
you may schedule a job to run more frequently than what has
been programmed in the default, but do not schedule it to
run less frequently.
In order to prevent the audit from being submitted from
both the Audit Command Console and the job scheduler
simultaneously, the scheduled time for the job schedule entry
should be earlier than the time indicated in the Next Sbm
Time field in the option 2 screen.
Do not change the name of the scheduled job. Otherwise,
results for the audit will not be reported back to the console.
F8=View Job Displays the job schedule entry in the IBM Job Scheduler.
IMPORTANT
When scheduling the following audits, ensure there is no run time
overlap. Allow enough time for each audit to complete before the
next one is allowed to start. They should not run concurrent with
each other.
- AUDSTREAM
- OWNER_AUTH
- DBR_AUDIT
- OBJATRAUD
IMPORTANT
The AUDSTREAM audit runs the following audits:
- CHKOBJMTCH
- JRNOBJJRN
- JRNOBJLST
- RCTCNTALL
- RCTCNTCHG
When the AUDSTREAM audit runs, the status for each of the
individual audits will be updated in the console.
Scheduled Audits
Vision recommends that all audits be scheduled in a job scheduler. However,
the Preferred Time can be used in the option 2 screen. At minimum, schedule
the following in a job scheduler.
Primary node
Audit name as it
Audit Description should appear in Frequency Schedule Schedule
Day Time
the Job Scheduler
Target node
Audit name as it
Schedule Schedule
Audit Description should appear in Frequency
Day Time
the Job Scheduler
NOTE
Do not attempt to schedule an inapplicable audit (the audit name
will be displayed in red text). Inapplicable audits are only displayed
in the console when the F8=List All Audits key has been pressed.
NOTE
The appendix “iTERA HA Scheduled Audits and Other Jobs” on
page 263 provides a list of all audits in the console. We recommend
that you refer to it when working through the instructions in this
section.
NOTE
The audits displayed will vary, depending on replication options on
your system (e.g., the spool file audit will not display in the list of
active audits if Spool File Replication is not active).
NOTE
Although there are a few exceptions, most audits are scheduled only
on the primary node. Exceptions are noted in the appendix “iTERA
HA Scheduled Audits and Other Jobs” on page 263. Audits that
should not be scheduled on the target are automatically restricted
from being scheduled via the console.
4. The IBM Add Job Schedule Entry window is displayed. Several fields,
such as Job name, Command to run, Frequency, etc., are pre-populated with
the minimum recommended run settings. Verify that the parameters meet
the run criteria for the audit. While some of these settings may be
changed, do not extend the run frequency beyond the default setting for
the scheduled entry. For example, you may schedule a job to run more
frequently than what has been programmed in the default, but do not
schedule it to run less frequently. Additionally, if scheduled to run more
frequently, ensure it does not run concurrent with other audits, as
described previously.
IMPORTANT
Do not change the name of the job schedule entry. In order to have
the job’s run status properly reported back to the Audit Command
Console, the default job name must be used. Failure to keep the
default name will result in the Audit Command Console initiating
the audit independently of the scheduled job, thus having the audit
run more than is necessary and/or conflicting with other audits.
For the Scheduled time field, enter the time so that the audit will have had
sufficient time to complete prior to the time indicated in the Next Sbm
Time field in the option 2 screen of the console (see the highlighted
example in the Audit Command Console screen, below).
5. Press Enter to add the scheduled entry. The display returns to the Audit
Command Console Change screen. If desired, press F8=View Job to view
the scheduled job in the IBM Work with Job Scheduled Entries screen,
then F3 to return to the console.
7. Allow all scheduled audit jobs to run via the job scheduler. (Alternatively,
you may manually execute audits using option 8=Submit.) After the audits
have had a chance to run at least once from the job scheduler, enable the
Audit Command Console using the F11=Start Auto Audit key from the
main Audit Command Console window. The screen below is displayed.
– For any audits that have not already run, specify a time of day for the
remaining audits to start (e.g., when system resources are more
abundant), then press Enter. Those audits will initiate at that time in
conjunction with the specified audit interval.
– To start the console immediately, press F7 to bypass the time interval
restriction. Unless scheduled in a job scheduler, future audits will be
initiated based solely on the specified audit interval and the Next Audit
Submission date and time field will indicate when each audit will run
in the future.
IMPORTANT
The Auto Audit feature must be active on all nodes. If the Auto
Audit feature is off, then audit job status will not be reported back to
the console and the console will not initiate audit jobs for any audit
that may not have run.
Audit Detail
The following table contains a brief summary of each audit. For additional
information, consult the iTERA HA v6.0 Reference Guide.
NOTE
If an audit below is not displayed in your systems, it is usually
because the component has not been enabled in the Replication
Options screen (30.23).
ALT_NAMEP Alternate Name Ensures that alternate names for files are replicated correctly. Alternate names are WRKSPLF on the iTERA
Audit used primarily in SQL applications. The audit submits a job on the primary. outq
When complete, it then submits a job to the target. Discrepancies are
automatically corrected. The command E2AUDALT can also be used to
perform the audit.
A report of the results of the audit lists any corrections that were made. This
report is available on the target and can be accessed using WRKSPLF on the
iTERA outq. To display the alternate names use the command DSPFD. A file
may have more than one alternate name.
ALVL_AUDIT Auditing Level In order to replicate objects correctly, object types must have the correct audit Option 1 displays the
levels. Object Audit Level
The audit reviews all object types and changes the audit level for any objects that Definition screen.
have an object type that is different from our recommended setting. The change
date for the objects on the primary will be modified if the audit level is changed
by the audit.
See Object Audit Level Definitions (4.62) in the iTERA HA Reference Guide
for additional details and a for a list audit level definitions that will be checked
by the audit.
AUDSTREAM Audit Stream AUDSTREAM runs the following audits: To verify that the audits ran,
• CHKOBJMTCH check option 5=History,
• JRNOBJJRN then check for data in the
Record Audit Maintenance
• JRNOBJLST
screen (1.22) on the target.
• RCDCNTCHG (daily, except Saturday) or RCDCNTALL (Saturday) If 1.22 has been cleared
If scheduling the AUDSTREAM, do not schedule these individual audits. since the last time the audits
When AUDSTREAM runs, the status for each of the individual audits is ran and there is now more
updated. data, it is a good indication
If needed, the source for this program can be found in source file that the audits ran as
ITHA/E2.CUST or ITE2/E2.CUST. To customize this program, copy the scheduled.
source member to source file ITHAxx/E2.CUST or ITE2xx/E2.CUST and Because it runs several audits
compile the program into ITHAxx or ITE2xx. (If needed, your Vision Solutions with multiple purposes,
Services Consultant will assist you in modifying the standard programs.) option 1 is not valid for the
Do not allow the AUDSTREAM to run while the DBR_AUDIT or the AUDSTREAM. Option 1
OWNER_AUTH audits are running. can be used for the
individual audits executed
by AUDSTREAM.
IMPORTANT
Special configuration is required for the CHKOBJMTCH audit
in environments with multiple target nodes. Contact
CustomerCare for details.
NOTE
The object attribute audit checks for attribute differences.
IMPORTANT
Special configuration is required for the CHKOBJMTCH audit
in environments with multiple target nodes. Contact
CustomerCare for details.
CST_AUDP Constraints This audit is submitted on the primary. It identifies missing constraints and There are no results to
automatically adds them to the file on the target and updates the target with any display. However, option 1
missing or different constraints. provides access to the Work
with Constraints screen
(4.24).
DBR_AUDIT Database This audit submits a job which identifies all physical files being mirrored and When option 1 is selected,
Relations their associated logical files and verifies that all database relations are identical to the prompted DSPDBR
those on the primary node. command is displayed.
If the target node does not match the primary node, iTERA HA will Enter the appropriate file,
automatically copy missing files to the target node (no other results of the audit library, and output
will be reported). information.
This audit should run at least weekly on the primary node.
IMPORTANT
Do not allow the DBR_AUDIT audit to run while the
AUDSTREAM or the OWNER_AUTH audits are running.
DEV_AUDP Device Identifies missing configuration objects and attempts to create them on the On the target, select option
Configuration target. If unsuccessful, an entry is added to the configuration error screen. This 1 on DEV_AUDT or 5.4
audit is only available if the replication option is enabled. screen.
DIRE_AUDP Directory Entries Identifies missing directory entries and attempts to create them on the target. If On the target, select option
unsuccessful, then an entry is added to the Directory Entry Errors screen. This 1 on DIRE_AUDT or in
audit is only available if the replication option is enabled. the 5.7 screen.
EXIST_DEVP Device Checks for devices and controllers that exist on one node but not the other. The On the target, select option
Configuration primary purpose is to locate devices and controllers that exist on the target but 1 on EXIST_DEVT.
Existence not on the primary. This audit is only available if the replication option is
enabled.
EXIST_DIRP Directory Entry Checks for directory entries that exist on one node but not the other. The On the target, select option
Existence Audits primary purpose is to locate directory entries that exist on the target but not on 1 on EXIST_DIRT or 5.7,
the primary. This audit is only available if the replication option is enabled. F19.
EXIST_JBSP Job Schedule Checks for job scheduler entries that exist on one node but not the other. The On the target, select option
Existence Audit primary purpose is to locate job scheduler entries that exist on the target but not 1 on EXIST_JBST or 5.5,
on the primary. This audit is only available if the replication option is enabled. F19.
EXIST_OBJP Object Existence This audit checks all libraries (whether being journaled or not) for objects that On both the primary and
exist on one node but not on the other. target nodes, select option 1
However, the primary purpose of the audit is to review libraries not being for the audit.
replicated to see if any necessary object is missing. There may objects that exist
in libraries that are not replicated but are needed in order for the applications to
run correctly. These may include *JOBDs, *JOBQs, and *SBSDs.
If desired, objects on the primary but not on the target may be copied by using
option 3=Copy from primary. However, this is the equivalent of a non-mirrored
library object sync, so the object will not be monitored for changes.
Additionally, if a filter is active, if the object is copied to the target, it will be
removed when the audit runs. Delete or revise the filter to prevent this from
occurring.
IMPORTANT
The audit, by design, does not consider filters that may be in
place. One reason for this is that objects missing on the target
could be an indication of an incorrectly defined filter.
NOTE
This audit can take an extensive amount of time to run and
evaluate. Because of this, default setting for the audit is disabled
from running on both nodes. After initially running the audit,
you may want to disable (hide) it so that indicators will not be
displayed in the console when the maximum audit interval is
reached. However, results (option 1=Work with App) are only
accessible when the audit is enabled (not hidden).
This audit, by design, is hidden. To enable the audit execute the following steps:
1. From within the Audit Console, view all audits by pressing F8=List All
Audits.
2. On the primary node, select option 2=Change for the EXIST_OBJP audit.
3. Press F18=Override.
4. Enter N for the Place audit on hold parameter.
5. Press Enter twice.
6. Repeat these steps for EXIST_OBJT on the target node.
7. On the primary, exit the screen then reenter the Audit Console. Select
option 8=Submit to run the audit. The job AJ_OBJEX is submitted and
runs in the iTERA HA subsystem.
EXIST_OTQP Output Queue Checks for output queues that exist on one node but not the other. The primary On the target, select option
Existence purpose is to locate output queues that exist on the target but not on the 1 on EXIST_OTQT.
primary. This audit is only available if the replication option is enabled.
EXIST_USRP User Profile Checks for user profiles that exist on one node but not the other. The primary On the target, select option
Existence purpose is to locate user profiles that exist on the target but not on the primary. 1 on USR_EXSTT.
This audit is only available if the replication option is enabled.
IFS_AUDIT and Integrated File IFS_AUDIT runs on the primary and executes the E2IFSAUD command. The On the target, select option
IFS_REVIEW System IFS_REVIEW audit on the target node provides access to the IFS Audit results. 1 on IFS_REVIEW.
Select option 1 on the audit to display The IFS Audit History screen. This
screen provides information on how many errors there were on the audit. For
IFS_REVIEW on the target node, select option 1 in conjunction with the
appropriate Fkey which pertains to the failure type for which you want to view
details.
If replicating QDLS, then the first time option 1 and the appropriate Fkey is
selected, leave this field blank. The QDLS paths are automatically built. These
paths establish the necessary links to be able to view the QDLS objects within
the audits. It typically takes longer to compile the audit data when building
these paths.
Once the QDLS path information has been built, a Y will automatically be
placed in the field and subsequent use of option 1 will skip the build of the
QDLS object paths. If you need to rebuild the QDLS path information, change
the Y to N or blank.
If not replicating QDLS, enter a Y to skip the build of QDLS paths. Option 5
will then display the audit details for current, primary, and target values.
This audit is only available if the replication option is enabled.
INSTAL_AUD Installation Setup Tests several settings required for correct iTERA HA configuration, such as Option 1 on each node
Audit system values, data areas, TCP, DDMTCP, and network, attributes. displays the Install Audit
The complete list of tests is documented in the iTERA HA v6.0 Reference Log, where discrepancies
Guide. can be investigated and
resolved.
You must be current with iTERA HA PTFs to run this audit.
JOBSCDE_P Job Scheduler Identifies missing job scheduler entries and attempts to create them on the On the target select option 1
Audit target. If unsuccessful, then an entry is added to the Job Schedule Errors screen. on JOBSCDE_T or in 5.5
LF_AUDP Logical File Compares the attributes of logical files on the primary system and changes the Option 1 displays the i5/OS
Attributes attributes on the target if they are different. Programming Development
Manager (PDM) screen. For
target node results, use opt 1
on LF_AUDT on the target.
LIB_AUDP Library Compares the library attributes between the primary and target and then change On the target select option 1
Attributes the target library to match the primary if necessary. It is executed on the on LIB_AUDT.
primary.
OBJATRAUD Object Attributes Compares attributes for all objects in mirrored libraries between primary and 1.23 on target.
target. Objects with attribute differences are automatically resynced. This audit
uses the transport journal technology to communicate object information
between source and target systems.
The complete list of attributes audited is documented in the iTERA HA v6.0
Reference Guide.
You must be current with iTERA HA PTFs to run this audit.
Checks to verify that object authority and ownership on the target node is
identical to that on the primary node. If objects on the target node have
differing authority and/or ownership, they are automatically changed to match
the same objects on the primary node.
On the primary, when option 1 is selected for this audit, the prompted
WRKOBJ command is displayed. Enter the appropriate object, library, and
object type information.
These audits tests whether the number of records in all mirrored objects on all
nodes match at the same point in time. If they don’t match, iTERA HA will
re-sync the objects (but only if iTERA HA is able to allocate the object). If
iTERA HA is not able to allocate the object, you must decide whether or not to
resync the object.
RCDCNTALL runs on Saturday only and checks all objects.
RCTCNTCHG runs daily, except Saturday and checks only for objects that
have been changed since the last audit.
SPLF_AUDP Spool File removes extra reports on the target and requests missing reports from the Option 1 on the
primary. This job is submitted on the primary, but results are viewed on the SPLF_AUDT audit on the
target. This audit is only available if the replication option is enabled. target displays the Work
with Output Queues screen,
where you can search for
specific output queues.
SRC_AUDP Source Member This audits checks source member attributes. If source files are changed using Option 1 displays the i5/OS
Attributes PDM and the source type and/or description is changed then the changes are Programming Development
not copied to the target. The reason is that no entry is created in the Manager (PDM) screen. For
QAUDJRN. This is a problem that has been reported to IBM. However, IBM target node results, use opt 1
has indicated a fix for this issue is not forthcoming. Due to this system on SRC_AUDT on the
limitation, it is necessary to compare the source attributes on a regular basis and target.
change the attributes if different.
STOR_PRC_P Stored Stored procedures (functions and procedures) are usually replicated when the Option 1 on
Procedures and associated file is also replicated, provided the program is ILE. If the program is STOR_PRC_T on the
Functions Audit OPM or not attached to a file then the stored procedure will not replicate. target displays the Work
This audit identifies and replicates missing functions and procedures that are with All Spooled Files
found in replicated libraries and deletes stored procedures that exist on the target screen. Results are only
but not on the primary. reported if additions,
deletions or changes to
The command E2AUDSPRC can be used to perform the audit by one or all
stored procedures were
libraries, for functions, procedures, or both, and for a specific function or
performed by the audit.
procedure. The command can be run in two modes *REPORT which will
identify any differences but not fix the differences and the *FIX mode which will
fix the differences.
You must be current with iTERA HA PTFs to use this audit.
SYSV_AUDP System Values You must be current with iTERA HA PTFs to run this audit. Additionally, the Results are accessible from
Replication System Values (*SYSVAL) component must be enabled in the Replication 5.8.
Options screen (30.23).
Some System Values can be replicated to the target node. This audit verifies that
the system values that are defined to be replicated match. Discrepancies are
automatically corrected.
Select option 21 to audit an individual audit or F20 to audit all.
TRG_AUDP Trigger Audit This audit is submitted on the primary. It identifies missing triggers and Option 1 on the audit on
automatically adds them to the file on the target and updates the target with any the primary displays the
missing or different triggers. Work with Triggers (4.23)
screen. Option 1 on the
TRG_AUDT audit on the
target displays the Trigger
Error screen.
USRPRF_AUD User Profile Identifies missing users or attribute differences and corrects the users on the
target.
V6R1_AUDIT OS Upgrade The i5/OS V6R1 release requires all programs, service programs, and modules See below.
Viability Audit (objects) to be converted before they can be used. Any program, service
program, or module compiled to run on a release prior to V5R1 and where
observability has been removed cannot be converted and therefore, cannot be
used. (This issue is similar to the conversion from CISC to RISC systems.)
Depending on how the system value QFRCCVNRST (*SEC-Force conversion
on restore) is set, the objects will not be converted until used, or they will be
converted when they are restored. If the object cannot be converted and the
system value QFRCCVNRST is set from 2 through 7, then the object will not
restore. There will be only a diagnostic message indicating the object could not
restore but no indication as to why it could not restore.
Many vendor applications and user applications will not convert. It is important
that you start auditing your systems well in advance of the upgrade to make sure
you do not have problems.
This audit should be executed on all systems that may be upgraded to V6R1 of
the OS. The audit should run monthly until all objects that cannot be converted
have been addressed and resolved. This audit does not address any programs
stored in the IFS.
3. Press F10 to submit the audit for all libraries. Once the analysis is
complete (or if the Audit Command Console has already run the audit)
press F7 to view the library summary list.
NOTE
Print capability has been provided (F18, F19, F20) in order to be
able to distribute reports to the various individuals who require
them. Reports can also be printed by library from within the
F7=Summary screen.
4. A “1” in the Search for field for library entries with values in the Problem
Objects field displays the libraries that have objects that cannot be
converted. Review and/or print, as necessary.
Notable information on this screen includes the value in the Can Convert
column. An “N” indicates the object that will not convert. These objects
need to be addressed by the appropriate application developer.
The First Release column indicates the OS version level at which the
objects were compiled. All objects with observability removed must be
compiled at V5R1 or later.
6. Option 1=Select on a library in the Details view will display detail about
the object. Review objects as necessary.
7. Press F12 twice to return to the Conversion Summary screen, then select
option 7=Errors to view Conversion Errors.
8. Run the V6R1 audit regularly until all problems have been resolved.
IMPORTANT
The Audit Command Console will display an error status for the
audit until all conversion errors have been resolved. Consequently,
the Audit Console test in the Role Swap Readiness Monitor will
display an error status.
Other Audits
The following system checks are not part of the Audit Command Console but
should be executed on a weekly basis and prior to a Role Swap.
Overview
This chapter contains an overview of the four types of role swap available in
iTERA HA, detailing the benefits of and recommended frequency each, and
in what circumstances it is appropriate to use a particular type.
Each time you perform a Role Swap, Virtual Role Swap Test, or Failover,
download the current guide published on Support Central in order to ensure
you have the most up-to-date information available.
Role Swap
Type Attributes
During the test, the primary system stays active, with no impact
on users. Tests are performed on the backup node.
• Testing can be performed during normal work hours.
Virtual Role • Every thing that can be directed to the backup or test system
Swap Test
can be tested.
• Records changed during the test are identified using the
ZZ-audits and reversed using the Heal process (except IFS,
MQ, and Spool Files).
The following sections provide additional detail about each type of role swap.
During the test, the apply and syncing processes are temporarily suspended on
the target node, even though iTERA HA continues to monitor these processes
on the primary. The updates from the primary continue to accumulate in the
remote journals on the target. The iTERA HA menus on the backup system
indicate you are in virtual mode. Users can execute applications on the backup
node in order to test functionality. Records can be added and deleted from
files. Meanwhile, iTERA HA keeps track of changes made to the systems. The
changes which occur in the virtual environment are not sent to the primary.
While application testing occurs, heal records from the primary are requested
for any objects being changed on the target so that they are in the remote
journal on the backup node and ready to be applied when the virtual test is
ended.
IMPORTANT
Be aware that if files are deleted or members are cleared on the target,
a resync of the deleted/cleared data will occur.
NOTE
Heal is not supported for WebSphere MQ, IFS, and Spool Files.
Additionally, the audits are not currently equipped to identify
changes made to these components during the Virtual Role Swap.
Therefore, these components are not supported in the Virtual Role
Swap and should not be tested. However, if a complete resync of IFS
is practical, it can be planned as part of the process.
IMPORTANT
Keep in mind that since all processing and changes performed in a
Virtual Role Swap will have to be reversed, you must allow sufficient
time for the changes to be rolled back, which takes time. We
recommend you only test a few things with your first one in order to
gauge how much time it will take your system to recover, then
increase the amount of processing with each subsequent test so that
you don’t encounter an unacceptable reversal time.
To execute a Virtual Role Swap, follow the procedure, see “Virtual Role Swap
Test and Virtual Role Swap with Communications Test” on page 222 of this
guide.
Role Swap
The role swap process can be executed in order to have the backup node
assume the role of a fully-functional primary node. During the switchover
process, users are redirected to the new primary and normal production is
resumed. When ready to do so, another role swap is performed to return to the
original production node. Some customers prefer to run for only a short period
on their new primary machine and switch back quickly, while others prefer to
run in a swapped condition for an extended period of time.
In our experience, the actual role swap process usually takes less than forty-five
minutes to complete, so user-downtime is greatly minimized.
Prior to performing a role swap, audits and other system checks help ensure the
viability of the backup node assuming the role of the primary node.
1. Verify correct replication and that everything that is needed to run the
system from the new primary is in place and ready to be used.
2. In the event of a failover, a role swap will provide good idea of what is
required to be fully operational and how long it will take to restore (A
failover will actually happen faster than a role swap, since in a failover
certain assumptions are made that are not made in a normal role swap.)
3. If you need to take your production node down for maintenance, the
backup assumes the role of a fully-functional primary node, minimizing
user downtime.
NOTE
A replicate node cannot be used in a role swap configuration. If more
than one backup node is defined, the role swap is performed to the
node identified as BACKUP 1. In order to roll to the other backup
node it must first be promoted to BACKUP 1.
To execute a role swap, see “Role Swap Procedure” on page 205 of this guide.
NOTE
Because each environment is unique, you may need to document
additional role swap procedures unique to your environment.
Failover
IMPORTANT
A failover should only be executed in the event of failure of the
primary system.
IMPORTANT
If you have experienced an actual failure on your primary node call
Vision Solutions CustomerCare immediately. As part of your
support agreement, Vision Solutions will help you work through a
failover situation.
IMPORTANT
By providing the failover procedure, Vision Solutions does not imply,
recommend, nor suggest that you perform a failover on your own.
To guide you through each role swap process from start to finish, detailed
instructions are included in this guide.
IMPORTANT
Prior to performing any type of role swap, always download a new
copy of this guide from Support Central to ensure you have the latest
version.
Virtual Role
Virtual Role Swap with Live Role Failover
Swap Communications Swap
Test
Tests:
IMPORTANT
The Virtual Role Swap Test and Virtual Role Swap with
Communications Test can help prepare your systems for live role
swap and/or failover events. These tests do not replace the need to
perform a live role swap regularly, but can help you to identify many
potential issues that would prevent a role swap or failover from being
executed successfully.
Two-node CRG
Role swap configuration for a standard two-node CRG is depicted in the
following example. The systems can be co-located or in separate data centers.
Three-node CRG
A role swap configuration for a three-node CRG consisting of a source and
target node located in the same data center and a target node located in a
separate off-site location is depicted in the following diagram.
The normal production mode in this example shows the source (upper left)
replicating to target 1 (lower left) and target 2 (upper right).
In high availability mode, a role swap has occurred where target 1 is now the
source and is replicating both to the former source (now target 1) and target 2
off-site.
IMPORTANT
Vision Solutions recommends that you review this complete
document a week prior to performing a role swap in order to refresh
your knowledge, better understand the process, and ensure that you
are aware of, and account for, the unique equipment and processes
within your environment.
IMPORTANT
Document separately any additional steps that you must perform for
your particular environment.
When preparing to perform your first role swap, Vision recommends that you
schedule a System Health Check through our Professional Services department
(this is a billable service). While optional, the System Health Check will help
ensure that iTERA HA is correctly configured and is ready for the role swap.
The System Health Check includes additional role swap training, if needed.
NOTE
If performing a migration, extensive testing of your backup node is
required. Consult with the Professional Services department for
pricing and additional details.
Assi-
Section One: Role Swap Pre-Planning Procedure gned Done
1. Ensure the Monitoring Checklist is run regularly each day up through the day of the
role swap. Correct any issues that exist. Do the steps in the weekly section 24 to 48
hours before the role swap.
2. Ensure audits in the Audit Command Console are running regularly and that all
audits have been run at least 24 hours prior to the role swap. Correct any issues
before proceeding with the role swap.
3. Review the list of all non-IP hardware and update the plan for switching them during
the role swap. If not already done, develop a written plan for handling all non-IP
devices during the role swap. Otherwise, review the list and verify that the actions are
still correct.
4. Review Device Configuration replication to verify that all necessary devices have
been replicated. This could include remote printers, handheld scanners, or other
devices that require special device configuration. Review the device configuration
replication screen (5.4).
5. Verify that you have an alternate sign on method and test it prior to performing the
role swap. Vision strongly recommends that a group of display devices be attached to
QCTL or *BASE. Remember that iTERA HA will be ending the takeover IP address
and you may be ending your version of QINTER. If devices have not been defined to
QCTL or *BASE the console may be used to sign on after the role swap.
6. Using option 30.22 on all systems, review all IP addresses. List what they are used for
and how they will be handled during a role swap. This is especially important for all
user IP addresses listed for the Primary and Backup 1 nodes.
NOTE
If you do not want your Takeover IP address to end on the primary you must
change the IP Mapping in 30.22 on the primary.
Assi-
Section One: Role Swap Pre-Planning Procedure gned
Done
7. Document the processes you will use to end and start user jobs. Vision recommends
that you update the E2USRDWN and E2USRUP programs to perform these
functions for you. Copy the E2.CUST file from ITHA (or ITE2) to ITHAxx (or
ITE2xx). Modify and recompile the E2USRDWN and/or E2USRUP CL programs
in the CRG library. E2USRDWN and/or E2USRUP may be different between
nodes.
NOTE
If you are replicating WebSphere MQ, include in your plan the processes for
ending the MQ Queue Managers and starting the MQ Subsystem.
8. Develop and document your test plan. Indicate all testers and what they will be
testing. Review and revise it, as needed.
9. Use the command WRKACTJOB to review and document the user jobs and
processes running on the primary system. Print the screens for post-role swap
comparison. After the role swap process, you will be checking the subsystems on the
new primary to ensure that the requisite jobs are running.
10. Ensure that the iTERA HA software is on the current PTF release level. If not
current, make sure all iTERA HA and Cross Product PTFs have been loaded and
applied the week before the role swap. (Also load and apply iTERA Alert PTFs, if
using iTERA Alert.)
IMPORTANT
Do not load PTF updates during the week of the role swap (unless you are told
by CustomerCare that you need to fix a specific issue).
11. Review the preferred states for all iTERA-managed triggers on all nodes (4.23).
Generally, triggers should have the following:
Assi-
Section One: Role Swap Pre-Planning Procedure gned
Done
13. Check the iTERA HA subsystem on all nodes to ensure they will run enough jobs.
The default number of jobs in the iTERA subsystem is 100 (50 for the E2SYSJOBQ
and 50 for E2JOBQ). A large number of journals will increase the need for
additional jobs.
Verify that the Maximum Active Jobs on the primary is set the same as (or higher
than) on the current backup.
a. Execute the following command on both the primary and backup 1:
WRKSBSD E2xxSBS
CLRPFM E2POSR
CLRPFM E2PZZLG
The E2P5101C file contains the heal records. The file exists on all nodes but is only
used on target nodes.
IMPORTANT
If you have more than one CRG verify that you are signed on to the
correct CRG on all nodes. You can check this by looking at the
iTERA HA Menu (top row, second from the left). Also verify that
you are signed on with the iTERA HA Admin profile.
Assi-
Section Two: Pre-Role Swap Checks gned Done
1. On all nodes, review scheduled jobs. Place any scheduled jobs that may submit
during the role swap on hold.
• To hold replicated jobs, use 5.5, F19 on the primary. If you don’t use 5.5 to hold
the jobs (e.g., if you hold them using another method), note which jobs you put
on hold so that you can release them after the role swap (in “Section Eight: Final
Steps”).
IMPORTANT
Do not use 5.5 to hold the jobs on systems where there is more than one
backup node or if there is a replicate node defined.
The purpose of this step is to recycle the job logs so that when the actual role swap is
executed, the subsystems will end more quickly.
3. End the Audit Command Console on all nodes (1.8, F11=End Auto Audit).
4. If this is a migration you may want to save the local journal receivers on the current
primary into a save file.
5. In Non-Mirrored Object Replication Check (4.30, primary), verify that all objects
that need to be replicated are being replicated. The Last Sync Date/Time data
should be current. If not, use F9=Sync NM-Library to sync all objects. (This should
be scheduled to run each day.)
6. If you have a multiple target node environment, use menu 30.7 on the node that
will become the new primary. Test communication to all other target nodes in the
CRG. Correct any issues prior to continuing.
Assi-
Section Three: Final Pre-Role Swap Checks (30 Minutes Prior) gned Done
1. Optional: Notify users that the system will be coming down in thirty minutes.
2. Perform the following steps to verify that the system is ready for the role swap. The
items listed below should be checked in the order listed. Any issues encountered
should be investigated and resolved before proceeding.
a. Update the System Monitor (1.1 F10, primary). Check the following:
– Verify that the % Total Disk Storage Used displayed for the Backup 1 node
well below the acceptable threshold.
– Local/Remote Journals Active, Apply Jobs Active and Network/Subsystem Active
should all indicate YES values (there should be no *NO indicators).
– Journal Entries Not Applied should either be caught up or a relatively low
number of entries.
– Objects Requesting Sync should be NONE.
– Press F14=Role Swap Readiness.
– The AUDIT_MON test may be in WRN status because the Audit
Command Console has been ended.
– The JOBSCDE test may be in ERR status because the journal manager
job has been placed on hold.
– No other tests should be in WRN or ERR statuses.
b. Review the Record Count Audit results (1.22; backup).
c. Check the Heal Status Summary screen (3.7; backup). Verify there are no
pending Heal requests.
d. On all nodes, review the iTERA HA message log (1.1 F11=E2MSGLOG).
Ensure there are no high severity errors occurring. If so, resolve them. Do not
proceed with the role swap until issues are fully resolved.
e. On all nodes, check the iTERA HA subsystem to ensure it is active and that
there are no jobs in MSGW status (1.1, F7=E2SBS).
3. Place the Role Swap Readiness Monitor job on hold on all nodes (E2SBS, opt 3 on
xx_RSRMON).
IMPORTANT
Verify all users are finished processing before ending the subsystems.
IMPORTANT
During the actual role swap, never go into a restricted state which
will end all subsystems and sever communications.
Assi-
Section Four: User Shutdown gned Done
a. End all MQ Queue Managers using your company’s established protocols. Make
sure the QMQM subsystem is not active.
b. Verify that the MQ mirroring process is caught up using the Mirror Process
Monitor (1.1, F16=Process Monitor). Verify that there are no exposed entries for
the MQ journals.
c. End iTERA HA WebSphere MQ Replication (5.6, F17=End MQ replication).
Assi-
Section Four: User Shutdown gned
Done
a. Select 40.24.
b. Do one of the following:
– To replicate individual objects in the list, select option 9=Sync Object for any
necessary objects
– To replicate all objects in the list, press F10=Sync Roll Request. Press F10 to
confirm replication of all necessary exclusive locked objects.
4. Check Data Queues (required for V5R3, optional for all other OS versions)
Select 40.23. If the message “All data queues are empty” is displayed at the bottom of
the screen then continue with the next step. If the message indicates there is data in
any data queue, review the message log to see which data queue needs to be emptied
(E2MSGLOG – primary). If there are entries in any data queues, determine why. If
applications are still processing them, they must be caught up before executing the
role swap.
5. Verify that there are no other jobs running. (QRB)
Assi-
Section Four: User Shutdown gned
Done
Select 40.25, F10=Update Monitor. Verify there is no latency, that there are no
objects requesting sync, and that there are all ‘YES’ indicators.
NOTE
If you replicate WebSphere MQ, the apply jobs and journal status will indicate
*NO.
9. Select 40.22 then press F5=Refresh to verify again that there are no new entries
caused by user jobs (entries from QAUDJRN are acceptable). (QRB)
IMPORTANT
Do not reset!
If there are new entries, it indicates there are still jobs running. Use option 7=Display
New Entries to determine which jobs are still running. If any new entries are from
user jobs, decide whether to end the jobs or allow them to finish processing. User jobs
must be completed prior to executing the role swap.
IMPORTANT
If you attempt a role swap then need to abort it for any reason,
contact CustomerCare immediately for assistance.
IMPORTANT
Assi-
Section Five: Perform the Role Swap gned
Done
IMPORTANT
Verify that the screen indicates “Role Swap”. If it indicates “Failover”, do not
continue with the role swap. Most likely, it indicates there is a communications
failure between systems. Contact CustomerCare immediately in order to
troubleshoot and resolve the issue.
Assi-
Section Five: Perform the Role Swap gned
Done
Review the parameters and set as needed, then press F10 to accept.
NOTE
Vision Solutions recommends setting all parameters on the confirmation
screen to *NO for the first role swap. For subsequent role swaps, set as desired.
IMPORTANT
Do not change any iTERA HA files or data areas, end any iTERA HA
programs, or start the iTERA HA subsystem manually.
Monitor the role swap from the Role Swap Monitor screen on the backup node
until the role swap has completed. The Status field in the middle section will
indicate “Complete” when the role swap has finished.
Assi-
Section Six: Post Role Swap Activities gned Done
1. Sign off all nodes, then sign back on to all nodes with the HA admin profile. (This
clears the cache memory and QTEMP.)
2. Review the System Monitor (QRB)
a. 1.1, F10 – primary. Verify that the journals and apply jobs are active. (YES status
in all appropriate columns.) (You may need to update the monitor more than
once if all statuses are not correctly reported.)
b. Press F7=E2SBS. Verify that the subsystem and all necessary jobs are active.
c. Press F14=Role Swap Readiness. Verify that no tests in the Role Swap Readiness
Monitor (with the possible exception of IPCONWRN) are in ERR status.
NOTE
If you are replicating WebSphere MQ, the apply jobs and journal status will
indicate *NO. The Role Swap Readiness Monitor will also show in error.
Verify that the triggers are not in an error status. An error status is indicated by the
Trigger State column showing red text. (If the library name shows in red text, it
indicates the library is no longer being replicated and you do not need to correct the
triggers in that library.) You may need to set other triggers to the node’s preferred
state. For more information on triggers, refer to the iTERA HA v6.0 Reference
Guide.
NOTE
iTERA HA does not modify the state of triggers in non-replicated libraries.
Assi-
Section Six: Post Role Swap Activities gned
Done
Verify that the constraints are not in an error status. An error status is indicated by
the State column showing red text. (If the library name shows in red text, it indicates
the library is no longer being replicated and you do not need to correct the
constraints in that library.) You may need to set other constraints to the node’s
preferred state.
Constraints that indicate REQ in both the Pri and Bkp fields (under Preferred State
column) cannot be changed. For more information on constraints, refer to the
iTERA HA v6.0 Reference Guide.
NOTE
iTERA HA does not modify the state of constraints in non-replicated libraries.
5. If replicating spool files, release output queues by running the following command:
E2RLSOUTQ
NOTE
This will release all replicated output queues.
Assi-
Section Six: Post Role Swap Activities gned
Done
7. If using ODBC, JDBC, or any other communication protocol, you may need to
change the new primary *LOCAL RDB entry to be the same as it was on the old
primary. See instructions in Appendix D, “i5/OS Recommendations” of the iTERA
HA v6.0 User Guide.
8. If you are using WebSphere MQ replication, start the process. (QRB)
NOTE
If WebSphere MQ Replication has never been set up on this system, you may
need to assign a default journal using F13 and opt 1, assign the journal to a
queue manager, and start journaling.
Assi-
Section Seven: User Signon gned
Done
You can redirect the user IP anytime after confirmation screen has been reviewed.
3. Start necessary user subsystems and interfaces, etc. (QRB)
Optional: Call the E2USRUP program, if created. (The CL source program in the
source file ITHAxx/E2.CUST or ITE2xx/E2.CUST.)
4. Optional: If you specified *NO for the Enable Frozen Profiles parameter in the
confirmation screen, enable them now (5.1, F17 – new primary).
5. Notify users that they can use the system (QRB).
Assi-
Section Eight: Final Steps gned Done
1. On all nodes, select option 6 to enter the Audit Console, press F11=Start Auto
Audit, then F7=Ignore Setup Time, to start the Audit Console.
2. Set the job schedule entries. Some environments generate duplicate entries. After
reviewing your environment, if no duplicates are detected, set the job schedule
entries as follows:
NOTE
This option does not work if you have a Replicate node defined to the CRG.
IMPORTANT
If the jobs are not held on the target machine and they run, then any files
affected by those jobs will be out of sync.
Assi-
Section Eight: Final Steps gned
Done
• The Virtual Role Swap Test does not affect users on the production
system. However, only limited communications testing can be performed.
• The Virtual Role Swap with Communications Test allows you to perform
all testing available in the Virtual Role Swap, as well as communications.
However, you will be required to restrict the production system so that
users cannot access it during the test.
NOTE
The Virtual Role Swap with Communications Test steps are listed as
optional steps within the Virtual Role Swap Test instructions.
• Execute the Virtual Role Swap procedure on BACKUP 1. See “Execute the
Virtual Role Swap Test” on page 224.
• After testing is complete, end the Virtual Role Swap Test as documented in
the procedure and resume replication. See “End Virtual Role Swap;
Resume Replication” on page 230.
IMPORTANT
Keep in mind that since all processing and changes performed in a
Virtual Role Swap will have to be reversed, you must allow sufficient
time for the changes to be rolled back, which takes time. We
recommend you only test a few things with your first one in order to
gauge how much time it will take your system to recover, then
increase the amount of processing with each subsequent test so that
you don’t encounter an unacceptable reversal time.
NOTE
Heal is not supported for WebSphere MQ, IFS, and Spool Files.
Additionally, the audits are not currently equipped to identify
changes made to these components during the Virtual Role Swap.
Therefore, these components are not supported in the Virtual Role
Swap and should not be tested. However, if a complete resync of IFS
is practical, it can be planned as part of the process.
IMPORTANT
Optional: If you do not run the Library Analyzer weekly as part of the monitoring
checklist then run it on the primary using 4.11, F18=Submit Library Analyzer.
Verify that all eligible libraries that should be replicated are being replicated.
NOTE
When the Virtual Role Swap Test is executed, any objects listed in option 3.7 on
the backup node (Heal Summary; Pending entries), objects requesting sync, and
objects with record count errors may contain incorrect data during testing. The
data in these objects on the backup may be different than on the primary.
2. Know how you are going to have your users access your backup node. ❏ ❏
• If you are doing a Virtual Role Swap Test with Communications Test you will be
redirecting your primary’s user IP address to the backup node. There are many
ways to accomplish this. For example, DNS server, virtual IP address, intelligent
routers, etc.
• If you are executing the standard Virtual Role Swap Test then you will need to
provide a way for the users who will be performing testing to access the backup
node.
3. Verify that the ZZ-Audits are running. On the backup node enter E2SBS. There ❏
should be one ZZ job for each journal, excluding IFS and Transport journals.
IMPORTANT
ZZ-Audits are required in order to perform the Virtual Role Swap Test. If they
are inactive, objects changed during the Virtual Test will not be healed.
4. If user journals are defined in iTERA HA, verify the setting for the parameter ❏ ❏
FIXLENDTA. iTERA HA requires this parameter to be set to *JOBUSRPGM.
5. Turn off Audit Command Console on all nodes using menu 6, F11. ❏ ❏
6. Place the Role Swap Readiness Monitor job on hold on all nodes (E2SBS, opt 3 on ❏ ❏
xx_RSRMON).
7. On the backup node, hold all jobs in the Job Scheduler (including the Journal ❏
Manager job that would normally run on the backup system) that are scheduled to
run during the time frame of the Virtual Role Swap Test.
NOTE
If you are using Job Scheduler Replication you can change the status using that
process (5.5, F20, H=Hold all jobs), (this will not work if you have a replicate
node). However, if you are going to use the Job Scheduler Replication to hold
and release the jobs on the job scheduler, verify that the Status fields in 5.5 on
the primary (Cur, Pri, and Bkp) are set correctly and understand that the statuses
for all job schedule entries will be changed.
8. On the primary node, only hold the iTERA HA jobs (including the iTERA HA audit ❏
jobs, and the journal manager job) on the Job Scheduler that will run during the time
frame of the Virtual Role Swap Test.
9. Optional: Replicate exclusively allocated objects. Select Work with Locked Objects ❏
(4.22), enter option 9=Sync Object on all records, press enter, and then select
F10=Sync Roll Request to replicate those objects to the backup system.
NOTE
This may not be an option on a Virtual Role Swap Test because users may have
the objects locked.
10. On the primary node, sync non-mirrored objects. (4.30, F9=Sync NM-Library, then ❏
enter the name of the library and target system.) (Note: you can use *ALL for the
library name to replicate all definitions at once.)
NOTE
This should be scheduled to run each day.
11. On primary node, refresh the System Monitor (1.1, F10=Update). Verify that there ❏
are no issues to resolve.
12. If replicating WebSphere MQ, end replication on the primary using 5.6, F17=End ❏
MQ Replication.
NOTE
WebSphere MQ Replication cannot be active during the Virtual Role Swap Test.
IMPORTANT
Do not end the MQ managers on the primary.
13. Review Triggers on all nodes. On all nodes run the Triggers update (4.23, F18=Submit ❏ ❏
Info Build, F8=All Research new Triggers), then review the current status of all the
triggers. Ensure that you understand why they are in their current state. Change the
preferred state, if needed.
14. Review Constraints on all nodes. On all nodes run the Constraints update (4.24, ❏ ❏
F18=Submit Info Build), then review the current status of all the Constraints. Ensure
you understand why they are in their current state. Change the preferred state, if
needed.
15. Execute the Virtual Role Swap Test. From the backup node select the Role Swap ❏
option 40.30, F16=Role Maintenance, F19=Virtual Test.
IMPORTANT
Review the Confirmation Screen. Verify that the screen heading indicates Virtual
Test Role Swap, as indicated:
• For the Process User Startup Program parameter, enter Y if you want to run the
iTERA HA Role Swap Startup program in the “Step 1” option. Otherwise, enter
N and press F10=Initiate Virtual Test Setup.
• The Role Swap Monitor will be displayed and processing will begin. In the Process
section (the center section of the screen), when step 9, Virtual Test Ready, is
displayed with the Status of Complete, you may exit the screen using F3.
NOTE
If it appears to take longer than expected for the completion message to be
displayed, contact CustomerCare.
16. Once you have exited the Role Swap Monitor screen, the message “WARNING: ❏
Virtual Test Role Swap in Progress” will be displayed at the top of the iTERA HA menu
screens. Triggers and constraints have been reset, apply jobs have been suspended, and
objects will not be resynced until recovery from Virtual Role Swap Test mode.
NOTE
The amount of time the recovery takes is in relation to the number of changes
made during the Virtual Test. The more changes made, the longer the recovery
takes. For example, if you plan on testing your day-end or month-end processes,
it could take a significant amount of time to recover. Vision recommends that
your Virtual Role Swap test be limited to one day.
17. Review Triggers on the backup node. Check to ensure the Triggers are set to the ❏
Preferred Primary status (4.23).
18. Review Constraints on all nodes. On the backup node, verify that the Constraints are ❏ ❏
set to the Preferred Primary status (4.24).
IMPORTANT
Perform this step only if performing the Virtual Role Swap with
Communications Test (which tests primary communications). If you choose to
do this, you will be required to restrict the production system so that users
cannot access it during the test. If you DO NOT want to test your primary
communications, skip this step and continue with the next step below.
NOTE
If you can create duplicate communications you can do your complete testing in
a Virtual Role Swap Test.
NOTE
We strongly encourage you to update the iTERA HA User Up and User Down
programs to automate the starting and ending of user jobs, subsystems, etc.
22. The target is now ready for application testing. Notify selected users that they can start ❏
testing applications.
NOTE
Test applications and verify that the license keys work.
IMPORTANT
Perform this step only if you executed the communications testing step in
the above procedure (as part of the Virtual Role Swap Test with
Communications Test). If you DID NOT test your primary
communications, skip this step and continue to step 2 below.
4. End the Virtual Role Swap Test and resume replication. On the backup, select ❏
40.30, F16=Role Maintenance, F19=Virtual Test. Review the Confirmation
screen. Enter Y if you want to run the iTERA HA Role Swap Primary End
program in the “Step 1” option. Otherwise, enter N and press F10.
The Role Swap Monitor will be displayed and processing will begin. Wait until
the Process Steps in the center section of the screen indicate a Complete message.
• The menus will no longer indicate that the system is in Virtual Role Swap Test
mode.
• Replication has resumed on the backup.
• The Heal process has identified any changed records and/or objects that have been
changed during the Virtual Test and has initiated heal requests.
NOTE
Changes made to objects that have been filtered using an IGNCHG filter will be
retained.
NOTE
Be patient for the completion message to be displayed. The amount of time the
recovery takes is in relation to the number of changes made during the test. The
more changes made, the longer the recovery takes. If concerned about the amount
of time it is taking to recover, contact CustomerCare. To verify that the system has
caught up, execute the steps in “How to Check the Progress of the Virtual Role
Swap Recovery”, in the next section.
IMPORTANT
Spool files created during the Virtual Role Swap Test will not be removed from the
backup node.
NOTE
A Replicate node cannot be used in a Role Swap or Virtual Role
Swap test.
NOTE
Disregard the message “Cluster resource group CRGnnnnn does not exist in
cluster CST” if it is displayed.
5. Optional: If the system that was promoted is to stay in the BACKUP 1 role, then ❏
change its Preferred Role to BACKUP 1 using option 9=Promote Preferred Role.
6. On all nodes, clear the following files by executing the commands: ❏ ❏
CLRPFM E2PXMONS
CLRPFM E2PXAMON
2. Compare the sequence number in the E2PQAJ file to the current sequence
number in the journal to estimate how much further the heal has to go.
h. Verify that the number in the E2PQAJ is caught up with the local
receiver sequence that you wrote down.
Overview
All updates and enhancements for iTERA HA v6.0 and related products are
done with IBM-style PTFs. To successfully update iTERA HA, the PTFs
must be acquired from the Vision Solutions Support Central or FTP site,
then loaded and applied. This document provides step-by-step instructions
for these processes, as well as provides relevant supplementary material in the
appendices, including instructions for removing a PTF.
Product IDs
iTERA HA has two Product IDs:
IMPORTANT
You must retrieve, load, and apply PTFs for both product IDs to
be up to date with the core iTERA HA product.
IMPORTANT
Always retrieve and apply PTFs for the Cross Product library prior
to those for iTERA HA.
IMPORTANT
The subsystems must be ended on all CRGs while applying PTFs.
Additionally, if you have configured iTERA Alert, you must also periodically
load and apply PTFs for that product as well (Product ID 7PA2K25).
The following table shows products, product IDs, and versions that are
updated using these instructions.
Product or Product
Description Product ID Relevant Menu Options
Feature Version
NOTE
The document iTERA HA 6.0 PTF Service Pack Availability-nn.pdf
Service Pack Announcement document, available from Support
Central, includes the text of the PTF Cover Letters, as well as any
other special installation instructions. Additionally, the documents
iTERA HA v6.0 PTF Release Report.pdf and PTF Release Report
for Cross Product V4R3.pdf contain the summary details of all cover
letters to date and are also available for download from Support
Central.
• Obtain the ISO image and create a virtual optical device on the IBM i. See
“How to Create a Virtual Optical Drive on the IBM i and Download the
PTF ISO Image” on page 246.
• If you need only a few PTFs, use FTP from your PC to retrieve them from
the Vision Solutions FTP site, then transfer them to your IBM i. See
• Regardless of the method you use to retrieve the PTFs, the instructions in
the section “Load and Apply PTFs” on page 237 should be used to load
and apply them.
The menu options to retrieve and install PTFs (listed in the table above) call
the ITINSTPTF command, and pre-populate some of the required
parameters, such as product ID and product release.
IMPORTANT
The instructions describe the process for the Cross Product first.
Load and apply PTFs for the Cross Product completely prior to
loading and applying PTFs for iTERA HA. If using iTERA Alert,
load and apply PTFs for that product after iTERA HA PTFs have
been applied.
You may load and apply Cross Product PTFs on all nodes
simultaneously, then subsequently repeating the instructions for
iTERA HA PTFs, and then again for iTERA Alert PTFs.
Do not start the subsystems until PTFs for both Cross Product and
iTERA HA have been applied on all nodes in the environment.
1. Sign on to your system (any node) as the iTERA HA user. Select menu
option 10.46 to display the ITINSTPTF command with the parameters
entered for the Cross Product. The following screen is displayed:
IMPORTANT
When repeating these instructions, use menu option 10.45 for
iTERA HA PTFs and 10.47 for iTERA Alert PTFs. These menu
options display the ITINSTPTF screen with Product ID and Release
parameters pre-populated.
2. Set the parameters according to the following table, and press Enter. See
screen below for example.
Parameter Description
iTera Product ID
• 7PA2K02 for cross-product PTFs
• 7PA2K05 for iTERA HA
• 7PA2K25 for iTERA Alert
Parameter Description
Display PTFs not on Enter *YES to display the PTFs that are not currently
System on the system.
Get PTF Save Files Enter *YES to get the PTF save files.
NOTE
Additional details on each parameter are available in the help text
(F1).
F10=Additional Parameters
When *ITERA is specified in the Location of PTF Files parameter, the PTFs
will be retrieved from Vision’s FTP server. Normally, these parameters are
defined when iTERA HA is initially configured and the settings are retained. If
not already configured, specify the following:
• iTera FTP user id / iTera FTP password - Enter the user ID “public”
and password provided by Vision Solutions CustomerCare.
• iTera FTP server address - Enter the Vision Solutions FTP server IP
address:166.70.109.194.
• Include “sendpasv” - “sendpasv” is an FTP subcommand. If you
have firewall issues, enter *YES for this parameter.
3. Upon pressing Enter, if your system can successfully retrieve PTFs, the
following report is displayed (the actual list of PTFs will vary):
This report displays a list of PTFs that are available for download that are
not already installed on your IBM i. If the list is empty, PTFs are up to date
for this product ID (i.e., no PTFs are displayed because they have all been
applied).
If there are any unapplied PTFs, status messages are displayed indicating
that they are being transferred to your IBM i (e.g., Getting PTF
0XP0009...).
5. Enter menu option 10.41 (Cross Product) to display the PTFs retrieved in
the previous step.
NOTE
Use menu option 10.40 for iTERA HA PTFs and 10.42 for iTERA
Alert PTFs.
NOTE
The document iTERA HA 6.0 PTF Service Pack Availability-nn.pdf
Service Pack Announcement document, available from Support
Central, includes the text of the PTF Cover Letters, as well as any
other special installation instructions. Additionally, the documents
iTERA HA v6.0 PTF Release Report and PTF Release Report for
Cross Product V4R3 contain the summary details of all cover letters
to date and are also available for download from Support Central.
a. Have all iTERA HA users sign off all nodes. (No users can access
any menus while PTFs are being applied.)
b. End the iTERA HA subsystem on all nodes (primary first, then
targets).
c. Sign off all nodes.
d. Sign on to all nodes with the iTERA HA Admin profile.
IMPORTANT
After you sign back on, do not select any iTERA HA menu options
or use any iTERA HA commands other than 10.45, 10.46, and
10.47. Doing so may allocate or lock iTERA HA objects which
could cause problems during the iTERA HA PTF apply process.
7. Enter menu option 10.46. Change the value in the Load and Apply PTFs
field to *YES, as indicated, then press Enter.
NOTE
The example above displays the Cross Product ID and Release
information. To apply iTERA HA PTFs, use menu option 10.45. To
apply iTERA Alert PTFs, use menu option 10.47.
Status messages displayed at the bottom of the screen indicate the PTFs
are being loaded and applied. The PTF files are placed in library
QGPL.
8. Enter menu option 10.41 and press Enter. Verify that the Cross Product
PTFs were correctly applied.
NOTE
Use 10.40 for iTERA HA PTFs and 10.42 for iTERA Alert PTFs.
NOTE
The appropriate status for most PTFs is Temporarily Applied.
PTFs with the status Not Applied indicate that the load process
failed. These will need to be investigated and resolved.
9. Repeat step 1 through step 8 for iTERA HA PTFs (menu option 10.45;
iTERA Product ID 7PA2K05, Release V6R0M0).
10. Repeat step 1 through step 8 for iTERA Alert PTFs (menu option 10.47;
iTERA Product ID 7PA2K25, Release V6R0M0).
11. If you did not apply PTFs on the other nodes simultaneously, repeat step 1
through step 10 on each of the other nodes in the iTERA HA
environment.
12. After the PTFs have been successfully applied, sign off, then sign on using
the iTERA HA Admin profile.
13. Start the iTERA HA subsystems on all nodes (E2STRSBS; targets first,
then primary).
1. Have all iTERA HA users sign off all nodes, then end the iTERA HA
subsystem on all nodes (primary first, then target).
2. Sign off all nodes, then sign back on to all nodes with the iTERA HA user
profile.
3. Execute the following commands:
APYPTF LICPGM(7PA2K02) RLS(V4R3M0) APY(*PERM) DELAYED
(*NO)
4. After the upgrade, menu options 10.46 and 10.45 should be executed to
remove the save files for the permanently applied PTFs using the following
parameters.
Keep in mind that PTFs that have been permanently applied cannot be
removed, so it is a good idea to apply them permanently only when an OS
upgrade is executed.
IMPORTANT
Do not permanently apply iTERA PTFs via an IPL. If you want to
permanently apply PTFs for iTERA products, they must be applied
as indicated above. Permanently applying iTERA PTFs during an
IPL will result in the IPL hanging because the iTERA libraries will
not be in the library list.
IMPORTANT
Do not just copy the PTF *.iso file onto a disc. You must use the
“Burn Image to Disk” option of the burning software. If you don’t
have CD burning software, there is a free ISO image plug-in for
Windows at http://isorecorder.alexfeinman.com/
2. Enter your user ID and password then select Login. If you do not have a
username and password, select Sign up.
3. Click “iTERA HA” from the My Products section in the left panel.
5. Verify that 6.0 is displayed in the Version drop-down box, then select
Downloads.
IMPORTANT
The PTF file may be large. Download time will depend on your
connection speed.
7. After file has finished downloading, unzip it, then burn the *.iso image file
to disc (CD or DVD, depending on file size).
8. Once you have burned the installation media to disc, follow the
instructions in the chapter “Load and Apply PTFs” on page 237. For step
2, load the disc on the IBM i (ensure there are no other discs loaded) and
specify the disc’s location in step 2 of that section.
1. Make certain you are signed on with the iTERA HA User Profile so that
you have all required authorities.
2. Create a virtual optical device by entering the command:
CRTDEVOPT DEVD(OPTVRT01) RSRCNAME(*VRT)
3. Vary on the virtual optical device. To make the device active, enter:
VRYCFG CFGOBJ(OPTVRT01) CFGTYPE(*DEV) STATUS(*ON)
5. Download the zip file of the ISO image to the IBM i. Type the following
commands, pressing enter after each line.
FTP ‘166.70.109.194’
Login with user name public and use the password provided by
Vision Solutions CustomerCare.
nam 1
NOTE
Ignore any error that may occur on this step.
bin
lcd /home/optvrt01
cd 7PA2K05/V6R0M0 (case-sensitive)
DIR
(case-sensitive)
6. Unzip the downloaded ISO image. On the command line, type the
following commands:
QSH
CD /HOME/OPTVRT01
JAR XF ITHA60_PTF.ZIP
/home/optvrt01/Distribution/7PA2K05/V6R0M0/6.0.14.00/ISO
Files/itha60_ptf.iso
7. Add an image catalog entry for the iso image. Type the command:
ADDIMGCLGE IMGCLG(OPTVRT01)
FROMFILE('/home/optvrt01/Distribution/7PA2K05/V6R0M0
/6.0.nn.00/ISO Files/itha60_ptf.iso')
8. Load the image catalog. Use the command LODIMGCLG. This associates
the virtual optical device to the image catalog.
LODIMGCLG IMGCLG(OPTVRT01) DEV(OPTVRT01)
The image catalog is ready for use. You can now load and apply PTFs using the
instructions in the section “Load and Apply PTFs” on page 237. Make certain
to specify *OPT in the Location of PTF files field (i.e., this will point to the
optical device you just created) to retrieve PTFs.
2. Click on the Windows “Start” button and select “Run”. Type cmd in the
Open field. Click OK. A dialogue box with text similar to the following
appears:
4. At the “C:\” prompt in the dialogue box, type the following command
then press Enter:
ftp ftp.iterainc.com
5. If the connection is successful, you will be prompted for a user name. Type
public and press enter.
cd 7PA2K05/V6R0M0
dir
(this will list the available PTF save files; the cover letter save files end
with “CL”)
quote site na 1
(Ignore the “SITE NA not understood” error message if you get it.)
8. Type the following command for each of the iTERA HA PTFs you wish
to retrieve and press enter:
get Q1HAnnnn c:\iTERAPTFs\Q1HAnnnn.savf
(where nnnn is the four digit PTF number; If you did not create the
“iTeraPTFs” directory, replace iTERAPTFs with the directory name
you created.)
9. To retrieve iTERA Cross Product library PTFs, type the following and
press enter.
cd /7PA2K02/V4R3M0
dir
10. Type the following command for each of the Cross Product Library PTFs
you wish to retrieve and press enter:
get Q0XPnnnn c:\iTERAPTFs\Q0XPnnnn.savf
11. To end the FTP session, type quit and press enter.
12. At the "C:\" prompt in the dialogue box, type this command:
ftp nnn.nnn.nnn.nnn
cd qgpl
Na 1
put c:\iTERAPTFs\Q1HAnnnn.savf
or
put c:\iTERAPTFs\Q0XPnnnn.savf
(Depending on whether your are downloading HA or XP PTFs, and
where nnnn is the PTF number.)
NOTE
When you FTP a file with an extension of “.savf ” to the IBM i, it
automatically assumes it is a save file, and you do not need to first
create a save file to receive the data.
16. Repeat step 15 for each additional iTERA HA PTF you wish to send to
your IBM i.
17. To end the FTP session, type quit and press enter.
18. After you have the iTERA HA PTF save file on your IBM i, you are ready
to load it. (Loading the PTF does not make any changes to your system.
It simply registers the PTF to the operating system and stages it for the
actual applying of the changes.)
PTFs are loaded into i5/OS using the LODPTF command. Type LODPTF
on the command line and press F4 to prompt.
Fill in the prompts indicated in the following screen then press Enter:
NOTE
To load iTERA HA PTFS, use 7PA2K05 for the Product ID and
V6R0M0 for the Release. The Save File name starts with Q1.
NOTE
“xx” is either “HA” for iTERA HA PTFs or “XP” for Cross Product
Library PTFs; “nnnn” is the number of the PTF you wish to load.
19. To verify that the PTFs were loaded, use menu option 10.41 (Cross
Product PTFs), 10.40 (iTERA HA PTFs), or 10.42 (iTERA Alert PTFs)
on a command line and press enter. A screen similar to the following is
displayed:
NOTE
If the PTF was loaded properly, it will appear in the list with a status
of “Not applied”.
20. After you have the iTERA HA PTF save file(s) on your IBM i and the
PTFs loaded, you are now ready to apply them. Applying the PTF will
make changes to iTERA HA.
To apply all loaded PTFs, type the following on the command line and
press enter:
To apply a single specific PTF, type the following on the command line
and press enter:
• For iTERA Cross Product PTFs use:
APYPTF LICPGM(7PA2K02) RLS(V4R3M0) SELECT(0XPnnnn)
22. To verify that the PTFs were applied, enter menu option 10.41 (Cross
Product PTFs) or 10.40 (iTERA HA PTFs). A screen similar to the
following is displayed:
Bottom
F3=Exit F11=Display alternate view F17=Position to F12=Cancel
NOTE
If the PTF was applied properly, it will appear in the list with a
status of Temporarily applied, or Superseded. (In some cases the
status may be Permanently Applied.) PTFs with the status Not
Applied indicate that the load process failed and they will need to be
resolved.
a. Sign off then sign on to all nodes using the iTERA HA user profile (to
release locks and re-establish the proper library list).
b. Start the iTERA HA subsystem on all nodes (E2STRSBS; target first,
then primary).
Errors
Unable to find PTF list file IPT7PA2K05…
If you receive the screen message “Unable to find PTF list file
IPT7PA2K05…” or similar, the simplest resolution is to ensure that the PTFs
for 7PA2K02 have been retrieved and applied prior to attempting to retrieve
PTFs for 7PA2K05. This will, in most cases, resolve the problem. If you still
receive the error, contact CustomerCare.
• DLTPTF deletes the PTF save files (PTF and cover letter) from the QGPL
library.
Remove PTF:
Use the RMVPTF command to return the PTF environment (programs, etc.) to
the previous level (the level prior to applying the PTF).
NOTE
Note: The Extent of Change parameter must be set to *PERM in
order to prevent problems when reapplying the PTF.
NOTE
PTF number example: 0XPnnnn (where nnnn is the PTF number).
NOTE
PTF number example: 1HAnnnn (where nnnn is the PTF number).
Use the DLTPTF command delete the save files for the PTF:
NOTE
PTF number example: 0XPnnnn or 1HAnnnn (where nnnn is the
PTF number).
NOTE
For iTERA HA PTFs: Use 7PA2K05 and V6R0M0 for the Product
and Release (the example at the left is for the Cross Product.
Check to see if the PTF was removed. Enter menu option 10.41 (Cross
Product PTFs) or 10.40 (iTERA HA PTFs).
Delete a PTF that has problems applying during the PTF apply
process:
Use this procedure if, during the PTF apply process (NOT the PTF retrieve
process), the status for the PTF indicates *Savefile.
Use the DLTPTF command to delete the PTF. Using this command will delete
the save files for the PTF from QGPL.
NOTE
Prior to the DLTPTF command, you may execute the RMVPTF
command for this PTF but this is not required since the PTF was
never applied nor the iTERA HA product upgraded.
PTF
Product Area Number Product ID Release
Retrieve, load, and apply the PTF again. Should the PTF not apply again (i.e.,
status indicates *Save file only), follow the steps below.
Use the IBM command LODPTF to load the PTF (register the PTF to the
operating system).
PTF
Product Area Product ID Release
Number
Enter menu option 10.41 (Cross Product PTFs) or 10.40 (iTERA HA PTFs)
to check the PTF status.
PTF IPL
Opt ID Status Action
0XP0052 Not applied None
0XP0051 Temporarily applied None
0XP0050 Temporarily applied None
0XP0049 Temporarily applied None
0XP0048 Temporarily applied None
0XP0047 Temporarily applied None
0XP0046 Temporarily applied None
0XP0045 Permanently applied None
0XP0044 Temporarily applied None
More...
NOTE
If the status indicates Not applied, the PTF loaded correctly. If this
is not the case, contact CustomerCare.
Have all iTERA HA users sign off all nodes. End the iTERA HA subsystem on
all nodes (primary first, then target). Sign off all nodes. Sign on to all nodes
with the iTERA HA user profile.
Use the IBM command APYPTF to apply (upgrade the iTERA HA product)
the PTF, specifying the parameters indicated.
PTF
Product Area Product ID Release
Number
Display the PTF status; 10.41 (Cross Product), or 10.40 (iTERA HA).
PTF IPL
Opt ID Status Action
0XP0052 Temporarily applied None
0XP0051 Superseded None
0XP0050 Temporarily applied None
0XP0049 Temporarily applied None
0XP0048 Temporarily applied None
0XP0047 Temporarily applied None
0XP0046 Temporarily applied None
0XP0045 Permanently applied None
0XP0044 Temporarily applied None
More...
IMPORTANT
If the status indicates Superseded or Temporarily applied, the PTF
loaded correctly. If this is not the case, contact CustomerCare.
After the PTFs are successfully applied, sign off then sign on to all nodes using
the iTERA HA user profile (to release locks and re-establish the proper library
list), then start the iTERA HA subsystem on all nodes (E2STRSBS; target first,
then primary).
/*========================================================================*/
/* This program will retrieve PTFs from Vision Solutions’ FTP site and */
/* put a notification message into the iTera HA message log for each */
/* unapplied PTF found on your system. It should be run on a periodic */
/* basis from a scheduler (i.e. monthly) */
/* */
/* Note: PTFs are NOT applied by this process, they are only retrieved. */
/* After you are notified that they need to be applied, you are */
/* responsible to review and apply them according to the instructions */
/* contained in the cover letter(s). */
/*========================================================================*/
PGM
DCL VAR(&JobName) TYPE(*CHAR) LEN(32)
DCL VAR(&MsgDesc) TYPE(*CHAR) LEN(25)
DCL VAR(&Pgm) TYPE(*CHAR) LEN(10) +
VALUE('E2GETPTFS')
DCLF FILE(QADSPPTF)
/* Fetch PTFs for the iTera Cross Product Library */
ITINSTPTF PRDID(7PA2K02) RELEASE(V4R3M0) GET(*YES) +
GETSAVF(*YES)
/* Fetch PTFs for iTera HA */
ITINSTPTF PRDID(7PA2K05) RELEASE(V6R0M0) GET(*YES) +
GETSAVF(*YES)
DLTF FILE(QTEMP/NONAPYPTFS)
MONMSG MSGID(CPF0000)
DSPPTF LICPGM(7PA2K02) SELECT(*ALL) RLS(V4R3M0) +
OUTPUT(*OUTFILE) +
OUTFILE(QTEMP/NONAPYPTFS) OUTMBR(*FIRST *ADD)
DSPPTF LICPGM(7PA2K05) SELECT(*ALL) RLS(V6R0M0) +
OUTPUT(*OUTFILE) +
OUTFILE(QTEMP/NONAPYPTFS) OUTMBR(*FIRST *ADD)
OVRDBF FILE(QADSPPTF) TOFILE(QTEMP/NONAPYPTFS)
PRIMEREAD:
RCVF RCDFMT(QSCPTF)
MONMSG MSGID(CPF0864) EXEC(GOTO CMDLBL(END))
MAIN:
IF COND(&SCSTATUS *EQ 'Damaged' *OR &SCSTATUS +
*EQ 'Save file only' *OR &SCSTATUS *EQ +
'Not applied') THEN(DO)
CHGVAR VAR(&JOBNAME) VALUE('Z$GETPTFS found new +
PTF' *BCAT &SCPTFID)
CHGVAR VAR(&MsgDesc) VALUE('Please review and apply')
CALL PGM(E22090RP) PARM(&JOBNAME &MsgDesc &Pgm)
ENDDO
READNEXT:
RCVF RCDFMT(QSCPTF)
MONMSG MSGID(CPF0864) EXEC(GOTO CMDLBL(END))
GOTO CMDLBL(MAIN)
END:
ENDPGM
If you do not have the ability to access our FTP site from your target
node, you can create this CLLE source on the target node with the
following changes. This will enable the target node to bring over all the
new PTFs that have been retrieved to the primary node.
/*========================================================================*/
/* This program will retrieve PTFs from iTera's FTP site and put a */
/* notification message into the iTera HA message log for each unapplied */
/* PTF found on your system. It should be run on a periodic basis */
/* from a scheduler (e.g. monthly). */
/* */
/* Note: PTFs are NOT applied by this process, they are only retrieved. */
/* After you are notified that they need to be applied, you are */
/* responsible to review and apply them according to the instructions */
/* contained in the cover letter(s). */
/*========================================================================*/
PGM
DCL VAR(&JobName) TYPE(*CHAR) LEN(32)
DCL VAR(&MsgDesc) TYPE(*CHAR) LEN(25)
DCL VAR(&Pgm) TYPE(*CHAR) LEN(10) +
VALUE('E2GETPTFS')
DCLF FILE(QADSPPTF)
/* Fetch PTFs for the iTera Cross Product Library */
DLTF FILE(QTEMP/NONAPYPTFS)
MONMSG MSGID(CPF0000)
DSPPTF LICPGM(7PA2K02) SELECT(*ALL) RLS(V4R3M0) +
OUTPUT(*OUTFILE) +
OUTFILE(QTEMP/NONAPYPTFS) OUTMBR(*FIRST *ADD)
DSPPTF LICPGM(7PA2K05) SELECT(*ALL) RLS(V6R0M0) +
OUTPUT(*OUTFILE) +
OUTFILE(QTEMP/NONAPYPTFS) OUTMBR(*FIRST *ADD)
OVRDBF FILE(QADSPPTF) TOFILE(QTEMP/NONAPYPTFS)
PRIMEREAD:
RCVF RCDFMT(QSCPTF)
MONMSG MSGID(CPF0864) EXEC(GOTO CMDLBL(END))
MAIN:
IF COND(&SCSTATUS *EQ 'Damaged' *OR &SCSTATUS +
*EQ 'Save file only' *OR &SCSTATUS *EQ +
'Not applied') THEN(DO)
CHGVAR VAR(&JOBNAME) VALUE('Z$GETPTFS found new +
PTF' *BCAT &SCPTFID)
CHGVAR VAR(&MsgDesc) VALUE('Please review and apply')
CALL PGM(E22090RP) PARM(&JOBNAME &MsgDesc &Pgm)
ENDDO
This program can then be put into the job scheduler on the target
node. Schedule this job later than this same job on the primary. Allow
time for the primary node time to finish the download before the
process is started on the target.
2. Call this program on a periodic basis from any scheduler (we recommend
once a month). The following is an example of how to schedule the job in
the IBM Job Scheduler using the ADDJOBSCDE command:
NOTE
In this example, a job has been set up to run on the second Monday of each
month at 11 AM.
If the process finds any PTFs that you have not yet applied to your iTERA HA
application, it will retrieve them (via FTP) and write a message to the iTERA
HA message log notifying you that a new PTF was retrieved.
The iTERA HA message log messages will display similar to the following:
Display Report
Report width . . . . . : 99
Position to line . . . . . Shift to column . . . . . .
Line ....+....1....+....2....+....3....+....4....+....5....+....6....+....7
MLJOB MLDESC MLPGM
000001 Z$GETPTFS found new PTF 0XP0018 Please review and apply E2GETPTFS
000002 Z$GETPTFS found new PTF 0XP0016 Please review and apply E2GETPTFS
000003 Z$GETPTFS found new PTF 1HA0075 Please review and apply E2GETPTFS
****** ******** End of report ********
The following iTERA HA audits and other jobs may be scheduled to run in
the IBM Job Scheduler. Other job scheduling software may be used, but it is
not supported by Vision Solutions.
From the Audit Command Console (menu 6), select option 2 for the audit,
then select F7=Create Job. The Add Job Schedule Entry screen is displayed.
Make adjustments as needed, then press Enter to add the scheduled entry.
IMPORTANT
The name of the job in the job scheduler must be entered as
indicated in the table below in order for the Audit Command
Console to update the status correctly.
Replace “xx” with the two-character CRG code. For example, if the
CRG code is A1, then for job E2xxAAUDLV, enter
E2A1AAUDLV for the job name in the job scheduler.
• If the audit has not run within the maximum allowable time, the Audit
Command Console will automatically initiate the job.
• The Audit Command Console will not display audits for the
components that are not enabled in the Replication Options 30.23
screen (e.g., Job Scheduler Replication, Spool File Replication, etc.)
unless F8=List All Audits is used.
Name of
Audit Required? Job Scheduler Entry Job in Job Scheduling Notes
Scheduler
Audit Stream Daily on primary CALL PGM(E21399CPS) E2xxASTRM The audit stream runs the following
AUDSTREAM audits, in order:
CHKOBJMTCH (Check Object Match)
JRNOBJLST (Journaled Object List)
JRNOBJJRN (Journal to Object Audit)
RCDCNTCHG (Record Count –
Changed Objects)
If the Audit Stream is scheduled to run
daily, these individual audits do not need
to be scheduled separately. However, see
Scheduling Notes for RCDCNTALL.
The source for this program can be found
in source file ITHA/E2.CUST. If you
choose to customize this program then
copy the source member to source file
ITHAxx/E2.CUST and compile the
program into ITHAxx.
Special configuration is required for the
CHKOBJMTCH audit in environments
with multiple target nodes. Contact
CustomerCare for details.
Check Object Match Daily on primary CALL PGM(E21306CPS) PARM('*ALL' E2xxAOBJMT This audit is not required to be scheduled
CHKOBJMTCH (see Scheduling '*ALL' '*ALL') separately if the AUDSTREAM audit is
Notes) scheduled to run daily.
Special configuration is required in
environments with multiple target nodes.
Contact CustomerCare for details.
Journal to Object Audit Weekly on primary CALL PGM(E21390CPS) E2xxAJROJR This audit is not required to be scheduled
JRNOBJJRN (see Scheduling separately if the AUDSTREAM audit is
Notes) scheduled to run daily.
Journaled Objects List Daily on primary CALL PGM(E21350CPS) E2xxAJRNOB This audit is not required to be scheduled
Audit (see Scheduling separately if the AUDSTREAM audit is
JRNOBJLST Notes) scheduled to run daily.
Logical File Attribute Every other day on CALL PGM(E21540CPS) PARM(*ALL E2xxAUDLFP
Audit primary *ALL)
LF_AUDP
Record Count – All Weekly on primary CALL PGM(E21200CP) E2xxARCDAL This audit should be scheduled to run
Objects Audit (see Scheduling after the AUDSTREAM audit has
RCDCNTALL Notes) completed.
Record Count – Daily on primary CALL PGM(E21210CPS) E2xxARCDCG This audit is not required to be scheduled
Changed Objects Audit (see Scheduling separately if the AUDSTREAM audit is
RCDCNTCHG Notes) scheduled to run daily.
Spool File Replication Every other day on CALL PGM(E27131RP) PARM('*ALL' E2xxASPLFP
Audit primary if ‘*ALL)
SPLF_AUDP replicating Spool
Files
Name of
Audit Required? Job Scheduler Entry Job in Job Scheduling Notes
Scheduler
V6R1 Conversion Monthly on all CALL PGM(E27050CPS) E2xxAV6R1 This audit should run monthly until all
Audit nodes (see PARM(*ALLUSR N) objects that would prevent successful
V6R1_AUDIT Scheduling Notes) upgrade to i5/OS V6R1 are resolved.
Name of Job
Description Required? Job Schedule Entry in Job Scheduling Notes
Scheduler
Introduction
This appendix contains information about upgrading the operating system
of systems running iTERA HA.
iTERA HA OS Compatibility
The following IBM OS versions are compatible with iTERA HA v6.0:
• V5R3, V5R3M5
• V5R4, V5R4M5
• V6R1, V6R4M5
• V7R1
http://www-947.ibm.com/systems/support/i/planning/software/i5osschedule.
html
NOTE
iTERA HA v6.0 will run on V5R3 or higher only.
http://www-912.ibm.com/s_dir/slkbase.nsf/recommendedfixes
• Hiper
• DB2 UDB for System i
• Java
• Backup Recovery Solutions
These groups should be installed on all systems in the cluster, and must be at
the same level on all systems. Notify CustomerCare if your systems run
different OS versions.
http://www-912.ibm.com/s_dir/sline003.nsf/GroupPTFs
IBM PTFs
Consult the document IBM Required PTFs [Published Date].pdf, available from
Support Central, for the list of IBM PTFs that are required for your iTERA
HA installation.
iTERA PTFs
Permanently apply all iTERA HA, iTERA Alert, and Cross Product PTFs
prior to upgrading the OS.
IMPORTANT
Do not permanently apply iTERA HA PTFs via an IPL. Permanently
applying iTERA PTFs during an IPL may result in the IPL hanging
because the iTERA libraries will not be in the library list.
IMPORTANT
All iTERA HA, iTERA Alert, and Cross Product PTFs must be
applied permanently prior to upgrading. Failure to do so could result
in losing some or all of the PTFs.
IMPORTANT
iTERA HA does not support the primary node running a higher OS
level. Additionally, errors may be encountered when running
different OS versions even if the primary version is lower than the
target.
• The target node must be running the higher level operating system.
• All objects must be compiled on the primary node using the lower level
operating system. (For example, the primary node could run release V5R3
and the target node run release V5R4.) When the primary node is using
the lower level of the operating system it guarantees that this will be done.
• You cannot perform a role swap to the backup node until you are ready to
upgrade the primary node to the higher OS level. However, you can
perform a failover in the case of the primary node going down.
1. Perform a role swap using the iTERA HA Role Swap Checklist (see “Role
Swap Procedure” on page 205) and then perform all instructions in the
“Pre-OS Upgrade Instructions” on page 273 and “Post-Upgrade
Instructions” on page 273. (If needed, an onsite visit from a Vision
Solutions Consultant is available as a billable service.)
NOTE
While a role swap is recommended, there are circumstances that may
preclude performing one in conjunction with an OS upgrade.
However, performing a role swap will result in less downtime.
NOTE
Errors may be encountered during or following the role swap due to
the different OS versions. CustomerCare will assist you in working
through the issues. However, some issues cannot be resolved until the
lower OS machine is upgraded.
3. End the iTERA HA subsystems on all nodes (primary first, then target).
4. If you performed a role swap, you may allow users to log on to the new
primary node (formerly the backup).
5. Perform the OS upgrade then restart the node. Do not start application
jobs yet.
6. Restart iTERA HA and let the apply process and object replication process
get caught up.
Post-Upgrade Instructions
1. On all nodes, use the command CHGDDMTCPA (F4 to prompt) to verify
that the password required field is set to *NO. (For V6R1 and later, this
can be set to *USRID.)
IMPORTANT
If updates are done to the database using SQL, there may be
additional objects requesting sync. This is caused by the primary
running on a newer version of the OS than the target. In general, the
target’s OS should be equal to or newer than the primary. Of course,
this is not possible when upgrading the OS.
2. For V6R1, Vision recommends you perform the IBM conversion for
application programs and service programs for objects that are compiled at
lower OS versions. The program conversion command is STROBJCVN.
Refer to IBM documentation for instructions on using this command.
IMPORTANT
This step must be done on all systems after each system has been
upgraded. Failure to do so may result in iTERA resyncing the entire
system.
5. On all nodes, access menu option 30.3 from within iTERA HA and delete
all entries beginning with “E”. However, if the *LOCAL entry begins with
“E”, do not delete it. (These entries will be automatically rebuilt when the
subsystem is restarted.)
10. When all systems are running the same OS version, another role swap may
be performed, if needed.
Optional: To test the application upgrade process on the target node prior to
“going live” on the primary, follow the procedure through step 9 (the “Do
upgrade as instructed by Vendor” step), upgrade target node only, then
perform tests on the application. After successfully testing the application
upgrade, load it on the primary node and continue with step 10 of this
checklist. Any changes to objects on the target node that have been affected
by testing will be resynced.
NOTE
If there are issues with the upgrade that cannot be resolved through
the vendor within a reasonable amount of time, you can opt to not
upgrade the primary, follow any instructions they provide for
removing the application from the target, then resync any libraries
or objects that have been affected.
IMPORTANT
PLEASE read the application’s upgrade documentation very
carefully before starting the upgrade in order to become familiar
with the process.
5. While still in 4.11, select option 4=End Journaling on the libraries being
upgraded to end journaling on the mirror journals. This submits job
xxJE2JRNJ.
a. On the primary, select Work with Local Journals (3.1). Select option
32=Toggle Mirror Status to set the mirror process to OFF for the
USER journals affected. Select Work with Local Journals (3.1,
Primary).
b. Select option 14=Remove Journal to remove the journal for the USER
journals being affected. Follow the application’s upgrade instructions
for handling user journals.
8. In IFS Replication (5.2, Primary), access the appropriate File System
(option 5), then use option 7=Cancel Mirroring, then option 8=End
Journaling for all IFS directories affected by the upgrade.
IMPORTANT
If your vendor requires you to start journaling to user journals,
journaling must be started prior to starting the iTERA HA
subsystems. Failure to do so will cause the iTERA HA subsystems to
start journaling objects to HA journals immediately upon starting
the subsystem.
11. Follow the application’s upgrade instructions for activating the user
journals.
12. Start the iTERA HA subsystems on all nodes (E2STRSBS; target first,
then primary).
13. In the iTERA HA subsystem (E2SBS; primary), place the sync job on
hold: xx_SNC_nnn; where xx is the two-character CRG code and nnn is
the three-character code indicating the target node).
14. In the Work with Libraries screen (4.11, primary) use option 21=Quick
Net Sync to sync the libraries that had previously had syncing cancelled on
them (see step 4).
NOTE
Select only one or two libraries at a time in order to avoid network
congestion.
15. In IFS Replication (5.2, primary), use option 4 to quick sync any IFS
directories that were affected by the upgrade.
NOTE
Select only one or two libraries at a time in order to avoid network
congestion.
16. After all libraries have been synced, release the xx_SNC_nnn job (2.11,
option 6=Release, primary).
17. Release the journal manager job (3.33, option 6=Release; all nodes).
18. Verify that replication has caught up by checking the System Monitor (1.1,
F10=Update Monitor, Primary).
19. Turn the Audit Command Console on (1.8, F11=Start Auto Audit, all
nodes)
20. Release any held iTERA HA audits from the job scheduler
(WRKJOBSCDE, option 6=Release).
IMPORTANT
If you plan on changing the name and/or *LOCAL RDB entry on
more than one node, change one node, cycle the subsystems on all
nodes twice before you change the name and/or *LOCAL entry on
another node.
IMPORTANT
JDBC and ODBC drives do not use the system name. They use
the *LOCAL entry in the relational database. You can check the
RDB entry in 30.3.
NOTE
If using APPC connections you must change LCLCPNAME and
LCLLOCNAME.
If you plan on changing the system name on the current primary to the
system name on the current backup do the following:
2. Change the system name on the current backup to an unused name as per
IBM (this is temporary), unless you are rolling back to the original
primary. In that case, use the original backup name.
6. On the current backup, sign on using the iTERA HA profile and enter
option 30.21, press enter, then F3 to exit. Select 30.22, press enter, then
F3 to exit. Sign on to the current primary, select option 30.21 and check
to see if the backup system name has changed.
8. Change the current primary system name to the primary system name as
per IBM.
11. If using passwords on DDM files, you will need to update the Server
Authority Entries. Use DSPSVRAUTE to display, CHGSVRAUTE to
change.
13. On the current primary, sign on using the iTERA HA profile and enter
option 30.21, press enter, then F3 to exit. Select 30.22, press enter, then
F3 to exit. Sign on to the current backup, select option 30.21 and check to
see if the system name has changed.
• CLRPFM(ITHAxx/E2PXMONS)
• CLRPFM(ITHAxx/E2PXAMON)
• CLRPFM(ITHAxx/E2PTIMH)
15. Check TCP/IP Host Table Entries to see if you need to update the system
that has been renamed (CFGTCP, opt 10).
17. Check TCP/IP Domain Information to see if you need to change the host
name (CFGTCP, opt 12).
18. Change the journal receivers for all IFS journals. On the primary, select
3.2, opt 7=Change Journal Receivers.
19. Restart the IFS Apply jobs. On the backup in option 3.4 reset the apply
jobs to the last sequence of the currently attached receiver (11=Suspend,
12=Override, option 2 to use the last sequence number of the currently
attached receiver, F5 to refresh until active, 14=Restart).
4. Create *LOCAL on the new primary using the value that was in *LOCAL
on the old primary.
5. Create *LOCAL on all other nodes using a value that is different than the
value used on all other nodes.
6. If you are not using passwords on DDM files, select option 30.4 (Check
DDM Attributes) and verify that the field Lowest Authentication Method
(OS V6R1 and later) or Password Required (earlier releases) is set to *NO.
a. Change the related journals receivers. From the primary node, use
menu 3.2 option 7=Change Journal Receiver.
b. Reset the apply jobs (for IFS only). On the backup, select menu 3.4 to
reset the apply jobs to the last sequence of the currently attached receiver
(option 11=Suspend Process State, option 12=Override Next Seq #,
option 2 to use the last sequence number of the currently attached
receiver, F5=Refresh until active, option 14=Restart Job).
NOTE
iTERA HA uses SQL CLI in processing data between nodes. The
*LOCAL name is included in the receiver. If you do not change the
receiver and the apply sequence to process the apply jobs for IFS will
not start.
IPL Considerations
When executing an IPL on the primary, in order to prevent communications
issues between primary and target, perform the following steps:
4. After the primary is up and running, start the subsystem on the target first,
then the primary.
• Remote system(s):
ADDSVRAUTE USRPRF(E2XXADMIN) SERVER(SYSTEMNAME)
PASSWORD(PASSWORD)
NOTE
If you wish use different DDMTCPA options, you will need make
sure that DDM functions between the HA systems.
IBM PTFs
Recommendations for IBM PTFs, PTF Groups, and Cumulative Fix Package
levels are documented on the Recommended Operating System Fixes page on
the Vision Solutions website:
http://portal.visionsolutions.com/RecommendedOSFixes.aspx
The list does not include resolutions of changing the thresholds or ignoring
the tests except under selected conditions.
RMT: Apply *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE)
[ApplyJobName] is not may be on hold. Too on the target.
active many jobs may be in the The subsystem may be down.
*JOBQ E2SYSJOBQ.
RMT: Apply The job may have been Select option 3.4 on the target to restart all ended jobs.
[ApplyJobName] is not cancelled. Select option R=Restart Job on the journal to restart the
active apply job.
RMT: Apply Get job information such as error message and line
[ApplyJobName] has a number of where the problem occurred. Contact Vision
message Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
AUDIT_MON
DDM
x transaction records The replication program Make sure the xx_DEVREP on the primary is running
found in the file is not running as it and not in MSGW. It may be behind if an audit or
should or is not caught quick sync was just requested.
up.
RMT: x transaction Old entries are found in Clear the file E2PCFEN. You may need to cancel the
records found in the file the E2PCFEN file on the xx_OBJMON job.
target system. This may
be due to unapplied
transactions prior to a
roll.
x error records found in Errors were left from a Clear the file E2PCFPC on the primary system.
the file. prior role swap.
RMT: x error records On the target, select option 5.4 and analyze and errors.
found in the file. If the errors are old or not a problem, you may delete all
entries by pressing F21.
xx_DEVREP is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. The subsystem may be down.
Too many jobs may be in
the *JOBQ
E2SYSJOBQ.
xx_DEVREP is not active The job may have been Select option 2.20 to restart all ended jobs.
cancelled
xx_DEVREP has a Get job information such as error message and line
message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
RMT: Apply *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE)
ZM_xxTJRx is not active may be on hold. on the target.
Too many jobs may be in The subsystem may be down.
the *JOBQ
E2SYSJOBQ.
RMT: Apply The job may have been Select option 3.4 on the target to restart all ended jobs.
ZM_xxTJRA is not active cancelled. Select option R=Restart Job on the journal to restart the
apply job.
RMT: Apply Get job information such as error message and line
ZM_xxTJRx has a number of where the problem occurred. Contact Vision
message Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
x transaction records The replication program Make sure the xx_DIRREP on the primary is running
found in the file is not running as it and not in MSGW. It may be behind if an audit or
should or is not caught quick sync was just requested.
up.
RMT: x transaction Old entries are found in You can clear the file E2PDIEN. You may need to
records found in the file the E2PDIEN file on the cancel the xx_OBJMON job.
target system. This may
be due to unapplied
transactions prior to a
roll.
x error records found in Errors were left from a Clear the file E2PDIPC on the primary system.
the file. prior role swap.
RMT: x error records On the target, select option 5.7 and analyze and errors.
found in the file. If the errors are old or not a problem, you may delete all
entries by pressing F21.
Missing exit point for xxx Directory entry additions Auto replication of directory entries is not set up. To set
and changes are not it up:
copied to the target. 1. From 30.23 on all nodes, verify that the Global State
Automatic replication for *DIRE is enabled.
needs to be restarted.
2. Select option 5.7 on primary system.
3. Press F7=Control.
4. Press F7=Add Exit Program.
5. Press F3 to exit. This should also replicate the process
to the target.
6. On the target, select option 5.7.
7. Press F7=Control. Verify that the Exit Program is
Active.
8. Reload directory entry list (5.7 F18) make sure list is
updated.
9. Press F20 to run audit. This will identify missing
entries and make changes as necessary.
xx_DIRREP is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. The subsystem may be down.
Too many jobs may be in
the *JOBQ
E2SYSJOBQ.
xx_DIRREP is not active The job may have been cancelled. Select option 2.20 to
restart all ended jobs.
xx_DIRREP has a Get job information such as error message and line
message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
RMT: Apply *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE)
ZM_xxTJRx is not active may be on hold. Too on the target.
many jobs may be in the The subsystem may be down.
*JOBQ E2SYSJOBQ.
RMT: Apply The job may have been Select option 3.4 on the target to restart all ended jobs.
ZM_xxTJRA is not active cancelled. Select option R=Restart Job on the journal to restart the
apply job.
RMT: Apply Get job information such as error message and line
ZM_xxTJRx has a number of where the problem occurred. Contact Vision
message Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
ENCRYPTION
FTP
HEAL
xx_IFSMON is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. Too The subsystem may be down.
many jobs may be in the
*JOBQ E2SYSJOBQ.
xx_IFSMON is not active The job may have been cancelled. Select option 2.20 to
restart all ended jobs.
xx_IFSMON has a Get job information such as error message and line
message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
RMT: Apply ZM_xxIJRx *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE)
is not active may be on hold. Too on the target.
many jobs may be in the The subsystem may be down.
*JOBQ E2SYSJOBQ.
RMT: Apply ZM_xxIJRA The job may have been cancelled. Select option 3.4 on
is not active the target to restart all ended jobs. Select option
R=Restart Job on the journal to restart the apply job
RMT: Apply ZM_xxIJRx Get job information such as error message and line
has a message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
IPCONWRN
IPCONWRN test to Connection warning on See “Resolve IP Connection Warnings” on page 300
[remote system] failed the IP address for routing notes as well as detailed instructions on
fixing IP address connection warnings.
iTERA Alert is not set up Refer to iTERA Alert configuration instructions in the
correctly. iTERA HA v6.0 Advanced Features Guide.
JOBQ
x transaction records The replication program Make sure the xx_JBSREP on the primary is running
found in the file is not running as it and not in MSGW. It may be behind if an audit or
should or is not caught quick sync was just requested.
up.
RMT: x transaction The replication program Make sure the xx_JBSREP on the target is running and
records found in the file is not running as it not in MSGW. It may be behind if an audit or quick
should or is not caught sync was just requested.
up.
x error records found in Errors were left from a Clear the file E2PJBSF on the primary system.
the file prior role swap.
RMT: x error records On the target, select option 5.4 and analyze and errors.
found in the file If the errors are old or not a problem, you may delete all
entries by pressing F21.
xx_JBSREP is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. Too The subsystem may be down.
many jobs may be in the
*JOBQ E2SYSJOBQ
xx_JBSREP is not active The job may have been cancelled. Select option 2.20 to
restart all ended jobs.
xx_JBSREP has a message Get job information such as error message and line
number of where the problem occurred. Contact Vision
Solutions to report the problem. You may need to end
the job and restart or remove a record from a
transaction file.
RMT: Apply *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE)
ZM_xxTJRx is not active may be on hold. Too on the target.
many jobs may be in the The subsystem may be down.
*JOBQ E2SYSJOBQ.
RMT: Apply The job may have been cancelled. Select option 3.4 on
ZM_xxTJRA is not active the target to restart all ended jobs. Select option
R=Restart Job on the journal to restart the apply job
RMT: Apply Get job information such as error message and line
ZM_xxTJRx has a number of where the problem occurred. Contact Vision
message Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
MQ (WebSphere MQ Replication)
Library QMQM does not Load library QMQM or turn off *MQ replication in
exist 30.23.
xx_OBJMON is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. Too The subsystem may be down.
many jobs may be in the
*JOBQ E2SYSJOBQ.
xx_OBJMON is not The job may have been Select option 2.20 to restart all ended jobs.
active cancelled.
xx_OBJMON has a Get job information such as error message and line
message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
x transaction records The replication program Make sure the OBJMON2 job on the primary is
found in the file is not running as it running and not in MSGW. It may be behind if an
should or is not caught audit or quick sync was just requested.
up.
RMT: x transaction Old entries are found in Clear the file E2PZCEN on the target. You may need to
records found in the file the E2PZCEN file on the cancel the OBJMON2 job.
target system. This may
be due to unapplied
transactions prior to a
roll. There was an old
problem where library
changes (or object
deletions) was creating
entries on the target file.
OBJMON2 job is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. Too The subsystem may be down.
many jobs may be in the
*JOBQ E2SYSJOBQ.
OBJMON2 job is not The job may have been cancelled. Select option 2.20 to
active restart all ended jobs.
OBJMON2 job has a Get job information such as error message and line
message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
RMT: Apply *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE)
ZM_xxTJRx is not active may be on hold. Too on the target.
many jobs may be in the The subsystem may be down.
*JOBQ E2SYSJOBQ.
RMT: Apply The job may have been cancelled. Select option 2.20 on
ZM_xxTJRA is not active the target to restart all ended jobs.
RMT: Apply Get job information such as error message and line
ZM_xxTJRx has a number of where the problem occurred. Contact Vision
message Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
xx_SNC_xx is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. Too The subsystem may be down.
many jobs may be in the
*JOBQ E2SYSJOBQ.
xx_SNC_xx is not active The job may have been cancelled. Select option 2.20 to
restart all ended jobs.
xx_SNC_xx has a Get job information such as error message and line
message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
PING
RJSTS
xx_RMTCMD is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. Too The subsystem may be down.
many jobs may be in the
*JOBQ E2SYSJOBQ.
xx_RMTCMD is not The job may have been cancelled. Select option 2.20 to
active restart all ended jobs.
xx_RMTCMD has a Get job information such as error message and line
message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
x error records found in Errors were left from a Clear the file E2PCFPC on the primary system.
the file prior role swap
xx_RPTREP is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. Too The subsystem may be down.
many jobs may be in the
*JOBQ E2SYSJOBQ.
xx_RPTREP is not active The job may have been cancelled. Select option 2.20 to
restart all ended jobs.
xx_RPTREP has a Get job information such as error message and line
message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
xx_RPTREP is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. Too The subsystem may be down.
many jobs may be in the
*JOBQ E2SYSJOBQ.
RMT: xx_RPTCHG is The job may have been cancelled. Select option 2.20 on
not active the target to restart all ended jobs.
RMT: xx_RPTCHG has Get job information such as error message and line
a message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
RMT: Apply *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE)
ZM_xxTJRx is not active may be on hold. Too on the target.
many jobs may be in the The subsystem may be down.
*JOBQ E2SYSJOBQ.
RMT: Apply The job may have been cancelled. Select option 3.4 on
ZM_xxTJRA is not active the target to restart all ended jobs. Select option
R=Restart Job on the journal to restart the apply job
RMT: Apply Get job information such as error message and line
ZM_xxTJRx has a number of where the problem occurred. Contact Vision
message Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
xx_SYSMON is in jobq *JOBQ E2SYSJOBQ You may need to add more active jobs (ADDJOBQE).
may be on hold. Too The subsystem may be down.
many jobs may be in the
*JOBQ E2SYSJOBQ.
xx_SYSMON is not The job may have been cancelled. Select option 2.20 to
active restart all ended jobs.
xx_SYSMON has a Get job information such as error message and line
message number of where the problem occurred. Contact Vision
Solutions to report problem. You may need to end the
job and restart or remove a record from a transaction
file.
QAUDCTL = QAUDJRN is no longer 1. For each of the applications, reload the list (typically
*NOQTEMP receiving changes. IBM F18). 4.11 Library Maint, 5.3 Outq and Spool File
or will remove these values Replication, 5.5 Job Scheduler Replication, 5.4
under certain Configuration Replication.
QAUDCTL = *OBJAUD circumstances to keep the 2. Run all of the audits.
or system up. However, it
QAUDCTL = *AUDLVL causes grief. All object
adds, changes, and
deletes will not be
recorded. Also job
schedule changes, report
add, change, deletes,
authority changes will
not be replicated to the
target.
QAUDLVL = *DELETE Deleted objects are not Run the object match audit.
deleted on the target.
QAUDLVL = *OBJMGT Object changes are not Run the object match audit.
replicated to the target
QAUDLVL = *SPLFDTA Reports may not be 1. Select 5.3 Spool File Replication.
or correct on the target. 2. Take option F18 to load any missing OUTQs.
QAUDLVL = *PRTDTA 3. Perform the spool file audit.
QAUDLVL - *CREATE New objects may not be Perform the object audit.
replicated.
Missing exit point for xxx The exit point program is Select option 5.1 on primary system.
not set up for user profile If F9=Start Replication is shown press F9.
replication. Users are not
being copied to the target If F21=End Replication is shown press F21, then press
system. F9.
To replicate any missing user profiles (and all existing)
press F13=Trigger all profiles. (The trigger all profiles
process can take a long time as there is a 3 second delay
between profiles.)
Routing Notes
IP address configuration for use in iTERA replication should adhere to the
following:
Websites
• Read important information about and obtain the IBM Schowler Route
instruction information here.
a. From the Role Swap Readiness Monitor on the primary, locate the
IPCONWRN test. If the monitor is inactive, press F10=Activate and
then refresh the screen until results for the IPCONWRN test are
returned. You should see an error as shown below.
Menu option 10.10 on the primary may also be used to check for
connection warnings. If connection warnings exist the following
message is displayed:
b. On the primary, use menu option 30.21 to view the IP addresses that
iTERA should be using for replication.
Cluster Cluster
Node Node Interface Interface Node
Opt Node Id Code Status IP 1 IP 2 State
SYSTEM1 E0CAD62005 ---------- 172.22.10.80 Inactive
SYSTEM2 E0CAD62006 ---------- 172.22.10.79 Inactive
...
After you retrieve this information, press F12 twice to return to the
Cluster and Node Maintenance screen.
c. On the target node, type the command NETSTAT and select option 3.
Review all DRDAs and remote journals in the Remote Port column.
Press F11 twice to verify the IP addresses being used. In the example
below, the DRDA and RMTJOUR jobs should be using 172.22.10.79.
a. After verifying that there are connection warnings and the IP addresses
in Netstat are incorrect, end the iTERA HA subsystem on all nodes
using the command E2ENDSBS (primary first, then targets).
b. After the subsystems have ended, select menu option 3.3, Remote
Journal Maintenance on the primary. Use option 14=Inactivate
Remote Journal for each active journal and press Enter. Verify that the
status of each remote journal indicates Inactive in the Remote Send
column.
IMPORTANT
If you’re not sure which IP addresses are in the same IP segment, check the
subnet calculator web link above to verify.
b. Press F12 to exit the Interface screen. On the command line type
Netstat. Select option 2. You will see some *DIRECT routes for the
interfaces on the User/Takeover and HA IP subnet. Make note of these
*DIRECT routes; they will be removed automatically as you create
Schowler routes.
c. Press F12 to exit the Netstat screen. Select option 2 to view the Work
with TCP/IP Routes screen (30.2, option 2). Check the existing routes
to be sure you are not creating duplicate routes.
Make note of the Next Hop on the Default Route. This may be used
later if the systems are on separate networks.
• Press enter. You should see that the route was added.
The completed routing entries are displayed in the Work with TCP/IP Routes
screen, as follows:
g. On the system you are on, use Netstat with option 2 to verify that the
*DIRECT routes are gone. The *DIRECT routes will be replaced
with routes similar to the image below.
h. From the backup system, use Netstat with option 3 and select option 4
on any remaining DRDA and RMTJOUR (under the Local Port
column) to close them. (Do NOT close the ones at the top of the list
with a * for Remote Address and a * for Remote Port.)
4. Restart the iTERA subsystem and verify the new routes are being used.
a. Start the iTERA subsystem on all systems (targets first, then primary).
NOTE
It will take longer than usual to start the subsystem on the target node.
NOTE
Update the System Monitor by selecting option 1.1 on the primary,
then F10=UpdateMonitor. This should activate the remote journals.
(Yes/No values display Yes. If not, press F10 again. If, after pressing
F10 three times the remote journals still will not start, select
F16=Process Monitor, select option 5=WRKJRNA for each journal,
press Enter, F16=Work with remote journal information, then
option 13=Activate.)
b. On the backup node, use Netstat and select option 3. Scroll down to
the DRDAs and remote journals. Press F11 twice to verify the IP
addresses being used. Notice how the IP addresses below match what is
shown in the 30.21 screen shot.
c. On the primary, access the Role Swap Readiness Monitor (1.7) and
locate the IPCONWRN test. If the test is still in WRN status, select
option 7=Run Test.
Sts Sts
Opt Test Description Pri Tgt Results
IPCONWRN IP Connection Warnings OK N/A
d. You can also use menu option 10.10 on the primary to check for
connection warnings. If no warnings exist, the following is displayed:
– What is the job status? (SDQ indicates that a data queue is being
emptied after an object restore.)
– Select option 9=Display Apply Entry.
• If library/object is *UNKNOWN/*UNKNOWN, the entry will
not process. (This issue should be discussed with IBM. See separate
instructions to bypass this entry.)
• If the apply entry will not display, there may be something else
wrong with it. You may have to resend the receiver from primary.
Status Indicator
DLY-99
• Are there any partial receivers? Partial receivers cannot be processed and
will stall the apply job. These need to be deleted. In most cases, when
there is a partial receiver in the chain, all the receivers in the chain will
need to be deleted.
a. Select 1.1 System Monitor
b. Press F10=Update Monitor on the primary to send the receivers
across.
• Is the receiver the apply job needs present? If not, do the following:
a. Select 3.2 Journal Receiver Maintenance on the primary system.
b. Select 5=WRKJRNA.
c. Select F15 Work with Receiver Directory. Is the receiver the apply
job needs present? If not, all the libraries and/or objects for this
journal need to be resynced.
4. Check the subsystem E2SBS on the target system.
IMPORTANT
In most cases, resetting the apply job by overriding the sequence
number will require a resync.
NOTE
Restored receivers do not go back into the chain and cannot be sent
to target via remote journaling.
• Did a save/restore migration of backup system. The apply jobs were active
on the target system after the save was done. The receivers had also been
deleted from the primary. Resync libraries for the related journals.
• Receivers were deleted on the primary and target before they were applied.
• Primary system save/restore upgrade (if done wrong). Objects in the
journal will need to be resynced. Journaling and iTERA were started after
production activity had begun.
• The Apply job is too far behind to catch up. This could be because of
planned or unplanned down time. The decision on whether to execute the
override should be discussed with CustomerCare and should be based on
the number of entries and the available bandwidth. Objects in the journal
will need to be resynced.
Primary System
1. Select the System Monitor
5. Select option 8.
– These are only used on the target system. They created when needed.
File Description
– These are only used on a backup system. If there has never been a role
swap or failover these should be empty.
File Description
E2P5101C Heal
– Program E21526RP will remove old records based upon the value
found in the file definition found in 50.11. The two file types that
allow purges are History and Log. The purge program E21526RP has
one parm. Either *ALL or the file name. It can be run at any time.
have a large number of deleted records on the primary. iTERA does not
currently have a procedure to reorganize the files.
File Description
Target System
Perform these steps on the target system.
1. Select 1.1 System Monitor. Check the following for the target node:
– What is the % Total Used By Receivers: _________
NOTE
The remote receivers are not eligible for deletion by the iTERA
Journal Manager until the receiver entries have been applied.
– These are only used on the target system. They are created when
needed.
File Description
– These are only used on a primary system. If there has never been a role
swap or failover these should be empty.
File Description
Program E21526RP will remove old records based upon the value found
in the file definition found in 50.11. The two file types that allow purges
are History and Log. The purge program E21526RP has one parm. Either
*ALL or the file name. It can be run at any time. xx_PRGLOG is
submitted daily at midnight by xx_OBJMON2 Purge.
File Description
E2P5101C Heal
File Description
– Since it isn't always apparent what takes up disk space, the instructions
for creating the IBM Disk Analysis Report are provided below.
• To build the Disk Analysis file, use the following command:
SBMJOB CMD(RTVDSKINF) JOB(RTVDSKINF) JOBQ(QSYSNOMAX)
Troubleshooting DDM
If you have tried all of the troubleshooting steps and you still have a problem
with the server, you need to call IBM! Let them know all the tests you have run
and the results so that they understand that you have checked for all the
common problems.
• Run the Ping, FTP, and DDM Check (30.7) to check to see if all three
work both ways. Results can be reviewed in the iTERA HA message log
(E2MSGLOG).
• Check to see if jobs are running using NETSTAT (is the DDM server up?)
• Check DDM Attributes don’t require a password (30.4 all nodes), the field
“Password required” should say “*NO” unless you have set up iTERA HA
to handle passwords on iTERA HA.
• Check to see if DDM has an exit program attached (DSPNETA) using the
command DSPNETA to view the field DDM request access (if a program
has been attached, make sure that the iTERA HA user profile is defined to
the exit program’s application).
• Check to ensure the Relational Database is defined correctly
(WRKRDBDIRE).
• Check JOBQ QSYSNOMAX to ensure that it is not on hold (WRKJOBQ
QSYSNOMAX).
• Try to access data using a DDM file. Use the command “WRKDDMF
ITE2/*all” to find the first file that is attached to the remote file
ITE2/E2PCNOD. Use the DBU or DFU (UPDDTA) to access the
corresponding DDM file. If the Screen requesting the key information
appears, then the DDM file is working.
• Inactivate the Ethernet Card (CFGTCP, 1, 10) then re-activate the Card
(CFGTCP 1, 9). If this corrects the problem, it indicates there is a possible
hardware failure.
• Check to see if you are on the latest IBM OS Cumulative PTF. Check the
IBM Web site for the latest Cumulative PTF.
• Check to see if IBM has any new PTFs. Check the IBM Web site for the
latest PTF.
Other Troubleshooting
1. End the iTERA HA subsystem on the primary node first (2.13), then the
target node (2.13).