Netware Clustered Servers

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Lascon StorageCookie PolicyContact usSearch this Site

HOME
Backups

CLOSE





Hardware

CLOSE





Mainframe

CLOSE




Windows

CLOSE



Databases

CLOSE




Strategy

CLOSE

Click on the grey buttons above to open an overlay menu that shows the areas in each major section.
Click on the yellow buttons to the right to move between pages in this area.

Unfortunately, Novell Netware is pretty much dead as an operating system. These pages will not be
updated anymore, but will be retained for a while for the benefit of the faithful who continue to use this
excellent operating system.
Novell did create the Open Enterprise Server, a SUSE Linux based OS that runs most of the old
NetWare server functions.

NETWARE CLUSTERED SERVERS

Netware Clustering Concepts

Novell introduced server clustering in NetWare 5 then enhanced it in NetWare 6. This section
discusses Novell Cluster Services 1.6 from a storage perspective.

A cluster is a group of file servers, servers are often called nodes in Novell documentation. A Netware
6 cluster contains between 2 and 32 servers. All servers in the cluster must be configured with IP and
be on the same IP subnet. All servers in the cluster must be in the same NDS tree and the NDS tree
must be replicated on at least two, but not more than six servers in the cluster. NetWare 5 and
NetWare 6 clusters can coexist in the same NDS tree. Each server must have at least one local disk
device for the SYS: volume, you normally connect your data disks to a cluster using a SAN.

Clustering allows services to survive the failure of a server. Any disks that were mounted on the failed
server are switched to one of the other servers in the cluster. Any applications which were active, or
users who were logged onto the failed server are switched to another server. This is called failover
and all users typically regain access to their resources in seconds, with no loss of data and usually
without having to log in again.
It is also possible to manually invoke a failover if you need to bring down a server for maintenance or
a hardware upgrade.

Novell Cluster Services 1.6 consists of a number of management modules or NLMs. The storage
related modules are -

 The CLSTRLIB or Cluster Configuration Library stores the NDS cluster data. The first
activated node in the cluster uses CLSTRLIB to access NDS eDirectory and becomes the
master node for the cluster. CLSTRLIB sends NDS cluster data to all cluster nodes.
 It is the Cluster Resource Manager (CRM) that is responsible for failover of resources after a
failure. To do this, CRM needs to track all the cluster's resources and where they are
running.The policies on how failover should happen are held in the NDS.
 The Cluster Volume Broker (CVB) keeps track of the NSS configuration of the storage pools
and logical volumes for the cluster. If a change is made to NSS for one server, the CVB
ensures that the change is replicated across all the nodes in the cluster. The CVB also looks
after data integrity. It will veto conflicting operations and enforce the rule that only a single
server can access a pool at a time.
 The Cluster System Services (CSS) module looks after data integrity issues for cluster aware
applications that share distributed memory or locks. This basically ensures that storage pools
are only active on one node at a time.
 The Split Brain Detector (SBD) is really nothing to do with storage, but is far to good a name
to ignore. Each server in the cluster sends out a heartbeat signal every minute, to say ‘I'm
Alive’. If Server1 stops sending its heartbeat, the other servers in the cluster know that
Server1 is dead, and start to take over its resources. What happens if Server1 just loses its
Network connection? Server1 cannot hear the other servers, and starts to take over their
resources. The other servers cannot hear Server1, and start to take over its resources, and
the whole thing would end up in a mess, except that the Split Brain Detector steps in, marks
Server1 out of service, and lets the rest of the cluster take over.

There are 6 other cluster management NLMs which are not discussed here

Cluster commands

You manage the cluster with Cluster commands from the system console. You can see the full list of
cluster commands by typing

HELP CLUSTER

At the console

Some useful commands are

CLUSTER VIEW

Displays the current node, and a list of nodes, i.e. servers.

CLUSTER RESOURCES

Displays the list of resources managed by the cluster, and which node has ownership of which
resource

You can force a resource to move to a different node with the command

CLUSTER MIGRATE resource-name node-name

Netware provides a few screens to monitor the cluster operations.


The Logger screen displays loaded NLMs and NSS operations like enforcement of directory quotas
The Cluster Resources screen displays volume mount and dismount messages

Volumes and Pooling

Pools

Storage Pools are containers for logical volumes. A Cluster Services pool is simply an area of storage
space created from the available Netware partitions. With NSS3.0, these can be virtual partitions, and
can support a mixture of NSS and non-NSS volumes. A Storage pool must be either all local, or all
shared.

A Shared storage pool can only be in use by one cluster node at a time to ensure data integrity. Data
corruption would most likely occur if two or more nodes had access to the same shared storage pool
simultaneously. This is managed by the Cluster System Services NLM.

Failover in Cluster Services 1.6 is by Storage Pool, whereas Netware 5 did failover by volume. If a
shared storage pool is active on a node when the node fails, the cluster automatically migrates the
pool to another node. The clustering software reactivates the pool and remounts the cluster-enabled
logical volumes within that pool.

Cluster Volumes

Inside the storage pools are the logical volumes. The volumes are only visible and accessible when
the pool is active. As the logical volumes have no hard size limit they can request more space from
the storage pool as needed. They hold the files and folders for users and applications.

When you define a pool with its volumes, you have to cluster-enable all the volumes. This creates a
virtual server for each cluster-enabled volume, with its own server name and IP address. Applications
and users access the volume through the virtual server name and the IP address of the virtual
volume. This means that if the hosting server fails, and the volume fails over to another server, clients
are not affected, and the IP address of the shared disk does not change. This is illustrated in the
picture below.

In Netware 5.1 the virtual server name was generated by the system, and has the
format NDStreename-diskname-server. DNS could not understand the underscores in the name, so
the IP addresses had to be hardcoded. Netware 6 removes this restriction, you can override the
default name with a name that DNS can understand.

Logical volumes have a new attributed called Flush On Close, which simply means that when a file is
closed the cache is flushed to disk. This means that when you close a file, you can be confident that
the data is safely stored on disk, and is not sitting in cache. If a server fails, any data resident in cache
will be lost. Flush On Close is set 'ON' on the server, and will have some performance overhead.

NDS, Trustee IDs and GUIDs

The NDS information which is used to identify, name and track all Netware objects is stored by the
CLSTRLIB NLM. Netware 5 had a problem with file control on SAN systems, as some NDS
information was not transferred when a volume was migrated between servers.
The issue was that the trustee IDs for each user object were different for each server. On failover, it
took several minutes to scan the entire file system and translate the trustee IDs to the new server, so
the file trustee IDs were usually not translated at failover. The result was that disk space and directory
space restrictions were not preserved.

In NetWare 6, server-linked trustee IDs are replaced with Globally Unique IDs (GUID), which are the
same across all servers where the user has trustee rights of any kind. Volumes can now failover in
seconds, and all trustee rights are preserved.

Backups had a similar problem. A file had to be restored to the same server it was backed up from, or
trustee IDs would not match and the file could be corrupted. With NetWare 6 and NCS 1.6 any file can
be backed up from any server and restored by any server without file corruption. The GUID remains
intact, along with the appropriate user restrictions, regardless of physical server used for the backup
and restore operation.

Backing up a Netware Cluster with TSM

Netware Clusters have two types of disk, local disks and clustered disks. The SYS: disk will probably
be local, and you may have others. The local disks are always attached to one particular server, while
the clustered disks can move around the various servers in the cluster. 'Takeover Scripts' are used to
make sure that the disks move cleanly between servers. TSM backups can be 'cluster aware', that is
they can move with the disks as the disks move between the cluster servers.

Actually, it is not the disks that move around between servers but Netware Partitions. To keep things
simple, many sites set up each disk in its own Netware Partition, but your site may have several disks
in each partition. When the TSM manuals refer to a 'cluster group' they really mean a Netware
partition.

The TSM software has to run on a physical server, but there is normally no way to decide ahead of
time which physical server will be hosting a volume.

The key to backing up a cluster volume is that the backup metadata must be available from whichever
server is hosting that volume, so the metadata must be held on the cluster volume. The metadata
includes the dsm.opt file and the password file. The schedlog, errorlog and webclient log also need to
be held on the cluster volume to get continuity between messages as the volume moves between
servers. Every Netware partition needs its own dsm.opt file.

Backing up the Local Volumes

Just use a standard TSM install on each of the physical servers. The dsm.opt file should specify
CLUSTERNODE=NO (or miss it out as that is the default). With this setting, if you use a domain of
ALL-LOCAL then it will not see the clustered disks. The NODENAME should be the same as the
server name

Backing up the Clustered Volumes

Each Netware partition must be defined to TSM as a separate node, and must have a unique name
that is not the same as any physical server name. As each partitions will have a virtual server name, it
is easiest to use that as the node name.

Allocate a TSM directory on a volume in the partition and copy a dsm.opt file into it. Assuming that
your are storing the tsm info on a disk called CAV1, edit the dsm.opt file with the following settings

NODENAME CAV1_SERVER
DOMAIN CAV1
CLUSTER YES
PASSWORDDIR CAV1:\TSM\PASS\
PASSWORDAccess GENERATE
NWPWFile YES
OPTFILE CAV1:\TSM\DSM.OPT
ERRORLOGName CAV1:\TSM\DSMERROR.LOG
SCHEDLOGName CAV1:\TSM\DSMSCHED.LOG

To set up the passwords, from your first clustered server enter the following commands:-

Unload TSAFS then reload it with TSAFS /cluster=off

dsmc query session -optfile=CAV1:/tsm/dsm.opt


dsmc query tsa -optfile=CAV1:/tsm/dsm.opt
dsmc query tsa nds -optfile=CAV1:/tsm/dsm.opt

Make a copy of dsmcad in the SYS:/Tivoli/tsm/client/ba/ directory and give it a unique name for this
volume, say DSMCAD_CAV1 then start the scheduler with

dsmcad_CAV1 -optfile=CAV1:/tsm/dsm.opt

Repeat this for every server in the cluster, and get the DSMCAD command added to the takeover
scripts so the correct DSMCAD is started as a volume moves between clustered servers.

back to top

back to top

Windows Storage

Infrastructure

 Storage Spaces Direct


 Windows Volume Mgmt.
 Windows File Systems
 Deduplication
 Volume Shadowcopy Services
 Storage Replica

Management

 Windows System state


 Storage QoS
 Data Classification
 Defragmentation

Removed Features

 Removable Storage System

Novell Netware

 File Systems
 Disks and Volumes
 NDS and eDirectory
 Clustered Servers
 iFolder
 Netware Volume Statistics

Lascon updTES

I retired 2 years ago, and so I'm out of touch with the latest in the data storage world. The Lascon site
has not been updated since July 2021, and probably will not get updated very much again. The site
hosting is paid up until early 2023 when it will almost certainly disappear.
Lascon Storage was conceived in 2000, and technology has changed massively over those 22 years.
It's been fun, but I guess it's time to call it a day. Thanks to all my readers in that time. I hope you
managed to find something useful in there.
All the best

HOME
Backups

CLOSE





Hardware

CLOSE






Mainframe

CLOSE




Windows

CLOSE



Databases

CLOSE




Strategy

CLOSE

Click on the grey buttons above to open an overlay menu that shows the areas in each major section.
Click on the yellow buttons to the right to move between pages in this area.

DISCLAIMER - By entering and using this site, you accept the conditions and limitations of use

Click here to see the Full Site Index       Click here to see the Cookie Policy       Click here to see
the Privacy Policy                             ©2019 and later, A.J.Armstrong

You might also like