GPFS Easy

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Initial GPFS Cluster Setup

This section will provide information on setting up an initial GPFS cluster on an AIX system Installation Setting up SSH Create Cluster Verification Steps: Create the GPFS cluster using mmcrcluster. Before the cluster can be built, GPFS has to be installed on the required nodes. Then enable password less login enabled from each of the nodes to other nodes. Step 1 Installing GPFS fileset is fairly easy for an AIX administrator by using the smitty installp tool, select the fileset or use the installp command depending on your choice. The filesets required are gpfs.base gpfs.msg.en_US gpfs.base gpfs.docs.data Step 2 Next step is to enable the ssh password less login rsh or ssh can be used. Even though rsh can also be used, ssh is the preferred method as it offers more security. On the one of the nodes (preferably the node which you plan as the primary node) generate ssh key and copy the private and the public key to /root/.ssh directory on all the nodes which are part of GPFS cluster.

Generate keypair #ssh-keygen -t dsa Verify if key-pair is generated There should be 2 files generated on /root/.ssh: id_dsa and id_dsa.pub Modify parameter to YES Look for PermitRootLogin parameter in /etc/ssh/sshd_config file Before copying the ssh keys to all other nodes make sure that the parameter PermitRootLogin is set to yes on all nodes. If the parameter is not set to yes then change it to 'yes' Then refresh the ssh daemon.

Copy the key-pair id_dsa and id_dsa.pub to all other nodes to the same location /root/.ssh Append the ssh public key to /etc/ssh/authorized_keys on all the nodes including the primary gpfs node . #cp /root/.ssh/id_dsa.pub >> /etc/ssh/authorized_keys

Step 3 Create a nodelist Create a nodelist in root home directory which has all the node names (FQDN) which will be part of gpfs cluster For eg: If the nodenames are node1.test.com (node1) , node2.test.com (node2) , etc Create a file /root/nodelist Add the FQDNs or shortnames of these nodes in the file one after the other (remember to put these entries in /etc/hosts file as well)

/root/nodelist is input file with a list of node names:designations. Designations are manager or client and quorum or nonquorum (Tip: To make a node as quorum node specify quorum alone , to make it a client quorum node specify as quorum-client) node1:quorum-manager node2: quorum-manager

Syntax: mmcrcluster -N {NodeDesc[,NodeDesc...] | NodeFile -p PrimaryServer [-s SecondaryServer] [-r RemoteShellCommand] [-R RemoteFileCopyCommand] [-C ClusterName] [-U DomainName] [-A] [-c ConfigFile]

Example: mmcrcluster -N /root/nodelist -p node1 -s node2 -r /usr/bin/ssh -R

/usr/bin/scp -C testcluster

-A

where /root/nodelist contains the the list of nodes, node1 the primary configuration serve which is specified by p optionr, node2 the secondary configuration server which is specified by s option, ssh is the shell used for GPFS command execution specified by R option and scp is the copy command used by GPFS to copy in between nodes specified by r option and testcluster is the cluster name specified by C option and finally A option denotes GPFS will auto start during reboot.

Step 4 Verify status of the cluster using the mmlscluster # mmlscluster GPFS cluster information GPFS cluster name: GPFS cluster id: GPFS UID domain: Remote shell command: Remote file copy command: testcluster 12399838388936568191 testcluster /usr/bin/ssh /usr/bin/scp

GPFS cluster configuration servers:

Primary server

node1 node2

Secondary server:

Node Daemon node name Designation

IP address

Admin node name

---------------------------------------------------------------------------------------------1 node1 quorum-manager 2 node2 quorum-manager 10.10.19.81 10.10.19.82 node1 node2

======================================================================== ======================================================================= ======================================================================== =======================================================================

Create GPFS Filesystem using Network shared Disk (NSD)


This section will deal with creating the GPFS filesystem using the network shared disk (NSD) Steps: Creating Network shared Disk (NSD) mmcrnsd command is used to add disks (NSD) to the cluster Initially you need to create a disk descriptor file first. Store this file as /root/disklist (any name can be used for this file and can be stored ) Format for a disk descriptor is as follows: DiskName:PrimaryServer:BackupServer:DiskUsage:FailureGroup:DesiredName:S toragePool

Example of a descriptor file # cat disks

hdisk1:node1:node2:dataAndMetadata:0:test_nsd This means the hdisk1 is the LUN which is to be used for the NSD, node1 is the primary server for this NSD, node2 is the backup server for this NSD, dataAndMetadata indicates it can contain data as well as metadata. 0 is the failure group and test_nsd is the name of the NSD After the descriptor file is created use the mmcrnsd command to creat the NSD Usage: mmcrnsd -F DescFile [-v {yes | no}]

Eg: #mmcrnsd -F /root/disklist mmcrnsd: Processing disk hdisk1 mmcrnsd: 6027-1371 Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.

Verify the NSDs were created Use mmlsnsd to verify the NSDs were properly created

# mmlsnsd

File system

Disk name

NSD servers

-------------------------------------------------------------------------(free disk) test_nsd node1

After the NSD has been created the "disk descriptor" file /root/disklist in our case will have been rewritten and now has the NSD disk names. This newly written "disk descriptor" file now is used as input to the mmcrfs command.

# cat disks # hdisk1:node1:node2:dataAndMetadata:0:test_nsd test_nsd:::dataAndMetadata:0::

Creating the GPFS Filesystem Before creating the gpfs filesystem make sure to create the mount point. For example if /gpfsFS1 is the filesystem name to be used then create the mount point using Command mkdir p /gpfsFS1. Create the gpfs filesystem using the below command. # mmcrfs /gpfsFS1 /dev/gpfsFS1 -F /root/disklist -B 64k m 2 M 2 This will create a Filesystem gpfsFS1 with device /dev/gpfsFS1 (whose underlying raw device will the disk mentioned in /root/disklist) with block size of 64K and maximum copies if data (m) and metadata (M) set to 2. Incase if you add an extra NSD like above and mention the failure group as 1, you can also specify the r 2 and R 2 which implies that 2 replicas of data and metadata will be created (like mirroring). Now the NSD can be viewed by mmlsnsd command # mmlsnsd File system Disk name NSD servers

-------------------------------------------------------------------------gpfs_fs1 test_nsd node1

======================================================================== ==================================================================

======================================================================== ==================================================================

GPFS startup , shutdown ,Status and add a new node


This section will explain about GPFS operations as below To Check Status of GPFS Cluster To Startup GPFS Cluster To Shutdown GPFS Cluster Add a new node to a running GPFS Cluster To change designation of the running GPFS cluster Steps: To Check Status of the GPFS Cluster # mmgetstate aLs To get the status of GPFS Node number Remarks Node name Quorum Nodes up Total nodes GPFS state

----------------------------------------------------------------------------------1 active 2 quorum node node1 quorum node node2 2 2 2 2 2 2 active

Summary information --------------------Number of nodes defined in the cluster: Number of local nodes active in the cluster: Number of remote nodes joined in this cluster: Number of quorum nodes defined in the cluster: Number of quorum nodes active in the cluster: Quorum = 2, Quorum achieved 2 2 0 2 2

To shutdown a single node

# mmshutdown N <nodename> # mmshutdown -N node1 Wed Dec 3 00:11:44 CDT 2010: 6027-1341 mmshutdown: Starting force unmount of GPFS file systems

Wed Dec daemons node1: node1: Wed Dec

3 00:11:49 CDT 2010: 6027-1344 mmshutdown: Shutting down GPFS Shutting down! 'shutdown' command about to kill process 5701702 3 00:11:56 CDT 2010: 6027-1345 mmshutdown: Finished

To shutdown all the nodes together

# mmshutdown N all

To check status of the nodes

# mmgetstate -a

Node number

Node name

GPFS state

-----------------------------------------1 2 node1 node2 arbitrating down

This shows that node2 is either down or GPFS is not started and hence the node1 is arbitrating to find the quorum. Solution is to start GPFS in node2 which will be described below

Startup a single node

Mmstartup N <nodename>

# mmstartup -N node2 Wed Nov 3 00:14:09 CDT 2010: 6027-1642 mmstartup: Starting GPFS ...

# mmgetstate -a Node number Node name GPFS state

-----------------------------------------1 2 node1 node2 active active

To add a new node

mmaddnode -N {NodeDesc[,NodeDesc...] | NodeFile} Must have root authority May be run from any node in the GPFS cluster Ensure proper authentication (.rhosts or ssh key exchanges) Install GPFS onto new node Decide designation(s) for new node, for example, Manager | Quorum

Eg: To add the node, node3 as a non-quorum node and as a client

#mmaddnode -N node3: quorum-manager Wed Nov 3 01:20:35 CDT 2010: 6027-1664 mmaddnode: Processing node node3

mmaddnode: Command successfully completed mmaddnode: 6027-1371 Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.

To Check the Status of the GPFS Cluster # mmlscluster

GPFS cluster information ======================== GPFS cluster name: testcluster

GPFS cluster id: GPFS UID domain: Remote shell command: Remote file copy command:

12399838388936568191 testcluster /usr/bin/ssh /usr/bin/scp

GPFS cluster configuration servers: ----------------------------------Primary server: Secondary server: node1 node2

Node Daemon node name Designation

IP address

Admin node name

---------------------------------------------------------------------------------------------1 node1 quorum-manager 2 node2 quorum 3 node3 quorum-manager 10.10.19.81 10.10.19.82 10.10.19.83 node1 node2 node3

To change the designation of a node mmchnode In our case the node3 was a non-quorum node and a client. To change the designation of node3 to client

# mmchnode --client -N node3 Wed Nov 3 00:29:01 CDT 2010: 6027-1664 mmchnode: Processing node node3

mmchnode: 6027-1371 Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.

# mmlscluster

GPFS cluster information =========================== GPFS cluster name: GPFS cluster id: GPFS UID domain: Remote shell command: Remote file copy command: testcluster 12399838388936568191 testcluster /usr/bin/ssh /usr/bin/scp

GPFS cluster configuration servers: ----------------------------------Primary server: Secondary server: node1 node2

Node Daemon node name Designation

IP address

Admin node name

---------------------------------------------------------------------------------------------1 node1 quorum-manager 2 node2 quorum 3 node3 quorum 10.10.19.81 10.10.19.82 10.10.19.83 node1 node2 node3

======================================================================== =============================================================== ======================================================================== ===============================================================

GPFS NSD and filesystem operations


This Section will explain more in detail of the following actions with examples. To To To To To To To To List Characteristics of the GPFS filesystem add a tiebreaker disk mount all the GPFS filesystems list all the physical disks which are part of a GPFS filesystem display the GPFS filesystem Unmount a GPFS filesystem from one node remove a GPFS filesystem remove a disk from the filesystem

To remove the NSD To replace the disk To add a new disk to the GPFS filesystem To suspend a disk To Resume the disk Steps: To list characteristics of GPFS filesystem

# mmlsfs <GPFS filesystem name>

# mmlsfs gpfs_fs1 flag value description

---- --------------------------------------------------------------------f -i -I -m -M -r -R -j -D -k -a 2048 512 8192 1 2 1 2 cluster nfs4 all 1048576 Minimum fragment size in bytes Inode size in bytes Indirect block size in bytes Default number of metadata replicas Maximum number of metadata replicas Default number of data replicas Maximum number of data replicas Block allocation type File locking semantics in effect ACL semantics in effect Estimated average file size Estimated number of nodes that will mount file Block size Quotas enforced Default quotas enabled Maximum number of inodes File system version Support for large LUNs?

-n 32 system -B -Q 65536 none none -F -V -u 33536 10.01 (3.2.1.5) yes

-z -L -E -S -K -P -d -A -o -T

no 2097152 yes no whenpossible system test_nsd yes none /gpfs_fs1

Is DMAPI enabled? Logfile size Exact mtime mount option Suppress atime mount option Strict replica allocation option Disk storage pools in file system Disks in file system Automatic mount option Additional mount options Default mount point

To add a tiebreaker disk

Mmchconfig Prerequisite : A LUN obtained from SAN should first be added to NSD as mentioned in the section Creating Network shared Disk (NSD) and then proceed

Eg : To use the nsd test_nsd as a tiebreaker disk use the following command mmchconfig tiebreakerDisks="test_nsd" Eg : To remove a tiebreaker disk use the following command mmchconfig tiebreakerDisks=no

To mount all the GPFS filesystems

# mmmount all -a

To list all the physical disks which are part of a GPFS filesystem

mmlsnsd

To show the node names use the f and m option # mmlsnsd -f gpfs -m

Disk name Remarks

NSD volume ID

Device

Node name

-------------------------------------------------------------------------------------nsd1 nsd2 nsd3 nsd4 AC1513514CD152BF AC1513514CD152C0 AC1513514CD152C1 AC1513514CD15352 /dev/hdisk1 /dev/hdisk2 /dev/hdisk3 /dev/hdisk4 node1 node2 node3 node4

To show the failure group info and storage pool info # mmlsdisk gpfs disk storage driver sector failure holds size holds status

name type availability pool

group metadata data

------------ -------- ------ ------- -------- ----- ------------------------ -----------nsd1 system nsd2 system nsd3 pool1 nsd4 pool1 nsd nsd nsd nsd 512 512 512 512 -1 yes -1 yes -1 no -1 no yes yes yes yes ready ready ready ready up up up up

To display the GPFS filesystem

# mmdf gpfs disk KB name blocks disk size free KB in KB in fragments failure holds holds free in full

group metadata data

--------------- ------------- -------- -------- ------------------------ ------------------Disks in storage pool: system (Maximum disk size allowed is 97 GB) nsd1 99%) nsd2 99%) 10485760 960 ( 0%) 10485760 960 ( 0%) -1 yes -1 yes yes yes 10403328 ( 10402304 (

-------------------------------- ------------------(pool total) 99%) 20971520 1920 ( 0%) 20805632 (

To Unmount a GPFS filesystem from one node

mmumount <GPFS filesystem> -N <nodename>

mmunount gpfs_fs1 N node1

where gpfs_fs1 is the GPFS filesystem and node1 is the name of the node from where it needs to be unmounted

To unmount a GPFS filesystem from all nodes

#mmumount <GPFS filesystem> -a

mmumount gpfs_fs1 -a

To remove a GPFS filesystem

1. # mmumount <GPFS filesystem> -a To unmount the GPFS filesystem from all nodes 2. # mmdelfs gpfs_fs1 p To remove the filesystem

Steps with example:

To remove FS gpfs_fs1

Initial output before removing the filesystem

# mmdf gpfs_fs1 disk KB name blocks disk size free KB in KB in fragments failure holds holds free in full

group metadata data

--------------- ------------- -------- -------- ------------------------ ------------------Disks in storage pool: system (Maximum disk size allowed is 104 GB) test_nsd 99%) test_nsd2 (100%) test_nsd1 99%) 10485760 152 ( 0%) 10485760 62 ( 0%) 10485760 160 ( 0%) 0 yes 0 yes 1 yes yes yes yes 10359360 ( 10483648 10359360 (

-------------------------------- ------------------(pool total) 99%) 31457280 374 ( 0%) 31202368 (

============= ==================== =================== (total) 99%) 31457280 374 ( 0%) 31202368 (

Inode Information ----------------Number of used inodes: Number of free inodes: Number of allocated inodes: Maximum number of inodes: 4042 29494 33536 33536

--------------------------------------------------------------------------------------------

1. mmumount gpfs_fs1 -a 2. # mmdelfs gpfs_fs1 -p GPFS: 6027-573 All data on following disks of gpfs_fs1 will be destroyed: test_nsd test_nsd1 test_nsd2 GPFS: 6027-574 Completed deletion of file system /dev/gpfs_fs1.

Mmlsnsd output after GPFS was removed which shows all NSDs as free disks

# mmlsnsd

File system

Disk name

NSD servers

-------------------------------------------------------------------------(free disk) (free disk) (free disk) test_nsd1 test_nsd2 test_nsd directly Attached directly Attached directly Attached

To remove a disk from the filesystem

Remember that you should remove only a disk after confirming that adequate space is left in the other disks which are part of this filesystem (you can check this by using mmdf <GPFS filesystem name> so that when the disk is removed it will respan and data will then be shared across other available disks. OR 2 data replicas are available which can be checked using the mmlsfs <GPFS filesysem> Syntax: mmdeldisk <GPFS filesystem> <NSD name> -r -r option is very important as it will resync the data and balnce the data across other available disks in this filesystem.

# mmdeldisk gpfs_fs1 test_nsd2 -r Deleting disks ... Scanning system storage pool GPFS: 6027-589 Scanning file system metadata, phase 1 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 2 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 3 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 4 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-565 Scanning user file metadata ... GPFS: 6027-552 Scan completed successfully. Checking Allocation Map for storage pool 'system' GPFS: 6027-370 tsdeldisk64 completed. mmdeldisk: 6027-1371 Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.

# mmlsnsd

File system

Disk name

NSD servers

-------------------------------------------------------------------------gpfs_fs1 gpfs_fs1 (free disk) test_nsd test_nsd1 test_nsd2 directly attached directly attached directly attached

To remove the NSD

Remember that you should only remove NSDs that are free. Steps to remove NSDs are #mmdelnsd <NSD name>

# mmlsnsd

File system

Disk name

NSD servers

-------------------------------------------------------------------------(free disk) Gpfs_fs1 Gpfs_fs1 test_nsd1 test_nsd2 test_nsd directly attached directly attached directly attached

# mmdelnsd test_nsd1 mmdelnsd: Processing disk test_nsd1 mmdelnsd: 6027-1371 Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.

# mmlsnsd

File system

Disk name

NSD servers

-------------------------------------------------------------------------Gpfs_fs1 Gpfs_fs1 test_nsd2 test_nsd directly attached directly attached

To replace the disk

Prerequisites for replacing or adding a new disk. The physical disk / LUN should be added to a NSD as mentioned in the Creating Network shared Disk (NSD) section

Syntax : mmrpldisk <GPFS filesystem name> <NSD to be replaced> <new NSD> -v {yes|no}

Yes checks will be done if any data is there in new NSD No -- no checks will be done if any data is there in new NSD

In this example you have 3 existing nsds nsd2, nsd3 and nsd4 and a newly added nsd nsd1 which is not part of GPFS filesystem. This procedure explains how to replace nsd4 with nsd1

# mmlsnsd

File system

Disk name

NSD servers

-------------------------------------------------------------------------Gpfs_fs1 Gpfs_fs1 Gpfs_fs1 nsd2 nsd3 nsd4 (directly attached) (directly attached) (directly attached)

(free disk)

nsd1

(directly attached)

# mmrpldisk gpfs nsd4 nsd1 -v no Verifying file system configuration information ... Replacing nsd4 ...

GPFS: 6027-531 The following disks of gpfs will be formatted on node trlpar06_21: nsd1: size 10485760 KB Extending Allocation Map Checking Allocation Map for storage pool 'system' GPFS: 6027-1503 Completed adding disks to file system gpfs_fs1 Scanning system storage pool GPFS: 6027-589 Scanning file system metadata, phase 1 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 2 ... Scanning file system metadata for pool1 storage pool GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 3 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 4 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-565 Scanning user file metadata ... 100 % complete on Thu Nov 4 02:14:14 2010

GPFS: 6027-552 Scan completed successfully. Checking Allocation Map for storage pool 'system' Done

Check the mmlsnsd output after the activity. Notice that nsd4 now became a part of gpfs_fs1 filesystem and nsd4 became free

# mmlsnsd

File system

Disk name

NSD servers

-------------------------------------------------------------------------(free disk) Gpfs_fs1 Gpfs_fs1 Gpfs_fs1 nsd2 nsd3 nsd4 nsd1 (directly attached) (directly attached) (directly attached) (directly attached)

To add a new disk to the GPFS filesystem

Prerequisites for adding a new disk. The physical disk / LUN should be added to a NSD as mentioned in the Creating Network shared Disk (NSD) section and the same disk descriptor file used for creating the NSD has to be used with F option

Step 1: create a disk descriptor file with any name. here we use the file name as disks and the disk used is hdisk10

# cat disks hdisk10:::dataAndMetadata:0:test_nsd10

Step 2: Create the NSD with the new disk hdisk10

# /usr/lpp/mmfs/bin/mmcrnsd -F disks mmcrnsd: Processing disk hdisk10 mmcrnsd: 6027-1371 Propagating the cluster configuration data to all affected nodes. # # # mmlsnsd This is an asynchronous process.

File system

Disk name

NSD servers

-------------------------------------------------------------------------Gpfs_fs1 Gpfs_fs1 (free disk) test_nsd2 test_nsd test_nsd10 directly attached directly attached directly attached

# cat disks # hdisk10:::dataAndMetadata:0:test_nsd10 test_nsd2:::dataAndMetadata:0::

Step3 : add the disk to the GPFS filesystem gpfs_fs1

# mmadddisk gpfs_fs1 -F disks

GPFS: 6027-531 The following disks of gpfs_fs1 will be formatted test_nsd10: size 10485760 KB Extending Allocation Map Checking Allocation Map for storage pool 'system' GPFS: 6027-1503 Completed adding disks to file system gpfs_fs1. mmadddisk: 6027-1371 Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.

To suspend a disk

This will be useful if you suspect any problem with existing disk and you want to stop further writes to that disk.

Syntax: mmchdisk <gpfs filesystem name> suspend -d <nsd name>

# mmchdisk gpfs_fs1 suspend -d nsd4

# mmlsdisk /dev/gpfs_fs1 disk storage driver sector failure holds size holds status

name type availability pool

group metadata data

------------ -------- ------ ------- -------- ----- ------------------------ -----------nsd4 system nsd2 system nsd3 pool1 nsd nsd nsd 512 512 512 0 yes 1 yes 0 no yes yes yes suspended ready ready up up up

To resume a disk

Syntax: mmchdisk <gpfs filesystem name> resume -d <nsd name>

# mmlsdisk /dev/gpfs_fs1 disk storage driver sector failure holds size holds status

name type availability pool

group metadata data

------------ -------- ------ ------- -------- ----- ------------------------ -----------nsd4 system nsd2 system nsd3 pool1 nsd nsd nsd 512 512 512 0 yes 1 yes 0 no yes yes yes suspended ready ready up up up

# mmchdisk gpfs_fs1 resume -d nsd4

# mmlsdisk /dev/gpfs_fs1 disk storage driver sector failure holds size holds status

name type availability pool

group metadata data

------------ -------- ------ ------- -------- ----- ------------------------ -----------nsd4 system nsd2 system nsd3 pool1 nsd nsd nsd 512 512 512 0 yes 1 yes 0 no yes yes yes ready ready ready up up up

You might also like