Download as pdf or txt
Download as pdf or txt
You are on page 1of 204

V6.

cover

Front cover

Power Systems for AIX III:


Advanced Administration and
Problem Determination

(Course code AN15)

Instructor Exercises Guide


with hints
ERC 2.1
Instructor Exercises Guide with hints

Trademarks
The reader should recognize that the following terms, which appear in the content of this
training document, are official trademarks of IBM or other companies:
IBM® is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
States, or other countries, or both:
AIX 5L™ AIX 6™ AIX®
AS/400® Current® DB2®
DS8000® HACMP™ Initiate®
Initiate® MWAVE® Power Systems™
Power® POWER® PowerVM™
POWER6® POWER7® pSeries®
Redbooks® Redbooks® RS/6000®
System p® Tivoli® Tivoli®
Intel is a trademark or registered trademark of Intel Corporation or its subsidiaries in the
United States and other countries.
Windows is a trademark of Microsoft Corporation in the United States, other countries, or
both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
VMware and the VMware "boxes" logo and design, Virtual SMP and VMotion are registered
trademarks or trademarks (the "Marks") of VMware, Inc. in the United States and/or other
jurisdictions.
Other product and service names might be trademarks of IBM or other companies.

August 2011 edition


The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis without
any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer
responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While
each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will
result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.

© Copyright International Business Machines Corporation 2009, 2011.


This document may not be reproduced in whole or in part without the prior written permission of IBM.
Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictions
set forth in GSA ADP Schedule Contract with IBM Corp.
V6.0
Instructor Exercises Guide with hints

TOC Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Instructor exercises overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Exercises configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Exercise description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Exercise 1. Problem diagnostic information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1

Exercise 2. The Object Data Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1

Exercise 3. Error monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1

Exercise 4. Basic Network Installation Manager configuration . . . . . . . . . . . . . . . 4-1

Exercise 5. System initialization: Accessing a boot image . . . . . . . . . . . . . . . . . . . 5-1

Exercise 6. System initialization: rc.boot and inittab . . . . . . . . . . . . . . . . . . . . . . . . 6-1

Exercise 7. LVM metadata and related problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1

Exercise 8. Disk management procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1

Exercise 9. Install and cloning techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1

Exercise 10. Advanced backup techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1

Exercise 11. Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-1

Exercise 12. System dump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1

© Copyright IBM Corp. 2009, 2011 Contents iii


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

iv AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

TMK Trademarks
The reader should recognize that the following terms, which appear in the content of this
training document, are official trademarks of IBM or other companies:
IBM® is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
States, or other countries, or both:
AIX 5L™ AIX 6™ AIX®
AS/400® Current® DB2®
DS8000® HACMP™ Initiate®
Initiate® MWAVE® Power Systems™
Power® POWER® PowerVM™
POWER6® POWER7® pSeries®
Redbooks® Redbooks® RS/6000®
System p® Tivoli® Tivoli®
Intel is a trademark or registered trademark of Intel Corporation or its subsidiaries in the
United States and other countries.
Windows is a trademark of Microsoft Corporation in the United States, other countries, or
both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
VMware and the VMware "boxes" logo and design, Virtual SMP and VMotion are registered
trademarks or trademarks (the "Marks") of VMware, Inc. in the United States and/or other
jurisdictions.
Other product and service names might be trademarks of IBM or other companies.

© Copyright IBM Corp. 2009, 2011 Trademarks v


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

vi AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

pref Instructor exercises overview


The objective of the Power Systems for AIX III: Advanced
Administration and Problem Determination exercises is to have
students understand and successfully perform the types of activities
an advanced AIX system administrator encounters on a day-to-day
basis.
Although all of the exercises are considered standalone exercises, if a
student starts an exercise, all sections must be completed. This is due
to the fact that in many exercises problems are introduced and then
fixed as part of the exercise. If an exercise is started but not
completed, follow-on exercises may not work properly or at all.
Before starting the exercises, read the Exercise Description section.
It is included in this guide as well as in the Student Exercises book.
This section explains the format of the different exercise sections. It is
important that this is covered with the students as an overall exercise
introduction before the first exercise is started.
Be sure you understand the equipment configuration for your teach
site before introducing the exercises. The students will be performing
the exercises as root throughout most of this class as they will be
doing system configuration and problem determination activities. The
password for root should be set to ibmaix if it has not already been
set. Check for root's password before the start of class. If this is not
root's password, contact the classroom administrator to obtain root’s
password. It is mandatory that students have access to the root
password as all exercises require the students to be the root user.
The students will also be using a Hardware Manager Console (HMC)
to control their LPARs and to access the virtual console for their LPAR.
The lab provisioning organization will provide the instructor with a a list
of student user IDs and the passwords to use.
Supplemental exercise scripts have been created and placed in the
/home/workshop directory. Check for the existence of this directory
and its files. If /home/workshop does not exist or is empty, restore
the scripts from the lab files for the class. This is often provided on an
instructor exercise diskette that is provided with the instructor
materials. If not, contact the support staff for the lab facility. Without
these scripts, several exercises cannot be done.
Due to the variety of classrooms where this class may be offered, the
equipment available in any one classroom may be different from
another.

© Copyright IBM Corp. 2009, 2011 Instructor exercises overview vii


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

The exercises are designed to run on POWER6 or POWER7 systems


or any system running AIX7.1.
The man pages should already be set up on each system before the
start of class. Many students work with this documentation and they
will complain if the AIX documentation is not configured.

Exercises guidelines
Here is a list of guidelines that are strongly recommended to
implement when teaching this course. If you follow them, your course
will be much better evaluated.
1. Before you start an exercise, make clear to the students what they
will do in the lab. Provide the goals to the students, and tell them
why this lab is necessary.
2. Tell the students how long the allowed time for the lab is. You find a
timetable on the next page and in the instructor notes.
3. After the lab, review the exercise. The students want a confirmation
for what they have done in the lab. Do not jump to the next unit
without a review, because most of your students will be
disappointed if you do not give them this confirmation.

viii AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

pref Exercises configuration


For detailed lab set up instructions, refer to Lab Setup Guide for
this course.

© Copyright IBM Corp. 2009, 2011 Exercises configuration ix


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

x AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

pref Exercise description


Each exercise in this course is divided into sections as described
below. Select the section that best fits your method of performing
exercises. You may use a combination of these sections as
appropriate.
Exercise Instructions
This section tells you what to accomplish. There are no definitive
details on how to perform the tasks. You are given the opportunity to
work through the exercise given what you learned in the unit
presentation, utilizing the Student Notebook, your past experience,
and maybe a little intuition.
Exercise Instructions with Hints
This section is also an exact duplicate of the Exercise Instructions
and contains solutions and additional tips for the students. It is
recommended that most students use the section with hints. There
may be some advanced students who will prefer the challenge of
working without the hints, but they should have the hints at ready.
Students can use this part to compare their work with the solutions.
When showing the SMIT method to accomplish a task, each line in
bold represents a submenu or selector screen. You will need to press
the Enter key after selecting each item as listed. When you reach the
dialog screen, the field descriptions will be in regular text and the items
you need to fill in will be in bold. Only the items that need to be
changed will be shown, not the entire screen. Once you have reached
the dialog screen portion of SMIT, press Enter ONLY after all indicated
entries have been made.
The SMIT steps will be shown for the ASCII version of SMIT. Under
most circumstances these steps match the steps taken if using the
graphics version of SMIT. The exceptions relate to the use of the
function keys. When instructed to press the F3 key back to a particular
menu, when in graphics SMIT, you will instead click the Cancel box at
the bottom of the screen. When instructed to press the F9 key to shell
out, in graphics mode, simply open another window.
Optional Exercise Parts
Some labs provide additional practice on a particular topic. Specific
details and hints are provided to help step you through the Optional
Exercises, if needed. Not all exercises include Optional Exercises.
According to the group, the instructor can decide to do them or not. If
there is time, the optional part should be executed by the students.

© Copyright IBM Corp. 2009, 2011 Exercise description xi


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

xii AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 1. Problem diagnostic information


(with hints)

Estimated time
00:35

What this exercise is about


This exercise will have you explore system information collection to
support problem diagnostics. You will collect some baseline
information about your lab system. You will explore the information
center for reference code information, and you will practice collecting
snap information.

What you should be able to do


At the end of the lab, you should be able to:
• Obtain configuration information about your system
• Navigate the information center to find reference code information
• Create, compress, and rename a snap file for upload to AIX
Support

Introduction
In this exercise, you will obtain and record information about your
system using some basic administration commands with which you
are probably already familiar. You will also locate reference code
information at the IBM infocenter Web site. Finally you will use snap to
collect information about a problem environment.
You will require root authority to complete this exercise.

Instructor exercise notes


This is the first time the students will be logging in to the systems in the
lab. Provide them with the root password. If the systems are remote,
provide them with their assigned IP addresses/hostnames and explain
any provided tools (WinXP telnet, Hummingbird telnet, PuTTY, and so
on). You might suggest customization such as window size or font size
for ease of viewing.

© Copyright IBM Corp. 2009, 2011 Exercise 1. Problem diagnostic information 1-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Common student problems


Many students are not familiar with the commands required to
complete the first portion of this exercise; they prefer working with
SMIT. However, because this portion of the exercise is a kind of a
review, you should insist that students use the required commands
rather than SMIT.

1-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

Part 1 - Recording system information


__ 1. Connect to your assigned LPAR using telnet protocol and log in as root.
__ 2. Using commands rather than SMIT, collect and record the following information
regarding your system:
__ a. The volume groups on your system:
______________________________________________________________
» # lsvg

__ b. The physical volumes for your system:


______________________________________________________________
» # lspv

__ c. The logical volumes in rootvg on your system:


______________________________________________________________
______________________________________________________________
______________________________________________________________
» # lsvg -l rootvg

__ d. All paging space areas for your system:


______________________________________________________________
» # lsps -a

© Copyright IBM Corp. 2009, 2011 Exercise 1. Problem diagnostic information 1-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 3. Execute the prtconf command and record the following information:


__ a. Real memory on your system:
______________________________________________________________
__ b. Machine model and serial number of your system:
______________________________________________________________
______________________________________________________________
__ c. Processor type of your system:
______________________________________________________________
__ d. Firmware level of your system:
______________________________________________________________
__ e. LPAR name of your partition:
______________________________________________________________

»# prtconf | pg

__ 4. Identify the logical volumes that reside on your hdisk0.


Write down the command you used:
_________________________________________________________________
» # lspv -l hdisk0

From the fact that the number of LPs is equal to the number of PPs, what can you
conclude?
_________________________________________________________________
» No mirroring

Part 2 - Looking up reference codes


__ 5. Using a Web browser, connect to the IBM Systems Information Centers.
» The Web address to enter is: http://publib.boulder.ibm.com/infocenter/systems
__ 6. In the content area, click IBM Systems Hardware Information Center.
__ 7. On the IBM Power Systems Hardware Information Center page, expand Systems
Hardware Information in the navigation area (left side).
__ 8. Expand Power Systems Information.

1-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 9. Expand POWER7 systems.


__ 10. Locate and expand the section on the server model that matches what you recorded
earlier in this exercise.
__ 11. Notice the various categories of product documentation that is available. Expand the
Troubleshooting, service, and support category.
__ 12. Expand Beginning troubleshooting and problem analysis.
__ 13. Click Reference codes.
__ 14. In the POWER7 Information Reference codes page (in the content area), notice the
links to documentation on System Reference Codes (SRC) and Service Request
Numbers (SRN).
Click Progress codes.
__ 15. On the Progress Codes Overview page, click AIX IPL progress codes. This gives
a list of AIX progress codes. Clicking on any one of them provides a brief description
of that code. This course will later cover the codes which are common in diagnosing
AIX boot problems.
__ 16. Return to the IBM Systems Information Centers page (first page that you displayed).
__ 17. Click AIX Information Center (at the bottom of the content area).
__ 18. Locate the section for AIX 7.1 and click the arrow icon.
__ 19. In the navigation area (on the left), expand AIX 7.1 Information.
__ 20. This will display various categories of AIX 7.1 information. Find and then click the
Troubleshooting category.
__ 21. Briefly scan the list of areas covered.
__ 22. Close the browser window.

Part 3 - Creating a snap file


__ 23. Connect to your assigned LPAR using telnet protocol and log in as root.
__ 24. Execute the snap command to collect all information for your system. If there is not
enough space in the /tmp file system, increase the size of /tmp and repeat the snap
execution. Do not collect any dump information to removable media, if prompted.
»# snap -a
__ 25. Once snap has completed the generation of the information files, change your
directory to the testcase directory that was created and create some files in that
directory.
»# cd /tmp/ibmsupt/testcase
»# touch here are four files
__ 26. Change the directory back to your home directory.

© Copyright IBM Corp. 2009, 2011 Exercise 1. Problem diagnostic information 1-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

»# cd
__ 27. Create a compressed pax file of the snap generated directory tree.
»# snap -c
__ 28. Rename the resulting compressed pax file to the standard naming convention, given
the following assumptions:
• Your PMR# is 12121
• Your branch# is 989
• Use your own country code (if you do not know it, for this class, just use 000)
»# cd /tmp/ibmsupt
»# mv snap.pax.Z pmr12121.b989.c000.snap.pax.Z

End of exercise

1-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise review/wrap-up


1. After the exercise, review the basic commands used in the first portion of the exercise
on the board:
- List Volume Groups - lsvg
- List Physical Volumes - lspv
- List Logical Volumes - lslv
- List Paging Space - lsps
- Using prtconf to obtain real memory, machine mode serial number, processor
type firmware level and LPAR name.
2. Ask the students the difference in the information they found at the two types of
information centers: the IBM Systems Hardware Information center with AIX Information
Center.
3. Review the steps in generating a snap file for upload to AIX Support.

© Copyright IBM Corp. 2009, 2011 Exercise 1. Problem diagnostic information 1-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

1-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 2. The Object Data Manager


(with hints)

Estimated time
Part 1: 00:30
Part 2: 00:15
Part 3 (optional): 00:15

What this exercise is about


This exercise will review some of the most important ODM files and
how they are used in device configuration. You will use the ODM
command line interface.

What you should be able to do


At the end of the lab, you should be able to:
• Describe some of the most important ODM files
• Use the ODM command line interface
• Explain how ODM classes are used by device configuration
commands

Introduction
This exercise has three parts:
1. Review of device configuration ODM classes (PdDv, PdAt, CuDv,
CuAt, CuDep, CuDvDr)
2. Modifying a device attribute default value
3. Optional Part: Creating self-defined ODM classes
All instructions in this exercise require root authority.

© Copyright IBM Corp. 2009, 2011 Exercise 2. The Object Data Manager 2-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Common student problems


Many students do not use the lsdev, lsattr, and mkdev commands. To show how the
ODM classes are used, we challenge them by using these commands.
Before starting the exercise, you may wish to review the following commands:
• lsdev -P (List predefined devices)
• lsdev -C (List customized devices)
• lsattr -E (List effective attributes)
• lsattr -D (List default attributes)
• lsattr -R (List range or enumeration of valid attribute values)

Known hardware/software problems


If two (or more) students work on one system, the optional part must be executed in
common.

2-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

Part 1 - Review of device configuration ODM classes


__ 1. Execute the lsdev command and identify all devices that are supported on your
system. Tell the lsdev command to provide column headers in the output.
What is the command you used?
__________________________________________________________
» # lsdev -P -H | more
Which ODM object class is used by the lsdev command to generate this output?
(You may need to your Student Guide materials).
__________________________________________________________
» PdDv
__ 2. Execute the lsdev command and identify all disk devices that are currently attached
to your system. Tell the lsdev command to provide column headers in the output.
What is the command you used?
__________________________________________________________
» # lsdev -C -c disk -H
Which ODM object class is used by the lsdev command to generate this output?
__________________________________________________________
» CuDv
__ 3. Request the same listing as above, except customize the reported fields needed to
complete the following list for disk hdisk0:
Name: ___________________________________
Status: ___________________________________
Location: _________________________________

© Copyright IBM Corp. 2009, 2011 Exercise 2. The Object Data Manager 2-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Physical location: ___________________________


Description: _______________________________
» This information can be obtained from the output of the command:
lsdev -C -c disk -F “name status location physloc description”,
but answers (particularly the location information) may vary.
__ 4. Use the ODM command line interface and list the ODM object that describes the
hdisk0 disk device. Also, use the ODM command line interface to list the ODM
object that contains the parent adapter’s physical location code as part of its Vital
Product Data information.
What command or commands did you use?
__________________________________________________________
»# odmget -q name=hdisk0 CuDv
»# odmget -q name=<parent-adapter> CuVPD
From the output complete the following list for disk hdisk0:
Status: ___________________________________
Chgstatus: ________________________________
Parent: ___________________________________
Location: _________________________________
Connwhere: _______________________________
PdDvLn: __________________________________
(for parent) CuVDP vdp: _______________________________
» This information can be obtained from the output of the command odmget
-qname=hdisk0 CuDv, but answers (particularly the parent and location information)
may vary. For some devices, such as virtual devices, there may not be any AIX location
code; in that case the physical location code provides the location information.
__ 5. Execute the lscfg command and filter for hdisk0. Compare the physical location
code with the ODM information you just displayed. How do they compare?
_____________________________________________________________
______________________________________________________________
» Suggested command is:
# lscfg | grep hdisk0
» The disk physical location code was constructed from the physical location code of the
parent adapter appended with the connwhere of the disk device. The connwhere (used
to locate a device once we know the parent adapter port) is often part of either the AIX
location code or the physical location code.

2-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 6. From the previous odmget output and your Student Guide notes (Customized
devices object class), please answer the following question:
What is the meaning of the displayed value of the CuDv descriptor: chgstatus?
__________________________________________________________
» A value of 2 (the expected result) indicates that the status of the disk device has not
changed since the last reboot.
__ 7. List the effective attributes (lsattr) for your hdisk0 device and identify the physical
volume identifier for that disk.
What is the command you used?
__________________________________________________________
» # lsattr -El hdisk0
Write down the physical volume ID of the disk:
pvid: ______________________________________________________
» This value can be obtained from the output of the command lsattr -El hdisk0, but
answers will vary. On two systems previously used to test this exercise, the pvid values
obtained were 0009330f2d01c69f0000000000000000 and
00cee60e58b2d39a0000000000000000. Note that while these are 32-digit values, the
last 16 digits are zeros.
__ 8. Use the ODM command line interface, and list the ODM object that stores the
physical volume identifier (pvid) device attribute:
What is the command you used?
__________________________________________________________
»Suggested commands are:
# odmget -q "name=hdisk0 and attribute=pvid" CuAt
-OR-
# odmget CuAt | grep -p hdisk0 | grep -p pvid

__ 9. The /dev directory contains the special files to access the devices. Write down the
major and minor number of the special file for hdisk0.
Major number: _____________________
Minor Number: _____________________
» # ls -l /dev/hdisk0
Which ODM class is used to identify the major number and minor number for the
device driver?
__________________________________________________________

© Copyright IBM Corp. 2009, 2011 Exercise 2. The Object Data Manager 2-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» CuDvDr
» Enter the following command to see the relevant entry in the CuDvDr object class:
# odmget -q value3=hdisk0 CuDvDr
__ 10. List all your logical volumes that are part of the rootvg.
• What is the command you used?
__________________________________________________________
» # lsvg -l rootvg
• Query the ODM class CuDep and identify all logical volumes that belong to
rootvg.
What is the command you used?
__________________________________________________________
»Suggested commands are:
# odmget -q name=rootvg CuDep | more
-OR-
# odmget -q parent=rootvg CuDv | more

Part 2 - Modifying a device attribute default value


In this part of the exercise, you will back up and then modify the ODM using the
ODM commands. Our example will use an ethernet interface attribute which will not
have much real effect (so it is safe to play with). The remote MTU (remmtu) attribute
is intended to set the Maximum Transmission Unit (MTU) size when transmitting to a
partner on a remote network, but it is superseded by other mechanisms.
__ 11. Display the standard ethernet interface devices. Select one of the interfaces
(defined or available) and record its device name: _________________________
You will use this interface in the remainder of this exercise part. Our instructions will
assume the interface is en1, but you might be using a different one.
»# lsdev -c if

__ 12. Using a high level command, retrieve the en1 effective attributes. What is the value
of the remmtu attribute? _____________________________________________
»# lsattr -E -l en1
» You should find that the value is set to 576.

2-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 13. This value is very small. We want to set it to the largest possible value. Run a high
level command to identify the allowable range of values for this attribute. What is the
largest value that we can use? _________________________________________
________________________________________________________________
»# lsattr -R -l en1 -a remmtu
» You should see that the largest allowable value is 1500.

__ 14. We could use a high level command to set the effective value of the attribute for our
interface; but, we would have to do this repeatedly each time a new instance of the
device was added. What command would you use to set a new effective value (but
do not run it)? ____________________________________________________
_______________________________________________________________
» The command you would use (but do not do this here) to override a default value for a
particular device would be: chdev -l <device name> -a <attribute name>=<new value>

__ 15. Let us verify that the current attribute value is not already an override to the default
value. Use a high level command to retrieve the default attributes for the en1
interface. Is the default the same as the effective attribute value in this instance?
_______________________________________________________________
»# lsattr -D -l en1 -a remmtu
» You should find that the default attribute value is the same as the effective attribute
value.
__ 16. If we change the default value for the attribute, each new instance of the device will
automatically have the preferred value. There is no high level command to modify
the default values. What object class holds the default attribute values?
_______________________________________________________________
» The predefined attributes (PdAt) object class is where device default values are stored.
__ 17. Before you use ODM commands to make this change, first back up the ODM object
class that you will be changing.
» Suggested commands are:
# odmget PdAt > /tmp/PdAt-back
-OR-
# mkdir /tmp/objrepos
# cp /etc/objrepos/* /tmp/objrepos
__ 18. To locate the correct object, you will need to know the class, subclass, and type
values associated with the ethernet interface device. Retrieve the ODM customized

© Copyright IBM Corp. 2009, 2011 Exercise 2. The Object Data Manager 2-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

device object for en1 and record the pre-defined device link descriptor value
(PdDvLn):
_______________________________________________________________
»# odmget -q name=en1 CuDv
__ 19. The corresponding descriptor value in the predefined ODM database is the
uniquetype descriptor. Using both the attribute descriptor value of remmtu and the
uniquetype descriptor value to qualify the operation, display the predefined attribute
object for the remote MTU attribute. Be sure that you see one and only one object in
the display. What is the attribute value displayed? __________________
»# odmget -q “uniquetype=if/EN/en AND attribute=remmtu” PdAt

__ 20. Repeat this display, only redirect the output to the file: /tmp/remmtu-object.
»# odmget -q “uniquetype=if/EN/en AND attribute=remmtu” PdAt \
> /tmp/remmtu-object
__ 21. Edit the file you created to change the default value to the maximum value allowed.
»# vi /tmp/remmtu-object
» Change the deflt descriptor value to 1500.
» Write and quit the edit session.
__ 22. Using the same qualification as on the retrieval, replace the ODM object with the
one in your edited file.
»# odmchange -o PdAt \
-q “uniquetype=if/EN/en AND attribute=remmtu” \
/tmp/remmtu-object
__ 23. Use a high level command to verify that the remmtu attribute default value has
changed. You can use either the en1 logical device name or the uniquetype value to
identify the object.
»# lsattr -D -l en1 -a remmtu
__ 24. Display the effective remmtu attribute value for en1. Did it change?
______________________________________________________________
»# lsattr -E -l en1 -a remmtu
» You should see that the effective attribute value has changed. The default value is
effective unless there is an override created with the chdev command. Of course, any
new interfaces that are configured, automatically, will be effectively using the new value.

(Optional) Part 3 - Creating self-defined ODM classes

2-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 25. Before creating an ODM class you need to specify the descriptors that are contained
in the class.
__ a. Create the directory /tmp/odm to hold the specification file and cd to that
directory.
»Suggested commands are:
# mkdir /tmp/odm
# cd /tmp/odm
__ b. Using an editor, create a file parts.cre (in your new working directory) with the
following class structure:

class parts {
long part_number;
char part_description[128];
char warehouse[4];
long contained_in;
}

__ 26. Create the ODM class using this class structure and check the structure of this
class. Write down the commands you used:
__________________________________________________________
__________________________________________________________
»Suggested commands are:
# odmcreate parts.cre
# odmshow parts
Identify in your present working directory, which new files have been created during
this step.
__________________________________________________________
»# ls
» parts.c and parts.h are the new files that were created.
What do you think is the purpose of these files?
__________________________________________________________
__________________________________________________________
__________________________________________________________
» As an administrator, you use the ODM command line interface to access the ODM
database files. For accessing the ODM from applications or system programs, an ODM
application programming interface (API) exists. These programs need to include the
files that have been created by odmcreate.

© Copyright IBM Corp. 2009, 2011 Exercise 2. The Object Data Manager 2-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Where does the ODM class parts reside?


__________________________________________________________
» /etc/objrepos (if you did not change the ODMDIR variable)
__ 27. Create some objects in ODM class parts, using the following data:

Part Number Description Warehouse Contained In


10001 Wheel a12 50001
10003 Frame a19 50001
10005 Saddle a01 50001
10006 Front wheel brake a03 50001
10007 Rear wheel brake a03 50001
50001 City Bike Easy Rider x99
» # vi parts.add
Insert the data from the preceding table into parts.add, using the format
shown below:
parts: 
part_number = “10001” 
part_description = “Wheel” 
warehouse = “a12”
contained_in = “50001” 

parts: 
part_number = “10003” 
part_description = “Frame” 
warehouse = “a19”
contained_in = “50001”
... (Add all the information from the table, using the same format.)
# odmadd parts.add
__ 28. List all objects that are contained in part 50001 (the City Bike Easy Rider). Write
down the command you used:
__________________________________________________________
» # odmget -qcontained_in=50001 parts
__ 29. Change the warehouse location for part Wheel to b10.
» Extract the object and place it in a file.
# odmget -qpart_description=Wheel parts > part_change

2-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty - Edit the file to change the warehouse location.


# vi part_change
...
warehouse="b10"
...
- At this point, you have two options:
Option 1:
• Remove the old record.
# odmdelete -qpart_description=Wheel -oparts
• Add the modified record.
# odmadd part_change
-OR-
Option 2:
# odmchange -qpart_description=Wheel -oparts part_change
- Verify the change.
# odmget -qpart_description=Wheel parts
__ 30. Remove ODM class parts from the system. Write down the command you used.
__________________________________________________________
» # odmdrop -o parts
__ 31. Use the shutdown command to reboot your system You do not need to wait for it to
complete; you will log back into your LPAR in the next exercise.
» # shutdown -Fr

End of exercise

© Copyright IBM Corp. 2009, 2011 Exercise 2. The Object Data Manager 2-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise review/wrap-up
1. Ask the students where the following information comes from:
a. Output of lsdev -P command (Answer: Comes from PdDv)
b. Output of lsdev -C command (Answer: Comes from CuDv)
c. Output of lsattr -E command (Answer: Comes from PdAt and CuAt)
d. State of a device (Answer: Comes from CuDv)
e. Physical volume ID of a disk (Answer: Comes from CuAt)
2. What happens if you add a modified object before removing the old object? (Answer:
Will end up with both old object and modified object in object class)

2-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 3. Error monitoring .

(with hints)

Estimated time
Part 1 - 00:20
Part 2 - 00:25

What this exercise is about


This exercise has two parts. In the first part, you will work with the AIX
error logging facility. In the second part, you will work with the syslogd
daemon and the ODM error notification class errnotify.
At the choice of the instructor, this exercise may be broken into
multiple lab sessions. If that is the case, you should stop after
completing each exercise part and not continue with the next part until
the related concepts have been covered in lecture and discussion.

What you should be able to do


At the end of the lab, you should be able to:
• Determine what errors are logged on your machine
• Generate different error reports
• Start concurrent error notification
• Identify errors and warnings sent by the syslogd daemon
• Create and maintain the /etc/syslog.conf file
• Automate error logging with errnotify
• Redirect syslogd messages to the error log

Introduction
In Part 1 of this exercise, you will work with the AIX error logging
facility. You should do this part of the exercise during the first lab
session allotted to this exercise.
In Part 2 of this exercise, you will work with the syslogd daemon and
the ODM error notification class errnotify. You should do this part of
the exercise during the second lab session allotted to this exercise.
You will need root authority to complete this exercise.

© Copyright IBM Corp. 2009, 2011 Exercise 3. Error monitoring. 3-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

Part 1 - Working with the error log


__ 1. Open a terminal emulator window and connect to your assigned LPAR (if you do not
already have a session).
__ 2. Generate a summary report of your system’s error log. Write down the command
that you (or SMIT) used:
____________________________________________________________
» # errpt
__ 3. Generate a detailed report of your system’s error log. Write down the command that
you (or SMIT) used:
____________________________________________________________
» # errpt -a | more
__ 4. Use the date command to obtain the current data and time in the format of:
mmddhhmmyy (month, day, hour, minute, year).
Record the result here: _____________________________________________
Modify this time stamp to reflect a time 1 day earlier.
Record that result here: ____________________________________________
» An example of how this could be done is:
# date +%m%d%H%M%y

__ 5. Using SMIT, generate the following reports. When prompted, select Filename and
do not request CONCURRENT error reporting.
• A summary report of all errors that occurred during the past 24 hours (place the
date recorded in previous step in the STARTING time interval field). Write down
the command that SMIT executes:

3-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty ____________________________________________________________
» # smit errpt
After generating the report, press F6 or <Esc-6> to see the command used.
The command SMIT executes is errpt -s ’mmddhhmmyy’ (where
mmddhhmmyy were the month, day, hour, minute, and year 24 hours ago).
• A detailed report of all records with an Error Class of S (software). Write down
the command that SMIT executes:
____________________________________________________________
» # smit errpt
After generating the report, press F6 or <Esc-6> to see the command used.
The command SMIT executes is errpt -a -d S.

__ 6. This step requires that you have a windowing workstation where you can have a
telnets from multiple windows. In this environment, start a second terminal emulation
with a telnet connection to your LPAR and log in as root. This could be another
PuTTY session or simply starting a new command window and running the telnet
command.
In one window, start up concurrent error logging, using the errpt command. Write
down the command that you used:
____________________________________________________________
» # errpt -c
In the other window, execute the errlogger command to generate an error entry.
Write down the command you used:
____________________________________________________________
» # errlogger "This is a test." (The text shown here is just an example.)
Is the complete error text shown in the error report?
____________________________________________________________
» No. Only the description OPERATOR NOTIFICATION is shown.
Stop concurrent error logging.
» <Ctrl-c>

__ 7. While the summary report does not show the errlogger specified text, the detailed
report should have it.
Run the errpt requesting a detailed report and look for the error records with your
text.

© Copyright IBM Corp. 2009, 2011 Exercise 3. Error monitoring. 3-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» # errpt -a | more

__ 8. Write down the characteristics of your error log:


LOGFILE: ___________________________________________________
Maximum LOGSIZE: __________________________________________
Memory BUFFER SIZE: _______________________________________
What command have you used to show these characteristics?
____________________________________________________________
» # smit errdemon
-OR
# /usr/lib/errdemon -l
Notice that the labels for these characteristics used by SMIT are somewhat
different from the labels used when you execute the errdemon command
directly.
__ 9. List the entries that have an error class of operator.
» # errpt -d O
__ 10. Clean up all error entries that have an error class of operator. Write down the
command, you (or SMIT) used:
____________________________________________________________
» # errclear -d O 0
__ 11. Verify that the operator entries are now gone.
» # errpt -d O

End of part 1

If you are doing “Part 1” of this exercise, stop here. Do not go on to Part
2.

3-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise (Part 1) review/wrap-up


Review the most important things students learned in this part of the exercise:
• Obtaining a summary error log report
• Obtaining a detailed error log report
• Setting up concurrent error logging
• Displaying error log characteristics
• Using errclear

© Copyright IBM Corp. 2009, 2011 Exercise 3. Error monitoring. 3-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Part 2

If you are doing Part 2 of this exercise, start here.

Section 1: Working with syslogd


__ 12. Backup the current /etc/syslog.conf file to the /tmp directory.
» # cp /etc/syslog.conf /tmp

__ 13. Edit the /etc/syslog.conf file and configure the syslogd daemon to log all daemon
messages to a file with the name /tmp/syslog.debug.
Write down the line that you added to /etc/syslog.conf:
____________________________________________________________
» daemon.debug /tmp/syslog.debug
__ 14. Execute the touch command and create the file /tmp/syslog.debug.
____________________________________________________________
» # touch /tmp/syslog.debug
__ 15. Refresh the syslogd daemon so it will pick up the changes. Write down the
command that you used:
____________________________________________________________
» # refresh -s syslogd
__ 16. Stop the inetd daemon and restart it in debug mode. Use the appropriate System
Resource Controller command to start the inetd daemon in debug mode (-d flag).
Write down the commands that you used:
____________________________________________________________
____________________________________________________________
»Example commands are:
# stopsrc -s inetd
# startsrc -s inetd -a "-d"
__ 17. Use the telnet command to telnet back to your own system, log in, and then log
back out of the telnet session. This step is performed to log several debug
messages. Use your login name when you telnet to your system.
____________________________________________________________
» # telnet <hostname or IP address of your host>
# exit

3-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 18. Analyze the content of the file /tmp/syslog.debug. Many debug messages from the
inetd daemon processes are shown.
____________________________________________________________
» # pg /tmp/syslog.debug

__ 19. Stop the inetd daemon and restart it without debug mode. Use the appropriate
System Resource Controller command to start the inetd daemon. Write down the
commands that you used:
____________________________________________________________
____________________________________________________________
» # stopsrc -s inetd
# startsrc -s inetd

__ 20. Change your /etc/syslog.conf. All messages should be directed to the AIX error log.
Write down what you have changed:
____________________________________________________________
» *.debug errlog

__ 21. Refresh the syslogd subsystem. Write down the command that you used:
____________________________________________________________
» # refresh -s syslogd
__ 22. Generate a syslogd message, for example, use an invalid password during a login.
Check that the message is posted to the error log.
____________________________________________________________
____________________________________________________________
» # login (Use an invalid password.) 
After three bad attempts, you will lose your telnet session. Either go to a spare session
(if you have one) or establish a new telnet session. Log in as root and check the error
log.
# errpt | more
# errpt -a | pg

__ 23. Restore the /etc/syslog.conf file from the backup which you earlier created in the
/tmp directory; then refresh the syslogd subsystem.

© Copyright IBM Corp. 2009, 2011 Exercise 3. Error monitoring. 3-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» The suggested commands are:


# cp /tmp/syslog.conf /etc
# refresh -s syslogd

Continued on next page with section 2

3-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Section 2: Error notification with errnotify


This part of the exercise demonstrates how to automate working with the error log.
__ 24. Create an errnotify object that mails a message to root, whenever an operator
message is posted to the errlog. Write down the stanza that you added:
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
»Suggested commands are:
# vi notify.add
errnotify: 
en_name="sample" 
en_persistenceflg=0 
en_class="O" 
en_method="errpt -a -l $1 | mail -s ERRLOG root"
» (be careful to use an the uppercase letter O for Operator)
# odmadd notify.add
# odmget -q “en_name=sample” errnotify
__ 25. Execute the errlogger command and create an entry in the errlog. Write down the
command that you used:
____________________________________________________________
» # errlogger test entry in the log
__ 26. After a short time, check the mail for the root user. The mail processing is batched
and it could take more than a minute before the mail is delivered.
____________________________________________________________
» # mail 
?t

End of part 2

End of exercise

© Copyright IBM Corp. 2009, 2011 Exercise 3. Error monitoring. 3-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise (part 2) review/wrap-up


Review the most important things students learned in this part of the exercise:
• Maintaining the syslogd
• Using the errnotify ODM class
• Redirecting syslogd messages to the AIX error log

3-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 4. Basic Network Installation Manager


configuration
(with hints)

Estimated time
00:55

What this exercise is about


Students configure an LPAR to act as a NIM master and server. They
define the other LPARs as clients machines, allocate NIM resources to
those clients, and set them up to enable BOS installation.

What you should be able to do


At the end of the lab, you should be able to:
• Configure an LPAR to be a NIM master and server
• Define a NIM client machine and setup for a BOS installation

Introduction
In this exercise, you will perform the following:
• Configuration of a NIM partition
• Define NIM client machine and setup for a BOS installation

Requirements
• This workbook
• A Windows 2000 or Windows XP workstation with a Web browser
and internet connectivity
• An assigned System p (p5, p6, or p7) environment

Instructor exercise overview


Write all IP addresses and login/password information for the HMCs and the partitions on
the board.

© Copyright IBM Corp. 2009, 2011 Exercise 4. Basic Network Installation Manager configuration 4-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

If there is a problem with a NIM operation on a machine object (such as bosinst) and the
student needs to repeat the operation, they may need to first reset the state of the
machine. The line command for this is: #nim -F -o reset <machine name>

4-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise instructions with hints


Preface
The exercise depends on the availability of specific equipment in your classroom. Many
of the steps require you to work as a team with the other students who share the same
server LPAR. The steps, near the end, where you define and configure installation
support for their assigned client LPAR can be done individually.
While you will configure your NIM server to support installation of an operating system
on your LPAR, we will not use it for this purpose in this course. Instead, in later lab
exercises, we will reconfigure the NIM server to provide support for network booting into
maintenance mode.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

Configuring the NIM Master LPAR


The environment which was installed from the mksysb image, is not configured as a NIM
master yet. You must perform the following steps to create a NIM Master’s environment.
The installed environment does contain a subdirectory which has all of the requirements of
an lppsource NIM resource.
__ 1. Connect to your assigned server LPAR and log in as root.
__ 2. Check the current OS level, including technology level (TL) and service pack (SP).
» $ oslevel -s
» It should be at 7100-00-01.
__ 3. Use the ls command to verify that the /lpp_source7100-00-01 is populated.
»$ ls -l /lpp_source7100-00-01
» There should be, at minimum, an installp directory.
__ 4. Determine if the environment has the required NIM filesets (NIM master and spot)
installed.
» Here is a sample command and what we would expect if all the software was
installed:
# lslpp -L | grep sysmgt.nim
bos.sysmgt.nim.client 7.1.0.0 C F Network Install Manager -
bos.sysmgt.nim.master 7.1.0.0 C F Network Install Manager -
bos.sysmgt.nim.spot 7.1.0.0 C F Network Install Manager - SPOT
» The default system installation only installs the NIM client fileset.
__ 5. If any filesets are missing, install them from the /lpp_source7100-00-01 directory.

© Copyright IBM Corp. 2009, 2011 Exercise 4. Basic Network Installation Manager configuration 4-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Suggested commands:
# smitty install_all

- Enter /lpp_source7100-00-01 for the source directory.


- Use ESC - 4 or F4 to select the software.
- Select the NIM manager (Master Tools) and spot filesets (F7 or <Esc-7).
- Press Enter to run the command.

__ 6. Make sure your assigned server LPAR and boot client LPARs are defined in the
/etc/hosts file. If your host names are missing, add them with the correct IP address
resolution.
» Suggested commands:
# host <your server LPAR hostname>
# host <your client LPAR hostname>
(if necessary) # smit hosts
and add your hostnames.

__ 7. Use SMIT to initialize the NIM master. Specify a network name of ent0. The primary
network interface will be en0. Accept all other defaults.
# smit nim
Configure the NIM Environment ->
Advanced Configuration ->
Initialize the NIM Master Only ->

Configure Network Installation Management Master Fileset

Type or select values in entry fields.


Press Enter AFTER making all desired changes.

[Entry Fields]
* Network Name [ent0]
* Primary Network Install Interface [en0] +
Allow Machines to Register Themselves as Clients? [yes] +
Alternate Port Numbers for Network Communications
(reserved values will be used if left blank)
Client Registration [] #
Client Communications [] #

4-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Standard out reveals that nimesis and nimd subsystems were added and the
nimesis subsystem is started.

__ 8. Type F3 or <esc>3 as many times as needed to get back to the main NIM SMIT
screen titled Network Installation Management. If you exited out of SMIT, run
smitty nim.
__ 9. Use SMIT to define an AIX 7.1 TL00 SP01 lppsource object with a name of
lppaix71-00-01. We have already copied the AIX 7.1 filesets from media into
/lpp_source7100-00-01. Specify that the server is master.
# smitty nim
Perform NIM Administration Tasks ->
Manage Resources ->
Define a Resource ->
[from Resource Type menu, select:)
lpp_source = source device for optional product images

Define a Resource

Type or select values in entry fields.


Press Enter AFTER making all desired changes.

[Entry Fields]
* Resource Name [lppaix71-00-01]
* Resource Type lpp_source
* Server of Resource [master] +
* Location of Resource </lpp_source7100-00-01] /
. . .

» The process checks the designated lppsource path to see whether it has the
minimum requirements (for creating the spot, and other NIM operations).

__ 10. In the following steps, you will create a shared product object tree (spot). This will
require a significant amount of storage. To avoid having any impact on the other file
systems, create a new JFS2 file system with 869 megabytes of space and a mount
point of /spots. Request that it be automatically mounted at system restart and also
mount it now.
» Following are suggested commands:
# crfs -v jfs2 -g rootvg -a size=896M -m /spots -A yes
# mount /spots

© Copyright IBM Corp. 2009, 2011 Exercise 4. Basic Network Installation Manager configuration 4-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 11. Press F3 to get back to the Manage Resources screen. If you exit smit, reenter by
running smitty nim.
__ 12. Use SMIT to create a NIM SPOT resource called spot71-00-01. Specify the server
as master. Specify the lppsource you just defined, as the source of the SPOT. Store
the generated SPOT in the recently created /spots directory.
When you are finished entering the values, press Enter. Look in the standard out
and check for errors. This may take up to 20 minutes.
# smitty nim
Perform NIM Administration Tasks ->
Manage Resources ->
Define a Resource ->
[from Resource Type menu, select:)
spot = Shared Product Object Tree - equivalent to /usr file

Define a Resource

Type or select values in entry fields.


Press Enter AFTER making all desired changes.

[TOP] [Entry Fields]


* Resource Name [spot71-00-01]
* Resource Type spot
* Server of Resource [master] +
Source of Install Images [lppaix71-00-01] +
* Location of Resource [/spots] /
. . .

__ 13. The remaining part of this exercise can be done separately by each of the team
members, each configuring for their own assigned client LPAR.
Type F3 as many times as required to get to the Perform NIM Administrative Tasks
screen or use smitty nim --> Perform NIM Administrative Tasks. From here, we
will create a machine object for the AIX client.
# smitty nim
Perform NIM Administration Tasks ->
Manage Machines ->
Define a Machine ->

__ 14. When prompted, enter your client’s hostname in the Host Name of Machine field
and press Enter. This fails if hostname is not resolved. Make sure your client’s
hostname is in the /etc/hosts file.

4-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty
Note

In the remaining parts of this exercise, the example commands use the machine names
that were on the development system. Your names are likely different and should be
substituted into the example commands.

__ 15. On the Define a Machine screen, use the following values:


- Machine type = standalone
- Hardware Platform Type = chrp
- Cable Type = N/A or TP
- Communication protocol = nimsh
- accept the defaults for all other fields.
Press Enter to run the command. When the command is done, press F10 or ESC-0
to exit.
Define a Machine

Type or select values in entry fields.


Press Enter AFTER making all desired changes.

[TOP] [Entry Fields]


* NIM Machine Name [sys304_118]
* Machine Type [standalone] +
* Hardware Platform Type [chrp] +
Kernel to use for Network Boot [64] +
Communication Protocol used by client [nimsh]
+
Primary Network Install Interface
* Cable Type N/A +
Network Speed Setting [] +
Network Duplex Setting [] +
* NIM Network ent0
* Host Name sys304_118

© Copyright IBM Corp. 2009, 2011 Exercise 4. Basic Network Installation Manager configuration 4-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Note

Next, you will set up NIM to be able to install an operating system. But you must not
actually do an overwrite install in this class. Follow the instructions closely to insure that
NIM does not automatically initiate an installation. Initiating an installation will destroy
customizations that are needed for the other exercises which follow this one.

__ 16. Now, let’s enable the bos_inst operation for the new NIM client. This sets up the NIM
master’s environment so that resources are made available and so that NIM can
properly handle the installation of this NIM client.
Invoke SMIT with the smitty command with the fast path to the NIM configuration
information:
$smitty nim_bosinst
You are asked to choose a TARGET for the operation. This is asking you to select
the NIM client host on which you want to load AIX. Highlight your partition’s
hostname and press Enter to select it.
You need to enter the following information on the next few entry screens:
__ a. Pick rte for the TYPE of installation.
__ b. Pick an LPP source. There should only be the one you defined earlier.
__ c. Pick a SPOT resource. There should only be the one you defined earlier.

__ 17. This should bring up a dialog panel that looks like:


Install the Base Operating System on Standalone Clients

[TOP] [Entry Fields]


* Installation Target sys304_118
* Installation Type rte
* SPOT spot71-00-01
LPP_SOURCE [lppaix71-00-01] +

Fill in the additional fields as follows:


__ a. Answer yes to the question:
Accept new license agreements?
__ b. Scroll down to the question:
Initiate reboot and installation now?

4-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Set the value to no. To change the value, press F4 or ESC 4, highlight your
choice, and press Enter.
__ c. Press Enter to effect the setup.
__ d. When complete, exit SMIT (F10 or ESC 0).

__ 18. Verify that NIM is now ready to support the base operating system installation
(bos_inst) operation with your client:
• List the attributes of your LPAR’s machine object and look for a Cstate value of
“BOS installation has been enabled”.
• Look for network boot support for you LPAR in /etc/bootptab.

[The exercise instructions continue on the next page]

© Copyright IBM Corp. 2009, 2011 Exercise 4. Basic Network Installation Manager configuration 4-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

• Verify that the allocated resources are NFS exported.


# lsnim -l <your client’s name>

# lsnim -l sys304_118
sys304_118:
class = machines
type = standalone
connect = shell
platform = chrp
netboot_kernel = 64
if1 = ent0 sys304_118 0
cable_type1 = N/A
Cstate = BOS installation has been enabled
prev_state = ready for a NIM operation
Mstate = currently running
boot = boot
lpp_source = lppaix71-00-01
nim_script = nim_script
spot = spot71-00-01
control = master

(look for a “Cstate” value of “BOS installation has been enabled”.

$ cat /etc/bootptab
. . .
sys304_118:bf=/tftpboot/sys304_118:ip=10.6.52.118:ht=ethernet:sa=10.6
.52.117:sm=255.255.240.0:
sys304_119:bf=/tftpboot/sys304_119:ip=10.6.52.119:ht=ethernet:sa=10.6
.52.117:sm=255.255.240.0:

$ exportfs
. . .
/export/nim/scripts/sys304_118.script
-ro,root=sys304_118,access=sys304_118
/spots/spot71-00-01/spot71-00-01/usr
-ro,root=sys304_118:sys304_119,access=sys304_118:sys304_119
/lpp_source7100-00-01
-ro,root=sys304_118:sys304_119,access=sys304_118:sys304_119
/export/nim/scripts/sys304_119.script
-ro,root=sys304_119,access=sys304_119

4-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty
Note

Do not execute an installation of AIX into your LPAR at this point. Doing so will destroy
customizations which are needed to complete the remaining exercises. Many other
courses provide opportunities to execute new AIX installations from NIM. If you wish to do
this at the end of the course, that would be the best time.

__ 19. While we have set up to over-write the install your LPAR with a new image, we are
not going to execute an installation. If we wanted to do this, we would only need to
network boot the LPAR. The prerequisite class and many other classes in the
curriculum will have you network boot a new logical partition in order to install a new
operating system.

End of exercise

© Copyright IBM Corp. 2009, 2011 Exercise 4. Basic Network Installation Manager configuration 4-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

4-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 5. System initialization: Accessing a


boot image
(with hints)

Estimated time
Part 1 - 00:15
Part 2 - 00:10
Part 3 - 00:20
Part 4 - 00:20
Part 5 - 00:15
Total - 01:20

What this exercise is about


This exercise will review the hardware boot process of an AIX system
and provide practice in dealing with problems locating and loading a
boot image.

What you should be able to do


At the end of the lab, you should be able to:
• Boot a machine in maintenance mode.
• Repair a corrupted boot logical volume.
• Manage multi-path bootlists.

Introduction
This exercise has five parts:
• Part 1: Identify information for your system.
• Part 2: Prepare NIM server to support maintenance boot.
• Part 3: Validate successful maintenance boot.
• Part 4: Repair a corrupted boot logical volume.
• Part 5: Working with multi-path bootlists.
All instructions in this exercise require root authority.

© Copyright IBM Corp. 2009, 2011 Exercise 5. System initialization: Accessing a boot image 5-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Requirements
• The program /home/workshop/ex5prob1
• A NIM server that you can customize to support your LPAR.

Instructor exercise overview


Be sure to communicate the IP address of HMCs assigned to the teams. Also, identify the
managed system and LPAR names that are assigned. The remote server facility should
have provided this information.

Common student problems


Many students are not familiar with booting a machine in maintenance mode. We will
emphasize this in this exercise.

Known hardware/software problems


None.

5-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise instructions with hints

Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output might be different.
All hints are marked with a >> sign.

Part 1: Identifying information on your system

__ 1. What is the boot sequence of your system for a normal boot?


Boot device:___________________________
What is the command you used, to determine the bootlist? ________________
» # bootlist -m normal -o

__ 2. Identify which disks are contained within the rootvg:


____________________________________________________________
What command did you use? ___________________________________
» # lsvg -p rootvg

What is the logical volume type of hd5? _________________________


What command did you use? ___________________________________
» # lsvg -l rootvg
» TYPE: boot

Which disk is the bootable disk? (That means the disk that contains the boot
logical volume hd5): ______________________
What command did you use? ___________________________________
» lspv -l hdisk0 (for example) or lslv -m hd5

© Copyright IBM Corp. 2009, 2011 Exercise 5. System initialization: Accessing a boot image 5-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 3. If the bootlist had more than one device, set the normal bootlist so it contains only
the bootable hard disk.
» (if needed) # bootlist -m normal hdisk0

__ 4. The Logical Volume Manager uses names and IDs when storing information.
Complete the following table that maps names to IDs:
rootvg VGID
First disk PVID
Second disk PVID
Be careful, the window might need to be enlarged to see the entire output. The
VGID is 32 characters long - be sure to record all of it.
What command did you use to determine the rootvg VGID? _______________
»# lsvg rootvg

What command did you use to determine the physical volume IDs? __________
»# lspv

__ a. Using odmget, identify the attribute pvid of one of your disks from ODM class
CuAt.
What command did you use?_____________________________________
»odmget -q "name=hdisk0 and attribute=pvid" CuAt

What difference do you see with the ID value? ________________________


» The ODM stores physical volume IDs in a 32-number field, and adds 16 zeros to the ID
of the disk. lspv just shows the first 16 bytes.

__ 5. Display your configured interfaces. What is your configured Ethernet interface


name?
____________________________________________________________

5-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Suggested commands are:


# ifconfig -a
-or-
# netstat -in

__ 6. Display and record the physical location code of the Ethernet adapter which your
interface is using (the numeric suffix will match)?
_____________________________________________________________
» Suggested commands are:
# lscfg | grep ent

Part 2: Preparing NIM to support booting to maintenance mode


To fix various problems in the following lab exercises, you will need to boot to
maintenance mode. Since we are unable to provide every LPAR with an optical or
tape drive (mounted with bootable media), instead you will boot over the network
using NIM.
To support this, you must first configure NIM to provide maintenance boot services.
__ 7. Using your telnet session to your server LPAR (start one if you do not already have
one), log in as root.
__ 8. List the NIM standalone machine objects and locate your client LPAR in the list, by
executing:
# lsnim | grep standalone
You machine object name should match your LPAR’s hostname.
» Following are the example commands and output:
# lsnim | grep standalone
sys304_118 machines standalone
sys304_119 machines standalone

__ 9. List the attributes of your machine object, by executing:


# lsnim -l <your-machine-object-name>

© Copyright IBM Corp. 2009, 2011 Exercise 5. System initialization: Accessing a boot image 5-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Following are the example commands and output:


# lsnim -l sys304_119
sys304_119:
class = machines
type = standalone
connect = shell
platform = chrp
netboot_kernel = mp
if1 = net_en0 sys304_119 0
cable_type1 = N/A
Cstate = ready for a NIM operation
prev_state = currently running
Mstate = not running
cpuid = 00C35B904C00
Cstate_result = success

__ 10. If the Cstate value is not ready for a NIM operation, force reset the state of your
client machine object, by executing:
# nim -o reset -F <your-machine-object-name>
» Following are the example commands and output:
# nim -o reset -F sys304_119
# lsnim -l sys304_119 | grep Cstate
Cstate = ready for a NIM operation
Cstate_result = reset

__ 11. The maintenance boot operation requires that a SPOT is allocated to the machine.
Check that there is a SPOT allocated, by executing:
# lsnim -l <your-machine-object-name> | grep spot
If there is not a SPOT allocated, then allocate one that matches the version and
release of your client LPAR’s operating system, by executing:
# nim -o allocate -a spot=spot71-00-01 <your-machine-object-name>
» Following are the example commands:
# lsnim -l sys304_119 | grep spot
if needed) # nim -o allocate -a spot=spot71-00-01 sys304_119

__ 12. Invoke the maint_boot operation for your client LPAR, by executing:
# nim -o maint_boot <your-machine-object-name>

5-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Following is an example commands:


# nim -o maint_boot sys304_119

__ 13. Verify that your client LPAR machine object now has a Cstate of maintenance boot
has been enabled, by executing:
# lsnim -l <your-machine-object-name> | grep Cstate
» Following are the example commands and output:
# lsnim -l sys304_119 | grep Cstate
Cstate = maintenance boot has been enabled

Part 3: Booting to maintenance mode


__ 14. Before creating any boot problems, verify that you can boot into maintenance mode
and then reboot back to multiuser mode. This will be crucial to fixing the problem.
The procedure for LPARs, at a high level, is as follows:
__ a. Shut down your AIX operating system in your client LPAR.
__ b. Access the HMC and locate the icon for your client LPAR.
__ c. Activate your client LPAR into SMS mode.
__ d. Network boot your client LPAR into maintenance mode using SMS.
__ e. Shut down your client LPAR from the current maintenance mode.
__ f. Start your client LPAR back up into multi-user mode.
Except for the shutdown of a running AIX operating system, details of this will
depend on the level of HMC with which you are working. This course is written to
expect HMCv7 or later.
Execute the above procedure. Check off each step (above) as you complete it.
If you are not well versed in HMCv7 operations, the details for working with HMCv7
and SMS follow in the next step. They are there for reference and include
procedures which you will not need to complete this step.

__ 15. This step is just for reference to support the previous step and for later steps. Do not
re-execute these procedures at this point. They assume that you are working with
the graphic web interface for HMC version 7.
__ a. At your AIX client LPAR (be sure it is not your server LPAR), shut down your AIX
operating system:

© Copyright IBM Corp. 2009, 2011 Exercise 5. System initialization: Accessing a boot image 5-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

- If you have access to a root level prompt on the client LPAR, execute
shutdown -F from the AIX root level command prompt in your LPAR.
- If you do not have a root level prompt at your client LPAR, then:
• Access the HMC and locate your LPAR as described in substep b.
• From the task menu, select Operations -> Shutdown.
• On the resulting Shutdown Partitions popup, select the Operating System
Immediate button and click OK.
• If the Operating System Immediate button is greyed out, then select the
Immediate option and click OK.
__ b. Access the HMC and locate your LPAR:
1) Start a Web browser on your lab workstation (note that the workstation can
be a portal machine at a remote location).
2) Enter a URL of: https://<IP address of your HMC>.
This will take you to an HMC status window which has three status indicators
and a link with the text:
“Log on and Launch the Hardware Management Console web application”
3) Click the log-on link to launch the HMC logon panel.
4) Enter the your assigned HMC user ID and the password and click the logon
button. This should launch the HMC Web interface.
5) In the left navigation area click Systems Management. The Systems
Management item should expand to show “Servers” and “Custom Groups.”
6) Click the Servers item. The “Servers” item should expand to show the
managed systems.
7) Click the managed system which is assigned to your team. In the Content
Area on the right, you should see a list of logical partitions defined for your
assigned system.
8) Select your assigned logical partition by clicking the box under “Select” for
your LPAR. After a short delay you should see a small menu icon appear to
the right of your LPAR name, and the Tasks Area on the bottom half of the
panel should update to reflect operations which are appropriate for the
selected target.
9) If you left-click the new menu icon (to right of the LPAR name), then you
should see a menu which is similar to what you see in the Tasks Area.

__ c. Activate your LPAR into SMS mode:


1) When the partition state is Not Activated, proceed to activate the partition.

5-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty 2) Select the partition (if not already selected).


3) When the small menu icon appears, click it to show the menu and move your
mouse over the Operations task and then the revealed Activate task.
4) When the Activate subtasks appear, click the Profile subtask.
5) In the pop-up window labeled Activate Logical Partition: <your lpar
name>, click the small box next to Open a terminal window or console
session.
6) Click the Advanced button. This should result in a new pop-up window
labeled “Activate Logical Partition - Advanced.”
7) In the new pop-up window, click the menu icon to the right of “Boot Mode”
and select SMS.
Click OK to exit this pop-up.
8) On the panel that is labeled Activate Logical Partition: <your lpar name>,
click OK. Respond to any security pop-up windows, in a manner which will
continue with the connection establishment.
Respond yes to: Web sites certificate can not be verified. continue?
Respond Run to; Application digital certificate can not be verified. Do you
want to run the application?
Respond No to: Application Components could indicate a security issue.
Block potentially unsafe components from being run?
A virtual terminal window should appear and you should see the system
console displays for a boot system, ending in an SMS menu. (If you do not
see the virtual terminal window, it is likely behind some other window and you
will need to bring it to the foreground).

__ d. Network boot your LPAR into maintenance mode using SMS:


1) From the SMS main menu, select options
2. Setup Remote IPL (Initial Program Load) ->
2) From the list of Network Interface Card (NIC) Adapters, choose the first one
(the one that matches the location code recorded earlier).
3) On newer systems, you will be prompted on what protocols to use. Select
IPv4 and bootp.
4) This should bring up the Network Parameters panel. select option
1. IP Parameters
5) On the IP Parameters panel, if the network parameters are already set,
validate that they are correct (The server IP address, if already set, is likely to

© Copyright IBM Corp. 2009, 2011 Exercise 5. System initialization: Accessing a boot image 5-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

be wrong for this exercise. Set it to the address of your assigned server
LPAR.) If they are not correct, then modify them.
The way to modify the values is to enter the number of the parameter you
want to change, type in the replacement value and then press Enter.
When you are comfortable that the IP Parameters are correct, return to the
previous Network Parameters panel by pressing the <Esc> key.
6) Next use the ping test to see if the parameters allow you to communicate with
the designated server. Select:
3. Ping Test
and
1. Execute Ping Test
If you do not get a “Ping Success” result, then check the status of the server
and your IP Parameter values.
7) Back out to the main menu, using the <Esc> key.
8) From the SMS main menu, select options:
5. Select Boot Options ->
1. Select Install/Boot Device ->
6. Network
When prompted for a network service, select bootp.
Select the device number of your network adapter
9) Then select:
2. Normal Mode Boot
1. Yes (to exit SMS)

You should see the tftp packet count incrementing as it downloads the boot
image. It is not unusual to see one or two retry attempts before the download
is successful; be patient.
Then you should see the system booting up into maintenance mode. It will
prompt you to identify the system console. Type 1 and press Enter.
It will next ask you to identify the language to be used while in maintenance
mode. Type 1 (for English) and press Enter.
It should then display the Maintenance menu.

5-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ e. Shut down the partition from maintenance mode


1) If you have accessed the volume group, simply run sync;sync;halt - this
can be done because there is no disk activity in the current state. If you have
not access the volume group (perhaps you are looking at the maintenance
menu), you will have to use the HMC to shutdown the LPAR - as follows:
2) On the HMC Content Area, make sure your LPAR (and only your LPAR) is
currently selected.
3) Click the menu icon, move your mouse over the Operations task and then
click the Shutdown subtask. This should result in a pop-up window.
4) In the shutdown window, select Immediate and then click OK. When
prompted, confirm that you want to shutdown the partition. Answer no if
asked if you are replacing a cache battery.
The shutdown immediate option is only valid because of the lack of any disk
activity in the current state. From a multiuser mode do not use the HMC
shutdown immediate. The os shutdown command is preferable.
5) The partition shutdown is complete when the “Status” field for your LPAR
changes from “Running” to “Not Active.”

__ f. Start your partition in multiuser mode (normal bootlist). When the partition state
is Not Activated, proceed to activate the partition:
1) Select the partition (if not already selected).
2) When the small menu icon appears, click it to show the menu and move you
mouse over the Operations task.
3) When the subtasks appear, click the Activate subtask.
4) In the pop-up window “Activate Logical Partition: <your lpar name>”, click the
small box next to Open a terminal window or console session (unless you
already have a virtual console window open) and click OK.
5) You should eventually see a login prompt appear in the virtual console
window.

Part 4: Repair a corrupted boot logical volume


__ 16. On your assigned client LPAR, if you have not already tested the boot to
maintenance procedure, do so now. Boot to maintenance mode and then reboot
back to multiuser mode. This verifies that this procedure can be successfully
implemented. You will need it when dealing with the lab problems.

© Copyright IBM Corp. 2009, 2011 Exercise 5. System initialization: Accessing a boot image 5-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 17. If you do not already have a virtual console open to your client LPAR, then open one
now and log in as root. What follows are instructions to open a virtual console:
__ a. Locate and select your LPAR, as described earlier.
__ b. Left-click the menu to the right of your LPAR name.
__ c. Left-click the Console Window item.
__ d. Left-click the Open Terminal Window item.

__ 18. In the virtual console window, execute the program /home/workshop/ex5prob1.


When the prompt is returned, shut down and reboot the system.
»# /home/workshop/ex5prob1
You have successfully broken your machine!
Now, run shutdown -Fr to attempt a reboot.
»# shutdown -Fr

What happens on your system during the reboot? Examine both the HMC
displayed reference code for your LPAR and the virtual console for your LPAR.
______________________________________________________________
______________________________________________________________
______________________________________________________________
______________________________________________________________
» On an LPAR system (using the HMC console):
• Access the HMC and locate your partition as described in the previous
section.
• The state of the partition will depend on the level of code. On older
POWER5 systems you might see a state of open firmware. On newer
systems you might see a state of “starting” with a reference code of
AA060011.
The boot record at the beginning of your partition has been removed.
When an LPAR is unable to locate a boot image, its behavior depends on
the firmware level. On older firmware levels it automatically booted to
SMS, which is a menu front end to the system firmware. At newer
firmware levels, it repeatedly retries the bootlist and displays a message
of:
“No OS image was detected by firmware
At least one disk in the bootlist was not found yet
Firmware is now retrying the entries in the bootlist

5-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Press ctrl-c to stop retrying.”

__ 19. Terminate any looping condition that might exist.


» Press ctrl-c to stop retrying.
» This signal will trigger a boot to SMS mode.

__ 20. Boot to maintenance mode to do the repair.


On the virtual console for your LPAR, you should see an SMS menu (resulting from
termination of the retry attempts to find a boot image).
Use SMS to execute a network boot of your system.
1) From the SMS main menu, select options:
5. Select Boot Options ->
1. Select Install/Boot Device ->
6. Network
When prompted for a network service, select bootp.
Select the device number of your network adapter
2) Then select:
2. Normal Mode Boot
1. Yes (to exit SMS)

You should see the tftp packet count incrementing as it downloads the boot
image. Then you should see the system booting up into maintenance mode.
It will prompt you to identify the system console. Type 1 and press Enter.
It will next ask you to identify the language to be used while in maintenance
mode. Type 1 (for English) and press Enter.
It should then display the Maintenance menu.
If the corresponding NIM machine object is in the correct state, your system should
boot to maintenance mode.

__ 21. Repair the boot logical volume.


The procedure for using the maintenance menu to repair the boot logical volume is
the same for all environments:
__ a. Access the rootvg with all mounted file systems.

© Copyright IBM Corp. 2009, 2011 Exercise 5. System initialization: Accessing a boot image 5-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Since you are booting from NIM, you will already be at the Maintenance menu.
» From the Maintenance menu, select option 1, Access a Root Volume Group.
» Type 0 to continue.
» Next, the Access a Root Volume Group screen is displayed. This screen lists all of the
volume groups (root and otherwise) on your system.
» Select the option for the root volume group whose logical volume information you want
to display. If there are multiple volume groups to choose from, choose the one which
matches the VGID that you recorded in Part 1 of this exercise (it is likely identified as
hdisk0).
» After entering your selection, the Volume Group Information screen is displayed.
» Select option 1, Access this volume group and start a shell. Selecting this choice
imports and activates the volume group and mounts the file systems for this root volume
group before providing you with a shell and a system prompt.

__ b. In the maintenance shell, check that hdisk0 is in the normal bootlist. Also check
that the rootvg actually has a boot logical volume on it. Correct if needed.
» The suggested commands are:
# bootlist -o -m normal
# lsvg -l rootvg

__ c. In the maintenance shell, rebuild the boot image on the boot logical volume.
Ensure that your changes are committed to disk. Write down the commands you
used.
____________________________________________________________
____________________________________________________________
» The suggested commands are:
# bosboot -ad /dev/hdisk0 (for example)
# sync
# sync

__ 22. If the command executes successfully, reboot your system in normal mode.
» Do a reboot:
# reboot
» We would normally recommend a shutdown command, but since there is no disk
activity in the current state, it is safe to use the reboot command. Also, use of the
shutdown command would generate multiple errors related to the assumption that the
shutdown is being issued from a multi-user mode (which it is not true in this situation).

5-14 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 23. When you receive a login prompt, the repair is complete.

Part 5: Working with multi-path bootlists


This section requires that your assigned LPAR have a fibre channel adapter with a LUN
that has be zoned to it. The fibre channel adapter may be either a physical adapter or a
virtual adapter.
When a boot disk is accessed over a storage area network, there can be multiple paths
to that disk. Once AIX is running, path management software can automatically fall over
to an alternate path when there is a problem. On the other hand, the firmware needs be
explicitly told which paths to use in accessing the disk during the boot process. This
section is about managing the bootlist in that situation.
__ 24. Login to your assigned client LPAR, if not logged in already.
__ 25. List the devices of class disk. Do you see a disk with an AIX location code and a
description which indicates it is fibre channel (FC) attached? What is the name of
that disk? _______________________________________________________
# lsdev -c disk

__ 26. List the configured devices and filter for just disk devices. Look for a disk which has
the world-wide name (ww_name) for the remote storage subsystem’s port as part of
its physical location code. The ww_name will begin with a W, followed by the
hexadecimal identifier. What is the name of the disk?
________________________________________________________________
»# lscfg | grep disk

__ 27. Identify the parent device of the disk. What is the name and description of the parent
device? _________________________________________________________
»# lsparent -C -l hdisk#
-OR-
# lsdev -l hdisk# -F parent

__ 28. Identify the parent device of the device you just described. What is the name and
description of this device?
_________________________________________________________
»# lsparent -C -l fscsci0
-OR-
# lsdev -l fscsi0 -F parent

© Copyright IBM Corp. 2009, 2011 Exercise 5. System initialization: Accessing a boot image 5-15
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 29. List the paths defined for the previously identified fibre channel attached disk,
requesting the pathid as part of the information. How many are there? Do the paths
relate to the parent device?
_________________________________________________________
»# lspath -l hdisk# -t
» Even though there is only one Fibre Channel adapter, there are multiple paths. That is
because there are multiple alternative paths in the fabric of the SAN.

__ 30. Display the normal mode bootlist. Record the current devices in the bootlist.
_________________________________________________________
»# bootlist -o -m normal

__ 31. Update the normal mode bootlist to include the existing device, the fibre channel
attached disk, and one other non-FC attached disk (such as hdisk1). Keep the
current device as the first in the bootlist order and place the other non-FC attached
disk as the last device.
»# bootlist -m normal hdisk0 hdisk# hdisk1

__ 32. Display the normal mode bootlist. Record the devices in the bootlist.
_________________________________________________________
_________________________________________________________
_________________________________________________________
__________________________________________________________
_________________________________________________________
Do you notice anything odd? __________________________________
_________________________________________________________
»# bootlist -o -m normal
» When you specify a fibre channel attached device for the bootlist without any path restrictions, an
attempt is made to add an entry for each of the known paths. This can cause a problem situation where
the number of Paths fills up the bootlist capacity, preventing a later disk from being included.
» In the current example, the bootlist filled up. On the development system, there were 6
possible paths to the disk and only four could fit into the bootlist. The third (non-FC
attached) disk also could not be included, even though it was specified.

5-16 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 33. Update the normal mode bootlist to only have the original boot device followed by
only the two of the paths to the FC attached disk, and with the other non-FC
attached disk listed last.
»# bootlist -m normal hdisk0 hdisk# pathid=1,3 hdisk1

__ 34. Display the normal mode bootlist. Record the devices in the bootlist.
_________________________________________________________
_________________________________________________________
_________________________________________________________
_________________________________________________________
»# bootlist -o -m normal
» Because you controlled which paths to include for the FC attached disks, you were able
get a total bootlist that included all the of the disks that you specified.
__ 35. Update the normal mode bootlist to include only the original device.
»# bootlist -m normal hdisk0

__ 36. Validate that your change was effective.


_________________________________________________________
_________________________________________________________
_________________________________________________________
»# bootlist -o -m normal

End of exercise

© Copyright IBM Corp. 2009, 2011 Exercise 5. System initialization: Accessing a boot image 5-17
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise review/wrap-up
Ask the students the following questions:
1. How does your AIX system know where to boot from?
2. How do you boot a system in maintenance mode?
3. How do you repair a corrupted boot logical volume?

5-18 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 6. System initialization: rc.boot and


inittab
(with hints)

Estimated time
Part 1 - 00:20
Part 2 - 00:10
Part 3 - 00:10
Total - 00:40

What this exercise is about


This exercise will review the software boot process of an AIX system
and provide practice dealing with problems during rc.boot and init
execution.

What you should be able to do


At the end of the lab, you should be able to:
• Repair a corrupted log logical volume
• Analyze and fix an unknown boot problem

Introduction
This exercise has two parts:
1. Repair a corrupted log logical volume
2. Analyze and fix a boot failure
All instructions in this exercise require root authority.

Required material
• Program /home/workshop/ex6prob1
• Bootable media that matches the version and release of your
system or a NIM server setup that can be used to execute a remote
boot

© Copyright IBM Corp. 2009, 2011 Exercise 6. System initialization: rc.boot and inittab 6-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Common student problems


Many students are not able to describe the purpose of the log logical volume. Before
starting the lab, make it clear to the students how the log logical volume is used in AIX.
In Part 2, students have to fix a common boot failure, a corrupted /etc/inittab file.

Known hardware/software problems


ex6prob1 makes a copy of the /etc/inittab file before corrupting it. The copy is found in
/tmp/inittab.
In step 6, the logform and the fsck commands may fail if you forget to specify the
correct file system type with the message:
fsck: Cannot find the vfs value for file system /dev/hd<#>

After leaving the shell by typing exit, the system will hang due to the fact that the
following mount of /dev/hd4 needs the correct file system type. You have to power off/on
the system.

6-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

Part 1 - Repair a corrupted log logical volume


Before starting the lab, read the following paragraph.
Files or directories which are created or updated are stored with their i-nodes and the
superblock of the file system in memory first. Most write requests are handled in
memory first to improve system performance. Later, the data is written to disk, either
due to the syncd daemon (every 60 seconds) or due to a threshold of dirty memory
pages being exceeded (for example every 16KB of changes).
Just before the data is written to disk, these changes to the JFS file systems
(superblock, i-nodes, list of free data blocks, and so forth) are recorded in a log logical
volume. The rootvg uses, by default, the log logical volume /dev/hd8. When the
changes are written to the disk, the JFS transactions are removed from the log logical
volume. This guarantees the integrity of a file system. Until the file system changes are
written to disk, the changes are recorded and held in the log logical volume.
In this part of the lab, we corrupt the jfslog to stress a boot failure.
__ 1. Check to see if your rootvg file systems are JFS or JFS2. You will need this
information later in this exercise. ____________________________________
» # lsvg -l rootvg

__ 2. Execute the program /home/workshop/ex6probl (the file name ends in a lower


case L). This program may take as long as 30 seconds to run. It will shut down your
machine. As soon as you see the message Halt Completed, switch over to your
Web browser session with the HMC.
» # /home/workshop/ex6probl
__ 3. Attempt to activate your LPAR to a multiuser mode.
» Follow the instructions in the step Start your partition to multiuser mode in the Exercise
5 Part 3.

© Copyright IBM Corp. 2009, 2011 Exercise 6. System initialization: rc.boot and inittab 6-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 4. What happens during the reboot? Investigate any reference code that seems to
persist. Examine your Student Guide to find an explanation for the boot failure.
_____________________________________________________________
_____________________________________________________________
» LED, LCD or HMC Operator Panel Value will show 0557 and will not change from
that code. A 557 progress code indicates that the mount of /dev/hd4 (root file system)
failed.

__ 5. Shutdown your AIX LPAR from the HMC graphic interface:


__ a. Navigate to the list of LPARs for your assigned server.
__ b. Select your LPAR
__ c. From the tasks menu, select: Operations -> Shutdown
__ d. On the Shut Down Partition panel, select Immediate.
__ e. Click OK.

__ 6. Once the partition is Not Activated, boot your machine to maintenance mode. If
unsure, follow the procedures in Exercise 5, Part 3 (Booting to maintenance mode).

__ 7. From the Maintenance menu, access the rootvg before mounting the file systems.
You need to do this, because mounting the file systems in rootvg will fail due to the
corrupted log logical volume.
» Select your terminal.
» Select your language.
» If booting from media you will see: Welcome to the Base Operating System Installation
and Maintenance. From here, select Start Maintenance Mode for System Recovery.
» From the Maintenance menu, select option 1, Access a Root Volume Group.
» Type 0 to continue.
» The Access a Root Volume Group screen displays. Select the volume group that is
causing the problem.

6-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Select option 2, Access this volume group and start a shell before mounting file
systems. Notice the error messages while rootvg is varied on. These provide more
clues to the problem:
Importing Volume Group...
imfs: can't find log for volume group rootvg
rootvg
Checking the / filesystem.

fsck: Cannot find the vfs value for file system /dev/hd4
.Checking the /usr filesystem.

fsck: Cannot find the vfs value for file system /dev/hd2
.Exit from this shell to continue the process of accessing the
root volume group.
__ 8. Reformat the journal log logical volume. Be sure to do a file system check for all file
systems that use /dev/hd8. If you like, use set -o emacs or set -o vi, to enable
command retrieval and edit.
» If it is a JFS2 file system:
# logform -V jfs2 /dev/hd8
logform: Destroy /dev/hd8 (y)? y
# fsck -y -V jfs2 /dev/hd1
# fsck -y -V jfs2 /dev/hd2
# fsck -y -V jfs2 /dev/hd3
# fsck -y -V jfs2 /dev/hd4
# fsck -y -V jfs2 /dev/hd9var
# fsck -y -V jfs2 /dev/hd10opt
# fsck -y -V jfs2 /dev/hd11admin
__ 9. Use the sync command to flush your changes from memory to the disk. Shut down
your system and reboot your system in normal mode. Were you able to successfully
reboot? _____________________________________________
»Here are example commands:
# sync
# sync
# reboot
» (Use of reboot is appropriate for the current state of the machine, but the shutdown
command should be used when the machine is in a multiuser mode.)
» If you are unable to shut down the system from the command prompt, then use the
HMC to stop and start your system:
• See the HMC instructions in exercise 5.

© Copyright IBM Corp. 2009, 2011 Exercise 6. System initialization: rc.boot and inittab 6-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» You should find that while you fixed the problem, the boot still fails at a later stage of the
boot sequence.
__ 10. If the reboot failed, determine if it is the same problem already identified or a new
problem. If it is the same problem, go back and figure out what was missed in the fix
procedure.

Part 2 - Analyze and fix a boot failure


__ 11. What happens during the reboot of the system? Write down the last reference code
that is shown. What type of problem is this indicative of?
______________________________________________________________
______________________________________________________________
______________________________________________________________
» The system stops with a progress code (HMC Reference Code) value of 0553.
This is an indication for a corrupted /etc/inittab.
Another possible symptom would be for a prompt to appear on the AIX
system console which asks for a run level; when responding with a multi-user
run level of “2”, the system may simply hang with no message or error code.

__ 12. Reboot the system to maintenance mode.


» Use your HMC to shut down your partition and then activate your partition to
maintenance mode as described by the procedures described in Exercise 5, Part 3
(Booting to maintenance mode).

__ 13. Access your machine with the file systems mounted.


» Select your terminal.
» Select your language.
» If booting from media you will see: Welcome to the Base Operating System Installation
and Maintenance. From here, select Start Maintenance Mode for System Recovery.
» From the Maintenance menu, select option 1, Access a Root Volume Group.
» Type 0 to continue.
» The Access a Root Volume Group screen displays. Select the volume group that is
causing the problem.
» Select option 1, Access this volume group and start a shell. Selecting this choice
imports and activates the volume group and mounts the file systems for this root
volume group before providing you with a shell and a system prompt.

6-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 14. Examine your system and find the corrupted file that leads to the boot failure.
Be sure to set the TERM variable to lft or vt320, if you are working on a graphical
display. Otherwise vi or SMIT will not work correctly in the maintenance shell.
» Set a usable terminal type. Issue the command: 
# export TERM=vt320
» The corrupted file is /etc/inittab.

__ 15. Repair the corrupted file. You will find an example in your student notebook. If you
are not able to fix the boot failure, contact your instructor.
» Notice that the file has a semi-colon instead of a colon as the first delimiter. Correct this
by manually editing /etc/inittab.

» # vi /etc/inittab
:%s/;/:/g
:wq!
» Shut down your system and reboot your system in normal mode. Your machine should
boot now without any boot failure.
# sync
# sync
# reboot

Part 3 - Exploring rc.boot


The real documentation of rc.boot is the script itself. It can be very informative to see
the actual conditions that triggers the display of a particular progress code.

__ 16. Open (in a safe manner) the /sbin/rc.boot file for examination, using a tool that
allows you to search the contents. (Remember to re-position at the start of the file
when doing a search for a new string.)
» # more /sbin/rc.boot

There are two types of progress codes.


- One type identifies that an event has occurred or a task has completed, and is
expected to be a transient display. The rc.boot script uses SHOWLED to display
these.
- The other type is an error condition that stops any further execution of the script and
is intended to display permanently, until the operator terminates that AIX instance.

© Copyright IBM Corp. 2009, 2011 Exercise 6. System initialization: rc.boot and inittab 6-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

The rc.boot script uses loopled to display these error codes (actually a loop of
issuing SHOWLED).

__ 17. Search the rc.boot script for where it displays the code 517.
__ a. What mechanism is used to display this code? _______________________
» SHOWLED.
__ b. Is this supposed to be transient state or a permanent error? ______________
» Transient state.
__ c. According to the context in the script, what is the situation which caused it to be
displayed?
_____________________________________________________________
_____________________________________________________________
» Attempting mount of the root file system.
__ d. Repeat the search to see if there are other places where the code is displayed.
What are the other situations (if any)?
_____________________________________________________________
______________________________________________________________
_____________________________________________________________
» The code appears many times. Each time it is a mount of a different filesystem.

__ 18. Search the rc.boot script for where it displays the code 557.
__ a. What mechanism is used to display this code? _______________________
» loopled.
__ b. Is this supposed to be transient state or a permanent error? ______________
» Permanent error.
__ c. According to the context in the script, What is the situation which caused it to be
displayed?
_____________________________________________________________
_____________________________________________________________
» Failure to successfully mount of the root file system.
__ d. Repeat the search to see if there are other places where the code is displayed.
What are the other situations (if any)?
_____________________________________________________________

6-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty ______________________________________________________________
_____________________________________________________________
» The code only appears that one time.

__ 19. Search the rc.boot script for where it displays the code 518.
__ a. What mechanism is used to display this code? _______________________
» loopled.
__ b. Is this supposed to be transient state or a permanent error? ______________
» Permanent error.
__ c. According to the context in the script, What is the situation which caused it to be
displayed?
_____________________________________________________________
_____________________________________________________________
» Failure to successfully mount of the /usr file system.
__ d. Repeat the search to see if there are other places where the code is displayed.
What are the other situations (if any)?
_____________________________________________________________
______________________________________________________________
_____________________________________________________________
» The code appears many times. Each time related to the failure to mount a different file
system.

__ 20. Search the rc.boot script for where it displays the code 511.
__ a. What mechanism is used to display this code? _______________________
» SHOWLED
__ b. Is this supposed to be transient state or a permanent error? ______________
» Transient state.
__ c. According to the context in the script, What is the situation which caused it to be
displayed?
_____________________________________________________________
_____________________________________________________________
» End of device configuration - phase 1.

© Copyright IBM Corp. 2009, 2011 Exercise 6. System initialization: rc.boot and inittab 6-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ d. Repeat the search to see if there are other places where the code is displayed.
What are the other situations (if any)?
_____________________________________________________________
______________________________________________________________
_____________________________________________________________
» The code appears one more time, at the end of device configuration - phase 2.

__ 21. Search the rc.boot script for where it displays the code 551.
__ a. What mechanism is used to display this code? _______________________
» SHOWLED
__ b. Is this supposed to be transient state or a permanent error? ______________
» Transient state.
__ c. According to the context in the script, What is the situation which caused it to be
displayed?
_____________________________________________________________
_____________________________________________________________
» IPL varyon is about to be attempted.
__ d. Repeat the search to see if there are other places where the code is displayed.
What are the other situations (if any)?
_____________________________________________________________
______________________________________________________________
_____________________________________________________________
» The code only appears that one time.

__ 22. Search the rc.boot script for where it displays the code 554.
__ a. What mechanism is used to display this code? _______________________
» loopled.
__ b. Is this supposed to be transient state or a permanent error? ______________
» Permanent error.
__ c. According to the context in the script, What is the situation which caused it to be
displayed?
_____________________________________________________________
_____________________________________________________________

6-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » There is no known previous boot device (as returned by bootinfo -b).
__ d. Repeat the search to see if there are other places where the code is displayed.
What are the other situations (if any)?
_____________________________________________________________
______________________________________________________________
_____________________________________________________________
» The code appears one more time, as part of the case structure related to the iplvaryon
attempt. Notice the other possible error conditions for this operation.

__ 23. You may wish to search the rc.boot script for other progress codes discussed in this
unit. For each, determine the same information:
__ a. What mechanism is used to display this code? _______________________
__ b. Is this supposed to be transient state or a permanent error? ______________
__ c. According to the context in the script, What is the situation which caused it to be
displayed?
_____________________________________________________________
_____________________________________________________________
__ d. Repeat the search to see if there are other places where the code is displayed.
What are the other situations (if any)?
_____________________________________________________________
______________________________________________________________
_____________________________________________________________

End of exercise

© Copyright IBM Corp. 2009, 2011 Exercise 6. System initialization: rc.boot and inittab 6-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise review/wrap-up
Ask the students the following questions:
1. What is the logform command doing? Can logform result in data loss?
Answer: The logform command initializes a logical volume for use as a JFS or
JFS2 log device. Running the logform command on any JFS log device or JFS2
outline or online log device will destroy all log records on the log device. This
may cause the file system to lose its recovery capability and therefore to lose the
file system data.
2. What indicates a 553 LED in most cases?
Answer: /etc/inittab

6-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 7. LVM metadata and related problems


(with hints)

Estimated time
Part 1 - 00:15
Part 2 - 00:15
Part 3 - 00:15
Part 4 - 00:15
Part 5 - 00:20
Total required parts - 01:20
(optional) Part 6 - 00:20

What this exercise is about


In this exercise, you will analyze and fix LVM-related ODM problems.
There will be optional labs for students who desire and have time for
additional lab experiences.

What you should be able to do


At the end of the lab, you should be able to:
• Work with importvg and exportvg and manage importvg issues
• Fix an LVM-related ODM problem involving a user volume group
• Fix an LVM-related ODM problem associated with the rootvg

Introduction
This exercise has four parts:
1. In the first part, you will export and import a volume group
2. In the second part, you will work with importvg issues related to
duplicate logical volume and file system names
3. In the third part, you will fix an LVM ODM problem using the
importvg and exportvg technique.
4. In the fourth part, you will be asked to analyze and fix an LVM ODM
failure by using the rvgrecover procedure.

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

5. The fifth part is related to the previous part. You are asked to fix
another ODM failure. It is a failure which requires you to use
intermediate level ODM commands to fix the problem.
6. In the last part, which is optional, you will be asked to analyze and
fix an LVM ODM failure manually.
You will need root authority to complete this exercise.

Requirements
• /home/workshop/ex7_corrupt_pvid
• /home/workshop/ex7_corrupt_odm
• /home/workshop/rvgrecover
• /home/workshop/ex7_build_vg
• /home/workshop/ex7_corrupt_odm2
• /home/workshop/ex7_corrupt_odm3

Common student problems


The last (optional) part of the exercise involving manually repairing the ODM for a missing
PVID is very difficult for many students. Students have to analyze a LVM ODM failure, and
they must fix the problem manually. This is a really challenging exercise. also, make
Ensure that the students include the 16 trailing zeros on the pvid objects they create.
Note: Do not use rvgrecover to fix this first ODM problem. It will not work. rvgrecover
requires that the PVIDs of the disks are in the ODM, which is not the case in this first part.
In the second section of that same part, students use the rvgrecover script to fix another
LVM-related ODM problem, which is easier.
The script ex7_corrupt_pvid removes the pvid objects from CuAt. Before it does that, it
makes a copy of CuAt to /etc/objrepos/CuAt.$$ and it extracts the pvid objects and
stores them at /tmp/ex5fix.add. The file ex5fix.add contains the same information that the
students are asked to build in the fix.add file.
The ex5_corrupt_odm script also copies CuAt and CuDv prior to altering those. Those
copies are found in /etc/objrepos/CuAt.$$ and CuDv.$$.

Known hardware/software problems


None
Previous issues related to showing a string of ??? in the type column of the lsvg -l rootvg
report, have been corrected by making changes to the ex7_corrupt_odm script.

7-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

Part 1 - Export and import a volume group


__ 1. Create a new volume group named datavg on a disk that is empty. Check that this
disk does not belong to another volume group. If you are certain that the disk is not
a part of a volume group, yet the disk has an old VGDA still on it - you may need to
force the creation of the new volume group. Set the physical partition size to 16 MB.
__ a. Write down the command you executed to create the new volume group:
___________________________________________________________
» To find a disk that is empty:
# lspv
» You may need to remove hdisk2 from rootvg.
» If so, use:
# reducevg rootvg hdisk2
» To create the datavg volume group:
# mkvg -s 16 -y datavg hdisk2
If you get the following error message:
0516-1398 mkvg: The physical volume hdisk2, appears to belong
to another volume group. Use the force option to add this
physical volume to a volume group.
0516-862 mkvg: Unable to create volume group.

Then, use this command:


# mkvg -f -s 16 -y datavg hdisk2

__ 2. Check if the new volume group has been varied on automatically. Write down the
command you used.

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

_____________________________________________________________
» List the active volume groups:
# lsvg -o
datavg
rootvg

__ 3. Use the fastpath smit mklv to create a logical volume in datavg with the following
characteristics:
__ a. Logical volume name: lv_raw
__ b. Number of logical partitions: 1

» # smit mklv

__ 4. Use the fastpath smit jfs2 to create an enhanced journaled file system in datavg
with the following characteristics:
__ a. Size of file system: 16 MB (65536 512-byte blocks)
__ b. Mount point: /home/mars
Note: Do not build this file system on the lv_raw logical volume created in the
previous step.
»# smit jfs2
Add an Ehanced Journaled File System

__ 5. Verify that the new logical volumes are in datavg with the lsvg command.
__ a. Fill in the following table with the logical volume information in datavg:
»
# lsvg -l datavg
datavg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lv_raw jfs 1 1 1 closed/syncd N/A
loglv00 jfs2log 1 1 1 closed/syncd N/A
fslv00 jfs2 1 1 1 closed/syncd /home/mars

LV NAME TYPE MOUNT POINT

7-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty

An example of the table filled in based on the results above are:


LV NAME TYPE MOUNT POINT
lv_raw jfs N/A
loglv00 jfslog N/A
fslv00 jfs2 /home/mars

__ 6. Mount the new file system and create some files in it.

»# mount /home/mars
»# cd /home/mars
»# touch m1 m2 m3

__ 7. Export the datavg volume group from your system.


__ a. Write down all the steps you executed to export the volume group.
____________________________________________________________
____________________________________________________________
____________________________________________________________
»# cd
»# umount /home/mars
»# varyoffvg datavg
»# exportvg datavg

__ 8. Analyze your system to see if it contains any reference to the exported volume
group. For example, check whether the file system which you created exists. (Check
/etc/filesystems.)

» # lsfs
/home/mars does not exist on the system.
» # more /etc/filesystems

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

/etc/filesystems contains no reference to a file system that has been


exported.

__ 9. While AIX seems to have no knowledge of the volume group, it’s logical volumes, or
its filesystems, let see if the disk has that information in its control blocks.
Directly query the VGDA control block on the disk that was part of the exported VG.
Does it know the name of the volume group? What does it know about the volume
group and its logical volumes?
_________________________________________________________________
_________________________________________________________________
_________________________________________________________________
» # lqueryvg -p hdisk2 -At

__ 10. Directly query the LVCB control block for logical volume hd2 on the same disk. What
does it know about the logical volume characteristics? What does it know about the
file system that is related to that logical volume?
_________________________________________________________________
_________________________________________________________________
_________________________________________________________________
_________________________________________________________________
» # getlvcb -AT hd2

__ 11. Given the information in the VGDA and LVCB control blocks, we should be able to
use that information to rebuild the related LVM ODM objects.
__ a. Import the volume group into your system. Explicitly specify the volume group
name datavg; otherwise, the system will generate a new volume group name.
__ b. Write down the command you executed:
______________________________________________________________
»# importvg -y datavg hdisk2

__ 12. Check whether the imported volume group, datavg, is varied on.

» # lsvg -o 
(datavg should be varied on)

7-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty
__ a. Check to see if the file system information is back.

»# lsfs
»# more /etc/filesystems
»# mount
References to the filesystem are there but the file system is not mounted.

__ 13. Mount the /home/mars file system.


__ a. Check that no files have been lost.

»# mount /home/mars
»# ls /home/mars

Part 2 - Analyze import messages


In the previous exercise part: Export and import a volume group, the export and import
worked without problems, as the logical volumes and file systems did not exist during
the import of the volume group.
This part will show what will happen when a volume group that is being imported has
the same logical volume names of those that already exist on the system.

__ 14. Export the datavg volume group again. Repeat the steps from the last export.
»# cd
»# umount /home/mars
»# varyoffvg datavg
»# exportvg datavg

__ 15. Use the fastpath smit mklv to create a logical volume in rootvg with the following
characteristics:
__ a. Logical volume name: lv_raw
__ b. Number of logical partitions: 1
» # smit mklv

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 16. Use the fastpath smit jfs2 to create one JFS2 file system in rootvg with the
following characteristics.
__ a. Size of file systems: 16 MB (65536 512-byte blocks)
__ b. Mount points:
- File system 1: /home/mars
»# smit jfs2
Add an Ehanced Journaled File System

__ 17. What are the corresponding logical volume names that have been created for the
file system?
»# lsfs
Logical volume for /home/mars: ______/dev/fslv00 (for example)_________

__ 18. Mount the /home/mars file system, and add a few files to each.
»# mount /home/mars
»# cd /home/mars
»# touch m20 m21 m22

__ 19. At this stage, the following problems will come up when you import the datavg
volume group:
• The logical volumes being imported already exist in rootvg.
• The /home/mars file system already exists in rootvg.
Let's see how importvg will react to this situation.
__ a. Import the datavg volume group into the system.
»# importvg -y datavg hdisk2
0516-530 synclvodm: Logical volume name lv_raw changed to fslv01.
0516-530 synclvodm: Logical volume name fslv00 changed to fslv02.
imfs: mount point "/home/mars" already exists in /etc/filesystems
datavg

__ 20. Write down the new logical volume names that are created for datavg during the
import.
_______________________________________________________________
_______________________________________________________________

7-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty _______________________________________________________________
» lv_raw has been changed to fslv01 (Example names)
» fslv00 has been changed to fslv02
» Mount point /home/mars already exists in /etc/filesystems

__ 21. Another problem that you should see at this stage, is that the /home/mars file
systems already exists in rootvg.
__ a. To fix this problem, first unmount the /home/mars file system.

»# cd
»# umount /home/mars

__ 22. Mount the file systems from datavg over /home/mars. Use the new logical volume
names that have been created. You have to specify the log device that is part of
datavg.
__ a. Write down the commands you executed.
_______________________________________________________________
_______________________________________________________________
»# mount -o log=/dev/loglv00 -V jfs2 /dev/fslv02 /home/mars

__ 23. Check the files you have created in /home/mars. They should exist in this directory. 

»# ls /home/mars

__ 24. At the end of this exercise, both file systems should be mounted at the same time.
Start with unmounting /home/mars.
»# umount /home/mars

__ 25. Create a new directory: /datavg/mars. This will be the new mount point for the file
system from datavg.
»# mkdir -p /datavg/mars

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 26. Create a new stanza in /etc/filesystems that describes the file system from datavg.
You must use the new logical volume names that have been created during the
import of datavg.
»/datavg/mars:
dev = /dev/fslv02
vfs = jfs2
log = /dev/loglv00
mount = false
options = rw
account = false

__ 27. Mount the /datavg/mars file system.


»# mount /datavg/mars
»# mount /home/mars

__ 28. Verify you can access all the files.


»# ls /datavg/mars
»# ls /home/mars

__ 29. Unmount the /datavg/mars file systems.


»# umount /datavg/mars

__ 30. Varyoff the datavg volume group.


»# varyoffvg datavg

__ 31. Export the datavg volume group.


»# exportvg datavg

__ 32. Remove the /home/mars file system from the rootvg volume group.
»# umount /home/mars

# rmfs /home/mars

Part 3 - Fixing LVM ODM problems with importvg and exportvg


__ 33. List the physical volumes on your system to verify that hdisk2 is available.

7-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty »# lspv

__ 34. The export and import technique can only be used with non-rootvg volume groups.
You have been provided with a script which will create a volume group (using
hdisk2) and a file system with a special naming convention to match the problem
setup script. The script is: /home/workshop/ex7_build_vg. Execute this script.
»# /home/workshop/ex7_build_vg

__ 35. Display the on-line VGs and then list the logical volumes and physical volumes in
the lvmtestvg volume group.
» Suggested commands are:
# lsvg -o
lvmtestvg
rootvg

# lsvg -l lvmtestvg
lvmtestvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvmtestlv jfs2 1 1 1 open/syncd /lvmtestfs
loglv00 jfs2log 1 1 1 open/syncd N/A

# lsvg -p lvmtestvg
lvmtestvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 1092 1090 219..216..218..218..219
__ 36. In the /home/workshop directory, you will find a script called ex7_corrupt_odm2.
Execute this script.
»# /home/workshop/ex7_corrupt_odm2

__ 37. Display the online VGs and then list the logical volumes and physical volumes in the
lvmtestvg. What problems do you see?
_________________________________________________________________
_________________________________________________________________

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Suggested commands are:


# lsvg -o
lvmtestvg
rootvg

# lsvg -l lvmtestvg
lvmtestvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvmtestlv ??? 4 4 1 open/syncd /lvmtestfs
loglv00 jfs2log 1 1 1 open/syncd N/A

# lsvg -p lvmtestvg
lvmtestvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 1092 1090 219..216..218..218..219
» The command is unable to locate the type attribute for the lvmtestlv logical volume.

__ 38. Try to increase the size of the /lvmtestfs file system. What happened?
_________________________________________________________________
_________________________________________________________________
» Suggested commands are:
# chfs -a size=+1 /lvmtestfs
0516-306 /usr/sbin/getlvodm: Unable to find lvmtestlv in the Device
Configuration Database.
chfs: Cannot get lv id from odm.
» The command fails because it is unable to find the logical volume in the ODM.

__ 39. Try to solve the problem using the exportvg and importvg technique. Remember that
the volume group must offline. In order to take the VG off-line, all logical volumes in
the volume group must be closed.
» Suggested commands are:
# umount /lvmtestfs
# varyoffvg lvmtestvg
# exportvg lvmtestvg
# importvg -y lvmtestvg hdisk2

__ 40. Display the online VGs and then list the logical volumes and physical volumes in the
lvmtestvg. Did the problem go away?
_________________________________________________________________
_________________________________________________________________

7-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Suggested commands are:


# lsvg -o
lvmtestvg
rootvg

# lsvg -l lvmtestvg
lvmtestvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvmtestlv jfs2 1 1 1 open/syncd /lvmtestfs
loglv00 jfs2log 1 1 1 open/syncd N/A

# lsvg -p lvmtestvg
lvmtestvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 1092 1090 219..216..218..218..219

__ 41. Try to increase the size of the /lvmtestfs file system. What happened?
_________________________________________________________________
_________________________________________________________________
» Suggested commands are:
# chfs -a size=+1 /lvmtestfs

__ 42. Vary off and then remove the lvmtestvg volume group.
» Suggested commands are:
# reducevg -d lvmtestvg hdisk2

End of part 3

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Part 4 - Fix an LVM ODM Problem Using rvgrecover


If the volume group is the rootvg, then we cannot vary off and export the volume group.
Instead, we must use a procedure which is the functional equivalent of the importvg and
exportvg method.
__ 43. List the CuAt ODM objects for the hd2 logical volume and redirect the results to
/tmp/hd2.odm.
» # odmget -q “name=hd2” CuAt > /tmp/hd2.odm

__ 44. Execute the program /home/workshop/ex7_corrupt_odm.


» # /home/workshop/ex7_corrupt_odm

__ 45. Verify the following information:


__ a. Check whether your volume groups are alright. Use lsvg.
» # lsvg

__ b. Check whether your physical volumes are alright. Use lspv. Make note of which
disk is associated with the rootvg. _________________________________
» # lspv

__ c. Check whether your logical volumes are alright. List all logical volumes that are
part of your rootvg. Use lsvg -l rootvg.
» # lsvg -l rootvg
What happens?
________________________________________________________
________________________________________________________
________________________________________________________
» Typically for this problem, the TYPE information for some logical volumes is not shown.
(The string ??? is shown instead.)
The logical volume type is stored in CuAt; so, this result indicates that there
might be a problem with logical volume objects in the CuAt object class in the
ODM.

__ 46. Display information for logical volume hd2. Use lslv hd2.
» # lslv hd2

7-14 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty What happens?


__________________________________________________________
__________________________________________________________
__________________________________________________________
» The following error message is displayed: 
0516-306 lslv: Unable to find hd2 in the Device 
Configuration Database.

__ 47. Analyze the ODM problem by retrieving the CuDv and CuAt ODM objects for logical
volumes hd2 and hd4. Compare the CuAt ODM entries you retrieved and stored
earlier with the ODM objects you are now listing.
What are the exact ODM problems that you discover?
__________________________________________________________
__________________________________________________________
__________________________________________________________
» Use the following commands:
# odmget -q "name=hd2" CuDv
# odmget -q "name=hd4" CuDv
» The results indicate that the logical volumes are missing in CuDv.
# odmget -q "name=hd2" CuAt | more
# odmget -q "name=hd4" CuAt | more
# more /tmp/hd2.odm
» The results indicate that the lvserial_id attributes are missing in CuAt.

__ 48. Examine the /home/workshop/rvgrecover script and modify it if necessary to


match your situation (the specified disk must be one in your rootvg).
»# view /home/workshop/rvgrecover
After making any required changes to the script, fix the ODM problem by executing
/home/workshop/rvgrecover. Ignore the error messages. This may take up to one
minute, depending upon the speed of your lab system.
» # /home/workshop/rvgrecover
Check that your ODM problems have been fixed. Repeat lsvg -l rootvg and lslv
hd2. They should work now without problems.

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-15
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» # lsvg -l rootvg 
# lslv hd2

__ 49. Look into /home/workshop/rvgrecover. Remember that this is not an AIX-provided


command, but rather a procedure for fixing rootvg ODM problems. What two main
steps fix your ODM problem?
» The two main steps are:
• Deleting all rootvg related ODM objects
• Importing new ODM objects by reading the information from the VGDA
and LVCB on the boot disk (importvg)

__ 50. Another approach to solving the same problem is to use intermediate LVM
commands. Recreate the problem which you just fixed and verify that the problem is
installed by listing the logical volumes in the rootvg (type information should be:
???).
» # /home/workshop/ex7_corrupt_odm
» # lsvg -l rootvg

__ 51. Use an intermediate level command to request a synchronization of the LVM


information in the ODM for the rootvg volume group.
»# synclvodm rootvg

__ 52. List the logical volumes in the ODM. Is the problem fixed? __________________
» # lsvg -l rootvg
»You should find that the problem is fixed. The type field is now
corrected.

End of part 4

7-16 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Part 5 - Using intermediate LVM commands


__ 53. List the physical volumes on your system to verify that hdisk2 is available.
»# lspv

__ 54. The problems we want to create next are more extensive then the last problem, so
we will not use the rootvg for this. Instead you have been provided with a script
which will create a volume group (using hdisk2) and a file system with a special
naming convention to match the problem setup script. The script is:
/home/workshop/ex7_build_vg. Execute this script.
»# /home/workshop/ex7_build_vg

__ 55. Display the online VGs and then list the logical volumes and physical volumes in the
lvmtestvg volume group. Record the names of the physical volumes:
____________________________________________________________
____________________________________________________________
» Suggested commands are:
# lsvg -o
lvmtestvg
rootvg

# lsvg -l lvmtestvg
lvmtestvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvmtestlv jfs2 1 1 1 open/syncd /lvmtestfs
loglv00 jfs2log 1 1 1 open/syncd N/A

# lsvg -p lvmtestvg
lvmtestvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 1092 1090 219..216..218..218..219

__ 56. Verify that the new /lvmtestfs file system is mounted.


»# mount

__ 57. Create a data file in the /lvmtestfs directory.


»# echo “hello world” > /lvmtestfs/testfile

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-17
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 58. In the /home/workshop directory, you will find a script called ex7_corrupt_odm3.
Execute this script.
»# /home/workshop/ex7_corrupt_odm3
0518-307 odmdelete: 8 objects deleted.
0518-307 odmdelete: 2 objects deleted.
0518-307 odmdelete: 1 objects deleted.
0518-307 odmdelete: 2 objects deleted.
0518-307 odmdelete: 2 objects deleted.

__ 59. Display the online VGs and then list the logical volumes and physical volumes in the
lvmtestvg volume group. What problems did you see?
» Suggested commands are:
# lsvg -o
0516-304 : Unable to find device id 00c07f7f00004c0000000121ad0e4aee in the Device
Configuration Database.
vgid=00c07f7f00004c0000000121ad0e4aee
rootvg

# lsvg -l lvmtestvg
0516-306 : Unable to find volume group lvmtestvg in the Device
Configuration Database.

# lsvg -p lvmtestvg
0516-306 : Unable to find volume group lvmtestvg in the Device
Configuration Database.
» In the example output, it appears that the lsvg command was unable to resolve a
volume group ID and thus unable to list the matching VG name. Attempts to reference
the lvmtestvg volume group name failed on the subsequent two commands.

__ 60. Try to increase the size of the /lvmtestfs file system by one block. Could it be done?
_____________________________________________________________
»# chfs -a size=+1 /lvmtestfs
0516-306 /usr/sbin/getlvodm: Unable to find lvmtestlv in the Device
Configuration Database.
chfs: Cannot get lv id from odm.
» In the example, the attempt failed due to not being able to find the logical volume name
in the ODM.

__ 61. We could try to solve the problem with our exportvg and importvg technique.
Attempt to export lvmtestvg. You first need to close the logical volumes and vary the

7-18 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty VG offline. How far can you get before you experience a problem?
________________________________________________________________
________________________________________________________________
________________________________________________________________
» Suggested commands are:
# umount /lvmtestfs
# varyoffvg lvmtestvg
0516-306 getlvodm: Unable to find volume group lvmtestvg in the Device
Configuration Database.
0516-942 varyoffvg: Unable to vary off volume group lvmtestvg.
» The procedure fails before we can even try to execute the exportvg command, which
requires that the volume group be inactive.

__ 62. Mount the /lvmtestfs filesystem and display the contents of the file you created.
Were you able to access your data? _________________________________
» Suggested commands are:
# mount /lvmtestfs
# cat /lvmtestfs/testfile
hello world
» It is interesting that while the corruption prevents us from executing important LVM
commands, it does not impact our ability to access the user data.

__ 63. Try using an intermediate level command that will synchronize the LVM information
with the ODM. What happened?
_____________________________________________________________
» Suggested commands are:
# synclvodm lvmtestvg
0516-306 : Unable to find volume group lvmtestvg in the Device
Configuration Database.
0516-502 synclvodm: Unable to access volume group lvmtestvg.
» The command requires that certain volume group information be in the ODM. The
corruption deleted this ODM information, preventing us from using the synclvodm
command.

__ 64. Use an intermediate level LVM command to redefine the lvmtestvg volume group in
the ODM. Use the physical volume that belongs to the lvmtestvg, as recorded earlier
in the exercise part.

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-19
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Suggested commands are:


# redefinevg -d hdisk2 lvmtestvg

__ 65. Display the online VGs and then list the logical volumes in the lvmtestvg volume
group. What problems do you see?
» Suggested commands are:
# lsvg -o
lvmtestvg
rootvg

# lsvg -l lvmtestvg
lvmtestvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvmtestlv ??? 2 2 1 open/syncd /lvmtestfs
loglv00 jfs2log 1 1 1 open/syncd N/A

# lsvg -p lvmtestvg
lvmtestvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 1092 1089 219..215..218..218..219
» The volume group and its physical volume and logical volume membership is
recovered, but the data for the logical volumes does not appear to be complete, given
the type value of ???.

__ 66. Try to increase the size of the /lvmtestfs file system by one block. Could it be done?
_____________________________________________________________
» # chfs -a size=+1 /lvmtestfs
0516-306 /usr/sbin/getlvodm: Unable to find lvmtestlv in the Device
Configuration Database.
chfs: Cannot get lv id from odm.

» Even though the logical volumes are known to be in the volume group, the logical
volume information needed by the chfs command is still missing. Specifically, it is
missing the logical volume ID.

__ 67. Once again, try using an intermediate level command that will synchronize the LVM
information with the ODM. This failed prior to the execution of the redefinevg
command. What happens when you try it now?
_____________________________________________________________
» Suggested commands are:
# synclvodm lvmtestvg

7-20 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » It succeeds this time.


» It should be noted that, if we had attempted an importvg command as our first fix
attempt after creating the problem, it would have failed (gives error messages); but, the
importvg command would have completed enough repairs to the ODM to allow the
synclvodm command to run successfully. Under the covers, the importvg shell script
executes the redefinevg command.

__ 68. Display the online VGs and then list the logical volumes and physical volumes in the
lvmtestvg. How did the situation change? ____________________________
______________________________________________________________
» Suggested commands are:
# lsvg -o
lvmtestvg
rootvg

# lsvg -l lvmtestvg
lvmtestvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvmtestlv jfs2 2 2 1 open/syncd /lvmtestfs
loglv00 jfs2log 1 1 1 open/syncd N/A

# lsvg -p lvmtestvg
lvmtestvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 1092 1089 219..215..218..218..219
» We no longer see any problems with the displays of LVM information.

__ 69. Try to increase the size of the /lvmtestfs file system by one block. Could it be done?
_____________________________________________________________
» # chfs -a size=+1 /lvmtestfs
Filesystem size changed to 262144

» Looks like the ODM problem is fixed.

__ 70. Unmount the lvmtestfs file systems and remove the lvmtestvg volume group.
_____________________________________________________________

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-21
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» The suggested commands are:


# umount /lvmtestfs
# reducevg -d lvmtestvg hdisk2

End of part 5

Part 6 (optional): Manually fixing an LVM ODM problem


On rare occasions, LVM problems need to be solved by manually rebuilding missing or
corrupted entries. In these cases, it is important that you have good documentation to
support building them correctly.
__ 71. Execute lspv without any options to list all physical volumes in your system.
Complete the following table.
Volume
Disk name PVID
group

» # lspv
__ 72. Execute lsvg -p to list all physical volumes that are part of your rootvg.
Complete the following table:
PV_NAME PV STATE TOTAL PPs

» # lsvg -p rootvg
__ 73. Execute odmget -q to see the pvid attribute information stored in ODM for all disks.
Write down the command you used:
_______________________________________________
» # odmget -q "name like hdisk? and attribute=pvid" CuAt

7-22 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty
PV_Name PVID

Also, write down the structure of the stanza (that is, information labels) output by the
above command. You will need this information in a later lab step.
__________________________________________
__________________________________________
__________________________________________
__________________________________________
__________________________________________
__________________________________________
__________________________________________
__________________________________________
name =
attribute = “pvid”
value =
type = “R”
generic = “D”
rep = “s”
nls_index = 11
__ 74. Execute the program /home/workshop/ex7_corrupt_pvid.
» # /home/workshop/ex7_corrupt_pvid
__ 75. Repeat the lspv command to list your physical volumes. Complete the table and
compare with the table from step 1.
Disk name PVID Volume group

» # lspv

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-23
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

The value PVID value is “none,” and the Volume group value is “None” for
each entry.
__ 76. Repeat the lsvg -p command you used earlier to list the physical volumes in
rootvg.
What is the output from the command?
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
» # lsvg -p rootvg

You get a number of error messages (the number of messages depends on the number
of disks in rootvg) like the following:

0516-304 lsvg: Unable to find device id 00008371b5969c35 in the Device
Configuration Database

The same information is shown as that which was obtained when the command lsvg
-p rootvg was executed earlier, but instead of disk names, the PVIDs are shown.

__ 77. You learned that LVM stores information about volume groups, physical volumes
and logical volumes in the ODM. Consider the output of lspv and lsvg -p. What
data is missing? Where is the problem?
• Volume group objects?
• Physical volume objects?
• Logical volume objects?
Write down what you suspect:
_____________________________________________________________
» Physical volume objects

__ 78. Depending on your suspicion, identify the ODM entries which are shown in your
student notes in Unit 7.
Find out which objects in which ODM class are missing by reviewing the material
from your student notes.
_____________________________________________________________

7-24 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Example commands are:


# odmget -q "name like hdisk?" CuDv | pg
# odmget -q "name=hdisk0" CuAt
# odmget -q "name=hdisk1" CuAt
» The PVIDs for all disks in object class CuAt are missing.

__ 79. Before you fix the problem, please consult one VGDA for each of the VGs on your
system and compare the missing information with the data in the VGDA. Be sure
that the information you wrote down in the tables above is correct, otherwise you will
not be able to fix the problem.
What command allows you to query a VGDA?
» # lqueryvg -p hdisk0 -At (or another disk in the appropriate volume group instead of
hdisk0) 

Look for the label “Physical.” The values given after this label are the missing PVIDs.
__ 80. Fix the ODM problem by adding the missing objects into the ODM. Please work very
carefully in this step.
Recover the missing entries for all disks that show a problem in the lspv listing, not
just the ones that are part of a particular volume group.
Use your student notes to find out the layout of the corresponding ODM class. Write
down the steps you executed to fix the problem.
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
» # vi fix.add 

Example objects (Use your information!):
CuAt: 
name = "hdisk0"

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-25
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

attribute = "pvid"
value = "00008371b5969c350000000000000000"
type = "R"
generic = "D"
rep = "s"
nls_index = 11

CuAt: 
name = "hdisk2"
attribute = "pvid"
value = "002106699b1dd4440000000000000000"
type = "R"
generic = "D"
rep = "s"
nls_index = 11
- # odmadd fix.add
__ 81. Repeat the commands lspv and lsvg -p to check whether your fix works.
If you still have problems, the stanza file you created contains a typo. Find the typo,
delete the objects you just created, and add the fixed file. Did you remember to
include the 16 trailing zeros on your pvid valve?
» # lspv
# lsvg -p rootvg
The output from these commands should indicate that your fix worked.

- If your fix does not work:


# odmdelete -o CuAt -q"attribute=pvid"
Fix the typo, and add the fixed objects:
# odmadd fix.add

End of exercise

7-26 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise review/wrap-up


Review the following things after the lab:
1. Review how and when rvgrecover can be used. Emphasize that rvgrecover works
only for the rootvg.
Mention that rvgrecover does not fix all types of LVM-related problems. It requires that
at least the disks are stored correctly with their PVIDs in the ODM.
2. Analyze the problem before fix. Always check volume groups, physical volumes and
logical volumes, in the ODM. Proceed exactly as described in the student notes.

© Copyright IBM Corp. 2009, 2011 Exercise 7. LVM metadata and related problems 7-27
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

7-28 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 8. Disk management procedures


(with hints)

Estimated time
Part 1 - 00:20
Part 2 - 00:20
Part 3 - 00:10
Total - 00:50

What this exercise is about


This exercise provides practice in handling disk replacement
procedures and managing issues related to importing a volume group.

What you should be able to do


At the end of the lab, you should be able to:
• Manage quorum and missing disks issues
• Implement the disk replacement procedure for a disk that has not
yet failed

Introduction
This exercise has two main topics:
• Quorum and missing disks
• Disk replacement procedure (rootvg and user VGs)
This exercise requires one disk to be completely empty. This disk will
be used to create a new volume group. This volume group will be
exported and imported.
All instructions in this exercise require root authority.

Instructor exercise overview


None.

© Copyright IBM Corp. 2009, 2011 Exercise 8. Disk management procedures 8-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

8-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Part 1: Working with LVM mirroring and quorum


__ 1. Create a volume group named newvg using one unused disk.
» Check to see which disks are free:
# lspv
hdisk0 00c07f7f59ac0ea9 rootvg active
hdisk1 00c07f7fbcdf4791 datavg
hdisk2 00c07fbf59f134be None
hdisk3 none None

» Create the volume group with one disk: (We will assume that this disk is hdisk2 in the
rest of the exercise examples.)
# mkvg -y newvg hdisk2
newvg

__ 2. Use the lsvg command to find the volume group information for the newvg volume
group.
__ a. Quorum: ________

__ b. Number of VGDAs (VG Descriptors): ____________

__ c. Active physical volumes: ___________


» Command and sample output:
# lsvg newvg
VOLUME GROUP: newvg VG IDENTIFIER:
00c07f7f00004c00000001218da898f3
VG STATE: active PP SIZE: 128 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 546 (69888 megabytes)
MAX LVs: 256 FREE PPs: 546 (69888 megabytes)
LVs: 0 USED PPs: 0 (0 megabytes)
OPEN LVs: 0 QUORUM: 2 (Enabled)
TOTAL PVs: 1 VG DESCRIPTORS: 2
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 1 AUTO ON: yes
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable

» Quorum: 2
» 2 VGDAs

© Copyright IBM Corp. 2009, 2011 Exercise 8. Disk management procedures 8-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» 1 active physical volume

__ 3. Add a second unused disk to the newvg volume group.


» We will assume that this disk is hdisk3 in the rest of this exercise.
» Command:
# extendvg -f newvg hdisk3

__ 4. Use the lsvg command to find the volume group information for the newvg volume
group.
__ a. Quorum: ____________________

__ b. Number of VGDAs: ____________

__ c. Active physical volumes: ___________


» Command and sample output:
# lsvg newvg
VOLUME GROUP: newvg VG IDENTIFIER:
00c07f7f00004c00000001218da898f3
VG STATE: active PP SIZE: 128 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 625 (80000 megabytes)
MAX LVs: 256 FREE PPs: 625 (80000 megabytes)
LVs: 0 USED PPs: 0 (0 megabytes)
OPEN LVs: 0 QUORUM: 2 (Enabled)
TOTAL PVs: 2 VG DESCRIPTORS: 3
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 2 AUTO ON: yes
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable

» Quorum: 2
» 3 VGDAs
» 2 active physical volumes
» The number of PVs and the number of PVs both increased.

8-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 5. Use the lspv command to identify how many VGDAs there are on each disk in the
VG.

__ a. Number of VGDAs on hdisk2: ____________

__ b. Number of VGDAs on hdisk3: ____________

» Commands and sample output:


# lspv hdisk2 | grep -i descriptor
TOTAL PPs: 546 (69888 megabytes) VG DESCRIPTORS: 2

# lspv hdisk3 | grep -i descriptor


TOTAL PPs: 79 (10112 megabytes) VG DESCRIPTORS: 1

» 2 VGDAs on hdisk2
» 1 VGDA on hdisk3

__ 6. Make the second disk (with only one VGDA) unavailable using the following steps.
__ a. Varyoff the newvg volume group.
» Command:
# varyoffvg newvg

__ b. Make the second disk unavailable using rmdev. Do not delete it from CuDv, just
change the device state from available to defined.
» Use rmdev -l <disk>. Do not use the rmdev -d flag.
» Command:
# rmdev -l hdisk3
hdisk3 Defined

__ 7. Try to vary on the newvg volume group. Did it vary on? _____________________
What is the status of the disk you unconfigured? __________________________

© Copyright IBM Corp. 2009, 2011 Exercise 8. Disk management procedures 8-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Command and sample output:


# varyonvg newvg
PV Status: hdisk2 00c07fbf59f134be PVACTIVE
hdisk3 00c07f7f8db00776 PVMISSING
varyonvg: Volume group newvg is varied on.

» The volume group did varyon, but the second disk was missing.

__ 8. Look in the error log file to see if any errors were logged. What were the error labels
for the errors related to this experiment? ________________________________
_________________________________________________________________
» You should only have to look at the first two errors. There were two errors logged:
# errpt -A | pg
---------------------------------------------------------------------------
LABEL: LVM_QUORUMNOQUORUM
Date/Time: Fri May 29 15:09:19 2009
Type: INFO
Resource Name: LIBLVM
Description
Activation of a no quorum volume group without 100% of the disks.
Detail Data
MAJOR/MINOR DEVICE NUMBER
0021 0000
SENSE DATA
00C0 7F7F 0000 4C00 0000 0121 8DA8 98F3 0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------
LABEL: LVM_MISSPVADDED
Date/Time: Fri May 29 15:09:19 2009
Type: UNKN
Resource Name: LIBLVM
Description
PHYSICAL VOLUME DEFINED AS MISSING
Detail Data
MAJOR/MINOR DEVICE NUMBER
0011 0003
SENSE DATA
0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------

__ 9. Bring the second disk in the newvg volume group back to an available state and
verify the state by listing the device.
» Use cfgmgr to bring hdisk3 online. (You might need to run this a couple times.)
# cfgmgr
-OR-
# mkdev -l hdisk3

8-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Use lsdev to list the state.


# lsdev -l hdisk3
hdisk3 Available Virtual SCSI Disk Drive

__ 10. Display the physical volumes in the newvg volume group. What is the PV STATE of
the second disk?
_______________________________________________________________
Commands and sample output:
# lsvg -p newvg
newvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 546 546 110..109..109..109..109
hdisk3 missing 79 79 16..16..15..16..16

The second disk (hdisk3) is still in a missing state.

__ 11. What do you think will bring hdisk2 into an active state in the newvg volume group?
Try your strategy. (Look at the Hints if you do not know.)
__ a. Verify that it worked by running the lsvg -p newvg command.
» Running the varyonvg command works.
» Note: You do not have to vary it off first. you can run varyonvg on a volume group that
is already active. This will refresh the state of the disks in the volume group.
# varyonvg newvg
# lsvg -p newvg
newvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 546 546 110..109..109..109..109
hdisk3 active 79 79 16..16..15..16..16

» The second disk, hdisk3, is now back to an active PV state.

In the previous steps, you removed the second disk (which only had one VGDA). In the
following steps, you will remove the first disk (which has two VGDAs).
__ 12. Make the first disk unavailable using the following steps.
__ a. Varyoff the newvg volume group.
» Command:
# varyoffvg newvg

© Copyright IBM Corp. 2009, 2011 Exercise 8. Disk management procedures 8-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 13. Make the first disk unavailable using rmdev. Do not delete it from CuDv.
» Use rmdev -l hdisk2. Do not use the rmdev -d flag.
» Command:
# rmdev -l hdisk2
hdisk2 Defined

__ 14. Clear the AIX error log of previous errors.


» Command:
# errclear 0

__ 15. Try to varyon the newvg volume group. Did it varyon? If not, why?
_____________________________________________________________
_____________________________________________________________
» Command and sample output:
# varyonvg newvg
0516-052 varyonvg: Volume group cannot be varied on without
a quorum. More physical volumes in the group must be
active. Run diagnostics on inactive PVs.

» The volume group did not varyon.


» In a previous step, you removed hdisk3 (the second disk added to the volume group)
and then did a varyon of newvg. The varyon was successful because hdisk3 only had
one active VGDA and hdisk2 had two.
» Why did it fail this time? This time you removed hdisk2 which had two active VGDAs
leaving only one active VGDA, which is less than 51% of the total VGDAs.

__ 16. Look in the AIX error log to see if any new errors were logged.
» Command:
# errpt -A | pg
» Note: No new errors were logged.

__ 17. Vary on the newvg volume group using the force (-f) flag. What is the state of the
disk which you just unconfigured?
_____________________________________________________________

8-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » The force (-f) option to varyonvg must be used.


# varyonvg -f newvg
PV Status: hdisk2 00c07fbf59f134be PVREMOVED
hdisk3 00c07f7f8db00776 PVACTIVE
» varyonvg: Volume group newvg is varied on. The disk is in a removed state.

__ 18. Look in the error log file to see if any errors were logged. What were the labels of the
errors which are listed?
______________________________________________________________
» There were two errors logged:
# errpt -A | pg
---------------------------------------------------------------------------
LABEL: LVM_FORCEVARYON
Date/Time: Fri May 29 15:36:06 2009
Type: INFO
Resource Name: LIBLVM
Description
Forced activation of a volume group.
Detail Data
MAJOR/MINOR DEVICE NUMBER
0021 0000
SENSE DATA
00C0 7F7F 0000 4C00 0000 0121 8DA8 98F3 0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------
LABEL: LVM_MISSPVADDED
Date/Time: Fri May 29 15:36:06 2009
Type: UNKN
Resource Name: LIBLVM
Description
PHYSICAL VOLUME DEFINED AS MISSING
Detail Data
MAJOR/MINOR DEVICE NUMBER
0011 0000
SENSE DATA
0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------

__ 19. Bring the first disk in the newvg volume group back to an available state and verify
the device state result by listing the device.
» Use cfgmgr to bring hdisk2 online:
# cfgmgr
-OR-
# mkdev -l hdisk2

© Copyright IBM Corp. 2009, 2011 Exercise 8. Disk management procedures 8-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Use the lsdev command to list the hdisk2 device state:


# lsdev -l hdisk2
hdisk2 Available Virtual SCSI Disk Drive

__ 20. In the previous scenario, we placed the second disk into a defined state and then
varied the volume group online to cause that physical volume to be in a missing
state. Then, after we brought the second disk back to an available state, the
varyonvg command rebuilt the VGDA information and brought the volume group
back to the original state with both physical volumes active.
In the current scenario, try the varyonvg command, followed by the lsvg -p newvg
command. Did it fix the situation?
_______________________________________________________________
» Commands and sample output:
# varyonvg newvg
# lsvg -p newvg
newvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 removed 546 546 110..109..109..109..109
hdisk3 active 79 79 16..16..15..16..16

» The PV states are the same as they were before. The first physical volume is in a
removed state. It did not fix the problem.

__ 21. Explicitly change the state of the first physical volume to an active state.
»# chpv -v a hdisk2

__ 22. Display the physical volumes in the newvg volume group. Has anything changed?
________________________________________________________________
» Example command and output:
# lsvg -p newvg
newvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 546 546 110..109..109..109..109
hdisk3 active 79 79 16..16..15..16..16

» The first physical volume has changed from a state of removed to a state of active.

__ 23. Prove that the first physical volume is truly active in the volume group by creating a
logical volume with one physical partition allocated on that physical volume.

8-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Example command and output:


# mklv newvg 1 hdisk2
lv01
# lspv -l hdisk2
hdisk2:
LV NAME LPs PPs DISTRIBUTION MOUNT POINT
lv01 1 1 00..01..00..00..00 N/A
» The chpv command has logic to make the specified Active state truly effective without
having to vary on the volume group.

__ 24. Remove the newvg volume group. Delete any allocated logical volume as needed to
succeed in the VG removal.
» Command:
# reducevg -d newvg hdisk2 hdisk3
Replace yes to any prompts on the removal of logical
volumes.
End of part 1

Part 2 - rootvg disk replacement


Hypothetical scenario:
While you have your application data on SAN disks, your rootvg is on a disk in a disk
bay which is integrated into your server. Recently you have noticed frequent
DISK_ERR4 events in the AIX error log. You have decided to replace the disk, but you
do not want to take down the system. There is already a spare disk in the disk bay.
You will migrate the rootvg content from the failing disk to this spare disk and then
replace the failing disk. (This has scared you enough that you plan to eventually mirror
the rootvg after the bad disk is replaced).
__ 25. Check to see if hdisk1 is not assigned to a volume group, If it is, remove it from that
volume group.
» To find a disk that is empty:
# lspv
» You may need to remove hdisk1 from a user volume group.
» If so, use (for example):
# reducevg datavg hdisk1

__ 26. Extend the rootvg volume group to include the hdisk1 physical volume. You may
have to use the force flag if there is an old VGDA on the disk.

© Copyright IBM Corp. 2009, 2011 Exercise 8. Disk management procedures 8-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» # extendvg rootvg hdisk1

__ 27. Migrate only the rootvg boot logical volume from hdisk0 to hdisk1.
» # migratepv -l hd5 hdisk0 hdisk1

__ 28. Regenerate the contents of the boot logical volume. You may need more room in
/tmp; if so, delete unneeded files or increase the /tmp filesystem allocation.
» # bosboot -ad /dev/hdisk1

__ 29. Clear the old boot record of hdisk0.


» # chpv -c hdisk0

__ 30. Update the bootlist to only try to boot off of hdisk1.


» # bootlist -m normal hdisk1

__ 31. Migrate all rootvg logical volumes, remaining on hdisk0, to hdisk1.


This will take a few minutes; please be patient.
» # migratepv hdisk0 hdisk1

__ 32. Verify that all of the logical volumes have been moved from hdisk0 to hdisk1. Then,
remove the hdisk0 physical volume from the rootvg volume group.
» Example commands are:
# lspv -l hdisk0
# lspv -l hdisk1
# reducevg rootvg hdisk0
# lsvg -p rootvg

__ 33. Delete hdisk0 from the ODM.


» Example commands are:
# rmdev -d -l hdisk0
# lsdev -Cc disk

__ 34. We will assume that the failing disk has been replaced through a hot swap
procedure. Rediscover and configure the replacement disk.

8-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Example commands are:


# cfgmgr
# lsdev -Cc disk

__ 35. Later lab exercises have hints which assume that the rootvg resides on hdisk0.

Note

Repeat this procedure to migrate the rootvg back to hdisk0 which will then make your
environment compatible with the later hints.

__ 36. Remove hdisk1 from the rootvg volume group.


»# reducevg rootvg hdisk1

Part 3 - User VG Disk replacement procedure


__ 37. Define a new volume group using hdisk2. Check that this disk does not belong to
another volume group. If you are certain that the disk is not a part of a volume group,
yet the disk has an old VGDA still on it - you may need to force the creation of the
new volume group. Set the physical partition size to 16 MB.
Write down the command you executed to create the new volume group:
___________________________________________________________
» To find a disk that is empty:
# lspv

» To create the datavg volume group:


# mkvg -s 16 -y datavg hdisk2
If you get the following error message:
0516-1398 mkvg: The physical volume hdisk2, appears to belong
to another volume group. Use the force option to add this
physical volume to a volume group.
0516-862 mkvg: Unable to create volume group.

Then, use this command:


# mkvg -f -s 16 -y datavg hdisk2

© Copyright IBM Corp. 2009, 2011 Exercise 8. Disk management procedures 8-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 38. Create and mount a new JFS2 file system in the new volume group. Allocate the
minimum amount of space with a default mount point of /myfs.
» Example commands are:
# crfs -v jfs2 -g datavg -a size=1 -m /myfs
# mount /myfs

__ 39. Create some files in the new file system by copying configuration files from the /etc
directory.
» Example commands are:
# cp /etc/*.conf /myfs
# ls /myfs

We will pretend that you have the following situation:


You have your user data in the SAN. The storage subsystem was purchased mainly on
a criteria of lowest price. The storage subsystem had performance and functional
problems, resulting in the purchase of a replacement storage subsystem. You have
been assigned to (non-disruptively) move the user volume group off of the old disk and
onto the disk that is backed by a LUN in the new storage subsystem.

__ 40. Check to see that another disk is available (such as hdisk3).


» Example commands are:
# lspv

__ 41. Extend the volume group to include the extra disk.


» Example commands are:
# extendvg datavg hdisk3
# lsvg -p datavg

__ 42. Migrate the data which is on the failing disk to the new disk.
» Example commands are:
# migratepv hdisk2 hdisk3

__ 43. Verify that there are no logical volumes left on the failing disk and that they are now
on the other disk.

8-14 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Example commands are:


# lspv -l hdisk2
# lspv -l hdisk3

__ 44. Verify that the data you created is still there.


» Example commands are:
# ls /myfs

__ 45. Remove the old disk from the VG and verify that your volume group now has only
the new disk.
» Example commands are:
# reducevg datavg hdisk2
# lsvg -p datavg

__ 46. Remove the old disk from the ODM customized device database and verify that it
has been deleted from the ODM.
» Example commands are:
# rmdev -dl hdisk2
# lsdev -Cc disk

__ 47. At this point, we will assume that the SAN administrators have created and zoned a
new LUN for our system. Discover and configure the disk. Verify that we now have
an hdisk2 disk.
» Example commands are:
# cfgmgr
# lsdev -Cc disk

__ 48. Finally, let us clean up what we have created in this part of the lab exercises.
__ a. Remove the datavg volume group. You will either need to first remove any logical
volumes or request that the removal of the logical volumes be handled as part of
removing the volume group.
» Example commands are:
# umount /myfs
# reducevg -d datavg hdisk3

© Copyright IBM Corp. 2009, 2011 Exercise 8. Disk management procedures 8-15
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

End of exercise

8-16 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise review/wrap-up


Ask the students:

© Copyright IBM Corp. 2009, 2011 Exercise 8. Disk management procedures 8-17
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

8-18 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 9. Install and cloning techniques


(with hints)

Estimated time
Part 1 - 00:35
Part 2 - 00:45
Total - 01:20

What this exercise is about


This exercise provides an opportunity to practice using advanced
techniques for updating AIX (alternate disk and multibos).

What you should be able to do


At the end of the lab, you should be able to:
• Create an alternate rootvg disk and update it with maintenance
without changing the level of the active rootvg
• Create a standby BOS inside the active rootvg and apply
maintenance without changing the level of the active BOS

Introduction
All instructions in this exercise require root authority. There must be
another disk which is large enough to hold the updated rootvg. The
rootvg must have enough free space to hold the standby BOS rootvg
file systems.
The disk assignments on your system can vary from what is in the
exercise hints. Adjust your commands to match your situation.

Common student problems


None to report at this time.

© Copyright IBM Corp. 2009, 2011 Exercise 9. Install and cloning techniques 9-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Special instructions
The first two parts of this exercise involve cloning the rootvg, first to an alternate disk
and then to a standby BOS. The clone operations take at least one half hour to
complete. It is strongly advised that this time not be spent staring at the computer
screen or just chatting.
This could be a good time to go to lunch; or if it is the end of the day, this could be a
good time to take stop class and check on the results when class resumes the next
morning.
If it is in the middle of the morning or the middle of the afternoon, then this time should
be used to cover the next lecture and discussion material. The students would pick up
where they left off when that lecture unit or topic is completed. For example, while
waiting on the alt_disk_copy operation to complete, the class could cover multibos.
When the multibos discussion is over, they could finish up the post-clone exercise steps
for alternate disk and then start right in on the multibos exercise.

9-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

Part 1 - Creating and working with an alternate rootvg


__ 1. Open a terminal emulation window using telnet protocol to connect to your assigned
server logical partition. Log in as root.
__ 2. AIX 7.1 TL0 SP2 maintenance is stored on your server LPAR in the
/export/AIX_7100-00-02 directory. Verify that there are files under this directory.
»# ls /export/AIX_7100-00-02

__ 3. Check to see if this maintenance directory has been NFS exported to allow
read-only access (root access allowed) from your client LPARs, using standard AIX
system authentications (sys).
»# exportfs

__ 4. If it has not been NFS exported, then set up the NFS export for this directory, with
the characteristics which were described in the previous step. If it has been NFS
exported, but your client LPAR does not have root access permission, then just add
your LPAR to that permission list. Be sure to coordinate with the other students
sharing this server LPAR so you do not try to configure NFS at the same time.
» if not yet NFS exported:
# smitty nfs
Network File System (NFS)
Add a Directory to Exports List

© Copyright IBM Corp. 2009, 2011 Exercise 9. Install and cloning techniques 9-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» (on the dialogue panel:

Pathname of directory to export [/export/AIX_7100-00-02]


. . .
* Security method 1 [sys] +
* Mode to export directory read-only +
Hostname list. If exported read-mostly []
Hosts & netgroups allowed client access []
Hosts allowed root access [<IPaddrs of client LPARs>]

» If exported, but your LPAR does not have root access permission:
# smitty nfs
Network File System (NFS)
Change / Show Attributes of an Exported Directory
» When prompted, provide the name of the exported maintenance directory.
» When prompted, specify version 3.
» In the dialog panel, under Security method 1, add your LPAR IP address to the list
(comma or colon delimited) next to Hosts allowed root access

__ 5. Open a terminal emulation window using telnet protocol to connect to your assigned
client logical partition, you do not already have one. Log in as root.
__ 6. Check to see if the exported maintenance directory is already mounted to your /mnt
directory mount point, with read-only access. If not, then mount it.
» The suggested commands are:
# mount
» If not already mounted:
# mount -o ro <serverLPAR IPaddr>:/export/AIX_7100-00-02 /mnt

__ 7. Identify the current level of the AIX base operating systems (BOS), including the
technology level and the service pack.
» The suggested commands are:
# oslevel -s
7100-00-01-1037

__ 8. Identify a free disk which has a size greater than the used space in the rootvg. (Use
the disk with the smallest numeric suffix, if possible). Record the disk logical device
name here: ______________________________________________

9-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » The suggested commands are:


# lsvg rootvg [examine the USED PPs field]
# lspv
# bootinfo -s hdisk# (for each disk that is not part of a volume group, until you find an
appropriate disk)
» The bootinfo -s output is in megabytes.

__ 9. Use the alt_disk_copy command to create a clone on the disk that was just
identified. Update the clone with all of the AIX 7.1 TL0 SP2 maintenance as it is
being created.
Notify your instructor that you have initiated the alternate disk copy operation.
The cloning followed by application of maintenance should take a little over 8
minutes to complete. While you are waiting, your instructor will direct you either to
continue with the next part of the exercise, to continue with lecture and discussion,
or to take a break.
» The suggested commands are:
# alt_disk_copy -b update_all -l /mnt -d hdisk1

Note

The exercise hints will assume that the current rootvg disk is hdisk0 and that the alternate
disk is hdisk1. If your situation is different, be careful to specify the correct disks for your
situation.

» If you receive an error message stating that the disk may not be bootable, use the -g
flag to override the check.

__ 10. When the alternate rootvg has been created, display the physical volumes and their
associated volume groups. Is the target disk of the alt_disk_copy operation
identified as the alternate rootvg?
_________________________________________
» The suggested commands are:
# lspv
hdisk0 00c07f6f584cd485 rootvg active
hdisk1 00c07f6f820feea0 altinst_rootvg
» The target disk now is identified as belonging to the altinst_rootvg volume group.

© Copyright IBM Corp. 2009, 2011 Exercise 9. Install and cloning techniques 9-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 11. Display the technology and service pack level of the active rootvg. Did it change?
______________________________________________________________
» The suggested commands are:
# oslevel -s
» The oslevel of the current operating system (using rootvg on hdisk0) has not changed.

__ 12. Also display the normal boot list. Is the active or alternate rootvg listed as the first
boot device? ____________________________________________________
» The suggested commands are:
# bootlist -o -m normal
hdisk1 blv=hd5 pathid=0
» The normal bootlist has been changed to have the altinst_rootvg disk as the only device
in the list.

__ 13. If the bootlist does not have the alternate rootvg disk as the first boot device, change
it to boot off of the alternate rootvg.
» The suggested commands are:
(if needed:) # bootlist -m normal hdisk#

__ 14. Reboot your system in a safe manner. You will lose your current connection to the
LPAR during the shutdown phase.
» The suggested commands are:
# shutdown -Fr

__ 15. After the reboot is completed, log back in to your LPAR as the root user, and verify
that the level of the BOS is at the applied TL and SP level.
» A good way to detect when the client is close to being backup, is to initial a ping
command from your server LPAR.
» The suggested commands are:
# oslevel -s
7100-00-02

__ 16. List the physical volumes. What are the associated volume groups? Did they
change?

9-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » The suggested commands are:


# lspv
hdisk0 00c07f6f584cd485 old_rootvg
hdisk1 00c07f6f820feea0 rootvg active
» In the hint example, the volume group name for hdisk0 has been changed from rootvg
to old_rootvg, while the volume group name for hdisk1 has been changed from
altinst_rootvg to rootvg.

__ 17. Change the bootlist back to using the original boot logical volume and safely reboot
the LPAR.
» The suggested commands are:
# bootlist -m normal hdisk0 (if hdisk0 is the original rootvg disk)
# shutdown -Fr

__ 18. After the reboot, reconnect to your client LPAR and log back in as the root user.
Confirm that the operating system is back to the older level.
» The suggested command is:
# oslevel -s

Part 2 - Creating and working with a standby BOS using multibos


__ 19. Open a terminal emulation window using telnet protocol to connect to your assigned
server logical partition. Log in as root.
__ 20. AIX 7.1 TL0 SP2 maintenance is stored on your server LPAR in the
/export/AIX_7100-00-02 directory. Verify that there are files under this directory.
»# ls /export/AIX_7100-00-02

__ 21. Check to see if this maintenance directory has been NFS exported to allow
read-only access (root access allowed) from your client LPARs, using standard AIX
system authentications (sys).
»# exportfs

__ 22. If it has not been NFS exported, then set up the NFS export for this directory, with
the characteristics described in the previous step. If it has been NFS exported, but
your client LPAR does not have root access permission, then just add your LPAR to

© Copyright IBM Corp. 2009, 2011 Exercise 9. Install and cloning techniques 9-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

that permission list. Be sure to coordinate with the other students sharing this server
LPAR so you do not try to configure NFS at the same time.
» If not yet NFS exported:
# smitty nfs
Network File System (NFS)
Add a Directory to Exports List
» (On the dialog panel:

Pathname of directory to export [/export/AIX_7100-00-02]


. . .
* Security method 1 [sys] +
* Mode to export directory read-only +
Hostname list. If exported read-mostly []
Hosts & netgroups allowed client access []
Hosts allowed root access [<IPaddrs of client LPARs>]

» If exported, but your LPAR does not have root access permission:
# smitty nfs
Network File System (NFS)
Change / Show Attributes of an Exported Directory
» When prompted, provide the name of the exported maintenance directory.
» When prompted, specify version 3.
» In the dialogue panel, under Security method 1, add your LPAR IP address to the
list (comma or colon delimited) next to Hosts allowed root access

__ 23. If you do not already have a session with your assigned client LPAR, open a
terminal emulation window using telnet protocol to connect to your assigned client
logical partition. Log in as root.

__ 24. Mount the exported maintenance directory to your /mnt directory mount point, with
read-only access.
» The suggested commands are:
# mount -o ro <serverLPAR IPaddr>:/export/AIX_7100-00-02 /mnt

__ 25. For the rootvg, determine the amount of space used and the amount of space free.
Record the values here (both in the number of physical partitions and in units of
megabytes):
________________________________________________________________

9-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » The suggested commands are:


# lsvg rootvg

__ 26. The alternate BOS creation will involve cloning the rootvg file systems and other
logical volumes. We need to ensure that there is enough space on the disk holding
the rootvg to receive all of the system-defined logical volumes that will be cloned.
Extending the volume group and configuring multibos to use an additional disk is
non-trivial.
If there is not more free space than the amount of used space in the rootvg, contact
your instructor.

__ 27. Identify the current level (including TL and SP) of the AIX base operating system
(BOS). ___________________________________________________________
» The suggested commands are:
# oslevel -s
7100-00-01-1037

__ 28. Create and then mount a user-defined enhanced file system (JFS2) in the rootvg
which is one logical partition in size and then mount that file system. The default
mount point directory should be: /userfs.
» The suggested commands are:
# crfs -v jfs2 -g rootvg -a size=1 –m /userfs
# mount /userfs
-OR-
# smit jfs2
Add an Enhanced Journaled File System
(select rootvg off of the volume group list)
Add an Enhanced Journaled File System

[TOP] [Entry Fields]


Volume group name rootvg
SIZE of file system
Unit Size 512bytes +
* Number of units [1] #
* MOUNT POINT [/userfs]

(accept the defaults for the other fields)

# mount /userfs

© Copyright IBM Corp. 2009, 2011 Exercise 9. Install and cloning techniques 9-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 29. Create some files in the new file system.


» The suggested commands are:
# cd /userfs
# touch this will create five files
# cd

__ 30. List the current logical volumes in the rootvg.


» The suggested commands are:
# lsvg -l rootvg
rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 1 1 closed/syncd N/A
hd6 paging 4 4 1 open/syncd N/A
hd8 jfs2log 1 1 1 open/syncd N/A
hd4 jfs2 4 4 1 open/syncd /
hd2 jfs2 34 34 1 open/syncd /usr
hd9var jfs2 2 2 1 open/syncd /var
hd3 jfs2 3 3 1 open/syncd /tmp
hd1 jfs2 2 2 1 open/syncd /home
hd10opt jfs2 1 1 1 open/syncd /opt
hd11admin jfs2 1 1 1 open/syncd /admin
fslv00 jfs2 1 1 1 open/syncd /userfs

__ 31. List the current normal bootlist. What is the first boot device in the list? ___________
» The example commands and output are:
# bootlist -o -m normal
hdisk0 blv=hd5 pathid=0

__ 32. Create a standby BOS and extend file systems as needed. Apply maintenance to
update the BOS to AIX 7.1 TL0 SP2. First run in preview mode; then, run it to
actually create the standby BOS.
Notify your instructor that you have initiated the creation of the standby BOS.
This will likely take at little over 10 minutes to complete
» The suggested commands are:
# multibos -p -Xsa -l /mnt
# multibos -Xsa -l /mnt

__ 33. When the standby BOS creation is completed, display the logical volumes in the
rootvg. What do you see that is new? _________________________________

9-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty ______________________________________________________________
______________________________________________________________
» The suggested commands are:
# lsvg -l rootvg
rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 1 1 closed/syncd N/A
hd6 paging 4 4 1 open/syncd N/A
hd8 jfs2log 1 1 1 open/syncd N/A
hd4 jfs2 4 4 1 open/syncd /
hd2 jfs2 34 34 1 open/syncd /usr
hd9var jfs2 2 2 1 open/syncd /var
hd3 jfs2 3 3 1 open/syncd /tmp
hd1 jfs2 2 2 1 open/syncd /home
hd10opt jfs2 1 1 1 open/syncd /opt
hd11admin jfs2 1 1 1 open/syncd /admin
fslv00 jfs2 1 1 1 open/syncd /userfs
bos_hd5 boot 1 1 1 closed/syncd N/A
bos_hd4 jfs2 4 4 1 closed/syncd /bos_inst
bos_hd2 jfs2 34 34 1 closed/syncd /bos_inst/usr
bos_hd9var jfs2 2 2 1 closed/syncd /bos_inst/var
bos_hd10opt jfs2 1 1 1 closed/syncd /bos_inst/opt
» You should see new logical volumes with names prefixed with bos_. For the new logical
volumes which are file systems, you should see that their default mount points are
under the /bos_inst directory.

__ 34. Was there a new copy of your user-defined file system in the standby BOS?
________________________________________________________________
» Your should see that your user-defined file system is available, but it is the same logical
volume rather than a unique copy related to the standby BOS.

__ 35. Display the normal bootlist. How does this differ from what you displayed prior to
standby BOS creation? ____________________________________________
______________________________________________________________
» The example command and output is:
# bootlist -o -m normal
hdisk0 blv=bos_hd5 pathid=0
hdisk0 blv=hd5 pathid=0
» You will notice that the standby BOS has been established as the first boot device. If
this is not what is desired, the multibos command has an option to suppress making this
automatic change.

© Copyright IBM Corp. 2009, 2011 Exercise 9. Install and cloning techniques 9-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 36. Display what is currently mounted. Are the standby BOS copies of the non-shared
file systems mounted? ______________________________________________
» The suggested commands are:
# mount
node mounted mounted over vfs date options
-------- --------------- --------------- ------ ------------ ---------------
/dev/hd4 / jfs2 Apr 14 12:05 rw,log=/dev/hd8
/dev/hd2 /usr jfs2 Apr 14 12:05 rw,log=/dev/hd8
/dev/hd9var /var jfs2 Apr 14 12:06 rw,log=/dev/hd8
/dev/hd3 /tmp jfs2 Apr 14 12:06 rw,log=/dev/hd8
/dev/hd1 /home jfs2 Apr 14 12:06 rw,log=/dev/hd8
/dev/hd11admin /admin jfs2 Apr 14 12:06 rw,log=/dev/hd8
/proc /proc procfs Apr 14 12:06 rw
/dev/hd10opt /opt jfs2 Apr 14 12:06 rw,log=/dev/hd8
/dev/fslv00 /userfs jfs2 Apr 14 12:14 rw,log=/dev/hd8
» You should not see any of the standby BOS created file systems listed as mounted.

__ 37. Mount the standby BOS (using the multibos command) and then display what file
systems are mounted. What is the path to the mount points of the standby BOS
unique file systems?
_____________________________________________________________
» The suggested commands are:
# multibos -m
# mount
node mounted mounted over vfs date options
-------- --------------- --------------- ------ ------------ ---------------
/dev/hd4 / jfs2 Apr 14 12:05 rw,log=/dev/hd8
/dev/hd2 /usr jfs2 Apr 14 12:05 rw,log=/dev/hd8
/dev/hd9var /var jfs2 Apr 14 12:06 rw,log=/dev/hd8
/dev/hd3 /tmp jfs2 Apr 14 12:06 rw,log=/dev/hd8
/dev/hd1 /home jfs2 Apr 14 12:06 rw,log=/dev/hd8
/dev/hd11admin /admin jfs2 Apr 14 12:06 rw,log=/dev/hd8
/proc /proc procfs Apr 14 12:06 rw
/dev/hd10opt /opt jfs2 Apr 14 12:06 rw,log=/dev/hd8
/dev/fslv00 /userfs jfs2 Apr 14 12:14 rw,log=/dev/hd8
/dev/bos_hd4 /bos_inst jfs2 Apr 14 15:29 rw,log=/dev/hd8
/dev/bos_hd2 /bos_inst/usr jfs2 Apr 14 15:29 rw,log=/dev/hd8
/dev/bos_hd9var /bos_inst/var jfs2 Apr 14 15:29 rw,log=/dev/hd8
/dev/bos_hd10opt /bos_inst/opt jfs2 Apr 14 15:29 rw,log=/dev/hd8
» The file systems under the /bos_inst directory are now mounted.

__ 38. Change your current working directory to the root of the standby BOS unique file
systems, and then create a new directory called special in the standby BOS
/bos_inst/usr file system. Create some files in /bos_inst/usr/special and then
change your working directory back to the active BOS root directory.

9-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » The suggested commands are:


# cd /bos_inst/usr
# mkdir special
# cd special
# touch this should create eight files in the directory
# cd /

__ 39. Unmount the standby BOS.


» The suggested commands are:
# multibos -u

__ 40. Display the directories under the active BOS /usr directory. Is the new special
directory there? ______________________________________________
» The suggested commands are:
# ls /usr
» The special directory is not shown. This illustrates how customizations of a mounted
standby BOS do not affect the active BOS.

__ 41. Start a standby BOS shell. In the shell, list the OS level (including service pack) and
list the directories under the /usr directory. Then exit the shell. Was the directory
called special shown?
________________________________________________________________
» The suggested commands are:
# multibos -S
MULTIBOS> oslevel -s
7100-00-02-1041
MULTIBOS> ls /usr/special
MULTIBOS> exit
» You should notice that the oslevel is for the standby BOS.
» You should notice that the special directory is not only there, but that it has a path of
/usr/special, rather than a path of /bos_inst/user/special. This is because of the chroot
environment created for this special multibos shell.

__ 42. Reboot your LPAR.


» The suggested commands are:
# shutdown -Fr

© Copyright IBM Corp. 2009, 2011 Exercise 9. Install and cloning techniques 9-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 43. After the reboot has completed, establish a new connection and then log in as root.
Display the OS level (including the technology level and the service pack).
» The suggested commands are:
# oslevel -s

__ 44. Display what is mounted. What logical volumes are now mounted to the standard
BOS defined file systems’ mount points? _______________________________
______________________________________________________________
______________________________________________________________
» The suggested commands are:
# mount
node mounted mounted over vfs date options
-------- --------------- --------------- ------ ------------ ---------------
/dev/bos_hd4 / jfs2 Apr 14 15:48 rw,log=/dev/hd8
/dev/bos_hd2 /usr jfs2 Apr 14 15:48 rw,log=/dev/hd8
/dev/bos_hd9var /var jfs2 Apr 14 15:48 rw,log=/dev/hd8
/dev/hd3 /tmp jfs2 Apr 14 15:48 rw,log=/dev/hd8
/dev/hd1 /home jfs2 Apr 14 15:49 rw,log=/dev/hd8
/proc /proc procfs Apr 14 15:49 rw
/dev/bos_hd10opt /opt jfs2 Apr 14 15:49 rw,log=/dev/hd8
/dev/hd11admin /admin jfs2 Apr 14 15:49 rw,log=/dev/hd8

__ 45. Change the normal bootlist to have the original BOS first in the bootlist, and safely
reboot your LPAR. Be careful to use the correct logical device names for the disk
and boot logical volumes.
» The example commands and output are:
# bootlist -m normal hdisk0 blv=hd5 hdisk0 blv=bos_hd5
# bootlist -o -m normal
hdisk0 blv=hd5 pathid=0
hdisk0 blv=bos_hd5 pathid=0
# shutdown -Fr

__ 46. When the reboot is complete, reconnect to your LPAR and log back in as the root
user. Display the level of the OS, including technology level and service pack.
» The suggested commands are:
# oslevel -s

End of exercise

9-14 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise review/wrap-up


Review the most important things students learned in this exercise:
1. Clone a rootvg to an alternate disk while applying maintenance.
2. Clone a rootvg to a standby BOS in the same rootvg while also applying maintenance.

© Copyright IBM Corp. 2009, 2011 Exercise 9. Install and cloning techniques 9-15
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

9-16 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 10.Advanced backup techniques


(with hints)

Estimated time
Part 1 - 00:30
Part 2 - 00:15 (optional)
Part 3 - 00:15
Part 4 - 00:35
Total required time - 01:20

What this exercise is about


This exercise provides an opportunity to practice using advanced
techniques for on-line filesystem backups.

What you should be able to do


At the end of the lab, you should be able to:
• Backup file systems using a snapshot volume group.
• Backup a JFS file system using JFS split copy (optional).
• Backup a JFS2 filesystem using JFS2 snapshot.

Introduction
All instructions in this exercise require root authority.

Common student problems


None to report at this time.

Special instructions
None.

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

Part 1 - Using a snapshot volume group


__ 1. In the /home/workshop directory you should find a script named: ex10_build_vgs.
The script creates a mirrored volume group, named testvg, with two included file
systems (one JFS and one JFS2). It also populates these file systems with data
files.
Change directory to /home/workshop and execute the ex10_build_vgs script.
» Following is an example command:
# cd /home/workshop
# ./ex10_build_vgs

__ 2. Display the information for the logical volumes within the created testvg volume
group. Are the file systems mirrored?
_______________________________________________________________
» Following is a suggested command and sample output:
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testlv2 jfs2 8 16 2 open/syncd /testfs2
testlv jfs 8 16 2 open/syncd /testfs
loglv00 jfs2log 1 2 2 open/syncd N/A
loglv01 jfslog 1 2 2 open/syncd N/A
» In the example output, the ratio of PPs to LPs is 2:1, indicating mirroring.

__ 3. Display the mapping of logical partitions to physical partitions for both of the created
logical volumes, testlv and testlv2. Record which disk holds the second copy.

10-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty _____________________________________________________________
» Following are suggested commands and sample output:
# lslv -m testlv
testlv:/testfs
LP PP1 PV1 PP2 PV2 PP3 PV3
0001 0212 hdisk2 0212 hdisk3
0002 0213 hdisk2 0213 hdisk3
0003 0214 hdisk2 0214 hdisk3
0004 0215 hdisk2 0215 hdisk3
0005 0216 hdisk2 0216 hdisk3
0006 0217 hdisk2 0217 hdisk3
0007 0218 hdisk2 0218 hdisk3
0008 0219 hdisk2 0219 hdisk3

# lslv -m testlv2
testlv2:/testfs2
LP PP1 PV1 PP2 PV2 PP3 PV3
0001 0204 hdisk2 0204 hdisk3
0002 0205 hdisk2 0205 hdisk3
0003 0206 hdisk2 0206 hdisk3
0004 0207 hdisk2 0207 hdisk3
0005 0208 hdisk2 0208 hdisk3
0006 0209 hdisk2 0209 hdisk3
0007 0210 hdisk2 0210 hdisk3
0008 0211 hdisk2 0211 hdisk3
» In the sample output, both logical volumes have their second copies on hdisk3.

__ 4. When ready to backup the file system data, you would briefly quiesce the application
and then split the VG. In this class, our method of quiescing is to simply not run any
commands that affect the data while splitting the mirrored VG.
Split the volume group, using the disk which holds the second copy as the snapshot
volume group. Name the new volume group: myvg-snap. Time how long it took to
create the snapshot Volume Group.
Once the split is completed, you would un-quiesce and resume application
processing. We will represent application processing, later, with a script to update
some files.
How long did it take to split the mirrored VG? __________________________

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Following is an example command:


# timex splitvg -y testvg-snap -c 2 testvg
real 8.04
user 0.89
sys 0.41
» In the example output, the splitvg command took only 8 seconds. While larger numbers
of physical partitions will require some more time, it is not much more. An 8 GB drive
fully populated with data, involving over one thousand physical partitions required 17
seconds to complete the split. What took longer was not the amount of data but the
number of logical partitions. This is significant, since in very large databases, the
volume groups typically use a much larger PP size, thus requiring fewer logical
partitions to contain the same amount of data.

__ 5. Display the testvg characteristics. Does it identify the status of having created a
snapshot VG? Does it have any stale PVs?
______________________________________________________________
______________________________________________________________
» Following is a suggested command and sample output:
# lsvg testvg
VOLUME GROUP: testvg VG IDENTIFIER:00f6060300004c000000012f21d4122f
VG STATE: active PP SIZE: 8 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 2030 (16240 megabytes)
MAX LVs: 256 FREE PPs: 1994 (15952 megabytes)
LVs: 4 USED PPs: 36 (288 megabytes)
OPEN LVs: 4 QUORUM: 1 (Disabled)
TOTAL PVs: 2 VG DESCRIPTORS: 2
STALE PVs: 1 STALE PPs: 1
ACTIVE PVs: 1 AUTO ON: yes
MAX PPs per VG: 32768 MAX PVs: 1024
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
SNAPSHOT VG: testvg-snap
PV RESTRICTION: none
» In the example output, you can identify that this volume group has a snapshot because
the SNAPSHOT VG field lists the name of the snapshot VG. You can also see that one
of the PPs is marked as stale.

__ 6. Display the testvg-snap volume group characteristics. What information does it


provide about the snapshot situation?
_______________________________________________________________

10-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty _______________________________________________________________
» Following is an example command and sample output:
# lsvg testvg-snap
VOLUME GROUP: testvg-snap VG IDENTIFIER:
00f6060300004c000000012f21e78f8b
VG STATE: active PP SIZE: 8 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 1015 (8120 megabytes)
MAX LVs: 256 FREE PPs: 997 (7976 megabytes)
LVs: 4 USED PPs: 18 (144 megabytes)
OPEN LVs: 0 QUORUM: 1 (Disabled)
TOTAL PVs: 1 VG DESCRIPTORS: 2
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 1 AUTO ON: yes
MAX PPs per VG: 32768 MAX PVs: 1024
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
SNAPSHOT VG: yes PRIMARY VG: testvg
PV RESTRICTION: none
» In the example output, the SNAPSHOT VG field set to yes indicates that this volume
group is a snapshot VG. The PRIMARY VG field identities which volume group is the
related primary.

__ 7. Display the information for the logical volumes within the testvg-snap volume
groups. What names were generated for the new logical volumes and the contained
file systems? Are the new file systems mounted? Is there any indication of
mirroring?
______________________________________________________________
______________________________________________________________
_______________________________________________________________
» Following is a suggested command and sample output:
# lsvg -l testvg-snap
testvg-snap:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
fstestlv2 jfs2 8 8 1 closed/syncd /fs/fs/testfs2
fstestlv jfs 8 8 1 closed/syncd /fs/fs/testfs
fsloglv00 jfs2log 1 1 1 closed/syncd N/A
fsloglv01 jfslog 1 1 1 closed/syncd N/A
» The new logical volume names are the LV names in the primary VG, prefixed with fs.
The file system default mount points (/etc/filesystems stanza labels) are the primary
VG mount points but relative to the /fs/fs directory path. The logical volumes in a

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

snapshot VG are not aware that there is any mirroring being tracked; that information is
maintained in the primary VG which still knows about both disks.

__ 8. Mount the new file systems in the snapshot VG.


» Following are suggested commands and sample output:
# mount /fs/fs/testfs
Replaying log for /dev/fstestlv.
# mount /fs/fs/testfs2
Replaying log for /dev/fstestlv2.

__ 9. Display file systems and their space utilization, requesting a unit size of one
megabyte. What is the utilization of the test file systems? How many megabytes are
used by each file system?
________________________________________________________________
________________________________________________________________
» Following is a suggested command and sample output:
# df -m | egrep “MB|test”
Filesystem MB blocks Free %Used Iused %Iused Mounted on
/dev/testlv2 64.00 0.00 100% 19 60% /testfs2
/dev/testlv 64.00 0.00 100% 26 1% /testfs
/dev/fstestlv 64.00 0.00 100% 26 1% /fs/fs/testfs
/dev/fstestlv2 64.00 0.00 100% 19 60% /fs/fs/testfs2
» The example output shows all of the test file systems as being 100% utilized. They each
use 64 MB of disk space.

__ 10. Next we want to update the data in one of the logical partitions for each filesystem. A
script has been provided that will do this: ex10_update_files. It updates one file in
each of the file systems, making each file one megabyte smaller.
Execute the script, ex10_update_files.
» Following is a suggested command and sample output:
# ./ex10_update_files
**** creating source files for filesystem population ****
**** updating file data7 in JFS2 filesystem ****
****updating file data7 in JFS filesystem ****

__ 11. Display file systems, requesting a unit size of one megabyte. Were the file systems
in the snapshot VG affected by the update you just executed?

10-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty _________________________________________________________________
_________________________________________________________________
» Following is a suggested command and sample output:
# df -m | egrep “MB|test”
Filesystem MB blocks Free %Used Iused %Iused Mounted on
/dev/testlv2 64.00 3.02 96% 19 3% /testfs2
/dev/testlv 64.00 7.02 90% 26 1% /testfs
/dev/fstestlv 64.00 0.00 100% 26 1% /fs/fs/testfs
/dev/fstestlv2 64.00 0.00 100% 19 60% /fs/fs/testfs2
» The example output shows that while the utilization of the file systems in the primary VG
is smaller in size (due to the updates you just made), the snapshot VG file systems are
un-affected; they are the same utilization as they were when the snapshot was taken.

__ 12. Display the logical volume characteristics for each of the test file systems in the
testvg volume group. Were some of the physical partitions counted as stale?
_______________________________________________________________
________________________________________________________________
» Following are suggested commands and sample output:
# lslv testlv | grep PPs
LPs: 8 PPs: 16
STALE PPs: 3 BB POLICY: relocatable

# lslv testlv2 | grep PPs


LPs: 8 PPs: 16
STALE PPs: 2 BB POLICY: relocatable
» The example output shows that 3 out of 16 PPS are stale in the JFS file system and that
2 out of 16 PPs are stale in the JFS2 filesystem. Since the rejoining of the VGs only
requires the synchronization of stale PPs, this means that the rejoin will be much faster
than if we had to synchronize all physical partitions.

__ 13. The backup of the snapshot contents would either be to a remote server or to
removable storage (tape or DVD). In this class, you will backup to your assigned
server LPAR, but you first need to create and access a filesystem on that server
which is large enough to hold the backups.
__ a. What is the size of the /fs/fs/testfs2 file system? (see your answer in the earlier
step 9)
_____________________________________________________________

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ b. Start a terminal emulation with a new connection to your assigned server LPAR
(if you do not already have one) and log in as root.
__ c. On the server LPAR, create and mount a JFS2 filesystem which is larger than
the size of the /fs/fs/testfs2 file system. Use SMIT (fast path jfs2) or the crfs
command.
Name the file system: back-<your client LPAR name>, to avoid conflict with the
other team sharing the server.
NFS export the new file system (with read-write authority) to your client LPAR
with root access. Be careful to specify the correct IP address for your assigned
client LPAR.
You may use SMIT (fastpath nfs) or the mknfsexp command:
mknfsexp -d <fs to export> -B -S sys -t rw -r <client IP or
hostname>
» The example commands are:
# crfs -v jfs2 -g rootvg -a size=65M -m /back-sys304_118
# mount /back-sys304_118
# mknfsexp -d /back-sys304_118 -B -S sys -t rw -r 10.6.52.118

__ d. Return to your client LPAR session and execute an NFS mount (read-write) of
the file system you just created with a mount point of /mnt.
» # mount -o rw 10.6.52.117:/back-sys304_118 /mnt

__ 14. Backup (relative path) the /fs/fs/testfs2 file system to a backup file in the /mnt
directory. Then verify the names of the files in the backup archive.
In the real world, the amount of data would be much greater and would require an
extensive amount of time to complete.

10-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Following are suggested commands and sample output:


# cd /fs/fs/testfs2
# find . | backup -q -i -v -f /mnt/vgsnap-fs2.bak
Backing up to /mnt/vgsnap-fs2.bak
Cluster 51200 bytes (100 blocks).
Volume 1 on /mnt/vgsnap-fs2.bak
a 0 .
a 8044544 ./data1
a 4194304 ./data10
. . .
a 4194304 ./data14
a 4186112 ./data15
a 4194304 ./data2
. . .
a 4194304 ./data9
a 0 ./lost+found
total size: 66756608
Done at Mon Apr 4 21:59:45 2011; 130400 blocks on 1 volume(s)
# restore -Tvf /mnt/vgsnap-fs2.bak

__ 15. Unmount the file systems which are in the snapshot VG, and then rejoin the
snapshot VG with the primary VG. The time it takes to join the snapshot VG to the
primary VG depends mainly upon how many PPs were marked as stale during the
existence of the snapshot.
» Following are suggested commands:
# cd /
# umount /fs/fs/testfs
# umount /fs/fs/testfs2
# joinvg testvg

__ 16. List the volume groups to verify that the snapshot VG no longer exists.
» Following is a suggested command and sample output:
# lsvg
rootvg
testvg

__ 17. List the logical volumes in the testvg volume group to verify that it is back to its
normal mirroring with no stale PPs.

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Following is a suggested command and sample output:


# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testlv2 jfs2 8 16 2 open/syncd /testfs2
testlv jfs 8 16 2 open/syncd /testfs
loglv00 jfs2log 1 2 2 open/syncd N/A
loglv01 jfslog 1 2 2 open/syncd N/A

__ 18. Before continuing to other parts of this exercise, remove the testvg volume group by
executing the provided script: ex10_cleanvg.
» Following is a suggested command and sample output:
# cd /home/workshop
# ./ex10_cleanvg
rmlv: Logical volume testlv2 is removed.
rmlv: Logical volume testlv is removed.
rmlv: Logical volume loglv00 is removed.
rmlv: Logical volume loglv01 is removed.
ldeletepv: Volume Group deleted since it contains no physical volumes.

Part 2 - (Optional) Using JFS split copy


__ 19. In the /home/workshop directory you should find a script named: ex10_build_vgs.
The script creates a mirrored volume group, named testvg, with two included file
systems (one JFS and one JFS2). It also populates these file systems with data
files.
Change directory to /home/workshop and execute the ex10_build_vgs script.
» Following is an example command:
# cd /home/workshop
# ./ex10_build_vgs

__ 20. Display the information for the logical volumes within the created testvg volume
group. Are the file systems mirrored?
_______________________________________________________________

10-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Following is a suggested command and sample output:


# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testlv2 jfs2 8 16 2 open/syncd /testfs2
testlv jfs 8 16 2 open/syncd /testfs
loglv00 jfs2log 1 2 2 open/syncd N/A
loglv01 jfslog 1 2 2 open/syncd N/A
» In the example output, the ratio of PPs to LPs is 2:1, indicating mirroring.

__ 21. When ready to backup the file system data, you would briefly quiesce the application
and then split the mirror for the file system. In this class, our method of quiescing is
to simply not run any commands to affect the data while splitting the file system.
Split the /testfs file system, using the second copy as the split copy. Use /backup
as the mount point for the split copy. Time how long it took to create the split copy.
Once the split is completed, you would un-quiesce and resume application
processing. We will represent application processing, later, with a script to update
some files.
How long did it take to split the mirror? __________________________
» Following is an example command:
# timex chfs -a splitcopy=/backup -a copy=2 /testfs
testlvcopy00
backup requested(0x100000)...
log redo processing for /dev/testlvcopy00
syncpt record at 2028
end of log 13c58
syncpt record at 2028
syncpt address 2028
number of log records = 102
number of do blocks = 25
number of nodo blocks = 0

real 5.93
user 0.22
sys 0.17

» In the example output, the splitcopy command took only 6 seconds. While larger
numbers of physical partitions will require some more time, it is not much more. An 8
GB drive fully populated with data, involving over one thousand physical partitions
required only 7 seconds to complete the split.

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 22. Display the information for the logical volumes within the testvg volume group. Does
it show a file system for the requested split copy? What name was generated for the
new logical volume and the contained file system? Is the new file system mounted?
______________________________________________________________
______________________________________________________________
______________________________________________________________
» Following is a suggested command and sample output:
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testlv2 jfs2 8 16 2 open/syncd /testfs2
testlv jfs 8 16 2 open/stale /testfs
loglv00 jfs2log 1 2 2 open/syncd N/A
loglv01 jfslog 1 2 2 open/syncd N/A
testlvcopy00 jfs 0 0 0 open/syncd /backup
» The example output shows a new /backup file system with a default logical volume
name based on the primary file system’s LV name appended with the string copy00. It is
notable that the split copy does not have any allocations of its own, since it is only an
additional mapping of mirror copy in the primary file system. The open state indicates
that the /backup filesystem is currently mounted.

__ 23. Display the logical volume characteristics for the /testfs file system. Are there
indications that the mirror has been split?
_______________________________________________________________
_______________________________________________________________

10-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Following are suggested commands and sample output:


# lslv testlv
LOGICAL VOLUME: testlv VOLUME GROUP: testvg
LV IDENTIFIER: 00f6060300004c000000012f2616a670.2 PERMISSION:
read/write
VG STATE: active/complete LV STATE: opened/stale
TYPE: jfs WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 8 megabyte(s)
COPIES: 2 SCHED POLICY: parallel
LPs: 8 PPs: 16
STALE PPs: 8 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 1024
MOUNT POINT: /testfs LABEL: /testfs
DEVICE UID: 0 DEVICE GID: 0
DEVICE PERMISSIONS: 432
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
BACKUP MIRROR COPY: 2
DEVICESUBTYPE : DS_LVZ
COPY 1 MIRROR POOL: None
COPY 2 MIRROR POOL: None
COPY 3 MIRROR POOL: None

» The example output shows that the BACKUP MIRROR COPY field has a value of 2,
indicating that mirror copy 2 is currently being used as a backup mirror copy. It is also
notable that all of those mirror copies are identified as stale (8 stale PPs), even though
you have not yet written any changes to the logical partitions.

__ 24. Display the logical volume characteristics for the /backup file system. Are there
indications that this is a split copy?
_______________________________________________________________
_______________________________________________________________

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Following are suggested commands and sample output:


# lslv testlvcopy00
LOGICAL VOLUME: testlvcopy00 VOLUME GROUP: testvg
LV IDENTIFIER: 00f6060300004c000000012f2616a670.5 PERMISSION:
read/write
VG STATE: active/complete LV STATE: opened?
TYPE: jfs WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 8 megabyte(s)
COPIES: SCHED POLICY: backup mirror
LPs: 0 PPs: 0
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 1024
MOUNT POINT: /backup LABEL: /backup
DEVICE UID: 0 DEVICE GID: 0
DEVICE PERMISSIONS: 0
MIRROR WRITE CONSISTENCY: off
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
PARENT LOGICAL VOLUME: testlv
DEVICESUBTYPE : DS_LVZ
COPY 1 MIRROR POOL: None
COPY 2 MIRROR POOL: None
COPY 3 MIRROR POOL: None

» The example output shows the SCHED POLICY field has a value of backup mirror and
that the PARENT LOGICAL VOLUME field identifies testlv as the parent LV.

__ 25. Display file systems and their space utilization, requesting a unit size of one
megabyte. What is the utilization of the test file systems?
________________________________________________________________

» Following is a suggested command and sample output:


# df -m | egrep “MB|test”
Filesystem MB blocks Free %Used Iused %Iused Mounted on
/dev/testlv2 64.00 0.00 100% 19 60% /testfs2
/dev/testlv 64.00 0.00 100% 26 1% /testfs
/dev/testlvcopy00 64.00 0.00 100% 26 1% /backup
» The example output shows all of the test file systems as being 100% utilized.

10-14 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 26. Next we want to update the data in one of the logical partitions for each filesystem. A
script has been provided that will do this: ex10_update_files. It updates one file in
each of the file systems, making each file one megabyte smaller.
Execute the script, ex10_update_files.
» Following is a suggested command and sample output:
# ./ex10_update_files
**** creating source files for filesystem population ****
**** updating file data7 in JFS2 filesystem ****
****updating file data7 in JFS filesystem ****

__ 27. Display the /testfs and the /backup file systems, requesting a unit size of one
megabyte. Was the /backup file system affected by the update you just executed?
_________________________________________________________________
_________________________________________________________________
» Following is a suggested command and sample output:
# df -m | egrep “MB|test”
Filesystem MB blocks Free %Used Iused %Iused Mounted on
/dev/testlv2 64.00 3.02 96% 19 3% /testfs2
/dev/testlv 64.00 7.02 90% 26 1% /testfs
/dev/testlvcopy00 64.00 0.00 100% 26 1% /backup

» The example output shows that while the JFS file system in the primary file system is
smaller in size (due to the updates you just made), the split copy file system was
un-affected; it is the same size as they were when the snplit copy was executed.

__ 28. Backup (relative path) the file systems in the split copy file system to a backup file in
the /tmp file system. Then verify the names of the files in the backup archive.
In the real world, the amount of data would be much greater and would require an
extensive amount of time to complete.

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-15
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Following are suggested commands and sample output:


# cd /backup
# find . | backup -q -i -v -f /tmp/splitcopy.bak
Backing up to /tmp/splitcopy.bak
Cluster 51200 bytes (100 blocks).
Volume 1 on /tmp/splitcopy.bak
a 0 .
a 0 ./lost+found
a 4194304 ./data1
a 8376320 ./data2
a 8376320 ./data3
a 8376320 ./data4
a 8376320 ./data5
a 8376320 ./data6
a 8376320 ./data7
a 8376320 ./data8
a 2035712 ./data9
total size: 64864256
Done at Tue Apr 5 17:47:43 2011; 126700 blocks on 1 volume(s)

__ 29. Unmount the split copy file system and then rejoin the split copy (by removing the
copy). The time it takes to join the snapshot VG to the primary VG depends on the
number of allocated PPs in the filesystem, regardless of how much data was
updated while it was split. For large file systems, this could take a fairly long time,
during which it is competing for system resources.
» Following are suggested commands:
# cd /
# umount /backup
# rmfs /backup
rmlv: Logical volume testlvcopy00 is removed.

__ 30. List the logical volumes in the testvg volume group to verify that the split copy file
system is gone.
» Following is a suggested command and sample output:
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testlv2 jfs2 8 16 2 open/syncd /testfs2
testlv jfs 8 16 2 open/syncd /testfs
loglv00 jfs2log 1 2 2 open/syncd N/A
loglv01 jfslog 1 2 2 open/syncd N/A

10-16 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty
__ 31. Before continuing to other parts of this exercise, remove the testvg volume group by
executing the provided script: ex10_cleanvg.
» Following is a suggested command and sample output:
# cd /home/workshop
# ./ex10_cleanvg
rmlv: Logical volume testlv2 is removed.
rmlv: Logical volume testlv is removed.
rmlv: Logical volume loglv00 is removed.
rmlv: Logical volume loglv01 is removed.
ldeletepv: Volume Group deleted since it contains no physical volumes.

Part 3 - Using JFS2 snapshots


__ 32. Create and mount an enhanced file system (JFS2), with the following
characteristics:
• Volume Group: rootvg
• Size: 200 MB
• Mount point /myfs
• Internal snapshots: yes
• For all other values, accept the default.
» # crfs -v jfs2 -a isnapshot=yes -g rootvg \ 
-a size=200M -m /myfs
» # mount /myfs

__ 33. Display the space utilization of the /myfs file system, in megabytes. Record the
amount of free space: __________________________________________
» # df -m /myfs

__ 34. The /home/workshop directory has a script called filegen which will generate 10
files of 10 MB each in a specified directory. Use the script to place files in the file
system you just created and then list the files.
The filegen script accepts a single argument with the path to the directory in which
to place the files.

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-17
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Here is the code in the script:


for f in 0 1 2 3 4 5 6 7 8 9
do
dd if=/dev/zero bs=10k count=1024 of=${1}/sfile$f
done
» The suggested commands are:
# /home/workshop/filegen /myfs
# ls -l /myfs

__ 35. Display the space utilization of the /myfs file system, in megabytes. Record the
amount of free space. ___________________________________________
» # df -m /myfs

__ 36. Create an internal snapshot of the /myfs file system, named mysnap.
» # snapshot -o snapfrom=/myfs -n mysnap

__ 37. Verify the snapshot was created.


» # snapshot -q /myfs

__ 38. Delete all of the files in the /myfs directory.


» # rm /myfs/*
»Ignore any lost+found errors.

__ 39. Verify that the files have been deleted.


» # ls /myfs

__ 40. Display the space utilization of the /myfs file system, in megabytes. Record the
amount of free space. __________________________________________
Did the file system size decrease as a result of deleting all the files? Why?
___________________________________________________________
___________________________________________________________
» # df -m /myfs

10-18 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » The file system size did not significantly decrease. That is because the deleted files
were first copied to the snapshot which is also part of the file system space allocation.

__ 41. Access the snapshot and show that the files are still shown there.
» Suggested command are:
# cd /myfs/.snapshot/mysnap
# ls

__ 42. Restore a single file back to the snappedFS and verify that it is recovered.
» Suggested command are:
# cp sfile0 /myfs
# ls /myfs

__ 43. Restore all of the /myfs file system contents to what they were when the snapshot
was taken, using the snapshot rollback facility. Verify that all of the files have been
restored.
» Suggested commands are:
# cd /
# unmount /myfs
# rollback -n mysnap /myfs
# mount /myfs
# cd /myfs
# ls

__ 44. Change the directory back to your home directory. Verify that the internal snapshot
for /myfs is gone (should be deleted as part of the rollback operation).
» Suggested commands are:
# cd
# snapshot -q /myfs

__ 45. Create an external snapshot for the /myfs file system, size 100 MB. Record the
name of the created snapshot logical volume
___________________________________
» # snapshot -o snapfrom=/myfs -o size=100M

__ 46. Display the snapshots for /myfs and record the free space for the listed external
snapshot.

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-19
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

____________________________________________________________
» # snapshot -q /myfs

__ 47. Delete all of the files in the /myfs directory.


» # rm /myfs/*

__ 48. Verify that the files have been deleted.


» # ls /myfs

__ 49. Display the space utilization of the listed external snapshot. Record the amount of
free space. ___________________________________________________

Did the snapshot space fill up significantly? What would be the impact of running out
of space in a snapshot?
___________________________________________________________
___________________________________________________________
» # snapshot -q /myfs
» The snapshot filled up significantly, with the original data blocks of the deleted files. If
the external snapshot logical volume had ran out of space, the entire snapshot would
have been invalidated and would be unusable.

__ 50. Access the snapshot (mount the external snapshot LV) and show that the files are
still shown there.
» Suggested commands are:
# mkdir /mntsnap
# mount -v jfs2 -o snapshot /dev/<lv_name> /mntsnap
# cd /mntsnap
# ls

__ 51. Restore a single file back to the snappedFS and verify that it is recovered.
» Suggested command are:
# cp sfile0 /myfs
# ls /myfs

10-20 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 52. Restore all of the /myfs file system contents to what they were when the snapshot
was taken, using the snapshot rollback facility.
Remember that you have to first unmount the snapshot and then unmount the file
system before executing the rollback.
» Suggested commands are:
# cd
# unmount /mntsnap
# unmount /myfs
# rollback -s /myfs /dev/<snapshot_lv_name>

__ 53. Remount the file system /myfs and check that the files have been restored to the
same state as when the snapshot was taken.
» # mount /myfs
» # ls -l /myfs

Part 4 - Using a file system as a recovery source


The previous example of recovering a single file from a mounted JFS2 snapshot was
very simplistic. Using a cp command can often cause problems. In this exercise part,
you will explore both the potential problems and different approaches to handling those
problems.
__ 54. Change your current working directory to the mount point of your /myfs filesystem.
» # cd /myfs

__ 55. Remove any files which are currently stored in the myfs file system (except
lost+found).
» Suggested commands are:
# ls /myfs/*
# rm /myfs/*

__ 56. In /home/workshop, we have provided a script (mk_tree) that will build a directory
tree containing files which are specially built to illustrate potential problems.
Execute the mk_tree script.
» # /home/workshop/mk_tree

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-21
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 57. Generate a recursive long listing of files under /myfs, including the i-node numbers
of the files. Examine this report and answer the following questions:
»#ls -i -l -R /myfs
__ a. What type of file is /myfs/tree/dir1/cmds/mydf?
____________________________
» This and the other files in the same directory are symbolic links to executable files in
/usr/bin.
__ b. For the following files (in /myfs/tree/dir2/dataC), what is the owner, what are the
permissions, and what is the time stamp?
• karim.data ____________________________________________________
» The owner is karim, the permissions are 776, and the time stamp is: Aug 22, 2005.
• michel.data ___________________________________________________
» The owner is michel, the permissions are 775, and the time stamp is: Aug 22, 2005.
• ted.data _______________________________________________________
» The owner is ted, the permissions are 766, and the time stamp is: Aug 22, 2005.
__ c. What is the relationship between /myfs/tree/dir2/dataB/sparseA and
/myfs/tree/dir1/dataA/sparse_file?
_____________________________________________________________
» Examining the inode numbers, you can see that they are hard links to the same file.
__ d. What is the size of the sparse_file in /myfs/tree/dir1/dataA?
_____________________________________________________________
» The displayed size is 183,527,532 bytes or more than 183 MB.
__ e. What is the actual disk space used by sparse_file? _____________________
» # du -k /myfs/tree/dir1/dataA/sparse_file
» The actual disk space is only in the hundred of kilobytes. This is because it was built as
a sparse file with no data stored in large extents of the file.

__ 58. Create an internal snapshot of /myfs. Call the snapshot mysnap. Verify that the
new directory tree is shown in the snapshot.
» Suggested commands are:
# snapshot -o snapfrom=/myfs -n mysnap
# ls -R /myfs/.snapshot/mysnap

__ 59. In the snapped filesystem (/myfs), recursively remove /myfs/tree and verify that it is
gone.

10-22 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Suggested commands are:


# rm -R /myfs/tree
# ls /myfs

__ 60. Before attempting a recovery of the data, ensure that the /myfs filesystem has more
free space than the listed size of the sparse_file.
» Suggested commands are:
# df -m /myfs
# chfs -a size=+350M /myfs (if necessary)

__ 61. Recover the recently created directory tree from the snapshot, using a recursive cp
command.
» # cp -R /myfs/.snapshot/mysnap/tree /myfs

__ 62. Display a recursive long listing (with inode attributes and inode number) of the files
under /myfs/tree. Examine this report and answer the following questions about
their characteristics, comparing them to your previous answers.
»# ls -ilR /myfs/tree
__ a. What type of file is /myfs/tree/dir1/cmds/mydf?
_____________________________________________________________
» This and the other files in the same directory are ordinary files. They are actual copies
of executable files in /usr/bin. Previously, they were symbolic links.
__ b. For the following files (in /myfs/tree/dir2/dataC), what is the owner, what are the
permissions, and what is the time stamp?
• karim.data ____________________________________________________
» The owner is changed to root, the permissions are changed to 754, and the time stamp
is the current date.
• michel.data ___________________________________________________
» The owner is changed to root, the permissions are 755, and the time stamp is the
current date.
• ted.data _______________________________________________________
» The owner is changed to root, the permissions are 744, and the time stamp is the
current date.

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-23
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ c. What is the relationship between /myfs/tree/dir2/dataB/sparseA and


/myfs/tree/dir1/dataA/sparse_file?
______________________________________________________________
» Examining the inode numbers, you can see that they are separate files; while they used
to be links to the same file.
__ d. What is the size of the sparse_file in /myfs/tree/dir1/dataA?
______________________________________________________________
» The displayed size is 183,527,532 bytes or approximately 175 MB.
__ e. What is the actual disk space used by sparse_file? _____________________
» # du -m /myfs/tree/dir1/dataA/sparse_file
» The actual disk space used has grown from 252 KB to more than 175 MB. This is
because the cp command wrote hex 00 bytes for data extents which previously were
not stored on disk.

These results are probably not desirable in an actual recovery situation. You will next
use special options with the cp command to avoid some of these problems.

__ 63. In the snapped filesystem (/myfs), recursively remove /myfs/tree and verify that it is
gone.
» Suggested commands are:
# rm -R /myfs/tree
# ls /myfs

__ 64. Recover the recently created directory tree from the snapshot, using a recursive cp
command, but requesting that symbolic links be copied and that permissions,
ownerships, and timestamps be preserved.
» # cp -h -p -R /myfs/.snapshot/mysnap/tree /myfs

__ 65. Display a recursive long listing (with inode attributes and inode number) of the files
under /myfs/tree. Examine this report and answer the following questions about
their characteristics, comparing them to your previous answers.
» # ls -ilR /myfs/tree
__ a. What type of file is /myfs/tree/dir1/cmds/mydf?
_____________________________________________________________
» This and the other files in the same directory are symbolic links to executable files in
/usr/bin.

10-24 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ b. For the following files (in /myfs/tree/dir2/dataC), what is the owner, what are the
permissions, and what is the time stamp?
• karim.data ____________________________________________________
» The owner is karim, the permissions are 776, and the time stamp is: Aug 22, 2005.
• michel.data ___________________________________________________
» The owner is michel, the permissions are 775, and the time stamp is: Aug 22, 2005.
• ted.data _______________________________________________________
» The owner is ted, the permissions are 766, and the time stamp is: Aug 22, 2005.
__ c. What is the relationship between /myfs/tree/dir2/dataB/sparseA and
/myfs/tree/dir1/dataA/sparse_file?
______________________________________________________________
» They are separate files; while they used to be links to the same file.
__ d. What is the size of the sparse_file in /myfs/tree/dir1/dataA?
______________________________________________________________
» The displayed size is 183,527,532 bytes or approximately 175 MB.
__ e. What is the actual disk space used by sparse_file? _____________________
» # du -m /myfs/tree/dir1/dataA/sparse_file
» The actual disk space used has grown from 252 KB to more than 175 MB. This is
because the cp command wrote hex 00 bytes for data extents which previously were
not stored on disk.

These results are better than the last copy attempt, but we still have separate files for a
situation where we previously had multiple hard links to a single file and we still have
files losing their sparseness. You will next use the backup and restore utilities to copy
over a directory tree.

__ 66. In the snapped filesystem (/myfs), recursively remove /myfs/tree and verify that it is
gone.
» Suggested commands are:
# rm -R /myfs/tree
# ls /myfs

__ 67. Recover the recently created directory tree from the snapshot by executing a
pipeline which uses the backup and restore commands as filters:
# cd /myfs/.snapshot/mysnap
# find ./tree | backup -i -qf - | (cd /myfs; restore -qvf -)

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-25
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 68. Display a recursive long listing (with inode attributes and inode number) of the files
under /myfs/tree. Examine this report and answer the following questions about
their characteristics, comparing them to your previous answers:
»# ls -ilR /myfs/tree
__ a. What type of file is /myfs/tree/dir1/cmds/mydf?
_____________________________________________________________
» This and the other files in the same directory are symbolic links to executable files in
/usr/bin.
__ b. For the following files (in /myfs/tree/dir2/dataC), what is the owner, what are the
permissions, and what is the time stamp?
• karim.data ____________________________________________________
» The owner is karim, the permissions are 776, and the time stamp is: Aug 22, 2005.
• michel.data ___________________________________________________
» The owner is michel, the permissions are 775, and the time stamp is: Aug 22, 2005.
• ted.data _______________________________________________________
» The owner is ted, the permissions are 766, and the time stamp is: Aug 22, 2005.
__ c. What is the relationship between /myfs/tree/dir2/dataB/sparseA and
/myfs/tree/dir1/dataA/sparse_file?
______________________________________________________________
» They are hard links to the same file.
__ d. What is the size of the sparse_file in /myfs/tree/dir1/dataA?
______________________________________________________________
» The displayed size is 183,527,532 bytes or approximately 175 MB.
__ e. What is the actual disk space used by sparse_file? _____________________
» Suggested commands are:
# cd
# du -k /myfs/tree/dir1/dataA/sparse_file
» The actual disk space used is only 252 KB. This is because the AIX restore utility, by
default, maintains the sparseness of files being restore.

You can see that the best results were from using a backup and restore pipeline. There
are other file characteristics which we did not explore, but for which the backup and
restore utilities are your best friends in ensuring you do not lose any of them.

End of exercise

10-26 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise review/wrap-up


Review ...

© Copyright IBM Corp. 2009, 2011 Exercise 10. Advanced backup techniques 10-27
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

10-28 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 11. Diagnostics


(with hints)

Estimated time
Part 1 - 00:13
Part 2 - 00:07
Part 3 - 00:17
Part 4 - 00:18
Total required: 00:55

What this exercise is about


This exercise describes how to use diagnostic routines in several
different modes.

What you should be able to do


At the end of the lab, you should be able to:
• Execute hardware diagnostics in the following modes:
- Concurrent
- Maintenance
- Service (standalone)

Introduction
Diagnostics can provide supplemental information about a hardware
related problem.

Common student problems


If a team has lost their TERM variable customization, they may have
problems interacting with the diag facility menu or dialog panels.
Resetting with:
# export TERM=vt320 (for virtual terminal environment)
(or any other effective emulation) will solve this problem.

© Copyright IBM Corp. 2009, 2011 Exercise 11. Diagnostics 11-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
Specifically, it requires either a local machine, where access does not depend upon
network access, or a remote LPAR which is accessible using a virtual terminal (HMC)
with a physical Ethernet adapter.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.

Part 1: Running diagnostics in multi-user mode


__ 1. Determine if your system has a physical Ethernet adapter port. For the purposes of
this lab exercise, a Logical Host Ethernet Port satisfies this requirement. Record the
name of the adapter: _______________________________________________
You will run diagnostics on this selected adapter in the remainder of this exercise
part. In the following instructions we will assume that this is ent5.
» # lsdev -Cc adapter | grep -i ethernet
» Look for adapters which are not virtual adapters.

__ 2. Determine if the corresponding ethernet interface is configured. The corresponding


interface with end with the matching suffix number; for example, if the adapter is
ent5 then the interface will be en5.
»# netstat -in

__ 3. If your physical Ethernet adapter’s interface is not configured, then configure it with
a private address which will not conflict with any existing lab subnets. For example,
you might assign it 192.168.252.<your student number>. Check with the instructor if
you are unsure.
»# smitty chinet
or
»# chdev -l en5 -a netaddr=192.168.252.5 -a state=up

11-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty
__ 4. Start up diagnostic routines in concurrent mode and test a communication adapter
that is in use on your system. What happens?
» # diag
• At the FUNCTION SELECTION screen, select Diagnostic Routines.
• You may be asked for your terminal type if it has not been defined.
• In the DIAGNOSTIC MODE SELECTION screen, select System
Verification.
• Select your configured physical communications adapter (for example:
ent1) by using cursor control to position the cursor on the adapter and
pressing Enter.
• Press F7 or <esc-7> to commit
» The communications adapter was not able to be tested. You should get the message
“No trouble was found. However, the resource was not tested because the
device driver indicated that the resource was in use.”

__ 5. Return to the FUNCTION SELECTION menu. Then, select Diagnostic Routines. 


What is the difference between System Verification and Problem Determination?
» Press the Previous Menu key (F3 or <Esc-3>) until you see the FUNCTION
SELECTION screen. Then, select Diagnostic Routines.
» System Verification tests a resource and does not analyze the error log.
Problem Determination tests a resource and analyzes the error log and should not be
used after a hardware repair unless the error log has been cleaned up.

__ 6. Return to the FUNCTION SELECTION screen.


Using Task Selection, query the vital product data of one of your physical Ethernet
adapters.
» Press the Previous Menu key (F3 or <Esc-3>) until you see the FUNCTION
SELECTION screen.
• Select Task Selection.
• Select Display Hardware Vital Product Data.
• Select ent1 (or whatever physical adapter is available on your system).
• Press F7 or <esc-7> to commit.
» The amount of information depends upon the type of adapter.

__ 7. Return to the TASKS SELECTION LIST screen.

© Copyright IBM Corp. 2009, 2011 Exercise 11. Diagnostics 11-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Press the Previous Menu key (F3 or <Esc-3>) until you see the TASKS SELECTION
LIST screen.
__ a. Who will be notified when a hardware error is posted to the error log?
» Select Automatic Error Log Analysis and Notification.
» You will see the screen:
AUTOMATIC ERROR LOG ANALYSIS AND NOTIFICATION SERVICE AID

This task controls the error notification mailing list for both
Periodic Diagnostics and Automatic Error Log Analysis. The error
notification mailing list can consist of system users and email
addresses of the form user@domain. Also, this task allows
automatic error log analysis to be disabled or enabled. By
default automatic error log analysis is enabled.

To continue, press 'Enter'.


» Press Enter.
» Select Display the error notification mailing list.
» By default, the list is empty.

__ b. If root was not in the notification list, return to the AUTOMATIC ERROR LOG
ANALYSIS AND NOTIFICATION SERVICE AID screen and add root to the
notification list.
» Press the Previous Menu key (F3 or <Esc-3>) until you see the AUTOMATIC ERROR
LOG ANALYSIS AND NOTIFICATION SERVICE AID screen.
» Select Add to the error notification mailing list.
» You will see the following menu. Add root.
ADD TO THE ERROR NOTIFICATION LIST 802103

Type in an email address (including the user name and domain) or


a system user to be notified of hardware problems, then press
'Commit' to add it to the mailing list.

email address or system user [root] +


» Press F7 or <esc-7> to commit.
» Press F10 or <esc-0> to exit diag.

11-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Part 2: Running diagnostics in single user mode


__ 8. Start up diagnostic routines in single user mode using the following steps:
__ a. You will need to be at the system console to do this. If you are using a remote
LPAR, then first open a virtual terminal to your system from the HMC.
»Follow the instructions on how to start a virtual terminal provided in
Exercise 5, Part 3 for your HMC environment
__ b. Shut down your system to single user mode. Note: In this particular case, it
would have been sufficient to detach the interface related to the network adapter,
rather than having to shut down to single user mode. But however, there will be
other situations where one may need to run diagnostics from single user mode,
maintenance mode, or even booting with a diagnostic routine provided on CD or
over the network.
»# shutdown -F -m

__ c. At your system console, log in to single user mode using root’s password.
»INIT: SINGLE USER MODE
Password: root’s password

__ d. Start the diagnostics facility.


»# diag

__ 9. Test the communication adapter again in maintenance mode. What happens now?
» At the Function Selection menu, select the Diagnostic Routines.
» You may be asked for your terminal type if it has not been defined:
• If you are at a graphics console, set the terminal type to lft (low function
terminal).
• If you are using a remote virtual console, set the terminal type to vt320.
» In the DIAGNOSTIC MODE SELECTION screen, select System Verification.
» Select your communications adapter (ex. ent1).
» Press F7 or <esc-7> to commit.
» The communications adapter was able to be tested. Hopefully, you got the message No
trouble was found.

__ 10. Exit the diagnostic utility.

© Copyright IBM Corp. 2009, 2011 Exercise 11. Diagnostics 11-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Press F10 or <Esc-0>

Part 3: Running diagnostics in service mode from hard drive


__ 11. Start up the diagnostic utility in service mode from the hard drive using the following
steps:
__ a. Shut down AIX and power off your machine or logical partition:
»# shutdown -F
• For a local server, when you see halt completed use the front panel to
power off the machine.
• For a remote server, when you see halt completed close the virtual
console window:
- The HMC window should show a partition state of shutting down and
eventually not activated.
- If the state stays at running, despite the halt completed message,
then use the HMC to shut down the partition.
Access the HMC and locate your LPAR.
From the task menu, select Operations -> Shutdown.
On the resulting Shutdown Partitions popup, select the Immediate
option and click OK.
- If not using HMCv7, you may need to also close the virtual terminal
from the HMC window in order to start it again later; right-click your
partition and select Close Terminal Connection and confirm when
prompted.

__ b. Boot your system to diagnostics using service mode off the hard drive.
» For a local server:
• Make sure that there is no bootable media in the CD drive.
• Power on the machine from the front panel.
• When the console displays the list of discovered devices (memory,
keyboard, network, scsi, speaker), press F5 (or 5).
» For a remote server LPAR environment:

11-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Using your HMC GUI interface, boot the server to service mode using the default
bootlist. Since there should not be at bootable CD, the system will boot off the
hard drive into diagnostic mode.
1) When the partition state is Not Activated, proceed to activate the partition.
2) Select the partition (if not already selected).
3) When the small menu icon appears, click it to show the menu and move your
mouse over the tasks: Operations --> Activate, and click Profile.
4) In the pop-up window labeled Activate Logical Partition: <your lpar
name>, click the small box next to Open a terminal window or console
session and also click the Advanced button. This should result in a new
pop-up window labeled Activate Logical Partition - Advanced.
5) In the new pop-up window, click the menu icon to the right of “Boot Mode”
and select Diagnostic with default bootlist.
Click OK to exit this pop-up.
6) On the panel that is labeled Activate Logical Partition: <your lpar name>,
click OK. Respond yes to any security pop-up windows.
A virtual terminal window should appear and you should see the system
console displays for a boot system, ending in an Diagnostics menu. (If you
do not see the virtual terminal window, it is likely behind some other window
and you will need to bring it to the foreground).
If you already have a virtual terminal window, then a window will pop up
stating this and that it is exiting the new window; locate your existing window.

__ 12. Test the communication adapter again in maintenance mode. What difference did
you see from the previous diagnostics mode?
_________________________________________________________________
_________________________________________________________________
_________________________________________________________________
» At the Function Selection menu, select the Diagnostic Routines.
» You may be asked for your terminal type if it has not been defined:
• If you are at a graphics console, set the terminal type to lft (low function
terminal).
• If you are using a remote virtual console, set the terminal type to vt320.
» In the DIAGNOSTIC MODE SELECTION screen, select System Verification.
» Select your communications adapter (ex. ent1).
» Press F7 or <esc-7> to commit.

© Copyright IBM Corp. 2009, 2011 Exercise 11. Diagnostics 11-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» The communications adapter was able to be tested. Once again, you should see the
message No trouble was found.

__ 13. Exit the diagnostic utility. This will warn you that exiting the diagnostic tool will
generate a shutdown. Press Enter to concur.
» Press F10 or <Esc-0>

__ 14. Once the halt is completed and the partition is Not Active, boot your system in
normal (multi-user) mode.
• For a local server, when you see halt completed, power off and back on
from the front panel.
• For a remote server, when you see halt completed.
- Close the virtual console window.
- The HMC window should show a partition state of shutting down and
eventually not activated.
- If the state stays at running, then use the HMC to shut down the
partition. Use the procedure for your HMC provided in Exercise 3, Part
3.
- When the HMC shows your LPAR state as not activated, activate the
partition to a multi-user mode (using the normal bootlist).
- If not using HMCv7, you may need to also close the virtual terminal
from the HMC window in order to start it again later; right-click your
partition and select Close Terminal Connection and confirm when
prompted.
__ 15. When AIX finishes booting, log in as root. 
View the contents of the diagnostics log using both the summary format and the
detailed format. Did you find any errors?
» # /usr/lpp/diagnostics/bin/diagrpt -r | more
» # /usr/lpp/diagnostics/bin/diagrpt -a | more

Part 4: Booting to diagnostics using external boot image (NIM


server)
When the problem to be diagnosed is an inability to access the boot drive, then you need to
boot off of some external image. Traditionally this would be a diagnostic CD loaded in your
optical drive. On a partitioned machine, it is more common to use a diagnostic boot image
provided by a NIM server.

11-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty In this part you will configure the NIM server to provide a diagnostic boot image and then
network boot your machine to use that image.
First you will configure your NIM server.
__ 16. Login to your server LPAR (if you do not already have a session with it).
__ 17. List the attributes of your client LPAR’s NIM machine object, by executing:
# lsnim | grep machines
# lsnim -l <your-machine-object-name>
» Following are the example commands and output:
# lsnim | grep machines
# lsnim -l sys264_lpar2
sys264_lpar2:
class = machines
type = standalone
connect = shell
platform = chrp
netboot_kernel = mp
if1 = net_en0 sys264_lpar2 0
cable_type1 = N/A
Cstate = ready for a NIM operation
prev_state = currently running
Mstate = not running
cpuid = 00C35B904C00
Cstate_result = success

__ 18. If the Cstate value is not ready for a NIM operation, force reset the state of your
client machine object, by executing:
# nim -o reset -F <your-machine-object-name>
» Following are the example commands and output:
# nim -o reset -F sys264_lpar2
# lsnim -l sys264_lpar2 | grep Cstate
Cstate = ready for a NIM operation
Cstate_result = reset

__ 19. The maintenance boot operation requires that a SPOT is allocated to the machine.
Check that there is a SPOT allocated, by executing:
# lsnim -l <your-machine-object-name> | grep spot
If there is not a SPOT allocated, then allocate one that matches the version and
release of your client LPAR’s operating system, by executing:

© Copyright IBM Corp. 2009, 2011 Exercise 11. Diagnostics 11-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

# nim -o allocate -a spot=spot71-00-01 <your-machine-object-name>


» Following are the example commands:
# lsnim -l sys264_lpar2 | grep spot
if needed) # nim -o allocate -a spot=spot71-00-01 sys264_lpar2

__ 20. Invoke the diag operation for your client LPAR, by executing:
# nim -o diag <your-machine-object-name>
» Following is an example commands:
# nim -o diag sys264_lpar2

__ 21. Verify that your client LPAR machine object now has a Cstate of maintenance boot
has been enabled, by executing:
# lsnim -l <your-machine-object-name> | grep Cstate
» Following are the example commands and output:
# lsnim -l sys264_lpar2 | grep Cstate
Cstate = diag boot has been enabled

Next you will boot your LPAR into SMS mode and use the SMS menus to network
boot the LPAR, using the NIM server as the boot server.
__ 22. Connect to and log in to your HMC graphical interface, is you do not already have a
session.

__ 23. If your assigned client LPAR is currently running, shut it down in an organized
manner. Once the logical partition is in a Not Activated state, continue to the next
step.

__ 24. When the partition state is Not Activated, proceed to activate your LPAR into SMS
mode.
» Following are detailed instructions on how to boot to SMS mode:
i. Select the partition (if not already selected).
ii. When the small menu icon appears, click it to show the menu and move your
mouse over the Operations task and then the Activate sub-task.
iii. When the next sub-task menu appears, click the Profile sub-task.
iv. In the pop-up window labeled Activate Logical Partition: <your lpar name>,
click the small box next to Open a terminal window or console session and

11-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty also click the Advanced button. This should result in a new pop-up window
labeled “Activate Logical Partition - Advanced.”
v. In the new pop-up window, click the menu icon to the right of “Boot Mode” and
select SMS.
vi. Click OK to exit this pop-up.
vii. On the panel that is labeled Activate Logical Partition: <your lpar name>,
click OK. Respond yes to any security pop-up windows.
A virtual terminal window should appear and you should see the system
console displays for a boot system, ending in an SMS menu. (If you do not
see the virtual terminal window, it is likely behind some other window and you
will need to bring it to the foreground).
If you already have a virtual terminal window, then a window will pop up
stating this and that it is exiting the new window; locate your existing window.

__ 25. Network boot your LPAR into diagnostic mode using SMS.
» Following are detailed instructions on using SMS to network boot your LPAR.
i. From the SMS main menu, select options
2. Setup Remote IPL (Initial Program Load) ->
ii. From the list of Network Interface Card (NIC) Adapters, choose the first one (the
one that matches the location code recorded earlier).
iii. On newer systems, you will be prompted on what protocols to use. Select IPv4
and bootp.
iv. This should bring up the Network Parameters panel. select option
1. IP Parameters
v. On the IP Parameters panel, if the network parameters are already set, validate
that they are correct (The server IP address, if already set, is likely to be wrong
for this exercise.) If they are not correct, then modify them.
The way to modify the values is to enter the number of the parameter you
want to change, type in the replacement value and then press Enter.
When you are comfortable that the IP Parameters are correct, return to the
previous Network Parameters panel by pressing the <Esc> key.
vi. Next use the ping test to see if the parameters allow you to communicate with
the designated server. Select:
3. Ping Test
and
1. Execute Test

© Copyright IBM Corp. 2009, 2011 Exercise 11. Diagnostics 11-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

If you do not get a “Ping Success” result, then check the status of the server
and your IP Parameter values.
vii. Back out to the main menu, using the <Esc> key.
viii.From the SMS main menu, select options:
5. Select Boot Options ->
1. Select Install/Boot Device ->
6. Network
When prompted for IP protocol. select IPV4.
When prompted for a network service, select bootp.
Select the device number of your network adapter
ix. Then select:
2. Normal Mode Boot
1. Yes (to exit SMS)

You should see the tftp packet count incrementing as it downloads the boot
image. Then you should see the system booting up into maintenance mode.
It will prompt you to identify the system console. Type 1 and press Enter.
It will next ask you to identify the language to be used while in maintenance
mode. Type 1 (for English) and press Enter.
It should then display the Diagnostics menu.

__ 26. Test the communication adapter again in diagnostics mode.

Note

When booting to special modes, you can not assume that the logical device name for a
resource will be the same as when in a multi-user mode. Use the description and physical
location code attributes to identify the device.

What is the result?


_________________________________________________________________
_________________________________________________________________
_________________________________________________________________
» At the Function Selection menu, select the Diagnostic Routines.

11-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » You may be asked for your terminal type if it has not been defined:
• If you are at a graphics console, set the terminal type to lft (low function
terminal).
• If you are using a remote virtual console, set the terminal type to vt320.
» In the DIAGNOSTIC MODE SELECTION screen, select System Verification.
» Select your communications adapter (ex. ent#).
» Press F7 or <esc-7> to commit.
» The communications adapter was able to be tested. Once again, you should see the
message No trouble was found.

__ 27. Exit the diagnostic utility.


» Press F10 or <Esc-0>

__ 28. Boot your system in normal (multi-user) mode.


» Following are detailed instructions on how to boot back to multi-user mode:
• For a local server, when you see halt completed, power off and back on
from the front panel.
• For a remote server, when you see halt completed.
- The HMC window should show a partition state of shutting down and
eventually not activated.
- If the state stays at running, then use the HMC to shut down the
partition. Use the procedure for your HMC provided in Exercise 3, Part
3.
- When the HMC shows your LPAR state as not activated, activate the
partition to a multi-user mode (using the normal bootlist).
- If not using HMCv7, you may need to also close the virtual terminal
from the HMC window in order to start it again later; right-click your
partition and select Close Terminal Connection and confirm when
prompted.

End of exercise

© Copyright IBM Corp. 2009, 2011 Exercise 11. Diagnostics 11-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise review/wrap-up
Review the three diagnostic modes: concurrent, maintenance, and service. What can
and cannot be done in each mode?

11-14 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty Exercise 12. System dump


(with hints)

Estimated time
00:40 - Working with the AIX dump facility
00:25 - Working with a dedicated dump logical volume
00:20 - Working with a firmware assisted dump and Initiating a dump
from the HMC
01:25 - total exercise time

What this exercise is about


This exercise allows you to become familiar with the AIX dump facility.
In addition, you will use the snap command to include the dump in the
system data that you would provide to AIX Support. During this
exercise, you will also use the kdb command, but only at a very
introductory level.

What you should be able to do


After completing this exercise, you should be able to:
• Initiate a dump
• Include the dump in data collected by the snap command

Introduction
In this exercise you will create a dump and use the kdb command to
look at that dump.
You will need root authority to complete this exercise.

Common student problems


There must be sufficient resources available to increase /var and /tmp to an appropriate
size in this exercise as the snap and dump facilities require a lot of space. For example, on
a Power 7 system with approximately 1 GB of memory, the compressed size of a system
dump obtained in testing this exercise was approximately 365 MB.

© Copyright IBM Corp. 2009, 2011 Exercise 12. System dump 12-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise instructions with hints


Preface
Two versions of these instructions are available; one with hints and one without. You
can use either version to complete this exercise. Also, please do not hesitate to ask the
instructor if you have questions.
All exercises of this chapter depend on the availability of specific equipment in your
classroom.
The output shown in the answers is an example. Your output and answers based on the
output may be different.
All hints are marked with a >> sign.
Note: All users must perform this exercise together if there is more than one user on your
system.

Working with the AIX Dump Facility


__ 1. If you do not already have a telnet connection to your client LPAR, start one now.

__ 2. If you do not have a Web browser session with your HMC, establish this before
starting this exercise. Navigate to the panel listing your LPARs. You will need this in
order to observe the reference code.

__ 3. Record the following dump-related settings for your system:


Primary dump device _____________________________
Secondary dump device ___________________________
Copy directory ___________________________________
Dump compression (ON or OFF) _____________________
» # sysdumpdev -l
» On systems running AIX 5L V5.3 or later, the value shown for dump compression
should be ON. This is the default for AIX 5L V5.3 or later. For AIX 6.1 and later, this
cannot be changed.

__ 4. Execute the command to display the estimated size of a dump and record the
estimate you obtain:
_______________________________________________
» # sysdumpdev -e

12-2 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » On a system with approximately 1 GB of memory that was used in testing this exercise,
the value obtained was approximately 365 MB.
__ 5. Verify that the dump copy directory is large enough to hold the dump size
reported on the previous command.
» # df -m /var
» You could also use the command /usr/lib/ras/dumpcheck -p to check if the size of
the copy directory is large enough. If no message is sent to stdout, then the size is
sufficient.
If there is not enough space, you must increase the size of the corresponding file
system. If necessary, use the chfs command to increase the size of the appropriate
file system, typically /var. After increasing the size, reverify that the filesystem is
large enough.
» # chfs -a size=+##M /var 
Where ## represents the number of megabytes that /var must be increased by to hold
the dump
On a system with approximately 1 GB of memory that was used in testing this
exercise, the value used for ## was 257.
Note that chfs will now accept M (Megabytes) and G (Gigabytes) unit
identifiers for file system size specifications. In our example, the command
chfs -a size=+257M /var can be used to indicate that the size of /var
should be increased by 257 MB.
# /usr/lib/ras/dumpcheck -p

__ 6. Ensure that the value of the autorestart attribute for sys0 is set to true. (If
autorestart is set to true, the system will reboot after a crash.)
»# lsattr -El sys0 -a autorestart
» # chdev -l sys0 -a autorestart=true (if necessary)

__ 7. Use the command sysdumpstart -p to start a dump to the primary dump device.
Record the time when you executed the command. _______________________
» # sysdumpstart -p
What dump progress code for your LPAR is reported at the HIMC for several
minutes after this command is entered? This is referred to as an Operator Panel
Value (pre-HMCv7) or as the Reference Code (HMCv7) in the HMC display
across from your LPAR name.
_______________________________________________
» 0c2 which indicates the dump is in progress.

© Copyright IBM Corp. 2009, 2011 Exercise 12. System dump 12-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 8. Eventually, the dump will complete and the AIX system will reboot. After the system
reboots, reestablish a telnet session with your LPAR and log in as root. Record the
time when the login prompts appears: __________________________________
How long did the dump require from initiation to reboot completion? ___________

__ 9. Determine and write down the size, uncompressed size, and filename for your
system dump:
- Size: ___________________________________________________________
- Uncompressed size: ______________________________________________
- Dump copy filename: ______________________________________________
» # sysdumpdev -L
Sample output is given below:
Device name: /dev/hd6
Major device number: 10
Minor device number: 2
Size: 71727616 bytes
Uncompressed Size: 874837997 bytes
Date/Time: Sat Apr 23 15:35:02 2011
Dump status: 0
Type of dump: traditional
dump completed successfully
Dump copy filename: /var/adm/ras/vmcore.0.BZ
(if it’s the first dump)

Note that the value shown for Uncompressed Size is much larger than the
value shown for Size. Also note that the .BZ extension means that the
compressed dump cannot be uncompressed using the uncompress
command.

__ 10. Uncompress the dump file (for example, /var/adm/ras/vmcore.#.BZ). When doing
the dump-uncompress, keep the original compressed file. Based on the reported
Uncompressed Size just reported, you may need to further increase the size of /var
to accommodate the size of the uncompressed dump (in addition to the already
created compressed dump).
» Suggested commands are:
# df -m /var
# chfs -a size=+##M /var

12-4 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty The additional number of megabytes should be greater than the Uncompressed
Size you have recorded.
# dmpuncompress -p /var/adm/ras/vmcore.0.BZ (if it's the first dump) 

__ 11. Execute the kdb command on the uncompressed dump that was created. Write
down the command you used:
___________________________________________________________
» # kdb /var/adm/ras/vmcore.0

__ 12. Use the kdb stat and status subcommands to show the system name and time of
the dump, and the threads that were running when the dump occurred. Quit the kdb
command when you are done.

© Copyright IBM Corp. 2009, 2011 Exercise 12. System dump 12-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» Sample output is shown below:


(0)> stat
SYSTEM_CONFIGURATION:
CHRP_SMP_PCI POWER_PC POWER_7 machine with 8 available CPU(s)
(64-bit registers)

SYSTEM STATUS:
sysname... AIX
nodename.. sys304_118
release... 1
version... 7
build date Aug 31 2010
build time 16:22:09
label..... 1034B_710
machine... 00F606034C00
nid....... F606034C
time of crash: Sat Apr 23 15:35:03 2011
age of system: 16 hr., 34 min., 46 sec.
xmalloc debug: enabled
FRRs active... 0
FRRs started.. 0

CRASH INFORMATION:
CPU -1 CSA 052AA6C8 at time of crash, error code for LEDs:
00000000

(0)> status
CPU INTR TID TSLOT PID PSLOT PROC_NAME
0 23800C1 568 B70068 183 sysdumpstart
1 180031 24 E001C 14 wait
2 190033 25 F001E 15 wait
3 1B0037 27 100020 16 wait
4 1C0039 28 110022 17 wait
5 1D003B 29 120024 18 wait
6 1E003D 30 130026 19 wait
7 1F003F 31 140028 20 wait
8-1023 Disabled

(0)> q
__ 13. Remove the uncompressed dump, but keep the original compressed dump. This will
ensure proper processing of the system dump by the snap command, which you will
use in a subsequent lab step.
» # rm /var/adm/ras/vmcore.0 (if it's the first dump)

12-6 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 14. Check to see how much free space is currently available in /tmp.
If necessary, increase your /tmp file system so that there is at least 210 MB of free
space. We need this space in the next lab step.
Write down the commands you used:
_______________________________________________
_______________________________________________
» # df -m /tmp
If necessary, you can increase the size of /tmp using a command similar to
the following:
# chfs -a size=+100M /tmp

__ 15. Run the command snap -a. This command required a little more than 6 minutes to
complete on our development system.
» # snap -a
Review the output of this command. This output will include a list of various
directories (in /tmp/ibmsupt) to which the snap command writes its output.
In these directories, you will find files with names that end in .snap, which are ASCII
files. Review the content of a few of these files.
» # cd /tmp/ibmsupt 
# ls 
# cd <subdirectory> 
# more <name of snap selected snap file>

Working with a dedicated dump logical volume


In this part of the lab, you will create a dedicated dump logical volume, configure to use it,
and process a dump in this environment.
__ 16. Remove any dump files currently in /var/adm/ras if they exist
» The suggested commands are:
# ls /var/adm/ras/vmcore*
# rm /var/adm/ras/vmcore*

__ 17. List the estimated size of a dump on your systems. Record it here:
________________________________________________________________

© Copyright IBM Corp. 2009, 2011 Exercise 12. System dump 12-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

» A suggested command and example output is:


# sysdumpdev -e
Estimated dump size in bytes: 366162739

__ 18. List the physical partition size for your rootvg.


________________________________________________________________
» A suggested command and example output is:
# # lsvg rootvg | grep "PP SIZE"
VG STATE: active PP SIZE: 8 megabyte(s

__ 19. Calculate how many physical partitions you must allocate to satisfy the estimated
size of a dump. Record it here:
________________________________________________________________
» Using our example output, the result of dividing 8 MB per PP into 366
Megabytes is 45.75. Rounded up, this gives a minimum allocation of 46
physical partitions.

__ 20. Verify that the file system that holds the copy directory has enough free space, given
the estimated dump size. Increase the filesystem size, if necessary.
» # df -m /var
»# chfs -a size=+16M /var

__ 21. Create a logical volume, out of the rootvg, that has more than enough physical
partitions to handle the estimated dump size. Name it dumplv and specify a logical
volume type of sysdump.
» # mklv -y dumplv -t sysdump rootvg 46

__ 22. Permanently define this new logical volume as the primary dump device.
» # sysdumpdev -P -p /dev/dumplv

__ 23. Verify that you new logical volume is defined as the primary dump device.
» # sysdumpdev -l

12-8 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 24. Initiate a dump of your operating system to your primary dump device. On our
development system, this require a little more than 7 minutes to complete.
» # sysdumpstart -p

__ 25. In your HMC GUI interface, examining the reference code field across from your
client LPAR, monitor the dump progress. 0c2 indicates the dump is in progress.
Eventually, the dump will complete and the AIX system will reboot.

__ 26. After the system reboots, reestablish a telnet session with your LPAR and log in as
root. List the dump status. What was the size of the dump? Was there a Dump copy
filename line in the report? __________________________________________
________________________________________________________________
» # sysdumpdev -L
Sample output is given below:
Device name: /dev/dumplv
Major device number: 10
Minor device number: 13
Size: 100349952 bytes
Uncompressed Size: 882864853 bytes
Date/Time: Wed Apr 6 19:24:47 2011
Dump status: 0
Type of dump: traditional
dump completed successfully

__ 27. Examine the copy directory. Is there a vmcore file located in that directory?
______________________________________________________________
» # ls /var/adm/ras/vmcore*
» There should not be any vmcore files in the directory. The dump may be left in the
dedicated dump device and is not automatically copied to the copy directory.

__ 28. Copy the dump and the current kernel to the dump copy directory.
» # savecore /var/adm/ras /unix
» There should now be a new vmcore file in the directory.

__ 29. Again, examine the copy directory. Is there a vmcore file located in that directory?

© Copyright IBM Corp. 2009, 2011 Exercise 12. System dump 12-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

______________________________________________________________
» # ls /var/adm/ras/vmcore*

Generating a firmware assisted dump


In this part of the lab, you will request a firmware assisted dump and observe the
differences in dump processing. It assumes that you have a dedicated device as your
primary dump device.
__ 30. If you do not already have an web browser window with a connection to the HMC,
establish one now.
__ 31. If you do not already have a virtual terminal to your assigned client LPAR, establish
one at this point, and log in as root.
__ 32. Remove any dump files currently in /var/adm/ras if they exist
» The suggested commands are:
# ls /var/adm/ras/vmcore*
# rm /var/adm/ras/vmcore.0

__ 33. List the current system dump configuration.


What is the current type of dump? _____________________________________
Is the primary dump device a dedicated dump device? _____________________
» The suggested commands are:
# sysdumpdev -l

__ 34. Modify your system to permanently use a firmware assisted dump. Is the change
immediately effective? _____________________________________________
» The suggested commands are:
# sysdumpdev -P -t fw-assisted
» The firmware-assisted dump will be configured at the next reboot.

__ 35. List your systems dump configuration to verify the current situation. What type of
dump is the system configured to use? _________________________________
» The suggested commands are:
# sysdumpdev -l

12-10 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty » Since you have not yet rebooted, the current configuration is still for a traditional dump.

__ 36. Shutdown and reboot your AIX system. Log back in as root when you receive a login
prompt.
» The suggested commands are:
# shutdown -Fr

__ 37. List your systems dump configuration to verify the current situation. What type of
dump is the system configured to use? _________________________________
» The suggested commands are:
# sysdumpdev -l
» You should see that the dump facility is now configured for a firmware-assisted dump.

__ 38. Initiate a dump to your primary device and record the time that you executed the
command. _______________________________________________________
» The suggested commands are:
# sysdumpstart -p

__ 39. Examine the HMC reference code for you client LPAR. What is the progress code?
________________________________________________________________
» The 0cb dump progress code is the fw-assisted equivalent of the traditional 0c2 code: a
dump has been initiated.

__ 40. Continue to observe the reference code field in your HMC. Note the time when the
dump progress code goes away.
________________________________________________________________

__ 41. Continue to observe the reference code field in your HMC. At some point you should
see some text (instead of a code) in the reference field about the dump processing
(just prior to starting the AIX operating system). What is the text?
________________________________________________________________
» The reference code field will eventually display: Fw-Assisted Dump, followed by AIX
is starting and Starting kernel. This illustrates that the Power Hypervisor is handling
the dump processing prior to booting AIX in the LPAR.

© Copyright IBM Corp. 2009, 2011 Exercise 12. System dump 12-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

__ 42. Watch your virtual console. You may be quick enough to see a SoftRos issued
message about dump processing on your system console.
Record the time when you receive a login prompt. ________________________
How long did it take from dump initiation to the completion of the reboot with login
prompt? _________________________________________________________
How did this compare to the traditional dump processing? __________________
________________________________________________________________

__ 43. Login as root and display the results of the last dump.
________________________________________________________________
» The suggested commands are:
# sysdumpdev -L
» The dump status does not list the location of the dump copy file as it did with the
traditional dump.

__ 44. Copy the dump and the current kernel to the dump copy directory.
» # savecore /var/adm/ras /unix

__ 45. Examine the copy directory. Is there a vmcore file located in that directory?
______________________________________________________________
» # ls /var/adm/ras/vmcore*

Initiating a dump from the HMC


There are times when the problem to be analyzed is a hang of the operating system. In that
situation you would be unable to initiate a dump with a command entered at a shell prompt.
Instead, you would need to tell the firmware to initiate the dump. In this part of the lab, you
will request a a system reset with dump request, using your HMC.
__ 46. If you do not already have an web browser window with a connection to the HMC,
establish one now.
__ 47. If you do not already have a virtual terminal to your assigned client LPAR, establish
one at this point, and log in as root.

12-12 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0
Instructor Exercises Guide with hints

EXempty __ 48. In your HMC web browser interface, locate and select your assigned client LPAR.

__ 49. From the task menu, select Operations -> Restart.

__ 50. In the pop-up Restart Partition window, select the Dump option and click OK.

__ 51. Examine the HMC reference code for you client LPAR. What is the progress code?
________________________________________________________________

__ 52. You do not need to wait for the dump to complete. Notify the instructor that you are
done with this lab exercise.

End of Exercise

© Copyright IBM Corp. 2009, 2011 Exercise 12. System dump 12-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Exercises Guide with hints

Exercise review/wrap-up
1. Review terms like primary dump device, secondary dump device, and copy directory.
2. Ask students why they would want to use the snap facility.
Answer: The snap facility documents system configuration information that is
needed, along with a dump, in order for support personnel to be able to analyze
the system.

12-14 AIX Advanced Administration © Copyright IBM Corp. 2009, 2011


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V6.0

backpg
Back page

You might also like