Download as pdf or txt
Download as pdf or txt
You are on page 1of 82

V11.

cover

Front cover
Course Exercises Guide
with hints
AIX Internals & Performance IV: I/O
Management - Part 2 (Specialized I/O)
Course code AHQV474   ERC 1.1

IBM Systems Software Education


Student Exercises with hints

Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corp., registered in many jurisdictions worldwide.
The following are trademarks of International Business Machines Corporation, registered in many
jurisdictions worldwide:
AIX 6™ AIX® DB2®
DS8000® Express® FlashSystem™
GPFS™ IBM FlashSystem® Power Systems™
Power® PowerHA® PowerSC™
PowerVM® POWER6® POWER7+™
POWER7® POWER8® PurePower System™
SystemMirror® Systems Director VMControl™
Intel is a trademark or registered trademark of Intel Corporation or its subsidiaries in the United
States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Java™ and all Java-based trademarks and logos are trademarks or registered trademarks of
Oracle and/or its affiliates.
VMware and the VMware "boxes" logo and design, Virtual SMP and VMotion are registered
trademarks or trademarks (the "Marks") of VMware, Inc. in the United States and/or other
jurisdictions.
Other product and service names might be trademarks of IBM or other companies.

May 2013 edition


The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis without
any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer
responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While
each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will
result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.

© Copyright International Business Machines Corporation 2013.


This document may not be reproduced in whole or in part without the prior written permission of IBM.
Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictions
set forth in GSA ADP Schedule Contract with IBM Corp.
V8.0
Student Exercises with hints

TOC Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Exercises description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Exercise 1. I/O Internals Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1

Exercise 2. Possible Disk I/O Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1

Exercise 3. Conventional I/O Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1

Exercise 4. Specialized I/O Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1

© Copyright IBM Corp. 2013 Contents iii


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

iv AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

TMK Trademarks
The reader should recognize that the following terms, which appear in the content of this
training document, are official trademarks of IBM or other companies:
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International
Business Machines Corp., registered in many jurisdictions worldwide.
The following are trademarks of International Business Machines Corporation, registered in
many jurisdictions worldwide:
Active Memory™ AIX 6™ AIX®
BladeCenter® DS4000® DS6000™
DS8000® Enterprise Storage Server® POWER Hypervisor™
Power Systems™ Power® PowerVM®
POWER6® POWER7® Storwize®
System Storage®
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Other product and service names might be trademarks of IBM or other companies.

© Copyright IBM Corp. 2013 Trademarks v


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

vi AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

pref Exercises description


In the exercise instructions you will see each step prefixed by a line.
You may wish to check off each step as you complete it to keep track
of your progress.
Most exercises include required sections which should always be
completed. These may be required before performing later exercises.
Some exercises may also include optional sections that you may wish
to perform if you have sufficient time and want an additional challenge.
This course includes two versions of the course exercises, “with hints”
and “without hints”.
The standard “Exercise instructions” section provides high-level
instructions for the tasks you should perform. You need to apply the
knowledge you gained in the unit presentation to perform the exercise.
The “Exercise instructions with hints” provide more detailed
instructions and hints to help you perform the exercise steps.

© Copyright IBM Corp. 2013 Exercises description vii


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Text highlighting
The following text highlighting conventions are used throughout this book:
Bold Identifies file names, file paths, directories, user names,
principals, menu paths and menu selections. Also identifies
graphical objects such as buttons, labels and icons that the
user selects.
Italics Identifies links to web sites, publication titles, is used where the
word or phrase is meant to stand out from the surrounding text,
and identifies parameters whose actual names or values are to
be supplied by the user.
Monospace Identifies attributes, variables, file listings, SMIT menus, code
examples and command output that you would see displayed
on a terminal, and messages from the system.
Monospace bold Identifies commands, subroutines, daemons, and text the user
would type.

viii AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Exercise 1. I/O Internals Framework


(with hints)

What this exercise is about


This exercise covers accessing the remote lab environment used in
this course and it provides you with experience displaying various
items of AIX configuration information.

What you should be able to do


At the end of the exercise, you should be able to:
• Verify that your remote lab environment is accessible
• Display information about the kernel on your system
• Display and examine information regarding hypervisor calls
• Understand how trace can be used to view kernel I/O activity

Requirements
In the normal lab environment for this class, each lab team will be
assigned a logical partition (LPAR) on a managed system. The
assigned logical partition should be running AIX 7.1 and should
normally be on a POWER6 or POWER7 processor-based system.
You will not be sitting directly in front of your lab system. Instead, you
will be using your personal PC to connect to your lab system.

© Copyright IBM Corp. 2013 Exercise 1. I/O Internals Framework 1-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Exercise instructions with hints

Preface
This exercise includes information for you to read, and exercise steps for you to
perform. The following examples illustrate the numbered checklist format used to
identify exercise steps:
__ 1. (This is example step one.) Login to ...
__ 2. (This is example step two.) Execute the following ...
Two versions of these instructions are available: one with hints and one without. You
can use either version to complete this exercise (or flip back and forth between the two
versions). In other words, use these two versions of the exercise in whatever way best
aids your learning. Also, please do not hesitate to ask the instructor if you have
questions.
In some cases, the answer given in a hint may be just an example, and there may be
other correct answers.
All hints are marked by a » sign.

1-2 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 1 - Displaying kernel-related information


__ 1. Log on to your assigned AIX system as root.
__ 2. Use the ls command to verify that /unix is a symbolic link that points to the kernel
image file /usr/lib/boot/unix_64 on your lab system.
» The required command and the expected output are shown below:
# ls -l /unix
lrwxrwxrwx 1 root system 21 Jun 15 15:49 /unix ->
/usr/lib/boot/unix_64
This output indicates that /unix is a symbolic link to /usr/lib/boot/unix_64.
__ 3. AIX 7 provides and supports only a 64-bit kernel. Use the -k flag of the prtconf
command to verify that a 64-bit kernel is currently in use on your lab system.
» The required command and the expected output are shown below:
# prtconf -k
Kernel Type: 64-bit
__ 4. AIX 7 runs only on 64-bit hardware. Use the -c flag of the prtconf command to
confirm that the CPU hardware type of your lab system is 64-bit.
» The required command and the expected output are shown below:
# prtconf -c
CPU Type: 64-bit
__ 5. The following command can be used to display information about files used in
building the /unix kernel nucleus:
# what /unix
JFS2 source files are located under the bos/kernel/j2 subdirectory. Thus, evidence
that JFS2 is integrated into the /unix kernel nucleus (rather than being implemented
as a separately loaded kernel extension) can be observed by running the following
command:
# what /unix | grep bos/kernel/j2
Does the output of this command indicate that JFS2 source files are used in building
the /unix kernel nucleus?
» The required command and the first few lines of typical output are shown below:
# what /unix | grep bos/kernel/j2
37 1.17 src/bos/kernel/j2/j2_errlog.c, sysj2, bos61B, b2007_33A0 8/6/0
7 16:26:17
55 1.110 src/bos/kernel/j2/j2_xtree.c, sysj2, bos61J, 0933A_61J 7/16/0
9 00:33:46
03 1.23 src/bos/kernel/j2/j2_access.c, sysj2, bos61H, 0911A_61H 2/27/0
9 17:26:32
15 1.62 src/bos/kernel/j2/j2_dtree.c, sysj2, bos61J, 0925A_61J 5/27/09
13:16:23

© Copyright IBM Corp. 2013 Exercise 1. I/O Internals Framework 1-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

26 1.37 src/bos/kernel/j2/j2_ea.c, sysj2, bos61D, d2007_49C2 11/2/07 0


9:51:33
29 1.7 src/bos/kernel/j2/j2_zombie.c, sysj2, bos61J, 0941A_61J 9/29/09
14:31:25
AIX710_area/1 bos/kernel/j2/j2_remove.c, sysj2, aix710, 1023A_710 2010-
05-25T08:52:54-05:00$
AIX710_area/1 bos/kernel/j2/j2_xacl.c, sysj2, aix710, 1011A_710 2010-03
-04T20:22:07-06:00$
22 1.21 src/bos/kernel/j2/j2_getattr.c, sysj2, bos61H, 0914A_61H 3
/26/09 14:28:20
AIX710_area/1 bos/kernel/j2/j2_mls.c, sysj2, aix710, 1011A_710 2010-03-
04T20:23:10-06:00$
55 1.30.1.45 src/bos/kernel/j2/j2_ea2.c, sysj2, bos61J, 0931A_61J 7/10
/09 14:36:44
AIX710_area/1 bos/kernel/j2/j2_efs.c, sysj2, aix710, 1019A_710 2010-05-
03T18:51:50-05:00$
34 1.12.1.30 src/bos/kernel/j2/j2_open.c, sysj2, bos61J, 0925A_61J 4/2
3/09 12:43:37
AIX710_area/3 bos/kernel/j2/j2_fstats.c, sysj2, aix710, 1008A_710 2010-
02-12T13:13:35-06:00$
1 bos/kernel/j2/j2_rdwr.c, sysj2, aix710, 0950A_710 2009-11-30T13:36:09
-06:00$
. . . < some output deleted > . . .
This output indicates that JFS2 source files were used in building the /unix kernel
nucleus.

1-4 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 2 - Obtaining hypervisor-related information


__ 6. The lparstat command reports LPAR-related information and statistics. When the
lparstat command is invoked with the -h flag (and no other flags or other
parameters), the resulting output will show the percentage of processing time spent
in hypervisor mode (in the %hypv column) and the number of hypervisor calls (in the
hcalls column) since the last time the partition was booted (as well as partition
configuration information and other utilization statistics). Enter the lparstat
command with the -h flag (and no other parameters) to obtain this summary of
hypervisor call activity for your system.
» The command and an example of output are shown below:
# lparstat -h

System configuration: type=Shared mode=Capped smt=On lcpu=2 mem=4096MB psize=2


ent=0.25

%user %sys %wait %idle physc %entc lbusy vcsw phint %hypv hcalls %nsp
----- ----- ------ ------ ----- ----- ------ ----- ----- ------ ------ -----
0.0 0.2 0.0 99.7 0.00 0.5 0.1 1396566823 6674419 0.0 817 99
__ 7. Use the output you just obtained to answer the following questions:
• How many hypervisor calls have been made since your partition was booted?
• What percentage of processing time has been spent in hypervisor mode since
your system was booted?
» The sample output indicates that 817 hypervisor calls were made from the time the
partition was last booted until the time the output was generated. During that time,
0.0% (rounded to the nearest 0.1%) of the processing time was spent in hypervisor
mode. (The values obtained will differ from system to system. Check the output you
obtain to determine the values for your system.)
__ 8. When the lparstat command is invoked with the -H flag (and no other flags or
other parameters), the resulting output will show detailed information regarding
hypervisor calls since the last time the partition was booted. The following
information will be displayed for each of the hypervisor calls:
• Number of calls: The number of hypervisor calls of this type made
• %Total Time Spent: Percentage of total time spent in this type of call
• %Hypervisor Time Spent: Percentage of hypervisor time spent in this type of call
• Average Call Time: Average call time for this type of call in nanoseconds
• Maximum Call Time: Maximum call time for this type of call in nanoseconds
Enter the lparstat command with the -H flag (and no other parameters) to obtain
detailed hypervisor call information for your system.

© Copyright IBM Corp. 2013 Exercise 1. I/O Internals Framework 1-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» The command and an example of output are shown below:


# lparstat -H

System configuration: type=Shared mode=Capped smt=On lcpu=2 mem=4096MB psize=2


ent=0.25

Detailed information on Hypervisor Calls

Hypervisor Number of %Total Time %Hypervisor Avg Call Max Call


Call Calls Spent Time Spent Time(ns) Time(ns)

remove 434 0.0 3.0 686 5531


read 15 0.0 0.0 141 156
nclear_mod 0 0.0 0.0 0 0
page_init 1807 0.0 16.0 873 2187
clear_ref 0 0.0 0.0 0 0
protect 0 0.0 0.0 0 0
put_tce 18 0.0 0.1 732 2718
xirr 14 0.0 0.1 852 1656
eoi 14 0.0 0.1 408 562
ipi 0 0.0 0.0 0 0
cppr 14 0.0 0.0 245 281
asr 0 0.0 0.0 0 0
others 10423 0.0 21.8 206 6062
enter 2250 0.0 12.0 528 2187
cede 258 0.0 40.8 15641 227937
migrate_dma 0 0.0 0.0 0 0
. . . < some output deleted > . . .
__ 9. Use the output you just obtained to answer the following questions:
• How many enter hypervisor calls (used to add entries into the partition page
frame table maintained by the hypervisor) have been made in your partition since
your partition was booted?
• What has been the average call time in nanoseconds for enter hypervisor calls
since your system was booted?
» The sample output indicates that 2250 enter (H_ENTER) hypervisor calls were made
from the time the partition was last booted until the time the output was generated.
During that time, the average call time in nanoseconds for enter hypervisor calls
was 528 nanoseconds. (The values obtained will differ from system to system.
Check the output you obtain to determine the number of enter hypervisor calls and
the average call time for enter hypervisor calls on your system during the
monitored period.)

1-6 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 3 - Using the trace facility to examine I/O activity


In this section of the exercise, you will be using the trace facility to identify and
examine the basic kernel functions to support I/O operations.
__ 10. Change directory to /home/QV474/ex1.
__ 11. In this step of the exercise you will collect some trace events when running the
trcprog program. Enter the following command:
# trace -J syscall,jfs2,vnops,filepvld,vmm -x ./trcprog
__ 12. Format and save the trace report using the following command:
# trcrpt -O exec=on,pid=on,svc=on,timestamp=1 > iotracereport
__ 13. In the next steps of this exercises you will be examining the trace report of the
trcprog program that was generated in the previous step.
There are several ways to filter the trace events to get the information you want to
find. Filtering can be performed on the raw data contained in the log file or by using
filter commands (such as awk, cut, grep, head, sed, etc.). As in this case where the
trace report is not big, we recommend a simple visual search by using the more or
pg commands, or an editor such as vi or view.
__ a. The trcprog program opens the /unix file. How many microseconds did the
open operation take? _________________
» System call entry and exit events are visible in the trace report with hook IDs 101
and 104, which indicate system call entry point and return from system call
respectively. The hook id 104 event contains the elapsed time of the system call
operation between [square brackets].
» In the example trace report shown below, the open operation took 26
microseconds. The time taken on your lab system may be slightly different.
. . . < some output deleted >. . .
101 trcprog 5832820 kopen 0.000921 kopen LR = D011
9BD8
107 trcprog 5832820 kopen 0.000924 lookupp
n: /unix
. . . < some output deleted >. . .
15B trcprog 5832820 kopen 0.000947 open fd
=3
104 trcprog 5832820 kopen 0.000947 return from kop
en [26 usec]
. . . < some output deleted >. . .
» Another way to find the entry and return from events for the kopen system call is
to use the egrep command, as shown below:
# egrep '(^101|^104)' iotracereport | grep kopen
101 trcprog 5832820 kopen 0.000921 kopen LR = D011
9BD8
104 trcprog 5832820 kopen 0.000947 return from kop
en [26 usec]

© Copyright IBM Corp. 2013 Exercise 1. I/O Internals Framework 1-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

__ b. What is the file descriptor (fd) value (integer) returned by the open operation to
the trcprog application program that corresponds to the /unix file? __________
» The expected answer is that the file descriptor number is 3.
» The file descriptor can be found just before the hook id 104, as shown in the
example below:
. . . < some output deleted >. . .
15B trcprog 5832820 kopen 0.000947 open fd
=3
104 trcprog 5832820 kopen 0.000947 return from kop
en [26 usec]
. . . < some output deleted >. . .
» Another way to find the fd value returned by the open operation is to use the
grep command, as shown below:
# grep kopen iotracereport | grep "fd="
15B trcprog 5832820 kopen 0.000947 open fd
=3
__ c. How many times did the trcprog application read data from the /unix file?
__________________
Remember the format of the read subroutine is as follows:
read (FileDescriptor, Buffer, NumberBytes)
» As you can see in the following section of an example trace report, the trcprog
application only issues one read operation on the /unix file.
. . . < some output deleted >. . .
101 trcprog 5832820 kread 0.000948 kread LR = D012
326C
163 trcprog 5832820 kread 0.000948 read(3,
000000002FF21CBC,1000)
. . . < some output deleted >. . .
104 trcprog 5832820 kread 0.000953 return from kre
ad [5 usec]
. . . < some output deleted >. . .
» Another way to see how many times the file has been read is to use the grep
command, as shown below:
# grep "return from kread" iotracereport
104 trcprog 5832820 kread 0.000953 return from kre
ad [5 usec]
__ d. Did the trcprog program create any files? ______________

1-8 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty » Yes, the trcprog program created a file called file_just_created, as shown
below:
. . . < some output deleted >. . .
101 trcprog 5832820 creat 0.000954 creat LR = 1000
040C
107 trcprog 5832820 creat 0.000955 lookupp
n: file_just_created
. . . < some output deleted >. . .
15B trcprog 5832820 creat 0.008626 open fd
=4 _FREAD _FCREAT _FTRUNC mode=----w----
104 trcprog 5832820 creat 0.008627 return from cre
at [7673 usec]
. . . < some output deleted >. . .
» The 15B event indicates that file descriptor 4 is used to reference the newly
created file.
» Another way to see if the program created a file is to use the grep command, as
shown below:
# grep "^101" iotracereport | grep creat
101 trcprog 5832820 creat 0.000954 creat LR = 1000
040C
__ e. How many times was a write operation performed to the file_just_created file?
________
How many bytes were written each time? _________
Remember the format of the write subroutine:
write (FileDescriptor, Buffer, NumberBytes)

© Copyright IBM Corp. 2013 Exercise 1. I/O Internals Framework 1-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» As you can see in the following sections of the trace, the file_just_created file
has been written to four times with 0x400 (1024) bytes each time.
. . . < some output deleted >. . .
101 trcprog 5832820 kwrite 0.008628 kwrite LR = D01
20104
19C trcprog 5832820 kwrite 0.008628 write(4
,000000002FF21CBC,400)
19C trcprog 5832820 kwrite 0.008629 vnop_rd
wr_write(vp = F1000A06019F0420, offset = 0000000000000000, length = 0400, flags
= 0002, ...) = ...
. . . < some output deleted >. . .
104 trcprog 5832820 kwrite 0.008661 return from kwr
ite [33 usec]
101 trcprog 5832820 kwrite 0.008662 kwrite LR = D01
20104
19C trcprog 5832820 kwrite 0.008662 write(4
,000000002FF220BC,400)
19C trcprog 5832820 kwrite 0.008663 vnop_rd
wr_write(vp = F1000A06019F0420, offset = 0000000000000400, length = 0400, flags
= 0002, ...) = ...
. . . < some output deleted >. . .
104 trcprog 5832820 kwrite 0.008667 return from kwr
ite [5 usec]
101 trcprog 5832820 kwrite 0.008668 kwrite LR = D01
20104
19C trcprog 5832820 kwrite 0.008668 write(4
,000000002FF224BC,400)
19C trcprog 5832820 kwrite 0.008669 vnop_rd
wr_write(vp = F1000A06019F0420, offset = 0000000000000800, length = 0400, flags
= 0002, ...) = ...
. . . < some output deleted >. . .
104 trcprog 5832820 kwrite 0.008673 return from kwr
ite [5 usec]
101 trcprog 5832820 kwrite 0.008673 kwrite LR = D01
20104
19C trcprog 5832820 kwrite 0.008673 write(4
,000000002FF228BC,400)
19C trcprog 5832820 kwrite 0.008674 vnop_rd
wr_write(vp = F1000A06019F0420, offset = 0000000000000C00, length = 0400, flags
= 0002, ...) = ...
. . . < some output deleted >. . .
104 trcprog 5832820 kwrite 0.008678 return from kwr
ite [5 usec]
. . . < some output deleted >. . .

1-10 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty » Another way to see how many times the file has been written is to use the grep
command, as shown below:
# grep 'write(4' iotracereport
19C trcprog 5832820 kwrite 0.008628 write(4
,000000002FF21CBC,400)
19C trcprog 5832820 kwrite 0.008662 write(4
,000000002FF220BC,400)
19C trcprog 5832820 kwrite 0.008668 write(4
,000000002FF224BC,400)
19C trcprog 5832820 kwrite 0.008673 write(4
,000000002FF228BC,400)
__ 14. Examine the following extract from a trace report:
101 trcprog 5832820 kwrite 0.008628 kwrite LR = D01
20104
19C trcprog 5832820 kwrite 0.008628 write(4
,000000002FF21CBC,400)
19C trcprog 5832820 kwrite 0.008629 vnop_rd
wr_write(vp = F1000A06019F0420, offset = 0000000000000000, length = 0400, flags
= 0002, ...) = ...
59B trcprog 5832820 kwrite 0.008631 JFS2 IO
write: vp = F1000A06019F0420, sid = 8403D0, offset = 0000000000000000, length =
0400
4C3 trcprog 5832820 kwrite 0.008646 VMM WRI
TE: sid=8403D0 src=FFFFF1000FF21CBC dest=FFFFF00000000000 bytes=0400 basecopy_fl
ags=0008
19C trcprog 5832820 kwrite 0.008661 vnop_rd
wr_write(vp = F1000A06019F0420, ext = 0000, ...) = 0000, 0400 bytes moved
104 trcprog 5832820 kwrite 0.008661 return from kwr
ite [33 usec]
In the table below, indicate which kernel I/O layer each event ID belongs to.
System Call Virtual File
File System
Hook Id Interface System VMM Layer
Layer
Layer Layer

101

19C

59B

4C3

104

© Copyright IBM Corp. 2013 Exercise 1. I/O Internals Framework 1-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» The table below has been completed with the answers.


System Call Virtual File
File System VMM
Hook Id Interface System
Layer Layer
Layer Layer
101 X
19C X
59B X
4C3 X
104 X
__ 15. Examine the following extract from a trace report:
. . . < some output deleted >. . .
106 getty 4325544 0.029628 dispatch: cmd=getty pid=4325544 tid=87818
45 priority=60 old_tid=131077 old_priority=255 CPUID=0 [91 usec]
200 getty 4325544 0.029630 resume getty iar=DBFD8 cpuid=00
460 getty 4325544 0.029631 e_assert_wait: tid=8781845 anchor=F1000A0
0104E18C0 flag=1 lr=14F54
462 getty 4325544 0.029634 e_block_thread: tid=8781845 anchor=F1000A
00104E18C0 t_flags=0000 lr=14F54
4B0 getty 4325544 0.029635 undispatch: old_tid=8781845 CPUID=0
10C wait 131076 0.029636 dispatch: idle process pid=131076 tid=13
1077 priority=255 old_tid=8781845 old_priority=60 CPUID=0
200 wait 131076 0.029637 resume wait iar=92880 cpuid=00
492 wait 131076 0.029638 h_call: start H_CEDE iar=20005 p1=172190164
D6383 p2=0000 p3=0000
234 wait 131076 0.039510 clock: iar=0000000000092880 lr=0000000000
074DCC [9937 usec]
100 wait 131076 0.039511 DECREMENTER INTERRUPT iar=92880 cpu
id=00
100 wait 131076 0.039511 PROCESSING DEFERRED INTERRUPT i_sof
tpri=0400 previous intpri=0B cpuid=00
200 wait 131076 0.039517 resume wait iar=92880 cpuid=00
. . . < some output deleted >. . .
In the table below, indicate which kernel area each event ID belongs to
Hypervisor
Thread Event FLIH and
Hook Id Call
Dispatching Management Clock
Interface

106

200

460

462

1-12 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Hypervisor
Thread Event FLIH and
Hook Id Call
Dispatching Management Clock
Interface

4B0

10C

492

234

100

» The table below has been completed with the answers.


Hypervisor
Thread Event FLIH and
Hook Id Call
Dispatching Management Clock
Interface
106 X
200 X
460 X
462 X
4B0 X
10C X
492 X
234 X
100 X

__ 16. Let your instructor know that you have completed the exercise.

End of exercise

© Copyright IBM Corp. 2013 Exercise 1. I/O Internals Framework 1-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

1-14 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Exercise 2. Possible Disk I/O Configurations


(with hints)

What this exercise is about


This exercise contains activities in which you will display and examine
information regarding disk resources and device drivers for storage
subsystems.

What you should be able to do


At the end of the exercise, you should be able to:
• Obtain information regarding system resources
• Obtain information about MPIO modules
• Obtain information about storage families and the driver that
manages each family

Requirements
In the normal lab environment for this class, each lab team will be
assigned a logical partition (LPAR) on a managed system. The
assigned logical partition should be running AIX 7.1 and should
normally be on a POWER6 or POWER7 processor-based system.
You will not be sitting directly in front of your lab system. Instead, you
will be using your personal PC to connect to your lab system.

© Copyright IBM Corp. 2013 Exercise 2. Possible Disk I/O Configurations 2-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Exercise instructions with hints

Preface
This exercise includes information for you to read, and exercise steps for you to
perform. The following examples illustrate the numbered checklist format used to
identify exercise steps:
__ 1. (This is example step one.) Login to ...
__ 2. (This is example step two.) Execute the following ...
Two versions of these instructions are available: one with hints and one without. You
can use either version to complete this exercise (or flip back and forth between the two
versions). In other words, use these two versions of the exercise in whatever way best
aids your learning. Also, please do not hesitate to ask the instructor if you have
questions.
In some cases, the answer given in a hint may be just an example, and there may be
other correct answers.
All hints are marked by a » sign.

2-2 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Section 1 - Obtaining information regarding system resources


__ 1. Log on to your assigned AIX system as root.
__ 2. The lparstat command reports logical partition (LPAR) related information and
statistics. Run the lparstat command using the -i option to list details on the
LPAR configuration.
» The command and example output are shown below:
# lparstat -i
Node Name : woolf1
Partition Name : LPAR1
Partition Number : 2
Type : Shared-SMT
Mode : Capped
Entitled Capacity : 0.25
Partition Group-ID : 32770
Shared Pool ID : 0
Online Virtual CPUs : 1
Maximum Virtual CPUs : 1
Minimum Virtual CPUs : 1
Online Memory : 4096 MB
Maximum Memory : 4096 MB
Minimum Memory : 1024 MB
<some output omitted>
Is your system running on a logical partition or it is running on a standalone servers?
_________________________________
» Your system is running on a logical partition and the name of the partition in this
example is LPAR1 (Partition Name: LPAR1).
Is your logical partition a dedicated processor partition or a shared processor
partition?
_________________________________
» Your partition is configured as a shared processor partition (Type: Shared-SMT).
How much memory is currently available in your system?
_________________________________
» The amount of memory currently available on your partition is 4096 MB (Online
Memory: 4096 MB).
__ 3. The lsdev command displays information about devices in the system and their
characteristics.You can list the disk resources for your partition using the command
lsdev -Cc disk.
Use the lsdev command to list the disk resources for your partition.

© Copyright IBM Corp. 2013 Exercise 2. Possible Disk I/O Configurations 2-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» The command and example output are shown below:


# lsdev -Cc disk
hdisk0 Available Virtual SCSI Disk Drive
hdisk1 Available Virtual SCSI Disk Drive
hdisk2 Available Virtual SCSI Disk Drive
How many disks are listed in the output of lsdev -Cc disk on your system?
_____________________________________
» In the normal lab environment for this class, there should be three disks listed.
Are the disks in you system virtual disks?
_____________________________________
» Yes, the disks are virtual disks. A disk resource accessed using a Virtual I/O
Server (VIOS) is described as a "Virtual SCSI Disk Drive".
__ 4. Use the lsdev -Cc adapter command to display the available adapters on your
system. What is the name of the adapter where your virtual disks are connected?
_________________________________
» The command and example output are shown below:
# lsdev -Cc adapter
ent0 Available Logical Host Ethernet Port (lp-hea)
lhea0 Available Logical Host Ethernet Adapter (l-hea)
vsa0 Available LPAR Virtual Serial Adapter
vscsi0 Available Virtual SCSI Client Adapter
» The adapter where the virtual disks is the virtual adapter (Virtual SCSI Client
Adapter) called vscsi0.
__ 5. The lspv command displays volume information. If the lspv command is entered
with no flags or parameters, the resulting output will list every known volume (disk)
in the system along with its disk name, the physical volume identifier (PVID) if it can
be determined, and the volume group (if any) to which the volume belongs.
» The command and an example of output are shown below:
# lspv
hdisk0 00066bd2f24a3c49 rootvg active
hdisk1 00066bd213398da6 None
hdisk2 none None
How many physical volumes are listed in the output of lspv on your system?
_____________________________
» In the normal lab environment for this class, there should be three volumes
(disks) listed. The number of disks listed should be the same as the number

2-4 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty listed in the output of the lsdev -Cc disk command you entered in a previous
lab step.
How many of the physical volumes listed currently belong to a volume group?
_________
» In the normal lab environment for this class, only one of the disks (hdisk0) listed
will belong to a volume group (rootvg). The other two disks will be placed in a
volume group in a subsequent exercise.

© Copyright IBM Corp. 2013 Exercise 2. Possible Disk I/O Configurations 2-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Section 2 - Obtaining kernel extension-related information


__ 6. Make sure you are still logged as root.
__ 7. The genkex command can be used to display a list of kernel extensions currently
loaded onto your system and the address, size, and path name for each extension
listed.
Enter the genkex command with no parameters.
» The required command and example output are shown below:
# genkex
Text address Size File
. . . < some output deleted >. . .
f1000000c02d7000 1e000 /usr/lib/drivers/random
5fb0000 220000 /usr/lib/drivers/nfs.ext
5f50000 60000 /usr/lib/drivers/krpc.ext
f1000000c02d5000 2000 /usr/lib/drivers/nfs_kdes.ext
5f20000 20000 /usr/lib/drivers/posix_aiopin
5f00000 20000 /usr/lib/drivers/posix_aio.ext
5ee0000 20000 /usr/lib/drivers/aiopin
5ec0000 20000 /usr/lib/drivers/aio.ext
5e70000 40000 /usr/pmapi/etc/pmsvcs
5e40000 20000 /usr/lib/drivers/perfvmmstat
f1000000c02b9000 1c000 /usr/lib/drivers/ptydd
. . . < some output deleted >. . .
__ 8. Examine the output of the genkex command. For kernel extensions loaded onto the
system, the kernel maintains a linked list consisting of data structures called loader
entries. A loader entry contains the name of the extension, its starting address, and
its size. Observe the name of the file of each kernel extension. It is in fact a full path
name from where the system loads the extensions.
Use the file command to examine the aio kernel extension. Is it a 32-bit or 64-bit
executable?
__________________________________
» The required command and example output are shown below:
# file /usr/lib/drivers/aio.ext
/usr/lib/drivers/aio.ext: 64-bit XCOFF executable or object module
not stripped
» The aio kernel extension file is a 64-bit XCOFF executable module. Kernel
extensions are loaded into the kernel address space, which is always a 64-bit
environment with AIX 7.

2-6 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty __ 9. If you want to obtain additional information about a kernel extension listed by the
genkex command, use the following sequence:
__ a. Obtain the fileset name that owns the kernel extension aio.ext by using the
lslpp command with the -w flag.
_______________________________
» The required command and example output are shown below:
# lslpp -w /usr/lib/drivers/aio.ext
File Fileset Type
----------------------------------------------------------
/usr/lib/drivers/aio.ext bos.rte.aio File
__ b. Display the name and the fileset description by using the lslpp command with
the -L flag.
_______________________________
» The required command and example output are shown below:
# lslpp -L | grep aio
bos.rte.aio 7.1.0.1 C F Asynchronous I/O Extension
__ 10. The kdb subcommand lke can also be used to list the loaded kernel extensions.
When invoked with no arguments, the lke subcommand shows a one line summary
of each loader_entry structure in the kernel load list. Enter the lke subcommand at
the kdb prompt and examine the output of this subcommand. Remember that kdb
has a built-in pager; press the <Enter> key to obtain a new page of output.
Note: When kdb starts, it provides an initial display that includes address
information for some key symbols and then provides a prompt. The initial kdb
prompt on your lab system should be (0)>. Various kdb subcommands can be
entered at the kdb prompt.

© Copyright IBM Corp. 2013 Exercise 2. Possible Disk I/O Configurations 2-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» The required command and example output are shown below:


# kdb
START END <name>
0000000000001000 00000000057A0000 start+000FD8
F00000002FF47600 F00000002FFDF9C0 __ublock+000000
000000002FF22FF4 000000002FF22FF8 environ+000000
000000002FF22FF8 000000002FF22FFC errno+000000
F1000F0A00000000 F1000F0A10000000 pvproc+000000
F1000F0A10000000 F1000F0A18000000 pvthread+000000
read vscsi_scsi_ptrs OK, ptr = 0xF1000000C01803B0
(0)> lke
ADDRESS FILE FILESIZE FLAGS MODULE NAME

1 F1000000A0644F00 F1000000C02D7000 0001E000 02090252


/usr/lib/drivers/random
2 F1000000A0644000 05F5E000 00001E18 00180248 /unix
3 F1000000A0644D00 05FC0000 00220000 02080252
/usr/lib/drivers/nfs.ext
4 F1000000A0644E00 05F5A000 00001E18 01180248 /unix
5 F1000000A0644B00 05F60000 00060000 02080252
/usr/lib/drivers/krpc.ext
6 F1000000A0644C00 05F56000 00001DD0 00180248 /unix
7 F1000000A0644900 F1000000C02D5000 00002000 02090252
/usr/lib/drivers/nfs_kdes.ext
8 F1000000A0644A00 05F52000 00001DD0 00180248 /unix
. . . < some output deleted >. . .
» It is perfectly normal to see multiple entries for the object named /unix in the
output of the lke subcommand. Certain kernel extensions add system calls to
the kernel. When this happens, a new copy of the system call table is created.
Since the system call table is in the global /unix namespace, this causes a new
loader entry with the name /unix to be created. However, the existence of
multiple /unix entries in the output of the lke subcommand does not mean that
the entire kernel is loaded multiple times.
__ 11. Use the q subcommand of kdb to quit (exit) from kdb.

2-8 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Section 3 - Obtaining information about MPIO modules


The Multiple Path I/O (MPIO) feature can be used to define alternate paths to a
device for failover purposes and it is installed and configured as part of BOS
installation. MPIO helps provide increased availability of virtual SCSI resources by
allowing for the configuration of redundant paths to the resource.
In order to provide MPIO to AIX client logical partitions, you must have two Virtual
I/O Server logical partitions configured on your system. This procedure assumes
that the disks are already allocated to both the Virtual I/O Server logical partitions
involved in this configuration.
__ 12. Make sure you are still logged as root.
__ 13. The command lspath displays information about paths to an MPIO capable device.
The lspath -H -l <hdisk#> command displays the status of all possible paths to
a particular disk on your system. How many paths are currently operational on
hdisk0?
» The command and an example of output are shown below:
# lspath -H -l hdisk0
status name parent
Enabled hdisk0 vscsi0
» The hdisk0 disk is not configured as a multi-path I/O capable device, so you will
only see one path available.
__ 14. The devices.common.IBM.mpio.rte fileset contains the default MPIO path control
module (PCM). Use the lslpp command to verify if it is installed on you system.
» The command and an example of output is shown below:
# lslpp -L |grep mpio
devices.common.IBM.mpio.rte
__ 15. The devices.common.IBM.mpio.rte fileset contains the aixdiskpcmke kernel
extension. Is this kernel extension loaded on your system? The genkex program can
be used to display a list of extensions currently loaded into the kernel on your
system.
» The command and an example of output is shown below:
# genkex | grep aixdiskpcmke
59d0000 30000 /usr/lib/drivers/aixdiskpcmke
» Yes, the aixdiskpcmke kernel extension is loaded into the kernel on your
system.
__ 16. The kdb subcommand lke can also be used to list the loaded kernel extensions.
When invoked with no arguments, the lke subcommand shows a one line summary
of each loader_entry structure in the kernel load list.
Enter the lke subcommand to verify the aixdiskpcmke kernel extension entry.

© Copyright IBM Corp. 2013 Exercise 2. Possible Disk I/O Configurations 2-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Note: kdb allows output redirection of subcommands via the operators "|", ">" and
">>". The "|" symbol pipes all output of the command before the symbol, to the
input of the command after the symbol. The ">" operator writes the output of the
command preceding the operator to the file name following the operator; any
existing file is overwritten. The ">>" operator appends the output of the command
preceding the operator to the file name following the operator. This means the
output from kdb commands can be piped to grep to search for a specified pattern.
» The command and an example of output is shown below:
# (0)> lke | grep pcm
57 F1000000A063AC00 059D0000 00030000 02080242
/usr/lib/drivers/aixdiskpcmke
(0)>
At which kernel address has the aixdiskpcmke kernel extension been loaded?
______________________________________
» The 3rd output value is the kernel address at which the extension has been
loaded. So in this example, 0x59D0000 is the address at which the aixdiskpcmke
has been loaded.
» On POWER6 and newer hardware, on AIX 6 and above, kernel extensions that
are explicitly marked as being storage key safe are loaded into segment 0. lke
will display these addresses as 8 digit hex, with the first digit being a 0. Kernel
extensions that are not explicitly marked as being key safe are loaded into a
different kernel segment. The exact segment number depends on the version of
AIX being used, but generally it will be a very large segment number, such as
F10000000, which results in kdb showing a sixteen digit hex number for the
address.
» The 2nd output value is the address of the kernel loader entry structure that is
being described by the lke command.
__ 17. Use the q subcommand of kdb to quit (exit) from kdb.
» The command and an example of output is shown below:
(0)> q
Note: You can also enter e at the kdb prompt to terminate a kdb session and return
to the shell prompt. Recall that many kdb subcommands have aliases and that e is
an alias for q.

2-10 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Section 4 - Obtaining information about storage families and the driver
that manages each family
The manage_disk_drivers command shows information about storage families and the
driver that manages each family. It is also used to change the driver for a storage family.
Each driver has its own characteristics, and a system may have additional drivers
installed besides the ones provided by the base AIX operating system.
The output of manage_disk_drivers has 3 columns of output:
• The first column shows the name of the storage system
• The second indicates the current MPIO driver in use
• The third indicates all supported MPIO drivers for the storage system (a comma
separated list)
__ 18. Make sure you are still logged as root.
__ 19. Use the manage_disk_drivers -l command to list all the storage families and their
supported drivers.
» The command and an example of output is shown below:
# manage_disk_drivers -l
Device Present Driver Driver Options
2810XIV AIX_AAPCM AIX_AAPCM,AIX_non_MPIO
DS4100 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4200 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4300 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4500 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4700 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4800 AIX_APPCM AIX_APPCM,AIX_fcparray
DS3950 AIX_APPCM AIX_APPCM
DS5020 AIX_APPCM AIX_APPCM
DS5100/DS5300AIX_APPCM AIX_APPCM AIX_APPCM
DS3500 AIX_APPCM AIX_APPCM
__ 20. Are models of the IBM TotalStorage DS4000 Midrange Disk System supported with
the currently installed drivers?
_____________________________________
» The command and an example of output is shown below:
# manage_disk_drivers -l | grep DS4
DS4100 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4200 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4300 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4500 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4700 AIX_APPCM AIX_APPCM,AIX_fcparray
DS4800 AIX_APPCM AIX_APPCM,AIX_fcparray

© Copyright IBM Corp. 2013 Exercise 2. Possible Disk I/O Configurations 2-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» Yes, the default multi-path control module (PCM) supports the IBM TotalStorage
DS4000 Midrange Disk Systems (DS4100, DS4200, DS4300, DS4500, DS4700,
and DS4800).
__ 21. Go to the Support Matrix for Subsystem Device Driver, Subsystem Device Driver
Path Control Module, and Subsystem Device Driver Device Specific Module page at
http://www.ibm.com/support/docview.wss?uid=ssg1S7001350. This page is the
entry point for the device drivers’ interoperability matrixes of the following modules
and storage subsystems:
- Subsystem Device Driver Path Control Module (SDDPCM)
- Subsystem Device Driver Device Specific Module (SDDDSM) for ESS
- DS8000
- DS6000
- DS5000
- DS4000
- DS5020
- DS3950
- SVC
- IBM Storwize V7000
- IBM BladeCenter S SAS RAID Controller Module (RSSM)
Click the Support Matrix for AIX SDD link and explore what information is
available.
__ 22. Let your instructor know that you have completed the exercise.

End of exercise

2-12 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Exercise 3. Conventional I/O Operations


(with hints)

What this exercise is about


This exercise contains activities that will have you examine
conventional I/O read and write operations using the trace facility.

What you should be able to do


At the end of the exercise, you should be able to:
• Examine trace data to analyze conventional read and write I/O
operations

Requirements
In the normal lab environment for this class, each lab team will be
assigned a logical partition (LPAR) on a managed system. The
assigned logical partition should be running AIX 7.1 and should
normally be on a POWER6 or POWER7 processor-based system.
You will not be sitting directly in front of your lab system. Instead, you
will be using your personal PC to connect to your lab system.

© Copyright IBM Corp. 2013 Exercise 3. Conventional I/O Operations 3-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Exercise instructions with hints

Preface
This exercise includes information for you to read, and exercise steps for you to
perform. The following examples illustrate the numbered checklist format used to
identify exercise steps:
__ 1. (This is example step one.) Login to ...
__ 2. (This is example step two.) Execute the following ...
Two versions of these instructions are available: one with hints and one without. You
can use either version to complete this exercise (or flip back and forth between the two
versions). In other words, use these two versions of the exercise in whatever way best
aids your learning. Also, please do not hesitate to ask the instructor if you have
questions.
In some cases, the answer given in a hint may be just an example, and there may be
other correct answers.
All hints are marked by a » sign.

3-2 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 1 - Setting the exercise environment


__ 1. Log on to your assigned AIX system as root, and change directory to
/home/QV474/ex3.
» The command is shown below:
# cd /home/QV474/ex3
__ 2. Use the lspv command to list the physical volumes (disks) in your system, along
with information about any volume group to which the physical volume belongs.
» The command and an example of output are shown below:
# lspv
hdisk0 00066bd28d9933c3 rootvg active
hdisk1 none None
hdisk2 none None
In the normal lab environment for this class you should have three disks in your
partition. The physical volume hdisk0 should be used for the rootvg volume group,
and hdisk1 and hdisk2 should not be part of any volume group. Let your instructor
know if this is not the case on your system.
__ 3. Use the mkvg command to create the ex3vg volume group, specifying hdisk1 and
hdisk2 as the physical volumes.
» The command and an example of output are shown below:
# mkvg -f -y ex3vg hdisk1 hdisk2
0516-1254 mkvg: Changing the PVID in the ODM.
0516-1254 mkvg: Changing the PVID in the ODM.
ex3vg
__ 4. Use the following mklv command to create a new logical volume in the ex3vg
volume group:
# mklv -y fslv00 -t jfs2 -e x ex3vg 10
__ 5. Use the following crfs command to create a new JFS2 file system called /ex3fs in
the fslv00 logical volume you just created:
# crfs -v jfs2 -d fslv00 -m /ex3fs
__ 6. Mount the newly created /ex3fs file system.
» The command is shown below:
# mount /ex3fs
__ 7. Use the following crfs command to create a new JFS2 file system called /data in
the ex3vg volume group:
# crfs -v jfs2 -g ex3vg -m /data -a size=2G
__ 8. Mount the newly created /data file system.

© Copyright IBM Corp. 2013 Exercise 3. Conventional I/O Operations 3-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» The command is shown below:


# mount /data
__ 9. Populate the /ex3fs file system by running the following command:
# /home/QV474/ex3/setup.sh
__ 10. Run the following command to change the default settings of the trace facility:
# trcctl -L 200M -T 10M -o /data/trcdata

3-4 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 2 - Examining I/O operations for app1


__ 11. Run the following command sequence to collect trace data while running the app1
program:
# trace -a ; ./app1 ; trcstop
__ 12. Create a trace report from the collected raw trace data using the following
command:
# trcrpt -O exec=on,tid=on,timestamp=1 -o /data/app1.rpt
__ 13. Change directory to /data.
» The command is shown below:
# cd /data
Use the report information in the app1.rpt report file to answer the following
sequence of questions. For some questions it might be best to use grep to search
the report file for the desired information. For other questions, it might be best to use
an editor such as vi to browse and search the contents of the report. Before using
vi to view the contents of a trace report, if possible you should make the terminal
window as wide as possible. This will make it easier to read the data in the report.
__ 14. How many kopen system calls were made by the app1 program? ____________
» The kopen system call is reported using hook ID 101 (system call entry point).
» One possible method to count the number of kopen system calls made by the
app1 application is shown in the example below:
# grep kopen app1.rpt | grep app1
101 app1 9633975 0.011032 kopen LR = D0119BD8
104 app1 9633975 0.011066 return from kopen [34 usec]
» The expected answer is that the app1 program made one kopen system call.
__ 15. What is the name of the file that was opened by the app1 program? ____________
» From the previous step we observed that the app1 program made one kopen
system call. From the information displayed in the output of grep, we can
determine the timestamp value of the 101 event for the kopen call. We can use
this information to find the relevant area in the app1.rpt file.
» In the following example output, the kopen system call has a timestamp value of
0.011032.
# grep kopen app1.rpt | grep app1
101 app1 9633975 0.011032 kopen LR = D0119BD8
104 app1 9633975 0.011066 return from kopen [34 usec]
» Using this information, we can use vi to examine the app1.rpt file, and then
search for the timestamp value. This will allow us to view the events for the

© Copyright IBM Corp. 2013 Exercise 3. Conventional I/O Operations 3-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

kopen system call. An example of the event sequence at the start of the kopen
call is shown below:
101 app1 9633975 0.011032 kopen LR = D0119BD8
3B7 app1 9633975 0.011033 SECURITY: refmon exit: rc=0
action=0000000000000015
52F app1 9633975 0.011034 SEC CRED: crref
callfrom=0000000000630D98 callfrom2=0000000000631A70 pid=5111824 (app1
)
107 app1 9633975 0.011034 lookuppn:
/ex3fs/smallfile
» The information shown in the lookuppn (lookup path name) event with hook ID
107 shows the name of the file passed to the kopen call.
» The expected answer is that the app1 program opened a file called
/ex3fs/smallfile.
__ 16. Which file descriptor number was used to reference the file opened by the app1
program? _______________
» The file descriptor information is shown in an event with hook ID 15B just before
the 104 return from kopen event. An example is shown below:
4DF app1 9633975 0.011065 JFS2 iput: vp =
F1000A0601CB2820, count = 0001, ino = 0002, nlink = 0003, getcaller = 33D0EC
15B app1 9633975 0.011065 open fd=3 _FWRITE
104 app1 9633975 0.011066 return from kopen [34 usec]
» The expected answer is that file descriptor number 3 is used to reference the file
opened by app1.
__ 17. How many kread system calls were made by the app1 program to read the contents
of the file that was opened? _______________
» We know that file descriptor number 3 is used for the file opened by app1. We
can count the number of read calls using the following command sequence:
# grep "read(3" app1.rpt | grep app1
163 app1 9633975 0.011067 read(3,00000000200
00848,800)
» The expected answer is that one read call is issued by app1.
__ 18. What was the read request size used with the first kread system call? ___________
» From the event information obtained while answering the previous question, we
can observe the parameters passed to the read routine:
read(3,0000000020000848,800)
» The third parameter is the read request size. The expected answer is 800
(hexadecimal), which is 2048 bytes in decimal.
__ 19. For the first kread system call, was the data requested obtained from the file system
cache? ____________

3-6 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty » From the information displayed in the output of the previous grep command, we
can determine the timestamp value of the 163 event for the read call. We can
use this information to find the relevant area in the app1.rpt file.
» In the following example output, the read system call event has a timestamp
value of 0.011067.
# grep "read(3" app1.rpt | grep app1
163 app1 9633975 0.011067 read(3,00000000200
00848,800)
» Using this information, we can use vi to examine the app1.rpt file, and then
search for the timestamp value. This will allow us to view the events for the read
system call. An example of the event sequence at the start of the kread call is
shown below:
101 app1 9633975 0.011067 kread LR = D012326C
163 app1 9633975 0.011067 read(3,000000002000
0848,800)
163 app1 9633975 0.011067 vnop_rdwr_read(vp =
F1000A0601CE2820, offset = 0000000000000000, length = 0800, flags = 0003, ...)=
= ...
F1000A0601CE2820, offset = 0000000000000000, length = 0800, flags = 0003, ...)
= ...
59B app1 9633975 0.011068 JFS2 IO read: vp =
F1000A0601CE2820, sid = 87845E, offset = 0000000000000000, length = 0017
100 app1 9633975 0.011070 DATA ACCESS
PAGE FAULT iar=B674 cpuid=00
1B2 app1 9633975 0.011071 VMM pagefault:
V.S=0000.87845E
client_segment
P_DEFAULT 4K large modlist req (type 0)
1B0 app1 9633975 0.011076 VMM page assign:
V.S=0000.87845E ppage=23B73
client_segment
P_DEFAULT 4K large modlist req (type 0)
» The expected answer is that the event sequence shows a data access page fault
immiediately after the JFS2 IO read event. The read request is using offset 0,
reading from the start of the file. The page fault is on page 0 of the segment
being used for the file by the cache. This indicates that the data being requested
by the read call is not currently in the file system cache.
__ 20. For the first kread system call, how many bytes were actually read? ____________
» The following example command sequence shows how to determine the
timestamp value associated with the return from kread 104 event for the app1
program:
# grep "return from kread" app1.rpt | grep app1
104 app1 9633975 0.011519 return from kread [452 usec]
» Using this information, we can use vi to examine the app1.rpt file, and then
search for the timestamp value. This will allow us to view the events for the end

© Copyright IBM Corp. 2013 Exercise 3. Conventional I/O Operations 3-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

of the read system call. An example of the event sequence at the end of the call
is shown below:
163 app1 9633975 0.011518 vnop_rdwr_read(vp =
F1000A0601CE2820, ext = 0000, ...) = 0000, 0017 bytes moved
104 app1 9633975 0.011519 return from kread [452 usec]
» The expected answer is that 17 (hexadecimal) bytes (23 in decimal) were read,
as indicated in the information in the 163 event.
__ 21. How many kwrite system calls were made by the app1 program to write information
to the file that was opened? _______________
» We know that file descriptor number 3 is used for the file opened by app1. We
can count the number of write calls using the following command sequence:
# grep "write(3" app1.rpt | grep app1
19C app1 9633975 0.011520 write(3,000000002000084
8,1000)
19C app1 9633975 0.011544 write(3,000000002000084
8,1000)
» The expected answer is that two write calls are issued by app1.
__ 22. What was the write request size used with the first kwrite system call? ___________
» From the event information obtained while answering the previous question, we
can observe the parameters passed to the write routine:
write(3,0000000020000848,1000)
» The third parameter is the write request size. The expected answer is 1000
(hexadecimal), which is 4096 bytes in decimal.

3-8 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 3 - Cleanup


__ 23. Change directory to /home.
» The command is shown below:
# cd /home/QV474/ex3
__ 24. Unmount the /data file system.
» The command is shown below:
# unmount /data
__ 25. Unmount the /ex3fs file system.
» The command is shown below:
# unmount /ex3fs
__ 26. Remove the ex3vg volume group and all logical volumes it contains using the
following command:
# reducevg -f -d ex3vg hdisk1 hdisk2
__ 27. Let your instructor know that you have completed the exercise.

End of exercise

© Copyright IBM Corp. 2013 Exercise 3. Conventional I/O Operations 3-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

3-10 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Exercise 4. Specialized I/O Operations


(with hints)

What this exercise is about


This exercise contains activities in which you will examine and
compare DIO activities, and monitor AIO usage on JFS2 both with and
without CIO.

What you should be able to do


At the end of the exercise, you should be able to:
• Compare performance of conventional I/O and direct I/O
• Examine filemon reports to analyze direct I/O operations
• Examine trace reports to analyze successful and demoted direct
I/O operations
• Show the initial AIO tunable values
• Monitor AIO server activity
• Compare AIO operations on JFS2 with and without CIO

Requirements
In the normal lab environment for this class, each lab team will be
assigned a logical partition (LPAR) on a managed system. The
assigned logical partition should be running AIX 7.1 and should
normally be on a POWER6 or POWER7 processor-based system.
You will not be sitting directly in front of your lab system. Instead, you
will be using your personal PC to connect to your lab system.

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Exercise instructions with hints

Preface
This exercise includes information for you to read, and exercise steps for you to
perform. The following examples illustrate the numbered checklist format used to
identify exercise steps:
__ 1. (This is example step one.) Login to ...
__ 2. (This is example step two.) Execute the following ...
Two versions of these instructions are available: one with hints and one without. You
can use either version to complete this exercise (or flip back and forth between the two
versions). In other words, use these two versions of the exercise in whatever way best
aids your learning. Also, please do not hesitate to ask the instructor if you have
questions.
In some cases, the answer given in a hint may be just an example, and there may be
other correct answers.
All hints are marked by a » sign.

4-2 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 1 - Setting the exercise environment


__ 1. Log on to your assigned AIX system as root, and change directory to
/home/QV474/ex4.
» The command is shown below:
# cd /home/QV474/ex4
__ 2. Use the lspv command to list every known volume (disk) in your system along with
the name of the volume group to which the volume belongs.
» The command and an example of output are shown below:
# lspv
hdisk0 00066bd2f24a3c49 rootvg active
hdisk1 00066bd213398da6 None
hdisk2 00066bd2d6aa7b19 None
In the normal lab environment for this class, hdisk2 should be available, and not
assigned to any volume group. Let your instructor know if hdisk2 is not available.
__ 3. Use the mkvg command to create the ex4vg volume group specifying hdisk2 as a
physical volume.
» The command and an example of output are shown below:
# mkvg -f -y ex4vg hdisk2
ex4vg
__ 4. Use the crfs command to create a new JFS2 file system of 1 GB in the ex4vg
volume group. Specify /convio as the mount point.
» The command and an example of output are shown below:
# crfs -v jfs2 -g ex4vg -m /convio -a size=1G
File system created successfully.
1048340 kilobytes total disk space.
New File System size is 2097152
__ 5. Use the crfs command to create a new JFS2 file system of 1 GB in the ex4vg
volume group. This time specify /dio as the mount point.
» The command and an example of output are shown below:
# crfs -v jfs2 -g ex4vg -m /dio -a size=1G
File system created successfully.
1048340 kilobytes total disk space.
New File System size is 2097152
__ 6. Use the crfs command to create a new JFS2 file system of 1 GB in the ex4vg
volume group. This time specify /aio as the mount point.

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» The command and an example of output are shown below:


# crfs -v jfs2 -g ex4vg -m /aio -a size=1G
File system created successfully.
1048340 kilobytes total disk space.
New File System size is 2097152
__ 7. Mount the /convio and /aio file systems. Use the default options.
» The commands are shown below:
# mount /convio
# mount /aio
__ 8. Mount the /dio file system with the -o dio option. In this case, all the files in the /dio
directory will be accessed using direct I/O.
» The command is shown below:
# mount -o dio /dio
__ 9. Check if the /convio, /aio, and /dio file systems are mounted correctly.
» The command and an example of output are shown below:
# mount | tail -3
/dev/fslv00 /convio jfs2 Apr 06 04:05 rw,log=/dev/loglv00
/dev/fslv01 /dio jfs2 Apr 06 04:07 rw,dio,log=/dev/loglv00
/dev/fslv03 /aio jfs2 Apr 11 13:01 rw,log=/dev/loglv00
__ 10. In this step you will run the writefile program in two windows at the same time,
and measure the results. The writefile program creates a sequential file with a
block size of 4096 bytes. In one window you will create a file using conventional I/O,
and in the other one using direct I/O.
__ a. Open a second window to your system, login as the root user and change
directory to /home/QV474/ex4.
» The command is shown below:
# cd /home/QV474/ex4
__ b. In this second window, type the following command, but do not hit return to run
it yet.
# time ./writefile 10000 /dio/file
__ c. In the first window, type the following command, but do not hit return to run it
yet.
# time ./writefile 10000 /convio/file
__ d. Now that these commands are set up, hit return on both windows, wait for the
results, and then fill in the table with the time command statistics.

4-4 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty
Time command Conventional I/O (first Direct I/O (second
statistics window) window)
real
user
sys
» The following table contains the example answers:
Time command Conventional I/O Direct I/O (second
statistics (first window) window)
real 0m0.94s 0m40.45s
user 0m0.08s 0m0.10s
sys 0m0.12s 0m0.33s

__ 11. What is happening with the direct I/O test to make it so slow?
________________________________________________________________
» When writing a direct I/O file the file system cache is not used, so the VMM’s
write behind mechanism also is not executed. All DIO writes go directly from the
user buffer to the physical volume, and are considered synchronous.
» Conventional I/O is a cached I/O that has multiple advantages:
- When writing to a new file or extending an existing file, the write system call
copies the data into the file system cache and then returns to the application.
The page in the cache is written to disk at a later time by one of a number of
different kernel mechanisms.
- The 4 KB bytes file page that is brought into the file system cache when a
single byte is read can be re-used by the application upon subsequent read
requests.
- Read-ahead can occur with conventional cached I/O, further reducing the
latency of future read requests.

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Part 2 - Examining successful DIO and demoted DIO scenarios


__ 12. In this section of the exercise, you will examine the performance of direct I/O for two
different scenarios, successful and demoted direct I/O.
Make sure you still have two windows logged in as root in the /home/QV474/ex4
directory.
__ 13. Duplicate the file previously created in Step b on page 4-4. Use the following
command:
# cp /dio/file /dio/filecopy
__ 14. In this step you will run the dd command to read (using DIO) in two windows at the
same time, one with successful direct I/O, and the second one with demoted direct
I/O. Notice the block sizes on the demoted DIO command.
__ a. In the first window, type the following command, but do not hit return to run it
yet.
# time dd if=/dio/file of=/dev/null ibs=4096
__ b. In the second window, type the following command, but do not hit return to run
it yet.
# time dd if=/dio/filecopy of=/dev/null ibs=4095
__ c. Now that these commands are set up, hit return on both windows, wait for the
results, and then fill in the table with the time command statistics.

Time command Successful Direct I/O Demoted Direct I/O


statistics (first window) (second window)
real
user
sys

» The following table contains the example answers:


Time command Successful Direct I/O Demoted Direct I/O
statistics (first window) (second window)
real 0m10.16s 0m10.58s
user 0m0.35s 0m0.34s
sys 0m0.60s 0m0.72s
__ 15. What is happening with the direct I/O tests (DIO and demoted DIO)?
________________________________________________________________
» The results are similar, with the demoted direct I/O a little bit slower.

4-6 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty » When direct I/O is demoted (alignment not satisfied, in this example) the
requested bytes are copied into the file system cache, then into the application
buffer, incurring the CPU costs of double-copying of data.
» Also, read-ahead will not occur when demoted direct I/O is used.

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Part 3 - Using filemon reports to examine direct I/O writes


__ 16. In this section of the exercise, you will examine reports generated by the filemon
command to analyze successful and demoted direct I/O writes.
If you are not already logged in, login to your assigned system as the root user and
change directory to /home/QV474/ex4.
__ 17. Run the following command to generate a filemon output to report the I/O activity
on successful write operations.
# filemon -O all,detailed -o filemon1.out; dd if=/dio/file \
of=/dio/file1 ibs=1024k obs=256k count=10; trcstop
__ 18. Use the filemon1.out report generated in the previous step to answer the following
questions.
» The command and an example of output are shown below:
# pg filemon1.out
Mon Apr 8 11:52:20 2013
System: AIX 7.1 Node: woolf1 Machine: 00066BD2D900
Cpu utilization: 39.8%
Cpu allocation: 81.8%

Most Active Files


------------------------------------------------------------------
#MBs #opns #rds #wrs file volume:inode
------------------------------------------------------------------
10.0 1 10 0 file /dev/fslv01:4
10.0 1 0 40 file1 /dev/fslv01:5
. . . < some output deleted >. . .
__ a. Most active file being written: ______________________
» file1
__ b. Number of reads from this file: _____________________
» None
__ c. Number of writes to this file: ______________________
» 40

4-8 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty __ d. The I/O was supposed to be DIO. Were there any demoted DIOs? _________
# pg filemon1.out
. . . < some output deleted >. . .
Most Active Segments
------------------------------------------------------------------
#MBs #rpgs #wpgs segid segtype volume:inode
------------------------------------------------------------------
. . . < some output deleted >. . .
» The DIO was not demoted because there is no segment activity caused by the
VMM file system cache. You cannot always tell from the filemon output whether
DIOs were not demoted. In those cases, you would need to look at the system
trace file for a deeper analysis.
__ e. What was the most active logical volume? ______________________
» /dev/fslv01 was the most active logical volume.
# pg filemon1.out
. . . < some output deleted >. . .
Most Active Logical Volumes
---------------------------------------------------------------
util #rblk #wblk KB/s volume description
---------------------------------------------------------------
0.87 20480 20480 40923.9 /dev/fslv01 /dio
0.01 0 8 8.0 /dev/loglv00 jfs2log
. . . < some output deleted >. . .
__ f. What was the most active logical volume being written? _________________
» /dev/fslv01 was the most active logical volume being written.
__ g. What was the most active physical volume? ____________________
» /dev/hdisk2 was the most active physical volume.
# pg filemon1.out
. . . < some output deleted >. . .
Most Active Physical Volumes
---------------------------------------------------------------
util #rblk #wblk KB/s volume description
---------------------------------------------------------------
0.88 20480 20488 40931.9 /dev/hdisk2 N/A
Most Active Physical Volumes
. . . < some output deleted >. . .
__ h. What was the utilization of the most active physical volume? _________
» The utilization was 88%.
__ i. For the most active file being written, answer the following questions:

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Note: Examine the Detailed File Stats section of the filemon output.
# pg filemon1.out
---------------------------------------------------------------
Detailed File Stats
---------------------------------------------------------------
. . . < some output deleted >. . .
FILE: /dio/file1 volume: /dev/fslv01 inode: 5
opens: 1
total bytes xfrd: 10485760
writes: 40 (0 errs)
write sizes (bytes):avg 262144.0 min 262144 max 262144 sdev 0.0
write times (msec):avg 10.548 min 4.835 max 24.226 sdev 5.880 lseeks: 2
. . . < some output deleted >. . .
a. How many total bytes were transferred? ________________________
10485760 bytes.
b. What was the write size, and was it consistent? ____________________
262144 bytes. The avg, min, and max were the same, so it was consistent.
c. What was the average write time? __________________________
10.548 msec.
__ 19. Run the following command to generate another filemon output to report the I/O
activity on demoted DIO write operations. Notice the output block size of the dd
command (obs=255k.).
# filemon -O all,detailed -o filemon2.out; dd if=/dio/file \
of=/dio/file2 ibs=1024k obs=255k count=10; trcstop
__ 20. Use the filemon2.out report generated in the previous step to answer the following
questions.
» The command and an example of output are shown below:
# pg filemon2.out
Mon Apr 8 12:31:18 2013
System: AIX 7.1 Node: woolf1 Machine: 00066BD2D900
Cpu utilization: 42.0%
Cpu allocation: 97.2%

Most Active Files


------------------------------------------------------------------
#MBs #opns #rds #wrs file volume:inode
------------------------------------------------------------------
10.0 1 10 0 file /dev/fslv01:4
10.0 1 0 41 file2 /dev/fslv01:6
. . . < some output deleted >. . .
__ a. Most active file being written: ______________________

4-10 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty » file2.
__ b. Number of reads from this file: ___________
» None.
__ c. Number of writes to this file: ___________
» 41.
__ d. The I/O was supposed to be DIO. Were there any demoted DIOs? ________
# pg filemon2.out
. . . < some output deleted >. . .
Most Active Segments
------------------------------------------------------------------
#MBs #rpgs #wpgs segid segtype volume:inode
------------------------------------------------------------------
10.2 30 2580 81c687 client
. . . < some output deleted >. . .
» Yes, the I/O was demoted. There was a segment associated with I/Os (client
segment type). The segment was probably used by VMM for file system cache
activities. You cannot always tell from the filemon output whether DIOs were not
demoted. In those cases, you would need to look at the system trace file for a
deeper analysis.
__ e. What was the most active logical volume? __________________________
# pg filemon2.out
. . . < some output deleted >. . .
Most Active Logical Volumes
------------------------------------------------------------------
util #rblk #wblk KB/s volume description
------------------------------------------------------------------
0.89 20720 20720 35002.5 /dev/fslv01 /dio
0.02 0 16 13.5 /dev/loglv00 jfs2log
. . . < some output deleted >. . .
» /dev/fslv01 is the most active logical volume.
__ f. What was the most active logical volume being written? _______________
» /dev/fslv01 is the most active logical volumes being written.

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

__ g. What was the most active physical volume? ______________________


# pg filemon2.out
. . . < some output deleted >. . .
Most Active Physical Volumes
------------------------------------------------------------------
util #rblk #wblk KB/s volume description
------------------------------------------------------------------
0.91 20720 20736 35016.0 /dev/hdisk2 N/A
. . . < some output deleted >. . .
» /dev/hdisk2 was the most active physical volume.
__ h. What was the utilization of the most active physical volume?_______________
» The utilization was 91%.
__ i. For the most active file being written, answer the following questions:
Note: Examine the Detailed File Stats session of the filemon output.
# pg filemon2.out
------------------------------------------------------------------
Detailed File Stats
------------------------------------------------------------------
. . . < some output deleted >. . .
FILE: /dio/file2 volume: /dev/fslv01 inode: 6
opens: 1
total bytes xfrd: 10485760
writes: 41 (0 errs)
write sizes (bytes): avg 255750.2 min 40960 max 261120 sdev 33961.3
write times (msec): avg 12.824 min 4.250 max 22.335 sdev 5.237 lseeks: 2
. . . < some output deleted >. . .
a. How many total bytes were transferred? ________________________
10485760 bytes.
b. What was the write size, and was it consistent? ____________________
Average size 255750.2 bytes, minimum size 40960 bytes, and maximum size
261120 bytes. The avg, min, and max were not the same.
c. What was the average write time? __________________________
12.824 msec.
__ 21. Using the answers from the filemon outputs in the previous steps, fill in the table
below.
filemon1.out filemon2.out
Most active file being written
#wrs
Demoted DIOs?
Most active segment (VSID)

4-12 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty filemon1.out filemon2.out


#wpgs
Most active logical volume
util
Most active physical volume
util
Most active file being written
total bytes xfrd
write size (avg)
write msec (avg)
» The following table contains the example answers:
filemon1.out filemon2.out
Most active file being written file1 file2
#wrs 40 41
Demoted DIOs? no yes
Most active segment (VSID) (none) 81c687
#wpgs (none) 2580
Most active logical volume /dev/fslv01 /dev/fslv01
util 87% 89%
Most active physical volume /dev/hdisk2 /dev/hdisk2
util 88% 91%
Most active file being written file1 file2
total bytes xfrd 10485760 10485760
write size (avg) 262144.0 255750.2
write msec (avg) 10.548 12.824

What can you summarize about the differences in the two runs?
________________________________________________
» The second run took longer than the first run because it had to use the VMM, so
there was more overhead.

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Part 4 - Using a trace report to examine successful direct I/O writes


__ 22. In this section of the exercise, you will examine trace information to analyze
successful direct I/O writes.
If you are not already logged in, login to your assigned system as the root user and
change directory to /home/QV474/ex4.
__ 23. Run the following command to generate a trace output to examine successful DIO
writes.
# trace -J syscall,jfs2,vnops,filepvld,vmm -x "./dio_w -w 1024000 -b 4096 -f \
/dio/file3"; trcrpt -O exec=on,pid=on,svc=on,timestamp=1 -o trace3.out
__ 24. Explore the trace3.out report generated in the previous step.
# pg trace3.out
Mon Apr 8 16:49:05 2013
System: AIX 7.1 Node: woolf1
Machine: 00066BD2D900
Internet Address: 091B198D 9.27.25.141
At trace startup, the system contained 2 cpus, of which 2 were traced.
Buffering: Kernel Heap
This is from a 64-bit kernel.
Tracing only these hooks,
1010,1040,1060,1070,10a0,10b0,1200,1290,12e0,1300,1340,1350
,1360,1390,13a0,13c0,13d0,14c0,1500,1520,1540,1560,15b0,1630,1640,1670,1680,18b0
,1940
,19c0,1a00,1b00,1b10,1b20,1b30,1b40,1b50,1b60,1b70,1b80,1b90,1ba0,1bb0,1bc0,1bd0
,1be0
,1bf0,1d90,1da0,1db0,1dc0,1dd0,2210,2220,2230,2a00,2a10,2a20,2fc0,45a0,45b0,4a50
,4c10
,4c20,4c30,4ca0,4cb0,4cc0,4cd0,4ce0,4cf0,4d00,4d10,4d20,4d30,4d40,4d50,4db0,4de0
,4df0
,4e00,4e10,4e20,4e30,4e40,4ef0,4f00,4f10,4f20,4f90,4fa0,59b0,5c00,5ca0,62c0

trace -J syscall,jfs2,vnops,filepvld,vmm -x ./dio_w -w 1024000 -b 4096 -f


/dio/file3
. . . < some output deleted >. . .
__ 25. Use the trace3.out report to answer the following questions.
__ a. What was the command name and PID of the process being executed?
Note: The trace hook ID for the exec system call is 134.
# grep "^134" trace3.out
134 -3145924- 3145924 execve 0.000307 exec:
cmd=./dio_w -w 1024000 -b 4096 -f /dio/file3 pid=3145924
tid=13369371
Command name: _____________________
» dio_w .
PID: _____________________

4-14 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty » 3145924.
__ b. How many read system calls the command issued? _______________
Note: The trace hook ID for the system call entry is 101.
# grep "^101" trace3.out | grep read | wc -l
» 0 (no read system calls).
__ c. How many write system calls did the command issue? _____________
# grep "^101" trace3.out | grep write | wc -l
» 257 write system calls.
__ d. The dio_w command uses the file descriptor 3 (returned by the open system call)
to write the file blocks. How many write system calls did the dio_w command
issue to file descriptor 3? ___________
Note: The system call syntax is write (FileDescriptor, Buffer, NBytes).
The write system call writes NBytes bytes from the buffer pointed to by Buffer to
the file referred by FileDescriptor. The FileDescriptor number used to write
into file3 is 3.
# grep "write(3" trace3.out
19C dio_w 3145924 kwrite 0.001396 write(3,0000000020000CA8,1000)
19C dio_w 3145924 kwrite 0.009015 write(3,0000000020000CA8,1000)
19C dio_w 3145924 kwrite 0.012041 write(3,0000000020000CA8,1000)
. . . < some output deleted >. . .
# grep "write(3" trace3.out | wc -l
250
__ e. What request size was used? ________________
» The request size is the third argument of the write system call which is 0x1000
in hex or 4096 in decimal.
__ f. What was the timestamp for the first write to the file represented by the file
descriptor 3? _____________________________
# grep "write(3" trace3.out | head -1
19C dio_w 3145924 kwrite 0.001396 write(3,0000000020000CA8,1000)
» 0.001396.
__ g. Edit the trace3.out file using vi, and move to the line that has the timestamp
obtained in the previous step. Then, looking at the JFS2 IO write [...] line
(hook ID 59B), what are the offset, length, and SID (segment ID) values?
# vi trace3.out

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-15


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» You may need to use the / (slash) subcommand to find the appropriate
timestamp. Example: /0.001396
19C dio_w 3145924 kwrite 0.001396 write(3,0000000020000CA8,1000)
19C dio_w 3145924 kwrite 0.001396 vnop_rdwr_write(vp =
F1000A0601323020, offset = 0000000000000000, length = 1000, flags =
8000003, ...) = ...
59B dio_w 3145924 kwrite 0.001398 JFS2 IO write: vp =
F1000A0601323020, sid = 8503B4, offset = 0000000000000000, length =
1000
5CA dio_w 3145924 kwrite 0.001399 VMSVC XMATTACH: caller=26ABF8
pid=FFFFFFFFFFFFFFFF vaddr=20000CA8 count=1000 segflag=0000
59B dio_w 3145924 kwrite 0.001402 JFS2 IO dio move: vp =
F1000A0601323020, sid = 8503B4, offset = 0000000000000000, length =
1000
59B dio_w 3145924 kwrite 0.001416 JFS2 IO dio devstrat: bplist =
F1000005C0160228, vp = F1000A0601323020, sid = 8503B4, lv blk =
8D8F8, bcount = 1000
. . . < some output deleted >. . .
Offset: ______________
» 0000000000000000.
Length: ______________
» 1000.
SID (segment ID): ______________
» 8503B4.
__ h. Looking in the trace using the SID from the last step, can you confirm that no
DIOs were demoted for the application’s writes? ___________________
» Yes, there was no VMM writing activity for that SID. You also see the following
lines for each write:
JFS2 IO write: . . .
JFS2 IO dio move: . . .
JFS2 IO dio devstrat: . . .
JFS2 IO dio iodone: . . .
» The JFS2 IO dio devstrat and JFS2 IO dio iodone are only shown if the DIO
was not demoted (see hook ID 59B)
» This only shows one write. In order to confirm all DIOs were not demoted you
would need to see this sequence for all of the writes.

4-16 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 5 - Using a trace report to examine demoted direct I/O writes
__ 26. In this section of the exercise, you will examine trace information to analyze
demoted direct I/O writes.
If you are not already logged in, login to your assigned system as the root user and
make sure you are in the /home/QV474/ex4 directory.
__ 27. Run the following command to generate a trace output to examine successful DIO
writes.
# trace -J syscall,jfs2,vnops,filepvld,vmm -x "./dio_w -w 1024000 -b 2048 -f \
/dio/file4"; trcrpt -O exec=on,pid=on,svc=on,timestamp=1 -o trace4.out
__ 28. Explore the trace4.out report generated in the previous step.
# pg trace4.out
Tue Apr 9 07:47:44 2013
System: AIX 7.1 Node: woolf1
Machine: 00066BD2D900
Internet Address: 091B198D 9.27.25.141
At trace startup, the system contained 2 cpus, of which 2 were traced.
Buffering: Kernel Heap
This is from a 64-bit kernel.
Tracing only these hooks, 1010,1040,1060,1070,10a0,10b0,1200,1290,12e0,1300,1340
,1350,1360,1390,13a0,13c0,13d0,14c0,1500,1520,1540,1560,15b0,1630,1640,1670,1680
,18b0,1940,19c0,1a00,1b00,1b10,1b20,1b30,1b40,1b50,1b60,1b70,1b80,1b90,1ba0,1bb0
,1bc0,1bd0,1be0,1bf0,1d90,1da0,1db0,1dc0,1dd0,2210,2220,2230,2a00,2a10,2a20,2fc0
,45a0,45b0,4a50,4c10,4c20,4c30,4ca0,4cb0,4cc0,4cd0,4ce0,4cf0,4d00,4d10,4d20,4d30
,4d40,4d50,4db0,4de0,4df0,4e00,4e10,4e20,4e30,4e40,4ef0,4f00,4f10,4f20,4f90,4fa0
,59b0,5c00,5ca0,62c0

trace -J syscall,jfs2,vnops,filepvld,vmm -x ./dio_w -w 1024000 -b 2048 -f /dio/f


ile4
. . . < some output deleted >. . .
__ 29. Use the trace4.out report to answer the following questions.
__ a. What was the command name and PID of the process being executed?
Note: The trace hook ID for the exec system call is 134.
# grep "^134" trace4.out
134 -6815850- 6815850 execve 0.000301 exec: cmd=./dio_w -w 1024000
-b 2048 -f /dio/file4 pid=6815850 tid=13238333
Command name: _____________________
» dio_w .
PID: _____________________
» 6815850.
__ b. The dio_w command uses the file descriptor 3 (returned by the open system call)
to write the file blocks. How many write system calls did the dio_w command
issue to file descriptor 3? _____________

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-17


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Note: The system call syntax is write (FileDescriptor, Buffer, NBytes).


# grep "write(3" trace4.out
19C dio_w 6815850 kwrite 0.019911 write(3,0000000020000CA8,800)
19C dio_w 6815850 kwrite 0.024017 write(3,0000000020000CA8,800)
19C dio_w 6815850 kwrite 0.028004 write(3,0000000020000CA8,800)
. . . < some output deleted >. . .
#
# grep "write(3" trace4.out | wc -l
500
» 500 write system calls.
__ c. What was the request size used? ____________________
» The request size is the third argument of the write system call which is 0x800 in
hex or 2048 in decimal.
__ d. Why were the DIOs demoted? _____________________
» The request size was not a multiple of 4 KB.
__ e. What is the timestamp for the first write to the file represented by the file
descriptor 3? ________________
# grep "write(3" trace4.out | head -1
19C dio_w 6815850 kwrite 0.019911 write(3,0000000020000CA8,800)
» The timestamp is 0.019911.
__ f. Edit the trace4.out file using vi, and move to the line that has the timestamp
obtained in the previous step. Then, looking at the JFS2 IO write [...] line
(hook ID 59B), what are the offset, length, and SID (segment ID) values?
# vi trace4.out

4-18 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty » You may need to use the / (slash) subcommand to find the appropriate
timestamp. Example: /0.019911
19C dio_w 6815850 kwrite 0.019911 write(3,0000000020000CA8,800)
19C dio_w 6815850 kwrite 0.019911 vnop_rdwr_write(vp =
F1000A0601371020, offset = 0000000000000000, length = 0800, flags
= 8000003, ...) = ...
59B dio_w 6815850 kwrite 0.019914 JFS2 IO write: vp =
F1000A0601371020, sid = 8345CD, offset = 0000000000000000, length =
0800
59B dio_w 6815850 kwrite 0.019915 JFS2 IO dio move: vp =
F1000A0601371020, sid = 8345CD, offset = 0000000000000000, length =
0800
4C3 dio_w 6815850 kwrite 0.019922 VMM WRITE: sid=8345CD
src=FFFFF10000000CA8 dest=FFFFF00000000000 bytes=0800
basecopy_flags=0008
59B dio_w 6815850 kwrite 0.020004 JFS2 IO dio demoted: vp =
F1000A0601371020, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
19C dio_w 6815850 kwrite 0.024013 vnop_rdwr_write(vp =
F1000A0601371020, ext = 0000, ...) = 0000, 0800 bytes moved
104 dio_w 6815850 kwrite 0.024014 return from kwrite [4104 usec]
101 dio_w 6815850 kwrite 0.024016 kwrite LR = D0120104
19C dio_w 6815850 kwrite 0.024017 write(3,0000000020000CA8,800)
19C dio_w 6815850 kwrite 0.024018 vnop_rdwr_write(vp =
F1000A0601371020, offset = 0000000000000800, length = 0800, flags
= 8000003, ...) = ...
59B dio_w 6815850 kwrite 0.024018 JFS2 IO write: vp =
F1000A0601371020, sid = 8345CD, offset = 0000000000000800, length =
0800
. . . < some output deleted >. . .
Offset: _________________
» 0000000000000000.
Length: ______________
» 800.
SID (segment ID): ______________
» 8345CD.
__ g. Looking in the trace using the SID from the last step, can you confirm that DIOs
were demoted for the application’s writes? ___________________

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-19


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» Yes, there was VMM write activity for that SID.


# grep VMM trace4.out | grep 8345CD | grep "VMM WRITE"
4C3 dio_w 6815850 kwrite 0.019922 VMM WRITE: sid=8345CD
src=FFFFF10000000CA8 dest=FFFFF00000000000 bytes=0800
basecopy_flags=0008
4C3 dio_w 6815850 kwrite 0.024022 VMM WRITE: sid=8345CD
src=FFFFF10000000CA8 dest=FFFFF00000000800 bytes=0800
basecopy_flags=0001
4C3 dio_w 6815850 kwrite 0.028009 VMM WRITE: sid=8345CD
src=FFFFF10000000CA8 dest=FFFFF00000001000 bytes=0800
basecopy_flags=0008
. . . < some output deleted >. . .
» Also, the trace shows a JFS2 IO dio demoted: event for each demoted write.
# grep demoted trace4.out
59B dio_w 6815850 kwrite 0.020004 JFS2 IO dio demoted: vp =
F1000A0601371020, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
59B dio_w 6815850 kwrite 0.024386 JFS2 IO dio demoted: vp =
F1000A0601371020, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
59B dio_w 6815850 kwrite 0.028067 JFS2 IO dio demoted: vp =
F1000A0601371020, mode = 0001, bad = 0002, rc = 0000, rc2 = 0000
. . . < some output deleted >. . .
__ h. In order to confirm that all DIO operations were demoted you would need to
analyze the trace sequence for all of the writes. How many demoted writes do
you see in the trace? _______________
# grep demoted trace4.out | wc -l
500
» 500 demoted writes. One for each write system call issued by the application.

4-20 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 6 - Examining the Asynchronous I/O configuration


__ 30. In this section of the exercise you will examine the AIO subsystem and tunable
parameters. The exercise assumes you are using Legacy AIO. The answers would
be similar if you were using POSIX AIO except the ioo parameters will have
“posix_” prepended to the name.
__ 31. If you are not already logged in, login to your assigned system as the root user and
change directory to /home/QV474/ex4.
__ 32. Is the AIO subsystem enabled? ______________
» The AIO subsystem is enabled when the system is booted. The AIO kernel
extensions are loaded at system boot:
# ps -k | grep aio
2359380 - 0:00 aioLpool
2621520 - 0:00 aioPpool
__ 33. Are there any AIO servers active? ______________
» Initially, no AIO servers are started by default. Notice there are no processes
called aioserver.
# ps -k | grep aioserver
__ 34. Has any AIO server been created since the system booted? ______________
» The ioo tunables, aio_active and posix_aio_active, show if any AIO servers
have been created since the system booted. If the value is 0, no AIO servers
have been created. This is the initial state. No AIO servers are started
automatically at boot time.
# ioo -o aio_active
aio_active = 0
__ 35. Write down the current values for the following ioo tunables:
__ a. aio_minservers: ________________
__ b. aio_maxservers: ________________
__ c. aio_maxreqs: ________________
__ d. aio_server_inactivity: ____________
__ e. aio_active: _____________

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-21


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» You can use to ioo -o <tunable> that lists the current value or ioo -L
<tunable> for all the tunable value information:
# ioo -o aio_minservers
aio_minservers = 3

# ioo -o aio_maxservers
aio_maxservers = 30

# ioo -o aio_maxreqs
aio_maxreqs = 131072

# ioo -o aio_server_inactivity
aio_server_inactivity = 300

# ioo -o aio_active
aio_active = 0

4-22 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Part 7 - Starting the Asynchronous I/O servers


__ 36. If you are not already logged in, login to your assigned system as the root user and
change directory to /home/QV474/ex4.
__ 37. Check if the /aio file system is currently mounted. If not, then mount it.
» The command and an example of output are shown below:
# mount | grep aio
/dev/fslv03 /aio jfs2 Apr 11 13:01 rw,log=/dev/loglv00
__ 38. Create a file called bigfile of size 127 MB in the /aio file system.
# dd if=/dev/zero of=/aio/bigfile bs=1m count=127
127+0 records in.
127+0 records out.
# ls -l /aio
total 260096
-rw-r--r-- 1 root system 133169152 Apr 11 15:11 bigfile
__ 39. In this step you will run the ndiskaio program in one window, and the iostat
command in another one to monitor I/O statistics.
__ a. In the first window, type the following command, but do not hit return to run it
yet.
# iostat -AQ 5
__ b. In the second window, type the following command, but do not hit return to run
it yet.
# ./ndiskaio -A -f /aio/bigfile -S -r75 -b4096 -t20 -M20 -X60
The ndiskaio flags and values are:
A: Asynchronous test
f <file>: The file or raw logical volume to use
S: Sequential access
r75: 75% reads
b4096: Block size 4096 bytes
t20: Run for 20 seconds
M20: Create 20 processes
X60: Use a maximum of 60 AIO servers
__ c. Now that these commands are set up, hit return on the second window to run
the ndiskaio program, then hit return on the first window to run the iostat
command.
Note: start the ndiskaio program before iostat. The order is important to make
sure the AIO kernel extension has been activated (used and pinned).

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-23


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» The command and an example of output are shown below:


# ./ndiskaio -A -f /aio/bigfile -S -r75 -b4096 -t20 -M20 -X60
Command: ./ndiskaio -A -f /aio/bigfile -S -r75 -b4096 -t20 -M20 -X60
Asynchronous Disk test: servers min=0, max=60
No. of processes = 20
I/O type = Sequential
Block size = 4096
Read-WriteRatio: 75:25 = read mostly
Sync type: none = just close the file
Number of files = 1
File size = 33554432 bytes = 32768 KB = 32 MB
Run time = 20 seconds
Snooze % = 0 percent
----> Running test with block Size=4096 (4KB) ....................
Proc - <---Disk IO----> | <---Throughput----> RunTime
Num - TOTAL IO/sec | MB/sec KB/sec Seconds
1 - 4800 238.2 | 0.93 952.79 20.15
2 - 4980 249.0 | 0.97 995.97 20.00
3 - 6060 301.6 | 1.18 1206.51 20.09
4 - 4200 197.3 | 0.77 789.01 21.29
5 - 4860 242.6 | 0.95 970.56 20.03
6 - 4260 212.8 | 0.83 851.16 20.02
7 - 4500 224.9 | 0.88 899.62 20.01
8 - 5040 252.1 | 0.98 1008.46 19.99
9 - 5580 279.4 | 1.09 1117.63 19.97
10 - 4620 230.2 | 0.90 920.88 20.07
11 - 5040 252.5 | 0.99 1010.03 19.96
12 - 4500 223.8 | 0.87 895.07 20.11
13 - 4740 237.4 | 0.93 949.43 19.97
14 - 4440 221.1 | 0.86 884.42 20.08
15 - 4680 233.1 | 0.91 932.24 20.08
16 - 4380 218.9 | 0.86 875.57 20.01
17 - 3900 194.0 | 0.76 776.06 20.10
18 - 4800 238.1 | 0.93 952.43 20.16
19 - 4200 210.3 | 0.82 841.27 19.97
20 - 6120 305.7 | 1.19 1222.80 20.02
TOTALS 95700 4763.0 | 18.61 Seq procs= 20 read= 75% bs= 4KB

4-24 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty » The command and an example of output are shown below:


# iostat -AQ 5

aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle %
iowait physc % entc
5002.2 0.0 1138 0 60 3.8 30.6 33.0
32.7 0.1 36.2

Queue# Count Filesystems


129 0 /
130 0 /usr
132 0 /var
133 0 /tmp
136 0 /home
137 0 /admin
138 0 /proc
139 0 /opt
140 0 /var/adm/ras/livedump
144 0 /convio
145 0 /dio
147 0 /temp
148 889 /aio

aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle %
iowait physc % entc
5369.7 0.0 1198 0 60 3.5 31.4 32.8
32.3 0.1 36.9

Queue# Count Filesystems


129 0 /
130 0 /usr
132 0 /var
133 0 /tmp
136 0 /home
137 0 /admin
138 0 /proc
139 0 /opt
140 0 /var/adm/ras/livedump
144 0 /convio
145 0 /dio
147 0 /temp
148 1083 /aio
. . . < some output deleted >. . .

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-25


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

__ 40. When the ndiskaio program finishes, stop the iostat command, and then answer
the following questions:
__ a. Has the AIO kernel extension been used and pinned? ______________
» Yes, the ioo parameter, aio_active, has been changed to 1.
# ioo -o aio_active
aio_active = 1
__ b. How many AIO servers were created? _________________
» You can find this answer by looking at the serv field of the iostat -AQ command
or by using the ps command. In our example, 60 AIO servers were created.
# iostat -AQ 5
. . . < some output deleted >. . .
aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle %
iowait physc % entc
5002.2 0.0 1138 0 60 3.8 30.6 33.0
32.7 0.1 36.2
. . . < some output deleted >. . .

# ps -k | grep aioserver | wc -l
60
__ c. How does the number of AIO servers created compare to the minimum and
maximum AIO servers allowed?
» In this sample output, 60 AIO servers were created. aio_minservers is 3 (per
CPU) and aio_maxservers is 30 (per CPU). There are two logical CPUs on the
system.
# ioo -o aio_minservers
aio_minservers = 3
# ioo -o aio_maxservers
aio_maxservers = 30
# mpstat -s
System configuration: lcpu=2 ent=0.2 mode=Capped

Proc0
0.08%
cpu0 cpu1
0.06% 0.02%
__ 41. Once the ndiskaio command finished and if there are no more AIO requests, how
long will all the AIO servers stay active? After that time, how many will remain
active?
» The ioo parameter, aio_server_inactivity, determines how long AIO servers
stay active with no AIO requests to handle. However, once the number of AIO

4-26 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty servers is above the ioo parameter aio_minservers, it will not fall below that
value. In our example, aio_server_inactivity is 300 seconds. After 300
seconds of inactivity, the number of AIO servers will be 6 (aio_minservers is 3
(per CPU)).
__ 42. Change the minimum number of (Legacy AIO) AIO servers to 0.
» Use the ioo command to change aio_minservers to 0:
# ioo -o aio_minservers=0
Setting aio_minservers to 0
__ 43. Has the number of active AIO servers been reduced to 0? If not, why not?
» Use the command to see the number of active AIO servers:
# ps -k | grep aioserver | wc -l
» If you issue these commands within 300 seconds of the termination of the
ndiskaio command, you will still see active AIO servers. The number of AIO
servers is not automatically reduced to 0. The system waits until the
aio_server_inactivity time is up before deciding if the number of AIO servers
should be reduced.
# ps -k | grep aioserver | wc -l
60
__ 44. Change the aio_server_inactivity time to 30 seconds.
» Use the ioo command to change aio_server_inactivity to 30:
# ioo -o aio_server_inactivity=30
Setting aio_server_inactivity to 30
__ 45. After 30 seconds, has the number of active AIO servers been reduced to 0 (the
minimum number you set it to in the previous step)?
» Each AIO server sleeps for the number of seconds indicated by the
aio_server_inactivity time that was in effect when going to sleep. Changing
the value of this setting will not impact the sleep duration of AIO servers that are
already sleeping. If no AIO requests have been placed on the queue, then no
AIO servers will have been woken up. Therefore each server will be sleeping for
300 seconds. The number of AIO servers you observe will depend on whether
300 seconds has passed since they previously went to sleep.

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-27


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

Part 8 - Comparing JFS2 AIO and JFS2 AIO with CIO accesses
__ 46. If you are not already logged in, login to your assigned system as the root user and
change directory to /home/QV474/ex4.
In this section, you will run a program named ndiskaio that generates AIO
requests. You will do this two times, once for conventional AIO and then again for
AIO using CIO accesses, and then compare the results.
__ 47. In the previous part of the exercise, you set aio_minservers to 0 and
aio_server_inactivity to 30 seconds. You will keep these settings so the
number of AIO servers get set back to 0 after each test. Verify there are no active
AIO servers on the system.
» Use the following command to see the number of active AIO servers:
# ps -k | grep aioserver | wc -l
0
__ 48. Open a second window to your system, login as the root user and run the following
command:
# iostat -A 5
__ 49. In the first window, verify the /aio file system is mounted. If it is not, mount it. Then,
run the following command:
# ./ndiskaio -A -f /aio/bigfile -S -r75 -b4096 -t20 -M20 -X60
__ 50. When the ndiskaio program finishes, stop the iostat -A command in the other
window. Record the following values:
From the ndiskaio output:
- Total disk I/O: ___________________________ 62640

- IO/sec : ___________________________________ 3039.8

- MB/sec : __________________________________ 11.87

4-28 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty » Following is a sample run of the ndiskaio program.


# ./ndiskaio -A -f /aio/bigfile -S -r75 -b4096 -t20 -M20 -X60
Command: ./ndiskaio -A -f /aio/bigfile -S -r75 -b4096 -t20 -M20 -X60
Asynchronous Disk test: servers min=0, max=60
No. of processes = 20
I/O type = Sequential
Block size = 4096
Read-WriteRatio: 75:25 = read mostly
Sync type: none = just close the file
Number of files = 1
File size = 33554432 bytes = 32768 KB = 32 MB
Run time = 20 seconds
Snooze % = 0 percent
----> Running test with block Size=4096 (4KB) ....................
Proc - <-----Disk IO----> | <-----Throughput------> RunTime
Num - TOTAL IO/sec | MB/sec KB/sec Seconds
1 - 3960 196.9 | 0.77 787.61 20.11
2 - 960 43.3 | 0.17 173.31 22.16
3 - 1440 63.7 | 0.25 254.66 22.62
4 - 3420 160.0 | 0.62 639.90 21.38
5 - 3900 194.0 | 0.76 776.15 20.10
6 - 60 2.7 | 0.01 10.78 22.27
7 - 4080 204.1 | 0.80 816.24 19.99
8 - 4080 203.2 | 0.79 812.84 20.08
9 - 4080 202.7 | 0.79 810.76 20.13
10 - 2400 103.2 | 0.40 412.72 23.26
11 - 4080 203.4 | 0.79 813.77 20.05
12 - 4080 203.7 | 0.80 814.74 20.03
13 - 4080 202.6 | 0.79 810.55 20.13
14 - 4080 203.8 | 0.80 815.08 20.02
15 - 3000 125.4 | 0.49 501.62 23.92
16 - 1140 49.6 | 0.19 198.25 23.00
17 - 4080 203.4 | 0.79 813.77 20.05
18 - 4080 203.0 | 0.79 811.85 20.10
19 - 4080 204.1 | 0.80 816.23 19.99
20 - 1560 67.1 | 0.26 268.47 23.24
TOTALS 62640 3039.8 | 11.87 Seq procs= 20 read= 75% bs=
4KB

From the iostat -A output (Use the interval that has the highest values):
- avgc: ___________________________ 3171.2

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-29


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

- avfc: ___________________________ 0.0

- maxgc: __________________________ 839

- maxfc: _________________________ 0

- maxreqs: _______________________ 60

- % user: _________________________ 4.4

- % sys: __________________________ 30.5

- % idle: _________________________ 25.4

- % iowait: _______________________ 39.7

- physc: __________________________ 0.1

- % entc: _________________________ 37.6


» Following is a sample run of the iostat -A 5 command.
. . . < some output deleted >. . .
aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle %
iowait physc % entc
3171.2 0.0 839 0 60 4.4 30.5 25.4
39.7 0.1 37.6

Disks: % tm_act Kbps tps Kb_read Kb_wrtn


hdisk1 0.0 0.0 0.0 0 0
hdisk0 0.0 0.0 0.0 0 0
hdisk2 99.6 2713.9 412.2 3872 9684
. . . < some output deleted >. . .

__ 51. Umount the /aio file system.


» The command is shown below:
# umount /aio

4-30 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty __ 52. The second test will use a JFS2 file system that is mounted with the CIO option.
Mount the /aio file system with the following command:
# mount -o cio /aio
__ 53. Run the iostat -A 5 command again in the second window.
__ 54. In the first window, run the following command:
# ./ndiskaio -A -f /aio/bigfile -S -r75 -b4096 -t20 -M20 -X60
__ 55. When the ndiskaio program finishes, stop the iostat -A command in the other
window. Record the following values:
From the ndiskaio output:
- Total disk I/O: ___________________________ 18960

- IO/sec: ___________________________________ 935.7

- MB/sec : __________________________________ 3.66

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-31


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

» Following is a sample run of the ndiskaio program.


# ./ndiskaio -A -f /aio/bigfile -S -r75 -b4096 -t20 -M20 -X60
Command: ./ndiskaio -A -f /aio/bigfile -S -r75 -b4096 -t20 -M20 -X60
Asynchronous Disk test: servers min=0, max=60
No. of processes = 20
I/O type = Sequential
Block size = 4096
Read-WriteRatio: 75:25 = read mostly
Sync type: none = just close the file
Number of files = 1
File size = 33554432 bytes = 32768 KB = 32 MB
Run time = 20 seconds
Snooze % = 0 percent
----> Running test with block Size=4096 (4KB) ....................
Proc - <-----Disk IO----> | <-----Throughput------> RunTime
Num - TOTAL IO/sec | MB/sec KB/sec Seconds
1 - 1020 51.0 | 0.20 203.92 20.01
2 - 1020 48.9 | 0.19 195.67 20.85
3 - 960 46.8 | 0.18 187.04 20.53
4 - 1020 51.0 | 0.20 204.12 19.99
5 - 960 46.9 | 0.18 187.42 20.49
6 - 900 44.8 | 0.18 179.38 20.07
7 - 1020 51.0 | 0.20 204.02 20.00
8 - 900 45.0 | 0.18 180.02 20.00
9 - 900 45.0 | 0.18 179.89 20.01
10 - 900 44.7 | 0.17 178.94 20.12
11 - 960 46.9 | 0.18 187.59 20.47
12 - 1020 48.9 | 0.19 195.50 20.87
13 - 900 44.8 | 0.18 179.29 20.08
14 - 900 44.8 | 0.17 179.20 20.09
15 - 960 46.9 | 0.18 187.58 20.47
16 - 900 45.0 | 0.18 180.01 20.00
17 - 900 44.8 | 0.18 179.38 20.07
18 - 960 46.9 | 0.18 187.41 20.49
19 - 900 44.8 | 0.17 179.17 20.09
20 - 960 46.8 | 0.18 187.32 20.50
TOTALS 18960 935.7 | 3.66 Seq procs= 20 read= 75% bs=
4KB#

From the iostat -A output (Use the interval that has the highest values):
- avgc: ___________________________ 0.0

4-32 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty - avfc: ___________________________ 0.0

- maxgc: __________________________ 0

- maxfc: _________________________ 0

- maxreqs: _______________________ 0

- % user: _________________________ 8.7

- % sys: __________________________ 25.0

- % idle: _________________________ 66.3

- % iowait: _______________________ 0.0

- physc: __________________________ 0

- % entc: _________________________ 38.4


» Following is a sample run of the iostat -A 5 command.
. . . < some output deleted >. . .
aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle %
iowait physc % entc
0.0 0.0 0 0 0 8.7 25.0 66.3
0.0 0.1 38.4

Disks: % tm_act Kbps tps Kb_read Kb_wrtn


hdisk1 0.0 0.0 0.0 0 0
hdisk0 0.4 3.2 0.8 0 16
hdisk2 100.0 3730.1 932.5 14028 4604
. . . < some output deleted >. . .

Transfer your data from the two tests into the table below, then answer the
following questions.

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-33


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

JFS2 AIO JFS2 AIO with CIO


Total Disk I/O
IO/sec
MB/sec
avgc
avfc
maxgc
maxfc
maxreqs
%user
%sys
%idle
%iowait
physc
%entc
» Table with values from the sample output:

Output Field JFS2 AIO JFS2 AIO with CIO


Total Disk I/O 62640 18960
IO/sec 3039.8 935.7
MB/sec 11.87 3.66
avgc 3171.2 0
avfc 0 0
maxgc 839 0
maxfc 0 0
maxreqs 60 0
%user 4.4 8.7
%sys 30.5 25.0
%idle 25.4 66.3
%iowait 39.7 0
physc 0.1 0.1
%entc 37.6 38.4

__ d. What was the best access method and why? _______________________


» The JFS2 file system without CIO gave the best performance. Without CIO, the
file system is using read-ahead and the system is handling larger I/O requests.
__ e. Why is the maxreqs field value 0 for JFS2 with CIO?

4-34 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0
Student Exercises with hints

EXempty Note: AIO I/Os performed against raw logical volumes or files opened in CIO
mode do not use the aioserver kprocs. ___________________________
» The maxreqs field specifies the maximum number of asynchronous I/O requests
that can be outstanding at one time, one per AIO server. AIO I/Os performed
against files opened in CIO mode do not use AIO server processes.
__ f. For JFS2 without CIO, why are the avfc and maxfc values zero? _______
» When the data being accessed asynchronously is located in a JFS2 file system,
it is not using fast path and all I/O is routed through the AIO servers.
__ g. Why is there iowait time for the JFS2 without CIO but not for JFS2 with CIO
access? _____________________________________________________
» AIO servers are being used for the JFS2 without CIO, so they I/O operations and
wait. When a processor has initiated an I/O request and there is nothing else for
the processor to do, it logs that time as iowait.
__ 56. Umount the /convio, /dio, and /aio file systems.
» The commands are shown below:
# umount /convio
# umount /dio
# umount /aio
__ 57. Remove the /convio, /dio, and /aio file systems and mounting points.
» The commands and example of outputs are shown below:
# rmfs -r /convio
rmlv: Logical volume fslv00 is removed.
# rmfs -r /dio
rmlv: Logical volume fslv01 is removed.
# rmfs -r /aio
rmlv: Logical volume fslv02 is removed.
__ 58. Remove physical volume hdisk2 from the volume group ex4vg.
» The commands and example of outputs are shown below:
# reducevg -d -f ex4vg hdisk2
rmlv: Logical volume loglv00 is removed.
ldeletepv: Volume Group deleted since it contains no physical
volumes.
__ 59. Let your instructor know that you have completed the exercise.

End of exercise

© Copyright IBM Corp. 2013 Exercise 4. Specialized I/O Operations 4-35


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Exercises with hints

4-36 AIX I/O Internals Part 2 © Copyright IBM Corp. 2013


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V8.0

backpg
Back page

You might also like