Download as pdf or txt
Download as pdf or txt
You are on page 1of 67

430 Chapter 12 C Programming Tools

Quiz
1. Why is it helpful to have a history of your source code?
2. What’s the definition of a leaf node on an SCCS delta tree?
3. What’s the benefit of the -q option of ar?
4. Can the make utility use object modules stored in an archive file?
5. What does the term “reusable function” mean? C H A P T E R 1 3
6. Why would you profile an executable file?
7. Describe briefly what the strip utility does.
Syste ins P rog ra in in I n g
Exercises
12.1 Create a shared library with a simple function that returns an integer value. Then
write a program to call the function and print its return value. After compiling -

and running the program, make a change to the library function, rebuild the li
brary, and run the program again (without recompiling the main program). What
MOTIVATION
happens if you rename the function in the shared library? [level: easyj
12.2 Compile “reverse.c” and “palindrome.c” and place them into an archive called If you’re a C programmer and you wish to take advantage of the UNIX multitasking
“string.a”. Write a main program in “prompt.c” that prompts the user for a string and interprocess communication facilities, it’s essential that you have a good knowl
and then outputs 1 if the string is a palindrome and 0 otherwise. Create a make edge of the UNIX system calls.
file that links “prompt.o” with the reverse 0 and palindrome () functions stored
in “string.a”. Use dbx to debug your code if necessary. [level: medium]
PREREQUISITES
12.3 Try a modern debugger such as the Borland C++ source-level debugger. How
does it compare with dbx? [level: medium]. In order to understand this chapter, you should have a good working knowledge of C.
For the Internet section of the chapter, it helps if you have read Chapters 9 and 10.

Projects
OBJECTIVES
1. Write a paper that describes how you would use the utilities presented in this sec
tion to help manage a 10-person computing team. [level: medium] In this chapter, I’ll explain and demonstrate a majority of the UNIX system calls, includ
ing those which support I/O, process management, and interprocess communication.
2. Replace the original version of palindrome () stored in “palindrome” with a
pointer-based version. Use SCCS to manage the source code changes and ar to
replace the old version in “string.a”. [level: medium]. PRESENTATION
The information is presented in the form of several sample programs, including a shell
designed for the Internet. Most sample code is available on-line. (See the preface for
more information.)

SYSTEM CALLS AND LIBRARY ROUTINES


The following system calls and library routines, listed in alphabetical order, are presented:

accept fchown inet_addr perror


alarm fcntl inet_ntoa pipe
bind fork ioctl read
bzero fstat kill setegid

431
Introduction 433
432 Chapter 13 Systems Programming

chdir ftruncate ichown seteuid file management system call hierarchy. The process management system call hierarchy
chmod getdents link setgid includes routines for duplicating, differentiating, and terminating processes, as shown in
chown getegid listen setpgid Figure 13.2. The only system call that supports error handling is perror (), which I’ll
close geteuid iseek setuid
lstat signal put in a hierarchy just to be consistent. This hierarchy is shown in Figure 13.3. In what
connect getgid
dup gethostbyname memset socket
dup2 gethostname mknod stat
execi getpgid nice sync
execip getpid ntohl truncate
execv getppid ntohs unlink
execvp getuid open wait
exit htonl pause write
nice chdir wait fork exec
fchmod htons
setgid getpid getppid setuid

getgid getrgid getuid getruid alarm signal kill pause


INTRODUCTION
In order to make use of services such as file creation, process duplication, and inter- FIGURE 13.2
process communication, application programs must “talk” to the operating system. Process management system call hierarchy.
They can do this via a collection of routines called system calls, which are the program
mer’s functional interface to the UNIX kernel. System calls are just like library rou
tines, except that they perform a subroutine call directly into the heart of UNIX.
UNIX system calls can be loosely grouped into the following three main categories:
• file management
• process management perror

• error handling FIGURE 13.3


Interprocess communication (IPC) is, in fact, a subset of file management, since UNIX Error handling hierarchy.
treats IPC mechanisms as special files. Figure 13.1 shows a diagram that illustrates the
follows, we cover the system calls shown in these hierarchy diagrams in the following
order:
• Error handling. I start the chapter with a description of perror 0.
• Regular file management. This includes information on how to create, open,
open close read close, read, and write regular files. We’ll also see a short overview of STREAMS.
fcntl fstat ftruncate
• Process management. Relevant here are how to duplicate, differentiate, suspend,
and terminate processes. Multithreaded processes are discussed briefly.
• Signals. Although the signal facility could be considered a subtopic of either
process management or interprocess communication, it’s a significant enough
topic to warrant a section of its own.
• IPC. Interprocess communication takes place via pipes (both unnamed and
listen socke named) and sockets (including information about Internet sockets). Brief
overviews of two newer IPC mechanisms found in some versions of UNIX—
shared memory and semaphores—are presented.
gethostbyname gethostname htonl htons inet_addr inet_ntoa
The chapter ends with a source code listing and a discussion of a complete Internet shell,
FIGURE 13.1 which is a shell that supports piping and redirection to other Internet shells on remote
File management system call hierarchy.
hosts. The Internet shell program uses most of the facilities described in this chapter.
Regular File Management 435
434 Chapter 13 Systems Programming

#include <sys/file.h>
ERROR HANDLING: perror 0 #include <errno.h>
Most system calls are capable of failing in some way. For example, the open () system main ()
call will fail if you try to open a nonexistent file for reading. By convention, all system
calls return —1 if an error occurs. However, this doesn’t tell you much about why the mt fd;
1* Open a non-existent file to cause an error /
error occurred; the open () system call can fail for one of several different reasons. If
fd = open (“nonexist .txt”, ORDONLY);
you want to deal with system call errors in a systematic way, you must know about the if (fd == -1) 1* fd == -l =, an error occurred
*/
following two things: {
%d\n”, errno);
• errno, a global variable that holds the numeric code of the last system call error printf (‘errno
perror (‘main’);
• perror (), a subroutine that describes system call errors }
1* Force a different error /
Every process contains a global variable called errno, which is originally set to zero open (“I”, 0_WRONLY);
=

when the process is created. When a system call error occurs, ermo is set to the numer if (fd == -1)

ic code associated with the cause of the error. For example, if you try to open a file that {
printf (errno = %d\n”, errno);
doesn’t exist for reading, errno is set to 2. The file “/usr/include/sys/errno.h” contains a perror (“main’);
list of the predefined error codes. Here’s a snippet of this file:
/ Execute a successful system call */
#define EPERM 1 1* Not owner /
fd = open (“nonexist.txt”, 0_EDONLY 0_CREAT, 0644);
#define ENOENT 2 1* No such file or directory / call */
/ No such process / printf (errno = %d\n”, errno); 1* Display after successful
#define ESRCH 3
*/ perror (‘main’
#define EINTR 4 / Interrupted system call */
/ errno = 0; 1* Manually reset error variable
#define ElO 5 / I/O error
perror (“main’);
}
A successful system call never affects the current value of errno, and an unsuccessful
system call always overwrites the current value of errno. To access errno from your
Don’t worry about how open ()works; I’ll describe it later in the chapter. Here’s the
program, include <errno.h>. The perror () subroutine converts the current value of
output from the program:
errno into an English description and works as shown in Figure 13.4. Your program
$ showErrno run the program.
Library Routine: void perror (char* str) errno 2
main: No such file or directory
perror 0 displays the string str, followed by a colon, followed by a description of the errno = 21 even after a successful call
last system call error. If there is no error to report, perror 0 displays the string main: Is a directory
“Error 0.” Actually, perror 0 isn’t a system call—it’s a standard C library routine. errno = 21
main: Is a directory
FIGURE 13.4
Description of the perror (3 library routine.
should check system calls for a return value of —1 and then deal with the situation im
mediately. One of the first things to do, especially during debugging, is to call perror 0
I main: Error 0

REGULAR FILE MANAGEMENT


My description of file management system calls is split up into four main subsections:
for a description of the error.
In the following example, I forced a couple of system call errors to demonstrate • A primer that describes the main concepts behind UNIX files and file descriptors.
perror () and then demonstrated that errno retained the last system call error code • A description of the basic file management system calls, using a sample program
even after a successful call was made: The only way to reset errno is to manually assign called “reverse” that reverses the lines of a file.
it to zero. • An explanation of a few advanced system calls, using a sample program called
“monitor,” which periodically scans directories and displays the names of files
$ cat showErrno.c within them that have changed since the last scan.
#include <stdio .
436 Chapter 13 Systems Programming Regular File Management 437

• A description of the remaining file management system calls, using some miscel example, the printf () library function always sends its output by means of file descrip
laneous snippets of source code. tor 1, and scanf () always reads its input via file descriptor 0. When a reference to a file
is closed, the file descriptor is freed and may be reassigned by a subsequent open ().
A File Management Primer Most 1/0 system calls require a file descriptor as their first argument so that they know
The file management system calls allow you to manipulate the full collection of regu which file to operate on.
lar, directory, and special files, including the following: A single file may be opened several times and may thus have several file descrip
tors associated with it, as shown in Figure 13.6. Each file descriptor has its own private
• disk-based files
• terminals
• printers
• interprocess communication facilities, such as pipes and sockets
In most cases, open () is used to initially access or create a file. If open 0 succeeds, it re
turns a small integer called a file descriptor that is used in subsequent I/O operations
on that file. If open ()fails, it returns —1. Here’s a snippet of code that illustrates a typ
ical sequence of events:

mt fd; 1* File descriptor /

/ FIGURE 13.6
fd open (fileName, ...); 7* Open file, return file descriptor
if (fd -1) { / deal with error condition */
Many file descriptors, one file.

fcntl (fd, J; !‘ Set some I/O flags if necessary / set of properties, such as the following, that have nothing to do with the file with which
the descriptor is associated:
read (fd, ..j; 7* Read from file /
• A file pointer that records the offset in the file it is reading or writing. When a file
write (fd, •-.); 7* Write to file */ descriptor is created, its file pointer is positioned at offset 0 in the file (the first
character) by default. As the process reads or writes, the file pointer is updated
iseek (fd, •.); / Seek within file*/
accordingly. For example, if a process opened a file and then read 10 bytes from
*/ the file, the file pointer would end up positioned at offset 10. If the process then
close (fd); / Close the file, freeing file descriptor
wrote 20 bytes, the bytes at offset 10..29 in the file would be overwritten, and the
When a process no longer needs to access an open file, it should close it, using the file pointer would end up positioned at offset 30.
close () system call. All of a process’ open files are automatically closed when the • A flag that indicates whether the descriptor should automatically be closed if the
process terminates. Although this means that you may often omit an explicit call to process execs. [exec 0 is described later in the chapter.]
close (), it’s better programming practice to close your files. • A flag that indicates whether all of the output to the file should be appended to
File descriptors are numbered sequentially, starting from zero. By convention, the the end of the file.
first three file descriptor values have a special meaning, as shown in Figure 13.5. For
In addition to these properties, the following ones are meaningful only if the file is a
special file such as a pipe or a socket:
Value Meaning
• A flag that indicates whether a process should block on input from the file if the
0 standard input (stdin)
file doesn’t currently contain any input.
1 standard output (stdout) • A number that indicates a process ID or process group that should be sent a
2 standard error (stderr) SIGIO signal if input becomes available on the file. (Signals and process groups
are discussed later in the chapter.)
FIGURE 13.5 The system calls open
File descriptor values for standard I/O channels.
() and fcntl () allow you to manipulate these flags and are de
scribed later.
438 Chapter 13 Systems Programming Regular Fi’e Management 439

,gnimoc si samtsirhC
Name I Function $cat test reverse .pipe output to ‘reverse’.
..

opens/creates a file Remind me of seasons I knew in the past.


open
The days that grow shorter,
read reads bytes from a file into a buffer Christmas is coming,
$—
write writes bytes from a buffer to a file
iseek moves to a particular offset in a file How reverse Works
close closes a file The reverse utility works by performing two passes over its input. During the first pass,
it notes the starting offset of each line in the file and stores this information in an array.
unlink removes a file During the second pass, it jumps to the start of each line in reverse order, copying it
from the original input file to its standard output.
FIGURE 13.7
If no file name is specified on the command line, reverse reads from its standard
UNIX system calls for basic I/O operations. input during the first pass and copies it into a temporary file for the second pass. When
the program is finished, the temporary file is removed.
Figure 13.9 shows an overview of the program flow, together with a list of the
Utility: reverse -c [fileName] functions that are associated with each action and a list of the system calls used by each
step. What follows is a complete listing of “reverse.c,” the source code of reverse. Skim
reverse reverses the lines of its input and displays them to standard output. If no file through the code and then read the description of the system calls that follow. The code
name is specified, reverse reverses its standard input. When the -c option is used, is also available on-line. (See the preface for more information.)
reverse also reverses the characters in each line.

FIGURE 13.8
Description of the reverse program. Step Action Functions System calls
1 Parse command line. parseCommandLine, open
First Example: reverse processOptions
As a first example, I’ll describe the most basic I/O system calls. Figure 13.7 shows a list 2 If reading from standard input, passi open
of them, together with brief descriptions of their functions. To illustrate the use of these create temporary file to store input;
system calls, I’ll use a small utility program called “reverse.c”. As well as being a good otherwise open input file for reading.
vehicle for my presentation, it doubles as a nice example of how to write a UNIX utili
ty. Figure 13.8 provides a description of reverse, an example of which is the following 3 Read from file in chunks, storing the passi, trackLines read, write
commands: starting offset of each line in an array.
If reading from standard input, copy
each chunk to the temporary file.
$ cc reverse.c -o reverse • . .compile the program.
$ cat test .list the test file. 4 Read the input file again, this time pass2, processLine, lseek
Christmas is coming, backwards, copying each line to reverseLine
The days that grow shorter, standard output. Reverse the line if
Remind me of seasons I knew in the past. the -c option was chosen.
$ reverse test .reverse the file.
Remind me of seasons I knew in the past. 5 Close the file. Delete it if it is a pass2 close
The days that grow shorter, temporary file.
Christmas is coming,
$ reverse —c test reverse the lines too. .
• .
FIGURE 13.9
.tsap eht ni wenk I snosaes fo em dnimeR
Description of algorithm used in reverse.c.
,retrohs worg taht syad ehT
440 Chapter 13 Systems Programming
r Regular File Management 441

50 {
reverse.c: Listing 51 mt I;
*/
1 #include <fcntl.h> / For file mode definitions 52
2 *include <stdio.h> 53 for (i= 1; i < argc; i+÷)
3 #include <stdlib.h> 54 {
4 55 ±f(argv[i][0]
5 56 processOptions (argv[i]);
6 / Enumerator / 57 else if (fileName == NULL)
*/
7 enum { FALSE, TRUE }; / Standard false and true values 58 fileName= argv[i];
*/
8 enum { STDIN, STDOUT, STDERR }; / Standard I/O channel indices 59 else
9 60 usageError 0; / An error occurred /
10 61 }
11 /* #define Statements / 62
4096 /* Copy buffer size / 63 standardlnput = (fileName NULL);
12 #define BUFFER_SIZE
13 #define NAME_SIZE 12 64 )
*/
14 #define MAX_LINES 100000 /* Max lines in file 65
**************/
66 /**************************************************
15

16 67
17 /* Globals / 68 processOptions (str)
/ j++)
18 char *fileName NULL; / Points to file name 69
19 char tmpName [NAME_SIZE]; 70 char* str;
20 mt charOption = FALSE; / Set to true if -c option is used */*/ 71
/* Parse options /
21 mt standardlnput FALSE; /* Set to true if reading stdin 72
22 mt lineCount = 0; /* Total number of lines in input / */ 73
23 mt lineStart [MAX_LINES]; /* Store offsets of each line 74
24 mt fileOf feet = 0; /* Current position in input *1 75 mt j;
25 mt fd; / File descriptor of input *7 76
26 77 for 0= 1; str[j] != NULL;
**************/
/************************************************** 78 {
27’
79 switch(str[:j]) /* Switch on command line flag *7
28
29 main (argc, argv) 80
r 81 case’c’:
30
31 mt argc; 82 charOption = TRUE;
32 char* argv [1; 83 break;
33 84
34 { 85 default:
*/
35 parseCommandLine (argc,argv); / Parse command line 86 usageError ;
/ input *7 87 break;
36 passl ; Perform first pass through
*7
37 pass2 ; / Perform second pass through input 88
(/* EXITSUCCE SS */ 0); /* Done *7 89
38 return
90
40 91
**************/ **************/
/************************************************** 92 /**************************************************
41

42 93
43 parseCommandLine (argc, argv) 94 usageError ()
44 95
45 mt argc; 96 {
46 char* argv [1; 97 fprintf (stderr, ‘Usage: reverse -c [filenameJ\n’);
47 98 exit (7* EXITFAILURE / 1)
48 7* Parse command line arguments / 99
49 100
442 Chapter 13 Systems Programming Regular File Management 443

************************************/
101 /***************** *********** 151 trackLines (buffer, charsRead)
102 152
103 passl () 153 char* buffer;
104 154 mt charsRead
*/
105 /* Perform first scan through file 155
106 156 / Store offsets of each line start in buffer */
107 { 157
108 mt tmpfd, charsRead, charsWritten; 158
109 char buffer [BUFFER_SIZE]; 159 mt i;
110 160
111 if (standardlnput) / Read from standard input *1 161 for (i = 0; 1 < charsRead; j++)
112 { 162 {
113 fd = STDIN; 163 ++fileOffset; /* Update current file position *1
/ fileOffset;
114 sprintf (tmpName, ‘.rev%d”,getpid Q); /‘ Random name 164 if (buffer[i] \n) lineStart[++lineCount) =

/* store copy of input /


115 Create temporary file to 165 }
116 tmpfd open (tmpName, 0_CREAT O_RDWR, 0600); 166
117 if (tmpfd = -1) fatalError 0; 167
**************/
/**************************************************
118 } 168
*/
119 else /* Open named file for reading 169
120 170 mt pass2 ()
121 fd = open (fileName, O_RDONLY); 171
122 if (fd == -1) fatalError 0; 172 /* Scan input file again, displaying lines in reverse order */
123 } 173
124 174 {
*/ mt
125 lineStart[0] = 0; /* Offset of first line 175 i;
126 176
*7
127 while (TRUE) 7* Read all input 177 for (i = lineCount — 1; i >= 0; i——)
128 178 processLine (i);
129 7* Fill buffer */ 179
130 charsRead = read (fd, buffer, BUFFER_SIZE); 180 close (fd); 7* Close input file */
/ Remove temp file *7
131 if (charsRead 0) break; /* EOF / 181 if (standardlnput) unlink (tmpName);
7* Error */
132 if (charsRead == -1) fatalError 0; 182
133 trackLines (buffer, charsRead);
/ Process line / 183
/* Copy line to temporary file if reading from stdin /
**************/
/**************************************************
134 184
135 if (standardlnput) 185
136 { 186 processLmne (‘)
137 charsWritten = write (tmpfd buffer charsRead) 187
138 if(charsWritten 1= charsRead) fatalError 0 188 mt i

139 189
190 / Read a line and display it *7
140 }
141 191
142 / Store offset of trailing line, if present *7 192
143 lineStart[line Count + 1] fileOffset; 193 mt charsRead;
144 194 char buffer [BUFFER_SIZE];
145 / If reading from standard input, prepare fd for pass2 / 195
lseek (fd, lineStart[i], SEEK_SET); /* Find the line and read it
146 if (standardlnput) fd trrpfd; 196
147 } *7
148 197 charsRead = read (fd, buffer, lineStart[i+l] lineStart[i]); -

149
/******************************************************
**********/
198 /* Reverse line if —c option was selected */
150 199 if (charOption) reverseLine (buffer, charsRead);
Regular File Management 445
444 Chapter 13 Systems Programming

200 write (1, buffer, charsRead); 7* Write it to standard output


*7 • If reverse is reading from a named file, the file is opened in read-only mode so
201 } that its contents may be read during pass 1, using the file descriptor fd.
202
203
/*************************** *********************************
****/
Each action uses the open () system call; the first action uses it to create a file, the sec
204 ond to access an existing file. The open () system call is described in Figure 13.10.
205 reverseLine (buffer, size)

System Call: mt open (char* fileName, mt mode [,int permissions])


206
207 char* buffer;
208 mt size; open () allows you to open or create a file for reading or writing, fileName is an ab
209
*7 solute or relative pathname and mode is a bitwise OR of a read/write flag, with or
210 /* Reverse all the characters in the buffer
without some miscellaneous flags. permissions is a number that encodes the value of
211
212 {
the file’s permission flags and should be supplied only when a file is being created. It
213 mt start 0, end = size 1; —
is usually written using the octal encoding scheme described in Chapter 2. The
214 char tmp; permissions value is affected by the process’ umask value, described in Chapter 4.
215 The values of the predefined read/write and miscellaneous flags are defined in
/
216 if (buffer[end] = ‘\n’) --end; /* Leave trailing newline “/usr/include/fcntl.h”. The read/write flags are as follows:
217
218 7* Swap characters in a pairwise fashion *7 FLAG MEANING
219 while (start < end) O_RDONLY Open for read only.
220
221 tmp but fer[start]; 0_WRONLY Open for write only.
222 but fer[start] = buffer[endj; O_RDWR Open for read and write.
223 but fer[end) = tmp;
224 ++start; / Increment start index
*7 The miscellaneous flags are as follows:
--end; 7* Decrement end index *7
225 MEANING
FLAG
226
227 } 0_APPEND Position the file pointer at the end of the file
228
********************************************/
before each write 0.
229 /********************
230 OCREAT If the file doesn’t exist, create it and set the
231 fatalError () owner ID to the process’ effective UID. The
232 umask value is used when determining the
233 { initial permission flag settings.
7
234 perror (‘reverse: “); / Describe error
OEXCL If 0_CREAT is set and the file exists, then
235 exit (1);
open 0 fails.
236
ONONBLOCK This setting works only for named pipes. If set,
Opening a File: open () (Called O_NDELAY an open for read only will return immediately,
The reverse utility begins by executing parseCommandLine () [line 43], which sets var on some systems) regardless of whether the write end is open,
ious flags, depending on which options are chosen. If a filename is specified, the vari and an open for write only will fail if the read
able fileName is set to point to the name and standardlnput is set to FALSE; end isn’t open. If clear, an open for read only
otherwise, fileName is set to NULL and standardlnput is set to TRUE. Next, passl 0 or write only will block until the other end is
[line 103] is executed. Passl ()performs one of the following actions: also open.
If reverse is reading from standard input, a temporary file is created with read 0_TRUNC If the file exists, it is truncated to length zero.
and write permissions for the owner and no permissions for anyone else (octal
mode 600). The file is opened in read/write mode and is used to store a copy of
open 0 returns a nonnegative file descriptor if successful; otherwise, it returns —1.
the standard input for use during pass 2. During pass 1, the input is taken from
standard input, so the file descriptor fd is set to STDIN, defined to be 0 at the top FIGURE 13.10
of the program. Recall that standard input is always file descriptor zero. Description of the open () system call.
446 Chapter 13 Systems Programming Regular File Management 447

Creating a File 130 charsRead = read (fd, buffer, BUFFER_SIZE);


131 if (charsRead = 0) break; /* EOF /
To create a file, use the 0_CREAT flag as part of the mode flags, and supply the initial 132 if (charsRead = -1) fatalError 0; /* Error /
file permission flag settings as an octal value. For example, lines 114—117 create a tem
porary file with read and write permission for the owner and then open the file for
reading and writing:
System Call: ssize_t read (mt fd, void* buf, size_t count)
/*
Random name
114 sprintf (tmpName, .rev.%d”, getpid (H;

/ Create temporary file to store copy of input / Note: this synopsis describes how read (,) operates when reading a regular file. For in
115
116 tmpfd = open (tinpNamne, 0_CREAT O_RDWR, 0600); formation on reading from special files, see later sections of the chapter.
117 if (tmpfd == -1) fatalError 0; read Qcopies count bytes from the file referenced by the file descriptor fd into
the buffer buf. The bytes are read starting at the current file position, which is then
updated accordingly.
The getpid () function is a system call that returns the process’ ID (PID) number, read () copies as many bytes from the file as it can, up to the number specified
which is guaranteed to be unique. This is a handy way to generate unique temporary by count, and returns the number of bytes actually copied. If a read ()is attempted
file names. [For more details on getpid (),see the “Process Management” section of the after the last byte has already been read, it returns 0, which indicates end of file.
chapter.] Note that I chose the name of the temporary file to begin with a period so If successful, read () returns the number of bytes that it read; otherwise, it
that it doesn’t show up in an ls listing. Files that begin with a period are sometimes returns —1.
known as hidden files.

Opening an Existing File FIGURE 13.11


Description of the read () system call.
To open an existing file, specify the mode flags only. Lines 121—122 open a named file
for read only:

121 fd = open (fileName, ORDONLY); As each chunk of input is read, it is passed to the trackLines () function. This function
122 if (fd —1) fatalError 0; scans the input buffer for newlines and stores the offset of the first character in each
line in the lineStart array. The variable fileOffset is used to maintain the current file
Other Open Flags offset. The contents of lineStart are used during the second pass.
The other more complicated flag settings for open O such as O_NONBLOCK, are
Writing to a File: write ()
intended for use with the pipes, sockets, and STREAMS, all described later in the
chapter. Right now, the 0_CREAT flag is probably the only miscellaneous flag that When reverse is reading from standard input, it creates a copy of the input for use dur
you’ll need. ing pass 2. To do this, it sets the file descriptor tmpfd to refer to a temporary file and
then writes each chunk of input to the file during the read loop. To write bytes to a file,
Reading From a File: read () it uses the write () system call, which works as shown in Figure 13.12. The write () sys
tem call performs low-level output and has none of the formatting capabilities of
Once reverse has initialized the file descriptor fd for input, it reads chunks of input and
printf (). The benefit of write 0 is that it bypasses the additional layer of buffering
processes them until the end of the file is reached. To read bytes from a file, reverse
uses the read () system call, which works as shown in Figure 13.11. The read () system
supplied by the C library functions and is therefore very fast. Lines 134—139 perform
the write operation:
call performs low-level input and has none of the formatting capabilities of scanf 0.
The benefit of read () is that it bypasses the additional layer of buffering supplied by
the C library functions and is therefore very fast. Although I could have read one char 134 /* Copy line to temporary file if reading standard
input */
acter of input at a time, that would have resulted in a large number of system calls, thus 135 if (standardlnput)
slowing down the execution of my program considerably. Instead, I used read ()to read 136 {
up to BUFFER_SIZE characters at a time. BUFFER_SIZE was chosen to be a multi 137 charsWritten write (tmpfd, buffer, charsRead);
ple of the disk block size, for efficient copying. Lines 130—132 perform the read and test
138 if (charsWritten charsRead) fatalError 0;
139 I
the return result:
448 Chapter 13 Systems Programming Regular File Management 449

Lines 196—197 seek until the start of a line and then read in all of the characters
System Call: ssize_t write (mt fd, void* buf, size_t count) in the line. Note that the number of characters to read is calculated by subtracting the
Note: this synopsis describes how write 0 operates when writing to a regular file. For start offset of the next line from the start offset of the current line:
information on writing to special files, see later sections of this chapter.
write copies count bytes from a buffer buf to the file referenced by the file 196 lseek (fd, lineStart[i], SEEK_SET); /* Find line and read it *7
descriptorfd. The bytes are written starting at the current file position, which is then 197 charsRead = read (fd, buffer, lineStart[i+1) lineStart[jj); -

updated accordingly. If the 0_APPEND flag was set for fd, the file position is set to
the end of the file before each write. If you want to find out your current location without moving, use an offset value of
write ()copies as many bytes from the buffer as it can, up to the number spec zero relative to the current position:
ified by count, and returns the number of bytes actually copied. Your process should
always check the return value. If the return value isn’t count, then the disk probably
currentoffset = iseek (fd, 0, SEEK_CUR);
filled up and no space was left.
If successful, write ()returns the number of bytes that were written; otherwise,
it returns —1. If you move past the end of the file and then perform a write
(), the kernel automati
cally extends the size of the file and treats the intermediate file area as if it were filled
FIGURE 13.12
with NULL (ASCII 0) characters. Interestingly enough, it doesn’t allocate disk space
for the intermediate area, which is confirmed by the following example:
Description of the write () system call.

Moving in a File: iseek () $ cat sparse.c .


. .list the test file.
#include <fcntl .

Once the first pass is completed, the array lineStart contains the offsets of the first #include <stdio .h>
character of each line of the input file. During pass 2, the lines are read in reverse order #include <stdlib.h>
and displayed to standard output. In order to read the lines out of sequence, the pro /
gram makes use of lseek (),which is a system call that allows a descriptor’s file pointer main ()
to be changed. Figure 13.13 describes lseek 0. {
mt
i, fd;
/ Create a sparse file *7
System Call: off_t iseek (mt fd, off_t offset, mt mode) fd open (“sparse.txt”, 0_CREAT O_RDWR, 0600)
write (fd, ‘sparse, 6);
lseek 0 allows you to change a descriptor’s current file position. fd is the file de lseek (fd, 60006, SEEK_SET);
scriptor, offset is a long integer, and mode describes how offset should be interpret write (fd, “file”, 4);
ed. The three possible values of mode are defined in “/usr/include/stdio.h” and have close (fd);
the following meanings: / Create a normal file *7
fd = open (“normal .txt”, 0_CREAT O_RDWR, 0600);
VALUE MEANING write (fd, “normal’, 6);
for (i = 1; i < 60000; i++)
SEEK_SET offset is relative to the start of the file.
4 write (fd, “/0”, 1);
SEEK_CUR offset is relative to the current file position. write (fd, “file”, 4);
close (fd);
SEEK_END offset is relative to the end of the file.
I
lseek 0 fails if you try to move before the start of the file. $ sparse .execute the file.
• .

$ is —1 *.txt .look at the files.


If successful, lseek 0 returns the current file position; otherwise, it returns —1. -rw—r——r——
• .

1 glass 60010 Feb 14 15:06 normal.txt


-rw—r——r—- 1 glass 60010 Feb 14 15:06 sparse.txt
On some systems, the modes are defined in “/usr/include/unistd.h.”
$ is —s *txt • list their block usage.
. .

60 normal.txt* .uses a full 60 blocks.


FIGURE 13.13 8 sparse.txt* • .only uses 8 blocks.
Description of the Iseek 0 system call. $_
450 Chapter 13 Systems Programming Regular File Management 451

Files that contain “gaps” like this are termed “sparse” files; for details on how they are Second Example: monitor
actually stored, see Chapter 14.
This section contains a description of some more advanced system calls, listed in
Closing a File: close () Figure 13.16. The use of these calls is demonstrated in the context of a program called
monitor, which allows a user to monitor a series of named files and to obtain informa
When pass 2 is over, reverse uses the close () system call to free the input file descrip tion whenever any of them are modified. Figure 13.17 gives a description of monitor.
tor. Figure 13.14 provides a description of close ().Line 180 contains the call to close 0:

180 close (fd); /* Close input file */ Name Function


stat obtains status information about a file
System Call: mt close (mt fd) fstat works just like stat
getdents obtains directory entries
close ()frees the file descriptor fd. If fd is the last file descriptor associated with a
particular open file, the kernel resources associated with the file are deallocated.
FIGURE 13.16
When a process terminates, all of its file descriptors are automatically closed, but it’s
better programming practice to close a file when you’re done with it. If you close a Advanced UNIX I/O system calls.
file descriptor that’s already closed, an error occurs.
If successful, close returns zero; otherwise, it returns —1. Utility: monitor [-t delay] [-1 count] { fileName }+
I
FIGURE 13.14 monitor scans all of the specified files every delay seconds and displays information
Description of the close () system call. about any of the specified files that were modified since the last scan. If fileName is
a directory, all of the files inside that directory are scanned. File modification is indi
Just because a file is closed does not guarantee that the file’s buffers are immediately cated in one of three ways:
flushed to disk; for more information on file buffering, see Chapter 14.
LABEL MEANING
Deleting a File: unlink ()
ADDED Indicates that the file was created since the last scan. Every
If reverse reads from standard input, it stores a copy of the input in a temporary file. At file in the file list is given this label during the first scan.
V the end of pass 2, it removes this file, using the unlink () system call, which works as
shown in Figure 13.15. Line 181 contains the call to unlink 0: CHANGED Indicates that the file was modified since the last scan.

181 if (standardlrxput) unlink (tmpName); / Remove temp file */ DELETED Indicates that the file was deleted since the last scan.

For more information about hard links, see Chapter 14. By default, monitor will scan forever, although you can specify the total number of
scans by using the -l option. The default delay time is 10 seconds between scans, al
though this may be overridden by using the -t option.
System Call: mt unlink (const char* fileName)

unlink () removes the hard link from the name fileName to its file. If fileName is the FIGURE 13.17
last link to the file, the file’s resources are deallocated. In this case, if any process’ file Description of the monitor program.
descriptors are currently associated with the file, the directory entry is removed im
mediately, but the file is deallocated only after all of the file descriptors are closed.
In the following example, I monitored an individual file and a directory, storing the
This means that an executable file can unlink itself during execution and still contin
output of monitor into a temporary file:
ue to completion.
If successful, unlink () returns zero; otherwise, it returns —1. % is .. . look at home directory.
monitor.c monitor trrp/
FIGURE 13.15 % is trap . . . look at ,trnp” directory.
Description of the unlink () system call. b
Regular File Management 453
452 Chapter 13 Systems Programming

monitor.out & .stt. Following is a complete listing of “monitor.c”, the source code of monitor. Skim
% monitor tmp myFile.txt >&
through it and then read the description of the system calls that follow.
[1] 12841
% cat > tmp/a . . .create a file in “/tmp”.
hi there
monitor.c: Listing

% cat > myFile.txt ...create “rnyFile.txt’.


1 #include <stdio.h> / For printf, fprintf /
hi there /
2 #include <string.h> For strcmp *1
3 #include <ctype.h> 7* For isdigit *7
% cat > myFile. txt . . .change “myFile. txt”.
4 #include <fcntl.h> /* For O_RDONLY *7
hi again /
5 #include <sys/dirent.h> For getdents *7
6 #include <sys/stat.h> / For IS macros /
% rm tmp/a .delete “tmp/a’.
• .

look at jobs.
7 #include <sys/types.h> 1* For modet *7
% jobs 8 / For *7
#include <time,h> localtime, asctime
[1] + Running monitor tmp myFile.txt ,& monitor.out
9
% kill %l .kill monitor job.
.

10
[1] Terminated monitor tmp myFile.txt ,& monitor.out
11 /* #define Statements */
% cat monitor.out .look at output.
1998 12 #define MAX_FILES 100
ADDED tmp/b size 9 bytes, mod, time = Sun Jan 18 00:38:55
13 18:51:09 1998 13 #define MAX_FILENANE 50 A
ADDED trnp/a size 9 bytes, mod. time = Fri Feb
Fri Feb 13 18:51:21 1998 14 #define NOT_FOUND —l
ADDED myFile.txt size 9 bytes, mod, time =

= Fri Feb 13 18:51:49 1998


15 #define FOREVER —l
CHANGED myFile.txt size 18 bytes, mod, time
16 #define DEFAULT DELAY TI 10
DELETED tmp/a
17 #define DEFAULT_LOOp COUNT FOREVER
18
19
Notice how the contents of the “monitor.out” file reflected the additions, modifica 20 /* Booleans *1
tions, and deletions of the monitored file and directory. 21 enum { FALSE, TRUE };
22
How monitor Works 23
24 /* Status structure, one per file. *1
The monitor utility continually scans the specified files and directories for modifica 25 struct statStruct
in
tions. It uses the stat () system call to obtain status information about named files, 26
cluding their type and most recent modification time, and uses the getdents ()system
27 char fileName [MAX_FILENANE]; /* File name /
call to scan directories. Monitor maintains a status table called stats, which holds the 28 mt lastCycle, thisCycle; 7* To detect changes /
following information about each file that it finds: 29 struct stat status; /* Information from stat () *7
30
• the name of the file 31
32
• the status information obtained by stat () / Globals *7
33
• a record of whether the file was present during the current scan and the previ 34 char* fileNames [MAX_FILES]; /* One per file on command line *7
ous scan 35 mt fileCount; / Count of files on command line *7
36 struct statStruct stats [MAX_FILES]; /* One per matching file *7
During a scan, monitor processes each file as follows: 37 mt loopCount DEFAULT_LOOP_COUNT; /* Number of times to loop *7
• If the file isn’t currently in the scan table, it’s added and the message “ADDED”
38 mt delayTime DEFATJLT_DELAY TINE; 7* Seconds between loops *7
39
is displayed. 40 /******************************************************
**********,

• If the file is already in the scan table and has been modified since the last scan, 41

I
the message “CHANGED” is displayed. 42 main (argc, argv)
43
At the end of a scan, all entries that were present during the previous scan, but not 44 mt argc;
during the current scan, are removed from the table and the message “DELETED” is 45 char* argv [j;
displayed.
454 Chapter 13 Systems Programming Regular File Management 455

46 95 case ‘1’:
47 96 loopCount = getflumber (str, &j);
48 parseCommandLine (argc, argv); / Parse command line *7 97 break;
49 monitorLoop Q; / Execute main monitor loop *7 98
50 return (7* EXIT_SUCCESS *7 0); 99
51 } 100 )
52 101
53 /****************************************************************7 102 /************************************************************
I

54 103
55 parseCommandtine (argc, argv) I 104 getNuniber (str, i)
56 105
57 mt argc; 106 char* str;
58 char* argv [1 107 int* j
59 108
60 / Parse command line arguments / 109 / Convert a numeric ASCII option to a number /
61 110
62 { 111 (
63 mt i; 112 mt number = 0;
64 113 mt digits = 0; / Count the digits in the number /
65 for (i = 1; ( (i < argc) && (i < MAX_FILES) ); i++) 114
66 { 115 while (isdigit (str[(*i) + 1])) 7* Convert chars to ints *7
67 if (argv[i] [0] == ‘—‘) 116 {
68 processOptions (argv[i]); 117 number = number * 10 + str[+i(*i)] — ‘0’;
69 else 118 ++digits;
70 fileNames[fileCount++] = argv[i]; 119 }
71 ) 120
72 121 if (digits == 0) usageError Q; / There must be a number /
73 if (fileCount == 0) usageError Q; 122 return (number);
74 } 123 }
75 124
76
/****************************************************************/ 125 /****************************************************************,
77 126
78 processOptions (str) 127 usageError ()
79 128
80 char* str; 129 {
81 130 fprintf (stderr, “Usage: monitor -t<seconds> -‘<loops>
82 / Parse options / {filename}+\n”);
83 131 exit (7* EXIT_FAILURE *7 1);
84 132 3
85 mt j; 133
86 134 /****************************************************************,

87 for (j = 1; str[j] != NULL; j++) 135


88 { 136 monitorLoop ()
89 switch(str[j]) 7* Switch on option letter *7 137
90 138 7* The main monitor loop /
91 case ‘t’: 139
92 delayTime = getNuiter (str, &j); 140 {
93 break 141 do
94 142 {
456 Chapter 13 Systems Programming Regular File Management 457

143 monitorFiles 0; 1* Scan all files / 193


144 f flush (stdout); 7* Flush standard output */ 194 mode = statBuf.st_mode; 7* Mode of file *7
145 ff lush (stderr); 7* Flush standard error / 195
146 sleep (delayTime); / Wait until next loop */ 196 if(S_ISDIR (mode)) 7* Directory *7
147 197 processDirectory (filename);
148 while (loopCount == FOREVER --loopCount 0); 198
> else if (5_ISREG (mode) II
S_ISCHR (mode) S_ISBLK (mode))
149 199 updateStat (filename, &statBuf); 7* Regular file /
150 200
/****************************************************************/
151 201
152 202
153 monitorFiles () 203
4.
154 204 processDirectory (dirName)
155 /* Process all files *7 205
156 206 char* dirName;
157 207
158 mt / Process all files in the named
i; 208 directory /
159 209
160 for (i = 0; i < fileCount; i++) 210
161 monitorFile (fileNames[i]); 211 mt fd, charsRead;
4
162 212 struct dirent dirEntry;
163 for (i = 0; i< MAX_FILES; i++) 7* Update stat array / 213 char fileName [MAX_F ILflTA1);
164 214
165 if (stats[i] .lastCycle && !stats[iJ .thisCycle) 215 fd = open (dirName, O_RDONLY); / Open for reading /
166 printf (“DELETED %s\n”, stats[i] .fileName); 216 if (fd == —1) fatalError Q;
167 217
168 stats[ij .lastCycle = stats[i] .thisCycle; 218 while (TRUE) 7* Read all directory entries !
169 stats[iJ .thisCycle = FALSE; 219
170 220 charsRead = getdents(fd, &dirEntry, sizeof (strict dirent));
171 221 if (charsRead == -1) fatalError ();
172 222 if (charsRead == 0) break; / EOF /
173 /****************************************************************7
223 if (strcmp (dirEntry.d_name, “.“) != 0&&
174 224 strcmp (dirEntry.d_name, “..“) != 0) 7* Skip and
175 monitorFile (filename) 225
176 226 sprintf (filename, “%s/%s”, dirName, dirEntry.d name);
177 char* filename; 227 monitorFile (filename); / Call recursively !
178 228
179 7* Process a single file/directory*/ 229
180 230 lseek (fd, dirEntry.d_off, SEEK_SET); 7* Find next entry *7
181 { 231
182 strict stat statBuf; 232
183 mode_t mode; 233 close (fd); 7* Close directory *7
184 mt result; 234
185 235
186 result = stat (filename, &statBuf); 7* Obtain file status / 236 /****************************************************************
187 237
188 if (result == -1) 7* Status was not available *7 238 updateStat (fileName, statBuf)
189 239
190 fprintf (stderr, “Cannot stat %s\n”, fileName); 240 char* fileName;
191 return; 241 strict stat* statBuf;
192 } 242
458 Chapter 13 Systems Programming Regular File Management 459

243 7* Add a status entry if necessary / 293 printf (“ADDED “); /‘ Notify standard output /
244 294 printEntry (index); /* Display status information *7
245 295 return (index);
246 mt entrylndex; 296 )
247 297
248 entrylndex = findEntry (fileName); / Find existing entry / 298 /****************************************************************/
249 299
250 if (entrylndex NOT_FOUND) 300 nextFree ()
251 entrylndex = addEntry (fileName, statBuf); 1* Add new entry / 301
302 7* Return the nextfree index in the status array *1
252 else
253 updateEntry (entrylndex statBuf) /* Update existing entry *7 I 303
254 304 {
255 if (entrylndex NOT_FOUND) 305 mt i;
256 stats[entxylndex).thisCyCle TRUE; / Update status array / 306
257 } 307 for (i = 0; i < MAX_FILES; j+-i-)
258 308 if (!stats[i].lastCycle && !stats[iJ.thisCycle) return (i);
259
/****************************************************************/ 309
260 310 return (NOT_FOUND);
261 findEntry (fileName) 311
262 312
263 char* fileName; 313 /****************************************************************/
264 314
265 7* Locate the index of a named filein the status array / 315 updateEntry (index, statBuf)
266 316
267 { 317 mt index;
268 mt i; 318 struct stat* statBuf;
269 319
270 for (i 0; i < MAX_FILES; i++) 320 /*Display information if the file has been modified *7
271 if (stats[i].lastCycle && 321
272 strcmp (stats[i].fileName, fileName) == 0) return (i); 322
273 323 if (stats[index) .status.st_mtime statBuf—>st_mtime)
274 return (NOT_FOUND); 324 {
275 } 325 stats[index] .status = *statBuf; / Store stat information /
276 326 printf (“CHANGED ); /‘ Notify standard output *7
277
****************************************************************/ 327 printEntry (index)
278 328
279 addEntry (fileName, statBuf) 329 }
280 330
281 char* fileName; 331 /*********************************************** *****************/
282 struct stat* statBuf; 332
283 333 printEntry (index)
284 7* Add a new entry into the status array */ 334
285 335 mt index;
286 ( 336
287 mt index; 337 7* Display an entry of the status array *7
288 338
289 index = nextFree Q; 7* Find the next free entry / 339 {
/
290 if (index == NOT_FOUND) return (NOT_FOUND); 7* None left 340 printf (“%s“, stats [index] fileName);
291 strcpy (stats[indexLfileNaifle, fileName); / Add filename */ 341 printStat (&stats[index].status);
292 stats[indexl.statUs *statBuf; / Add status information / 342 }
460 Chapter 13 Systems Programming Regular File Management 461

343
4 /****************************************************************/ Thestatstructurecontainsthefollowingmembers:
NAME MEANING
346 printstat (statBuf)
347 st_dev the device number
348 struct stat* statBuf;
349 st_mo the mode number
350 7* Display a status buffer */
st_mode the permission flags
351
352 { st_nlink the hard-link count
353 printf (“size %lu bytes, mod, time = %s”, statBuf->st_size,
354 asctime (localtime (&statBuf->st_mtimefl); st_uid the user ID
355 — st_gal the group ID
357 /****************************************************************/ st_size thefilesize
358
359 fatalError () st_atime the last access time
360 st_mtime the last modification time
361 {
362 perror (“monitor: “); st_ctime the last status change time
363 exit (7* EXIT_FAILURE *7 1);
364 1 There are some predefined macros defined in “/usr/include/sys/sta.h” that take
st_mode as their argument and return true (1) for the following file types:

Obtaining File Information: stat () MACRO RETURNS TRUE FOR HLE TYPE

monitor obtains its file information by calling stat (),which works as shown in Figure 13.18. S_IFDIR directory
The monitor utility invokes stat ()from monitorFile 0 [line 175] on line 186: S_IFCHR character special device
186 result = stat (fileName, &statBuf); 7* obtain file status / S_IFBLK block special device
S_IFREG regular file
S_IFFIFO pipe
System Call: mt stat (const char* name, struct stat* buf)
The time fields may be decoded with the standard C library asctime sand localtime (3
mt istat (const char* name, struct stat* buf) subroutines.
stat and fstat 0 return 0 if successful and -1 otherwise.
int fstat (mt fd, struct stat* buf)
FIGURE 13.18 (Continued)
stat ()fills the buffer buf with information about the file name.The stat structure is
defined in “/usr/include/sys/stat.h”. lstat() returns information about a symbolic link
itself, rather than the file it references. fstat (3 performs the same function as stat EL
except that it takes the file descriptor of the file to be “stat’ed” as its first parameter. monitor examines the mode of the file using the S_ISDIR, S_ISREG, S_ISCHR, and
S_ISBLK macros, processing directory files, and other files as follows:

FIGURE 13.18 • If the file is a directory file, it calls processDirectory 0 [line 204], which applies
Description of the stat () system call. monitorFile 0 recursively to each of its directory entries.
462 Chapter 13 Systems Programming Regular File Management 463

• If the file is a regular file, a character special file, or a block special file, monitor
calls updateStat 0 [line 238], which either adds or updates the file’s status entry. Name Function
If the status changes in any way, updateEntry 0 [line 315] is called to display the chown changes a file’s owner or group
file’s new status. The decoding of the time fields is performed by the localtime 0
and asctime ()routines in printStat 0 [line 346]. chmod changes a file’s permission settings
dup duplicates a file descriptor
Reading Directory Information: getdents ()
dup2 similar to dup
processDirectory () [line 204] opens a directory file for reading and then uses getdents ()
to obtain every entry in the directory, as shown in Figure 13.19. processDirectory () is fchown works just like chown *
careful not to trace into the and “.“ directories and uses iseek 0 to jump from one di
“..“
fchmod works just like chmod
rectory entry to the next. When the directory has been completely searched, it is closed.
fcntl gives access to miscellaneous file characteristics

System Call: mt getdents (mt fd, struct dirent* buf, mt structSize) ftruncate works just like truncate
ioctl controls a device
getdents ()reads the directory file with descriptor fd from its current position and
fills the structure pointed to by buf with the next entry. The structure dirent is de link creates a hard link
fined in “/usr/include/sys/dirent.h” and contains the following fields: mknod creates a special file

NAME MEANING sync schedules all file buffers to be flushed to disk

dino the mode number truncate truncates a file

d_off the offset of the next directory entry FIGURE 13.20


d_reclen the length of the directory entry structure UNIX file management system calls.

d_nam the length of the filename

getdents () returns the length of the directory entry when successful, 0 when the last System Call: mt chown (const char* fileName, uid_t ownerld, gid_t groupld)
directory entry has already been read, and —1 in the case of an error.
mt Ichown (const char* fileName, uid_t ownerld, gid_t groupid)
FIGURE 13.19 mt fchown (mt fd, uid_t ownerld, gid_t groupld)
Description of the getdents 0 system call. chown causes the owner and group IDs of fileName to be changed to ownerld and
groupld, respectively. A value of —1 in a particular field means that its associated
Some older systems use the getdirentries () system call instead of getdents ().The value should remain unchanged. lchown() changes the ownership of a symbolic link
usage of getdirentries () differs somewhat from getdents 0; see your system’s man itself, rather than the file the link references.
page for details. Only a superuser can change the ownership of a file, and a user may change the
group only to another group of which he or she is a member. If fileName is a sym
Miscellaneous File Management System Calls
bolic link, the owner and group of the link are changed instead of the file that the
Figure 13.20 gives a brief description of some miscellaneous UNIX file management link is referencing.
system calls. fchown 0 is just like chown 0 except that it takes an open descriptor as an ar
gument instead of a filename.
Changing a File’s Owner or Group: chown () and fchown () Both functions return —1 if unsuccessful and 0 otherwise.
chown () and fchown () change the owner or group of a file. They work as shown in
Figure 13.21. In the following example, I changed the group of the file “test.txt” from FIGURE 13.21
“music” to “Cs,” which has group ID number 62 (for more information about group IDs
Description of the chown ichown 0 and fchown 0 system calls.
and how to locate them, see Chapter 15): ,
464 Chapter 13 Systems Programming Regular File Management 465

$ cat mychown.c . . .list the file. Duplicating a File Descriptor: dup () and dup2 ()
main ()
dup () and dup2 () allow you to duplicate file descriptors. They work as shown in
mt flag; Figure 13.23. Shells use dup2 () to perform redirection and piping. (For examples that
flag = chown (‘test.txt”, -1, 62); /* Leave user ID unchanged */ show how this is done, see “Process Management” in this chapter, and study the Internet
if (flag == -1) perror(”mychown.c”);

examine file before. System Call: mt dup (mt oldFd)


$ is —ig test.txt
—rw—r——r-— 1 glass music 3 May 25 11:42 test.txt
$ mychown run program. mt dup2 (mt oldFd mt newFd),

$ is —ig test.txt examine file after.


—rw--r——r-— 1 glass cs 3 May 25 11:42 test.txt dup () finds the smallest free file descriptor entry and points it to the same file as
$— oldFd. dup2 () closes newFd if it’s currently active and then points it to the same file
as oldFd. In both cases, the original and copied file descriptors share the same file
Changing a File’s Permissions: chmod () and fchmod () pointer and access mode.
chmod () and fchmod () change a file’s permission flags. They work as shown in Both functions return the index of the new file descriptor if successful and —1
Figure 13.22. In the following example, I changed the permission flags of the file otherwise.
“test.txt” to 600 octal, which corresponds to read and write permission for the owner only:
4
FIGURE 13.23
$ cat mychrnod. c . . . list the file.
main
Description of the dup () and dup2 () system calls.

mt flag; shell at the end of the chapter. In the following example, I created a file called
flag chmod (test. txt’, 0600); /* Use an octal encoding */ “test.txt” and wrote to it via four different file descriptors:
if (flag == -1) perror (‘mychmod.c’);
• The first file descriptor was the original descriptor.
$ is -i test.txt .examine file before. • The second descriptor was a copy of the first, allocated in slot 4.
—rw-r--r-— 1 glass 3 May 25 11:42 test.txt
run the program.
• The third descriptor was a copy of the first, allocated in slot 0, which was freed by
$ mychmoci
examine file after. the close (0) statement (the standard input channel).
$ is —i test.txt
-rw 1 glass 3 May 25 11:42 test.txt • The fourth descriptor was a copy of descriptor 3, copied over the existing de
scriptor in slot 2 (the standard error channel).

System Call: mt chmod (const char* fileName, mt mode)


$ cat mydup.c .. .list the file.
#include <stdio.h>
#include <fcntl .h>
mt fchmod (mt fd, mode_t mode); main ()

chmod changes the mode of fileName to mode, usually an octal number as de mt fdl, fd2, fd3;
scribed in Chapter 2. The “set user ID” and “set group ID” flags have the octal val f dl = open (“test.txt”, O_RLR 0_TRUNC);
ues 4000 and 2000, respectively. To change a file’s mode, you must either own it or be printf (fdl = %d\ri°, fdl);
a superuser. write (fdl, “what’s’, 6);
fchmod ()works just like chmod (),except that it takes an open file descriptor fd2 = dup (fdl); /* Make a copy of fdl *1
printf (fd2 = %d\n°, fd2);
as an argument instead of a filename.
write (fd2, up’, 3);
Both functions return —1 if unsuccessful and 0 otherwise.
close (0); /* Close standard input *1
fd3 = dup (fdl); /* Make another copy of fdl *1
FIGURE 13.22 printf (“fd3 = %d\n”, fd3);
Description of the chmod () system call. write (0, doc”, 4);
Chapter 13 Systems Programming Regular File Management 467
466

dup2 (3, 2); /* Duplicate channel 3 to channel 2


*7 writes. This caused “guys” to be placed at the end of the file, even though I moved the
write (2, “?\n”, 2); file position pointer back to the start with lseek 0. The code is as follows:

$ mydup . . . nm the program.


$ cat myfcntl.c .. .list the program.
fdl = 3
#include <stdio.h>
fd2 = 4
#include <fcntl .h>
fd3 = 0
main ()
$ cat test.txt . . .list the output file.
whats up doc? mt fd;
$_
fd = open (“test.txt”, 0_WRONLY); /* Open file for writing *1
File Descriptor Operations: fanti () write (fd, hi there\n, 9);
lseek (fd, 0, SEEK_SET); /* Seek to beginning of file /
fcntl () directly controls the settings of the flags associated with a file descriptor. It fcntl (fd, F_SEI’FL, 0_WRONLY 0_APPEND); /* Set APPEND flag */
J
works as shown in Figure 13.24. In the next example, I opened an existing file for writ write (fd, guys\n”, 6);
ing and overwrote the initial few letters with the phrase “hi there.” I then used fcntl 0 close (fd);
to set the file descriptor’s APPEND flag, which instructed it to append all further
$ cat test.txt list the original file.
here are the contents of
System Call: mt fcntl (mt fd, mt cmd, mt arg) the original file.
$ myfcntl • .run the program.
fcntl 0 performs the operation encoded by cmd on the file associated with the file $ cat test.txt .list the new contents.
hi there
descriptor fd. arg is an optional argument for cmd. Here are the most common val the contents of
ues of cmd: the original file.
guys .. .note that “guys” is at the end.
VALUE OPERATION $_

F_SETFD Set the close-on-exec flag to the lowest bit of arg (0 or 1).
Controlling Devices: ioctl ()
F_GETFD Return a number whose lowest bit is 1 if the close-on-exec
flag is set and 0 otherwise. Figure 13.25 describes the operation of ioctl 0.
F_GETFL Return a number corresponding to the current file status
flags and access modes. System Call: mt locti (mt fd, mt cmd, mt arg)
F_SETFL Set the current file status flags to arg. ioctl 0 performs the operation encoded by cmd on the file associated with the file
F_GETOWN Return the process ID or process group that is currently descriptor fd. arg is an optional argument for cmd. The valid values of cmd depend
set to receive SIGIO/SIGURG signals. If the value on the device that fd refers to and are typically documented in the manufacturer’s
returned is positive, it refers to a process ID. If it’s operating instructions. I therefore supply no examples for this system call.
negative, its absolute value refers to a process group. ioctl 0 returns 1 if unsuccessful.
F_SETOWN Set the process ID or process group that should receive
SIGIO/SIGURG signals to arg. The encoding scheme is FIGURE 13.25
as described for F_GETOWN. Description of the ioctl () system call.

fcntl 0 returns —1 if unsuccessful. Creating Hard Links: link ()


FIGURE 13.24 link () creates a hard link to an existing file. It works as shown in Figure 13.26. In the
Description of the fcntl () system call. next example, I created the filename “another.txt” and linked it to the file referenced

4
468 Chapter 13 Systems Programming Regular File Management 469

System Call: mt link (const char* oldPath, const char* newPath) System Call: mt mknod (const char* fileName, mode_t type, dev_t device)
link () creates a new label, newPath, and links it to the same file as the label oldFath. mknod () creates a new regular, directory, or special file called fileName whose type
The hard link count of the associated file is incremented by one. If oldPath and can be one of the following:
newPath reside on different physical devices, a hard link cannot be made and link 0
fails. For more information about hard links, see the description of In in Chapter 3. VALUE MEANING
link () returns —1 if unsuccessful and 0 otherwise.
SJFDIR directory
FIGURE 13.26 S_IFCHR character oriented file
Description of the link () system call. S_IFBLK block oriented file

by the existing name, “original.txt”. I then demonstrated that both labels were linked S_IFREG regular file
to the same file. The code is as follows: S_IFIFO named pipe
$ cat myiink.c . . .list the program.
If the file is a character- or block-oriented file, then the low-order byte of device
main ()
should specify the minor device number, and the high-order byte should specify the
link (original. txt’, “another. txt”); major device number. (This can vary in different UNIX versions.) In other cases, the
value of device is ignored. (For more information on special files, see Chapter 14.)
$ cat originai.txt .list original file. Only a superuser can use mknod () to create directories, character-oriented
this is a file. files, or block-oriented special files. It is typical now to use the mkdir (j system call to
$ is i originai.txt another.txt ..examine files before. .
create directories.
another.txt not found mknod () returns —1 if unsuccessful and 0 otherwise.
—rw—r——r—- 1 glass 16 May 25 12:18 original.txt
$ mylink run the program.
FIGURE 13.27
$ is -i originai.txt another.txt .examine files after.
. .

—rw-r——r-- 2 glass 16 May 25 12:18 another.txt Description of the mknod () system call.
-rw—r——r-— 2 glass 16 May 25 12:18 original.txt
$ cat >> another. txt .alter “another.txt”.
hi Flushing the File System Buffers: sync ()
$ is -i originai.txt another.txt ...both labels reflect change. sync () flushes the file system buffers. It works as shown in Figure 13.28.
—rw—r--r—— 2 glass 20 May 25 12:19 another.txt
—xw-r--r—— 2 glass 20 May 25 12:19 original.txt
$ nfl oniginai.txt .remove original label.
$ is -i originai.txt another.txt .exarnine labels. . .

original.txt not found System Call: void sync 0


—rw—r-—r—- 1 glass 20 May 25 12:19 another.txt
$ cat another.txt .list contents via other label. sync 0 schedules all of the file system buffers to be written to disk. (For more infor
this is a file. mation on the buffer system, consult Chapter 14.) sync 0 should be performed by
hi any programs that bypass the file system buffers and examine the raw file system.
$— sync 0 always succeeds.
Creating Special Files: inknod ()
FIGURE 13.28
mknod () allows you to create a special file. It works as shown in Figure 13.27. For an
Description of the sync 0 system call.
example of mknod (), consult the section on named pipes later in the chapter.

j
470 Chapter 13 Systems Programming Regular File Management 471

Truncating a File: truncate () and ftruncate () process space. The implementation of STREAMS is more generalized than previous
I/O mechanisms, making it easier to implement new device drivers. One of the original
truncate 0 and ftruncate set the length of a file. They work as shown in Figure 13.29. motivations for STREAMS was to clean up and improve traditional UNIX character
In the next example, I set the length of two files to 10 bytes; one of the files was origi I/O sent to terminal devices.
nally shorter than that, and the other was longer. Here is the code: System V-based versions of UNIX also include the Transport Layer Interface
(TLI) networking interface to STREAMS drivers, a socketlike interface enabling the
$ cat truncate.c list the program.
STREAMS-based network drivers to communicate with other socket-based programs.
main 0
{ Improvements over traditional UNIX 110
truncate (fi1e1.txt’, 10);
truncate (file2.txt, 10); Traditional UNIX character-based 110 evolved from the early days of UNIX. As with
} any complex software subsystem, over time, unplanned and poorly architected changes
$ cat filel.txt .list filel.txt’.
• . added yet more complexity. Some of the advantage of STREAMS comes simply from
short the fact that it is newer and can take advantage of lessons learned over the years. This
$ cat fiie2.txt .list “file2.txt”.
• . yields a cleaner interface than was available before.
long file with lots of letters STREAMS also makes adding network protocols easier than having to write the
$ is —i fiie*.txt .examine both files. whole driver and all its required parts from scratch. The device-dependent code has been
-rw-r--r-- 1 glass 6 May 25 12:16 filel.txt
separated into modules so that only the relevant part must be rewritten for each new de
-rw-r--r-- 1 glass 32 May 25 12:17 file2.txt
vice. Common I/O housekeeping code (e.g., buffer allocation and management) has been i
$ truncate • run the program.
fiie*.txt examine both files again.
standardized so that each module can leverage services provided by the stream.
$ is -i
STREAMS processing involves sending and receiving streams messages, rather
-rw-r--r-- 1 glass 10 May 25 12:16 filel.txt
-rw-r--r-- 1 glass 10 May 25 12:17 file2.txt than just doing raw, character-by-character I/O. STREAMS also added flow control
$ cat filei.txt ‘filel.txt” is longer.
• . .
and priority processing.
short
“file2.txt” is shorter.
Anatomy of a STREAM
$ cat file2.txt
long file $ — Each STREAM has three parts:
• the stream head, an access point for a user application, for functions, and for data
System Call: mt truncate (const char* fileName, off_t length) structures representing the STREAM
• modules—code to process data being read or written
mt ftruncate (mt fd, offt length) • the stream driver, the back-end code that communicates with the specific device
truncate sets the length of the file fileName to length bytes. If the file is longer than All three run in kernel space, although modules can be added from user space.
length, it is truncated. If it is shorter than length, it is padded with ASCII nulls. The stream head provides the system call interface for a user application. A
ftruncate 0 works just like truncate 0 except that it takes an open file de stream head is created by using the open 0 system call. The kernel manages any mem
scriptor as an argument instead of a filename. ory allocation required, the upstream and downstream flow of data, queue scheduling,
Both functions return —1 if unsuccessful and 0 otherwise. flow control, and error logging.
Data written to the stream head from an application program are in the form of a
FIGURE 13.29 message that is passed to the first module for processing. This module processes the mes
sage and passes the result to the second module. Processing and passing continue until the
Description of the truncate Q and ftruncate 0 system calls.
last module passes the message to the stream driver, which writes the data to the appro
priate device. Data coming from the device take the same path in the reverse direction.
STREAMS
STREAMS is a newer and more generalized I/O facility that was introduced in System STREAM system calls
V UNIX. STREAMS are most often used to add device drivers to the kernel and pro In addition to the I/O system calls we’ve already seen—ioctl (),open (),close
vide an interface to the network drivers, among others.
0 read 0
and write 0—the following system calls are useful with a stream:
Originally developed by Dennis Ritchie, one of the designers of UNIX,

I
STREAMS provides a full-duplex (two-way) path between kernel space and user • getmsg 0—get a message from a stream
____________________
-

472 Chapter 13 Systems Programming Process Management 473

• putmsg 0—put a message on a stream Figure 13.31 provides an illustration of the way that a shell executes a utility; I’ve
• poll 0—poll one or more streams for activity indicated the system calls that are responsible for each phase of the execut
ion. The In
• isastream 0—find out whether a given file descriptor is a stream ternet shell that I present later in the chapter has the basic process management
facili
ties of classic UNIX shells and is a good place to look for some in-dep
th coding
examples that utilize process-oriented system calls. In the meantime, let’s
look at some
PROCESS MANAGEMENT simple programs that introduce these system calls one by one. The next few subsec
tions
describe the system calls shown in Figure 13.32.
A UNIX process is a unique instance of a running or runnable program. Every process
in a UNIX system has the following attributes:

• some code (a. k. a. text) Parent process PID 34 fr


• some data running shell

• a stack iplicacate:fork()
• a unique process ID (PID) number
Parent process PID 34 Child process PID 35
When UNIX is first started, there’s only one visible process in the system. This process running shell, running shell
is called “mit,” and is PID 1. The only way to create a new process in UNIX is to dupli waiting for child
cate an existing process, so “mit” is the ancestor of all subsequent processes. When a
Differentiate: exec 0
process duplicates, the parent and child processes are virtually identical (except for Wait for child: wait 0
things like PIDs, PPIDs, and run times); the child’s code, data, and stack are a copy of
the parent’s, and it even continues to execute the same code. A child process may, how Child process PID 35
running utility
ever, replace its code with that of another executable file, thereby differentiating itself
from its parent. For example, when “mit” starts executing, it quickly duplicates several
times. Each of the duplicate child processes then replaces its code from the executable
file called “getty,” which is responsible for handling user logins. The process hierarchy
therefore looks like that shown in Figure 13.30.
Parent process PID 34
running shell, Signal
Jr Terminate: exit

Child process PID 35


terminates
0

awakens
Parent
mit (PID 1) FIGURE 13.31
Duphcate: fork 0 then How a shell runs a utility.

Child Child Child


getty (PID 4) getty (PID 5) getty (PID 6) Name Function
handle a handle a handle a
login login login fork duplicates a process
FIGURE 13.30 getpid obtains a process’ ID number
The initial process hierarchy. getppid obtains a parent process’ ID number
exit terminates a process
When a child process terminates, its death is communicated to its parent so that
the parent may take some appropriate action. It’s common for a parent process to sus wait waits for a child process
pend until one of its children terminates. For example, when a shell executes a utility in
exec replaces the code, data, and stack of a process
the foreground, it duplicates into two shell processes; the child shell process replaces its
code with that of the utility, whereas the parent shell waits for the child process to ter
FIGURE 13.32
minate. When the child terminates, the original parent process “awakens” and presents
the user with the next shell prompt. UNIX process-oriented system calls.
474 Chapter 13 Systems Programming Process Management 475

Creating a New Process: fork () of these calls. To illustrate the operation of fork (),here’s a small program that duplicates
and then branches, based on the return value of fork 0:
A process may duplicate itself by using fork (), which works as shown in Figure 13.33.
fork () is a strange system call, because one process (the original) calls it, but two $ cat myfork. c . .
. list the program.
processes (the original and its child) return from it. Both processes continue to run the #include <stdio.h>
same code concurrently, but have completely separate stack and data spaces. main ()
{
mt pid;
printf (“I’m the original process with PID %d and PPID %d.\n”,
System Call: pid_t fork (void) getpid (), getppid 0);
pid = fork 0; /* Duplicate. Child and parent continue from here *1
fork 0 causes a process to duplicate. The child process is an almost exact duplicate of if (pid 0) 1* pid is non-zero, so I must be the parent *1
the original parent process; it inherits a copy of its parent’s code, data, stack, open {
file descriptors, and signal table. However, the parent and child have different printf (“I’m the parent process with P113 %d and PPID %d.\n’,
process ID numbers and parent process ID numbers. getpid 0, getppid 0);
If fork 0 succeeds, it returns the PID of the child to the parent process and re printf (“My child’s P113 is %d\n”, pid).
}
turns 0 to the child process. If fork 0 fails, it returns —1 to the parent process, and no else /* pid is zero, so I mist be the child */
child is created. (
printf (‘I’m the child process with P113 %d and PPID %d.\n”,
FIGURE 13.33 getpid 0, getppid 0);
Description of the fork () system call. }
printf (“P113 %d terminates.\n”, getpid () ); / Both processes execute
this *1

Now, that reminds me of a great sci-fi story I read once, about a man who comes I
$ myfork
across a fascinating booth at a circus. The vendor at the booth tells the man that the
run the program.
. .
.

I’m the original process with P113 13292 and PPID 13273.
booth is a matter replicator: Anyone who walks through the booth is duplicated. But I’m the parent process with PID 13292 and PPID 13273.
that’s not all; The original person walks out of the booth unharmed, but the duplicate My child’s P113 is 13293.
person walks out onto the surface of Mars as a slave of the Martian construction crews. I’m the child process with P113 13293 and PPID 13292.
The vendor then tells the man that he’ll be given a million dollars if he allows himself P113 13293 terminates. .chjld terminates.
to be replicated, and the man agrees. He happily walks through the machine, looking P113 13292 terminates. parent terminates.
forward to collecting the million dollars... and walks out onto the surface of Mars. $—
Meanwhile, back on Earth, his duplicate is walking off with a stash of cash. The ques
The PPID of the parent refers to the PID of the shell that executed the “myfork”
tion is this: If you came across the booth, what would you do?
program.
A process may obtain its own process ID and parent process ID numbers by
Here is a warning: As you will soon see, it is dangerous for a parent to terminate
using the getpid () and getppid () system calls, respectively. Figure 13.34 gives a synopsis
without waiting for the death of its child. The only reason that the parent doesn’t wait for
its child to die in this example is because I haven’t yet described the wait system call!
0
System Call: pid_t getpid (void) Orphan Processes
pid_t getppid (void) If a parent dies before its child, the child is automatically adopted by the original “mit”
process, PID 1. To illustrate this feature, I modified the previous program by inserting a
getpid 0 and getppid 0 return a process’ ID and parent process’ ID numbers, re sleep statement into the child’s code. This ensured that the parent process terminated
spectively. They always succeed. The parent process ID number of PID 1 is 1. before the child did. Here’s the program and the resultant output:
$ cat orphan.c .
. .list the program.
FIGURE 13.34 #include <stdio.h>
Description of the getpid () and getppid () system calls. main ()
476 Chapter 13 Systems Programming Process Management 477

{
mt pid;
System Call: void exit (mt status)
%d.\n”,
printf (‘I’m the original process with PID %d and PPID
getpid Q, getppid 0);
exit () closes all of a process’ file descriptors, deallocates its code, data, and stack,
from here / and then terminates the process. When a child process terminates, it sends its parent
pid = fork ; / Duplicate. Child and parent continue
if (pid 0) / Branch based on return value from fork () *1 a SIGCHLD signal and waits for its termination code status to be accepted. Only the
lower eight bits of status are used, so values are limited to 0—255. A process that is
7* pid is non-zero, so I must be the parent / waiting for its parent to accept its return code is called a zombie process. A parent
printf (“I’m the parent process with PlO %d and PPID %d.\n”, accepts a child’s termination code by executing wait (), which is described shortly.
getpid 0 getppid 0); The kernel ensures that all of a terminating process’ children are orphaned
printf (“My child’s PID is %d\n”, pid); and adopted by “unit” by setting their PPIDs to 1. The “mit” process always accepts
} its children’s termination codes.
else
exit () never returns.
{
*/
1* pid is zero, so I must be the child
/
sleep (5); 7* Make sure that the parent terminates first FIGURE 13.36
printf (“I’m the child process with PlO %d and PPID %d.\n”, Description of the exit () system call.
getpid 0, getppid 0);

printf (“PlO %d terrninates.\n”, getpid () ); 1* Both processes execute


this / purposes by the parent process. Shells may access the termination code of their last
child process via one of their special variables. For example, in the following code,
$ orphan . . . run the program. the C shell stores the termination code of the last command in the variable $status:
I’m the original process with PID 13364 and PPID 13346.
I’m the parent process with PlO 13364 arid PPID 13346. % cat myexi t. c .. . list the program.
PlO 13364 terminates. #include <stdio.h>
I’m the child process with PlO 13365 and PPID 1... .orphaned! main ()
PlO 13365 terminates.
$— printf (“I’m going to exit with return code 42\n”);
exit (42)
Figure 13.35 shows an illustration of the orphaning effect.
% myexit .run the program.
.

mit I’m going to exit with return code 42

——
—— % echo $status .. .display the termination code.
42
Parent.
dies Adopt child
first -.

S...
S.
In all other shells, the return value is returned in the special shell variable $?.
S.

Child Zombie Processes


survives
the parent A process that terminates cannot leave the system until its parent accepts its return
code. If its parent process is already dead, it’ll already have been adopted by the “mit”
FIGURE 13.35
process, which always accepts its children’s return codes. However, if a process’ parent
Process adoption. is alive, but never executes a wait (), the child process’ return code will never be ac
cepted and the process will remain a zombie. A zombie process doesn’t have any code,
Terminating a Process: exit () data, or stack, so it doesn’t use up many system resources, but it does continue to in
habit the system’s fixed-size process table. Too many zombie processes can require the
A process may terminate at any time by executing exit (), which works as shown in
system administrator to intervene. (See Chapter 15 for more details.)
Figure 13.36. The termination code of a child process may be used for a variety of
478 Chapter 13 Systems Programming Process Management 479

The next program created a zombie process, which was indicated in the output
from the ps utility. When I killed the parent process, the child was adopted by “mit” and mt pid, status, childPid;
printf (“I’m the parent process and my PID is %d\n”, getpid
allowed to rest in peace. Here is the code: c));
pid = fork ; / Duplicate *7
list the program. if (pid 0) /* Branch based on return value from fork () /
$ cat zorribie.c
#include <stdio .h>
printf (“I’m the parent process with PID %d and PPID
main () %d\n”,
getpid 0, getppid 0);
childPid wait (&status); / Wait for a child to terminate. *7
mt pid;
printf (“A child with PID %d terminated with exit code
pid fork 0; /* Duplicate *1 %d\n”,
if (pid 0) 7* Branch based on return value from fork () / childPid, status >> 8);

while (1) 7* Never terminate, and never execute a wait C) / else


sleep (1000)
printf (“I’m the child process with PID %d and PPID %d\n”,
)
getpid 0, getppid 0);
else
exit (42); 7* Exit with a silly number /
exit (42); /* Exit with a silly number /
printf (“PID %d terininates\n”, getpid () );
}
.execute the program in the background. $ mywait .run the program.
$ zombie &
I’m the parent process and my PID is 13464
. .

[11 13545
.obtain process status. I’m the child process with PID 13465 and PPID 13464
$ PS • .

I’m the parent process with PID 13464 and PPID 13409
PID ‘IT STAT TINE COD
.the shell. A child with PID 13465 terminated with exit code 42
13535 p2 S 0:00 —ksh (ksh) • .

.the parent process. PID 13465 terminates


13545 p2 S 0:00 zombie
13546 p2 Z 0:00 <defunct> the zombie child. $—
13547 p2 R 0:00 P5
$ kill 13545 .kill the parent process.
[11 Terminated zombie
.notice the zombie is gone now.
System Call: pid_t wait (int* status)
$ S
PID ‘I STAT TINE CONMAND
0:00 —ksh (ksh) wait () causes a process to suspend until one of its children terminates. A successful
2 S
13535 p
13548 p
2 R 0:00 ps call to wait () returns the PID of the child that terminated and places a status code
$— into status that is encoded as follows:

Waiting for a Child: wait () • If the rightmost byte of status is zero, the leftmost byte contains the low eight
bits of the value returned by the child’s exit () or return
A parent process may wait for one of its children to terminate and then accept its 0.
child’s termination code by executing wait (),described in Figure 13.37. In the next ex • If the rightmost byte is nonzero, the rightmost seven bits are equal to the num
ample, the child process terminated before the end of the program by executing an ber of the signal that caused the child to terminate, and the remaining bit of the
exit ()with return code 42. Meanwhile, the parent process executed a wait () and sus rightmost byte is set to 1 if the child produced a core dump.
pended until it received its child’s termination code. At that point, the parent dis
If a process executes a wait () and has no children, wait returns immediately with
played information about its child’s demise and executed the rest of the program. The ()
code is as follows: —1. If a process executes a wait () and one or more of its children are already zom
bies, wait ()returns immediately with the status of one of the zombies.
$ cat rnywait.c list the program.
#include <stdio .h> FIGURE 13.37
main () Description of the wait () system call.
480 Chapter 13 Systems Programming Process Management 481

Differentiating a Process: exec () execi (“Ibm/is”, “is”, “—1”, NULL); /* Execute is /


printf (“This line should never be executed\n”);
A process may replace its cunent code, data, and stack with those of another executable }
file by using one of the exec () family of system calls. When a process executes an exec 0, $ myexec run the program. . .
.

its PID and PPID numbers stay the same—only the code that the process is executing I’m process 13623 and I’m about to exec an is -i
changes. The exec () family works as shown in Figure 13.38. The members of the exec 0 total 125
family listed in the figure aren’t really system calls: rather, they’re C library functions that -rw-r--r-- 1 glass 277 Feb 15 00:47 myexec.c
invoke the execve ()system call. execve () is hardly ever used directly, as it contains some —rwxr—xr—x 1 glass 24576 Feb 15 00:48 myexec
rarely used options $—

Note that the execl ()was successful and therefore never returned
Library Routine: mt execl (const char* path, const char* argO, const char* argi,
const char* argn, NULL) Changing Directories: chdir ()

mt execv (const char* path, const char* argv[J) Every process has a current working directory that is used in processing a relative path-
name. A child process inherits its current working directory from its parent. For exam
mt execip (const char* path, const char* argO, const char* argi,..., const ple, when a utility is executed from a shell, its process inherits the shell’s current
char* argn, NULL) working directory. To change a process’ current working directory, use chdir
(), which
works as shown in Figure 13.39. In the following example, the process printed its current
mt execvp (const char* path, const char* argv[])
The exec 0 family of library routines replaces the calling process’ code, data, and System Call: mt chdir (const char* pathname)
stack from the executable file whose pathname is stored in path.
execi is identical to execlp o,
and execv 0 is identical to execvp 0 except chdir () sets a process’ current working directory to the directory pathname. The
that execl 0 and execv 0 require the absolute or relative pathname of the exe process must have execute permission from the directory to succeed.
cutable file to be supplied, whereas execlp 0 and execvp 0 use the $PATH environ chdir () returns 0 if successful; otherwise, it returns —1.
ment variable to find path.
If the executable file is not found, the system call returns —1; otherwise, the
FIGURE 13.39
calling process replaces its code, data, and stack from the executable file and starts
to execute the new code. A successful exec 0 never returns. Description of the chdir 0 system call.
execl 0 and execlp 0 invoke the executable file with the string arguments
pointed to by argi.. argn. argO must be the name of the executable file itself, and the working directory before and after executing chdir () by executing pwd, using the system
list of arguments must be terminated with a null. 0 library routine:
execv 0 and execvp 0 invoke the executable file with the string arguments
pointed to by argv[1]. .argv[nJ, where argv[n+1J is NULL. argv[0J must be the name $ cat mychdir.c .
. .list the source code.
of the executable file itself. #include <stdio.h>
main ()
{
FIGURE 13.38 system (“pwd”); / Display current working directory */
Description of the execl (), execv ,exec1p (), and execvp () library routines. chdir (“/“); / Change working directory to root directory */
system (“pwd”); /* Display new working directory *1
In the following example, the program displayed a small message and then replaced /home/glass”); /* Change again */
chdir (
0
its code with that of the “is” executable ifie: system (“pwd’); I Display again *1
}
$ mychdir . .
. execute the program.
$ cat myexec.c list the program.
/home/glass
#include <stdio .h>
/
main () /home/giass
$—
printf (“I’m process %d and I’m about to exec an is -l\n”, getpid );
482 Chapter 13 Systems Programming Process Management 483

PlO TT STAT TIME COMMAND


Changing Priorities: nice ()
15099 P
2 S 0:00 —sh (sh)
Every process has a priority value between —20 and +19 that affects the amount of CPU 15206 p
2 S N 0:00 a.out
time that the process is allocated. In general, the smaller the priority value, the faster the 15211 p
2 S N 0:00 sh —C PS
process will run. Only superuser and kernel processes (described in Chapter 14) can have 15212 p
2 R N 0:00 PS
a negative priority value, and login shells start with priority 0. $—
Note that when the process’ priority value became nonzero, it was flagged with an “N” by
Library Routine: mt nice (mt delta) ps, together with the sh and ps commands that it created due to the system ()library call.

nice 0 adds delta to a process’ current priority value. Only a superuser may specify a Accessing User and Group IDs
delta that leads to a negative priority value. Valid priority values lie between —20 and Figure 13.41 shows the system calls that allow you to read a process’ real and effec
+19. If a delta is specified that takes a priority value beyond a limit, the value is trun tive IDs. Figure 13.42 shows the system calls that allow you to set a process’ real and
cated to the limit. effective IDs.
If nice ()succeeds, it returns the new nice value; otherwise it returns —1. Note
that this can cause problems, since a nice value of—i is valid. System Call: uid..t getuid 0
FIGURE 13.40 uidt geteuid 0 4
Description of the nice () library routine.
gidt getgid 0
A child process inherits its priority value from its parent and may change it by gidt getegid 0
using nice described in Figure 13.40. In the following example, the process executed
,

ps commands before and after a couple of nice 0 calls: getuid () and geteuid () return the calling process’ real and effective user ID, respec
list the Source code. tively. getgid ()and getegid ()return the calling process’ real and effective group ID,
$ cat mynice.C
#include <stdio .h>
respectively. The ID numbers correspond to the user and group IDs listed in the
main 0 “!etc/passwd” and “/etc/group” files.
These calls always succeed.
printf (‘original priority\n”);
system (‘ps’); /* Execute a ps / FIGURE 13.41
nice (0); /* Add 0 to my priority / Description of the getuid ),geteuid ,getgid
printf (running at priority 0\n”);
O and getegid () system calls.
system (‘ps ); I Execute another ps *1
nice (10); /* Add 10 to my priority /
Library Routine: mt setuid (uid..t id)
printf (‘running at priority 10\n”);
system (‘ps’); / Execute the last PS
/ mt seteuid (uid..t id)
$ mynice • . . execute the program. mt setgid (gidt id)
original priority mt setegid (gid.J id)
PlO TT STAT TIME COMMAND
15099 p
2 5 0:00 —sh (sh) seteuid () and (setegid 0) set the calling process’ effective user (group) ID. setuid
15206 p
2 S 0:00 a.out
and (setgid 0) set the calling process’ effective and real user (group) IDs to the
0
15207 p
2 S 0:00 sh —c PS
specified value.
2 R
15208 P 0:00 PS
running at priority 0 .adding 0 doesn’t change it. These calls succeed only if executed by a superuser or if id is the real or effec
PlO TI’ STAT TIME COMMAND tive user (group) ID of the calling process. They return 0 if successful; otherwise,
15099 p2 5 0:00 —sh (sh) they return —1.
2 S
15206 P 0:00 a.out
15209 P
2 S 0:00 sh —c p5 FIGURE 13.42
2 R
15210 p 0:00 ps
Description of the setuid 1),seteuid (), setgid (), and setegid 0 library routines.
running at priority 10 adding 10 makes them run slower.
484 Chapter 13 Systems Programming Process Management 485

long processDirectory 0;
Sample Program: Background Processing main (argc, argv)
Next, we will examine a sample program that makes use of fork () and exec () to execute mt argc;
a program in the background. The original process creates a child to exec the specified char* gv [j;

executable file and then terminates. The orphaned child is automatically adopted by
“mit.” Here is the code: long count;
count = processFile (argv[lJ);
list the program. printf (‘Total nun,ber of non-directory files is %ld\n”, count);
$ cat background.c
return (/* EXIT SUCCESS *7 0);
#include <stdio.h>
main (argc, argv)
}
long processFile (name)
mt argc; char* name;
char* argv [1;
(
*/ struct stat statBuf; / To hold the return data from stat () *7
if (fork () == 0) 7* Child
mode_t mode;
{
execvp (argv[l], &argv[1]); / Execute other program
/ mt result;
result = stat (name, &statBuf); / Stat the specified file /
fprintf (stderr, Could not execute %s\n’, argv[l]);
if (result == -1) return (0); 1* Error *7
}
mode = statBuf.st mode; 7* Look at the file’s mode *7
. . .run the program.
if (5_ISDIR (mode)) /* Directory *7
$ background cc mywait.c
.confirm that cc’ is in background. return (processDirectory (name));
$ PS
TI COMMAND
else
PID E9 STAT
0:00 —csh (csh) return (1); / A non-directory file was processed *7
13664 p0 S
13716 p0 R 0:00 ps
0:00 cc mywait.c long processDirectory (dirName)
13717 p0 D
char’ dirName;
$—
mt fd, children, i, charsRead, childPid, status;
Note how I craftily passed the argument list from main () to execvp () by passing long count, totalCount;
&argv[1} as the second argument to execvp (). Note also that I used execvp ()instead char fileName [100);
of execv () so that the program could use $PATH to find the executable file. struct dirent dirEntry;
fd open (dirName, O_RDONLY); / Open directory for reading *7
Sample Program: Disk Usage
children 0; 7* Initialize child process count *7
The next programming example uses a novel technique for counting the number of while (1) /* Scan directory *7
nondirectory files in a hierarchy. When the program is started, its first argument must
be the name of the directory to search. The program searches through each entry in the charsRead getdents (fd, &dirEntry, sizeof (struct dirent));
directory, spawning off a new process for each. Each child process either exits with 1 if if (charsRead == 0) break; /* End of directory */
if (strcmp (dirEntry.d_narne, “.“)
its associated file is a nondirectory file or repeats the process, summing up the exit 0 &&
strcmp (dirEntry.d_name, “..“) != 0)
codes of its children and exiting with the total count. This technique is interesting, but
{
silly: Not only does it create a large number of processes, which is not particularly effi if (fork 7* Create a child to process dir.
() == 0) entry *7
cient, but since it uses the termination code to return the file count, it’s limited to an
eight-bit total count. The code is as follows: sprintf (fileName, “%s/%s”, dirNaine, dirEntry.d name);
count = processFile (fileName);
$ cat count.c list the program. exit (count);
#include <stdio .

#include <fcntl .h> else


#include <sys/dirent.h> ++children; / Increment count of child processes *1
#include <sys/stat .h> I
long processFile 0;
486 Chapter 13 Systems Programming Process Management 487
/
lseek (fd, dirEntry.d_off, SEEK_SET); /* Jump to next dir.entry Thread management
I
close (fd); /* Close directory / Four major functions make up the common thread management capabilities in most
totalCount = 0; 1* Initialize file count / implementations:
*7
f or (1 = 1; i <= children; i++) 1* Wait for children to terminate
• create—create a thread
*/
childPid = wait (&status); 1* Accept child’s termination code • join—suspend and wait for a created thread to terminate (similar to the wait()
totalCount += (status >> 5); / Update file count /
system call between parent and child processes)
return (totalCourit); / Return number of files in directory
/ • detach—allow the thread to release its resources to the system when it finishes
and not require a join (in this case, an exit value is not available)
$ is -F list current directory.
• . . • terminate—return resources to process
a. out* disk.c fork tn/ zombie*
background myexec c . myfork.c mywait.c Thread synchronization
background. c myexit.c orphan.c mywait*
In a multithreaded environment, one or more threads can be created to handle specif
count* myexit* orphan* zombie. c
ic tasks. If the tasks are unrelated, the threads can be initiated and run to completion. If
$ is tmp list only
• . subdirectory.
.

a.out* myexit.c orphan. c


any part of the task requires information from another task, processing among threads
disk.c
background.c myexec.c myfork.c mywait.c must be synchronized. Synchronization can often be accomplished via standard UNIX
zombie. c IPC mechanisms, but most threads libraries also provide synchronization primitives
$ count . .count regular files from
..
specific to the use of threads.
Total number of non-directory files is 25 A mutex object can be used to manage mutual exclusion among threads. Mutex
$_ objects can be created, destroyed, locked, and unlocked. The attributes of a mutex ob
ject are shared among threads and are used to let other threads know the state of the
Threads thread the mutex object describes. Mutex objects can also be used in conjunction with
conditional variables, which maintain a value (such as a threshold) to allow more pre
Multiple processes are expensive to create, either anew or by copying an existing cise management of thread synchronization.
process with the fork () system call. Often, a completely new process space is not nec
essary for a small, yet independent, task in a program. In fact, you may want separate Thread Safety
tasks to be able to share some resources in a process, such as memory space, or to share So now you’ve synchronized the various threads of control in your own program, but
an open device. what about library functions they call? Does your code need to synchronize its use of a
When multiprocessor systems became available, it was clear that UNIX needed a graphics library (for example) to make sure that two separate threads don’t try to write
better way to take advantage of multiple processors without requiring a new process to to the same part of the screen at the same time? What about two threads that are using a
be started in order to take advantage of the additional processor. A thread is an ab math library to update shared data? You’ve synchronized your use of your variables, but
straction that allows multiple “threads of control” in a single process space. It can al do the math functions use any shared variables? Is the function reentrant (i.e., can more
most be thought of as a process within a process (almost). The thread model is similar than one control point be used in the memory space of the function at the same time)?
to the UNIX process model in many ways. By asking these questions, you are asking if the library is thread safe: Is it safe to
Terminology among some thread implementations can be confusing. You may call the functions in these libraries from a multithreaded program? It probably isn’t
find the term “lightweight processes” used interchangeably with “thread,” or you may hard to imagine the kinds of unforeseen problems that can crop up under these cir
find places where the two terms are used to distinguish subtle differences. In most cumstances. Unless the vendor or author of the library claims that it is thread safe, you
cases, the idea of lighter weight (i.e., less expensive costs) is what is intended. For the should assume that it is not and write your code accordingly (managing mutually ex
purposes of our high-level examination, we will merely refer to threads. clusive access to the library among the various threads in your program).
Since the implementation of thread functionality varies widely. in different ver
Other process-related system calls that we’ve already examined may be affected
sions of UNIX, to examine any one would unfairly ignore others, and a complete ex by the implementation of threads. For example, each thread maintains its own stack,
amination of all current implementations is beyond the scope of this introductory text. signal mask, and local storage area. Therefore, it may not always be obvious when a sys
We therefore will examine UNIX thread functionality at a high level that is common to tem call applies only to the thread or to the entire process running the thread. It will be
all implementations. I recommend that you consult the documentation for your version important for you to find out what effects your implementation of threads may have on
of UNIX for information on specific system calls. other UNIX system calls.
488 Chapter 13 Systems Programming Signals 489

$ redirect ls.out is —1 .redirect ‘ls —1” to ‘ls.out’.


Redirection
$ cat is.out list the output file.
When a process forks, the child inherits a copy of its parent’s file descriptors. When a total 5
process execs, all file descriptors that do not close upon execution remain unaffected, -rw--r-xr--x 1 gglass 0 Feb 15 10:35 ls.out
including the standard input, output, and error channels. The UNIX shells use these -rw-r-xr-x 1 gglass 449 Feb 15 10:35 redirect.c
-r’.—xr-x
two pieces of information to implement redirection. For example, say you type the 1 gglass 3697 Feb 15 10:33 redirect
command $— -- 7

The Internet shell described at the end of this chapter has better redirection facilities
$ is > ls.out
than the standard UNIX shells; it can even redirect output to another Internet shell on
at a terminal. To perform the redirection, the shell performs the following series of a remote host.
actions:
SIGNALS
• The parent shell forks and then waits for the child shell to terminate.
• The child shell opens the file “ls.out,” creating or truncating it as necessary. Programs must sometimes deal with unexpected or unpredictable events, such as any
• The child shell then duplicates the file descriptor of “ls.out” to the standard out of the following:
put file descriptor, number 1, and then closes the original descriptor of “ls.out”.
All standard output is therefore redirected to “ls.out”. a floating-point error
• The child shell then exec’s the ls utility. Since file descriptors are inherited during • a power failure
an exec O all of the standard output of Is goes to “ls.out”. • an alarm clock “ring” (discussed soon)
• When the child shell terminates, the parent resumes. The parent’s file descriptors • the death of a child process
are unaffected by the child’s actions, as each process maintains its own private de • a termination request from a user (i.e., a Control-C)
scriptor table. • a suspend request from a user (i.e., a Control-Z)
To redirect the standard error channel in addition to standard output, the shell would
These kinds of events are sometimes called interrupts, since they must interrupt the
simply have to duplicate the “ls.out” descriptor twice—once to descriptor 1 and once to
regular flow of a program in order to be processed. When UNIX recognizes that such
descriptor 2.
an event has occurred, it sends the corresponding process a signal. There is a unique,
Following a small program that does approximately the same kind of redirection
numbered signal for each possible event. For example, if a process causes a floating-
as a UNIX shell. When invoked with the name of a file as the first parameter and a
point error, the kernel sends the offending process signal number 8, as shown in
command sequence as the remaining parameters, the program “redirect” redirects the
Figure 13.43. The kernel isn’t the only one that can send a signal; any process can send
standard output of the command to the named file. Here’s the code:
any other process a signal, as long as it has permission. (The rules regarding permis
sions are discussed shortly.)
$ cat redirect.c list the program.
#include <stdio.h>
#include <fcntl .h>
main (argc, argv) Signa
mt argc;
char* argv [1;
{
mt fd; FIGURE 13.43
/* Open file for redirection I Floating-point error signal.
fd = open (argv[l1, OCREZT 0_TRUNC OWRONLY, 0600);
/
dup2 (fd, 1); / Duplicate descriptor to standard output
/* Close original descriptor to save descriptor space / By means of a special piece of code called a signal handler, a programmer may
close (fd);
execvp (argv[2], &argv[2]);
/* Invoke program; will inherit stdout *1 arrange for a particular signal to be ignored or to be processed. In the latter case, the
perror (‘main’); /* Should never execute / process that receives the signal suspends its current flow of control, executes the signal
handler, and then resumes the original flow of control when the signal handler finishes.
490 Chapter 13 Systems Programming
Signa’s 491

By learning about signals, you can “protect” your programs from Control-C’s,
arrange for an alarm clock signal to terminate your program if it takes too long to per SIGSYS 12 dump bad argument to system call
form a task, and learn how UNIX uses signals during everyday operations. SIGPIPE 13 quit write on a pipe or other socket with
no one to read it
The Defined Signals
SIGALRM 14 quit alarm clock
Signals are defined in “/usr/include/sys/signal.h.” A programmer may choose for a par
ticular signal to trigger a user-supplied signal handler, trigger the default kernel-sup SIGTERM 15 quit software termination signal (default
plied handler, or be ignored. The default handler usually performs one of the following signal sent by kill)
actions: SIGUSR1 16 quit user signal 1
• terminates the process and generates a core file (dump) SIGUSR2 17 quit user signal 2
• terminates the process without generating a core image file (quit) SIGCHLD 18 ignore child status changed
• ignores and discards the signal (ignore) SIGPWR 19 ignore power fail or restart
• suspends the process (suspend)
SIGWINCH 20 ignore window size change
• resumes the process
SIGURG 21 ignore urgent socket condition
SIGPOLL 22 exit pollable event
A List of Signals
SIGSTOP 23 quit stopped (signal)
Figure 13.44 lists the System V predefined signals, along with their macro definitions,
numeric values, default actions, and a brief description of each. SIGSTP 24 quit stopped (user)
SIGCONT 25 ignore continued
Macro # Default Description
SIGrrIN 26 quit stopped (tty input)
SIGHUP 1 quit hang-up SIGTTOU 27 quit stopped (tty output)
SIGINT 2 quit interrupt
SIGVTALRM 28 quit virtual timer expired
SIGQUIT 3 dump quit
SIGPROF 29 quit profiling timer expired
SIGILL 4 dump invalid instruction SIGXCPU 30 dump CPU time limit exceeded
SIGTRAP 5 dump trace trap (used by debuggers) SIGXFSZ 31 dump file size limit exceeded
SIGABRT 6 dump abort
SIGEMT 7 dump emulator trap instruction FIGURE 13.44 (Continued)

SIGFPE 8 dump arithmetic exception


SIGKILL 9 quit kill (cannot be caught, blocked, or Terminal Signals
ignored)
The easiest way to send a signal to a foreground process is by pressing Control-C or
SIGBUS 10 dump bus error (bad format address)
Control-Z from the keyboard. When the terminal driver (the piece of software that
SIGSEGV 11 dump segmentation violation (out-of- supports the terminal) recognizes a Control-C, it sends a SIGINT signal to all of the
range address) processes in the current foreground job. Similarly, Control-Z causes the driver to send
a SIGTSTP signal to all of the processes in the current foreground job. By default,
FIGURE 13.44 SIGINT terminates a process and SIGTSTP suspends a process. Later in this section,
Signals. I’ll show you how to perform similar actions from a C program.
492 Chapter 13 Systems Programming Signals 493

Requesting an Alarm Signal: alarm ()


Library Routine: void (*signal (mt sigCode, void (*func)(int))) (int)
One of the simplest ways to see a signal in action is to arrange for a process to receive
an alarm clock signal, SIGALRM, by using alarm ().The default handler for this signal signal () allows a process to specify the action that it will take when a particular sig
displays the message “Alarm clock” and terminates the process. Figure 13.45 shows nails received. The parameter sigCode specifies the number of the signal that is to
how alarm () works. Here’s a small program that uses alarm (),together with its output: be reprogrammed, and func may be one of several values:

$ cat alarm.c .. .list the program. • SIG_IGN, which indicates that the specified signal should be ignored and
#include <stdio.h> discarded.
main U
• SIG_DFL, which indicates that the kernel’s default handler should be used.
alarm (3); /* Schedule an alarm signal in three seconds */ • an address of a user-defined function, which indicates that the function should
printf (‘Looping forever.. be executed when the specified signal arrives.
while (1);
printf (‘This line should never be executed\n’); The valid signal numbers are stored in “/usr/include/signal.h”. The signals
} SIGKILL and SIGSTP may not be reprogrammed. A child process inherits the sig
$ alarm • . . run the program.
nal settings from its parent during a fork ().When a process performs an exec (), pre
Looping forever...
viously ignored signals remain ignored, but installed handlers are set back to the
Alarm clock .occurs three seconds later. 4
default handler.
$—
With the exception of SIGCHLD, signals are not stacked. This means that if a
process is sleeping and three identical signals are sent to it, only one of the signals is
The next section shows you how you override a default signal handler and make your
actually processed.
program respond specially to a particular signal.
signal () returns the previous func value associated with sigCode if successful;
otherwise, it returns —1.
Library Routine: unsigned mt alarm (unsigned mt count)
FIGURE 13.46
alarm () instructs the kernel to send the SIGALRM signal to the calling process Description of the signal () library routine.
after count seconds. H an alarm had already been scheduled, it is overwritten. If
count is 0, any pending alarm requests are cancelled.
alarm () returns the number of seconds that remain until the alarm signal is Library Routine: mt pause (void)
sent.
pause ()suspends the calling process and returns when the calling process receives a
FIGURE 13.45 signal. Pause () is most often used to wait efficiently for an alarm signal. It doesn’t
Description of the alarm U library routine. return anything useful.

Handling Signals: signal () FIGURE 13.47


Description of the pause () library routine.
The last sample program reacted to the alarm signal SIGALRM in the default manner.
The signal () system call may be used to override the default action. It works as shown
in Figure 13.46. I made the following changes to the previous program so that it caught
and processed the SIGALRM signal efficiently: - Figure 13.47 provides a description of pause (). Here’s the updated version of the
program:
• I installed my own signal handler, alarmHandler (), by using signal 0.
• I made the while loop less draining on the time-sharing system by making use of $ cat handler.c .list the program.
. .
a system call called pause ().The old version of the while loop had an empty code #include <stdio .h>
body that caused it to loop very fast and soak up CPU resources. The new version *include <signal.h>
of the while loop suspends each time through the loop until a signal is received. mt alarmFlag = 0; /* Global alarm flag /
494 Chapter 13 Systems Programming Signals 495

void alarrnHandler Q; / Forward declaration of alarm handler *7 I can be Control-C’ed


/ ***************************************************************/ AC .Control-C works here.
main () $ critical .run the program again.
{ I can be Control-C’ed
signal (SIGALRM, alarrnHandler); 1* Install signal handler *7 I’m protected from Control-C now
alarm / schedule an alarm signal in three seconds */ AC .
. .Control-C is ignored.
printf (“Looping.. .\n”); I can be Control-C’ed again
while (!alarrnFlag) 7* Loop until flag set *7 Bye!

pause Q; / Wait for a signal *7


I
printf (“Loop ends due to alarm signal\n”); Sending Signals: kill ()
/ ***************************************************************/ A process may send a signal to another process by using the kill (3 system call. kill (3 is
void alarmHandler () a misnomer, since many of the signals that it can send do not terminate a process. It’s
called kill (3 because of historical reasons: The main use of signals when UNIX was first
printf (“An alarm clock signal was received\n”); designed was to terminate processes. kill (3 works as shown in Figure 13.48.
alanaFlag = 1;

$ handler .. .run the program.


Looping...
An alarm clock signal was received .. . occurs three seconds later. System Call: mt kill (pid_t pid, mt sigCode)
Loop ends due to alarm signal
kill (3 sends the signal with value sigCode to the process with PID pid. kill (3 suc
ceeds, and the signal is sent as long as at least one of the following conditions is
Protecting Critical Code and Chaining Interrupt Handlers satisfied:

The same techniques that I just described may be used to protect critical pieces of code • The sending process and the receiving process have the same owner.
against Control-C attacks and other such signals. In these cases, it’s common to save the • The sending process is owned by a superuser.
previous value of the handler so that it can be restored after the critical code has exe
cuted. Here’s the source code of a program that protects itself against SIGINT signals: There are a few variations on the way that kill (3 works:
$ cat critical.c . . .list the program. • If pid is 0, the signal is sent to all of the processes in the sender’s process group.
#include <stdio.h> • If pid is —1 and the sender is owned by a superuser, the signal is sent to all
#include <signal .h>
processes, including the sender.
main 0
• If pid is —1 and the sender is not a superuser, the signal is sent to all of the
void (*oldHandler) Q; 7* To hold old handler value *7 processes owned by the same owner as the sender, excluding the sending
printf (“I can be Control-C’ed\n”); process.
sleep (3);
oldHandler = signal (5IGINT, 51G_IGN); 7* Ignore Control-C *7 • If pid is negative and not —1, the signal is sent to all of the processes in the
printf (“I’m protected from Control-C now\n”); process group. (Process groups are discussed later in the chapter.)
sleep (3);
signal (SIGINT, oldHandler); 7* Restore old handler *7 If kill (3 manages to send at least one signal successfully, it returns 0; otherwise, it re
printf (“I can be Control-C’ed again\n”); turns —1.
sleep (3);
printf (“Bye!\n”);
FIGURE 13.48
$ critical .
. .run the program. Description of the kill 0 system call.

I’
F
496 Chapter 13 Systems Programming Signals 497

Death of Children sleep (delay); 7* Sleep for the specified number of seconds /
printf (“Child %d exceeded limit and is being killed\n”, pid);
When a parent’s child terminates, the child process sends its parent a SIGCHLD signal. kill (pid, SICINT); /* Kill the child /
A parent process often installs a handler to deal with this signal, which typically exe }
cutes a wait () to accept the child’s termination code and let the child “de-zombify.” )
Alternatively, the parent can choose to ignore SIGCHLD signals, in which case / **********************************************************************/
the child de-zombifies automatically. One of the socket programs that follows later in void chilciHandler () /‘ Executed if the child dies before the parent *7
the chapter makes use of this feature.
The next example illustrates a SIGCHLD handler and allows a user to limit the mt childPid, childStatus;
childPid = wait (&childStatus); 7* Accept child’s termination code *7
amount of time that a command takes to execute. The first parameter of “limit” is the
printf (“Child %d terminated within %d seconds\n”, childPid, delay); &.
maximum number of seconds that is allowed for execution, and the remaining parame
exit (7* EXITSUCCESS / 0);
ters are the command itself. The program works by performing the following steps:
$ limit 5 is run the program; command finishes OK.
. .

1. The parent process installs a SIGCHLD handler that is executed when its child
.

a. out alarm critical handler limit


process terminates. alarm. c critical.c haridler.c limit.c
2. The parent process forks a child process to execute the command. Child 4030 terminated within 5 seconds
3. The parent process sleeps for the specified number of seconds. When it wakes up, $ limit 4 sleep 100 .run it again; command takes too long.
. .

it sends its child process a SIGINT signal to kill it. Child 4032 exceeded limit and is being killed
$—
4. If the child terminates before its parent finishes sleeping, the parent’s SIGCHLD
handler is executed, causing the parent to terminate immediately. Suspending and Resuming Processes
Here are the source code and sample output from the program: The SIGSTOP and SIGCONT signals suspend and resume a process, respectively.
They are used by the UNIX shells (most shells, except for the Bourne shell) that sup
$ cat limit.c . . .list the program. port job control to implement built-in commands such as stop, fg, and bg.
#include <stdio.h> In the next example, the main program created two children that entered an infi
#include <signal .h> nite loop and displayed a message every second. The main program waited for three
mt delay; seconds and then suspended the first child. The second child continued to execute as
void childHandler l;
usual. After another three seconds, the parent restarted the first child, waited a little
/ **********************************************************************/
while longer, and then terminated both children. Here is the code:
main (argc, argv)
mt argc;
char* argv [1; $ cat puise.c . . .list the program.
{ #include <signal .h>
mt pid; #include <stdio h> .

signal (SIGCHLD, childHandler); / Install death-of-child handler *7 main 0


pid = fork ; / Duplicate *7
if (pid == 0) 7* Child / mt pidl;
mt pidi;
execvp (argv[2], &argv[2]); /* Execute command */ pidl = fork 0;
perror (‘limit’); 7* Should never execute / if (pidl 0) 7* First child /
} {
else / Parent / while (1) 7* Infinite loop *7
{
sscanf (argv[lJ, “%d”, &delay); / Read delay from conuxiand line *7 printf (“pidl is alive\n);
sleep (1);
}
:1
pid2 = fork 0; 7* Second child */
‘This means that the child is completely laid to rest and is no longer a zombie.
498 Chapter 13 Systems Programming Signals 499

if (pid2 == 0) • Every terminal can be associated with a single control process. When a
metacharacter such as Control-C is detected, the terminal sends the appropriate
while (1) /* Infinite loop *1 signal to all of the processes in the process group of its control process.
(“pid2 is alive\n”);
• If a process attempts to read from its control terminal and is not a member of the
printf
sleep (1); same process group as the terminal’s control process, the process is sent a SlOT-
TIN signal, which normally suspends it.
Here’s how a shell uses these features:
sleep (3);
kill SIGSTOP); /* Suspend first child / • When an interactive shell begins, it is the control process of a terminal and has that

I
(pidl,
sleep (3); terminal as its control terminal. How this occurs is beyond the scope of the book.
kill (pidl, SIGCONT); / Resume first child */ • When a shell executes a foreground process, the child shell places itself in a dif
sleep (3); ferent process group before exec’ing the command and takes control of the ter
kill (pidl, SIGINT) ; /‘ Kill first child */ minal. Any signals generated from the terminal thus go to the foreground
kill (pid2, SIGINT); /* Kill second child */ command rather than the original parent shell. When the foreground command
terminates, the original parent shell takes back control of the terminal.
$ pulse run the program.
.. .

pidl is alive .. .both run in first three seconds. • When a shell executes a background process, the child shell places itself in a differ
pid2 is alive ent process group before exec’ing, but does not take control of the terminal. Any
pidl is alive signals generated from the terminal continue to go to the shell. If the background
pid2 is alive process tries to read from its control terminal, it is suspended by a SIGTTIN signal.
pidl is alive
pid2 is alive
The diagram in Figure 13.49 illustrates a typical setup. Assume that pocess 145 and
pid2 is alive . . . lust the second child runs now.
process 230 are the process leaders of background jobs, and that process 171 is the
pid2 is alive process leader of the foreground job. setpgid () changes a process’ group and works as
pid2 is alive shown in Figure 13.50. A process may find out its current process group ID by using
pidl is alive . . . the first child is resumed. getpgid (), which works as shown in Figure 13.51.
pid2 is alive
pidl is alive
pid2 is alive
pidl is alive
pid2 is alive
$—

Process Groups and Control Terminals


When you’re in a shell and you execute a program that creates several children, a sin
gle Control-C from the keyboard will normally terminate the program and its children
and then return you to the shell. In order to support this kind of behavior, UNIX in
troduced a few new concepts:
Processes in groups 145,171,
and 230 share the same controlling
• In addition to having a unique process ID number, every process is a member of terminal
a process group. Several processes can be members of the same process group.
When a process forks, the child inherits its process group from its parent. A
process may change its process group to a new value by using setpgid ().When a /
process execs, its process group remains the same. The terminal’s control
process is 171
• Every process can have an associated control terminal—typically, the terminal
where the process was started. When a process forks, the child inherits its control FIGURE 13.49
terminal from its parent. When a process execs, its control terminal stays the same. Control terminals and process groups.
500 Chapter 13 Systems Programming ; Signals 501

{
System Call: pidj setpgid (pid_tpid, pidjpgrpld) - printf (‘Process %d got a SIGINT\n , getpid );

setpgid () sets the process group ID of the process with PID pid to pgrpld. If pid is $ pgrpl run the program.
. .

zero, the caller’s process group ID is set to pgrpld. In order for setpgid ()to succeed Parent PID 24583 PGRP 24583 waits
and set the process group ID, at least one of the following conditions must be met: Child. PID 24584 PGRP 24583 waits
.press Control-C.
Process 24584 got a SIGINT
• The caller and the specified process must have the same owner.
Process 24583 got a SIGINT
• The caller must be owned by a superuser. $ —

When a process wants to start its own unique process group, it typically passes its If a process places itself into a different process group, it is no longer associated with
own process ID number as the second parameter to setpgid 0. the terminal’s control process and does not receive signals from the terminal. In the
If setpgid 0 fails, it returns —1. following example, the child process is not affected by a Control-C:

FIGURE 13.50 $ cat pgrp2.c ...list the program.


#include <signal.h>
Description of the setpgid 0 system call.
#include <stdio h> .

void sigintHandler 0;
main ()
System Call: pidt getpgid (pid_t pid)
{
getpgid 0 returns the process group ID of the process with PID pid. If pid is zero,
mt i;

signal (SIGINT, sigintHandler); 7* Install signal handler */


the process group ID of the caller is returned, if (fork () == 0)
setpgid (0, getpid Q); / Place child in its own process group /
FIGURE13.51
Description of the getpgid 0 system call. I printf (‘Process PID %d PGRP %dwaits\n”, getpid 0, getpgid (0));
for (1 1; i <= 3; i++) / Loop three times /

printf (“Process %d is alive\n”, getpid 0);


sleep (1)
The next example illustrates the fact that a terminal distributes signals to all of the
processes in its control process’ process group. Since the child inherited its process
group from its parent, both the parent and child catch the SIGINT signal. The code is void sigintHandler ()
as follows: {
printf (“Process %d got a SIGINT\n”, getpid Q);
exit (1);
$ cat pgrpl.c . . .list program.
#include <signal .h>
#include<stdio.h>
$ pgrp2 . . . run the program.
Process PID 24591 PGRP 24591 waits
void sigintHandler ;
Process PID 24592 PGRP 24592 waits
main ()
Control-C
{ Process 24591 got a SIGINT .parent receives signal.
signal (SIGINT, sigintHandler); 7* Handle Control-C */
. .

Process 24592 is alive . . .child carries on.


if (fork () 0)
Process 24592 is alive
printf (‘Child PID %d PGRP %dwaits\n”, getpid (),getpgid (0));
Process 24592 is alive
else
printf (Parent PID %d PGRP %dwaits\n”, getpid getpgid (0));
$—
,

pause ; /* Wait for asignal *7


} If a process attempts to read from its control terminal after it disassociates itself from
void sigintHandler () the terminal’s control process, it is sent a SIGTI’IN signal, which suspends the receiver

Li
502 Chapter 13 Systems Programming IPC 503

by default. In the following example, I trapped SIGTTIN with my own handler to the standard output of one utility to the standard input of another. For example, here’s
make the effect a little clearer: a simple shell command that determines how many users are on a system:

$ cat pgrp3.c list the program.


#include <signal .h>
$ who / wc -1

#include <stdio .h>


#include <sys/termio .h> The who utility generates one line of output per user. This output is then “piped” into
#include <fcntl .h> the wc utility, which, when invoked with the -1 option, outputs the total number of lines
void sigttinHandler Q; in its input. Thus, the pipelined command craftily calculates the total number of users
main () by counting the number of lines that who generates. Figure 13.52 shows a diagram of
the pipeline.
mt status;
char str [1001;
if (fork () 0) 7* Child *7
{ who Pipe we
signal (SIGPTIN, sigttinHandler); 7* Install handler */
setpgid (0, getpid 0); /* Place myself in a new process group */
printf (‘Enter a string: “); Bytes from who’ flow
scanf (‘%s”, str); 7* Try to read from control terminal *7 through the pipe to ‘we”
printf (‘You entered %s\n’, str);
FIGURE 13.52
}
else 7* Parent / A simple pipe.
{
wait (&status); / Wait for child to terminate / It’s important to realize that both the writer process and the reader process of a
} pipeline execute concurrently; a pipe automatically buffers the output of the writer and
:1 suspends the writer if the pipe gets too full. Similarly, if a pipe empties, the reader is
void sigttinHandler () suspended until some more output becomes available.
{
All versions of UNIX support unnamed pipes, which are the kind of pipes that
printf (“Attempted inappropriate read from control terminal\n’);
exit (1);
shells use. System V also supports a more powerful kind of pipe called a named pipe. In
this section, I’ll show you how to construct each kind of pipe, starting with unnamed
$ pgrp3 ...
run the program. pipes.
Enter a string: Attempted inappropriate read from control terminal
Unnamed pipes: pipe 0
An unnamed pipe is a unidirectional communication link that automatically buffers its
I PC input (the maximum size varies with different versions of UNIX, but is approximately
5K) and may be created using the pipe () system call. Each end of a pipe has an associ
Interprocess communication (IPC) is the generic term describing how two processes ated file descriptor. The “write” end of the pipe may be written to using write (), and
may exchange information with each other. In general, the two processes may be run the “read” end may be read from using read (). When a process has finished with a
ning on the same machine or on different machines, although some IPC mechanisms pipe’s file descriptor, it should close it, using close (). Figure 13.53 shows how pipe 0
may support only local usage (e.g., signals and pipes). IPC may be an exchange of data works.
wherein two or more processes are cooperatively processing the data or other syn If the code is executed, then the data structures shown in Figure 13.54 will be cre
chronization information to help two independent, but related, processes schedule ated. Unnamed pipes are usually used for communication between a parent process
work so that they do not destructively overlap. and its child, with one process writing and the other process reading. The typical se
quence of events is as follows:
Pipes
Pipes are an interprocess communication mechanism that allow two or more processes
mt fd [2];
pipe (fd);
to send information to each other. They are commonly used from within shells to connect
504 Chapter 13 Systems Programming IPC 505

1. The parent process creates an unnamed pipe, using pipe 0.


System Call: mt pipe (intfd [2]) 2. The parent process forks.
pipe () creates an unnamed pipe and returns two file descriptors; the descriptor as 3. The writer closes its read end of the pipe, and the designated reader closes its
sociated with the “read” end of the pipe is stored in fd [0], and the descriptor associ write end of the pipe.
ated with the “write” end of the pipe is stored infd [1]. 4. The processes communicate by using write ()and read () calls.
5. Each process closes its active pipe descriptor when it’s finished with it.
The following rules apply to processes that read from a pipe:
Bidirectional communication is possible only by using two pipes. Here’s a small pro
• If a process reads from a pipe whose write end has been closed, the read () re gram that uses a pipe to allow the parent to read a message from its child:
turns a 0, indicating the end of input. .list the program.
$ cat talk.c ..

• If a process reads from an empty pipe whose write end is still open, it sleeps #include <stdio.h>
until some input becomes available. #define READ 0 / The index of the read end of the pipe *7
#define WRITE 1 /* The index of the write end of the pipe /
• If a process tries to read more bytes from a pipe than are present, all of the char* phrase = “Stuff this in your pipe and smoke it”;
pipe’s current contents are returned, and read 0 returns the number of bytes main ()
actually read.
mt fd [2J, bytesRead;
The following rules apply to processes that write to a pipe: char message [1001; /* Parent process’ message buffer */
pipe (fd); /*Create an unnamed pipe /
• If a process writes to a pipe whose read end has been closed, the write fails and
if (fork () == 0) 7* Child, writer /
the writer is sent a SIGPIPE signal. The default action of this signal is to termi
nate the receiver.
close(fd{READ]); /* Close unused end */
• If a process writes fewer bytes to a pipe than the pipe can hold, the write 0 is write (fd[WRITE],phrase, strien (phrase) + 1); /* include ISIULL*/
guaranteed to be atomic; that is, the writer process will complete its system call close (fd[WRITEJ); 7* Close used end*/
without being preempted by another process. If a process writes more bytes to
a pipe than the pipe can hold, no similar guarantees of atomicity apply. else /* Parent, reader*/

Since access to an unnamed pipe is via the file descriptor mechanism, typically only close (fd[WRITE]); /* Close unusedend *7
2 iseek 0 has no
the process that creates a pipe and its descendants may use the pipe. bytesRead = read (fd[READ], message, 100);
meaning when applied to a pipe. printf (“Read %d bytes: %s\n”, bytesRead, message); 7* Send *7
If the kernel cannot allocate enough space for a new pipe, pipe 0 returns —1; close (fd[READJ); / Close usedend *7
otherwise, it returns 0.
$ talk . run
..the program.
FIGURE 13.53 Read 37 bytes: Stuff this in your pipe and smoke it
Description of the pipe () system call. $
Notice that the child included the phrase’s NULL terminator as part of the message so
that the parent could easily display it. When a writer process sends more than one van
fd 01
able-length message into a pipe, it must use a protocol to indicate an end of message to
the reader. Methods for doing this include the following:
fd[lj
Read end
• sendmg the length of a message (in bytes) before sending the message itself
• ending a message with a special character, such as a newline or a NULL

FIGURE 13.54 UNIX shells use unnamed pipes to build pipelines. To do so, they use a trick similar to
An unnamed pipe. the redirection mechanism described in an section to connect the standard output of
one process to the standard input of another. To illustrate this approach, consider a
j
2 advanced situations, it is actually possible to pass file descriptors to unrelated processes via a pipe.
506 Chapter 13 Systems Programming IPC 507

program that executes two named programs, connecting the standard output of the Unfortunately, named pipes are supported only by System V. All of the rules that I
first to the standard input of the second. The program doing the connecting assumes mentioned in the previous section regarding unnamed pipes apply to named pipes, ex
that neither program is invoked with options and that the names of the programs are cept that named pipes have a larger buffer capacity—typically, about 40K.
listed on the command line. Here’s the code: Named pipes exist as special files in the file system and may be created in one of
two ways:
$ cat connect.C list the program.
#include <stdio.h> • by using the UNIX mknod utility
#define READ 0 • by using the mknod () system call
#define WRITE 1
main (argc, argv) To create a named pipe using mknod, use the p option. (For more information about
mt argc; mknod, see Chapter 15.) The mode of the named pipe may be set using chmod, allow
char* argv [1; ing others to access the pipe that you create. Here’s an example of this procedure, exe
{ cuted from a Korn shell:
mt fd [2];
pipe (fd); /* Create an unamed pipe / • . .create
$ inknod myPipe p pipe.
if (fork () 0) / Parent, writer /
$ chmod ug+rw myPipe •update permissions.
{ $ is -ig myPipe •examine attributes.
close (fd[READ]); / Close unused end *7 prw-rw---- 1 glass cs
/ 0 Feb 27 12:38 myPipe
dup2 (fdWRITE], 1); 7* Duplicate used end to stdout 4
7* */ $—
close (fd[WRITEI); Close original used end
/
execlp (argv[l], argv[1], NULL); /* Execute writer program
7* Should never execute / Note that the type of the named pipe is “p” in the Is listing.
perror (‘connect’);
To create a named pipe using mknod (), specify SJFIFO as the file mode. The
}
else /* Child, reader *7 mode of the pipe can then be changed by using chmod (). Here’s a snippet of C code
{ that creates a named pipe with read and write permission for the owner and group:
close (fd[WRITE]); 1* Close unused end */
dup2 (fd[READ], 0); 7* Duplicate used end to stdin / mknod (‘myPipe”, SIFIFO, 0); 7* Create a named pipe *7
close (fd[READ]); /* Close original used end *7 chmod (“myPipe”, 0660); 7* Modify its permission flags *7
/
execlp (argv[2], argv[2], NULL); /* Execute reader program
perror (“connect”); 7* Should never execute /
Regardless of how you go about creating a named pipe, the end result is the same: A
} special file is added into the file system. Once a named pipe is opened using open (),
write () adds data at the start of the FIFO queue, and read ()removes data from the
$ who • .execute “who” by itself.
end of the FIFO queue. When a process has finished using a named pipe, it should close
gglass ttyp0 Feb 15 18:45 (xyplex_3)
it using close (), and when a named pipe is no longer needed, it should be removed
$ connect who wc • .pipe “who” through “wc”.
.

57 .1 line, 6 words, 57 chars.


from the file system via unlink 0.
6 . .

Like an unnamed pipe, a named pipe is intended only for use as a unidirectional
$—
link. Writer processes should open a named pipe for write only, and reader processes
Later in the chapter, we examine a more sophisticated example of unnamed pipes. should open one for read only. Although a process could open a named pipe for both
Also, the chapter review contains an interesting exercise that involves building a ring reading and writing, doing so doesn’t have much practical application. Before I show
of pipes. you a sample program that uses named pipes, here are a couple of special rules con
cerning their use:
Named pipes
• If a process tries to open a named pipe for read only and no process currently has
Named pipes [often referred to as first-in, first-out queues (FIFOs)] are less restricted
that file open for writing, the reader will wait until a process opens the file for
than unnamed pipes, and offer the following advantages:
writing, unless O_NONBLOCK/O_NDELAY is set, in which case open () suc
• They have a name that exists in the file system. ceeds immediately.
• They may be used by unrelated processes. • If a process tries to open a named pipe for write only and no process currently
has that file open for reading, the writer will wait until a process opens the file for
• They exist until they are explicitly deleted.

II
508 Chapter 13 Systems Programming IPC 509

reading, unless O_NONBLOCK/O_NDELAY is set, in which case open 0 fails Writer program
immediately.
#include <stdio .h>
• Named pipes will not work across a network.
#include <fcntl h>.

The next example uses two programs—”reader” and “writer”—and works like this: / **********************************************************************/
main ()
• A single reader process is executed, creating a named pipe called “aPipe”. The
process then reads and displays NULL-terminated lines from the pipe until the mt fd, messageLen, i;
char message [100];
pipe is closed by all of the writing processes. 7* Prepare message /
• One or more writer processes are executed, each of which opens the named pipe sprintf (message, Hello from PID %d”, getpid U);
called “aPipe” and sends three messages to it. If the pipe does not exist when a messageLen = strlen (message) + 1;
writer tries to open it, the writer retries every second until it succeeds. When all of do / Keep trying to open the file until successful /
a writer’s messages are sent, the writer closes the pipe and exits.
fd = open (“aPipe”, 0_WRONLY); 7* Open named pipe for writing *7
Following are the source code for each file and some sample output: if (fd == -1) sleep (1); 7* Try again in 1 second *7

Reader program while (fd == —1);


for (i = 1; i <= 3; i++) 7* Send three messages *7
<stdio.h> 4
#include
#include <sys/types .h> write (fd, message, messageLen); / Write message down pipe *7
#include <sys/stat.h> / For SIFIFO / sleep (3); / Pause a while *7
#include <fcntl .h>
/ *********************************************************************/ close (fd); 7* Close pipe descriptor /
main ()
{
mt fd;
char str[1001;
unlink(aPipe’); / Remove named pipe if it already exists */ Sample output
mknod (‘aPipe’, S_IFIFO, 0); /* Create named pipe *7
chmod (“aPipe”, 0660); /* Change its permissions *7 $ reader & writer & writer & .start 1 reader, 2 writers.
. .

fd = open (‘aPipe”, O_RDONLY); 7* Open it for reading *7 [1] 4698 • . .reader process.
while (readLine (fd, str)) /* Display received messages / [2) 4699 • . .first writer process.
printf (“%s\n’, str); [31 4700 • . .second writer process.
close (fd); /* Close pipe */ Hello from PID 4699
} Hello from PID 4700
/ *********************************************************************/ Hello from PID 4699
reaciLine (id, str) Hello from PID 4700
mt fd; Hello from PID 4699
char* str; Hello from PID 4700
/* Read a single NULL-terminated line into str from fd */ [21 Done writer first writer exits.
/* Return 0 when the end-of-input is reached and 1 otherwise / [31 Done writer • . .second writer exits.
[11 Done reader • .reader exits.
mt fl;

do 7* Read characters until NULL or end-of-input *7


Sockets
n read (fd, str, 1); 7* Read one character *7
Sockets are the traditional UNIX interprocess communication mechanism that allows
while (n > 0 && *str++ ! NULL); processes to talk to each other, even if they’re on different machines. It is this across
return (n > 0); /* Return false if end-of-input *7 network capability that makes sockets so useful. For example, the riogin utility, which
allows a user on one machine to log into a remote host, is implemented with sockets.
510 Chapter 13 Systems Programming IPC 511

Other common uses of sockets include the following: • how a server and client communicate after a socket connection is made
• how a socket connection is closed
• printing a file on one machine from another machine • how a server can create a child process to converse with a client
• transferring files from one machine to another machine
The Different Kinds of Sockets
Process communication via sockets is based on the client—server model. One process,
The various kinds of sockets may be classified according to three attributes:
known as a server process, creates a socket whose name is known by other client
processes. These client processes can talk to the server process via a connection to its
• the domain
named socket. To do this, a client process first creates an unnamed socket and then re
quests that it be connected to the server’s named socket. A successful connection re • the type
turns one file descriptor to the client and one to the server, both of which may be used • the protocol
for reading and writing. Note that, unlike pipes, socket connections are bidirectional.
Figure 13.55 illustrates the process.
Domains
The domain of a socket indicates where the server and client sockets may reside; the
1. Server creates a named domains that are currently supported include the following:
socket.
4
• AF_UNIX (the clients and server must be in the same machine)
2. Client creates an unnamed • AF_INET (the clients and server may be anywhere on the Internet)
socket and requests a nt
connection. 6rName • AF_NS (the clients and server may be on a XEROX network sysfem)

“AF” stands for “Address Family.” There is a similar set of constants that begin with
“PF” (e.g., PF_UNIX and PF_INET), which stands for “Protocol Family.” Either set
3. Client makes a connection. may be used, since they are equivalent. This book contains information about
Server retains original
named socket. AF_UNIX and AF_INET sockets, but not AF_NS sockets.

Types
Completed The type of socket determines the type of communication that can exist between the
connection
client and server; the following are the two main types that are currently supported:
FIGURE 13.55
The socket connection. • SOCK_STREAM: sequenced, reliable, two-way-connection-based, variable-
length streams of bytes
• SOCK_DGRAM: like telegrams—connectionless, unreliable, fixed-length
Once a socket connection is made, it’s quite common for the server process to messages
fork a child process to converse with the client, while the original parent process con
tinues to accept other client connections. A typical example of this is a remote print Other types that are either in the planning stages or implemented only in some do
server: The server process first accepts a client that wishes to send a file for printing mains include the following:
and then forks a child to perform the file transfer. The parent process meanwhile waits
for more client print requests. • SOCK_SEQPACKET: sequenced, reliable, two-way-connection-based, fixed-
In what follows, we’ll take a look at these topics: length packets of bytes
• SOCK_RAW: provides access to internal network protocols and interfaces
• the different kinds of sockets
• how a server creates a named socket and waits for connections This book contains information only on how to use SOCK_STREAM sockets, which
• how a client creates an unnamed socket and requests a connection from a server are the most common. SOCK_STREAM sockets are both intuitive and easy to use.

II.
512 Chapter 13 Systems Programming IPC 513

Spain, spain, spain, spain,


Protocols
spain, and spain.
The protocol value specifies the low-level means by which the socket type is imple $ cook .run another client-display the recipe.
. .

mented. System calls that expect a protocol parameter accept 0 as meaning “the cor spain, spain, spain, spain,
rect protocol”; in other words, the protocol value is something that you generally won’t spain, and spain.
have to worry about. Most systems support only protocols other than 0 as an optional $ kill %l . .kill the server.
.

[1] Terminated chef


extra, so I’ll use the default protocol in all the examples.

Writing Socket Programs


Any program that uses sockets must include “/usr!inciude/sys/types.h” and “/usr/in Chef—Cook Listing
clude/sys/socket.h”. Additional header files must be included on the basis of the sock
This section contains the complete listing of the chef and cook programs. I suggest that
et domain that you wish to use. The most commonly used domains are shown in
you skim quickly through the code and then read the sections that follow for details on
Figure 13.56. Other socket domains are defined in socket.h.
how the two programs work. In the interests of space, I have purposely left out a great
deal of error checking.
Domain Additional header files
AF_UNIX /usr!include/sys/un.h Chef Server
4
AF_INET /usr/include!netinet/in.h 1 #include <stdio.h>
/usr/include/arpa/inet.h 2 #include <sigrial.h>
/urs/inciude/netdb.h 3 #include <sys/types.h>
4 #include <sys/socket.h>
5 #include <sys/un.h> / For AFUNIX sockets */
FIGURE 13.56
6
Common socket domains and corresponding header files. 7 #define DEFAULT_PROTOCOL 0
8
To illustrate clearly the way in which a program that uses sockets is written, I’ll 9 /****************************************************************/
build my description of socket-oriented system calls around a small client—server ex 10
ample that uses AF_UNIX sockets. Once I’ve done this, I’ll show you another exam- 11 main ()
12
pie that uses AF_INET sockets. The AFUNIX example consists of the following two
13 {
programs:
14 mt serverFd, clientFd, serverLen, clientLen;
15 struct sockaddr_un serveruNlXAddress;/* Server address */
• “chef,” the server, which creates a named socket called “recipe” and writes the
16 struct sockaddrun clienttJNlXAddress; 1* Client address *7
recipe to any clients which request it. The recipe is a collection of variable-length 17 struct sockaddr* serverSockAddrPtr; / Ptr to server address /
NULL-terminated strings. 18 struct sockaddr* clientSockAddrPtr; / Ptr to client address /
• “cook,” the client, which connects to the named socket called “recipe” and reads 19
the recipe from the server. “Cook” displays the recipe to standard output as it 20 / Ignore death-of-child signals to prevent zombies *7
reads it, and then it terminates. 21 signal (SIGCHLD, SIGIGN);
22
The chef server process runs in the background. Any client cook processes that con 23 serverSockAddrPtr (struct sockaddr*) &servertiNlXAddress;
nect to the server cause it fork a duplicate server to handle the recipe transfer, allowing 24 serverLen = sizeof (serverUNlXAddress);
the original server to accept other incoming connections. Here’s some sample output 25
26 client5ockAdcjrptr = (struct sockaddr*) &clienttJNlXAddress;
from the chef—cook example:
27 cijentLen = sizeof (clientUNlXAddress);
28
$ chef & run the server in the background. 7* Create a UNIX socket, bidirectional, default protocol */
29
[1] 5684 30 serverFd = socket (AF_TJNIX, SOCK_STREN, DEFAULT_PROTOCOL);
$ cook .run a client-display the recipe.
31 serverUNlXAddress sun_family = AF_tJNIX; / * Set domain type * /
.
Systems Programming PC 515
514 Chapter 13

32 strcpy (serverUNlXAddress.sun_path, ‘recipe); /* Set name / 16 struct sockaddr* serverSockAddrPtr;


33 unlink (“recipe); /* Remove file if it already exists / 17
bind (serverFd, serverSockAddrPtr, serverLen); 7* Create file */ 18 serverSockAddrPtr (struct sockaddr*) &serverUNlXAddress;
34
35 listen (serverFd, 5); /‘ Maximum pending connection length *7 19 serverLen = sizeof (serveruNlXAddress);
36 20
7* Create a UNIX socket, bidirectional, default protocol *7
37 while (1) / Loop forever *7 21
22 clientFd = socket (AF_UNIX, SOCK_STRE,M, DEFAULT_PROTOCOL);
38 {
39 / Accept a client connection / 23 serverUNlXAddress.sun_family = AF_tiNIX; / Server domain /
40 clientFd = accept (serverFd, clientSockAddrPtr, &clientLen); 24 strcpy (servertlNlXAddress. sun_path, “recipe”); /* Server name /
41 25
42 if (fork () 0) 7* Create child to send receipe */ 26 do / Loop until a connection is made with the server /
43 { 27 {
44 writeRecipe (clientFd); /* Send the recipe *7 28 result = connect (clientFd, serverSockAddrPtr, serverLen);
45 close (clientFd); 7* Close the socket *7 29 if (result == -1) sleep (1); 1* Wait and then try again *7
46 exit (7* EXIT_SUCCESS */ 0); /* Terminate / 30
31 while (result == —1);
47 )
48 else 32
49 close (clientFd); 7* Close the client descriptor *7 33 readRecipe (clientFd); 7* Read the recipe /
50 } 34 close (clientFd); 7* Close the socket /
51 35 exit (7* EXIT_SUCCESS *7 0); 7* Done I
52 36 }
53 /****************************************************************/ 37
38 /**************************************************************/
54

55 writeRecipe (fd) 39
56 40 readRecipe (fd)
57 mt fd; 41
58 42 mt fd;
43
59 {
60 static char* linel = “span, span, spain, span,”; 44
61 static char* line2 = “spaIn, and spaIn.”; 45 char str[200];
62 write (fd, linel, strlen (linel) + 1); 7* Write first line *7 46
63 write (fd, line2, strlen (line2) + 1); / Write second line / 47 while (readLine (fd, str)) 7* Read lines until end-of-input *7
64 } 48 printf (“%s\n”, str); 7* Echo line from socket *7
49 )
50
Cook Client /**************************************************************/
51

1 4tinclude <stdio.h> 52
2 #include <signal.h> 53 readLine (fd, str)
3 #include <sys/types.h> 54
4 #include <sys/socket.h> 55 mt fd;
5 #include <sys/un.h> 7* For AFUNIX sockets *7 56 char* str;
6 57
7 #define DEFAULT_PROTOCOL 0 58 7* Read a single NULL-terminated line */

8 59
/****************************************************************/
9 60
10 61 mt n;
11 main () 62
12 63 do 7* Read characters until NULL or end-of-input *7
13 { 64 {
14 mt clientFd, serverLen, result; 65 n = read (fd,str, 1); 7* Read one character *7
15 struct sockaddrun servertlNlXAddress; 66 )
516 Chapter 13 Systems Programming IPC 517

67 while (n > 0 && *str++ NULL); Naming a Socket: bind ()


68 return (n > 0); 7* Return false if end—of-input *7
69 ) Once the server has created an unnamed socket, it must bind it to a name by using bind
(), which works as shown in Figure 13.59. The chef server assigns the sockaddr_un
Analyzing the source code fields and performs a bind () on lines 31—34:
Now that you’ve glanced at the program, it’s time to go back and analyze it. We begin 31 serveruNlxAddress sun_family . AF_UNIX; / * Set domain type * /
with the server. 32 strcpy (servertJNlXAddress.sun_.path, “recipe); /* Set name /
33 unlink (“recipe”); /* Remove file if it already exists /
34 bind (serverFd, serverSockAddrPtr, serverLen); / Create file *7
The server
A server is the process that’s responsible for creating a named socket and accepting
System Call: mt bind (mt fd, const struct sockaddr* address,size_taddressLen)
connections to it. To accomplish this, the server must use the system calls listed in
Figure 13.57, in the order in which they are shown.
bind () associates the unnamed socket represented by file descriptor fd with the
socket address stored in address. addressLen must contain the length of the address
structure. The type and value of the incoming address depend on the socket domain.
Name Meaning If the socket is in the AF_UNIX domain, a pointer to a sockaddr_un structure
must be cast to a (sockaddr*) and passed in as address. This structure has two fields
socket creates an unnamed socket that should be set as follows:
bind gives the socket a name
FIELD ASSIGN THE VALUE
listen specifies the maximum number of pending connections
sun_family AF_UNIX
accept accepts a socket connection from a client
sun_path the full UNIX pathname of the socket (absolute or
relative), up to 108 characters long
FIGURE 13.57
System calls used by a typical UNIX daemon process.
If the named AF_UNIX socket afready exists, an error occurs, so it’s a good
idea to unlink ()a name before attempting to bind to it.
If the socket is in the AF_INET domain, a pointer to a sockaddr_in structure
Creating a Socket: socket () must be cast to a (sockaddr*) and passed in as address. This structure has four fields,
A process may create a socket by using socket (),which works as shown in Figure 13.58. which should be set as follows:
The chef server creates its unnamed socket on line 30 of the program:
FIELD ASSIGN THE VALUE

30 serverFd = socket CAP_UNIX, SOCK_STREAM, DEFAULT_PROTOCOL); sin_family AF_INET


sin_port the port number of the Internet socket
sin_addr a structure of type in_addr that holds the Internet address
System Call: mt socket (mt domain, mt type, mt protocol)
sin_zero leave empty
socket () creates an unnamed socket of the specified domain, type, and protocol. The
valid values of these parameters were described earlier. (For more information about Internet ports and addresses, see the Internet-
If socket 0 is successful, it returns a file descriptor associated with the newly specific part of this section.)
created socket; otherwise, it returns —1. If bind () succeeds, it returns a 0; otherwise, it returns —1.

FIGURE 13.58 FIGURE 13.59

Description of the socket 0 system call. Description of the bind 0 system call.
518 Chapter 13 Systems Programming IPC 519

Creating a Socket Queue: listen () Serving a client


When a server process is servicing a client connection, it’s always possible that anoth When a client connection succeeds, the most common sequence of events is this:
er client will also attempt a connection. The listen () system call allows a process to /
• The server process forks.
specify the number of pending connections that may be queued. It works as shown in
Figure 13.60. The chef server listens to its named socket on line 35: • The parent process closes the newly formed client file descriptor and loops back
to accept (), ready to service new requests for connection.
35 listen (serverFd, 5); /* Maximum pending connection length *7 • Using read () and write (),the child process talks to the client. When the conver
sation is complete, the child process closes the client file descriptor and exits.
The chef server process takes this series of actions on lines 37—50:
System Call: mt listen (mt fd, mt queueLength)
37 while (1) / Loop forever *7
listen () allows you to specify the maximum number of pending connections on a 38
socket. The maximum queue length is 5. If a client attempts to connect to a socket 39 / Accept a client connection /
whose queue is full, it is denied access. 40 clientFd = accept (serverFd, clientSockAddrPtr, &clientLen);
41
42 if (fork () == 0) / Create child to send receipe /
FIGURE 13.60
43 {
Description of the listen 0 system call. 44 writeRecipe (clientFd); 7* Send the recipe *7
45 close (clientFd); 7* Close the socket *7
46 JCCESS *7 0); / Terminate /
exit (/*ITS
T
Accepting a Client: accept () 47 }
48 else
Once a socket has been created and named, and its queue size has been specified, the
49 close (clientFd); / Close the client descriptor /
final step is to accept client connection requests. To do this, the server must use accept O
50 I
which works as shown in Figure 13.61. The chef server accepts a connection on line 40:
40 clientFd = accept (serverFd, clientSockAddrPtr, &clientLen);
Note that the server chose to ignore SIGCHLD signals on line 21 so that its children
could die immediately without requiring the parent to accept their return codes. If the
server had not done this, it would had to have installed a SIGCHLD handler, which
System Call: mt accept (mt fd, struct sockaddr* address, int* addressLen) would have been more tedious.
The Client
accept () listens to the named server socket referenced by fd and waits until a client
connection request is received. When this occurs, accept creates an unnamed sock Now that you’ve seen how a server program is written, let’s take a look at the con
et with the same attributes as the original named server socket, connects the un struction of a client program. A client is a process that’s responsible for creating an un
named socket to the client’s socket, and returns a new file descriptor that may be named socket and then attaching it to a named server socket. To accomplish this, it
used for communication with the client. The original named server socket may be must use the system calls listed in Figure 13.62, in the order shown. The way in which a
used to accept more connections. client uses socket () to create an unnamed socket is the same as the way in which the
The address structure is filled with the address of the client and is normally
used only in conjunction with Internet connections. The addressLen field should ini
tially be set to point to an integer containing the size of the structure pointed to by
address. When a connection is made, the integer that it points to isset to the actual Name Meaning
size, in bytes, of the resulting address. socket creates an unnamed socket
If accept () succeeds, it returns a new file descriptor that may be used to talk
with the client; otherwise, it returns —1. connect attaches an unnamed client socket to a named server socket

FIGURE 13.61 FIGURE 13.62

Description of the accept System calls used by a typical UNIX client process.
0 system call.
520 Chapter 13 Systems Programming
IPC 521
server uses it. The domain, type, and protocol of the client socket must match those of the 58
targeted server socket. The cook client process creates its unnamed socket on line 22: 59 C
60 static char* linel “spam, spain, spain, spain,”;
22 clientFd = socket (1W_UNIX, SOCK_STREAM, DEFAULT PROTOCOL); char* line2 = “spain, and spain.”;
61 static
Making the Connection: connect () 62 write (fd, linel, strien (linel) + 1); / Write first line */
63 write (fd, line2, strien (line2) + 1); / Write second line*/
To connect to a server’s socket, a client process must fill a structure with the address of 64 }
the socket and then use connect (), which works as shown in Figure 13.63. In lines
26—31, the cook client process calls connect () until a successful connection is made: The client uses read () in lines 53—69:
53 readLine (fd, str)
26 do /* Loop until a connection is made with the server /
54
27
28 result connect (clientFd, serverSockAddrPtr, serverLen);
55 mt fd;
56 char* str;
29 if (result == —1) sleep (1); /* Wait and then tzy again *1
57
30 } /* Read a single NULL-terminated line */
58
31 while (result -1);
59
60 {
System Call: mt connect (mt fd, struct sockaddr* address, mt addressLen) 61 mt n;
62
connect () attempts to connect to a server socket whose address is contained within 63 do /* Read characters until NULL or end-of-input /
64
a structure pointed to by address. If successful, fd may be used to communicate with
65 n read (fd, str, 1); 1* Read one character */
the server’s socket. The type of structure that address points to must follow the same 66
rules as those stated in the description of bind 0: 67 while (n > 0 &&*str++ != NULL);
68 return (n > 0); /* Return false if end-of-input */
• If the socket is in the AF_UNIX domain, a pointer to a sockaddr..un structure 69 )
must be cast to a (sockaddr*) and passed in as address.

• If the socket is in the AF_INET domain, a pointer to a sockaddrjn structure The server and the client should be careful to close their socket file descriptors when
must be cast to a (sockaddr*) and passed in as address.
they are no longer needed.
Internet Sockets
addressLen must be equal to the size of the address structure. (For examples of
Internet clients, see the connect () socket example and the Internet shell program at The AfLUNIX sockets that you’ve seen so far are fine for learning about sockets, but they
the end of the chapter.) aren’t where the action is. Most of the useful stuff involves communicating among ma
If the connection is made, connect () returns 0. If the server socket doesn’t chines on the Internet, so the rest of this chapter is dedicated to AFJNET sockets. If you
exist or its pending queue is currently filled, connect returns —1. haven’t already read about networking in Chapter 9, now would be a good time to do so.
() An Internet socket is specified by two values: a 32-bit IP address, which specifies
a single unique Internet host, and a 16-bit port number, which specifies a particular
FIGURE 13.63
port on the host. This means that an Internet client must know not only the IP address
Description of the connect () system call. of the server, but also the server’s port number.
As I mentioned in Chapter 9, several standard port numbers are reserved for sys
Communicating via Sockets tem use. For example, port 13 is always served by a process that echoes the host’s time
of day to any client that’s interested. The first Internet socket example allows you to
Once the server socket and client socket have connected, their file descriptors may connect to port 13 of any Internet host in the world and find out the “remote” time of
be
used by write ()and read (). In the sample program, the server uses write in lines 55—64:
() day. It allows three kinds of Internet address:
55 writeRecipe (fd) • If you enter “s”, it automatically means the local host.
56 • If you enter something that starts with a digit, it’s assumed to be an A.B.C.D-format
57 mt fd;
IP address and is converted into a 32-bit IP address by software.
522 Chapter 13 Systems Programming IPC 523

• If you enter a string, it’s assumed to be a symbolic host name and is converted 21 mt clientFd; /* Client socket file descriptor *7
into a 32-bit IP address by software. 22 mt serverLen; / Length of server address structure *7
23 mt result; / From connect () call *1
Here’s some sample output from the “Internet time” program. The third address that I 24 struct sockaddr_in serverINEAddress; /* Server address *1
entered is the IP address of “ddn.nic.mil,” the national Internet database server. Notice 25 struct sockaddr* server5ockAddrptr; / Pointer to address *7
26 unsigned long inetAddress; 7* 32-bit IP address *7
the one-hour time difference between my local host’s time and the database server
27
host’s time. 28 /* Set the two server variables */
29 server5ockAddrptr = (struct sockaddr*) &serveriNETAddress;
Sample Output 30 serverLen sizeof (serverlNETAddress); /* Length of address *7
31
run the program.
$ inettiine while (1) 7* Loop until break *7
. . .
32
Host name (q= quit s self) s what s rry time’
33
Self host name is csservr2 34 inetAddress = prornptForlNErAddress Q; /‘ Get 32-bit IP */
Internet Address = 129 110 42 1 35 if (inetAddress == 0) break; /* Done *7
The time on the target port is Fri Mar 27 17:03:50 1998 36 /* Start by zeroing out the entire address structure *7
Host name (q = quit, s= self): wotan .what’s the time on “wotan”?
. .
37 bzero ((char*)&serverIMET2ãress, sizeof (serverlNETAddress));
Internet Address = 129.110.2.1 38 serverINETAddress.sin_fa.iiy = AF_INET; 1* Use Internet *7
The time on the target port is Fri Mar 27 17:03:55 1998 39 serverlNETAddress.sinad&saddr = inetAddress; 7* IP */
Host name (q = quit, s = self): 192.112.36.5 .try ddn.nic.mil.
. .
40 serverlNETAddress sin_port = htons (DAYTIME_PORT);
.

The time on the target port is Fri Mar 27 18:02:02 1998 7* Now create the client socket *7
41
Host name (q = quit, s = self): q .quit program.
. .
42 clientFd socket (AF_INET, soCK_STREAM, DEFAULT PROTOCOL);
$— 43 do 7* Loop until a connection is made with the server /
44
Internet Time Listing 45 result = connect (clientFd,server5ockJddrptr,serverLen);
46 if (result == -1) sleep (1); 7* Try again in 1 second *7
This section contains the complete listing of the Internet time client program. I suggest 47
that you skim through the code and then read the sections that follow for details on 48 while (result -1);
how it works. 49
50 readTime (clientFd); / Read the time from the server /
1 #include <stdio.h> 51 close (clientFd); 7* Close the socket *7
2 #include <signal .h> 52
3 #include <ctype.h> 53
4 #include <sys/types .h> 54 exit (7* EXIT_SUCCESS *7 0);
5 #include <sys/socket .h> 55 )
<net met / in. Ia> 7* For AFINET sockets / 56
6 #include
7 #include <arpa/inet h>.
57 /******************************************************
8 #include <netdb h>
.
58
9 59 unsigned long prontForINETAddress ()
13 7* Standard port o / 60
10 #define DAYTIME_PORT
11 #define DEFAULT_PROTOCOL 0 61 {
12 62 char hostNaine [100]; / Name from user: numeric or symbolic /
13 unsigned long promptForlNETAddress (); 63 unsigned long inetAddress; / 32-bit IP format *7
14 unsigned long nameToAddr I; 64
65 / Loop until quit or a legal name is entered *7
15
/******************************************************
***********/ 66 7* If quit, return 0 else return host’s IP address /
16

17 67 do
68
18 main ()
19 69 printf (“Host name (q quit, s = self):
fl);

20 70 scanf (°%s”, hostName); / Get name from keyboard *7


524 Chapter 13 Systems Programming IPC 525

71 if (strcmp (hostName,
uqn)
== 0) return (0); / Quit / 121
7* Convert to IP / 122 printf (“The time on the target port is “);
72 inetAddress = nameToAddr (hostHame);
73 if (inetAddress == 0) printf (“Host name not found\n”); 123 while (readLine (fd, str)) 7* Read lines until end—of-input *7
124 printf (“%s\n”, str); 7* Echo line from server to user *7
74 }
75 while (inetAddress == 0); 125 }
76 return (inetAddress); 126
/****************************************************************/
77 } 127
78 /
****************************************************************/ 128
79 129 readLine (fd, str)
80 unsigned long nameToAddr (name) 130
81 131 mt fd;
132 char* str;
82 char* name;
83 133
134 7* Read a single NEWLINE-terminated line *7
84
85 char hostName [100]; 135
86 struct hostent* hostStruct; 136 {
87 struct in_addr* hostNode; 137 mt n;
88 138
89 /* Convert name into a 32-bit IP address / 139 do 7* Read characters until NULL or end-of-input *7
4
90 140 {
91 / If name begins with a digit, assume it’s a valid numeric / 141 n = read (fd, str, H; 7* Read one character *7
92 /* Internet address of the form A.B.C.D and convert directly *7 142 )
93 if (isdigit (name[0])) return (inet_addr (name)); 143 while (n > 0 && *str++ !“
94 144 return (n > 0); 7* Return false if end-of-input *7
*7
95 if (strcmp (name, “s”) == 0) 7* Get host name from database 145
96
97 gethostname (hostName,l00); Analyzing the source code
98 printf (“Self host name is %s\n”, hostName); Now that you’ve had a brief look through the Internet socket source code, it’s time to
99 examine the interesting sections. The program focuses mostly on the client side of an
*7
100 else 7* Assume name is a valid symbolic host name
Internet connection, so I’ll describe that portion first.
101 strcpy (hostName, name);
102 Internet clients
103 /* Now obtain address information from database /
104 hostStruct = gethosthyname (hostName); The procedure for creating an Internet client is the same as that for creating an
7* Not Found *7
105 if (hostStnict == NULL) return (0); AF_UNIX client, except for the initialization of the socket address. Earlier, I men
106 7* Extract the IP Address from the hostent structure *7
tioned that an Internet socket address structure is of type struct sockaddr_in and has
107
108
109
hostNode = (struct in_addr*) hostStruct->h_addr;
7* Display a readable version for fun *7
printf (“Internet Address = %s\n”, inet_ntoa (*hostNodefl;
I four fields:

*7 • sin_family, the domain of the socket, which should be set to AF_INET


110 return (hostNode->s_addr); 7* Return IP address
111 • sin_port, the port number, which in this case is 13
112 • sin_addr, the 32-bit IP number of the client—server
113 / • sin_zero, which is padding and is not set
114
115 readTime (fd)
In creating the client socket, the only tricky part is determining the server’s 32-bit IP
116
address. promptForlNETAddress 0 [line 59] gets the host’s name from the user and
117 mt fd;
118
then invokes nameToAddr () [line 80] to convert it into an IP address. If the user enters
119 a string starting with a digit, inet_addr ()is invoked to perform the conversion. It works
120 char str [200]; 7* Line buffer *7 as shown in Figure 13.64. Note that “network-byte” order is a host-neutral ordering of
526 Chapter 13 Systems Programming IPC 527

[ZbrarY Routine: in_addr_t inet_addr (const char* string) Library Routine: char* inet_ntoa (struct in_addr address)

inet_addr () returns the 32-bit IP address that corresponds to the A.B.C.D-format inet_ntoa () takes a structure of type in_addr as its argument and returns a pointer
string. The IP address is in network-byte order. to a string that describes the address in the format A.B.C.D.

FIGURE 13.64 FIGURE 13.67


Description of the inet_addr () library routine. Description of the inet_ntoa () library routine.

bytes in the IP address. This ordering is necessary because regular byte ordering can
differ from machine to machine, which would make IP addresses nonportable. The final 32-bit address is then returned in line 110. Once the IP address
If string doesn’t start with a digit, the next step is to see if the first character is “s,” inetAddress has been determined, the client’s socket address fields are filled in lines
which means the local host. The name of the local host is obtained by gethostname 0 37—40;
[line 97], which works as shown in Figure 13.65. Once the symbolic name of the host is
determined, the program can look it up in the network host file, “/etc/hosts.” This is per 37 bzero ((char*)&serverlNETAddress, sizeof(serverlNETAddress));
formed by gethostbyname () [104], which works as shown in Figure 13.66. 38 serverlNETAddress.sin_family = AF_INFI; / Use Internet I
39 serverlNETAddress.sin_addr.s_addr = inetAddress; /* IP /
40 serverlNErAddress.s±n..port = htons (DAYTIME_PORT);
System Call: mt gethostname (char* name, mt nameLen)
bzero (),described in Figure 13.68, clears the socket address structure’s contents before
gethostname () sets the character array pointed to by name of length nameLen to a its fields are assigned. The bzero ()call had its origins in the Berkeley version of UNIX.
null-terminated string equal to the local host’s name. The System V equivalent is memset (), described in Figure 13.69. Like the IP address,
the port number is converted to a network-byte ordering by htons (), which works as
FIGURE 13.65 shown in Figure 13.70.
Description of the gethostname 0 system call.

Library Routine: void bzero (void* buffer, size_t length)


Library Routine: struct hostent* gethostbyname (const char* name)
bzero 0 fills the array buffer of size length with zeroes (ASCII NULL).
gethostbyname 0 searches the “/etc/hosts” file and returns a pointer to a hostent
structure that describes the file entry associated with the string name.
FIGURE 13.68
If name is not found in the “letc/hosts” file, NULL is returned.

FIGURE 13.66
Description of the gethostbyname 0 library routine.
I Description of the bzero 0 library routine.

Library Routine: void memset (void* buffer, mt value, size_t length)


The hostent structure has several fields, but the only one we’re interested in is a
memset 0 fills the array buffer of size length with the value of value.
field of type (struct in addr*) called h_addr. This field contains the host’s associated IP
number in a subfield called s_addr. Before returning the IP number, the program dis
plays a string description of the IP address by calling inet_ntoa () [line 109]. inet_ntoa FIGURE 13.69
is described in Figure 13.67. Description of the memset 0 library routine.

II
528 Chapter 13 Systems Programming PC 529

mt port 13; / Set to the port that you wish to serve /


Library Routine: in_addr_t htonl (in_addr_t hostLong) mt serverLen; /* Length of address structure /
serverFd = socket (AF_INET, SOCK_STREAM, DEFAULT_PROTOCOL); /* Create */
in_port_t htons (in_port_t hostShort) serverLen = sizeof (serverlNETAddress); / Length of structure /
bzero ((char*) &serverlNETAddress, serverLen); /* Clear structure /
in_addr_t ntohl (in_addr_t networkLong) serverlNETAddress.sin_family = AF_INET; / Internet domain /
in_port_t ntohs (in_port_t networkShort) serverlNETAddress.sin_addr.s_addr = htonl (INADDR_ANY); / Accept all /
serverlNETAddress.sin_port = htons (port); /* Server port number */
Each of these functions performs a conversion between a host-format number and a
network-format number. For example, htonl () returns the network-format equiva
lent of the host-format unsigned long hostLong, and ntohs returns the host-format When the address is created, the socket is bound to the address, and its queue size is
equivalent of the network-format unsigned short networkShort. specified in the usual way.

serverSockAddrPtr (struct sockaddr*) &serverlNETAddress;


FIGURE 13.70 bind (serverFd, serverSockAddrPtr, serverLen);
Description of the htonl O htons 0’ ntohl 0’ and ntohs ()library routines. listen (serverFd, 5);

The final step is to accept client connections. When a successful connection is made,
The final step is to create the client socket and attempt the connection. The code the client socket address is filled with the client’s IP address and a new file descriptor
for this is almost the same as for AF_UNIX sockets: is returned.

42 clientFd = socket (AF_INET, SOCK_STREAM, DEFAULT_PROTOCOL);


clientLen = sizeof (clientlNElAddress);
43 do /* Loop until a connection is made with the server /
clientSockAdd.rPtr (struct sockaddr*) clientINETAddress;
44 {
clientFd = accept (serverFd, clientSockAddrPtr, &clientLen);
45 result = connect (clientFd,serverSockAddrPtr,serverLen);
46 if (result == —1) sleep (1); /* Try again in 1 second */
47 1 As you can see, an Internet server’s code is very similar to that of an AF_UNIX server.
48 while (result == -1) The final example in this chapter is the Internet shell.

The rest of the program contains nothing new. Now it’s time to look at how an Internet Shared Memory
server is built. Sharing a segment of memory is a straightforward and intuitive method of allowing
Internet Servers two processes on the same machine to share data. The process that allocates the shared
memory segment gets an ID back from the call, assuming that the creation of the
Constructing an Internet server is actually pretty easy. The sin_family, sin_port, and shared memory segment succeeds. Other processes can then use that ID to access the
sin_zero fields of the socket address structure should be filled in as they were in the shared memory segment.
client example. The only difference is that the s_addr field should be set to the net Accessing a shared memory segment is the fastest form of IPC, since no data
work-byte-ordered value of the constant INADDR_ANY, which means “accept any have to be copied or sent anywhere else. However, because there is just one copy of the
incoming client requests.” The following example of the procedure used to create a data, if more than one process is updating the data, the processes must synchronize
server socket address is a slightly modified version of some code taken from the Inter their actions to prevent corrupting the data.
net shell program that ends this chapter: The following are some of the common system calls utilized to allocate and use
shared memory segments in System V-based versions of UNIX:
mt serverFd; / Server socket
struct sockaddr_in serverlNETAddress; /* Server Internet address *1 • shmget 0—allocates a shared memory segment and returns the segment ID
struct sockaddr* serverSockAddrPtr; /* Pointer to server address /
• shmat 0—attaches a shared memory segment to the virtual address space of the
struct sockaddr_in clientlNETAddress; /* Client Internet address /
struct sockaddr* clientSockAddrPtr; / Pointer to client address /
calling process
530 Chapter 13 Systems Programming The Internet Shell 531

• shmdt 0—detaches an attached segment from the address space THE INTERNET SHELL
• shincti 0—allows you to modify attributes (e.g., access permissions) associated Have you ever wondered what the inside of a shell looks like? Well, here’s a great
with the shared memory segment opportunity to learn how they work and to obtain some source code that could help
After a successful call to shmget (), a shared memory segment exists and can be ac you to create your own shell. I designed the Internet shell to be a lot like the stan
cessed by means of the ID returned in the call. Note that for any other process to use dard UNIX shells, in the sense that it provides piping and background processing fa
the same segment, it must also know this ID. The ID can be made available to other cilities, but I also added some Internet-specific capabilities that the other shells lack.
processes via another IPC mechanism, or a specific ID can be passed to shmget () to
force the use of a specific known ID (with the understanding that the call will fail if Restrictions
that ID has already been used with another shared memory segment). In order to pack the functionality of the Internet shell into a reasonable size, there are
Once you have obtained a valid ID for a shared memory segment, a call to shmat a few restrictions:
0 will return a pointer to the address in the local process’ virtual memory space where
the shared memory segment has been attached. You can then use that pointer to index • All tokens must be separated by white space (tabs or spaces). This means that in
into the block of memory just as you would any other block of memory. (Doing this stead of writing Is; date you must write Is ; date. The upshot is that the lexical an
does presume that you know the format of the data contained in the shared memory alyzer is very simple.
segment.) If and when you finish using the shared memory, you can release (detach) it • Filename substitution (globbing) is not supported. This means that the standard
with a call to shmdt 0 (into which you pass the pointer, not the shared memory seg ?, and [] metacharacters are not understood.
‘,

ment ID). When the last process to have the shared memory segment attached (which
is not necessarily the process that created it) detaches it, the space allocated to the seg These features are nice to have in an everyday shell, but their implementation wouldn’t
ment is released. have taught you anything significant about how shells work.
Semaphores
Command Syntax
A semaphore is not a communication mechanism of the type we’ve seen with pipes,
sockets, and shared memory. No actual data are sent with a semaphore. Rather, a sem The syntax of an Internet shell command is similar to that of the standard UNIX
aphore is a counter that describes the availability of a resource (which could be a shells. We describe it formally using BNF notation (note that the redirection sym
shared memory segment). bols < and > are escaped by a \ to prevent ambiguity; see the appendix for a discus
A semaphore is created and assigned a value that denotes how many concurrent sion of BNF):
uses of a resource are allowed. Each time a process gets ready to use a certain resource,
it checks the semaphore to see whether the resource is available. If the value of the <internetShellcommand> <sequence> [ & I
semaphore is greater than zero, then the resource is available. The process allocates “a <sequence> = <pipeline> { ; <pipeline> }*
unit of the resource,” and the semaphore is decremented by one. If the value of the <pipeline> = <simple> { <simple> }
semaphore is zero, the process sleeps until the value is greater than zero (until another <simple> { <token> }* { <redirection>}*
process has finished its use of the resource). <redirection> = <fileRedirection> I
<socketRedirection>
Semaphores can be used to exclusively lock something by creating a semaphore <fileRedirection> = \> <file> > <file> \< <file>
with a value of one (as soon as one process uses it, the semaphore value be zero). This <socketRedirection> = <clientRedirection> I
<serverRedirection>
<clientRedirection> = @\>c <socket> \<c <socket>
is known as a binary semaphore. Semaphores can also be used to set a maximum num <serverDirection> = @\>s <socket> \<s <socket>
ber of concurrent uses of a particular resource. <token> = a string of characters
The System V semaphore is a bit more complex than what I’ve just described. <file> = a valid UNIX pathname
Semaphores are managed as a list or set of semaphores rather than individually. This <socket> = either a UNIX pathname (UNIX domain socket) or
provides a method of defining multiple semaphores for a complex locking mechanism, an Internet socket name of the form hostname.port#
but requires unnecessary overhead when you only wanted one. Semaphore-related sys
tem calls include the following:
Starting the Internet Shell
• semget0—creates a set (an array) of semaphores
I named the Internet shell executable file ish. The Internet shell prompt is a question
• semop 0—manipulates a set of semaphores
mark.
• semctl 0—modifies attributes of a set of semaphores
532 Chapter 13 Systems Programming The Internet Shell 533

When ish is started, it inherits the $PATH environment variable from the shell hi there
? getenv PATH look at PATH env variable.
that invokes it. The value of $PATH may be changed by using the setenv built-in com
. . .

/usr/local/bin: /usr/ucb: /usr/bin: ibm: /usr/etc


mand that is described shortly. .input redirection works too.
7 inaii giass < who.sort . .

To exit the Internet shell, press Control-D on a line of its own. 7 “D .exit shell.
. .

$—
Built-in Commands
The Internet shell executes most commands by creating a child shell that execs the spec
ified utility while the parent shell waits for the child. However, some commands are Some Internet Examples
built into the shell and are executed directly. Figure 13.71 lists the built-ins. Built-in com The Internet shell becomes pretty interesting when you examine its socket features.
mands may be redirected. Before I describe the construction and operation of the In Here’s an example that uses a UNIX domain socket to communicate information:
ternet shell, let’s take a look at a few examples of both regular commands and
Internet-specific commands. -
$ ish • . .start the Internet shell.
Internet Shell.
? who @>s who.sck & .server sends output to socket “who.sck’.
Built in Function
[2678]
echo {<token>}* echoes tokens to the terminal ? is • .execute a command for fun.
.

ish.c ish.van who.sock who.sort


cd path changes the shell’s working directory to path ish.cs who.sck who. socket
getenv name displays the value of the environment variable name ? sort @<c who.sck .client reads input from socket “who.sck°.
glass ttyp2 May 28 18:33 (bridgeo5.utdalla)
setenv name value sets the value of the environment variable name to value posey ttyp0 May 22 10:19 (blackfoot .utdall)
posey ttypl May 22 10 :19 (blackfoot .utdall)

FIGURE 13.71
I veerasam ttyp3 May 28 18:39 (129.110.70.139)
? “D .quit shell.
Internet shell built-in commands.

Some Regular Examples


The really fun stuff happens when you introduce Internet sockets. The first shell in the
Here are some examples that illustrate the sequencing, redirection, and piping capabilities following example was run on a host called “csservr2,” and the second shell was run on
of the Internet shell: a host called “vanguard”:

$ ish .start shell.


$ ish .run Internet shell on csservr2.
Internet Shell.
? is simple command. Internet Shell.
ish.cs ish.van who.socket who.sort ? who @>s 5000 & • .background server sends output to port 5000.
ish.c
[7221]
? is / wc .pipe.
41 ? “D • .quit shell.
5 5
.pipe + redirect + background. $ riogin vanguard login to vanguard host.
? who / sort > who.sort &
.PID of background process. % ish run Internet shell on vanguard.
• .

[4356]
? cat who.sort show redirection worked. Internet Shell.
May 28 18:33 (bridgeos.utdalla) ? sort @<c csservr2.5000 .client reads input from csservr2.
. .
glass ttyp2
May 22 10:19 (blackfoot.utdall)
.port 5000.
posey ttyp0
ttypl May 22 10:19 (blackfoot.utdall) IP address = 129.110.42.1.. .echoed by Internet shell.
posey
sequence of commands. glass ttyp2 May 28 18:42 (bridgeo5.utdalla) output from
• . .
? date ; whoami
posey ttypo May 22 10:19 (blackfoot.utdall) who on • . .
Thu Mar 26 18:36:24 CDr 1998
posey ttypl May 22 10:19 (blackfoot.utdall) .csservr2!
• .
glass
execute a built-in. veerasain ttyp3 May 28 18:39 (129.110.70.139)
7 echo hi there
534 Chapter 13 Systems Programming The Internet Shell 535

? “D .quit shell.
% “D
logout
logout from vanguard. 5002
H
$— back to csservr2
5001 )

Figure 13.72 is an illustration of the socket connection.

[j
-‘ N ‘ N
/ ___i—\ / ———i—i

FIGURE 13.73
More Internet shell redirection.
/ A

csservr2 vanguard
How It Works
FIGURE 13.72
The operation of the Internet shell can be broken down into several main sections:
Internet shell redirection.
• the main command loop
The next example is even more interesting. The first shell uses one socket to talk • parsing
to the second shell, and the second shell uses another socket to talk to the third: • executing built-in commands
$ ish .start shell on csservr2. • executing pipelines
Internet Shell. • executing sequences
? who @>s 5001 & .background server sends output to port 5001.
• background processing
[2001]
? “D • . .quit shell. • dealing with signals
$ riogin vanguard login to vanguard. • performing file redirection
% ish start shell onvanguard. • performing socket redirection
Internet Shell.
? sort @<c csservr2.5001 @>s 5002 & • . background process reads
.
We next describe each operation, together with fragments of code and diagrams when
[3756] • input from port 5001 on
. necessary. Before you continue, I suggest that you glance through the source code list
• csservr2 and sends it to
. ing at the end of the chapter to familiarize yourself with its overall layout.
• .local port 5002.
IP address 129.110.42.1
= .echoed by shell. The Main Command Loop
? “D .quit shell.
When the shell starts, it initializes a signal handler to catch keyboard interrupts and re
% “D logout of vanguard.
logout
sets an error flag. It then enters commandLoop 0 [line 167], which prompts the user
$ ish .start another shell on csservr2. for a line of input, parses the input, and then executes the command. commandLoop 0
Internet Shell. loops until the user enters Control-D, at which point the shell terminates.
? cat @<c vanguard.5002 read input from port 5002 on vanguard.
Parsing
IP address 129.110.43.128
= .echoed by the shell.
glass ttyp2 May 28 18:42 (bridgeo5 .utdalla) To check the command line for errors, the line is first broken down into separate tokens
posey ttypo May 22 10:19 (blackfoot utdall) .
by tokenize 0 [line 321], which is located in the lexical analyzer section of the source
posey ttypl May 22 10:19 (blackfoot utdall) .
code. tokenize () is called by commandLoop () and fills the global tokens array with
veerasam ttyp3 May 28 18:39 (129.110.70.139)
pointers to each individual token. For example, if the input line was “ls -1,” tokens[0]
? “D .quit shell.
ir would point to the string “ls,” and tokens[1] would point to the string “-1”. Once the line
. $_
rr
is parsed, the global token pointer tlndex is set to zero [line 350] in preparation for
Figure 13.73 is an illustration of the two socket connections. parsing.

II I
536 Chapter 13 Systems Programming The Internet Shell 537

Parsing is performed in a top-down fashion. The main parser, parseSequence 0 If the pipeline is more than a simple built-in command, execute Pipeline () creates
[line 194], is called from the commandLoop Qfunction. parseSequence () parses each a child shell to execute the pipeline; the original parent shell waits for the child to
pipeline in the sequence by invoking parsePipeline 0 and records the information that complete its processing. Notice that the parent waits for a specific PID by calling
parsePipeline ()returns. Finally, it checks to see whether the sequence is to be execut waitForPid 0 [line 503]. This is because the parent shell might have created some
ed in the background. previous children to execute background processes, and it would be incorrect for
Similarly, parsePipeline 0 [line 222] parses each simple command in the pipeline the parent to resume when one of these background processes terminated. If the
by calling parseSimple () and records the information that parseSimple ()returns pars pipeline contains only one simple command, then no pipes need to be created, and
eSimple 0 [line 242] records the tokens in the simple command and then processes any executeSimple 0 [line 569] is invoked. Otherwise, executePipes 0 [line 516] which
trailing metacharacters, such as >,>>, and @>s. connects each command with its own pipe, is invoked.
The information that each of these parsing functions gathers is stored in structures
for later use by the execution routines. A struct sequence [line 75] can hold the details of executePipes ()is a fairly complicated routine. If the pipeline contains n simple com
up to five (MAX_PIPES) pipelines, together with a flag indicating whether or not the mands, then executePipes 0 creates n child processes, one for each command, and n-i
sequence is to be executed in the background. Each pipeline is recorded in a struct pipes to connect the children. Each child reconnects its standard input or output chan
pipeline [line 67], which can record the details of up to five (MAX_SIMPLE) simple nels to the appropriate pipe and then closes all of the original pipe file descriptors.
commands. Finally, a struct simple [line 52] can the hold up to 100 (MAX_TOKENS) Each child then executes its associated simple command. Meanwhile, the original
tokens, together with several fields that record information related to I/O redirection. process th invoked executePipes 0 waits for all of its children to terminate.
If a command is parsed with no errors, the local variable sequence [line 182] is
equal to a struct sequence, which holds the analyzed version of the command. Executing a Simple Command
Note that although I could have used pointers to return structures more effi
executeSimple redirects the standard input or output channels as necessary and then
ciently, I chose to keep the program as simple as I could in order to focus on its UNIX-
executes either executeBuiltin 0 [line 635] or executePrimitive 0 [line 596], depending
specific aspects.
on the category of the command. builtln 0 [line 624] returns true if a token is the name
Executing a Command Sequence of a built-in command. If the command is a built-in, it’s possible that it is being execut
ed directly by the shell. To prevent the shell’s I/O channels from being altered by redi
The main command loop executes a successfully parsed command by invoking exe rection, the original standard input and output channels are recorded for later
cuteSequence 0 [line 444]. This routine does one of two things: restoration.
• If the command is to be executed in the background, it creates a child process to executePrimitive 0 [line 596] simply executes using execvp 0. Fortunately (but
execute the pipelines in sequence; the original parent shell does not wait for the not coincidentally), p->.token is already in the form required by execvp 0. Built-in
child. Before executing the pipeline, the child restores its original interrupt han functions are executed by executeBuiltln 0 using a simple switch statement.
dler and places itself into a new process group to make it immune from hang-ups
and other signals. This ensures that a background process will continue to execute Redirection
even when the shell is terminated and the user logs out. redirect 0 [line 761] performs all of the preprocessing necessary for both file and sock
• If the command is to be executed in the foreground, the parent shell executes the et redirection. The basic technique for redirecting the standard I/O channels is the
pipelines in sequence. same as the one I described earlier in the chapter. If file redirection is required, dupFd
In both cases, executePipeline 0 [line 472] is used to execute each pipeline component o [line 806] is invoked to create the file with the appropriate mode and to duplicate the
standard file descriptor. If socket redirection is required, either server 0 [line 950] or
of the command sequence.
client 0 [line 879] is invoked to create the appropriate type of socket connection. These
Executing Pipelines functions manipulate both UNIX-domain and Internet-domain sockets the same way
the earlier socket examples did.
executePipeline ()performs one of two actions:
• If the pipeline is a simple built-in command, it executes the simple command di Extensions
rectly, without creating a child process. This is very important. For example, the
built-in command cd executes chdir ()to change the shell’s current working di I think that it could be a lot of fun and fairly educational to add some new features to
rectory. If a child shell were created to execute this built-in command, the original the Internet shell. If you’re interested, see the “Projects” at the end of the chapter for
parent shell’s working directory would be unaffected, which would be incorrect. some suggestions.
538 Chapter 13 Systems Programming The Internet Shell 539

Internet Shell Source Code Listing 48 enum INPUT_SOCKET, OUTPUT_SOCKET };


49
50
1 #include <std.io.h> 51 1* Every simple command has one of these associated with it /
2 #include <stdlib.h> 52 struct simple
3 #include <string. h> 53
4 #include <signal .h> 54 char* token [MAX_TOKENS]; /* The tokens of the command /
5 #include <ctype .h> 55 mt count; 7* The number of tokens /
6 #include <sys/types .h> 56 mt outputRedirect; / Set to an lOEnum /
7 #include <fcntl .h> 57 mt inputRedirect; / Set to an IOEnum /
8 #include <sys/ioctl .h> 58 mt append; /* Set to true for append mode *7
9 #include <sys/socket .h> 59 char *outputFile; / Name of output file or NULL if none /
10 #include <sys /un .
..S 60 char *inputFile; / Name of input file or NULL if none /
11 #include <netinet/in.h>
.
,:.
61 char *outputSocket; /* Output socket name or NULL if none /
12 #include <arpa/inet .h> 62 char *inputSocket; / Name of input socket or NULL if none *7
13 #include <netdb h>
.
63
14 64
15 65
16 7* Macros */ 66 7* Every pipeline has one of these associated with it /
17 #define MAX_STRING_LENGTH 200 67 struct pipeline
18 #define MAX_TOKENS 100 68
19 #define MAX_TOKEN_LENGTh 30 69 struct simple simple [MAX_SIMPLE]; 7* Commands in pipe *7
20 #define MAX_SIMPLE 5 70 mt count; / The number of simple commands /
21 #define MAX_PIPES 5 71
22 #define NOT_FOUND —l 72
23 #define REGULAR —l 73
24 #define DEFAULT_PERMISSION 0660 74 / Every command sequence has one of these associated with it /
25 #define DEFAULT_PROTOCOL 0 75 struct sequence
26 #define DEFAULT_QUEUE_LENGTH 5 76
27 #define SOCKET_SLEEP 1 77 struct pipeline pipeline [MAX_PIPESI; /* Pipes in sequence /
28 78 mt count; / The number of pipes *7
29 79 mt background; /* True if this is a background sequence /
30 7* Enumerators / 80 };
31 enum { FALSE, TRUE }; 81
32 enum metacharacterEnum 82
33 83 7* Prototypes *7
34 SEMICOLON, BACKGROUND, END_OF_LINE, REDIRECT_OUTPUT, 84 struct sequence parseSequence Q;
35 REDIRECT_INPUT, APPEND_OUTPUT, PIPE, 85 struct pipeline parsePipeline ;
36 REDIRECT_OUTPUT_SERVER, REDIRECT_OUTPUT_CLIENT, 86 struct simple parseSimmiple Q;
37 REDIRECT_INPUT_SERVER, REDIRECT_INPUT_CLIENT 87 char *nextToken 0;
38 88 char *peekToken 0;
39 enum builtlnEnum { ECHO_BUILTIN, SETENV, GETENV, CD }; 89 char *lastToken 0;
40 enum descriptorEnum { STDIN, STDOUT, STDERR }; 90 char* getToken 0;
41 enum pipeEnum { READ, WRITE }; 91
42 enum lOEnum 92
43 ( 93 / Globals /
44 NO_REDIRECT, FILE_REDIRECT, 94 char* metacharacters [] = { “;“, “&“, “\n”, “>“, “<“, “>>“,
45 SERVER_REDIRECT, CLIENT_REDIRECT 95 ‘)“, “@>s’, “9>c”, “@<s”, “@<c”,
““

46 ); 96 char* builtlns [1 = { “echo”, “setenv”, “getenv”, “cd”, ““

47 enum socketEnumn { CLIENT, SERVER }; 97 char line [MAX_STRING_LENGTH]; /* The current line *7
540 Chapter 13 Systems Programming The Internet Shell 541

/
98 char tokens [MAX_TOKENS] [MAX_TOKEN_LENGTH]; /* Tokens in line 148
*/
99 mt tokenCount; / The number of tokens in the current line 149 char* str;
100 mt tlndex; / Index into line: used by lexical analyzer */ 150
101 mt errorFiag; /* Set to true when an error occurs / 151
102 152 /* Display str as an error to the standard error channel */
103 153 fprintf (stderr, “%s’, str);
104 /* Some forward declarations / 154 errorFlag = TRUE; /* Set error flag *1
105 void (*orjginalQuitHandler) ; 155 }
106 void quitHandler 0; 156
107 157 /****************************************************************/

108 158
109 /* Externals */ : 159 displayPrompt ()
110 char **environ; /* Pointer to the environment */ 160
111 161
/****************************************************************/
112 162 printf (“? “);
113 163
114 main (argc, argv) 164
115 165 /****************************************************************/

116 mt
argc; 166
117 char* argv []; 167 comrnandLoop ()
118 168
119 { 169
120 initialize 0; / Initialize some globals I 170 struct sequence sequence;
121 corrunandLoop 0; /* Accept and process commands */ 171
122 return (/* EXIT_SUCCESS */ 0); 172 /* Accept and process commands until a Control-D occurs /
123 } 173 while (TRUE)
124 174
/****************************************************************/
125 175 displayPrompt 0
126 176 if (gets (line) == NULL) break; /* Get a line of input *1
127 initialize () 177 tokenize Q; /* Break the input line into tokens */
128 178 errorFlag FALSE; / Reset the error flag /
129 { 179
130 printf (Internet Shell.\n); / Introduction */ 180 if (tokenCount > 1) /* Process any non-empty line */
131 /* Set the Control-C handler to catch keyboard interrupts I 181
132 originaiQuitHandler = signal (SIGINT, quitHandler); 182 sequence parseSequence 0; /* Parse the line */
183 /* If no errors occurred during the parsing, */
133 }
134 184 / execute the command /
/****************************************************************/
135 185 if (!errorFlag) executeSequence (&sequence);
136 186 }
137 void quitHandler () 187
138 188
139 { 189
140 / Control—C handler */ 190 /****************************************************************/

141 printf (\n’); 191 7* PARSER ROUTINES *7


142 displayProxnpt () ; 192 /****************************************************************/

143 } 193
144 194 struct sequence parseSequence ()
/****************************************************************/
145 195
146 196 {
147 error (str)
542 Chapter 13

197
Systems Programming

struct sequence q;
I 246 mt code;
The Internet Shell 543

198 247 mt done;


199 / Parse a command sequence and return structure description *7 248
*7 / Parse a simple command and return a structure description /
200 q.count = 0; /* Number of pipes in the sequence 249
201 q.background = FALSE; /* Default is not in background */ 250 s.count 0; 7* The number of tokens in the simple command /
202 251 s.outputFile = s.inputFile = NULL;
203 while (TRUE) /* Loop until no semicolon delimiter is found / 252 s.inputSocket s.outputSocket = NULL;
204 253 s.outputRedirect = s.inputRedirect NO_REDIRECT; 7* Defaults */
q.pipeline[q.count++] = parsePipeline 0; / Parse
/ 254 s.append = FALSE;
205
206 if (peekCode () SEMICOLON) break; 255
207 nextToken Q; /“ Flush semicolon delimiter *7 256 while (peekCode () == REGULAR) 7* Store all regular tokens *7
208 ) 257 s token[s count++] = nextToken Q;
209 258
7* Sequence is in background *7 259 s.token[s.count] = NULL; 7* NULL-terminate token list *7
210 if (peekCode () == BACKGROUND)
211 260 done = FALSE;
212 q.background = TRUE; 261
/ Parse special metacharacters that follow, like /
213 nextToken 0; 7* Flush ampersand */ 262 > and >

214 263 do
215 264 {
215 getToken (END_OF_LINE); 7* Check end-of-line is reached
/ 265 code = peekCode 0;/* Peek at next token *7
217 return (q); 266
218 267 switch (code)
219 268
220
/****************************************************************/ 269 case REDIRECT_INPUT: 7* < *7
221 270 nextToken 0;
222 struct pipeline parsePipeline () 271 s.inputFile = getToken (REGULAR);
223 272 s.inputRedirect = FILE_REDIRECT;
273 break;
224 {
225 struct pipeline p; 274
226 275 case REDIRECT_OUTPUT: 7* > *7
227 / Parse a pipeline and return a structure description of it / 276 case APPEND_OUTPUT: 7* > *7
*7
228 p.count 0; 7* The number of simple commands in the pipeline 277 nextToken 0;
229 278 s.outputFile getToken (REGULAR);
230 while (TRUE) 7* Loop until no pipe delimiter is found *7 279 s.outputRedirect FILE_REDIRECT;
231 280 s append (code APPEND_OUTPUT);
232 p.simple[p.count++J parseSimple 0; 7* Parse command *7 281 break;
233 if (peekCode () PIPE) break; 282
234 nextToken ; / Flush pipe delimiter *7 283 case REDIRECT_OUTPUT_SERVER: 7* @>s /
235 284 nextToken 0;
236 285 s.outputSocket = getToken (REGULAR);
237 return (p); 286 s.outputRedirect = SERVER_REDIRECT;
238
239
} I 287
288
break;

240 /
****************************************************************/ 289 case REDIRECT_OUTPUT_CLIENT: 7* @>c /
241 290 nextToken 0;
291 s.outputSocket = getToken (REGULAR);
242
243
244
struct simple parseSimple ()

{
•1 292
293
294
s.outputRedirect = CLIENT_REDIRECT;
break;
245 struct simple 5;
544 Chapter 13 Systems Programming The Internet Shell 545

295 case REDIRECT_INPUT_SERVER: /* @<5 / 344 strcpy (tokens[tlndex++], token); /* Store the token *7
296 nextToken Q; 345
297 s.inputSocket = getToken (REGULAR); 346
298 s.inputRedirect = SERVER_REDIRECT; 347 /* Place an end-of-line token at the end of the token list */
299 break; 348 strcpy (tokens [tlndex++J, \n’);
300 349 tokenCount = tlndex; / Remember total token count *7
301 case REDIRECT_INPUT_CLIENT: / @<c / 350 tlndex 0; 7* Reset token index to start of token list */
302 nextToken ; 351
303 s.inputSocket getToken (REGULAR); 352
304 s.inputRedirect = CLIENT_REDIRECT; 353 / ****************************************************************/
305 break; 354
306 355 char* nextToken ()
307 default: 356
308 done = TRUE; 357
309 break; 358 return (tokens[tlndex++J); / Return next token in list *7
310 359
311 } 360
312 while (!done); 361 / ****************************************************************/
313 362
314 return (s); 363 char *lastToken ()
315 } 364
316 365
317 /****************************************************************/ 366 return (tokens[tlndex - 1]); /* Return previous token in list *7
318 /* LEXICAL ANALYZER ROUTINES */ 367
319 /****************************************************************/ 368
320 369 / ****************************************************************/
321 tokenize () 370
322 371 peekCode ()
323 { 372
324 char* ptr = line; 7* Point to the input buffer / 373
325 char token [MAX_TOKEN_LENGTH); / * Holds the current token *
/ 374 / Return a peek at code of the next token in the list *7
326 char* tptr; 7* Pointer to current character */ 375 return (tokenCode (peekToken a));
327 376
328 tlndex = 0; 7* Global: points to the current token *7 377
329 378 / ****************************************************************/
330 7* Break the current line of input into tokens / 379
331 while (TRUE) 380 char* peekToken ()
332 { 381
333 tptr token; 382
334 while (*ptr == ) ++ptr; /* Skip leading spaces *7 383 / Return a peek at the next token in the list /
335 if (*ptr == NULL) break; 7* End of line *7 384 return (tokens [tlndex]);
336 385
337 do 386
338 { 387 / ****************************************************************/
339 *tptr++ *ptr++; 388
340 } 389 char *getToken (code)
341 while (*ptr && *ptr NULL) 390
342 391 mt code;
343 *tptr NULL; 392
393

4:
546 Chapter 13 Systems Programming I The Internet Shell 547

394 char str [MAX_STRING_LENGTH]; 443


395 444 executeSequence (p)
396 / Generate error if the code of the next token is not code *7 445
397 /* Otherwise return the token *7 446 struct sequence* p;
398 if (peekCode () code) 447
399 448 {
400 sprintf (str, Expected %s\n”, metacharacters [code]); 449 mt i, result;
401 error (str); 450
402 return (NULL); 451 / Execute a sequence of statments (possibly just one) */
403 452 if (p->background) / Execute in background */
404 else 453
405 return (nextToken ); 454 if (fork () = 0)
406 } 455 {
407 456 printf (‘[%d]\n’, getpid ); / Display child PID */
408
/****************************************************************/
457 /* Child process *7
409 458 signal (SIGQUIT, originalQuitHandler); /* Oldhandler /
410 tokenCode (token) 459 setpgid (0, getpid 0); /* Change process group /
411 460 for (i = 0; i < p->count; i++) / Execute pipelines *7
412 char* token; 461 executePipeline (&p->pipeline[i]);
413 462 exit (/* EXIT_SUCCESS */ 0);
414 { 463
415 / Return the index of token in the metacharacter array / 464 }
416 return (findString (metacharacters, token)); 465 else /* Execute in foreground *7
417 } 466 for (i = 0; i < p->count; i++) / Execute each pipeline */
418 467 executePipeline (&p->pipeline[i]);
/****************************************************************/ 468
419

420 469
470 /****************************************************************/
421 findString (strs, str)
422 471
423 char* strs [1; 472 executePipeline (p)
424 char* str; 473
425
j 474 struct pipeline *p;
426 { 475
427 mt i = 0; 476 {
428 477 mt pid, procesaGroup, result;
429 7* Return the index of str in the string array strs / I 478
430 7* or NOT_FOUND if it isn’t there *7 479 / Execute evexy simple conunand in pipeline (possibly one) *7
431 while (strcmp (strs[i), ““) != 0) 480 if (p->count 1 && builtln (p—>simple[0].token[0]))
432 if (strcmp (strs[iJ, str) == 0) 481 executeSimple (&p->simple[0]); / Execute it directly *7
433 return (i); 482 else
434 else 483
435 484 if ((pid = fork ) == 0)
436 485 {
437 return (NOT_FOUND); /* Not found *7 486 /* Child shell executes the simple commands /
438 } 487 if (p->count == 1)
439 488 executeSimple (&p->simple[0]); 7* Execute command *7
440
/****************************************************************/ 489 else
441 7* COMMAND EXECUTION ROUTINES *7 490 executePipes (p); / Execute more than one coimnand *7
442
/****************************************************************/ 491 exit ( / EXIT_SUCCESS *7 0);
492

I
548 Chapter 13 Systems Programming The Internet Shell 549
I
493 else 544 closeAllPipes (pipefd, pipes);
494 545 for (i 0; ± < p->count; i++) 7* Wait for children to finish *7
495 7* Parent shell waits for child to complete */ 546 wait (&status);
496 waitForPlD (pid); 547 }
548
497 }
549 /****************************************************************/
498 3
550
499 }
500 551 closeAllPipes (pipefd, pipes)
****************************************************************/ 552
501 /
502 553 mt pipefd [1 [2];
503 waitForPlD (pid) 554 mt pipes;
504 555
505 intpid; 556
506 557 mt i;

507 558
7* Close every pipe’s file descriptors *7
508 mt status; 559
509 560 for (i = 0; i < pipes; i++)
510 / Return when the child process with PID pid terminates / 561
511 while (wait (&status) pid); 562 close (pipefd[i] [READ]);
563 close (pipefd[i] [WRITE]);
512 }
513 564
514 / ****************************************************************/ 565 }
515 566
516 executePipes (p) 567 /****************************************************************/

517 568
518 struct pipeline *p; 569 executeSimple (p)
519 570
520 571 struct simple* p;
521 mt pipes, status, i; 572
522 mt pipefd [MAX_PIPES] [2]; 573 {
523 574 mt copyStdin, copyStdout;
524 / Execute two or more simple commands connected by pipes / 575
/ Execute a simple command /
525 pipes = p->count 1; 7* Number of pipes to build */

576
526 for (i = 0; i < pipes; i+÷) / Build the pipes
*7 577 if (builtln (p—>token[0])) / Built—in *7
527 pipe (pipefd[i]); 578 {
7* The parent shell is executing this, so remember *7
528 for (i = 0; i < p->count; i++) 7* Build one process per pipe *7 579
580 / stdin and stdout in case of built-in redirection *7
529 {
530 if (fork () != 0) continue; 581 copyStdin dup (STDIN);
531 / Child shell *7 582 copyStdout dup (STDOUT);
532 7* First, connect stdin to pipe if not the first command / 583 if (redirect (p)) executeBuiltln (p)• 7* Execute built-in *7
584 7* Restore stdin and stdout /
533 if (I != 0) dup2 (pipefd[i—l] [READ] , STDIN)
534 / Second, connect stdout to pipe if not the last command *7 585 dup2 (copyStdin, STDIN);
535 if (i != p->count 1) dup2 (pipefd[i] [WRITE], STDOUT);

586 dup2 (copyStdout, STDOUT);
536 7* Third, close all of the pipes file descriptors *7 587 close (copyStdin);
537 closeAllPipes (pipefd, pipes); 588 close (copyStdout);
538 7* Last, execute the simple command *7 589
539 executeSimple (&p->simple[i]); 590 else if (redirect (p)) 7* Redirect if necessary *7
540 exit (7* EXIT_SUCCESS *70); 591 executePrimitive (p); 7* Execute primitive command *7
541 592 3
542 593
543 7* The parent shell comes here after forking the children *7 594 /****************************************************************/
550 Chapter 13 Systems Programming The Internet Shell 551

595 646
596 executePrimitive (p) 647 case ECHO_BUILTIN:
597 648 executeEcho (p);
598 struct simple* P; 649 break;
599 650
600 651 case GETENV:
601 /* Execute a command by exec ing */ 652 executeGetenv (p);
602 if (execvp (p->token[0] p->token) -1) 653 break;
603 654
604 perror (“ish”); 655 case SETENV:
605 exit (/* EXIT_FAILURE */ 1); 656 executeSetenv (p);
606 657 break;
607 658
608 659
609 /****************************************************************/ 660
/****************************************************************/
610 /* BUILT-IN CONMPNDS *1 661
611 /****************************************************************/ 662
612 663 executeEcho (p)
613 builtinCode (token) 664
614 665 struct simple* p;
615 char* token; 666
616 667
617 668 mt i;

618 / Return the index of token in the builtlns array */ 669


670 /* Echo the tokens in this command */
619 return (findString (builtlns, token));
620 671 for (i = 1; ± < p—>count; i++)
621 672 printf (“%s ‘> p->token[i]);
622 / ****************************************************************/ 673
623 674 printi (\n);
624 builtln (token) 675
625 676
677 /****************************************************************/
626 char* token;
627 678
628 679 executeGetenv (p)
629 / Return true if token is a built-in */ 680
630 return (builtlnCode (token) !z NOT_FOUND); 681 struct simple* p;
631 } 682
632 683
633
/*********************************** *********** * **************** *7 684 char* value;
634 685
686 / Echo the value of an environment variable */
635 executeBuiltln (p)
636 687 if (p—>count != 2)
637 struct simple* p; 688
638 689 error (‘Usage: getenv variable\n’);
639 690 return;
640 7* Execute a single built-in command */ 691
641 switch (builtlnCode (p->token[0j)) 692
642 693 value = getenv (p->token[l]);
643 case CD: 694
644 executeCd (p); 695 if (value = I’BJLL)
645 break; 696 printf (‘Environment variable is not currently set\n’);
552 Chapter 13 Systems Programming The Internet Shell 553

697 else 748


698 printf (°%s\n”, value); 749 {
699 750 / change directory *7
)
700 751 if (p—>count 2)
701 /****************************************************************/
752 error (“Usage: cd path\n”);
702 753 else if (chd±r (p->token[l]) == —1)
703 executeSetenv (p) 754 perror (“ish”);
704 755
705 struct simple* p; 756
/****************************************************************/
706 757
758 /* REDIRECTION *7
707 {
708 / Set the value of an environment variable */ 759
/****************************************************************/

709 if (p->count 3) 760


710 error (‘Usage: setenv variable value\n”); 761 redirect (p)
711 else 762
712 setenv (p->token[1], p->token[21); 763 struct simple *p;
713 764
714 765 {
715 / ****************************************************************/ 766 mt mask;
4
716 767
717 setenv (envNaine, newValue) 768 / Perform input redirection */
718 769 switch (p->inputRedirect)
719 char* envName; 770
720 char* newValue; 771 case FILE_REDIRECT: 7* Redirect from a file /
721 772 if (!dupFd (p->inputFile, O_RDONLY, STDIN)) return(FALSE);
722 { 773 break;
723 mt ± 0; 774
724 char newStr [MAX_STRING_LENGTh]; 775 case SERVER_REDIRECT: /* Redirect from a server socket /
725 mt len; 776 if (!server (p->inputSocket, INPUT_SOCKET)) return(FALSE);
726 777 break;
727 7* Set the environment variable envName to newValue / 778
728 sprintf (newStr, “%s=%s”, envNaioe, newValue); 779 case CLIENT_REDIRECT: /* Redirect from a client socket */
729 len = strlen (envNaine) + 1; 780 if (!client (p->inputSocket, INPUT_SOCKET)) return(FALSE);
730 781 break;
731 while (environ[±] NULL) 782 }
732 783
733 if (strncmp (environ[i], newStr, len) == 0) break; 784 7* Perform output redirection *7
734 ++i; 785 switch (p->outputRedirect)
735 786
736 787 case FILE_REDIRECT: 7* Redirect to a file */
737 if (environ(i] = NULL) environ[i+l] = NULL; 788 mask = 0_CREAT 0_WRONLY (p->append?O_APPEND: 0_TRUNC);
738 789 if (!dupFd (p->outputFile, mask, STDOUT)) return (FALSE);
739 environ[i] = (char*) malloc (strlen (newStr) + 1); 790 break;
740 strcpy (environ[i], newStr); 791
741 792 case SERVER_REDIRECT: /* Redirect to a server socket *7
742 793 if (!server(p->outputSocket, OUTPUT_SOCKET)) return(FALSE);
743 /***** * ***************************************** ********* ******** / 794 break;
744 795
745 executeCd (p) 796 case CLIENT_REDIRECT: /* Redirect to a client socket */
746 797 if (Iclient (p->outputSocket, OUTPUT_SOCKET)) return(FALSE);
747 struct simple* p; 798 break;
r
554 Chapter 13

799
800
Systems Programming

7* If I got here, then everything went OK *7


I 850
851
852
The Internet Shell

/****************************************************************/
555

801 return (TRUE)


802 853 getHostAndPort (str, name, port)
803 854
804 / ****************************************************************/ 855 char *str, *name;
805 856 int* port;
806 dupFd (name, mask, stdFd) 857
807 858
808 char* name; 859 char *tokl, *tok2;
809 mt mask stdFd 860
810 861 / Decode name and port number from input string of the fonn /
811 862 /* NAME.PORT /
812 intfd 863 toki strtok (str, “.“);
813 864 tok2 = strtok (NULL, “.“);
814 7* Duplicate a new file descriptor over stdin/stdout / 865 if (tok2 == NULL) 7* Name missing, so assume local host /
815 fd = open (name, mask, DEFAULT_PERMISSION); 866
816 867 strcpy (name, ““);

817 if (fd == —1) 868 sscanf (tokl, “%d”, port);


818 869
819 error (“Cannot redirect\n”); 870 else
820 return (FALSE); 871
821 872 strcpy (name, tokl);
822 873 sscanf (tok2, “%d”, port);
823 dup2 (fd, stdFd); / Copy over standard file descriptor *7 874
824 close (fd); 7* Close other one */ 875
825 return (TRUE); 876
877 /****************************************************************/
826
827 878
828 / ****************************************************************/ 879 client (name, type)
829 /* SOCKET MNAGETT *7 880
830 / ****************************************************************/ 881 char* name;
831 882 mt type;
832 internetAddress (name) 883
833 884
834 char* name; 885 mt clientFd, result, internet, domain, serverLen, port;
835 886 char hostName [100];
836 887 struct sockaddr_un serverUNlXAddress;
837 / If name contains a digit, assume it’s an Internet address *7 888 struct sockaddr_in serverlNETAddress;
838 return (strpbrk (name, “01234567890”) NULL); 889 struct sockaddr* serverSockAddrPtr;
839 890 struct hostent* hostStruct;
840 891 struct in_addr* hostNode;
841 / ****************************************************************/ 892
842 893 7* Open a client socket with specified name and type *7
843 socketRedirect (type) 894 internet = internetAddress (name); 7* Internet socket? *7
844 895 domain = internet ? AF_INET AR_UNIX; 7* Pick domain */
845 mt type; 896 / Create client socket /
846 897 clientFd = socket (domain, SOCK_STREP,N, DEFAULT_PROTOCOL);
847 898
848 return (type == SERVER_REDIRECT I type == CLIENT_REDIRECT); 899 if (clientFd == -1)
849 900
556 Chapter 13 Systems Programming The Internet Shell 557

901 perror (“ish”); 951


902 return (FALSE); 952 char* name;
903 } 953 mt type;
904 954
905 if (internet) /* Internet socket / 955 {
906 { 956 mt serverFd, clientFd, serverLen, clientLen;
/
907 getHostAndPort (name, hostNaine, &port); / Get name, port 957 mt domain, internet, port;
908 if (hostName[0] == NULL) gethostname (hostName, 100); 958 struct sockaddr_un servertjNlXAddress;
909 serverlNETAddress.sin_family AF_INET; 7* Internet *7 959 struct sockaddr_un clientuNlXAddress;
910 hostStruct = gethostbynamne (hostNarne); /* Find host / 960 struct sockaddr_in serverlNETAddress;
911 961 struct sockaddr_in clientlNETAddress;
912 if (hostStruct == NULL) 962 struct sockaddr* serverSockAddrPtr;
913 963 struct sockaddr* clientSockAddrPtr;
914 perror (“ish”); 964
return (FALSE); 965 7* Prepare a server socket *7
915
916 } 966 internet = internetAddress (name); 7* Internet? /
917 967 domain = internet ? AF_INET : AF_UNIX; / Pick domain /
hostNode = (struct in_addr*) hostStruct->h_addr; 968 7* Create the server socket*/
918
919 printf (“IP address = %s\n”, inet_ntoa (*hostNodeH; 969 serverFd = socket (domain, SOCK_STREAN, DEFAULT_PROTOCOL);
4
920 serverlNETAddress.sin_addr = *hostNode; / Set IP address *7 970
921 serverlNETAddress.sin_port = port; / Set port / 971 if (serverFd == —1)
922 serverSockAddrPtr (struct sockaddr*) &serverlNETAddress; 972 {
923 serverLen = sizeof (serverlNETAddress); 973 perror (“ish”);
924 } 974 return (FALSE);
925 else 7* UNIX domain socket */ 975 }
926 976
927 serverUNlXAddress.sun_family = AF_UNIX; / Domain / 977 if (internet) /* Internet socket /
928 strcpy (servertJNlXAddress.sun_path, name); 7* File name / 978 {
929 serverSockAddrPtr (struct sockaddr*) &serverUNlXAddress; 979 sscanf (name, ‘%d”, &port); / Get port number /
serverLen = sizeof (serverUNlXAddress); 980 7* Fill in server socket address fields *7
930
931 981 serverLen sizeof (serverlNETAddress);
932 982 bzero ((char*) &serverlNETAddress, serverLen);
933 do 7* Connect to server / 983 serverlNETAddress.sin_family AF_INET; / Domain *7
934 { 984 serverlNETAddress.sin_addr.s_addr htonl (INADDR_AIY);
935 result = connect (clientFd, serverSockAddrPtr, serverLen); 985 serverlNETAddress.sin_port = htons (port); 7* Port /
936 if (result == -1) sleep (SOCKET_SLEEP); /* Try again soon / 986 serverSockAddrPtr = (struct sockaddr*) &serverlNETAddress;
937 } 987 }
938 while (result == -1); 988 else / UNIX domain socket *7
939 989 {
940 7* Perform redirection *7 990 serverUNlXAddress.sun_family AF_UNIX; 7* Domain /
941 if (type == OUTPUT_SOCKET) dup2 (clientFd, STDOUT); 991 strcpy (serverUNlXAddress.sun_path,name); / Filename /
942 if (type == INPUT_SOCKET) dup2 (clientFd, STDIN); 992 serverSockAddrPtr (struct sockaddr*) &serverUNlXAddress;
943 close (clientFd); / Close original client file descriptor *7 993 serverLen = sizeof (serverUNlXAddiess);
944 994 unlink (name); 7* Delete socket if it already exists /
945 return (TRUE); 995 }
946 } 996
997 7* Bind to socket address */
947
948 /****************************************************************/ 998 if (bind (serverFd, serverSockAddrPtr, serverLen) == -1)
949 999
950 server (name, type) 1000 perror (‘ish”);
558 Chapter 13 Systems Programming Chapter Review 559

1001 return (FALSE); • how a parent may wait for its children
1002 • the terms orphan and zombie
1003
1004 / Set max pending connection queue length */ • threaded processes
1005 if (listen (serverFd, DEFAULT_QUEUE_LENGTh) == -1) • how signals may be trapped and ignored
1006 • the way to kill processes
1007 perror (‘ish);
• how processes may be suspended and resumed
1008 return (FALSE);
1009
• IPC mechanisms: unnamed pipes, named pipes, shared memory, and semaphores
1010 • the client—server paradigm
1011 if (internet) /* Internet socket */
• UNIX domain and Internet domain sockets
1012 { • the design and operation of an Internet shell
1013 clientLen = sizeof (clientlNETAddress);
1014 clientSockAddrPtr (struct sockaddr*) &clientlNETAddress;
1015 Quiz
1016 else /* UNIX domain socket */
1017 1. How can you tell when you’ve reached the end of a file?
1018 clientLen = sizeof (clienttiNlXAddress);
2. What is a file descriptor?
1019 clientSockAddrPtr = (struct sockaddr*) &clientUNlXAddress;
1020 } 3. What’s the quickest way to move to the end of a file?
1021 4. Describe the way that shells implement I/O redirection.
1022 7* Accept a connection */
5. What is an orphaned process?
1023 clientFd = accept (serverFd, clientSockAdclrPtr, &clientLen);
6. How is a task run in two processes different from a task run in two threads?
1024
1025 close (serverFd); /* Close original server socket *7 7. Under what circumstances do zombies accumulate?
1026 8. How can a parent find out how its children died?
1027 if (clientFd —1) 9. What’s the difference between execv () and execvp Q?
1028
1029 perror (‘ish); 10. Why is the name of the system call kill () a misnomer?
1030 return (FALSE); 11. How can you protect critical code?
1031 12. What is the purpose of process groups?
1032
/ Perform redirection *7 13. What happens when a writer tries to overflow a pipe?
1033
1034 if (type == OUTPUT_SOCKET) dup2 (clientFd, STDOUT); 14. How can you create a named pipe?
1035 if (type INPUT_SOCKET) dup2 (clientFd, STDIN); 15. Describe the client—server paradigm.
1036 close (clientFd); / Close original client socket */ 16. Describe the stages that a client and a server go through to establish a connection.
1037
1038 return (TRUE);
Exercises
1039
1040 13.1 Write a program that catches all signals sent to it and prints out which signal was
sent. Then issue a “kill —9” command to the process. How is SIGKILL different
CHAPTER REVIEW from the other signals? [level: easy]
13.2 Write a program that takes a single integer argument n from the command line
Checklist and creates a binary tree of processes of depth n. When the tree is created, each
In this chapter, I described process should display the phrase “I am process x” and then terminate. The nodes
of the process tree should be numbered according to a breadth-first traversal. For
• all of the common file management system calls example, if the user enters
the system calls for duplicating, terminating and differentiating processes
$ tree 4 . . .build a tree of depth 4.
560 Chapter 13 Systems Programming Chapter Review 561

then the process tree would look like this: 13.4 Rewrite the “ghoul” exercise of Chapter 5, using the C language. [level: medium
13.5 Write a program that uses setuid ()to allow a user to access a file that he or she
1
would not normally be able to access. [level: medium]

Projects

1. Write a suite of programs that run in parallel and interact to play the “Paper,
Scissors, Rock” game. In this game, two players secretly choose either paper,
scissors, or rock. They then reveal their choice. A referee decides who wins as
follows:

8 9 10 11 12 13 14 15 Paper beats rock (by covering it).


• Rock beats scissors (by blunting it).
The output would be • Scissors beats paper (by cutting it).
I am process 1 • Matching choices draw.
I am process 2
The winning player gets a point. In a draw, no points are awarded. Your program
should simulate such a game, allowing the user to choose how many iterations are
performed, observe the game, and see the final score. Here’s an example of a game:
I am process 15

$ play 3 .play three iterations.


Make sure that the original parent process does not terminate until all of its chil
Paper, Scissors, Rock: 3 iterations
dren have died. This is so that you can terminate the parent and its children from
Player 1: ready
your terminal with Control-C. [level: medium] Player 2: ready
13.3 Write a program that creates a ring of three processes connected by pipes. The Go Players [1]
first process should prompt the user for a string and then send it to the second Player 1: Scissors
process. The second process should reverse the string and send it to the third Player 2: Rock
process. The third process should convert the string to uppercase and send it back Player 2 wins
to the first process. When the first process gets the processed string, it should dis Go Players [2]
play it to the terminal. When this is done, all three processes should terminate. Player 1: Paper
Player 2: Rock
Here’s an illustration of the process ring:
Player 1 wins
Go Players [3]
Player 1: Paper
Player 2: Paper
Players draw.
Final score:
Player 1: 1
Player 2: 1
Players Draw
$—

Here’s an example of the program in action: You should write three programs, which operate as follows:

$ ring . . .run the program. a. One program is the main program, which forks and execs one referee process
Please enter a string: ole and two player processes. It then waits until all three terminate. The main pro
Processed string is: ELO
gram should check that the command-line parameter that specifies the num
$— ber of turns is valid and should pass the number to the referee process as a
[level: medium] parameter to exec
562 Chapter 13 Systems Programming

b. One program is a referee program, which plays the role of the server. This
program should prepare a socket and then listen for both players to send the
string “READy”, which means that they’re ready to make a choice. The ref
eree should then tell each player to make a choice by sending them both the
string “GO.” Their responses are read. and their scores calculated and updat
ed. This process should be repeated until all of the turns have been taken, at
which point the referee should send both players the string “STOP,” which
CHAPTER 14
causes them to terminate.
c. One program is a player program, which plays the role of the client. This pro
gram is executed twice by the main program and should start by connecting to
the referee’s socket. It should then send the “READY” message. When it re UNIX Internals
ceives the “GO” message back from the referee, the player should make a
choice and send it as a string to the referee. When the player receives the
string “STOP”, it should kill itself.

The three programs will almost certainly share some functions. To do a good job, MOTIVATION
create a makefile that compiles these common functions separately and links
them the executable files that use them. Don’t avoid sending strings by encoding The UNIX operating system was one of the best designed operating systems of its
them as one-byte numbers—that’s part of the problem. [level: medium] time. Many of the basic underlying operating system concepts embedded in UNIX will
continue to be used in some form or fashion for a long time to come. For example, the
2. Rewrite Exercise 1, using unnamed pipes instead of sockets. Which program do
way that UNIX shares CPUs among competing processes is used in many other oper
you think was easier to write? Which is easier to understand? [level: medium]
ating systems, such as Microsoft Windows. Knowledge of the way in which the system
3. Rewrite Exercise 1 to allow the players to reside on different Internet machines. works can aid in designing high-performance UNIX applications. For example, knowl
Each component of the game should be able to start separately. [level: hard] edge of the internals of the virtual memory system can help you arrange data struc
tures so that the amount of information transferred between main and secondary
execute this command on vanguard.
memory is minimized. In sum, knowledge of UNIX internals is useful for two purpos
$ referee 5000 .use local port 5000.
. .

es: as a source of reusable information that may help you in designing other similar
.execute this command on csservr2.
systems and to help you design high-performance UNIX applications.
$ player vanguard.5000 .player is on a remote port.
. .

execute this command on wotan.


$ player venguard.5000 .player is on a remote port.
. .

PREREQUISITES
4. The Internet shell is ripe for enhancements. Here is a list of features that would You should already have read Chapter 13. It also helps to have a good knowledge of
be challenging to add: data structures, pointers, and linked lists.

a. The ability to supply an Internet address of the form A.B.C.D. This feature
would actually be easy to add, since my first Internet example already has that OBJECTIVES
capability. [level: easy]
b. Job control features like fg, bg, and jobs. [level: medium] In this chapter, I describe the mechanisms that UNIX uses to support processes, mem
c. Filename substitution using ‘i’, ?, and []. [level: hard] ory management, input/output, and the file system. I also explain the main kernel data
d. A two-way socket feature that connects the standard input and output chan structures and algorithms.
nels of either the keyboard or a specified process to an Internet socket. This
feature would allow you to connect to standard services without the aid of
telnet. [level: hard] PRESENTATION
e. A simple built-in programming language. [level: medium] Various portions of the UNIX system are described in their turn.
f. The ability to refer to any Internet address symbolically. For example, it would
be nice to be able to redirect to “vanguard.utdallas.edu.3000.” [level: medium]

563

You might also like