Systems Programming Using C (File Subsystem) : Intended Schedule

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Intended Schedule

0
1
2
3
4
5
6
7
8
9
10
11
12
13

1.
Systems Programming using C
(File Subsystem)

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

46

Schedule for today

Date
20.04.
27.04.
04.05.
11.05.
18.05.
25.05.
01.06.
08.06.
15.06.
22.06.
29.06.
06.07.
13.07.
20.07.
27.07.
12.10.

Lecture
Introduction to Operating Systems
Systems Programming using C (File Subsystem)
Systems Programming using C (Process Control)
Processes Scheduling
Process Synchronization
Inter Process Communication
Pngstmontag
Input / Output
Memory Management
Filesystems

Hand out
Submission
Course registration
1. Assignment
2. Assignment
1. Assignment
3. Assignment
2. Assignment
4. Assignment
3. Assignment
5. Assignment
4. Assignment
6. Assignment
5. Assignment
7. Assignment
6. Assignment
8. Assignment
7. Assignment
9. Assignment
8. Assignment
10. Assignment
9. Assignment
10. Assignment

Special subject: Transactional Memory


Special subject: XQuery your Filesystem
Wrap up session
First examination date
Second examination date

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

47

UNIX System Interface

Last lecture: All operating system provide services for programs they run.

UNIX provides its services through a set of system calls.


System calls:

execute new program, open a le, read a le, allocate a region of


memory, get the current time, ...

Today: Systems Programming using C

Functions within the OS that may be called by user programs.

Practical approach to the UNIX System Interface


Use system calls for unbuffered le I/O.
Construct higher level (standard) routines from system calls.

Use library functions for standard I/O.

(A) Black Box

(B) White Box

Benutzerapplikation, die API anspricht

Benutzerapplikation, die API anspricht

POSIX.1 System Call Interface

POSIX.1 System Call Interface


Betriebssystemkern (operating
system kernel)

From Programs to Processes.

Tomorrow: Introduction to the C Programming Language


Focus on differences compared to JAVA
Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

48

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

49

System calls

System calls accessed by C functions


Denition of the system call interface is in the C language*
A standard technique on UNIX is for each system call to have
a function of the same name encapsulated in a C library.

System calls determine a direct interface to the kernel.


Employed for maximum efciency.
Access some facility not implemented in the libraries.

Those functions invoke the apt kernel service, using


whatever technique is required on the system.

The calls available in the interface vary from OS to OS.


Underlying concepts, however, tend to be similar.

The function may put one or more of the C arguments into


general registers and then execute some machine instruction
that generates a software interrupt in the kernel.

We can consider the system calls as being C functions.

*
Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

50

Regardless of the actual implementation technique used to invoke a call.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

51

!"#$%"&''('#")"

*+%,-.('#")-/
UNIX system

architecture (System V)

Documentation of system calls

!"+5#<"$=$;6$1))"

The system call interface has traditionally been documented


in Section 2 of the UNIX Programmers Manual.

:$;6$1))&%&A%;#?"D

-."#&'"("%
!"#$"%&'"("%

.('#")15E$5E4.>?+%##'#"AA"-8.3.4"5&8*%%&0$4"#/*8"9

B+#"$4:$;<"''4
C;))5+%D1#%;+
8;<=9

71#"%4.5&'('#")-8/0%"&.12.3.4"59

:5EE"$5+6
821//"#&8*89"9

Files
F"%>?"+-889*#*84"#9

:$;<"'''#"5"$5+6'4
.5&'('#")-86#78"..
87$4#7%&.12.3.4"59

!A;>D-82%78>9

G"$H#"#$"%&"$-82"@%>"-2$%@"$9

:$;<"'';$4
<5#"%A5+6
8.89"+1%0$:9

Processes

.="%>?"$4
@"$31A#5+6
85"57#3
5*$*:"5"$49

01$231$"4.#"5"$5+6
!"#$"%&'"("%
)*#+,*#"&'"("%
01$231$"

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

52

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

53

Documentation of library calls


Section 3 of the UNIX Programmers Manual denes general
purpose functions available to the programmer.
These functions are not entry points into the kernel.
They may use kernels system calls, however.

System Interface
Standardization

printf(3): May invoke write(2) to perform output.


atoi(3) (convert ASCII string to integer): no kernel at all.

Implementors view (kernel programming): Distinction


between system call vs. library function is fundamental.
Users perspective (systems programming): Not as critical,
both exist to provide services for application programs.
Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

54

Standardization of UNIX

POSIX refers to a family of related standards


originally used as synonym for IEEE Std 1003.1-1988
POSIX.1 emerged as a preferred term

Among others ANSI C and the IEEE POSIX emerged.

The latest version of POSIX.1 was published on April, 30th 04.

American National Standards Institute


Portable Operating System Interface for UNIX

Overview UNIX versions and systems


http://www.levenez.com/unix/
Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

55

The POSIX standard

During the 1980s the proliferation of UNIX versions and


differences between them led many large users (such as the
U.S. government) to call for standardization.

Institute of Electrical and Electronics Engineers

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

1978
1979
1980
1980
1994
1995
1992
1986
1989
1988
1990
2000
1989
1990
1989
1991
1986
1990
1995
1992
2007
1994
1991
1989
1989
1979
1988
1991
1991
1992
1978
1986
1988
1983
1987
1984
1999
2000
1995
1989
2005
1995
2003
1984
1993
2003
1977
2004
1993
2006
1986
2001
2005
1987

1BSD
2BSD
3BSD
4BSD
4.4BSD Lite 1
4.4BSD Lite 2
386 BSD
A/UX
Acorn RISC iX
Acorn RISC Unix
AIX
AIX 5L
AIX PS/2
AIX/370
AIX/6000
AIX/ESA
AIX/RT
AMiX
AOS Lite
AOS Reno
AppleTV
ArchBSD
ASV
Atari Unix
BOS
BRL Unix
BSD Net/1
BSD Net/2
BSD/386
BSD/OS
CB Unix
Chorus
Chorus/MiX
Coherent
CTIX
CXOS
Darwin
Debian GNU/Hurd
DEC OSF/1 ACP
Dell Unix
DesktopBSD
Digital Unix
DragonFly BSD
Dynix
Dynix/ptx
ekkoBSD
Eunice
FireFly BSD
FreeBSD
FreeDarwin
GNU
GNU-Darwin
Gnuppix GNU/Hurd-L4
HPBSD

1983
2000
1991
1988
1985
1985
1978
2007
2007
1983
1986
1991
1994
1977
1999
1999
1985
1974
2002
1977
1984
2005
2000
1985
2002
1996
1998
1988
1983
1993
1993
1988
1987
1994
2001
1995
2003
1995
2005
1996
1996
1997
1990
2005
1982
1986
1982
2007
1977
1974
1984
2001
1996
1981

HP-UX
HP-UX 11i
HP-UX BLS
IBM AOS
IBM IX/370
Interactive 386/ix
Interactive IS
iPhone OS X
iPod OS X
IRIS GL2
IRIX
Linux
Lites
LSX
Mac OS X
Mac OS X Server
Mach
MERT
MicroBSD
Mini Unix
Minix
Minix 3
Minix-VMD
MIPS OS RISC/os
MirBSD
Mk Linux
Monterey
more/BSD
mt Xinu
MVS/ESA OpenEdition
NetBSD
NeXTSTEP
NonStop-UX
Open Desktop
Open UNIX 8
OpenBSD
OpenDarwin
OpenServer 5
OpenSolaris
OPENSTEP
OS/390 OpenEdition
OS/390 Unix
OSF/1
PC-BSD
PC/IX
Plan 9
Plurix
PureDarwin
PWB
PWB/UNIX
QNX
QNX RTOS
QNX/Neutrino
QUNIX

1997
1997
1991
1977
1994
2002
1984
1987
2001
2004
1983
1995
1990
1992
1982
1982
2004
1999
1995
1998
1991
1977
1981
1980
1979
1988
1984
1982
1986
1996
2002
1993
1969
1981
1979
1991
1981
1982
1983
1984
1986
1988
1985
1986
1971
1993
1998
1976
1977
1982
1980
1984
1998
2001

It is called IEEE Std 1003.1, 2004 Edition (POSIX.1)

ReliantUnix
Rhapsody
RISC iX
RT
SCO UNIX
SCO UnixWare 7
SCO Xenix
SCO Xenix System V/386
Security-Enhanced Linux
Silver OS
Sinix
Sinix ReliantUnix
Solaris 1
Solaris 2
SPIX
SunOS
Triance OS
Tru64 Unix
Trusted IRIX/B
Trusted Solaris
Trusted Xenix
TS
Tunis
UCLA Locus
UCLA Secure Unix
Ultrix
Ultrix 32M
Ultrix-11
Unicos
Unicos/mk
Unicos/mp
Unicox-max
UNICS
UniSoft UniPlus
UNIX 32V
UNIX Interactive
UNIX System III
UNIX System IV
UNIX System V
UNIX System V Release 2
UNIX System V Release 3
UNIX System V Release 4
UNIX System V/286
UNIX System V/386
UNIX Time-Sharing System
UnixWare
UnixWare 7
UNSW
USG
Venix
Xenix OS
Xinu
xMach
z/OS Unix System Services

It is also known as Single UNIX Specication Version 3 (SUSv3)


http://www.unix.org/version3/

56

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

57

Single UNIX Specification

Tight coupling of C and standards

The latest version POSIX.1 has been jointly developed by the


IEEE and The Open Group5.

The Single UNIX Specication, Version 3, 2004 Edition:


Conceptually, this standard describes a set of fundamental
services needed for the efcient construction of application
programs. Access to these services has been provided by dening
an interface, using the C programming language, a command
interpreter, and common utility programs that establish
standard semantics and syntax.

As such it is both an IEEE & Open Group Technical Standard:


IEEE Std 1003.1, 2004 Edition
The Open Group Technical Standard Base Specications, Issue 6
It is also an international standard ISO/IEC 9945:2003

Readers are expected to be experienced C language


programmers.

The standard is published free of charge on the web as

The Single UNIX Specication, Version 3, 2004 Edition

5 http://www.opengroup.org/overview/members/membership_list.htm
6 http://www.unix.org/version3/
Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

58

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

59

The UNIX system interface for File I/O


ISO C Standard Library provides an I/O interface that is
uniform across operating systems.
On any particular system the routines of the standard library
have to be written in terms of the facilities provided by the
host system.

File Subsystem

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

Next we will describe some UNIX system calls for input and
output, and show an example how parts of the standard
library can be implemented with them.

60

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

61

!"#$%"&''('#")"

*+%,-.('#")-/
UNIX system

architecture (System V)

Hello World and related system calls

!"+5#<"$=$;6$1))"
:$;6$1))&%&A%;#?"D

-."#&'"("%
!"#$"%&'"("%

.('#")15E$5E4.>?+%##'#"AA"-8.3.4"5&8*%%&0$4"#/*8"9

B+#"$4:$;<"''4
C;))5+%D1#%;+
8;<=9

71#"%4.5&'('#")-8/0%"&.12.3.4"59

:5EE"$5+6
821//"#&8*89"9

Files
F"%>?"+-889*#*84"#9

:$;<"'''#"5"$5+6'4
.5&'('#")-86#78"..
87$4#7%&.12.3.4"59

:$;<"'';$4
<5#"%A5+6
8.89"+1%0$:9

Processes

.="%>?"$4
@"$31A#5+6
85"57#3
5*$*:"5"$49

!A;>D-82%78>9

G"$H#"#$"%&"$-82"@%>"-2$%@"$9

01$231$"4.#"5"$5+6
!"#$"%&'"("%
)*#+,*#"&'"("%
01$231$"

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009
!"#$%&'()*&+,*(-.)/

62

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009
012'&345

63

I/O system calls - Tracing hello world


strace(1): trace system calls & signals

Unbuffered I/O

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

64

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

65

I/O system calls


Regarding le I/O it all boils down to ve system calls:
open, read, write, lseek, and close

We speak of unbuffered I/O in contrast to the standard I/O.


Unbuffered I/O because each read or write invokes a system
call which is immediately executed and not cached inside a
buffer (as is using the standard I/O).

Everything is a file

Unbuffered I/O is not part of ISO C, but POSIX.1 and SUSv3.


Standard I/O is provided by the standard C library (ISO C).
Further le related system calls:
unlink(2), mkdir(2), rmdir(2), link(2), symlink(2), chmod(2),
stat(2), umask(2), chown(2), chflags(2), utimes(2), ...
Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

66

Everything is a file

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

67

File Descriptor

In the UNIX operating system, all input and output is done


by reading or writing les.

When we open an existing le or create a new le, the kernel


returns a le descriptor to the process.

All peripheral devices, even keyboard and screen, are les in


the lesystem.

To the kernel all open les are referred to by le descriptors.


A le descriptor is a small non-negative integer.

This means that a homogeneous interface handles all


communication between a program and peripheral devices.

Whenever I/O is to be done on the le, the le descriptor is


used instead of the name to identify the le.

In the most general case, before we read or write a le, we


must inform the system of our intent to do so, a process
called opening the le.

Each process has a xed size descriptor table, which is


guaranteed to have at least n slots.
The call getdtablesize(3) returns the size of this table.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

68

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

69

Special File Descriptors

Special File Descriptors


The user can redirect I/O to and from les with < and >:

A le descriptor is analogous to the FILE pointer used by


the standard library, or the le object used in JAVA.

$ ./a.out <infile >outfile

All info about an open le is maintained by the system.


Since I/O involving keyboard and screen is so common,
special arrangements exist to make this convenient.

In this case, the shell changes the default assignments for


le descriptors 0 and 1 to the named les.
File descriptor 2 normally remains attached to the screen to
display error messages.

When the command interpreter (the shell) runs a program,


three les are open, with le descriptors*:
0 standard input (stdin)

In all cases, the le assignments are changed by the shell,


not by the program.

line buffered

1 standard output (stdout) line buffered


2 standard error (stderr)

The program does not known where its input comes from nor
where its output goes, as long as it uses le 0 for input and 1
and 2 for output.

unbuffered

* POSIX.1 replaces the magic numbers 0, 1, and 2 with STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO (unistd.h)
Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

70

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

71

Read andread(2)ing
Write a File and write(2)ing a file

Open a file

!
!

When opening a le the system checks for, e.g.,


the existence of the le

!
!

our permissions to access the le

If we are going to write on a le it may also be necessary

Input and output uses the read and write system calls.
Those are accessed from C programs through two identically
named functions: read(2) and write(2).
For both the first argument is a file descriptor.
Second argument is a pointer to a buffer in the program
where the data is to go to or come from.
Third argument is the number of bytes to be transferred.

# include < sys / types .h >


# include < unistd .h >

to create it rst
or discard its previous contents.

/* Returns : number of bytes read , 0 is EOF ,


-1 on error ( errno is set ) */
ssize_t
read ( int d , void * buf , size_t nbytes );
/* Returns : number of bytes written if OK
-1 on error ( errno is set ) */
ssize_t
write ( int d , const void * buf , size_t nbytes );

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

72

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

390

73

Higher-level routines build on syscalls

Higher-level routines build on syscalls

It is instructive to see how system calls, such as read and


write can be used to construct higher-level routines dened
in the standard library, such as getchar(3), putchar(3).

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

74

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

75

... words of wisdom

Computer Science is largely concerned


with an understanding of how low-level
details make it possible to achieve highlevel goals.

Unbuffered I/O
routines

--- Donald E. Knuth, TAOCP*, p1.3

* The Art of Computer Programming, Volume 1, Fascicle 1, 2005, Addison-Wesley


Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

76

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

77

Open(2)ing a file
Open A File

to open(2)
Flags ForFlags
The open(2)
Function a file

A file is opened by calling the open(2) function:

flags is an int that specifies how the file is to be opened:

# include < fcntl .h >

O RDONLY
O WRONLY
O RDWR

int
open ( const char * path , int flags , mode_t mode );

path is the name of the file to open or create.

flags specifies a multitude of options (formed by ORing


together one or more constants from fcntl.h (next slide)).

mode holds permission information associated with a file.

If successful, open(2) returns a file descriptor.

!
!

Table: One and only one of these three constants must be specified.29

fd = open ( name , O_RDONLY , 0);

Otherwise, a value of -1 is returned and errno is set to


indicate the error ( errno.h).
e.g. ENAMETOOLONG: A component of a path name exceeded
MAXNAMLEN, or an entire path name exceeded MAXPATHLEN-1

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

375

In principle, its an error to try to open a non-existing file.

The system call creat (sic!) is provided to create new files.

But . . .

29
Most implementations define O RDONLY as 0, O WRONLY as 1, and O RDWR as
2, for compatibility with older programs.

78

Creating Creation
New Files of new files

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

376

79

More
toopen(2)
open(2)
a file
Even More
Flagsflags
For The
Function
Following constants are optional flags to open(2):

# include < sys / types .h >


# include < sys / stat .h >
# include < fcntl .h >

O
O
O
O
O
O
O
O

int
creat ( const char * path , mode_t mode );

However, creat is made obsolete by open:


Historically, in early version of the UNIX system, the second argument to open could
be only 0, 1, or 2. There was no way to open a file that didnt exist. Therefore, a
separate system call, creat, was needed to create new files. With th O CREAT and

APPEND
CREAT
EXCL
TRUNC
NONBLOCK
SYNC
RSYNC
DSYNC

Append on each write.


Create file if it does not exist (requires mode argument).
Error if O CREAT and file exists.
Truncate file length to 0.
Do not block on open or for data to become available.
Have each write wait for I/O to complete (incl. file attributes).
Let read wait until pending writes to same area are complete.
Have each write wait for I/O to complete (excl. file attributes).

Table: Short description of some more POSIX.1 flags to open. Consult


your system manual for further information & implementation details.

O TRUNC options now provided by open, a separate creat function is no longer needed:

open ( path , O_CREAT | O_TRUNC | O_WRONLY , mode );

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

Open for reading only.


Open for writing only.
Open for reading and writing.

377

80

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

378

81

Close(2)ing time
Close A File

lseek(2):
Reposition
file offset
lseek - Reposition
Read/Write
File Offset

close(2) - delete a descriptor

# include < unistd .h >

# include < unistd .h >

off_t
lseek ( int fildes , off_t offset , int whence );

int
close ( int d );

Every open file has a current file offset


!

An open file is closed by calling the close function.

Releases any record locks the process may have on the file.

May be used to not run out of active descriptors per process.

When a process exits, all associated file descriptors are freed.

returns 0 on success, -1 on failure and sets global int errno.


close will fail if:

!
!

379

82

30

The offset can be negative or positive.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

They increment the offset by number of bytes read or written.


By default the offset is initialized to 0 when a file is opened.
Open with O APPEND is an exception.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

381

83

383

85

ESPIPE fd is associated with a pipe, socket, or FIFO.


EINVAL whence is not a proper value or the resulting offset would
be negative on a filesystem or special device that does
not allow negative offsets to be used.

offset (re-)position
offset is set to offset bytes from the beginning of file.
files offset is set to its current value plus offset.30
files offset is set to the size of the file plus offset.30

Otherwise, a value of -1 is returned and errno is set.

read and write normally start at the current file offset.

EBADF fd is not an open file descriptor.

Interpretation of offset depends on whence argument:

lseek will fail and the file pointer will remain unchanged if:

off_t
lseek ( int fildes , off_t offset , int whence );

A successful call to lseek returns the new file offset.

Measures the number of bytes from the beginning of file.

Errors
& Current
determine
Error Indication
And
Offsetcurrent offset

# include < unistd .h >

Offset interpretation
Offset Interpretation

whence
SEEK SET
SEEK CUR
SEEK END

Normally, a non-negative integer.

Argument is not an active descriptor (EBADF)


An interrupt was received (EINTR)

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

To determine the current offset, we can seek with zero offset:


off_t pos ;
pos = lseek ( fd , 0 , SEEK_CUR );

382

84

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

Read andread(2)ing
Write a File and write(2)ing a file

Determine seeking capabilities


Seeking Capability

Same goes to determine if a file is capable of seeking:

Input and output uses the read and write system calls.
Those are accessed from C programs through two identically
named functions: read(2) and write(2).
For both the first argument is a file descriptor.
Second argument is a pointer to a buffer in the program
where the data is to go to or come from.
Third argument is the number of bytes to be transferred.

int
main ( void )
{
if ( lseek ( STDIN_FILENO , 0 , SEEK_SET ) == -1)
err ( errno , " can ! not ! seek ! [% d ]. " , errno );
else
printf ( " seek ! OK .\ n " );
return (0);
}

# include < sys / types .h >


# include < unistd .h >

$ ./ a . out / etc / motd


seek OK .
$ cat / etc / motd | ./ a . out
a . out : can not seek [29].: Illegal seek

/* Returns : number of bytes read , 0 is EOF ,


-1 on error ( errno is set ) */
ssize_t
read ( int d , void * buf , size_t nbytes );

$ grep 29 / usr / include / sys / errno . h


# define ESPIPE
29

!
!
!

/* Returns : number of bytes written if OK


-1 on error ( errno is set ) */
ssize_t
write ( int d , const void * buf , size_t nbytes );

/* Illegal seek */

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

384

86

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

390

87

Example:
Low-Level File Copy
File Copying,
v4
/* cp : copy f1 to f2 ( file copy ) */
int
main ( int argc , char * argv [])
{
int f1 , f2 ;
ssize_t n ;
char buf [ BUFSIZ ];

Standard I/O

if ( argc != 3)
error ( " Usage : cp from to " );
if (( f1 = open ( argv [1] , O_RDONLY , 0)) == -1)
error ( " can t open % s " , argv [1]);
if (( f2 = creat ( argv [2] , PERMS )) == -1)
error ( " can t create %s , mode %03 o " , argv [2] , PERMS );
while (( n = read ( f1 , buf , BUFSIZ )) > 0)
if ( write ( f2 , buf , n ) != n )
error ( " write error on file % s " , argv [2]);
return (0);

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

395

88

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

89

!"#$%"&''('#")"

*+%,-.('#")-/
UNIX system

vs. standard I/O


Streams Unbuffered
and FILE Objects

architecture (System V)

!"+5#<"$=$;6$1))"

Unbuffered I/O File descriptors

:$;6$1))&%&A%;#?"D

-."#&'"("%
!"#$"%&'"("%

.('#")15E$5E4.>?+%##'#"AA"-8.3.4"5&8*%%&0$4"#/*8"9

B+#"$4:$;<"''4
C;))5+%D1#%;+
8;<=9

71#"%4.5&'('#")-8/0%"&.12.3.4"59

:5EE"$5+6
821//"#&8*89"9

Files
F"%>?"+-889*#*84"#9

:$;<"'''#"5"$5+6'4
.5&'('#")-86#78"..
87$4#7%&.12.3.4"59

!A;>D-82%78>9

G"$H#"#$"%&"$-82"@%>"-2$%@"$9

:$;<"'';$4
<5#"%A5+6
8.89"+1%0$:9

So far I/O centered around file descriptors.

When a file is opened a file descriptor is returned.

It was used for all subsequent I/O operations.

Standard I/O Library Streams

Processes

.="%>?"$4
@"$31A#5+6
85"57#3
5*$*:"5"$49

01$231$"4.#"5"$5+6

Standard I/O centers around streams.

When opening or creating a file we say that we associate a


stream with the file (fopen(3) returns a pointer to FILE).

FILE contains all the information required by the standard


I/O library to manage the stream.

!"#$"%&'"("%
)*#+,*#"&'"("%
01$231$"

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009
!"#$%&'()*&+,*(-.)/

90

Unbuffered
The Standard
I/O Libraryvs. standard I/O

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009
012'&345

Specified by the ISO C standard.

The file descriptor used for actual I/O.

Has been implemented on many OSs other than UNIX.

A pointer to a buffer for the stream.

Additional interfaces defined as extensions by SUSv3.

The size of the buffer.

Count of the number of characters currently in the buffer.

An error flag.

!
!

Handles details such as buffer allocation and performing I/O


in optimal-sized chunks (no need to worry about using the
correct buffer size).

Ease of use.

Initially written by Dennis Ritchie around 1975.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

449

93

Typical members of the FILE structure


!

91

FILE Object
The FILEThe
object

Input and output functionality of the ISO C standard library


!

448

Incidental Remark

447

92

In general there is no need to examine a FILE object, just pass


the pointer as an argument to each standard I/O function.

A pointer with type FILE * is referred to as a file pointer.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

BufferingAutomatic Buffering
The standard I/O library provides buffering
!

Goal is to minimize the number of read and write calls.

Buffering is tried to be automatically associated to streams.

Applications should not worry about it.

Different buffering modes can lead to confusions.

Buffering

Three types of buffering provided by the standard I/O library


!

Fully buffered

Line buffered

Unbuffered

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

452

94

Fully
(block)
Fully (block)
buffered
I/O buffered I/O

Actual I/O takes place when the standard I/O buffer is filled.

Files residing on disk are normally fully buffered by the library.

The buffer is obtained by one of the I/O functions.

Usually by calling malloc(3) the first time I/O takes place.

The term flush describes the writing of a standard I/O buffer.

A buffer can be flushed automatically by the standard I/O


routines such as when a buffer fills.

Explicitly, by using the function fflush(3).

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

95

LineI/O
buffered I/O
Line buffered
Line buffered I/O provided by the standard I/O library:

The fully buffered I/O provided by the standard I/O library:


!

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

453

96

Actual I/O takes place, when a newline character is


encountered on input or output.

This allows us to output a single character at a time (e.g.,


with fputc(3)), knowing that actual I/O will take place only
when we finish writing each line.

Line buffering is typically used on a stream when it refers to a


terminal (e.g., standard input and standard output).

However, the size of the buffer is fixed, so I/O might take


place if the buffer is filled before a newline is seen.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

454

97

Unbuffered
I/O
Unbuffered
I/O

ISO C Requirements
buffering requirements
ISO C Buffering
ISO C requires the following buffering characteristics:

Unbuffered I/O:
!

The standard I/O library does not buffer the characters.

When an output stream is unbuffered, information appears on


the destination file/terminal as soon as written write(2).

Standard error stream is normally unbuffered.

Any error messages are displayed as quickly as possible


(regardless whether they contain a newline or not).

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

Standard input and output are fully buffered, if and only if


they do not refer to an interactive device.

Standard error is never fully buffered.

Should standard input and output be unbuffered or line buffered, if


they refer to an interactive device? Should standard error be line
buffered or unbuffered?

System dependent (for instance OpenBSD)

455

98

Turn On
buffering
Turn Buffering
and Off on and off

If stdin and stdout refer to a terminal they are line buffered.

Standard error is initially unbuffered.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

456

99

458

101

Altering
buffer behaviour
Alter Buffering
Behaviour
# include < stdio .h >

# include < stdio .h >

int
/* 0 if OK else EOF ( but stream is still functional ) */
setvbuf ( FILE * stream , char * buf , int mode , size_t size );

void
setbuf ( FILE * stream , char * buf );
!
!

setbuf turns buffering on or off.

setvbuf is used to alter the buffering behavior of a stream.

It may be implemented similar to:

May only be used after sucessful open and before first I/O.

mode must be one of the following three macros:


# define _IOFBF 0 /* setvbuf should set fully buffered */
# define _IOLBF 1 /* setvbuf should set line buffered */
# define _IONBF 2 /* setvbuf should set unbuffered */

/* / usr / src / lib / libc / stdio / setbuf . c */


# include < stdio .h >
void
setbuf ( FILE * fp , char * buf )
{
( void ) setvbuf ( fp , buf , buf ? _IOFBF : _IONBF , BUFSIZ );
}

!
!

For an unbuffered stream, buf and size are ignored.


For line or fully buffered streams
!
!

33

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

457

100

buf and size can optionally specify a buffer and its size.
If buf is NULL the system chooses an apt size33 .

System-dependent:

BUFSIZE (stdio.h), st blksize (stat.h)

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

Buffer options overview

Buffer Options and Flushing a Stream


The setbuf and setvbuf functions and their options:
Function

mode

setbuf
IOFBF
setvbuf

IOLBF
IONBF

buf

Buffer & length

Type of buffering

nonnull
NULL

user buf of length BUFSIZ


(no buffer)

fully buffered or line buffered


unbuffered

nonnull
NULL
nonnull
NULL
(ignored)

user buf of length


system buffer of apt
user buf of length
system buffer of apt
(no buffer)

size
length
size
length

Standard I/O
routines

fully buffered
line buffered
unbuffered

At any time, a stream can be flushed:


# include < stdio .h >
int
/* 0 if OK , EOF on failure and errno set */
fflush
(
FILE
*
stream
);
Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009
102
!

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

103

Any unwritten data for the stream is passed to the kernel.

!
If stream
is NULL,
Opening
a Stream

output streams are flushed.


fopen(3) all
a stream

to fopen(3)
a stream
Modes toModes
Open a Standard
I/O Stream

# include < stdio .h >

459

FILE *
fopen ( const char * path , const char * mode );

mode
r or rb
w or wb
a or ab
r+ or r+b or rb+
w+ or w+b or wb+
a+ or a+b or ab+

FILE *
freopen ( const char * path , const char * mode , FILE * stream );
FILE * /* all : fpointer if OK , NULL on failure with errno */
fdopen ( int fildes , const char * mode );
!
!

fopen opens a specified file


freopen opens a specified file on a specified stream.
!
!

The original stream (if it exists) is always closed.


Change the file associated with stderr, stdin, stdout.

fdopen is part of Posix.1 not ISO C, as standard I/O does


not deal with file descriptors.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

ISO C specifies 15 values for opening a standard I/O stream:

460

104

Description
open for reading
truncate to 0 length or create for writing
append; open for writing at end of file, or create for writing
open for reading and writing
truncate to 0 length or create for reading and writing
open or create for reading and writing at end of file

Using b allows to differentiate between text and binary files.

UNIX kernels do not differentiate between these types of files


it has no effect.

With fdopen, the meaning of mode differs slightly.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

461

105

Function

mode

setbuf

Remarks
on fopen(3)
Final Remarks
on Opening
a Stream
r
x
x

x
x

x
x

r+
x
x
x

Buffer & length

Type of buffering

user buf of length BUFSIZ


(no buffer)

fully buffered or line buffered


unbuffered

fflush(3) a stream
IOFBF

setvbuf

Six different ways to open a standard I/O stream:


Restriction
file must already exist
previous contents of file discarded
stream can be read
stream can be written
stream can be written only at end

buf
nonnull
NULL

w+

IONBF

a+

x
x
x

Creating a new file with mode w or a, there is no way to


specify files access permission bits, as with open(2).

Any created files will have mode S IRUSR | S IWUSR |


S IRGRP | S IWGRP | S IROTH | S IWOTH (0666).

With a file opened for reading and writing (+ sign in mode)


reads and writes cannot be arbitrarily intermixed.

Output shall not be directly followed by input without an


intervening fflush. Input shall not be followed by output
without repositioning.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

user buf of length


system buffer of apt
user buf of length
system buffer of apt
(no buffer)

size
length
size
length

fully buffered
line buffered
unbuffered

At any time, a stream can be flushed:

x
x
x

IOLBF

nonnull
NULL
nonnull
NULL
(ignored)

# include < stdio .h >


int
/* 0 if OK , EOF on failure and errno set */
fflush ( FILE * stream );
!

Any unwritten data for the stream is passed to the kernel.

If stream is NULL, all output streams are flushed.


459

467

106

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

107

Reading and
Writingand
a Stream
Reading
writing a stream

stream
Closing afclose(3)
stream with a
fclose(3)

Once opened a stream there are three different types of I/O:


# include < stdio .h >

Character-at-a-time I/O. Read and write one character at a


time, with the standard I/O functions handling all the
buffering (if the stream is buffered).

Line-at-a-time I/O. To read or write a line at a time, we use


fgets(3) and fputs(3). Each line is terminated with a
newline character, and we have to specify the maximum line
length we can handle.

Direct I/O36 . Provided by fread(3) and fwrite(3). For


each operation we read or write some number of objects,
where each object is of specified size.

int
/* 0 if OK , else EOF / errno ( no further access ) */
fclose ( FILE * stream );
!

An open stream is closed by calling fclose.

Any buffered output data is flushed before the file is closed.

Any buffered input data is discarded.

Any automatically allocated buffers are released.

When a process terminates normally (calling exit or returning


from main), all open standard I/O streams are closed.

These types of I/O are refered to as unformatted I/O. Formatted


I/O is done by functions, such as printf or scanf.
36

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

468

108

aka binary I/O, object-at-a-time I/O, record/structure-oriented I/O

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

469

109

Character-at-a-time
Character-at-a-time
Input Functionsinput functions
# include < stdio .h >
int
fgetc ( FILE * stream );
int
getc ( FILE * stream );

Character-at-a-time I/O

int /* equivalent to getc () with the argument stdin . */


getchar ( void );
!

Return the next requested object from the stream.


Next character as an unsigned char converted to int.

The input functions return the same value* whether an error


occurs or EOF (feof and ferror are used to distinguish).

*EOF
Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

110

Push
back characters
Push-Back
Characters

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

111

473

113

Character-at-a-time
output functions
Character-at-a-time
Output Functions

# include < stdio .h >

# include < stdio .h >

int
/* c if OK , EOF on failure */
ungetc ( int c , FILE * stream );

int
fputc ( int c , FILE * stream );

Characters pushed back return by subsequent reads on the


stream in reverse order of their pushing (FILO).

int
putc ( int c , FILE * stream );

One character of push-back is guaranteed, but as long as


there is sufficient memory, an effectively infinite amount of
pushback is allowed.

int
/* All : c if OK , EOF / errno on failure */
putchar ( int c );

If a character is successfully pushed-back, the end-of-file


indicator for the stream is cleared.

The functions write the character c (converted to an


unsigned char) to the output stream.

Pushing back EOF will fail and the stream remains unchanged.

Pushed characters dont get written back to file or device.


They are kept incore.

EOF is returned if a write error occures, or if an attempt is


made to write a read-only stream.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

470

472

112

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

Line-at-a-time
input functions
Line-at-a-time
Input Functions
# include < stdio .h >
char *
fgets ( char * str , int size , FILE * stream );
char * /* should NEVER be used - > unknown buffer size */
gets ( char * str );

Line-at-a-time I/O

/* Both return str if OK , NULL on EOF or error */

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

114

str specifies the address of the buffer to read the line into.

gets reads from stdin and fgets from stream.

With fgets the size of the buffer is specified.

The buffer is always null-terminated, i.e., at most size 1 is


read. If the line is longer, a partial line is returned. The next
call will read what follows.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

474

115

Line-at-a-time
output functions
Line-at-a-time
Output Functions
# include < stdio .h >
int
/* 0 on success and EOF on error */
fputs ( const char * str , FILE * stream );

Direct I/O

int
puts ( const char * str );
/* >=0 on success and EOF or error */
!

fputs writes the string pointed to by str to the stream


pointed to by stream.

puts writes the string str, and a terminating newline


character, to the stream stdout.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

475

116

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

117

Binary I/O
Binary I/O

Reading
Binary I/O
Functionsand writing using binary I/O
# include < stdio .h >
size_t
fread ( void * ptr , size_t size ,
size_t nmemb , FILE * stream );

Motivation for binary I/O


!

Read or write an entire structure at a time.

With character-at-a-time functions, such as getc or putc we


have to loop through an entire structure.
Line-at-a-time functions will not work.

!
!

size_t
fwrite ( const void * ptr , size_t size ,
size_t nmemb , FILE * stream );
/* Return number of objects read or written */

fputs stops writing when it hits a null byte.


fgets wont work right on input with null or newline bytes.

fread reads nmemb objects, each size bytes long.

Input is taken from stream and stored at the location ptr.


Both return number of objects read or written.

!
!

37

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

476

118

Writinganan
array
Binary I/OWrite
Array

ferror and feof must be called to determine.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

float data [10];


if ( fwrite (& data [2] , sizeof ( float ) , 4 , fp ) != 4)
err (1 , " fwrite ! error . " );

size as the size of each element of the array.

nmemb as the number of elements.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

119

479

121

Read or write a structure


struct tuple {
unsigned int size ;
unsigned int level ;
enum kind kind ;
void * cnt ;
} tup ;

Read or write a binary array

477

Writinga aStructure
structure
Binary I/OWrite

The functions have two common cases:


!

For read it can be less than nmemb if error occurs or EOF.37


For write an error has occured if it is not equal to nmemb.

if ( fwrite (& tup , sizeof ( tup ) , 1 , fp ) != 1)


err (1 , " fwrite ! error . " );

478

120

size as the size of structure.

nmemb as one (the number of objects to write).

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

Problems
binary
Fundamental
Problemswith
with Binary
I/OI/O

Positioning
a stream
Positioning
a Stream

Binary formats change between compilers and architectures


!
!

There are three ways to position a standard I/O stream:

Binary formats used to store multibyte integers and


floating-point values differ among machine architectures.
The offset of a member within a structure can differ between
compilers and systems.

Even on a single system, the binary layout of a structure


can differ, depending on compiler options.

To exchanging binary data among different systems a


higher-level protocol is probably the better choice.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

ftell and fseek. File position stored as long. (Historic)

ftello and fseeko. File position stored as off t. (SUSv3)

fgetpos and fsetpos. File position stored as fpos t. (ISO C)

They work similar to lseek(2) and the whence options (SEEK SET
etc.) are the same.

480

122

Check
stream status
Check Stream
Status

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

481

123

482

125

a file descriptor
ObtainingObtaining
a File Descriptor

# include < stdio .h >


int
feof ( FILE * stream );

/* non - zero if it is set */

# include < stdio .h >

int
ferror ( FILE * stream );

/* non - zero if it is set */

int
/* file descriptor assoc . with the stream */
fileno ( FILE * stream );

int
clearerr ( FILE * stream );
!

Most implementations have two flags for each stream in FILE.

On UNIX, the standard I/O library ends up calling the


low-level I/O routines.

Each standard I/O stream has an associated file descriptor.

fileno can obtain the descriptor (SUSv3, not ISO C).

An error flag. An end-of-file flag.


!

Both flags are cleared by clearerr.

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

471

124

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

Standard I/O
Example

Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

126

Review
On slide: I/O system calls - Tracing hello world
write(2) is invoked without previous call to open(2). How so?
Why does the following code yield exit status 13?*

* on some systems, e.g., Linux in the CIP pool


Operating Systems Prof. Dr. Marc H. Scholl DBIS U KN Summer Term 2009

128

127

You might also like