Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 36

Chapter 10: File-System Interface

Chapter 10: File-System Interface

 Chapter 10.1
 File Concept
 Access Methods
 Chapter 10.2
 Directory Structure
 File-System Mounting
 File Sharing
 Protection

Operating System Concepts 10.2 Silberschatz, Galvin and Gagne ©2005


Storage Management

 New block – File Systems a.k.a. “Storage Management”


 An Operating System is often described as a program
that manages processes, processors, memory, and
storage.
 Listing these: operating systems control and manage:
 Processes (both user and system)
 Processors (the CPUs)
 Memory management (primary, cache, …) and
 Storage management (data, programs, directories used for
access, etc. )

Operating System Concepts 10.3 Silberschatz, Galvin and Gagne ©2005


Storage Management - more
 Disk storage – primary medium for primary, online storage.
 Contains files – collections of related items defined by file creator.
 Normally grouped into directories for ease of use and reference.
 Organized in a variety of structures.
 Disk Access –
 sometimes character at a time; often blocks at a time.
 sometimes access sequentially; sometimes randomly.
 Some file systems dedicated; some shared
 Some support data transfer data asynchronously; others
synchronously.
 Differ greatly in speed – many parameters as cited above.
 This chapter: the File System Interface.

Operating System Concepts 10.4 Silberschatz, Galvin and Gagne ©2005


Objectives of this Chapter:

 To explain the function of file systems

 To describe the interfaces to file systems

 To discuss file-system design tradeoffs,


including access methods, file sharing, file
locking, and directory structures

 To explore file-system protection

Operating System Concepts 10.5 Silberschatz, Galvin and Gagne ©2005


File Concept
 A File System consists of two parts:
 Files – the actual storage of data on a medium
 Stored on sequential or some kind of direct access storage device.
 Directory Structure – structures the information for access
 Size, location, logical record length, block size, format, ownership, security,
paths to files / directories, etc.

 A file may be defined as a contiguous logical address space, which is


mapped by the operating system onto some kind of physical devices.
 Note: ‘logical’ does not mean ‘physical.’
 Almost all storage devices are non-volatile (data remains when power
is removed)
 Magnetic tapes
 Magnetic disks
 Optical disks,
 Jump drives
 CDs / DVDs …. And others…

Operating System Concepts 10.6 Silberschatz, Galvin and Gagne ©2005


File Concept
 To a user, a file is the smallest allocation of logical secondary
storage. All data is written to a ‘file.’
 Data may be numeric, alphabetic, alphanumeric, or binary.
 Can be free form (text)
 Can be rigidly formatted – records.
 Fixed length records; variable length records: Bright Lights application?

 Generally, a file is a sequence of


 bits,
 bytes,
 lines, or
 records
 … whose meaning is interpreted by the creator of the file and how
it is used.
 “One man’s program is another man’s data.”

Operating System Concepts 10.7 Silberschatz, Galvin and Gagne ©2005


File Concept (continued)
 Data files – many forms and structures
 Differentiate between a file’s organization and how it may be
accessed.
 not the same

 Program files –
 Source programs
 Object files
 May not be directly executable
 May be understandable by a ‘linker.’
 Executable files
 May be ready for loader to bring into memory.

 Much of the data about programs and data files revolves simply as
how they are used!

Operating System Concepts 10.8 Silberschatz, Galvin and Gagne ©2005


File Attributes
 Name – Typically the only information kept in human-readable form
 Usually independent of the process and system that created it.
 Save for possible extensions or types, such as .doc or .ppt, etc.
 But names often are constrained by the operational environment.
 NIHPOO……. Each positions often means something very important in a
commercial (non-academic environment.)
 Identifier – unique tag (number) identifies file within file system
 NIHP00; System Code IH; Source programs: ‘N’; subsystem ‘P’
 Programs within subsystem: 00, 01, ….
 Type – needed for systems that support different types
 .c, .java. .cpp, .exe, .dll, .dat, .wpd, .doc, etc. .xls, .css. ….
 And bringing up certain ‘processes’ to process these files … by type.
 Location – pointer to file location on device
 Size – current file size - generally in bytes or blocks, especially blocks.
 Protection – controls who can do reading, writing, executing
 Yes! Read, write, execute,
 Time, date, and user identification – data for protection, security, and
usage monitoring –
 Maybe date last accesses; OPR; security.

Operating System Concepts 10.9 Silberschatz, Galvin and Gagne ©2005


File Operations
 File is an “abstract data type.”
 This means it has data which will be unique to
its implementation (realization – how
organized, and use – how accessed and
processed), and

 File operations that can be performed on the


data – dependent upon how it is implemented.
 Accessed sequentially, randomly, etc.

 Let’s look at the six basic functions that can be


performed on most files.
Operating System Concepts 10.10 Silberschatz, Galvin and Gagne ©2005
Typical File Operations
 Create –
 Need to allocate space
 Adds entry in disk directory; load data onto storage device.

 Write –
 “System call” supplies name of file and data to be written.
 A pointer usually needs to be available to “point” to the place
where the next ‘item’ is to be written; pointer updated.

 Read –
 Another system call specifies file name, location in memory where
read data is to be placed, and, using a pointer, locates data to be
read.
 Pointer needs to be updated to point to ‘next’ item to be read.
 Pointer for read and write: called a ‘current file-position pointer.’

Operating System Concepts 10.11 Silberschatz, Galvin and Gagne ©2005


Typical File Operations – more
 Reposition within file –
 This refers to moving a file pointer to point to a
specific position / record in the file.
 Really, this is a file-seek.
 Delete –
 Using the directory, release the file space for reuse;
 Clears directory entry referring to this file.
 Truncate – often used in recreating a file…
 Delete entries in file but keeps file attributes.
 Changed attribute is file length;
 File length reset to zero and its file space is released.

Operating System Concepts 10.12 Silberschatz, Galvin and Gagne ©2005


Typical File Operations
 Other operations include:
 Append data to end of a file
 Rename a file
 Copy a file
 Other file utilities: get length of file; get attributes, etc….
 Many OS utilities such as file prints, allocating space, …
 Some files open() a file at first reference; others require a
specific open() or fopen, (system call) etc.
 Some files are automatically close() when program terminates;
others suggest an explicit file close().
 My take: always close your files. Keep things clean.
 Open() usually validates the desired mode (read. write, append,
…), permissions, and more.
 Then, open() typically returns a pointer to the entry in the
open-file table.

Operating System Concepts 10.13 Silberschatz, Galvin and Gagne ©2005


File Operations – The Process Itself

 In a multiprogramming environment, there is usually a


“process table” (PCB) for each running process.
 Most processes will contain current file pointer for each
opened file
 Interestingly,, there is often a system-wide open file table too,
which contains a list of open files for all running processes.
 Honeywell – UNISYS
 PAT Table overflow (peripheral allocation table)….

Operating System Concepts 10.14 Silberschatz, Galvin and Gagne ©2005


Open File Tables
 So there’s an entry in a process-dependent table and a system-wide
table

 System wide table contains additional information including an ‘open


count.’

 When a file is opened for a process, an entry in the open-file table for
that process points to the entry in the system-wide table.

 The system-wide table also keeps track of who has the same file open,
should more than a single process be accessing the file.

 Close() decreases this count. When open count reaches zero, this file’s
entry is removed form the system-wide table.

Operating System Concepts 10.15 Silberschatz, Galvin and Gagne ©2005


Open File Basic Information

 Data needed to manage open files:


 File pointer - pointer to last read/write location, per process that
has the file open
 Note: this is needed for systems that do not include a file offset as part
of the read() and write() operations.
 Needs to keep track of last read / write location as a current file-
position pointer.
 File-open count: - counter of number of times a file is open –
to allow removal of data from open-file table when last process
closes it
 Disk location of the file: cache of data access information
 Access rights: per-process access mode information.
 Each process opens a file in some kind of access mode..

Operating System Concepts 10.16 Silberschatz, Galvin and Gagne ©2005


“Open File Locking”
 Provided by some operating systems and file
systems
 Particularly useful for files that can be accessed by
multiple applications at same time.
 Mediates access to a file
 Shared locks – used for reading
 Exclusive locks – needed for writing.
 Only one process at a time can get the exclusive lock.
 Some OSs only provide for exclusive locking – which makes
sense.

Operating System Concepts 10.17 Silberschatz, Galvin and Gagne ©2005


File Locking Example – Java API
import java.io.*;
import java.nio.channels.*;
public class LockingExample {
public static final boolean EXCLUSIVE = false;
public static final boolean SHARED = true;
public static void main(String arsg[]) throws IOException
{
FileLock sharedLock = null;
FileLock exclusiveLock = null;
try {
RandomAccessFile raf = new RandomAccessFile("file.txt", "rw");
// get the channel for the file
FileChannel ch = raf.getChannel();
// this locks the first half of the file - exclusive
exclusiveLock = ch.lock(0, raf.length()/2, EXCLUSIVE);
/** Now modify the data . . .Needs exclusive access! */
// release the lock
exclusiveLock.release();
Operating System Concepts 10.18 Silberschatz, Galvin and Gagne ©2005
File Locking Example – Java API (cont)
// this locks the second half of the file - shared
sharedLock = ch.lock(raf.length()/2+1, raf.length(), SHARED);

/** Now read the data . . . */

// release the lock


sharedLock.release();
}
catch (java.io.IOException ioe) {
System.err.println(ioe);
}finally {
if (exclusiveLock != null)
exclusiveLock.release();
if (sharedLock != null)
sharedLock.release();
}
}
}

Operating System Concepts 10.19 Silberschatz, Galvin and Gagne ©2005


File Types – Name, Extension
We mentioned several file-types earlier.
Here are more samples.

Common approach for implementing


file types is to include the file-type
as part of the name: name.extension.

File type tells the operating system the


types of operations that can be
performed on the file. e.g. .com and
.exe and .bat can be executed.

.com and .exe files are binary executable


files; a .bat file is text in ASCII format
and consists of a series of commands
to the operating system.

Certain applications expect files sent to


them to be of a certain type, as in .
.c, .java or .doc.
Operating System Concepts 10.20 Silberschatz, Galvin and Gagne ©2005
File Types – Name, Extension - more
We are very familiar with file-types, as
we use them all the time.

“These” notes have extension .ppt for


power point.

When I open this file by double-clicking


on an icon or hot link representing the file,
the specific application (Power Point) is
automatically invoked.
Windows has default associations
of file-types to applications

Some OSs don’t require an extension and


take an extension as a ‘hint.’

Operating System Concepts 10.21 Silberschatz, Galvin and Gagne ©2005


File Structure
 File types indicate the internal structure of the file.
 These have structures expected by programs that process them.
 Typically, there is information (often up front in the file)
needed by the processing program to properly process (load,
process, display etc.) the file in question.
 It might include where program is to be loaded, key words, location of
first instruction, external symbols, and more.
 For any file-type the Operating System supports, it needs
some code to recognize and support that file type.
 But new applications may require information structured in
ways not supported by the operating system and problems
may occur (book).
 This presents some interesting problems.

Operating System Concepts 10.22 Silberschatz, Galvin and Gagne ©2005


File Structure – not recognizable formats…
 We may develop an application that creates a file-type not
compatible with recognized file-types supportable by the
operating system.
 So, what to do?
 Some operating systems support a very limited set of
structures and interpret files very simply as, say, a sequence of
8-bit bytes.
 So, ‘something’ must interpret these.
 The OS allows these, but does not support these directly.
 Thus, each application must include code to interpret such
an input file…
 Can you think of any? They are all around us!
 If you are a Java person, look at all the various I/O options available!
They are all a bit different.

Operating System Concepts 10.23 Silberschatz, Galvin and Gagne ©2005


Internal File Structure
 Most systems usually have well-defined block sizes
 These are usually dependent on the organization of the disk:
sector size or some derivative of track size.
 We always read and write in blocks – physical records.

 For a specific file, all blocks are usually of the same size,
with the number of ‘logical records’ as some subset of the
block size.
 Called ‘blocking factor’ BF = 100  one hundred logical
records per physical record (block).
 Discuss
 Why do we read/write ‘blocks’ in lieu of logical records?
 Discuss.

Operating System Concepts 10.24 Silberschatz, Galvin and Gagne ©2005


Internal File Structure
 Some operating systems define files as simply streams of data
bytes.
 Here, each byte is individually addressable by its offset from the
front (or end) of the file.
 Logical record size = 1 byte.
 But the system packs and unpacks these bytes into physical
disk blocks of, say, 512 bytes per block.
 So,
 the length of a logical record (a read() operation),
 the physical block size (determined by sector size or track
length), and
 packing technique determine the number of logical records in a
physical block (record).

Operating System Concepts 10.25 Silberschatz, Galvin and Gagne ©2005


Internal File Structure
 Files, nonetheless, are considered a series of blocks (whatever their
size) and all I/O functions (logical read() and write()) take place with
blocks.

 The ‘first’ read() or write() does not read a logical record.


 It typically reads from IRG to IRG (IBG to IBG) or sector boundary to
sector boundary. - much more later
 `512 byte sector – contains five 100-character records…
 Subsequent read() or write() operations result in
(typically) a pointer moving to the next logical record in
the block, which is part of a process’s address space.
 Thus only the physical read of a block results in a physical disk
access.
 Naturally, there is likely some internal fragmentation for the last
block allocated to a file.
 Data in a file can be accessed in several, but restricted, ways
often dependent upon the file’s ‘organization.’
Operating System Concepts 10.26 Silberschatz, Galvin and Gagne ©2005
Access Methods
 Sequential Access of a sequential file organization is
the simplest form.
 Information is processed in order – one logical
record after another.
 Operations are typically some form of read() or write()
 read next – reads a record and advances file pointer.
 write next - appends to the end of the file and advances to
the new end of file…moves file pointer as ‘writes’ occur.
 reset – some can be reset to a certain position
 Others….

Operating System Concepts 10.27 Silberschatz, Galvin and Gagne ©2005


Access Methods
 Direct Access – organization.
 File typically consists of fixed-length logical records.
 File is viewed as numbered sequence of blocks (records)
 Access may be random; sometimes sequential.
 Given the need for a retrieval, a ‘key’ of some sort is
developed for a logical record and from this a block
address is computed and the block (containing the logical
record) is read.
 Blocks are stored according to some kind of key (like
SSAN or Account Number, and others) and the
computation of the disk address is often done by a variety
of algorithms.
 Typically we can read a block randomly – given its disk
address.

Operating System Concepts 10.28 Silberschatz, Galvin and Gagne ©2005


Access Methods
 There are many ways that direct access can be affected.
 Some direct access approaches allow the programmer to
computer a CCTTRR number;
 Others require the application to compute a ‘relative record
number’ starting with record 0, the first record in the file.
 IBM uses VSAM – Virtual Storage Access Method. Terms:
 ESDS – entry sequenced data set – for sequential files
 KSDS – Key Sequenced Data Set – for indexed sequential files,
 KSDS – uses a primary key such as SSAN, or account number. These
then are mapped into physical disk addresses.
 RRDS – Relative Record Data Set.
 Here we compute algorithmically a relative record number – an integer.
 More later.

Operating System Concepts 10.29 Silberschatz, Galvin and Gagne ©2005


Sequential-access File

cp = current position

Here’s a visual for, perhaps, a tape drive.


For sequential files, access is always sequential as shown above.

Operating System Concepts 10.30 Silberschatz, Galvin and Gagne ©2005


Simulation of Sequential Access on a Direct-access
File

On some direct access types of files, sequential processing is permitted, but not all..

On file organizations that permit both sequential and random access, both
random queries for retrievals and sequential processing for other requirements
such as reports, etc. are permitted.

Indexed Sequential Files (ahead) support both random and sequential access.

Direct Access files normally only support random access. (more ahead)

Operating System Concepts 10.31 Silberschatz, Galvin and Gagne ©2005


Example of Index Organization and Random Access

This organization requires an index and contains pointers to various blocks.


Access requires the search of index followed by the retrieval of a record from the
file.

Logical records are contained within a block and blocks are read and written.
So, when a block is read from disk, this is followed by a sequential read of the
logical records within the block to see if the specific desired record lies within the
block

Typically the index (above) has keys (primary keys) and block numbers are shown.
The highest key in a block is shown. So we are not certain that the desired logical
record is actually in the block until it is retrieved and searched for.

Operating System Concepts 10.32 Silberschatz, Galvin and Gagne ©2005


Example of Index Organization and Random
Access

An indexed sequential file is sorted (ordered) on some index or primary key, like name
(above) or account number (key must be unique).

Then an index of primary keys and disk addresses (kept in memory when file is active)
is used to locate a logical record.

(Actually disk addresses point to a block where the desired logical record ‘may’ be .)

Multiple keys may be used to search the file for a desired record.
Example: The file must be ordered on a unique primary key, such as account
number.
But we may also retrieve on a unique or non-unique secondary key such as name
(non-unique) or phone number (unique).
Operating System Concepts 10.33 Silberschatz, Galvin and Gagne ©2005
Example of Index Organization and Random
Access

For very large files, we may have levels of indexes (coarse index and fine
index; or index sets, sequence sets, data sets (IBM)).

One or more of these indices may be kept in primary memory to reduce


I/Os when attempting to access a record.

The indices are searched via a binary search; the retrieved block is
searched sequentially for the desired logical record..

Operating System Concepts 10.34 Silberschatz, Galvin and Gagne ©2005


Example of Relative Files

Relative Files are another kind of direct access file that does not allow for sequential access.

Due to the way the records in the file are created, sequential access, though possible, makes
little sense. This is because we typically us a field within a logical record and – based on that
Field within the record, computationally determine (usually, ‘hash’) a relative record number
– an integer – that provides the ‘relative’ displacement of the logical record from the beginning
of the file.

In a Relative File, the key to an individual record is usually computed and is an integer, such
as 3, 25, 65, 234, etc. and not related to the order in which it is added to the file.’

Again, there are other direct access file types besides indexed sequential and relative files.
Operating System Concepts 10.35 Silberschatz, Galvin and Gagne ©2005
End of Chapter 10.1

You might also like