Professional Documents
Culture Documents
Unit-4 Os
Unit-4 Os
UNIT-4
UNIT IV: FILE AND I/O SYSTEMS:
File System: File concept, Access Methods, Directory
Structure, File System Structure, i-node, File System
Implementation, Directory Implementation, Allocation
Methods.
I/O System: I/O Hardware, Application I/O Interface,
Kernel I/O subsystem.
File concept:
File:
A file is a named collection of related information that is
recorded on secondary storage such as magnetic disks,
magnetic tapes and optical disks. In general, a file is a
sequence of bits, bytes, lines or records whose meaning
is defined by the files creator and user.
File Structure:
A File Structure should be according to a required format
that the operating system can understand.
A file has a certain defined structure according to
its type.
A text file is a sequence of characters organized
into lines.
2
UNIT-4
A source file is a sequence of procedures and
functions.
An object file is a sequence of bytes organized
into blocks that are understandable by the
machine.
When operating system defines different file
structures, it also contains the code to support
these file structure. Unix, MS-DOS support
minimum number of file structure.
File Attributes:
A file's attributes vary from one operating system to
another but typically consist of these:
1. Name:
The symbolic file name is the only information kept in
human readable form.
2. Identifier:
This unique tag, usually a number, identifies the file
within the File system; it is the non-human-readable
name for the file.
3. Type:
This information is needed for systems that support
different type of files.
3
UNIT-4
4. Location:
This information is a pointer to a device and to the
location of the file on that device.
5. Size:
The current size of the file (in bytes, words, or blocks)
and possibly the maximum allowed size are included in
this attribute.
6. Protection:
Access-control information determines who can do
reading, writing, executing, and so on.
7. Time, date, and user identification:
This information may be kept for creation, last
modification, and last use. These data can be useful for
protection, security, and usage monitoring.
File Operations:
Create
Write
Read
Reposition within file
Delete
Truncate
Open
4
UNIT-4
Close
1. Creating a file:
Two steps are necessary to create a file.
First, space in the file system must be found for the file.
Second, an entry for the new file must be made in the
directory.
2. Writing a file:
To write a file, - a system call specifying both the name
of the file and the information to be written to the file.
Given the name of the file, the system searches the
directory to find the file's location. The system must keep
a write pointer to the location in the file where the next
write is to take place. The write pointer must be updated
whenever a write occurs.
3. Reading a file:
To read from a file, we use a system call that specifies
the name of the file and where (in memory) the next
block of the file should be put. Again, the directory is
searched for the associated entry, and the system needs to
keep a read pointer to the location in the file where the
next read is to take place. Once the read has taken place,
the read pointer is updated.
5
UNIT-4
Because a process is usually either reading from or
writing to a file, the current operation location can be
kept as a per-process current file position pointer. Both
the read and write operations use this same pointer,
saving space and reducing system complexity.
4. Repositioning within a file:
The directory is searched for the appropriate entry, and
the current-file-position pointer is
repositioned to a given value. Repositioning within a file
need not involve any actual I/O. This file operation is
also known as a file seek.
5. Deleting a file:
To delete a file, we search the directory for the named
file. Having found the associated directory entry, we
release all file space, so that it can be reused by other
files, and erase the directory entry.
6. Truncating a file:
The user may want to erase the contents of a file but keep
its attributes. Rather than forcing the user to delete the
file and then recreate it, this function allows all attributes
to remain unchanged—except for file length—but lets
the file be reset to length zero and its file space released.
6
UNIT-4
File types:
Access Methods:
When a file is used, information is read and accessed into
computer memory and there are several ways to access
this information of the file. Some systems provide only
one access method for files. Other systems, such as those
of IBM, support many access methods, and choosing the
right one for a particular application is a major design
problem.
7
UNIT-4
There are three ways to access a file into a computer
system:
Sequential-Access
Direct Access
Index sequential Method
Sequential Access:
It is the simplest access method. Information in the file is
processed in order, one record after the other. This mode
of access is by far the most common; for example, editor
and compiler usually access the file in this fashion.
Key points:
Data is accessed one record right after another
record in an order.
When we use read command, it move ahead
pointer by one
When we use write command, it will allocate
memory and move the pointer to the end of the
file
8
UNIT-4
Such a method is reasonable for tape.
UNIT-4
searched sequentially to find a specific record,
which can be time-consuming.
Direct Access:
Another method is direct access method also known as
relative access method. A fixed-length logical record that
allows the program to read and write record rapidly. in no
particular order. The direct access is based on the disk
model of a file since disk allows random access to any
file block. For direct access, the file is viewed as a
numbered sequence of block or record. Thus, we may
read block 14 then block 59, and then we can write block
17. There is no restriction on the order of reading and
writing for a direct access file.
A block number provided by the user to the operating
system is normally a relative block number, the first
relative block of the file is 0 and then 1 and so on.
Advantages of Direct Access Method:
The files can be immediately accessed decreasing
the average access time.
10
UNIT-4
In the direct access method, in order to access a
block, there is no need of traversing all the blocks
present before it.
UNIT-4
Directory Structure:
What is Directory:
On a computer, a directory is used to store, arrange, and
segregate files and folders. It is similar to a telephone
directory in that it just contains lists of names, phone
numbers, and addresses rather than the real papers. It
uses a hierarchical structure to organise files and
directories. On many computers, directories are referred
to as drawers or folders, much like a workbench or a
standard filing cabinet in an office. You may, for
instance, create a directory for images and another for all
of your documents. You could easily access the type of
file you wanted to see by saving particular file types in a
folder.
There are several logical structures of a directory, these
are given below.
Single level directory
Two-level directory
Tree structure or hierarchical directory
Acyclic graph directory
General graph directory structure
UNIT-4
UNIT-4
The main advantage of a single-level directory is
that it is very simple to implement.
Simple operations like file creation, search,
deletion, and updating are possible with a single-
level directory structure.
The single-level directory is easier to understand
in practical life.
Disadvantages of single-level directory:
If the number of files is very large, searching a
particular file is very inefficient.
Segregation of important and unimportant files is
not possible.
The single-level directory is not useful for multi-
user systems.
Two-level directory:
We saw how the single-level directory proves to be
inefficient if multiple users are accessing the system. If
two different users wanted to create a file with the same
name (say report.doc), it was not allowed in a single
level directory.
14
UNIT-4
In a two-level directory structure, there is a master node
that has a separate directory for each user. Each user can
store the files in that directory. It can be practically
thought of as a folder that contains many folders, each
for a particular user, and now each user can store files in
the allocated directory just like a single level directory.
UNIT-4
Disadvantages of two-level directory:
One user cannot share a file with another user.
Even though it allows multiple users, still a user
cannot keep two same type files in a user
directory.
It does not allow users to create subdirectories.
UNIT-4
This is how things work on our PCs. We can store some
files inside a folder and also create multiple folders
inside a folder.
UNIT-4
Disadvantages of tree-structured directory:
As one user cannot enter another user’s directory,
this restricts sharing of files.
Too many subdirectories may make the search
complicated.
Users cannot modify the root directory’s data.
UNIT-4
The solution to this problem is presented by the acyclic-
graph directory. In this type of directory, we can access a
file or a subdirectory from multiple directories. Hence
files can be shared between directories. It is designed in
such a way that multiple directories point to a particular
directory or file with the help of links.
General-graph directory:
19
UNIT-4
This is an extension to the acyclic-graph directory. In the
general-graph directory, there can be a cycle inside a
directory.
UNIT-4
It costs more than alternative solutions.
Garbage collection is an essential step here.
UNIT-4
The image shown below, elaborates how the file system
is divided in different layers, and also the functionality of
each layer.
UNIT-4
in order to store and retrieve the files, the logical
blocks need to be mapped to physical blocks.
This mapping is done by File organization
module. It is also responsible for free space
management.
Once File organization module decided which
physical block the application program needs, it
passes this information to basic file system. The
basic file system is responsible for issuing the
commands to I/O control in order to fetch those
blocks.
I/O controls contain the codes by using which it
can access hard disk. These codes are known as
device drivers. I/O controls are also responsible
for handling interrupts.
i-node:
In UNIX based operating systems, each file is indexed by
an Inode. Inode are the special disk block which is
created with the creation of the file system. The number
of files or directories in a file system depends on the
number of Inodes in the file system.
An Inode includes the following information
Attributes (permissions, time stamp, ownership
details, etc) of the file
23
UNIT-4
A number of direct blocks which contains the
pointers to first 12 blocks of the file.
A single indirect pointer which points to an index
block. If the file cannot be indexed entirely by the
direct blocks then the single indirect pointer is
used.
A double indirect pointer which points to a disk
block that is a collection of the pointers to the
disk blocks which are index blocks. Double index
pointer is used if the file is too big to be indexed
entirely by the direct blocks as well as the single
indirect pointer.
A triple index pointer that points to a disk block
that is a collection of pointers. Each of the
pointers is separately pointing to a disk block
which also contains a collection of pointers which
are separately pointing to an index block that
contains the pointers to the file blocks.
UNIT-4
File system implementation is the process of designing,
developing, and implementing the software components
that manage the organization, allocation, and access to
files on a storage device in an operating system.
UNIT-4
deletion involves removing the file from the disk and
releasing the space it occupies. In some file systems,
deleted files may be recoverable if they have not been
overwritten.
UNIT-4
within the file. These operations are useful for random
access and manipulation of specific portions of a file.
Implementation Issues
Disk space management
Consistency checking and error recovery
File locking and concurrency control
Performance optimization
27
UNIT-4
Directory Implementation:
Directory implementation in the operating system can be
done using Singly Linked List and Hash table. The
efficiency, reliability, and performance of a file system
are greatly affected by the selection of directory-
allocation and directory-management algorithms. There
are numerous ways in which the directories can be
implemented. But we need to choose an appropriate
directory implementation algorithm that enhances the
performance of the system.
UNIT-4
After searching we can delete that file by
releasing the space allocated to it.
To reuse the directory entry we can mark that
entry as unused or we can append it to the list of
free directories.
To delete a file linked list is the best choice as it
UNIT-4
Directory Implementation using Hash Table:
An alternative data structure that can be used for
directory implementation is a hash table. It overcomes
the major drawbacks of directory implementation using a
linked list. In this method, we use a hash table along with
the linked list. Here the linked list stores the directory
entries, but a hash data structure is used in combination
with the linked list.
UNIT-4
entire list will not be searched on every operation. Using
the keys the hash table entries are checked and when the
file is found it is fetched.
Disadvantage:
The major drawback of using the hash table is that
generally, it has a fixed size and its dependency on size.
But this method is usually faster than linear search
through an entire directory using a linked list.
Allocation Methods:
The allocation methods define how the files are stored in
the disk blocks. There are three main disk space or file
allocation methods.
Contiguous Allocation
Linked Allocation
Indexed Allocation
The main idea behind these methods is to provide:
Efficient disk space utilization.
Fast access to the file blocks.
All the three methods have their own advantages and
disadvantages as discussed below:
31
UNIT-4
Contiguous Allocation
In this scheme, each file occupies a contiguous set of
blocks on the disk. For example, if a file requires n
blocks and is given a block b as the starting location,
then the blocks assigned to the file will be: b, b+1, b+2,
……b+n-1. This means that given the starting block
address and the length of the file (in terms of blocks
required), we can determine the blocks occupied by the
file.
The directory entry for a file with contiguous allocation
contains
Address of starting block
Length of the allocated portion.
The file ‘mail’ in the following figure starts from the
block 19 with length = 6 blocks. Therefore, it occupies
19, 20, 21, 22, 23, 24 blocks.
32
UNIT-4
Advantages:
Both the Sequential and Direct Accesses are
supported by this. For direct access, the address
of the kth block of the file which starts at block b
can easily be obtained as (b+k).
This is extremely fast since the number of seeks
are minimal because of contiguous allocation of
file blocks.
Disadvantages:
33
UNIT-4
This method suffers from both internal and
external fragmentation. This makes it inefficient
in terms of memory utilization.
Increasing file size is difficult because it depends
on the availability of contiguous memory at a
particular instance.
UNIT-4
Advantages:
This is very flexible in terms of file size. File size
can be increased easily since the system does not
have to look for a contiguous chunk of memory.
This method does not suffer from external
fragmentation. This makes it relatively better in
terms of memory utilization.
Disadvantages:
35
UNIT-4
Because the file blocks are distributed randomly
on the disk, a large number of seeks are needed to
access every block individually. This makes
linked allocation slower.
Pointers required in the linked allocation incur
some extra overhead.
Indexed Allocation
In this scheme, a special block known as the Index block
contains the pointers to all the blocks occupied by a file.
Each file has its own index block. The ith entry in the
index block contains the disk address of the ith file block.
The directory entry contains the address of the index
block as shown in the image:
36
UNIT-4
Advantages:
This supports direct access to the blocks occupied
by the file and therefore provides fast access to
the file blocks.
It overcomes the problem of external
fragmentation.
Disadvantages:
37
UNIT-4
The pointer overhead for indexed allocation is
greater than linked allocation.
I/O Hardware:
I/O Hardware is a set of specialized hardware devices
that help the operating system access disk drives,
printers, and other peripherals. These devices are located
inside the motherboard and connected to the processor
using a bus. They often have specialized controllers that
allow them to quickly respond to requests from software
running on top of them or even respond directly to
38
UNIT-4
commands from an application program. This post will
discuss in detail I/O Hardware basics such as daisy chain
expansion bus controller memory-mapped I/O Direct
Memory Access (DMA)
The Daisy chain, expansion bus, controller, and host
adapter are used to access the I/O hardware.
The daisy chain is a method of connecting multiple I/O
devices with each other through a single connection point
(pin). Each device can be accessed by plugging into any
of the pins on this connection point. The expansion bus
connects devices together in parallel with each other so
that they can be accessed simultaneously by using only
one cable instead of several cables (one per device). This
design allows you to connect more than one peripheral
device while maintaining compatibility with older
systems that may not support additional peripherals or
features such as memory-mapped I/O (MMIO).
A controller manages all incoming data from its
associated port and sends outgoing commands from its
associated port; it’s like an interface between an
application software program and hardware components
such as disk drives or network adapters
A host adapter is a bridge between the system bus and the
expansion bus. It allows you to connect multiple
expansion cards at the same time and provides additional
interfaces for those cards such as DMA or MMIO. A
39
UNIT-4
controller manages all incoming data from its associated
port and sends outgoing commands from its associated
port; it’s like an interface between an application
software program and hardware components such as disk
drives or network adapters.
1. Polling
Polling is a software technique that uses a program to
check the status of devices. The device can be a disk
drive or any other peripheral device in the computer. The
program polls the device for information, such as if it has
data available or not. Polling is a slow way to get data
from a device because it has to wait until another
function occurs before being able to get information
about its state.
In some cases polling may be desirable; for example,
when there are several items being polled simultaneously
and only one item updates its state at any time — the rest
continue waiting until they receive an acknowledgment
from one item that it’s done updating their states (which
could take seconds).
UNIT-4
suspending tasks that depend on the device (e.g.,
stopping a backup in progress).
2. Interrupts
The CPU is interrupted by several different devices, such
as I/O hardware and peripheral devices. These interrupts
are used by the I/O device to notify the CPU when it
needs attention. The CPU can be interrupted by several
different devices at any given time, but only one interrupt
will be delivered to it at any given time. This can happen
because of a hardware or software error occurring on an
I/O bus—for example, if a disk drive has failed then all
other data transfers will pause until it’s repaired; or
41
UNIT-4
perhaps another device wants access to memory (or vice
versa). In either case, there won’t be any way for your
application program not to be written specifically for this
particular machine!
UNIT-4
happening at that moment but also when something has
happened before so they can take appropriate action as
well!
The first step in using interrupts is to set up the interrupt
controller. This is done by calling the BIOS INT 10h
service with several parameters, including the number of
IRQ lines available (16 is a common value), the
maximum number of devices that can be connected to
each line, and so on.
The next step is to assign an INT 10h handler for each
device that needs access. This is done by calling the
BIOS INT 15h service with several parameters, including
the number of IRQ lines available (16 is a common
value), the maximum number of devices that can be
connected to each line, and so on
UNIT-4
Various applications of I/O Interface:
Application of I/O is that we can say interface have
access to open any file without any kind of information
about file i.e., even basic information of file is unknown.
It also has feature that it can be used to also add new
devices to computer system even it does not cause any
kind of interrupt to operating system. It can also used to
abstract differences in I/O devices by identifying general
kinds. The access to each of general kind is through
standardized set of function which is called as interface.
Each type of operating system has its own category for
interface of device-drivers. The device which is given
may ship with multiple device-drivers-for instance,
drivers for Windows, Linux, AIX and Mac OS, devices
may is varied by dimensions which is as illustrated in the
following table:
44
UNIT-4
Character-stream or Block:
A character stream or block both transfers data in form of
bytes. The difference between both of them is that
character-stream transfers bytes in linear way i.e., one
45
UNIT-4
after another whereas block transfers whole byte in
single unit.
Sequential or Random Access:
To transfer data in fixed order determined by device, we
use sequential device whereas user to instruct device to
seek to any of data storage locations, random-access
device is used.
Synchronous or Asynchronous:
Data transfers with predictable response times is
performed by synchronous device, in coordination with
others aspects of system. An irregular or unpredictable
response times not coordinated with other computer
events is exhibits by an asynchronous device.
Sharable or Dedicated:
Several processes or threads can be used concurrently by
sharable device; whereas dedicated device cannot.
Speed of Operation:
The speed of device has range set which is of few bytes
per second to few giga-bytes per second.
Read-write, read only, write-only:
Different devices perform different operations, some
supports both input and output, but others support only
one data transfer direction either input or output.
46
UNIT-4
UNIT-4
time, response time, and turnaround time for I/O
operations to complete. The OS developers implement
schedules by maintaining a wait queue of requests for
each device, and the I/O scheduler rearranges the order to
improve the efficiency of the system.
Buffering
Another important service provided by the I/O subsystem
is buffering. Buffers are used to cope with speed
mismatches, provide adaptation for different data transfer
sizes, and support copy semantics for the application I/O.
A buffer is a memory area that stores data being
transferred between two devices or between a device and
an application.
Caching
Caching is another service provided by the I/O
subsystem. It is a region of fast memory that holds a
copy of data, making access to the cached copy much
easier than the original file. The main difference between
a buffer and a cache is that a buffer may hold only the
existing copy of a data item, while a cache holds a copy
of faster storage of an item that resides elsewhere.
48
UNIT-4
Spooling and Device Reservation
Spooling and device reservation are also important
services provided by the I/O subsystem. They are used to
hold the output of a device, such as a printer that cannot
accept interleaved data streams, in a buffer known as a
spool. All applications' output is spooled in a separate
disk file, preventing all output from continuing to the
printer. When an application finishes printing, the
spooling system queues the corresponding spool file for
output to the printer.
Error Handling
Error handling is another crucial function of the I/O
subsystem, which guards against many kinds of hardware
and application errors. An OS that uses protected
memory can prevent a complete system failure from
minor mechanical glitches. Devices and I/O transfers can
fail transiently or permanently, but the OS can handle
such failures in different ways.
I/O Protection
Finally, I/O protection ensures that user processes cannot
issue illegal I/O instructions to disrupt the normal
function of a system. The I/O subsystem implements
various mechanisms to prevent such disruptions by
49
UNIT-4
defining all I/O instructions as privileged instructions.
Users cannot issue I/O instructions directly, preventing
illegal I/O access.
Conclusion
The Kernel I/O Subsystem in Operating System plays an
important role in managing I/O operations efficiently and
securely. In this article, we discovered that It provides
various services such as scheduling, buffering, caching,
spooling, device reservation, error handling, and I/O
protection. This ensures optimal use of resources and
safeguarding the system against errant processes and
malicious users. Further, the I/O subsystem is one of the
core components of the OS. Its efficient functioning is
vital for the smooth operation of the system.