Download as pdf or txt
Download as pdf or txt
You are on page 1of 148

Advanced Operating Systems B.Tech.[C.S.E.

] & [IT]

Advanced
Operating Systems

TEXT BOOK:
1. Advanced Concepts in Operating Systems, Mukesh Singhal, Niranjan G. Shivaratri,
Tata McGraw-Hill Edition 2001

REFERENCE:
1. Distributed Systems: Andrew S. Tanenbaum, Maarten Van Steen, Pearson Prentice
Hall, Edition – 2, 2007

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Centralized-Deadlock-Detection Algorithm

Centralized Deadlock Detection attempts to imitate the non-distributed algorithm through a

central coordinator. Each machine is responsible for maintaining a resource graph for its

processes and resources. A central coordinator maintains the resource utilization graph for the

entire system: the Global Wait-For Graph.

In the centralized approach of deadlock detection, two techniques are used namely:

Completely centralized algorithm and Ho Ramamurthy algorithm (One phase and Two-phase).

Completely Centralized Algorithm:

In a network of n sites, one site is chosen as a control site. This site is responsible for deadlock
detection. It has control over all resources of the system. If a site requires a resource it
requests the control site, the control site allocates and de-allocates resources and maintains a
wait for graph. And at regular interval of time, it checks the wait for graph to detect a cycle. If
cycle exits then it will declare system as deadlock otherwise the system will continue working.
The major drawbacks of this technique are as follows:
[1] A site has to send request even for using its own resource.
[2] There is a possibility of phantom deadlock.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Centralized-Deadlock-Detection Algorithm

HO Ramamurthy (Two-Phase Algorithm):

In this technique a resource status table is maintained by the central or control site, if a cycle is

detected then the system is not declared deadlock at first, the cycle is checked again as the

system is distributed some or the other resource is vacant or freed by sites at every instant of

time. Now, after checking if a cycle is detected again then, the system is declared as deadlock.

This technique reduces the possibility of phantom deadlock but on the other hand time

consumption is more.

HO Ramamurthy (One Phase Algorithm):

In this technique a resource status table and a process table is maintained by the central or

control site if the cycle is detected in both processes and resource tables then, the system is

declared as deadlock. This technique reduces time consumption but space complexity

increases.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed-Deadlock-Detection Algorithm

• Central Coordinator (CC)


– Each machine maintains a
local wfg
– Changes reported to CC
– CC constructs and
analyzes global wfg
• Problems
– Coordinator is a
performance bottleneck
– Communication delays
may cause phantom
deadlocks

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed-Deadlock-Detection Algorithm

• Distributed Approach
– Detect cycles using probes.
– If process pi blocked on pj , it launches probe pi  pj
– pj sends probe pi  pj  pk along all request edges, etc.
– When probe returns to pi, cycle is detected

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed-Deadlock-Detection Algorithm

Recovery from Deadlock


• Process termination
– Kill all processes involved in deadlock; or
– Kill one at a time. In what order?
• By priority: consistent with scheduling
• By cost of restart: length of recomputation
• By impact on other processes: CS, producer/consumer
• Resource preemption
– Direct: Temporarily remove resource (e.g., Memory)
– Indirect: Rollback to earlier “checkpoint”

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed-Deadlock-Detection Algorithm

Dynamic Deadlock Avoidance

• Maximum Claim Graph


– Process indicates
maximum resources needed
– Potential request edge
piRj (dashed)
– May turn into
real request edge

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed-Deadlock-Detection Algorithm

Dynamic Deadlock Avoidance

Theorem: Prevent acquisitions that do not produce a completely


reducible graph
* All state are safe.
Banker’s algorithm (Dijkstra):
Given a satisfiable request, pR, temporarily grant request, changing pR to
Rp
Try to reduce new claim graph, treating claim edges as actual requests.
If new claim graph is completely reducible proceed. If not, reverse temporary
acquisition Rp back to pR
Analogy with banking: resources correspond to currencies,
allocations correspond to loans, maximum claims correspond to
credit limits

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Hierarchical-Deadlock-Detection Algorithm

In hierarchical deadlock detection algorithm, sites are arranged in a hierarchical fashion and a

site detects deadlocks involving only its descendant sites. Distributed deadlock algorithms

delegate the responsibility of deadlock detection to individual sites while in hierarchical there

are local detectors at each site which communicate their local wait for graphs(WFG) with one

another.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Hierarchical-Deadlock-Detection Algorithm

Approach:
Deadlocks that are local to a single site are detected at that site using their local WFG. Each
site also sends its local WFG to deadlock detector at the next level. Thus, distributed deadlocks
involving 2 or more sites would be detected by a deadlock detector in lowest level that has
control over these sites. In this approach, there are 2 methods to detect:

1. Ho-Ramamoorthy Algorithm:
*Uses only two levels i.e. Master control nodes and Cluster control nodes.
*Cluster control nodes are used for detecting deadlock among their members and reporting
dependencies outside their cluster to Master control node.
*The master control node is responsible for detecting inter-cluster deadlocks.

2. Menasce-Muntz Algorithm:
*Leaf controllers are responsible for allocating resources whereas branch controllers find
deadlock among the resources that their children span in the tree.
*Network congestion can be managed and node failure is less critical than in fully centralized.
*Detection can be done ways such as Continuous allocation reporting or periodically allocation
reporting.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Hierarchical-Deadlock-Detection Algorithm

Advantages:

If the hierarchy coincides with resource access pattern local to cluster of sites, this approach

can provide efficient deadlock detection as compared to both centralized and distributed

methods.

It reduces the dependence on central sites thus, reducing the communication cost.

Disadvantages:

If deadlocks are span over several clusters, this approach will be inefficient.

It is more complicated to implement and would involve nontrivial modification to lock and

transaction manager algorithms.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Unit IV

Multiprocessor System Architectures

Multi Processor Operating Systems

Distributed File Systems

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

Introduction

A Multiprocessor System is defined as "a system with more than one processor“

“A number of central processing units linked together

to enable parallel processing to take place“

The key objective of a multiprocessor is to boost a system's execution speed.

The other objectives are fault tolerance and application matching.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Motivations for Multiprocessor Systems

Enhanced Performance

Multiprocessor systems increases system performance

Fault Tolerance

A multiprocessor system exhibits graceful performance degradation to processor failures

because of the availability of multiple processors.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

According to the classification of Flynn, in MIMD (multiple instruction multiple data)

architectures, multiple instruction streams operate on different data streams.

In the broadest sense, an MIMD architecture qualifies as a full-fledged multiprocessor system.

A multiprocessor system consists of multiple processors, which execute different programs (or

different segments of a program) concurrently.

The main memory is typically shared by all the processors. Based on whether a memory

location can be directly accessed by a processor or not, there are two type s of multiprocessor

system.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

Tightly Coupled

Loosely Coupled

(Based on whether a memory location can be directly accessed by a processor or not)

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

Loosely Coupled

In loosely coupled systems, not only is the main memory partitioned and attached to

processors, but each processor has its own address space. Therefore, a processor cannot

directly access the memory attached to other processors. One example of a loosely coupled

systems is Intel’s Hypercube.

Loosely coupled systems, on the other hand, use only message passing for inter-processor

communication and synchronization.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

Tightly Coupled

In tightly coupled systems, all processors share the same memory address space and all

processors can directly access a global main memory.

Examples of commercially available tightly coupled systems are Multimax of Encore

Corporation, Flex/32 of Flexible Corporation, and FX of Sequent Computers.

Tightly coupled systems can use the main memory for inter-processor communication and

synchronization.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

Types of Multiprocessor Systems:

Loosely coupled multiprocessor system

Tightly coupled multiprocessor system

Homogeneous multiprocessor system

Heterogeneous multiprocessor system

Shared memory multiprocessor system

Distributed memory multiprocessor system

Uniform memory access (UMA) system

cc-NUMA system

Hybrid system – shared system memory for global data and local memory for local data

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

Types of Multiprocessor Systems:

UMA (Uniform Memory Access)

NUMA (Non-Uniform Memory Access)

NORMA (No Remote Memory Access)

(Based on the vicinity and accessibility of the main memory to the processors)

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

UMA (Uniform Memory Access)

The main memory is located at a central location such that it is equidistant from all the

processors in terms of access (in the absence of conflicts).

That is, all the processors have the same access time to the main memory. In addition to this

centralized shared memory, processors may also have private memories, where they can cache

data for higher performance.

Some examples of UMA architectures are Multimax of Encore Corporation, Balance of Sequent,

and VAX 8800 of Digital Equipment.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

NUMA (Non-Uniform Memory Access)

The main memory is physically partitioned and the partitions are attached to the processors.

All the processors share the same memory address space.

A processor can directly access the memory attached to any other processor, but the time to

access its own memory partition.

Examples of NUMA architectures are cm* of CMU and Butterfly machine of BBN Laboratories.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor System Architectures

NORMA (No Remote memory access)

The main memory is physically partitioned and the partitions are attached to the processors.

A processor cannot directly access the memory of any other processor. The processors must

send message over the interconnection network to exchange information.

An example of NORMA architecture is Intel’s Hypercube.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor Operating System: Introduction

Multiprocessor Operating System refers to the use of two or more central processing units

(CPU) within a single computer system. These multiple CPUs are in a close communication

sharing the computer bus, memory and other peripheral devices.

These systems are referred as tightly coupled systems.

 These types of systems are used when very high speed

is required to process a large volume of data.

 These systems are generally used in environment like

satellite control, weather forecasting etc.

The basic organization of multiprocessing system

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Multiprocessor Operating System : Structure

Multiprocessing system is based on the symmetric multiprocessing model, in which each

processor runs an identical copy of operating system and these copies communicate with each

other. In this system processor is assigned a specific task.

In order to employ multiprocessing operating system effectively, the computer system must

have the followings:

1. Motherboard Support: A motherboard capable of handling multiple processors. This

means additional sockets or slots for the extra chips and a chip-set capable of handling the

multiprocessing arrangement.

2. Processor Support: Processors those are capable of being used in a multiprocessing

system.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed File Systems

It is a special case of distributed system,

It allows multi-computer systems to share files - Even when no other IPC or RPC is needed

Sharing devices - Special case of sharing files

E.g.,

NFS (Sun’s Network File System)

Windows NT, 2000, XP

Andrew File System (AFS) etc.

DFS is One of most common uses of Distributed Computing

Goal:

Provide common view of centralized file system, but distributed implementation.

Ability to open & update any file on any machine on network.

All of synchronization issues and capabilities of shared local files

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Naming of Distributed Files

Naming – mapping between logical and physical objects.

A transparent DFS hides the location where in the network the file is stored.

Location Transparency – file name does not reveal the file’s physical storage location.

File name denotes a specific, hidden, set of physical disk blocks.

Convenient way to share data.

Could expose correspondence between component units and machines.

Location Independence – file name does not need to be changed when the file’s physical

storage location changes.

Better file abstraction.

Promotes sharing the storage space itself.

Separates the naming hierarchy from the storage-devices hierarchy.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Naming of Distributed Files

Naming – mapping between logical and physical objects.

A transparent DFS hides the location where in the network the file is stored.

Location Transparency – file name does not reveal the file’s physical storage location.

File name denotes a specific, hidden, set of physical disk blocks.

Convenient way to share data.

Could expose correspondence between component units and machines.

Location Independence – file name does not need to be changed when the file’s physical

storage location changes.

Better file abstraction.

Promotes sharing the storage space itself.

Separates the naming hierarchy from the storage-devices hierarchy.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Three Naming Schemes

1. Mount remote directories to local directories, giving the appearance of a coherent local

directory tree

• Mounted remote directories can be accessed transparently.

• Unix/Linux with NFS; Windows with mapped drives

2. Files named by combination of host name and local name;

• Guarantees a unique system wide name

• Windows Network Places, Apollo Domain

3. Total integration of component file systems.

• A single global name structure spans all the files in the system.

• If a server is unavailable, some arbitrary set of directories on different machines also

becomes unavailable.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Mounting Remote Directories (DFS)

Note:– names of files are not unique


As represented by path names
E.g.,
Server A sees : /users/steen/mbox
Client A sees: /remote/vu/mbox
Client B sees: /work/me/mbox

Consequence:– Cannot pass file “names” around haphazardly


Department of Computer Science & Engineering
JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems DFS

MECHANISMS FOR BUILDING DISTRIBUTED FILE SYSTEMS

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems DFS – File Access Performance

• Reduce network traffic by retaining recently accessed disk blocks in local cache

• Repeated accesses to the same information can be handled locally.

All accesses are performed on the cached copy.

• If needed data not already cached, copy of data brought from the server to the local cache.

Copies of parts of file may be scattered in different caches.

• Cache-consistency problem – keeping the cached copies consistent with the master file.

Especially on write operations

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems File Caches

In client memory

Performance speed up; faster access

Good when local usage is transient

Enables diskless workstations

On client disk

Good when local usage dominates (e.g., AFS)

Caches larger files

Helps protect clients from server crashes

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Cache Update Policies

When does the client update the master file?

I.e. when is cached data written from the cache to the file?

Write-through – write data through to disk ASAP

I.e., following write() or put(), same as on local disks.

Reliable, but poor performance.

Delayed-write – cache and then written to the server later.

Write operations complete quickly; some data may be overwritten in cache, saving

needless network I/O.

Poor reliability

unwritten data may be lost when client machine crashes

Inconsistent data

Variation – scan cache at regular intervals and flush dirty blocks.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems File Consistency

H.W. for you (Is locally cached copy of the data consistent with the master copy)

Client-initiated approach

Client initiates a validity check with server.

Server verifies local data with the master copy

E.g., time stamps, etc.

Server-initiated approach

Server records (parts of) files cached in each client.

When server detects a potential inconsistency, it reacts.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Remote Service vs Caching

Remote Service – all file actions implemented by server.

RPC functions

Use for small memory diskless machines

Particularly applicable if large amount of write activity

Cached System

Many “remote” accesses handled efficiently by the local cache

Most served as fast as local ones.

Servers contacted only occasionally

Reduces server load and network traffic.

Enhances potential for scalability.

Reduces total network overhead.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems File Service Semantics

Stateless Service

Avoids state information in server by making each request self-contained.

Each request identifies the file and position in the file.

No need to establish and terminate a connection by open and close operations.

Poor support for locking or synchronization among concurrent accesses

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems File Service Semantics

Stateful Service

Client opens a file (as in Unix & Windows).

Server fetches information about file from disk, stores in server memory,

Returns to client a connection identifier unique to client and open file.

Identifier used for subsequent accesses until session ends.

Server must reclaim space used by no longer active clients.

Increased performance, fewer disk accesses.

Server retains knowledge about file

E.g., read ahead next blocks for sequential access

E.g., file locking for managing writes

Windows

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Comparison - Server Semantics

Failure Recovery: Stateful server loses all volatile state in a crash.

Restore state by recovery protocol based on a dialog with clients.

Server needs to be aware of crashed client processes

orphan detection and elimination.

Failure Recovery: Stateless server failure and recovery are almost unnoticeable.

Newly restarted server responds to self-contained requests without difficulty.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems NFS Implementation

NFS

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems NFS Version 4

(read, write, link, symlink, mkdir, rename, rmdir, readdir, readlink, getattr, setattr, create, remove)

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Building DFS

File System is a refinement of the more general abstraction of permanent storage. Databases

and Object Repositories are other examples.

A file system defines the naming structure, characteristics of the files and the set of operations

associated with them.

The classification of computer systems and the corresponding file system requirements are

given below.

Each level subsumes the functionality of the layers below in addition to the new functionality

required by that layer.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Building DFS

File Location Availability Multi User/Task


Multi Site

Multi User/Task
Security Single Site

Concurrency Control Multi Task


Single User/ Site

Naming Structure, Programming Interface, Single


Mapping, Integrity User/Task/ Site

Distributed File Systems constitute the highest level of this classification. Multiple users using

multiple machines connected by a network use a common file system that must be efficient,

secure and robust.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Building DFS

Encapsulation: File systems view data in files as a collection of un-interpreted bytes whereas

databases contain information about the type of each data item and relationships with other

data items.

Naming: File Systems organize files into directory hierarchies. The purpose of these hierarchies

is to help humans deal with a large number of named objects.

The ratio of search time to usage time is the factor that determines whether access by name is

adequate.

Mounting: Mount mechanisms allow the binding together of different file namespaces to form a

single hierarchical namespace.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Building DFS

Client machines can maintain mount information individually as is done in Sun's Network File
System.

Mount information can be maintained at servers in which case it is possible that every client
sees an identical namespace.

Client Caching is the architectural feature which contributes the most to performance in a
distributed file system.

Bulk Data Transfer allows all data transfer in a network requires the execution of various layers
of communication protocols.

Caching amortizes the cost of accessing remote servers over several local references to the
same information, bulk transfer amortizes the cost of the fixed communication protocol
overheads.

Encryption is used for enforcing security in distributed systems.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Building DFS

Stateful and Stateless File Servers

A client operates on files during a session with a file server. A session is an established

connection for a sequence of requests and responses between the client and server. During this

session a number of items of data may be maintained by the parties such as the set of open

files and their clients, file descriptors and handles, current position pointers, mounting

information, lock status, session keys, caches and buffers.

This information is required partly to avoid repeated authentication protocols for each request

and repeated directory lookup operations for satisfying each file access during the session. The

information may reside partly in clients or servers or both.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Design Issues

AFS Assumptions: Client machines are un-trusted

Must prove they act for a specific user

Secure RPC layer

Anonymous “system:anyuser”

Client machines have disks

Can cache whole files over long periods

Write/write and write/read sharing are rare

Most files updated by one user, on one machine

AFS : Andrew File System NFS : Network File System

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Design Issues : AFS Cell/Volume Architecture

Cells correspond to administrative groups

Path is a cell

Client machine has cell-server database

protection server handles authentication

volume location server maps volumes to servers

Cells are broken into volumes (miniature file systems)

One user's files, project source tree, ...

Typically stored on one server

Unit of disk quota administration, backup

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Design Issues : AFS Cell/Volume Architecture

Why is client-side caching necessary?


What is cached
Read-only file data and directory data  easy
Data written by the client machine  when is data written to the server? What happens if
the client machine goes down?
Data that is written by other machines  how to know that the data has changed? How to
ensure data consistency?
Is there any pre-fetching?

Data consistency guarantee is very poor


Simply unacceptable for some distributed applications
Productivity apps tend to tolerate such loose consistency
Different client implementations implement the “prefetching” part differently
Generally clients do not cache data on local disks

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems AFS vs RPC Procedures

Procedures that are not in NFS

Fetch: return status and optionally data of a file or directory, and place a callback on it

RemoveCallBack: specify a file that the client has flushed from the local machine

BreakCallBack: from server to client, revoke the callback on a file or directory

What should the client do if a callback is revoked?

Store: store the status and optionally data of a file

Rest are similar to NFS calls

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Failure Recovery

What if the file server fails?

Two sites may approaches to failure recovery

What if the client fails?

What if both the server and the client fail?

Network partition

How to detect it? How to recover from it?

Is there anyway to ensure absolute consistency in the presence of network partition?

Reads

Writes

What if all three fail: network partition, server, client?

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Failure Recovery

Need to maintain the state of the server

If you must keep some state on the server

Understand why and what state the server is keeping

Understand the worst case scenario of no state on the server and see if there are still

ways to meet the correctness goals

Revert to this worst case in each combination of failure cases

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Building DFS

Encapsulation: File systems view data in files as a collection of un-interpreted bytes whereas

databases contain information about the type of each data item and relationships with other

data items.

Naming: File Systems organize files into directory hierarchies. The purpose of these hierarchies

is to help humans deal with a large number of named objects.

The ratio of search time to usage time is the factor that determines whether access by name is

adequate.

Mounting: Mount mechanisms allow the binding together of different file namespaces to form a

single hierarchical namespace.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Building DFS

Client machines can maintain mount information individually as is done in Sun's Network File
System.

Mount information can be maintained at servers in which case it is possible that every client
sees an identical namespace.

Client Caching is the architectural feature which contributes the most to performance in a
distributed file system.

Bulk Data Transfer allows all data transfer in a network requires the execution of various layers
of communication protocols.

Caching amortizes the cost of accessing remote servers over several local references to the
same information, bulk transfer amortizes the cost of the fixed communication protocol
overheads.

Encryption is used for enforcing security in distributed systems.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Building DFS

Stateful and Stateless File Servers

A client operates on files during a session with a file server. A session is an established

connection for a sequence of requests and responses between the client and server. During this

session a number of items of data may be maintained by the parties such as the set of open

files and their clients, file descriptors and handles, current position pointers, mounting

information, lock status, session keys, caches and buffers.

This information is required partly to avoid repeated authentication protocols for each request

and repeated directory lookup operations for satisfying each file access during the session. The

information may reside partly in clients or servers or both.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Design Issues

AFS Assumptions: Client machines are un-trusted

Must prove they act for a specific user

Secure RPC layer

Anonymous “system:anyuser”

Client machines have disks

Can cache whole files over long periods

Write/write and write/read sharing are rare

Most files updated by one user, on one machine

AFS : Andrew File System NFS : Network File System

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Design Issues : AFS Cell/Volume Architecture

Cells correspond to administrative groups

Path is a cell

Client machine has cell-server database

protection server handles authentication

volume location server maps volumes to servers

Cells are broken into volumes (miniature file systems)

One user's files, project source tree, ...

Typically stored on one server

Unit of disk quota administration, backup

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Design Issues : AFS Cell/Volume Architecture

Why is client-side caching necessary?


What is cached
Read-only file data and directory data  easy
Data written by the client machine  when is data written to the server? What happens if
the client machine goes down?
Data that is written by other machines  how to know that the data has changed? How to
ensure data consistency?
Is there any pre-fetching?

Data consistency guarantee is very poor


Simply unacceptable for some distributed applications
Productivity apps tend to tolerate such loose consistency
Different client implementations implement the “prefetching” part differently
Generally clients do not cache data on local disks

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems AFS vs RPC Procedures

Procedures that are not in NFS

Fetch: Return status and optionally data of a file or directory, and place

a callback on it

RemoveCallBack: Specify a file that the client has flushed from the local machine

BreakCallBack: From server to client, revoke the callback on a file or directory

What should the client do if a callback is revoked?

Store: Store the status and optionally data of a file

Rest are similar to NFS calls

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Failure Recovery

What if the file server fails?

Two sites may approaches to failure recovery

What if the client fails?

What if both the server and the client fail?

Network partition

How to detect it? How to recover from it?

Is there anyway to ensure absolute consistency in the presence of network partition?

Reads

Writes

What if all three fail: network partition, server, client?

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Failure Recovery

Need to maintain the state of the server

If you must keep some state on the server

Understand why and what state the server is keeping

Understand the worst case scenario of no state on the server and see if there are still

ways to meet the correctness goals

Revert to this worst case in each combination of failure cases

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Unit IV

End of Unit IV

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Unit V

Distributed Scheduling:

Issues in Load Distributing, Components of a Load Distributed Algorithm, Stability, Load

Distributing Algorithms, Requirements for Load Distributing, Task Migration, Issues in task

Migration

Distributed Shared Memory:

Architecture and Motivation, Algorithms for Implementing DSM, Memory Coherence, Coherence

Protocols, Design Issues

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed Scheduling

Scheduling refers to the execution of non-interactive processes or tasks or jobs at designated

times and places around a network of computers.

Scheduling refers to assigning a resource and a start time end to a task.

Distributed Scheduling refers to the chaining of different jobs into a coordinated workflow that

spans several computers.

For example, you schedule a processing job on machine1 and machine2, and when these are

finished you need to schedule a job on machine3. This is distributed scheduling.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed Scheduling

Scheduling is a decision-making process that is used on a regular basis in many manufacturing

and services industries.

It deals with the allocation of resources to tasks over given time periods and its goal is to

optimize one or more objectives.

Implementing a distributed system requires cost for hardware support and agreements on

service expectations.

Optimizing tasks through proper scheduling helps reduce the overall cost of computation while

increasing the value customers receive.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed Scheduling

Scheduling is a decision-making process that is used on a regular basis in many manufacturing and services industries.

It deals with the allocation of resources to tasks over given time periods and its goal is to optimize one or more

objectives.

Implementing a distributed system requires cost for hardware support and agreements on service expectations.

Optimizing tasks through proper scheduling helps reduce the overall cost of computation while increasing the value

customers receive.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems System Taxonomy

Parallel Systems

Distributed Systems

Dedicated Systems

Shared Systems

Time Shared e.g. aludra

Space Shared e.g. HPCC cluster

Homogeneous Systems

Heterogeneous Systems

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Scheduling Regimens

Online/Dynamic Scheduling

Offline/Static Scheduling

Resource level Scheduling

Application level Scheduling

Application Taxonomy

Bag of tasks – Independent tasks

Workflows – dependent tasks

Directed Acyclic Graphs (DAGs)

Performance criteria

Completion time (make-span), reliability etc.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Model

<- Application Model

System Model ->

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Scheduling Regimens

Online/Dynamic Scheduling

Offline/Static Scheduling

Resource level Scheduling

Application level Scheduling

Application Taxonomy

Bag of tasks – Independent tasks

Workflows – dependent tasks

Directed Acyclic Graphs (DAGs)

Performance criteria

Completion time (make-span), reliability etc.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Scheduling refers to the execution of non-interactive processes or tasks at


designated times and places around a network of computer

Distributed scheduling refers to the chaining of different jobs into a coordinated workflow that spans several computers. For
example: - you schedule a processing job on computer1 and computer2, and when these are finished you need to schedule a
job on computer3, this is distributed scheduling.

******************

Locally Distributed System consists of a collection of autonomous computers, connected by a


LAN.
In a locally distributed system, there is a good possibility that several computers are heavily
loaded while others are ideal or lightly loaded.
If we can move jobs around, the overall performance of the system can be maximized.
A Distributed Scheduler is a Resources Management Component of a distributed operating
system that focuses on judiciously and transparently redistributing the Load of the system
among the computers to maximize the overall performance.

A distributed system may have a mix of heavily and lightly loaded system.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

“A distributed system may have a mix of heavily and lightly loaded system.”

 Tasks arrive at random intervals.

 CPUs service time requirements are also random.

 Migrating a task to share or balance load can help.

System may be heterogeneous in terms of CPU speed and resources. The distributed system

may be heterogeneous in terms of load in different system.

Even in homogeneous distributed system a system may be idle even when a task is waiting for

service in other system.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Consider a system of N identical, independent servers. Let P be the probability that the system

is in state in which at least 1 task is waiting for service and at least 1 server is idle. Let P be

the utilization of each servers.

We can estimate P using probabilistic analysis and plot a graph against system utilization.

For moderate system utilization,

Value of P is high, i.e.

at least 1 node is idle.

Hence, performance can be

improved by sharing of task.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

LOAD: Load on a system/node can correspond to the queue length of tasks/ processes that

need to be processed.

Queue length of waiting tasks: proportional to task response time, hence a good indicator of
system loads.

Distributing load: Transfer TASKS/PROCESSES among nodes.

If a task transfer (from another node) takes a long time, the node may accept more tasks
during the transfer time.

Causes the node to be highly loaded.

Affects performance.

Artificially increment the queue length when a task is accepted for transfer from remote node
(to account for the proposed increased in load). We can use TIMEOUTS if Task transfer fail.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Types of Algorithms:

Static load distribution algorithms: Decisions are hard-coded into an algorithm with knowledge
of system.

Dynamic load distribution: Use system state information such as task queue length, processor
utilization.

Adaptive load distribution: Adapt the approach based on system state.(e.g.) Dynamic
Distribution Algorithms Collect load information from nodes. Even at very high system loads.

Load information collection itself can add load on the system as messages need to be
exchanged.

Adaptive distribution algorithms may stop collecting state information at high loads.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Balancing Vs Sharing

Load balancing:

Equalize load on the participating nodes.

Transfer tasks even if a node is not heavily loaded so that queue length on all

Nodes are approximately equal.

More number of tasks transfers, might degrade performance.

Load sharing:

Reduce burden of an overloaded node.

Transfer tasks only when the queue length exceeds a certain threshold.

Less number of task transfers.

Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become

idle/highly loaded

More like load balancing, but may be less number of transfers.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Scheduling refers to the execution of non-interactive processes or tasks at


designated times and places around a network of computer

Distributed scheduling refers to the chaining of different jobs into a coordinated workflow that spans several computers. For
example: - you schedule a processing job on computer1 and computer2, and when these are finished you need to schedule a
job on computer3, this is distributed scheduling.

******************

Locally Distributed System consists of a collection of autonomous computers, connected by a


LAN.
In a locally distributed system, there is a good possibility that several computers are heavily
loaded while others are ideal or lightly loaded.
If we can move jobs around, the overall performance of the system can be maximized.
A Distributed Scheduler is a Resources Management Component of a distributed operating
system that focuses on judiciously and transparently redistributing the Load of the system
among the computers to maximize the overall performance.

A distributed system may have a mix of heavily and lightly loaded system.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

“A distributed system may have a mix of heavily and lightly loaded system.”

 Tasks arrive at random intervals.

 CPUs service time requirements are also random.

 Migrating a task to share or balance load can help.

System may be heterogeneous in terms of CPU speed and resources. The distributed system

may be heterogeneous in terms of load in different system.

Even in homogeneous distributed system a system may be idle even when a task is waiting for

service in other system.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Consider a system of N identical, independent servers. Let P be the probability that the system

is in state in which at least 1 task is waiting for service and at least 1 server is idle. Let P be

the utilization of each servers.

We can estimate P using probabilistic analysis and plot a graph against system utilization.

For moderate system utilization,

Value of P is high, i.e.

at least 1 node is idle.

Hence, performance can be

improved by sharing of task.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

LOAD: Load on a system/node can correspond to the queue length of tasks/ processes that

need to be processed.

Queue length of waiting tasks: proportional to task response time, hence a good indicator of
system loads.

Distributing load: Transfer TASKS/PROCESSES among nodes.

If a task transfer (from another node) takes a long time, the node may accept more tasks
during the transfer time.

Causes the node to be highly loaded.

Affects performance.

Artificially increment the queue length when a task is accepted for transfer from remote node
(to account for the proposed increased in load). We can use TIMEOUTS if Task transfer fail.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Types of Algorithms:

Static load distribution algorithms: Decisions are hard-coded into an algorithm with knowledge
of system.

Dynamic load distribution: Use system state information such as task queue length, processor
utilization.

Adaptive load distribution: Adapt the approach based on system state.(e.g.) Dynamic
Distribution Algorithms Collect load information from nodes. Even at very high system loads.

Load information collection itself can add load on the system as messages need to be
exchanged.

Adaptive distribution algorithms may stop collecting state information at high loads.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Balancing Vs Sharing

Load balancing:

Equalize load on the participating nodes.

Transfer tasks even if a node is not heavily loaded so that queue length on all

Nodes are approximately equal.

More number of tasks transfers, might degrade performance.

Load sharing:

Reduce burden of an overloaded node.

Transfer tasks only when the queue length exceeds a certain threshold.

Less number of task transfers.

Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become

idle/highly loaded

More like load balancing, but may be less number of transfers.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Balancing Vs Sharing

Load balancing:

Equalize load on the participating nodes.

Transfer tasks even if a node is not heavily loaded so that queue length on all

Nodes are approximately equal.

More number of tasks transfers, might degrade performance.

Load sharing:

Reduce burden of an overloaded node.

Transfer tasks only when the queue length exceeds a certain threshold.

Less number of task transfers.

Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become

idle/highly loaded

More like load balancing, but may be less number of transfers.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Distributed scheduling is concerned with distributing the load of a system among the available
resources in a manner that improves overall system performance and maximizes resource
utilization. The basic idea is to transfer tasks from heavily loaded machines to idle or lightly
loaded ones.

Load distributing algorithms can be classified as static, dynamic or adaptive.

Static means decisions on assignment of processes to processors are hardwired into the
algorithm, using a priori knowledge, perhaps gained from analysis of a graph model of the
application.

Dynamic algorithms gather system state information to make scheduling decisions and so can
exploit under-utilized resources of the system at run time but at the expense of having to
gather system information.

An Adaptive algorithm changes the parameters of its algorithm to adapt to system loading
conditions. It may reduce the amount of information required for scheduling decisions when
system load or communication is high.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Most models assume that the system is Fully Connected and that every processor can
communicate in a number of hops, with every other processor although generally, because of
communication latency, load scheduling is more practical on tightly coupled networks of
homogeneous processors if the workload involves communicating tasks.

Processor allocation strategies can be Non-Migratory(non preemptive) or


Migratory(preemptive).

Process migration involves the transfer of a running process from one host to another.

It is an expensive and potentially difficult operation to implement. It involves packaging the


tasks virtual memory, threads and control block, I/O buffers, messages, signals and other OS
resources into messages which are sent to the remote host, which uses them to reinitiate the
process.

Non migratory algorithms involve the transfer of tasks which have not yet begun, and so do not
have this state information, which reduces the complexities of maintaining transparency.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Resource queue lengths and particularly CPU Queue Length are good indicators of load as

they correlate well with the task response time.

It is also very easy to measure queue length.

For example, a number of remote hosts could observe simultaneously that a particular site had

a small CPU queue length and could initiate a number of process transfers. This may result in

that site becoming flooded with processes and its first reaction might be to try to move them

elsewhere. As migration is an expensive operation, we do not want to waste resources (CPU

Time and Communication Bandwidth) by making poor choices which result in increased

migration activity.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Resource queue lengths and particularly CPU Queue Length are good indicators of load as

they correlate well with the task response time.

It is also very easy to measure queue length.

For example, a number of remote hosts could observe simultaneously that a particular site had

a small CPU queue length and could initiate a number of process transfers. This may result in

that site becoming flooded with processes and its first reaction might be to try to move them

elsewhere. As migration is an expensive operation, we do not want to waste resources (CPU

Time and Communication Bandwidth) by making poor choices which result in increased

migration activity.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

A Load Distributing Algorithm has four Components:-

Transfer

Selection

Location

Information Policies.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Scheduling refers to the execution of non-interactive processes or tasks at


designated times and places around a network of computer

Distributed scheduling refers to the chaining of different jobs into a coordinated workflow that spans several computers. For
example: - you schedule a processing job on computer1 and computer2, and when these are finished you need to schedule a
job on computer3, this is distributed scheduling.

******************

Locally Distributed System consists of a collection of autonomous computers, connected by a


LAN.
In a locally distributed system, there is a good possibility that several computers are heavily
loaded while others are ideal or lightly loaded.
If we can move jobs around, the overall performance of the system can be maximized.
A Distributed Scheduler is a Resources Management Component of a distributed operating
system that focuses on judiciously and transparently redistributing the Load of the system
among the computers to maximize the overall performance.

A distributed system may have a mix of heavily and lightly loaded system.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

“A distributed system may have a mix of heavily and lightly loaded system.”

 Tasks arrive at random intervals.

 CPUs service time requirements are also random.

 Migrating a task to share or balance load can help.

System may be heterogeneous in terms of CPU speed and resources. The distributed system

may be heterogeneous in terms of load in different system.

Even in homogeneous distributed system a system may be idle even when a task is waiting for

service in other system.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Consider a system of N identical, independent servers. Let P be the probability that the system

is in state in which at least 1 task is waiting for service and at least 1 server is idle. Let P be

the utilization of each servers.

We can estimate P using probabilistic analysis and plot a graph against system utilization.

For moderate system utilization,

Value of P is high, i.e.

at least 1 node is idle.

Hence, performance can be

improved by sharing of task.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

LOAD: Load on a system/node can correspond to the queue length of tasks/ processes that

need to be processed.

Queue length of waiting tasks: proportional to task response time, hence a good indicator of
system loads.

Distributing load: Transfer TASKS/PROCESSES among nodes.

If a task transfer (from another node) takes a long time, the node may accept more tasks
during the transfer time.

Causes the node to be highly loaded.

Affects performance.

Artificially increment the queue length when a task is accepted for transfer from remote node
(to account for the proposed increased in load). We can use TIMEOUTS if Task transfer fail.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Types of Algorithms:

Static load distribution algorithms: Decisions are hard-coded into an algorithm with knowledge
of system.

Dynamic load distribution: Use system state information such as task queue length, processor
utilization.

Adaptive load distribution: Adapt the approach based on system state.(e.g.) Dynamic
Distribution Algorithms Collect load information from nodes. Even at very high system loads.

Load information collection itself can add load on the system as messages need to be
exchanged.

Adaptive distribution algorithms may stop collecting state information at high loads.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Balancing Vs Sharing

Load balancing:

Equalize load on the participating nodes.

Transfer tasks even if a node is not heavily loaded so that queue length on all

Nodes are approximately equal.

More number of tasks transfers, might degrade performance.

Load sharing:

Reduce burden of an overloaded node.

Transfer tasks only when the queue length exceeds a certain threshold.

Less number of task transfers.

Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become

idle/highly loaded

More like load balancing, but may be less number of transfers.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues In Load Distributing

Balancing Vs Sharing

Load balancing:

Equalize load on the participating nodes.

Transfer tasks even if a node is not heavily loaded so that queue length on all

Nodes are approximately equal.

More number of tasks transfers, might degrade performance.

Load sharing:

Reduce burden of an overloaded node.

Transfer tasks only when the queue length exceeds a certain threshold.

Less number of task transfers.

Anticipatory task transfer: Transfer from overloaded nodes to ones that are likely to become

idle/highly loaded

More like load balancing, but may be less number of transfers.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Distributed scheduling is concerned with distributing the load of a system among the available
resources in a manner that improves overall system performance and maximizes resource
utilization. The basic idea is to transfer tasks from heavily loaded machines to idle or lightly
loaded ones.

Load distributing algorithms can be classified as static, dynamic or adaptive.

Static means decisions on assignment of processes to processors are hardwired into the
algorithm, using a priori knowledge, perhaps gained from analysis of a graph model of the
application.

Dynamic algorithms gather system state information to make scheduling decisions and so can
exploit under-utilized resources of the system at run time but at the expense of having to
gather system information.

An Adaptive algorithm changes the parameters of its algorithm to adapt to system loading
conditions. It may reduce the amount of information required for scheduling decisions when
system load or communication is high.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Most models assume that the system is Fully Connected and that every processor can
communicate in a number of hops, with every other processor although generally, because of
communication latency, load scheduling is more practical on tightly coupled networks of
homogeneous processors if the workload involves communicating tasks.

Processor allocation strategies can be Non-Migratory(non preemptive) or


Migratory(preemptive).

Process migration involves the transfer of a running process from one host to another.

It is an expensive and potentially difficult operation to implement. It involves packaging the


tasks virtual memory, threads and control block, I/O buffers, messages, signals and other OS
resources into messages which are sent to the remote host, which uses them to reinitiate the
process.

Non migratory algorithms involve the transfer of tasks which have not yet begun, and so do not
have this state information, which reduces the complexities of maintaining transparency.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Resource queue lengths and particularly CPU Queue Length are good indicators of load as

they correlate well with the task response time.

It is also very easy to measure queue length.

For example, a number of remote hosts could observe simultaneously that a particular site had

a small CPU queue length and could initiate a number of process transfers. This may result in

that site becoming flooded with processes and its first reaction might be to try to move them

elsewhere. As migration is an expensive operation, we do not want to waste resources (CPU

Time and Communication Bandwidth) by making poor choices which result in increased

migration activity.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Resource queue lengths and particularly CPU Queue Length are good indicators of load as

they correlate well with the task response time.

It is also very easy to measure queue length.

For example, a number of remote hosts could observe simultaneously that a particular site had

a small CPU queue length and could initiate a number of process transfers. This may result in

that site becoming flooded with processes and its first reaction might be to try to move them

elsewhere. As migration is an expensive operation, we do not want to waste resources (CPU

Time and Communication Bandwidth) by making poor choices which result in increased

migration activity.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

LOAD BALANCING ALGORITHMS

STATIC DYNAMIC

Deterministic Probabilistic Centralized Distributed

Cooperative Non-Cooperative

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Distributed scheduling is concerned with distributing the load of a system among the available
resources in a manner that improves overall system performance and maximizes resource
utilization. The basic idea is to transfer tasks from heavily loaded machines to idle or lightly
loaded ones.

Load distributing algorithms can be classified as static, dynamic or adaptive.

Static Means decisions on assignment of processes to processors are hardwired into the
algorithm, using a priori knowledge, perhaps gained from analysis of a graph model of the
application.

Dynamic Algorithms gather system state information to make scheduling decisions and so can
exploit under-utilized resources of the system at run time but at the expense of having to
gather system information.

An Adaptive Algorithm changes the parameters of its algorithm to adapt to system loading
conditions. It may reduce the amount of information required for scheduling decisions when
system load or communication is high.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Most models assume that the system is Fully Connected and that every processor can
communicate in a number of hops, with every other processor although generally, because of
communication latency, load scheduling is more practical on tightly coupled networks of
homogeneous processors if the workload involves communicating tasks.

Processor allocation strategies can be Non-Migratory(non preemptive) or


Migratory(preemptive).

Process migration involves the transfer of a running process from one host to another.

It is an expensive and potentially difficult operation to implement. It involves packaging the


tasks virtual memory, threads and control block, I/O buffers, messages, signals and other OS
resources into messages which are sent to the remote host, which uses them to reinitiate the
process.

Non migratory algorithms involve the transfer of tasks which have not yet begun, and so do not
have this state information, which reduces the complexities of maintaining transparency.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Resource queue lengths and particularly CPU Queue Length are good indicators of load as

they correlate well with the task response time.

It is also very easy to measure queue length.

For example, a number of remote hosts could observe simultaneously that a particular site had

a small CPU queue length and could initiate a number of process transfers. This may result in

that site becoming flooded with processes and its first reaction might be to try to move them

elsewhere. As migration is an expensive operation, we do not want to waste resources (CPU

Time and Communication Bandwidth) by making poor choices which result in increased

migration activity.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

A Load Distributing Algorithm has four Components:-

Transfer Policy

Process Selection Policy

Site Location Policy

Sender Initiated
Receiver Initiated

Information Policy

Demand Driven
Periodic
State-Change Driven

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Transfer Policy

In this When a process is a about to be created, it could run on the local machine or be started

elsewhere.

Bearing in mind that migration is expensive, a good initial choice of location for a process could

eliminate the need for future system activity.

Many policies operate by using a threshold.

If the machine's load is below the threshold then it acts as a potential receiver for remote

tasks.

If the load is above the threshold, then it acts as a sender for new tasks.

Local algorithms using thresholds are simple but may be far from optimal.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Process Selection Policy

A selection policy chooses a task for transfer.

This decision will be based on the requirement that the overhead involved in the transfer will be

compensated by an improvement in response time for the task and/or the system.

Means of knowing that the task is long-lived will be necessary to avoid needless migration. This

could be based on past history where number of other factors could influence the decision.

The size of the task's memory space is the main cost of migration.

Small tasks are more suited.

For efficiency purposes, the number of location dependent calls made by the chosen task

should be minimal because these must be mapped home transparently.

The resources such as a window or Input may only be available at the task's originating site.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Site Location Policy

Once the transfer policy has decided to send a particular task, the location policy must decide

where the task is to be sent. This will be based on information gathered by the information

policy.

Polling is a widely used SENDER-INITIATED technique. A site polls other sites serially or in

parallel to determine if they are suitable sites for a transfer and/or if they are willing to accept a

transfer. Nodes could be selected at random for polling, or chosen more selectively based on

information gathered during previous polls. The number of sites polled may vary.

A RECEIVER-INITIATED scheme depends on idle machines to announce their availability for

work. The goal of the idle site is to find some work to do. An interesting idea is for it to offer to

do work at a price, leaving the sender to make a cost/performance decision in relation to the

task to be migrated.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Components of Load Distribution Algorithm

Information Policy
The information policy decides what information about the states of other nodes should be collected
and where it should be collected from. There are a number of approaches:

Demand Driven
A node collects the state of other nodes only when it wishes to become involved in either sending or
receiving tasks, using sender initiated or receiver initiated polling schemes.
Demand driven policies are inherently adaptive and dynamic as their actions depend on the
system state.

Periodic
Nodes exchange information at fixed intervals. These policies do not adapt their activity to system
state, but each site will have a substantial history over time of global resource usage to guide location
algorithms. Note that the benefits of load distribution are minimal at high system loads and the
periodic exchange of information therefore may be an unnecessary overhead.

State-Change Driven
Nodes disseminate state information whenever their state changes by a certain amount. This state
information could be sent to a centralized load scheduling point or to peers.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Stability

Information Policy
The information policy decides what information about the states of other nodes should be collected
and where it should be collected from. There are a number of approaches:

Demand Driven
A node collects the state of other nodes only when it wishes to become involved in either sending or
receiving tasks, using sender initiated or receiver initiated polling schemes.
Demand driven policies are inherently adaptive and dynamic as their actions depend on the
system state.

Periodic
Nodes exchange information at fixed intervals. These policies do not adapt their activity to system
state, but each site will have a substantial history over time of global resource usage to guide location
algorithms. Note that the benefits of load distribution are minimal at high system loads and the
periodic exchange of information therefore may be an unnecessary overhead.

State-Change Driven
Nodes disseminate state information whenever their state changes by a certain amount. This state
information could be sent to a centralized load scheduling point or to peers.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Stability

The two views of stability are:

The Queuing-Theoretic Perspective

A system is termed as unstable if the CPU queues grow without bound when the long term

arrival rate of work to a system is greater than the rate at which the system can perform work.

The Algorithmic Perspective

If an algorithm can perform fruitless actions indefinitely with finite probability, the algorithm is

said to be unstable.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Load Distributing Algorithms

Load Distributing Algorithms

Load Distributing Algorithms

Sender-Initiated Algorithms

Receiver-Initiated Algorithms

Symmetrically Initiated Algorithms

Adaptive Algorithms

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Load Distribution Algorithm

ALGORITHMS

STATIC DYNAMIC

Deterministic Probabilistic Centralized Distributed

Cooperative Non-Cooperative

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Load Distributing Algorithms

Load distribution algorithms can be classified as Static, Dynamic or Adaptive.

Static schemes are those when the algorithm uses some priori information of the system based

on which the load is distributed from one server to another.

The disadvantage of this approach is that it cannot exploit the short term fluctuations in the

system state to improve performance.

This is because static algorithms do not collect the state information of the system. These

algorithms are essentially graph theory driven or based on some mathematical programming

aimed to find a optimal schedule, which has a minimum cost function.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Load Distributing Algorithms

Dynamic scheduling collect system state information and make scheduling decisions on these

state information. An extra overhead of collecting and storing system state information is

needed but they yield better performance than static ones.

Dynamic load distribution for homogenous systems scenario of task waiting in one server and

other server being idle was regarded as “wait and idle” (WI) condition. Significantly for a

distributed system of 20 nodes and having a system load between 0.33 and 0.89, the

probability of WI state is greater than 0.9. Thus, at typical system loads there is always a

potential for improvement in performance, even when nodes and process arrival rates are

homogeneous.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Load Distributing Algorithms

Adaptive load balancing algorithms are a special class of dynamic load distribution algorithms,

in that they adapt their activities by dynamically changing the parameters of the algorithm to

suit the changing system state.

Adaptive algorithms use the previous information to query a new node and also adjust their

threshold themselves according to the information.

Pre-emptive and Non pre-emptive Type

A pre-emptive transfer involves transfer of task which are partially executed. These transfers

are expensive because the state of the tasks also needs to be transferred to the new location.

Non pre-emptive transfers involves transfer of tasks which has not been started. For a system

that experiences wide fluctuations in load and has a high cost for the migration of partly

executed tasks, non pre-emptive transfers are appropriate.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Load Distributing Algorithms

Comparison

Sender initiated algorithms work well in low system load, but in case of high system load when

most of the nodes are senders they send query to each other resulting in wastage of CPU cycles

and incurring more delay due to which the system becomes unstable.

This un-stability happens with receiver initiated algorithms when the system load is low and

most nodes are receiver.

For symmetrically initiated algorithms, they cannot use the previous gathered information and

so in stateless.

Adaptive algorithms use the previous information to query a new node and also adjust their

threshold themselves according to the information.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Requirements for Load Distributions

Load Balancing Empirical evidence from some studies has shown that often, a small subset of processes

running on a multiprocessor system often account for much of the load and a small amount of effort spent off-

loading such processes may yield a big gain in performance.

Load sharing algorithms avoid idle time on individual machines when others have anonymous work queues.

Communication Network saturation can be caused by heavy communication traffic induced by data transfers

from one task to another, residing on separate hosts.

Fault Tolerance Long running processes may be considered as valuable elements in any system because of the

resource expenditure that has been outlaid

Application Concurrency The divide and conquer, or crowd approach to problem solving decomposes a problem

into a set of smaller problems, similar in nature, and solves them separately.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Task Migration

Process/Task migration is the transfer of partially executed tasks to another node in a

distributed system i.e. preemptive task transfer. Some references to task migration include the

transfer of processes before execution begins, but the most difficult issues are those related to

preemption.

Task migration is the movement of an executing task from one host processor (source) in

distributed computing system to another (destination).

Task placement is the selection of a host for a new task and the creation

of the task on that host.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Task Migration

Benefits of Task Migration

Load Balancing – Improve performance for a distributed computing system overall, or a

distributed application, by spreading the load more evenly over a set of host.

Reduction in Communication Overhead – By locating on one host a group of tasks with

intensive communication amongst them.

Resource Access – Not all resources are available across the network; a task may need to

migrate in order to access a special device, or to satisfy a need for a large amount of physical

memory.

Fault-Tolerance – Allowing long running processes to survive the planned shutdown or failure of

a host.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues in Task Migration

Steps involved in Task Migration

Suspending (freezing) the task on the source

Extracting and transmitting the state of the task to destination

Reconstructing the state on the destination

Deleting the task on the source and resuming the task’s execution on the destination

Issues

State Transfer

Location Transparency

Structure of a Migration Mechanism

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues in Task Migration

State Transfer

The cost to support remote execution

Freezing the task (as little time as possible)

Obtaining and transferring the state

Unfreezing the task

Residual Dependencies – It refer to the amount of resources a host of a migrated task

continues to dedicate to service requests from the migrated task.

They are undesirable for three reasons: reliability, performance and complexity.

Precopying the State - Bulk of the task state is copied to the new host before freezing the task

Location-transparent file access mechanism.

Copy-On-Reference - Just copy what is migrated task need for its execution.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues in Task Migration

Location Transparency

Task migration should hide the locations of tasks. Location transparency in principle requires

that names (process name, file names) be independent of their locations (host names). Need to

have uniform name space throughout the system.

Structure of a Migration Mechanism

There will be interaction between the task migration mechanism, the memory management

system, the interprocess communication mechanisms, and the file system.

The mechanisms can be designed to be independent of one another so that if one mechanism’s

protocol changes, the other need not require the migration mechanism it can be turned off

without interfering with other mechanisms.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Task Migration

Process/Task migration is the transfer of partially executed tasks to another node in a

distributed system i.e. preemptive task transfer. Some references to task migration include the

transfer of processes before execution begins, but the most difficult issues are those related to

preemption.

Task migration is the movement of an executing task from one host processor (source) in

distributed computing system to another (destination).

Task placement is the selection of a host for a new task and the creation

of the task on that host.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Task Migration

Benefits of Task Migration

Load Balancing – Improve performance for a distributed computing system overall, or a

distributed application, by spreading the load more evenly over a set of host.

Reduction in Communication Overhead – By locating on one host a group of tasks with

intensive communication amongst them.

Resource Access – Not all resources are available across the network; a task may need to

migrate in order to access a special device, or to satisfy a need for a large amount of physical

memory.

Fault-Tolerance – Allowing long running processes to survive the planned shutdown or failure of

a host.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues in Task Migration

Steps involved in Task Migration

Suspending (freezing) the task on the source

Extracting and transmitting the state of the task to destination

Reconstructing the state on the destination

Deleting the task on the source and resuming the task’s execution on the destination

Issues

State Transfer

Location Transparency

Structure of a Migration Mechanism

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues in Task Migration

State Transfer

The cost to support remote execution

Freezing the task (as little time as possible)

Obtaining and transferring the state

Unfreezing the task

Residual Dependencies – It refer to the amount of resources a host of a migrated task

continues to dedicate to service requests from the migrated task.

They are undesirable for three reasons: reliability, performance and complexity.

Precopying the State - Bulk of the task state is copied to the new host before freezing the task

Location-transparent file access mechanism.

Copy-On-Reference - Just copy what is migrated task need for its execution.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Issues in Task Migration

Location Transparency

Task migration should hide the locations of tasks. Location transparency in principle requires

that names (process name, file names) be independent of their locations (host names). Need to

have uniform name space throughout the system.

Structure of a Migration Mechanism

There will be interaction between the task migration mechanism, the memory management

system, the Interprocess Communication mechanisms, and the file system.

The mechanisms can be designed to be independent of one another so that if one mechanism’s

protocol changes, the other need not require the migration mechanism it can be turned off

without interfering with other mechanisms.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Distributed Shared Memory

Distributed Shared Memory (DSM)

is a form of memory architecture where

Physically Separated Memories

can be addressed as one logically shared address space.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems DSM – Architecture and Motivation

Distributed Shared Memory (DSM) is a form of memory architecture where

Physically Separated Memories can be addressed as one logically shared address

space.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems DSM – Architecture and Motivation

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems DSM – Architecture and Motivation

Distributed Shared Memory (DSM) is a resource management component of a distributed

operating system that implements the shared memory model in distributed systems, which

have no physically shared memory. The shared memory model provides a virtual address space

that is shared among all computers in a distributed system.

In DSM, data is accessed from a shared address space similar to the way that virutal memory is

accessed.

Data moves between Secondary and Main Memory, as well as, between the Distributed

Main Memories of different nodes. Ownership of pages in memory starts out in some pre-

defined state but changes during the course of normal operation. Ownership changes take place

when data moves from one node to another due to an access by a particular process.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems DSM – Architecture and Motivation

 Hide data movement and provide a simpler abstraction for sharing data. Programmers don't

need to worry about memory transfers between machines like when using the message

passing model.

 Allows the passing of complex structures by reference, simplifying algorithm development

for distributed applications.

 Takes advantage of "locality of reference" by moving the entire page containing the data

referenced rather than just the piece of data.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems DSM – Architecture and Motivation

 Cheaper to build than multiprocessor systems. Ideas can be implemented using normal

hardware and do not require anything complex to connect the shared memory to the

processors.

 Larger memory sizes are available to programs, by combining all physical memory of all

nodes. This large memory will not incur disk latency due to swapping like in traditional

distributed systems.

 Unlimited number of nodes can be used. Unlike multiprocessor systems where main memory

is accessed via a common bus, thus limiting the size of the multiprocessor system.

 Programs written for shared memory multiprocessors can be run on DSM systems.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems DSM – Architecture and Motivation

DSM Organization

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Algorithms for Implementing DSM

Challenges in DSM
How to keep track of the location of remote data?
How to overcome the communication delays and high overhead associated with the references
to remote data?
How to allow "controlled" concurrent accesses to shared data?

Algorithms:

 The central-Server Algorithm

 The migration algorithm

 The Read-Replication Algorithm

 The Full Replication Algorithm

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Algorithms for Implementing DSM

The central-Server Algorithm

 A central-server maintains all the shared data.

 For read: the server just return the data

 For write: update the data and send acknowledge to the client

 A simple working solution to provide shared memory for distributed applications

 Low efficiency -- bottleneck at the server, long memory access latency

 Data can be distributed -- need a directory to store the location of a page.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Algorithms for Implementing DSM

The Migration Algorithm

 Data is shipped to the location of the data access request -- subsequent accesses are local

 For both read/write: Get the remote page to the local machine, then perform the operation.

 Keeping track of memory location: location service, home machine for each page, broadcast.

 Problems: Thrashing -- pages move between nodes frequently, false sharing

 Multiple reads can be costly.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Algorithms for Implementing DSM

The Read-Replication Algorithm

 On read, a page is replicated (mark the page as multiple reader)

 On write, all copies except one must be updated or invalidated

 Multiple read one write

 Allowing multiple readers to a page

 All the location must be kept track of: location service/home machine

The Full Replication Algorithm

 Allow multiple read and multiple write

 Must control the access to shared memory

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Memory Coherence

Memory coherence is an issue that affects the design of computer systems in which two or

more processors or cores share a common area of memory.

In DSM there are two or more processing elements working at the same time, and so it is

possible that they simultaneously access the same memory location. Provided none of them

changes the data in this location, they can share it indefinitely and cache it as they please. But

as soon as one updates the location, the others might work on an out-of-date copy that, e.g.,

resides in their local cache. Consequently, some scheme is required to notify all the processing

elements of changes to shared values; such a scheme is known as a Memory Coherence

Protocol, and if such a protocol is employed the system is said to have a Coherent Memory.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Memory Coherence

 The set of allowable memory access orderings forms the memory consistency

model.

 A memory is coherent if the value returned by a read operation is always the

value that the programmer expected.

o Strict consistency model is typical in uniprocessor: a read returns the most

recent written value.

o It is very costly to enforce the strict consistency model in distributed

systems: how to determine last update is to be managed.

o To improve performance, we need relax memory consistency model.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Memory Coherence

Relax memory consistency model :

 Sequential Consistency: The result of any execution of the operations of all the

processors is the same as if they were executed in a sequential order.

 General Consistency: All the copies of a memory location eventually contain the

same data when all the writes issues by every processor have completed.

 Weak Consistency: Synchronization accesses are sequentially consistent.

 All data access must be performed before each synchronization.

 Other Consistency Models: General Consistency, Processor Consistency, Release

Consistency.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Coherence Protocols

Coherence Protocols apply Cache Coherence in multiprocessor systems.

The intention is that two clients must never see different values for the same

shared data.

The Protocol must implement the basic requirements for Coherence. It can be

tailor-made for the target system or application.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Coherence Protocols

 The needs to make the data replicas consistent

 Two types of basic protocols Write-Invalidate Protocol: a write to a shared data

causes the invalidation of all copies except one before the write.

 Write-Update Protocol: A write to a share data causes all copies of that data to

be updated.

 Case Study: Cache Coherence & using Coherence Protocols in the system.

Contd… Next Slide ->

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Coherence Protocols

Case Study: Cache Coherence & using Coherence Protocols in the system.

 Write update protocol

 General consistency

 Unit of replication: a page (4KB)

 Coherence maintenance in the unit of one word

 A virtual page corresponds to a list of replicas, one of the replica is the

master copy.

 On a read fault: If local memory, read local memory. Otherwise, send

request a specified remote node and get the data.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Coherence Protocols

 For write: First update the master copy and then propagated to the copies

linked by the copy list.

 On a write fault, if the address indicates a remote node, the update request

is sent to the remote node. If the copy is not the master copy, the update

request is sent to the nod containing the master copy for updating and then

further propagation.

 Write is nonblocking.

 Read is blocked when all writes completes.

 Write-fence is used to flush all previous writes.

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Coherence Protocols

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Design Issues

Distributed shared memory (DSM) systems could overcome major obstacles of the

widespread use of distributed-memory multiprocessors, while retaining the

attractive features of low cost and good scalability common to distributed-memory

machines.

A DSM system allows a natural and portable programming model on distributed-

memory machines, making it possible to construct a relatively inexpensive and

scalable parallel system on which programmers can develop parallel application

codes. Due to its potential advantages, DSM has received increasing attention.

The specific question that we are trying to answer is, " Can we determine a set of system
design parameters that defines an efficient realization of
a distributed shared memory system".

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad
Advanced Operating Systems Design Issues

Invalidate Find Data Writeback / Writethrough

Cache Block States Contention for Tags Enforcing Write Serialization

Department of Computer Science & Engineering


JNTUH – C.S.E. & IT - III- I
Aurora’s Scientific, Technological and Research Academy, Hyderabad

You might also like