CS330 Operating System Part IV

CS330: Operating Systems
Lecture 12
Limited Direct Execution
What we have seen so far:
1. Process: Each process thinks that it owns a CPU
2. Address space: Each process feels like it owns a huge address space.
3. File system tree: The user feels like operating on files directly.
What are the OS responsibilities for providing the above virtual notions?
OS performs multiplexing of resources efficiently.

Maintains mapping of virtual view to physical resource.
This partitioning of resources in terms of time and space.

The resource can be space partitioned if it allows partionability of itself. However, if it is not the
case then it has to partitioned across time. This requires state management of the resource.
Resource Multiplexing: Efficiency/Performance

Resource virtualization should not add excessive overheads. This might happen if the OS inter-
venes too much.
This might suggest that efficiency is guaranteed when programs use the resource directly, without
OS intervention.
There is a catch however:
1. Loss of control: process running on an infinite loop
2. Isolation issues: process accessing and changing OS data structures.
Therefore: Some limits to direct access must be enforced ; hence Limited Direct Execution.
Hardware support
The OS cannot enforce limits on processes by itself and still achieve efficiency. OS needs support
from hardware: Privilege levels.
CPU can execute in two modes: User mode and Kernel mode. Some operations are allowed
only from kernel-mode (privileged OPs): if executed from user space, the hardware will notify the
OS by raising fault/traps.
If the user process requires to invoke the services of the OS, the privilege level of the CPU
cannot be changed. The hardware provides entry instructions from the user mode which causes
a mode switch. The OS defines the handler for different entry gates. These handlers are also for
faultsexceptions, device interrupts. Registration of handlers are privileged.
After the boot, the OS needs to configure the handlers for system calls, exceptions/faults and
interrupts. The handler code is invoked by the OS when user-mode process invokes a system call
or an exception or an external interrupt.
2
Lecture 13
Priveleged ISA (x86 64)
The support needed from the hardware includes:
(i) CPU privilege levels
(ii) switching between modes- entry points and handlers
x86 64 rings of protection

4-privilege levels (0 being the highest, and 3 being the lowest)
- Some operations are allowed only in privilege level 0.
- Most OSes use 0 for kernel and 3 for user.
- Different kinds of privilege enforcement:
a) instruction is privileged b) operand is privileged
privileged instruction: HLT (on Linux x86 64)
#include <stdio.h>
int main() {
asm("hlt;");
}
HLT- halts the CPU till next external interrupt. It is a privileged instruction. Cannot be exe-
cuted from the user space. In case called from the user space, the linux kernel kills the application.
Note: this protection is independent from user-user privilege (distinction) (eg.: root user and
others). The user privilege is a software construct.
privileged operand: CR3 (on Linux x86 64)
#include <stdio.h>
int main() {
unsigned long cr3_val;
asm volatile (
"mov %%cr3, %0;"
:"=r" (cr3_val)
:
);
printf("%lx\n", cr3_val);
}
The CR3 register points to address space info for translation information. When executed
from user space results in protection fault.
3
Interrupt Descriptor Table (IDT): gateway to handlers

Interrupt descriptor table provides a way to define handlers for different events like external
interrupts, faults and system calls by defining the descriptors.
Descriptors 0 − 31 are predefined events. For eg.: 0: Division by zero exception, etc.
Events 32 − 255 are OS defined can be used for hardware and software interrupt handling.
Defining the Descriptor Table (OS BOOT)

⇒ each descriptor contains information about handling the event. This include information about
privilege level switch, handler address, etc.
The OS defines the descriptors and loads the IDTR register with the address of the descriptor
table (using the LIDT (a privileged instruction)).
Note: The least 2 significant bits of the CS-register define the mode of execution (0 for OS
mode, and 3 for kernel mode).
Also for instance: int "0x80"- instruction to raise an interrupt for the descriptor number
0x80.
System call and INT instruction (gemOS)

INT #N- raise a software interrupt. CPU invokes the handler defined in the IDT descriptor #N.
Conventionally, #0x80 is used to describe system calls entry gates.
The generic system call handler invokes the appropriate handler system calls using
- system call number associated with each system call
- user process sends information like system call number, arguments through CPU registers,
which is used to invoke the actual handler.
Lecture 15-16
OS Mode Execution
Post Boot OS execution
4
OS is triggered because of interrupts, exceptions, or system calls.

Since exceptions and interrupts are abrupt, the user process may not be prepared for this event
to happen.
The interrupted program may become corrupted after resumption. The OS needs to save the
user execution state and restore it on return.
Does the OS need a separate stack?

OS execution does require a stack for obvious reasons (function calls and returns).
⇒ Can the OS use user stacks?

Not recommended because of security and efficiency reasons
- The user may have invalid SP at the time of entry
- OS needs to erase the used area before returning.
⇒ The OS has its own stack, which switches the stack on kernel entry. In x86 64 the hardware
switches the stack pointer to the stack address configured by the OS.
How many OS stacks are required?

A per-process OS stack is required to allow multiple processes to be in OS mode of execution
simultaneously.
→ The OS configures the kernel stack address of the currently executing process in the hardware
(happens during the start of each process).
→ The hardware switches the stack pointer on system call or exception.
What about external interrupts?
External interrupts may happen even when execution in kernel mode. Therefore a need for
separate interrupt stacks are used by the OS for handling interrupts.
How is the user process state preserved on entry to OS and restored in return to user space?
User-kernel context switch
- The CPU saves the execution state onto the kernel stack.
- The OS handler finds the SP switched with user state saved (fully or partially, depending on
architecture)
- The OS executes the event (syscall/interrupt) handler [makes use of the kernel stack].
- The kernel stack pointer should point to the position at the time of entry (after execution).
- The CPU loads the user execution state and resumes user execution.
// looks something like this
entry() {
save_user_regs();
invoke_actual_handler();
restore_user_regs();
}
5
Which address space does the OS use?

2 possible design choices:
- use a separate address space for the OS, change the translation information on every OS
entry (inefficient approach)
- consume a part of the address space from all processes and protect the OS address using
hardware assistance (commonly used approach) [A part of the process address space is reserved
for the OS and is protected.]
Lecture 17-18
Process Scheduling
Triggers for process context switch
- The user process can invoke the scheduler through system calls like sched_yield()
- The user process can invoke sleep() to suspend itself.
- A process may have to wait for some I/O event. Eg.: read() from a file on disk.
- The OS gets the control back on every syscall and exception. Before returning from the
syscall the scheduler can de-schedule the process and schedule something else.
- Timer interrupts can be configured to generate interrupts periodically or after some config-
ured time. The OS can invoke the scheduler after handling the interrupt.
Process Control Switch
Assume a process P2 was in kernel mode initially, which was de-scheduled dome time back by
the scheduler and has been scheduled again.
6
Scheduling
A queue of ready processes is maintained. A scheduler decides to pick the process based on
some scheduling policy ans performs a context switch. The outgoing process is put back to the
ready queue (if required).
Most processes perform a mixture of CPU and I/O activities. When the process is waiting
for an an I/O it is moved to the waiting state. The process becomes ready again when the event
completion is notified (eg. on device interrupt).
What if there is no process in the ready queue?
System Idle process: There can be instances when there are zero ready processes in the
ready queue. A special process is always there. It halts the CPU (HLT in x86 64)
7
Classification of scheduler on the basis of invocation

Preemptive v/s Non-preemptive
There are scheduling points which are triggered because of the current process execution be-
haviour (non-preemptive):
- process termination
- process explicitly yielding the CPU
- process waits for an I/O event
The OS may invoke the scheduler in other cases (preemptive):
- return from system call (specifically fork())
- after handling an interrupt (specifically the timer interrupt)
Non-preemptive- triggered by the process and Preemptive- OS interjections
Scheduling metrics
Consider the following metrics:
Turnaround time: Time of completion - Time of arrival [objective is to minimize turnaround
time]
Waiting time: sum of time spent in ready queue [objective is to minimize waiting time]
Response time: Waiting time before the first execution of the the process [objective is to
minimize the response time]
Average value of metrics is used to represent average efficiency.
Standard deviation: measure of fairness of the scheduling process.
Lecture 19-20
Process Scheduling Policies
First Come First Serve (FCFS)
FIFO queue based non-preemptive scheduling.
Eg.:
AT CPU Burst
P1 0 100
P2 0 15
P3 0 45
P1 P2 P3
0 100 115120
FCFS
Avg. Turnaround time = 100+115+120

3
Avg. Waiting time = 0+100+115
3
8
Advantage: Easy to implement

Issues with FCFS
- Convoy effect
- Not suitable for interactive applications.
Shortest Job First (SJF)

Select the process with shortest CPU burst. Pick the next process only when the current process
is finished (non-preemptive).
P3 P2 P1
0 5 20 120
SJF
Avg. Turnaround time = 5+20+120

3
Avg. Waiting time = 0+5+20
3
Optimal on waiting and turnaround time. However, it is not realistic (we don’t know the
execution time)
Shortest Time to Completion First (STCF)

Pick the process with the shortest remaining time when a new process arrives in the ready queue
(SRTF).
Eg.:
AT CPU Burst
P1 0 10
P2 1 5
P3 2 2
P4 3 5
P1 P2 P3 P2 P4 P3
0 1 2 4 8 13 22
STCF
Avg. Waiting time = 12+2+0+5

3
Improves the efficiency of SJF at the cost more context switches.
Round Robin Scheduling

Preemptive scheduling with time slicing. Ready queue is maintained as a circular queue. At the
end of the time quantum if there are other process in the queue, the current process goes to the
TAIL of the queue, and the next process is picked up from the HEAD of the queue.
Design choice: Length of the time quantum.
Eg.: Assume size of time quantum is of size = 5
9
AT CPU Burst
P1 0 15
P2 3 5
P3 6 5
P1 P2 P1 P3 P1
0 5 10 15 20 25
RR
0+2+9
Avg. Waiting time = 3
Priority Scheduling
Select the process with highest priority can be preemptive and non-preemptive.
SJF: priority based by job lengths
Advantages: practical (no assumptions)
Disadvantage: Starvation
Problem formulation with I/O bursts: most process require a series of CPU and I/O
bursts. Every CPU burst can be treated as a new process where every CPU burst start is the
process, with arrival time and the burst length as the execution length.
Lecture 21
Static priority based scheduling
Processes are assigned to different queues based on their priority. Processes from the non-empty
highest priority queue is always picked. Different queues may implement schemes.
Main concern: Starvation of low-priority processes.
10
Multi-level feedback queue

Dynamically adjust priority such that:
- Interactive application are responsive.
- Shorter jobs do not suffer.
- No starvation
- No user can trick the scheduler.
Basic Multi-level strategy: Pick a process from highest priority queue and apply RR within
queue.
Dynamic Priorities:
- A process is assigned highest priority when it is initially created.
- If the process consumes the slice (scheduler is invoked because of timer), its priority is reduced.
- If the process relinquishes the CPU (I/O wait, etc.) its priority queue is the same.
Approximation of SJF
MLFQ can approximate SJF because:
- long running jobs are moved to low priority queues.
- New jobs are added to highest priority queue.
A shorter job may not get a chance to execute for a small duration.
Upper bounded by: #( of jobs on the highest priority queue + 1) × time quantum
MLFQ favours interactive jobs because interactive jobs maintain the highest priority as they
relinquish the CPU before quantum expires. Long running jobs are moved to low priority queues.
Thus in a steady state, interactive jobs compete with other interactive jibs.
MLFQ - Starvation and other issues

- Long running processes may starve with the proposed scheme.
- Additionally, permanent demotion of priority hurts processes which change their behaviour.
Eg.: A process performing a lot of computation only at start gets pushed only to a low
priority queue permanently.
How to avoid these issues?

- Periodic priority boost: all processes moved to high priority queue.
- Priority boost with ageing: recalculate the priority based on scheduling history of a pro-
cess.
Can the user trick the scheduler?

A smart user can maintain highest priority for long running processes by exploiting the strategy
(assumption - user knows the time quantum).
- Strategy: voluntarily release the CPU before time quantum expires. Result: Batch processes
compete with other interactive processes.
- Core of the issue: binary history regarding a process. Advanced MLFQ: better accounting,
variable quantum.
11
Scheduling in Real OSes

Scheduling requirements of different processes in the system are different.
→ Real time applications: should meet strict deadlines
→ Interactive processes: responsive scheduling
→ Batch processes: starvation free scheduling
Well intentioned users should be able to influence the scheduling policy in a positive manner.
Greed of the greedy users should be controlled by the OS.
Therefore, OS scheduling should provide flexibility while being auto-tuning in nature.
LINUX SCHEDULING CLASSES: Real time applications
Real time processes are always at higher priority than normal processes. (pre-emptive in
nature)
Priority value: 1 − 99 (lower value: higher priority)
FIFO: run to completion
RR: round robin within a given priority level
sched_setscheduler syscall to define scheduling class and priorities.
LINUX SCHEDULING CLASSES: Normal applications
SCHED_OTHER: Default policy. OS dynamic priorities and variable time slicing comes into
picture.
SCHED_BATCH: Assume CPU bound while calculating priorities.
SCHED_IDLE: very low priority jobs.
40 priority levels (100 − 139)

Every process starts with a default priority queue of 120.
Linux provides nice syscall to adjust the static priority.
nice(int x): where x is between −20 to 19
12
nice(19) moves the process to the lowest priority queue, i.e. 139.
nice(-20) moves the process to the highest priority queue, i.e. 100.
For this scheduling class dynamic priority is calculated by the Linux kernel considering the
inter-activeness of the process.
Selecting new jobs
A task is picked from the non-empty highest priority queue. Critical task queue contains
tasks which require immediate attention: hardware events, restarts, etc.
Normal task queue (a.k.a fair scheduling class) implements the heuristics to self-adjust.
If all the queues are empty, swapper task is scheduled (HLT the CPU).

CS330 Operating System Part IV

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS330 Operating System Part IV

Uploaded by

Copyright:

Available Formats

CS330: Operating Systems

OS performs multiplexing of resources efficiently.

This partitioning of resources in terms of time and space.

Resource Multiplexing: Efficiency/Performance

x86 64 rings of protection

privileged instruction: HLT (on Linux x86 64)

privileged operand: CR3 (on Linux x86 64)

Interrupt Descriptor Table (IDT): gateway to handlers

Defining the Descriptor Table (OS BOOT)

System call and INT instruction (gemOS)

OS is triggered because of interrupts, exceptions, or system calls.

Does the OS need a separate stack?

⇒ Can the OS use user stacks?

How many OS stacks are required?

Which address space does the OS use?

Process Control Switch

Classification of scheduler on the basis of invocation

Avg. Turnaround time = 100+115+120

Advantage: Easy to implement

Shortest Job First (SJF)

Avg. Turnaround time = 5+20+120

Shortest Time to Completion First (STCF)

Avg. Waiting time = 12+2+0+5

Round Robin Scheduling

Multi-level feedback queue

MLFQ - Starvation and other issues

How to avoid these issues?

Can the user trick the scheduler?

Scheduling in Real OSes

LINUX SCHEDULING CLASSES: Real time applications

LINUX SCHEDULING CLASSES: Normal applications

40 priority levels (100 − 139)

Selecting new jobs

You might also like