Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Assignment (due by midnight IST 25/Oct/19)

Objective: Best C program for multiplying


1024x1024 float matrices on any single core that
you normally use. Code must be based on the
“triply nested loop algorithm”
Techniques to be attempted: Loop interchange,
Blocking, Vectorization
Constraint: No use of any math or linear algebra
library routine. No use of transpose
Criterion: Execution time
Submit: Programs + report describing what all
you tried, including execution times. Email as a
single attachment YourName.tgz to mjt@iisc.ac.in

1
In Lecture 6 – Software organization
 Hardware, Process, Operating system
 Operating system

– System calls
– Memory management
 Memory management

– Address translation
– Paged virtual memory
– MMU, TLB

3
Steps in a Memory Access

Processor: generates a virtual memory
address

MMU: translates virtual address into a
physical memory address

Cache: looks it up in cache directory

172
Putting it all together
 The typical data reference will be
translated by the MMU and hit in the
L1 data cache
 What can go arong?

– TLB miss
– Page fault
– Cache miss
 These may require hard disk access,
which will take msecs

174
Process Management
 What is a Process?
 Program in execution
 But some programs run as multiple
processes
 And the same program can be running
multiply at the same time

176
Process as a Data Structure
 Operations?
 fork, exit, exec, …
 Data?
 Text, data, stack, heap
 Hardware information
 PC value, register values
 Other information maintained by operating
system
 Process id, parent id, user id
 Time of CPU used by process, in user/system
 Memory management info: Page table
 File related info: Open files, file pointers
177
Process as an OS Abstraction
 OS abstraction for program execution
 Unit of resource management by OS
 Separate page table for each process
 Unit of sharing of CPU time

178
Process Management
 What should OS do when a process does
something that will involve a long wait?
– File read/write operation, page fault, ...
– Make another process run on the CPU
 Which process?
– Decided by OS process scheduling policy
 Policy objectives?
– Minimize average program execution time
– Fairness

180
Process Scheduling Policies
 Preemptive vs Non-preemptive
 Preemptive policy: where OS `preempts’
the running process from the CPU even
though it is not waiting for something
– Idea: give a process some maximum
amount of CPU time (“CPU time slice”)
before preempting it, for the benefit of
the other processes
 Non-preemptive policy: process would
yield CPU either due to waiting for
something or voluntarily
181
Process State Transition Diagram
preempted or
yields CPU

Running Ready

scheduled

waiting for an event awaited event


to happen happens
Waiting

182
Context Switch
 When OS changes which process is
currently “Running” on CPU
 Takes some time, as it involves

replacing hardware state (in CPU) of


previously running process with that
of newly scheduled process
 Save HW state of previously running
process
 Restore HW state of scheduled process

183
How is the Running Process Preempted?
 OS preemption code must run on CPU
 How does OS get control of CPU from the
running process to run its preemption code?
 Hardware timer interrupt
 Hardware generated periodic event
 When it occurs, hardware automatically
transfers control to OS code (“timer
interrupt handler”)
 Interrupt is an example of a more general
phenomenon called an exception

186
Exceptions
 Certain exceptional events during
program execution that are handled
by processor HW
 Two kinds of exceptions
 Traps : Software generated
 Page fault, System call, Divide by 0, Invalid
opcode, ...
 Interrupts : Hardware generated
 Timer interrupt, keyboard interrupt, disk
interrupt, ...

187
What Happens on an Exception?
1. Hardware
• Saves processor state
• Transfers control to corresponding piece of
OS code, called the exception handler
2. Software (exception handler)
• Takes care of the situation as appropriate
• Ends with “return from exception” instruction
3. Hardware (execution of iret instruction)
• Restores the saved processor state
• Transfers control back to saved PC value

188
Re-look at Process Lifetime
 Which process has the exception
handling time accounted against it?
– Process “Running” at time of
exception
 Accounted differently than normal
execution of the program’s
instructions
– “Running in user mode”
– “Running in kernel/system mode”

189
Time: CPU vs Elapsed
Wallclock time
Elapsed time
P1 P2 P3 P1 P3
“Real time”

P1 P1
Process P1 CPU time

Process P1 CPU time

` : Running in user mode


: Running in system mode
190
Issues in Measuring Execution Time
 Elapsed time ? CPU time
 CPU time user ? CPU time user+system
 Resolution? (nsec, usec, msec, sec, ..)
 Timing mechanism? (time, chrone, ..)
 Experimental conditions
 Repetitions

191
time: Command execution
$ time ./a.out

Elapsed time

CPU time user

CPU time system

192
Linux manual: time (1)

193
time -v: Command execution
$ time -v ./a.out Why? man bash

194
time (1): Command execution
$ /usr/bin/time -v ./a.out

195
perf: Command execution
$ sudo perf stat -e cycles ./a.out

At 2.4GHz, 8,02,280 cycles is 0.00033 sec

196
perf: Command execution
$ sudo perf stat -e cycles:u ./a.out

197
Timing part of a program
Example: Only the time for doing
matrix multiplication
Initialize operand matrices
Do multiplication using unoptimized triple loop
Do multiplication using your faster method
Compare the 2 outputs and certify correctness

Linux:
– clock_gettime
– gettimeofday

198
199
200
Repetitions

201
Measurement conditions

With 3 other programs running

202
Guidelines
 Decide Elapsed time / CPU time
 Decide CPU time user / CPU time user+system
 Decide resolution (nsec, usec, msec, sec, ..)
 Decide timing mechanism (time, chrone, ..)
 Measurement conditions: as light a load as
possible
 Repetitions: a few, under similar
measurement conditions, as far as possible
– report average

203
Midterm 2
 Thursday, October 3 10:00-11:30 in
this classroom
 Portion: everything covered by me
until beginning of this lecture
 Closed book
 No calculator or phone
 Blue/black (not red) pen (not pencil)

204

You might also like