#4 (Memory Management Intro)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

CSE 330:

Operating Systems

Adil Ahmad
Lecture #4: Memory management (introduction)
Let’s finish up the general OS concepts
Process interacts with the kernel using system calls

▪ Requests made to the OS from a user process for a certain operation (e.g., file
system access, device access, etc.)

▪ Made using special instructions in modern CPUs (e.g., SYSCALL in x86)


▪ These instructions add important security checks

User mode
(U-mode/ring-3)
syscall sysret
Supervisor mode
(S-mode/ring-0)

(More details about transitions in lecture #9 onwards.)


Detailed illustration of system call handling

▪ Process executes SYSCALL which tells the CPU a system call is made
▪ Arguments are passed between process and kernel using CPU registers

CPU jumps to System call handler


open() SYSCALL predefined code
System call “table” “open” handler
function instruction

▪ You are updating this “system call table” in project #1 ☺

▪ After system call is handled, OS executes SYSRET instruction


▪ CPU jumps back to U-mode, returns results, and resumes process
Famous file-related system calls in Linux
fd = open(file, permissions, …)
• Open a file for reading, writing, or both
s = close(file)
• Close an open file
n = read(fd, buf, nbytes)
• Read data from a file into a buffer
n = write(fd, buf, nbytes)
• Write data from a buffer into a file
pos = lseek(fd, offset, whence)
• Move the file pointer
s = stat(name, &buf)
• Get a file’s status info (e.g., size, etc.)
Processes also exit to kernel for interrupts/exceptions

▪ System calls → voluntary exit (e.g., process wants to open a file)


▪ Process can also involuntarily be stopped by the system

▪ Two main kind of involuntary exits:


▪ Interrupts → external event notifiers, e.g.,
➢ Network packet is received at the wireless card

▪ Exceptions → Something abnormal done by the process, e.g.,


➢ Divide-by-0 and now needs to be stopped

(More details post midterm!)


Program and the address space
Recalling the process address space

▪ OS does not show all system memory to a process, rather abstracts a


portion of memory to each process

▪ View of memory shown to each process is called its address space


▪ Also think as “list of addresses that can be accessed by a process”

What are the different parts (or segments) within a process’ memory?

▪ Generally, four segments: code, data (global), stack, and heap


Can you match the provided variables to the program’s segments?
Let’s take an example of a simple program

▪ Process will access several memory regions while executing its code

▪ Let’s see how a C program accesses memory at a high-level

High-level operations performed by CPU:

1. read “x” from memory into a register (edi)


2. add 2 to edi
3. write edi back to the memory location “x”
Transforming “high-level” into real machine assembly

“High-level” operations for x= x+2


1. read “x” from memory into a register (edi)
2. add 2 to edi
3. write edi back to the memory location “x”

Assembly instructions

Location (address)
of an instruction
Understanding memory access at assembly level

Memory accesses:

➢ Fetch instr. at addr 0x100000fad


➢ Load from address 0x8+rbp (x)

➢ Fetch instr. at addr 0x100000fb0


➢ No memory load (register ops)

➢ Fetch instr. at addr 0x100000fb3


➢ Store to address 0x8+rbp (x)

The OS’ job is to make sure all these memory accesses work “correctly”!
Memory management basics and history
High-level overview of memory management

▪ The OS is responsible for efficiently managing system memory so that it can


be divided to serve different processes during runtime

Process: Process: OS
NGINX Memcached kernel
DRAM
Disk NGINX Memcached

OS also “oversubscribes” memory by storing


overflow contents in the disk (next week)
Start with managing memory for two processes

▪ User creates and executes two copies of the same process

Process A Process B

▪ Could something go wrong here?


▪ They might conflict each other since they seem to be accessing the same
addresses (e.g., 0x1000000fad)

▪ Any ideas as to how you might address this problem?


Take-1: static rewriting of the programs

▪ Rewrite the two process’ addresses to be distinct and load the processes
at those new distinct locations
▪ You can do this using a compiler or binary rewriter (ask me offline!)
An example static system memory layout

▪ Each process is rewritten to use its own entirely reserved region that is
decided when it is loaded

Start End
0x0 0x10000
Process Process
DRAM A B
0x1fff 0x2fff
0x1000 0x2000
Can you think of some problems with the static rewriting approach?

▪ Each process is rewritten to use its own entirely reserved region that is
decided when it is loaded

Start End
0x0 0x10000
Process Process
DRAM A B
0x1fff 0x2fff
0x1000 0x2000
Problem #1: relocation at runtime is very hard

▪ Must constantly rewrite process memory to relocate addresses


▪ Process may write pointers in many (random) places, and each must be relocated

Process Process
A A
DRAM
Problem #2: security problems due to no isolation

▪ Nothing is really preventing a process from accessing any random


location (even outside its memory)

0x0 0x1000

Process Process
A B
DRAM
int* malicious_ptr = (int*) 0x2000
*malicious_ptr = 0x100;

(Recall that *ptr → accessing location of ptr)


Take-2: dynamic relocation using base and bounds

▪ Problems with static rewriting: relocation and security, let’s tackle those

▪ Imagine that the OS had access to two mechanisms:


➢ Every memory access by a process could be shifted by a “base”
▪ *addr → *(addr + base) → *new_addr
▪ 0x0 → 0x0 + 0x1000 → 0x1000

➢ Every shifted memory access could be checked with a “bound”


▪ *(new_addr) → if (new_addr < bound) *new_addr
▪ 0x1000 → if (0x1000 < 0x2000) access 0x1000

Let’s see how this solves the relocation and security problems!
Solving the relocation hurdle with “base”

▪ OS copies all contents to new location and switches “base”


▪ All internal pointers are automatically shifted to new locations

Base1 Base2
Process Process
A B
DRAM
*(ptr+base1) *(ptr+base2)
Before After
Solving the security hurdle with “bounds”

▪ OS sets bounds and process cannot access over it

Set bounds here

Process Process
A B
DRAM
Caught/rejected by the bounds
int* malicious_ptr = (int*) 0x2000
*malicious_ptr = 0x100 check made by the system!
Note on the virtualization of “memory address”

▪ A write to 0x1000 using the “base and bounds” approach


▪ Program’s perspective → 0x1000
▪ OS inserts a “base” of 0x1000
▪ Resulting final address → 0x2000

▪ In this example, we see the “virtualization of address”


▪ Virtual address → the one a process sees (0x1000 in above example)
▪ Physical address → the translated final version (0x2000 in above example)
Implementing base and bounds during each access

▪ Requires support from the CPU hardware to check addresses


▪ Introduced a new unit called the memory management unit (MMU)
▪ MMU provides two new registers: “base” and “bounds”

Without MMU: With MMU:


*addr → CPU accesses addr *addr → (i) CPU adds “base” register,
(ii) CPU checks if “new_addr < bounds”
(iii) CPU accesses *new_addr

Should any process be allowed to change base/bounds registers?


▪ Only the kernel (in S-mode/ring-0) is allowed to change these registers
▪ These are “privileged” registers ☺
Can you think of some problems with dynamic relocation?

▪ Programs are not contiguous and have chunks of “free spaces” to grow
▪ E.g., free space between stack and heap regions

Base Bounds
Free space!

Code Heap Stack

Process address space

▪ Simple base and bounds wastes a lot of memory for each process!
▪ Must reserve all the free space beforehand
How exactly would you solve this problem?

▪ Programs are not contiguous and have chunks of “free spaces” to grow
▪ E.g., free space between stack and heap regions

Base Bounds
Free space!

Code Heap Stack

Process address space

▪ Simple base and bounds wastes a lot of memory for each process!
▪ Must reserve all the free space beforehand
Take-3: segmentation introduced by Intel x86

▪ Use multiple base and bounds instead of a single


▪ Each “base and bound” is called a “segment” in x86 systems

▪ Different program regions (e.g., heap, stack, etc.) can have a segment

▪ Individual segments can be efficiently resized (e.g., grow the stack, etc.)
Examining the logical view of segments
Segmentation requires further changes to the MMU

▪ Naïve idea: have many “base and bound” registers

▪ Any problems?
▪ Require many (expensive) registers and cannot dynamically adjust over fixed set
of hardware registers

▪ Better idea used in the Intel x86 systems:

▪ A segment table containing variable # of segments


▪ Entry contains a “segment” (i.e., base and bound)

▪ One base-bound register for the segment table


▪ One base-bound register for each selected segment
An illustration of the segmentation hardware

Find segment for


ptr in table

*(ptr) Check limit


(bounds) Access physical
location

Give error if over


segment limit
Questions? See you in the next class!

You might also like