Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

SPEDE-2000 Lab Manual, CSU Sacramento 79

Chapter 9. Address Translation and Virtual Memory

You step in the stream


But the water has moved on.
Page not found.
— Computer Haiku error message

The Intel386 and later models include an on-chip paging memory-mapping unit (MMU). Paging occurs
after the logical address has been resolved to a linear address. If paging is enabled, the linear address
will be translated into a page frame number and offset, run through the page tables, then sent to the bus
interface unit of the CPU. This process creates a virtual address space.

The first section describes how a logical address (selector and offset) is converted to a physical
address. It starts with an overview of the whole process, then explains the two steps in detail. The next
page describes the paging system. When a page-fault occurs the offending linear address is stored in
CR2. The chapter ends with information about setting up virtual address spaces. many O.S. textbooks
describe the Intel segmentation-paging system; you might want to also reference those texts.

The Address Conversion Process


All addresses on a Pentium CPU begin as logical addresses consisting of a selector and an offset. The
CPU sends a physical address to memory. When paging is enabled, each address goes through two
conversions. A protected-mode, virtual memory OS uses both. First the selector is used to index into a
descriptor table, usually the Global Descriptor Table (GDT). The descriptor provides a base address,
which is added to the segment’s offset, provided by the original logical address. This sum is the linear
address. The offset is compared against the segment’s limit value to ensure the offset is within bounds.
See Figure 9-1 for a picture of this.

With paging enabled (bit 31 in CR0 set), the linear address is really two values: a page frame number
(PFN) and an offset with that page. The Intel Pentium uses a two-level scheme for page numbers, so
the PFN is actually a directory index and a page table index. This scheme reduces the number of page
tables required when there are “holes” (unmapped areas) in the address space. This way a very large
“address space” can be supported with a small amount of physical RAM. Each “page entry” is 4 bytes.

The descriptor table and the page tables are all located in system memory. To realize one memory
access for a program, the CPU must actually read the descriptor from memory, a page directory, and a
page table. So for every program memory access, the CPU must perform three additional accesses.
This would really slow down any program. For that reason, the CPU caches as much information as it
can onboard itself. In normal operation, only three or so selectors are used. When a selector register is
first loaded, the CPU checks to make sure the descriptor is valid, and if so, loads its contents into the
selector’s cache storage (these registers are hidden from the programmer). When the CPU performs the
addition of the segments base address and the offset in the logical address, both values are already
inside the CPU.

Even though each page table is 4K, the CPU doesn’t need to read the whole thing to translate a linear
address. It needs only one page entry from the page directory (top level) and one entry from the page
table (second level). A translation lookaside buffer (TLB) is used to cache these entries. It remembers
recent page entries. Each time a new set of page tables is used (e.g., each address space has its own
SPEDE-2000 Lab Manual, CSU Sacramento 80

Logical
Selector Offset Address

Dir Table Offset

+
Segment
Descriptor
+
Page Entry

Linear Page Entry Physical


Address Address
Global Descriptor Table
(from GDTR)

Segmentation Paging

Figure 9-1: Overview of Segmentation and Paging


set), this cache must be flushed (i.e., emptied). This is done automatically by the CPU when CR3 (page
directory base register) is loaded. After the TLB is flushed (i.e., “cold”), the next few memory accesses
will incur a lot of memory clock cycles.

♦ Segmentation
Figure 9-2 below shows how a segmented protected mode address becomes a linear address. It takes
a logical address and generates a linear address. Open arrows indicate base addresses. Each
descriptor has four fields of primary interest. The first is the type information defining it as a code or data
segment. Second are the access (permission) bits, which state if the whole segment can be written (if
data) or executed (if code). Third is the base linear address of the segment, and lastly is the size or limit
of the segment.

For 159, all the segments are setup with a base address of zero. This way all addresses point to the
same place in the address space. The limit is set to 4GB, so that won’t get in your way. All this is done
by the boot loader, before FLAMES runs.

The CPU register GDTR (global descriptor base register) supplies a base address and segment limit for
the descriptor table. Using the selector’s upper 13 bits, a descriptor is selected and the limit and size
fields are examined. If the limit is exceeded a general protection fault will occur. This stops the memory
and terminates the instructions, but the EIP register will point to the faulting instruction so it can be
retired once the OS has recovered from the general protection fault. Note the LDTR holds a selector, not
a pointer value. Its base and limit are from the descriptor is indicates.

There are a couple of places where an incorrect segment can be referenced. First, the descriptor index
must be with the descriptor table. Bit 2 is the table indicator, and determines whether the GDT (zero) or
the LDT (one) is used. The segment might also be accessed in an invalid manner, e.g., writing to a code
segment. All these conditions will generate a general protection fault.
SPEDE-2000 Lab Manual, CSU Sacramento 81

Really 13 Selector (16) Offset (32) Logical


Address Byte Offset
bit index
Index Into
Limit
(TI=0, so use GDT) Base Addr
Segment Ref

Add Offset and


Local Descriptor Segment’s Base
Segment Address

Linear
Code or Data Address
Descriptor Compare
Offset and
Segment’s
Limit

GDTR LDTR Offset >= Limit,


then SegFault!
Global and Local Descriptor Tables
(8,192 entries each)

Figure 9-2. Logical to Linear Address Translation (first part)

♦ How the Page Tables Work


This section describes how a linear address is translated through the page tables to generate a physical
address. The page directory and all the page tables are stored in main memory. If paging is disabled,
then the linear address is emitted from the CPU as the physical address.

If paging is enabled, the two-level page tables are referenced. As shown in Figure 9-3 below, the linear
address is chopped into three fields (described next). Two of those fields index into page tables with
1024 page table entries (PTE). Each PTE contains a physical base address and some status bits.
Twenty bits form the base address used in the next level down. The base address from the page table
provides the upper 20 address bits of the frame. The CPU will cache portions of the tables in a
Translation Look-aside Buffer (TLB). Thus, if it caches two entries, it can now access a 4K chunk of
linear memory without having to read those parts again.

GENERATING A PHYSICAL ADDRESS


This base address of the segment is added to the offset from the memory reference to generate a
logical address. If paging is enabled, CPU’s memory interface unit (MIU) gets a chance to change this
address. The linear address is split into three pieces. The top two fields are used as index values into
the page tables for the current address space. The pages tables form a sparse, two-level, 1024-ary tree,
anchored by the CPU’s CR3 (page directory base register) register.

The upper 10 bits are combined with CR3 to find the appropriate page directory. Address bits 31 to 22
index into the directory to get a page table pointer. Address bits 21 to 12 are used to select the page
table entry with the frame’s base address. This base is combined to the lower 12 bits (page offset) to get
SPEDE-2000 Lab Manual, CSU Sacramento 82

Byte Offset
Linear
Address Index Into
Limit
msb lsb Base Addr
Page Page Table Page Frame Segment Ref
Directory Index Offset
Index (10) (10) (12)

Combine
Offset and
Frame’s Base
Address
PDBR
(CR3)
Physical
Address
Page Directory Tables

Figure 9-3. Logical to Physical Address Translation (second part)

a physical address inside the page frame. Each index is 10 bits, so it can index 1024 different page
entries. Each page entry is 4 bytes, therefore each page table is 4K bytes in size. This is also the size of
a page frame!

When paging is enabled, the two-level page tables are referenced. The upper ten bits index into a page
directory structure. Each page table entry (PTE) contains a physical base address and some status bits.
Twenty bits form this physical base address, and they are combined with the lower twelve bits of the
linear address (a perfect match) to finally generate the physical address. (The status bits are masked
out when forming an address.) If either the page directory or PTE is marked not present, a page fault
will occur. Register CR2 will contain the virtual address that caused the fault. (See Section 3-1 in Intel
Architecture Software Developer’s Manual, volume 3 for an overview of this process.)

Use the VERR to verify a read through a ring 3 selector. You may have experienced the target computer
spontaneously resetting itself. One cause of this is loading CR3 with a NULL value. Page frame 0 is not
mapped, which causes the CPU to double-fault when generating an address. The CPU’s response is to
shut itself down. The BIOS re-acts by either turning the computer off or rebooting the whole system.

PAGE ENTRY CONTROL BITS


The status bits in each page table entry are important. They are described a few pages below. Each
PTE is split into a 20 bit address and 12 bits of control. Intel has set aside bits 9, 10 and 11 for program
usage; the hardware will not modify them. The most important bit is number 0, the “present” flag. When
set to 0, it tells the MIU this base address in the PTE is not valid. For instance, the present flag is
cleared on the first entry in the first page table. This affects the 4K range from 0K to 4K in the linear
address map. It is used to catch NULL pointer references! (Now you know.) Another important flag is bit
1, which tells if the page frame can be written to (one) or is read-only (zero).

Selectors and Virtual Memory


It isn’t the selectors that give you virtual memory, it’s the page tables and the address translation they
provide. The important piece of data from the selector’s descriptor is the base address in memory. This
SPEDE-2000 Lab Manual, CSU Sacramento 83

address, along with the offset in the memory access are added together. The sum is used to index into
the page tables (described in more detail below).

♦ Segments
Addresses which are output by the ‘core CPU’ in an Intel Architecture machine are virtual addresses in
segmented form. These addresses are first fed to the Segmentation Unit (SU), which is separate from
the CPU core but is internal to the processor chip (it is part of the circuitry called the “Address
Translation Unit”). The SU translates the segmented address into a “linear” form – that is, an equivalent
address which references (virtual) memory as a contiguous sequence of bytes with linearly increasing
addresses.

The address translation (from segmented to linear form) performed by the SU is controlled by a set of
“Segment Descriptors” contained in a “Descriptor Table” (either the “Global Descriptor Table (GDT)” or
the “Local Descriptor Table (LDT)”) in main memory. There are many different Segment Descriptors –
one for code references, one for data references, another for stack references, and so forth. These
Segment Descriptors must be set up correctly in order for address translation to work properly.

The FLAMES startup code creates a set of Segment Descriptors in main memory – one for code (called
the “Kernel Code Segment”), one for data (the “Kernel Data Segment”) and so forth. The startup code
arranges that when a downloaded program starts running, the CPU can correctly access these
Segment Descriptors (see Appendix B for details). The values in the Segment Descriptors are such that
(virtual) memory appears to the Segmentation Unit translation hardware as one contiguous sequence of
bytes. I.e., that every segment, regardless of type, has a starting (virtual) address of “zero” and is
4Gbytes long.

♦ Page Translation
Linear (virtual) addresses which are output by the Segmentation Unit are fed into the “Paging Unit”. This
piece of hardware uses a translation table to perform a mapping from a given virtual address to the
corresponding physical memory address. If the mapping is able to be successfully completed, the
translated address is output to physical memory (RAM). If not, the Paging Unit generates a Page Fault
interrupt (INT 14) instead.

The translation table is a hierarchical arrangement of translation values. At the top of the hierarchy is
the Page Directory table. Entries in the Page Directory table point to Page Tables, each of which
contains a translation value for a collection of individual virtual space pages. Both the Page Directory
table and the individual Page Tables are stored in main memory; CPU Control Register CR3 contains
the address of the base of the Page Directory table.

The Page Directory Table contains 1024 entries, each of which is 4 bytes. Each entry contains a pointer
to the base of a Page Table, along with some attribute bits. Each Page Table, in turn, contains 1024 4-
byte entries, each of which contains a translation value (frame address) of one 4K page of virtual space
(again along with some attribute bits). It is these numbers – 1K of Page Directory entries × 1K of Page
Table entries × 4K pages – which produce the fact that the size of Virtual Space is 4GB.

The following diagram shows the arrangement of the Page Directory and Page Tables, and how a
virtual address is translated by the Paging Unit into a physical address.
SPEDE-2000 Lab Manual, CSU Sacramento 84

4 bytes 4 bytes
First-level is
1023 the “page
. 1023
directory”
. .
Each maps 4
. .
megs.
. 1
. 0
.
. Page Table
Status bits
.
(bits 11..0)
3
2
1
CR3 0
(PDBR) 1023
.
Page Directory .
1
0

Page Table
31 22 21 12 11 0
Page offset
Directory Page Table Offset into never
entry entry Page changed

Linear Address
(from segmentation unit)

31 12 11 0

Frame Base Offset into


Address Frame

Physical Address
(to Memory)

Figure 9-4. Linear to Physical (Paging Tables)

The current address mapping is determined by the page directory pointer stored in the CPU’s CR3
register. Use the FLAMES command “CPU” to print out the value. The Page Directory entries and the
Page Table entries have nearly identical formats; they differ in only two bit positions. The Page Directory
entries have the following structure:
SPEDE-2000 Lab Manual, CSU Sacramento 85

Bits Meaning
31..12 Upper 20 bits of base address of Page Table
11..9 Available for OS use (not used by MMU hardware)
8 Global Page (leave 0)
7 Page Size (0 = 4K)
6 Reserved (0)
5 Accessed (1 = this Page Table has been accessed)
4 Cache Disabled (0)
3 Cache Policy (1= WriteThrough; 0=WriteBack)
2 User/Supervisor (0 = Supervisor and Page Table cannot be accessed in CPL3)
1 Read/Write (0 = Page Table is Read-Only)
0 Present (1=present)

Figure 9-5: Fields of a Page Directory Entry


Page Table entries have the following structure:

Bits Meaning
31..12 Upper 20 bits of base address of Page (i.e., Frame Number)
11..9 Available for OS use (not used by MMU hardware)
8 Global Page (leave 0)
7 Page Size (0 = 4K)
6 Dirty (1 = Frame contents have been modified)
5 Accessed (1 = Frame has been accessed)
4 Cache Disabled (0)
3 Cache Policy (1= WriteThrough; 0=WriteBack)
2 User/Supervisor (0 = Supervisor and Page Table cannot be accessed in CPL3)
1 Read/Write (0 = Frame is Read-Only)
0 Present (1=present)

Figure 9-6: Fields of a Page Table Entry

The logical initial value for a Page Table entry, or a Page Directory entry, for an object which is present
in memory is “base_addr | 0x7” – this indicates a page that is user-mode accessible, writable,
and present. The FLAMES startup code creates an initial Page Directory with these values, then
allocates a Page Table for each 4Mbyte block of installed physical memory and sets the Base Address
value for each Page Table entry to be exactly the same as the corresponding physical frame base
address. I.e., it creates what is called “straight-through” or “unity” mapping: every page maps to the
frame of the same address. This is the default address translation in effect when a downloaded program
starts running.

Caching policy can be set on a per-page basis. Normally, the policy should be cache enabled with write-
back. If the page mapped I/O registers, you would disable the caching. A write-through means the CPU
must do the write to main memory whenever the CPU commands it. Again, this is required when
touching memory mapped I/O registers. However, it slows down the machine instruction execution rate.
Using write-back, the cache circuits can queue up the memory change, transferring it out to main
memory when it has time. Also, several writes to nearby memory locations can be batched together can
executed as a single write operation.
SPEDE-2000 Lab Manual, CSU Sacramento 86

Using this technique of write-back, it is sometimes possible that memory will be updated in an order
different than what the machine code expects. (This can also be caused by out-of-order instruction
execution.) This situation is called “weak memory ordering” and occurs on processors that are highly
pipelined. If a page is not cached, then the BIU will write to memory in exactly the order commanded by
the instructions. This is called “strong memory ordering.” Also, several instructions can force all pending
writes to complete before the instruction begins to execute. The instructions are all I/O, IRET, LOCK,
and moves to the control registers.

♦ Accessing a Page Table Entry


There are two ways to access a page table entry in the page directory, and it depends on how you
declare the PDBR value. You can either say it is an unsigned 32-bit integer or a pointer.

#include <spede/machine/page.h> // For pte_t typedef


#include <spede/assert.h>

pte_t
get_pagedir_ptr( uint32 pdbr, void * virt_addr ) /* FIRST ATTEMPT */
{
uint pagedir_index = PAGE_DIRECTORY_NR(virt_addr);
pte_t * entryptr = (pte_t *) (pdbr + pagedir_index*sizeof(pte_t));
assert( 0 == (pdbr & PT_OFFSET_MASK) );

return *entryptr;
} /* end get_pagedir_ptr() */

This function takes the process’s page-directory base pointer and a virtual address of some process. It
looks into the page directory and returns an entry. From the return value, the caller should first check to
make sure the entry is good by checking the valid bit (bit 0). If OK, then it can extract the upper 20 bits
to form the base address of the page table.

The page-directory base pointer is added to an index value. Since each page-table entry is 4 bytes in
size, we need to scale up the index value (this is done automatically for you by C). However, since the
PDBR value is an integer, we must do the scaling ourselves (but see below). In both cases, we assert
the lower 12 bits of the page-directory pointer are zero.

This next example shows the same function as above but the PDBR value is a pointer. This simplifies
the code to find the page entry pointer. Since it is a pointer, C will do the index scaling for use.

#include <spede/machine/page.h> // For pte_t typedef


#include <spede/assert.h>

pte_t
get_pagedir_ptr( pte_t * pdbr, void * virt_addr ) /* SECOND ATTEMPT */
{
assert( 0 == (PTR2INT(pdbr) & PT_OFFSET_MASK) );
{
uint pagedir_index = PAGE_DIRECTORY_NR(virt_addr);
pte_t entry = pdbr[pagedir_index] ;

return entry;
}
} /* end get_pagedir_ptr() */

You might also like