Professional Documents
Culture Documents
The Complete Memory Hierarchy
The Complete Memory Hierarchy
Disk storage
Example: Page size = 4KB = 2^12 Bytes Physical memory size = 2^18 pages = 2^30 Bytes = 1 GB Virtual memory size = 2^32 Bytes = 4 GB
Virtual address
31 30 29 28 27 15 14 13 12 11 10 9 8 Virtual page number 3 2 1 0 Page offset 12 Physical page number
20
Valid
Page table
(each program has its own)
Page table
18 If 0 then page is not present in memory 29 28 27 15 14 13 12 11 10 9 8 Physical page number Physical address 3 2 1 0 Page offset
If the TLB is a cache, how does one determine its degree of associativity?
TLB TLB
Valid 1 1 1 1 0 1 Page table Physical page Validor disk address 1 1 1 1 0 1 1 0 1 1 0 1 Tag
Physical memory
Disk storage
Virtual address Virtual address 31 30 29 20 15 14 13 12 11 10 9 8 Virtual page number Page offset 12 3210
Tag
20
Byte offset
Valid
Tag
Data
Cache
10
No
TLB hit?
No
Write?
Yes
No
Yes
No
Cache hit?
Yes
Write data into cache, update the tag, and put the data and the address into the write buffer
11
Possible?
TLB misses, then page fault, then the page is in, but the data is not in the cache. TLB misses, but the page is in memory, though not in the cache. IMPOSSIBLE: cant have a TLB hit if the page is not in memory. The data is not in the cache, but the page is in memory and we have a TLB hit. IMPOSSIBLE: the data cant be in the cache if its not in memory! TLB misses, but the entry is in the page table; when we retry we find the cached data. IMPOSSIBLE: cant have a TLB hit if the page is not in memory. TLB hit, so the page is in memory, and the data is in the cache.
Hit Hit
Hit Hit
Miss Hit
12
Save register contents, Update page table register, Flush the TLB (unless TLB entries are tagged with process id), Flush the cache (remember the cache stores physical addresses).
13
14
high
Fine Tuning a Program for Memory Performance for (i=0; i<rows; i++)
A[2][1] A[2][0]
A[1][1] A[1][0]
for (j=0; j<columns; j++) A[i][j] = A[i][j] + i; for (j=0; j<columns; j++) for (i=0; i<rows; i++) A[i][j] = A[i][j] + i;
If you know how data is organized in memory, you can write your code in a way that minimizes cache misses and page faults.
A[0][1] A[0][0]
low
15