Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

© 2019 Ethan L.

Miller

Caching
Principles of Computer System Design

Cash rules everything around me (GZA)


© 2019 Ethan L. Miller

Improving performance: Caching


def b (N):
if 0 <= N <= 1:
return 1
❖ Compute the nth Fibonacci else if N < 0:
return 0
number: Fn = Fn-1 + Fn-2 return b(N-1) + b(N-2)

❖ How long does it take to


compute b(50)? big_int bv[1000]; // Set to 0
‣ b(49) + b(48) def b (N):
if 0 <= N <= 1:
‣ b(49) = b(48) + b(47) return 1

‣…
else if N < 0:
return 0
if bv[N] == 0:
❖ Solution: Memoize! bv[N] = b(N-1) + b(N-2)
return bv[N]

CSE 130: Principles of Computer System Design 2


fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
© 2019 Ethan L. Miller

Caching is everywhere

❖ Hardware—CPU uses cache to access memory faster


❖ Operating System—File system caches to avoid disk reads
❖ Application—Web client uses cache to avoid fetching web pages
❖ Whole Systems—modern systems nearly always cache database

CSE 130: Principles of Computer System Design 3


© 2019 Ethan L. Miller

Why does caching work?

w(t)

❖ Working set is the set of items used by the k most recent actions
‣ w(t) is the size, usually much much smaller than the total number of items!
❖ If you can cache the whole working set, you win!
❖ But, how do you know what is in the working set???

CSE 130: Principles of Computer System Design 4


© 2019 Ethan L. Miller

Why does caching work?

❖ Keep “likely to be referenced” items in the cache


‣ Locality of reference
• Temporal locality: Recent items are likely to be referenced again soon
• Spatial locality: Items “near” recently items are likely to be referenced

❖ Caching consists of one or both of:


‣ Keeping recently used items in a way that’s faster to retrieve them
‣ Fetching other items that might be referenced soon
❖ How do we track all of this?

CSE 130: Principles of Computer System Design 5


© 2019 Ethan L. Miller

Example: virtual memory

❖ Each process has its own virtual memory space


❖ What if demand for memory exceeds physical memory?
❖ Solution: Paging
‣ Treat physical memory as cache of all memory
‣ CPU and operating system work in tandem

CSE 130: Principles of Computer System Design 6


© 2019 Ethan L. Miller

Caching Terms
❖ Write-Back: Writes occur only on cached data
❖ Write-Through: Writes occur on both cache and back-end
❖ Associativity: Which data can be stored in each cache slot?
‣ Fully Associative: each item can be stored anywhere
‣ Direct-mapped: each item can be stored in only one location
‣ N-way associative: each item can be stored in n locations
❖ Mechanism: The algorithms and structures used
‣ Namely: how do you map from virtual pages to physical pages?
❖ Policy: The decision of what action to perform
‣ Namely: how do you pick the “right” pages to keep in memory?
CSE 130: Principles of Computer System Design 7
© 2019 Ethan L. Miller

Translating virtual addresses

❖ CPU and OS translate virtual address (and process ID)


‣ Address in physical memory — done!
‣ Location on disk — Page Fault, OS gets the page from disk!
‣ Address is not valid — Segmentation fault!

❖ Fetch policy: when are values brought in?


‣ In this case: on-demand (when they are requested)
‣ Imagine predicting pages that might be used soon (CPU caches do this!)

CSE 130: Principles of Computer System Design 8


© 2019 Ethan L. Miller

Page Fault

❖ Select an old page to remove, write it back to disk.

❖ Read new page from disk and place it in old pages’s slot
‣ Update structures that keep track of virtual memory
‣ Careful: there’s tricky atomicity issues here
❖ Removal policy: when are values removed?
‣ In this case: on-demand (only remove when necessary)
‣ But, which values should be removed?

CSE 130: Principles of Computer System Design 9


© 2019 Ethan L. Miller

Page Replacement Algorithm

❖ Capacity: How many items can the cache t?


‣ In this case, xed size
❖ We’d like to evict a page that is unlikely to be used soon, but how?
‣ Track the pages that are accessed
‣ Evict based on some principle.
❖ Lots of choices
‣ Least-recently used (LRU)
‣ First-In-First-Out (FIFO)
‣ Clock
CSE 130: Principles of Computer System Design 10
fi
fi
© 2019 Ethan L. Miller

How well can we do?

❖ What’s the best we can possibly do (OPTimal algorithm)?


‣ Assume perfect knowledge of the future
‣ Not realizable in practice (usually)
‣ Useful for comparison: if another algorithm is within 5% of optimal, not much
more can be done…
❖ Algorithm: replace the page that will be used furthest in the future
‣ Only works if we know the whole sequence!
‣ Can be approximated by running the program twice
• Once to generate the reference trace (workload)
• Once (or more) to apply the optimal algorithm now that we know the references
❖ Nice, but not achievable in real systems!

CSE 130: Principles of Computer System Design 11


© 2019 Ethan L. Miller

Simple Page Replacement: Dirty pages last

❖ Insight: A page only needs to be written to disk if it was changed.


‣ So, track if a page is clean or dirty!

❖ Mechanism: on each write to a page, mark it as dirty


❖ Policy: Choose clean pages rst when choosing the page to evict
❖ Very common optimization

❖ Does this predict the working set?

CSE 130: Principles of Computer System Design 12


fi
© 2019 Ethan L. Miller

Not Recently Used + Dirty pages last


❖ Use Temporal locality—likely to use recently used pages again

❖ Mechanism: Each page has dirty bit and a reference bit


‣ Bits are set when page is referenced and/or modi ed
‣ Four classes of pages:
• 0: not referenced, not dirty
• 1: not referenced, dirty
• 2: referenced, not dirty
• 3: referenced, dirty
❖ Policy: remove a page from the lowest non-empty class

❖ Problem: Everything drifts towards classes 2 and 3!


CSE 130: Principles of Computer System Design 13
fi
© 2019 Ethan L. Miller

First-in, First-Out (FIFO) page replacement

❖ Maintain a linked list of all pages


‣ Maintain the order in which they entered memory
❖ Page at front of list replaced
❖ Advantage: (really) easy to implement
❖ Disadvantage: page in memory the longest may be used frequently
‣ This algorithm forces pages out regardless of usage
‣ Usage may be helpful in determining which pages to keep

CSE 130: Principles of Computer System Design 14


© 2019 Ethan L. Miller

Second chance page replacement


❖ Modify FIFO to avoid throwing out heavily used pages
‣ If reference bit is 0, throw the page out
‣ If reference bit is 1
• Reset the reference bit to 0
• Move page to the tail of the list
• Continue search for a free page
❖ Still easy to implement, and better than plain FIFO

referenced unreferenced

A B C D E F G H A
t=0 t=4 t=8 t=15 t=21 t=22 t=29 t=30 t=32

CSE 130: Principles of Computer System Design 15


© 2019 Ethan L. Miller

The Clock Algorithm

❖ “Don’t let perfect be the enemy of good” - Voltaire

❖ Mechanism: keep track of “next page” to consider in a “clock”


‣ Track whether each page was used since it was last considered
❖ Policy: evict page if not used since last revolution

CSE 130: Principles of Computer System Design 16


© 2019 Ethan L. Miller

Clock algorithm
❖ Same functionality as second
A
chance t=32
t=0
H B
❖ Simpler implementation t=30 t=32
t=4

‣ “Clock” hand points to next page to


replace G C

‣ If R=0, replace page t=29 t=32


t=8

‣ If R=1, set R=0 and advance the


clock hand F J
D
t=22 t=32
t=15
E
❖ Continue until page with R=0 is t=21
found
‣ This may involve going all the way referenced unreferenced
around the clock…
CSE 130: Principles of Computer System Design 17
© 2019 Ethan L. Miller

Least Recently Used (LRU)

❖ Mechanism: keep counter in each page table entry


‣ Global counter increments with each CPU cycle
‣ Copy global counter to PTE counter on a reference to the page
❖ Policy: evict page with lowest counter value, break ties with dirty bit

❖ Problem: Expensive to implement!


‣ Approximate it with NRU and “the clock algorithm”.

CSE 130: Principles of Computer System Design 18


© 2019 Ethan L. Miller

How do you know if your algorithm is any good?

❖ Reference String: set of accesses made to system


‣ Can be synthetic or captured from a real workload
❖ Simulate algorithm over reference string
‣ Vary the number of available pages
‣ Track the number of Hits/Misses

CSE 130: Principles of Computer System Design 19


© 2019 Ethan L. Miller

Example Simulation

❖ Reference String: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
❖ Simulate a FIFO cache with 3 pages and with 4 pages.

❖ Did anything weird happen?

CSE 130: Principles of Computer System Design 20

You might also like