CACHE 技术讨论: sina@冰砖帮帮忙

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 17

CACHE

sina@

CACHE

INTRO

"Caching is a temp location where I store data in (data that I


need it frequently) as the original data is expensive to be

fetched, so I can retrieve it faster.

CACHE Why do we need cache ?

1. The user will get upset and complain and even wont use this application again

2. The storage place will pack up its bags and leave your application , and that made a big problems(no place to store data)

CACHE What is cache ?

CACHE Cache Hit

1. When the client invokes a request (lets say he want to view product information) and our application gets the request it will need to access the product data in our storage (database), it first checks the cache. 2. If an entry can be found with a tag matching that of the desired data (say product Id), the entry is used instead. This is known as a cache hit (cache hit is the primary measurement for the caching effectiveness we will discuss that later on).

3. And the percentage of accesses that result in cache hits is known as the hit
rate or hit ratio of the cache.
5

CACHE Cache Miss

On the contrary when the tag isnt found in the cache (no match were found) this is known as cache miss , a hit to the back storage is made and the data is fetched back and it is placed in the cache so in future hits it will be found and will make a cache hit. If we encountered a cache miss there can be either a scenarios from two scenarios: 1.There is free space in the cache (the cache didnt reach its limit and there is free space) so in this case the object that cause the cache miss will be retrieved from our storage and get inserted in to the cache. 2.There is no free space in the cache (cache reached its capacity) so the object that cause cache miss will be fetched from the storage and then we will have to decide which object in the cache we need to move in order to place our newly created object (the one we just retrieved) this is done by replacement policy (caching algorithms) that decide which entry will be remove to make more room which will be discussed below.
6

CACHE Storage Cost

When a cache miss occurs, data will be fetch it from the back storage, load it and place it in the cache but how much space the data we just fetched takes in the cache memory? This is known as Storage cost

CACHE Retrieval Cost

And when we need to load the data we need to know how much does it take to load the data. This is known as Retrieval cost

CACHE Replacement Policy

When cache miss happens, the cache ejects some other entry in order to make room for the previously uncached data (in case we dont have enough room). The heuristic used to select the entry to eject is known as the replacement policy.

Caching Algorithms LFU

Least Frequently Used (LFU):


I am Least Frequently used; I count how often an entry is needed by incrementing a counter associated with each entry.
I remove the entry with least frequently used counter first am not that fast and I am not that good in adaptive actions (which means that it keeps the entries which is really needed and discard the ones that arent needed for the longest period based on the access pattern or in other words the request pattern)

10

Caching Algorithms LRU

Least Recently Used (LRU):


I am Least Recently Used cache algorithm; I remove the least recently used items first. The one that wasnt used for a longest time. I require keeping track of what was used when, which is expensive if one wants to make sure that I always discards the least recently used item. Web browsers use me for caching. New items are placed into the top of the cache. When the cache exceeds its size limit, I will discard items from the bottom. The trick is that whenever an item is accessed, I place at the top. So items which are frequently accessed tend to stay in the cache. There are two ways to implement me either an array or a linked list (which will have the least recently used entry at the back and the recently used at the front). I am fast and I am adaptive in other words I can adopt to data access pattern, I have a large family which completes me and they are even better than me (I do feel jealous some times but it is ok) some of my family member are (LRU2 and 11 2Q) (they were implemented in order to improve LRU caching

Caching Algorithms LRU2

Least Recently Used 2(LRU2):


I am Least recently used 2, some people calls me least recently used twice which I like it more, I add entries to the cache the second time they are accessed (it requires two times in order to place an entry in the cache); when the cache is full, I remove the entry that has a second most recent access.

Because of the need to track the two most recent accesses, access overhead increases with cache size, If I am applied to a big cache size, that would be a problem, which can be a disadvantage. In addition, I have to keep track of some items not yet in the cache (they arent requested two times yet).I am better that LRU and I am also adoptive to access patterns. I am Two Queues; I add entries to an LRU cache as they are accessed. If an entry is accessed again, I move them to second, larger, LRU cache.

12

Caching Algorithms ARC

Adaptive Replacement Cache (ARC):


I am Adaptive Replacement Cache; some people say that I balance between LRU and LFU, to improve combined result, well thats not 100% true actually I am made from 2 LRU lists, One list, say L1, contains entries that have been seen only once recently, while the other list, say L2, contains entries that have been seen at least twice recently. The items that have been seen twice within a short time have a low inter-arrival rate, and, hence, are thought of as high-frequency. Hence, we think of L1 as capturing recency while L2 as capturing frequency so most of people think I am a balance between LRU and LFU but that is ok I am not angry form that. I am considered one of the best performance replacement algorithms, Self tuning algorithm and low overhead replacement cache I also keep history of entries equal to the size of the cache location; this is to remember the entries that were removed and it allows me to see if a removed entry should have stayed and we should have chosen another one to remove.(I really have bad memory)And yes I am fast and adaptive.

13

Caching Algorithms MRU

Most Recently Used (MRU):


I am most recently used, in contrast to LRU; I remove the most recently used items first. You will ask me why for sure, well let me tell you something when access is unpredictable, and determining the least most recently used entry in the cache system is a high time complexity operation, I am the best choice thats why. I am so common in the database memory caches, whenever a cached record is used; I replace it to the top of stack. And when there is no room the entry on the top of the stack, guess what? I will replace the top most entry with the new entry.

14

Caching Algorithms FIFO

First in First out (FIFO):


I am first in first out; I am a low-overhead algorithm I require little effort for managing the cache entries. The idea is that I keep track of all the cache entries in a queue, with the most recent entry at the back, and the earliest entry in the front. When there e is no place and an entry needs to be replaced, I will remove the entry at the front of the queue (the oldest entry) and replaced with the current fetched entry. I am fast but I am not adaptive. Hello I am Second Change I am a modified form of the FIFO replacement algorithm, known as the Second chance replacement algorithm, I am better than FIFO at little cost for the improvement. I am Clock and I am a more efficient version of FIFO than Second chance because I dont push the cached entries to the back of the list like Second change do, but I perform the same general function as Second-Chance.

15

Caching Algorithms FIFO

Distributed caching:
1.Caching Data can be stored in separate memory area from the caching directory itself (who handle the caching entries and so on) can be across network or disk for example. 2.Distrusting the cache allows increase in the cache size. 3.In this case the retrieval cost will increase also due to network request time.

4.This will also lead to hit ratio increase due to the large size of the cache

16

Thank You

SINA@
Make Presentation much more fun

You might also like