Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 18

Sunplus mMobile Inc.

Cache Introduction
May 17st, 2007
Leon Chen
Outline

 Cache Concepts
 Memory hierarchy
 Cache Architecture
 Cache Policy
 Cache Clean & Flush
 Cache Lockdown
 Cache performance v.s. code organize

2
Cache Concepts

 What is cache
 Small , fast array of memory
 Reduce bottleneck of accessing slow memory
 How it does to improve performance?
 Block read from main memory and temporarily
manipulate these data (read/write)
 Drawback
 Difficult to determine the execution time of a program

3
Memory Hierarchy
Processor core Register file
Tightly Coupled
Memory L1 cache Write buffer

Main memory SRAM

DRAM

Flash and board-level NVRAM

Read path

Write path
4
Memory Relationship

Word, Block transfer


byte access
Processor Slow
Main
Cache
Core Fast memory
Write buffer
Fast Slow

Word, byte access Slow

5
Unified Instruction and Data Cache

Flexible to adjust portion of instruction and data region

6
Harvard Architecture

Instruction fetch
and data access
in a single clock
cycle

7
Cache Architecture

 How to achieve high performance access?


 Time—Temporal locality
 Space—Spatial locality
 Elements of cache
 Cache Controller
 Cache Memory
 Write Buffer

8
Cache Controller Operations
Data
31 Tag 10 9 Set index 4 3 index 0
 Cache line
 Cache Hit Tag v d word3 word2 word1 word0
 Cache Line fill
 Read data from a mapped address of main memory to a cache line
 Eviction
 Victim—the cache line selected for replacement
 The process of selecting and replacing a victim cache line

 Data Streaming
 While a cache line is filling, processor do not need to wait remaining
words in the cache line fill done, it continue execute

9
Cache Organization
Data
Tag 10 9 Set index 4 3 index 0 4KB, 4 way
31
Words:4 words
1

64
entries

3
= 2 = = =

MUX

Miss 10
HIT data
0x3FF

Tag v d word3 word2 word1 word0 0x224

0x00000C00 0x000
0x3FF
0x00000A24
Tag v d word3 word2 word1 word0 0x224
0x00000800
1KB
0x000
0x3FF
0x00000624
Tag v d word3 word2 word1 word0 0x224
0x00000400
0x000
1KB
0x00000224 0x3FF
0x00000000
Tag v d word3 word2 word1 word0 0x224

0x000

Data
Tag 10 9 Set index 4 3 index 0
31
xxxx 22 4
11
Write Buffer

 A very small, fast FIFO memory buffer


 Few cache lines deep
 Allow CPU process data write within one cycle
 Performance improve
 CPU need to write to main memory (ex. Write through
policy)
 Cache line eviction
 Cache line available as soon as possible

12
Cache Policy

 Write policy—where to store


 Write through
 Cache and main memory Tag v d word3 word2 word1 word0
 Write back
 Cache itself, conjunct with dirty bit
 What is dirty bit
 Mark a given cache line is inconsistent with main memory
 Operations are associated with dirty bit
 Eviction
 Clean cache

 Write Policy, control by Memory Attribute (CB bits)

13
Cache Policy (Cont.)
 Replacement policy—which cache line in a set
 Round-robin
 Sequential, increment victim counter (associate with a cache line)
 Pseudorandom
 Select randomly a value and increment victim counter
 Least recently used (ARM do not support)
 Select cache line unused for the longest time as the victim
 Allocation policy—when to cache line fill
 Read allocation
 Cache line fill until read
 If the corresponding cache line not valid (not line fill), writes to main
memory only
 Read-write allocation
 If a write occurs, cache line fill first then write data to cache
 ARM1156T2-S only support read allocation

14
Invalidate and Clean

 Invalidate cache
 Clear valid bit in the affected cache line
 Alias to ‘flush’

 Clean cache
 Apply to writeback policy
 write the cache lines with dirty bit to main memory and clear dirty bit

 No need clean operation for Instruction cache


 When to flush/clean cache
 Change access permission
 Change cache configuration
 Remap virtual address

15
Cache lock down
 Avoid miss penalty
 Lockable at a granularity of a cache way
 Critical code or data
 Vector interrupt
 ISR
 Algorithm used extensively
 Variables referenced intensively
 Immune from replacement operations (by cache controller)
 If cache is flushed (by program), must rerun to restore
 CP15 CRn c9, CRm c0 to lock/unlock cache way

16
Cache performance v.s. code organize
 Memory-mapped peripherals memory region must
reside in noncached and nonobuffered region
 Place frequently accessed data sequentially
 Utilize spatial locality
 Ex. For loop expand
 Keep a routine smaller
 Group related data close together
 Linked list reduce program performance

17
ARM 1156T2-S Cache Characteristics

 One-1KB, two-2KB, four-other cache size way set


associative
 Cache line size:32 bytes
 Cache way size support
 maximum is 16KB
 minimum is 1KB
 Unique values of cache lines within a set

18

You might also like