Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Unit 5.

Memory Systems

Basic Concepts
The two primary classifications of memory are 1)Primary memory 2) Secondary memory Under Primary memory the two classifications are 1)RAM (Random Accesses Memory) 2)ROM(Read Only Memory) RAM :- Its is further classified into a) Static RAM (SRAM) b) Dynamic RAM(DRAM) DRAM:- the Dynamic RAM is again further Sub Classified into i) Synchronous DRAM ii) Asynchronous DRAM

The Maximum size of the memory in any computer is determined by the number address lines, provided by processor used in the computer. For ex: if processor has 20 address lines, it is capable of addressing 220 = 1M (mega ) memory locations. The maximum bits that can be transferred from memory or to the memory depend on the data lines supported by the processor. From the system standpoint, the memory unit is viewed as a black box. Data transfer between the memory and the processor takes place through the two processor registers MAR(memory address register) and MDR(memory data register). If MAR is k-bits long and MDR is n-bits long then the memory unit may contain up to 2k addressable locations. The bus also includes control lines Read/ Write (R/ W ) and MFC(memory function completed) for coordinating data transfers.

Processor MAR MDR

Memory

k-bit address bus n-bit address bus

Up to 2k addressable locations Word length= n bits

Control lines RD, WR , MFC etc

Figure 1: Connection of the memory to the processor

The processor reads the data from memory by loading the address of the required memory location into the MAR register and setting R/ W line to 1.the memory response by placing the data from the addressed location onto the data lines, and confirms this action by asserting the MFC signal.upon receipt of the MFC the processor loads the data on the data lines into the MDR register. The processor writes the data into a memory location by loading the address of this location into MAR and loading the data into MDR. It indicates that a write operation is involved by setting the R/ W line to 0.

Random access memory


Random access memory (RAM) is the best known form of computer memory. RAM is considered "random access" because you can access any memory cell directly if you know the row and column that intersect at that cell. RAM data, on the other hand, can be accessed in any order. RAM memory consists of memory cells. Each memory cell represents a single bit of data (logic 1 or logic 0). Memory cells are etched onto a silicon wafer in an array of columns (bit lines) and rows (word lines). The intersection of a bit line and word line constitutes the address of the memory cell. RAM memory is available in many physical forms. Memory chips in desktop computers originally used a pin configuration called dual inline package (DIP). This arrangement was later replace with memory modules, that consist of memory chips, along with all of the support components, on a separate printed circuit board (PCB) that could then be plugged into a special connector (memory bank) on the motherboard. The type of board and connector used for RAM in desktop computers has evolved over the past years. First there were proprietary memory

modules, then came SIMMs (single in-line memory module). There has been two different SIMM types widely in use: 30-pin connector version (8 bit bus version) and 72-pin connector version (wider bus, more address lines). As processors grew in speed and bandwidth capability, the industry adopted a new standard in dual in-line memory module (DIMM). Many brands of notebook computers use small outline dual in-line memory module (SODIMM). Memory chips are normally nowadays only available to general public in a form of a card called a module. Most memory available today is highly reliable. Most systems simply have the memory controller check for errors at start-up and rely on that. Memory chips with built-in errorchecking typically use a method known as parity to check for errors. Parity chips have an extra bit for every 8 bits of data. The way parity works is simple. Let's look at even parity first. Computers in critical positions need a higher level of fault tolerance. High-end servers often have a form of error-checking known as error-correction code (ECC). The majority of computers sold today use nonparity memory chips. These chips do not provide any type of built-in error checking, but instead rely on the memory controller for error detection.

Dynamic RAM (DRAM)


Static RAMs are fast, but they come at a high cost because their cells require several transistors. Less expensive RAMs can be implemented if simpler cells are used. such cells do not retain their states indefinitely, hence they are called dynamic RAMs(DRAMs). Dynamic random access memory (DRAM) is the most commonly used computer memory type. In DRAM a transistor and a capacitor are paired to create a memory cell. Each memory cell represents a single bit of data. The capacitor holds the bit of information (a 0 or a 1) as the voltage to charged to it. The transistor acts as a switch that lets the control circuitry on the memory chip read the capacitor or change its state. Word line T C

Bit line
Figure 2: A Single-transistor dynamic DRAM cell

A capacitor is like a small bucket that is able to store electrons. To store a 1 in the memory cell, the bucket is filled with electrons. To store a 0, it is emptied. he problem with the capacitor's bucket is that it has a leak (usually in few milliseconds a full bucket becomes empty). Therefore, for dynamic memory to work, either the CPU or the memory controller has to come along and recharge all of the capacitors holding a logic 1 state voltage level before they discharge. To do this, the memory controller reads the memory and then writes it right back. This refresh operation typically happens automatically thousands of times per second. DRAM works by sending a charge through the appropriate column (CAS) to activate the transistor at each bit in the column. When writing, the row lines contain the state the capacitor should take on. When reading, the sense-amplifier determines the level of charge in the capacitor. If it is more than 50 percent, it reads it as a 1; otherwise it reads it as a 0. The counter tracks the refresh sequence based on which rows have been accessed in what order. The length of time necessary to do is expressed in nanoseconds (billionths of a second). A memory chip rating of 70ns means that it takes 70 nanoseconds to completely read and recharge each cell. The amount of time that RAM takes to write data or to read it once the request has been received from the processor is called the access time. Typical access times vary from 9 nanoseconds to 70 nanoseconds, depending on the kind of RAM. Although fewer nanoseconds access is better, user-perceived performance is based on coordinating access times with the computer's clock cycles. Access time consists of latency and transfer time. Latency is the time to coordinate signal timing and refresh data after reading it. Typical DRAM memory access procedure is the following: To read a memory cell, we place a row address on the address bus lines (all the address lines together are called an address bus) and activate the Row Access Select (RAS) line and wait for 15ns while the holding circuitry to latches the Row address. Then we place column address on the address bus and activate the Column Access Select (CAS) line. Now, we have to wait for the level checking circuitry to determine if the location contains a 0 or 1. This information or data will appear as a high or low voltage on the data output pin. The DRAMs are classified into two categories: 1. Synchronous DRAMs 2. Asynchronous DRAMs

Synchronous DRAMs
More recent developments in memory technology have resulted in DRAMs whose operation is directly synchronized with a clock signal. such memories are known as synchronous DRAMs(SDRAMs).The figure below describes the structure of an SDRAM. The cell array is the same as in asynchronous DRAMs. The address and data connections are buffered by means of registers. The output of each sense amplifier is connected to a latch. SDRAMSs have several different modes of operation, which can be selected by writing control information into a mode register.

Refresh Counter

Row address Latch

Row decoder

Cell array

Row/column address
Column address Latch Column decoder

Read/Write circuits& Latches

clock RAS CAS


R/W CS

Mode register and timing control

Data input register

Data output register

Data
Figure 3: Synchronous DRAM

Asynchronous DRAMs: In the DRAM, the timing of the memory device is controlled asynchronously. A Specialized memory controller circuit provides the necessary control signals, RAS and CAS, that govern the timing. The processor must take into account the delay in the response of the memory. Such Memories are referred to as Asynchronous DRAMs.

Static RAM (SRAM)


SRAM consists of memory cells. Each memory cell represents a single bit of data. In static RAM, a form of flip-flop holds each bit of memory. This kind of flip-flop will hold it's state as long as it gets power or the state is changed with a write signal to that memory cell. Flip-flop for a memory cell takes four or six transistors along with some wiring, which is much more than what is needed by DRAM. Therefore, you get less memory per chip, and that makes static RAM a lot more expensive. b b`

T1

T2

Word line Bit lines


Figure 4: A Static RAM cell.

There are many variations of SRAM in use. Here are some variations used inside computers:

ASRAM: A sync SRAM has been with us since the days of the 386, and is still in place in the L2 cache of many PCs. It's called asynchronous because it's not in sync with the system clock, and therefore the CPU must wait for data requested from the L2 cache. However, the wait isn't as long as it is with DRAM. BSRAM: Burst SRAM (also known as Synch Burst SRAM) is synchronized with the system clock or, in some cases, the cache bus clock. This allows it be more easily synchronized with any device that accesses it and reduces access waiting

time. It is used as the external level-2 cache memory for the Pentium II microprocessor chipset. PB SRAM: Using burst technology, SRAM requests can be pipelined, or collected so that requests within the burst are executed on a nearly instantaneous basis. PB SRAM uses pipelining, and while it's slightly behind system synchronization speeds, it's a possible improvement over Sync SRAM because it's designed to work well with bus speeds of 75 MHz and higher.

Static RAM typically is fast and expensive. So static RAM is typically used to create the CPU's speed-sensitive cache. In addition SRAM is sometimes used to store data "semi permanently", so that when system is not powered up, the data in SRAM chip is retained with a help of a small backup battery that provides operating power to memory when rest of the system is not operating (there are special SRAM ICs that consume very little power when they are not accessed, so they are suitable for battery backed up application).

Non-volatile memory ROM


Non-volatile memory will keep its storage capacity even when it is powered down. Read-only memory (ROM) is an integrated circuit programmed with specific data when it is manufactured.

Word line T P Connected to store a 0 Not connected to store a 1 Bit line


Figure 5: A ROM Cell

ROM chips contain a grid of columns and rows. ut where the columns and rows intersect, there is a diode to connect the lines if the value is 1. If the value is 0, then the lines are not connected at all.

PROM
Programmable Read-only memory (PROM) is an integrated non-volatile memory circuit that is manufactured to be empty. It can be later programmed with specific data. The programming can be done only once. After programming this data is always stored to this IC. Blank PROM chips can be bought inexpensively and coded by anyone with a special tool called a programmer. PROM chips have a grid of columns and rows just as ordinary ROMs do. The difference is that every intersection of a column and row in a PROM chip has a fuse connecting them. A charge sent through a column will pass through the fuse in a cell to a grounded row indicating a value of 1. Since all the cells have a fuse, the initial (blank) state of a PROM chip is all 1s. To change the value of a cell to 0, you use a programmer to send a specific amount of current to the cell. The higher voltage breaks the connection between the column and row by burning out the fuse. This process is known as burning the PROM.

EPROM
Erasable programmable read-only memory (EPROM) chips work PROM chips, but they can be rewritten many times. EPROM is constructed to have a grid of columns and rows. In an EPROM, the cell at each intersection has two transistors. The two transistors are separated from each other by a thin oxide layer. One of the transistors is known as the floating gate and the other as the control gate. The floating gate's only link to the row (wordline) is through the control gate. As long as this link is in place, the cell has a value of 1. To change the value to 0 requires altering the placement of electrons in the floating gate. An electrical charge, usually 10 to 13 volts, is applied to the floating gate to charge the floating gate and thus turn bit to 0. A blank EPROM has all of the gates fully open, giving each cell a value of 1. Programming can change wanted cells to 0. To rewrite an EPROM, you must erase it first. Erasing an EPROM requires a special tool that emits a certain frequency of ultraviolet (UV) light (253.7 nm wavelength). An EPROM eraser is not selective, it will erase the entire EPROM. Erasing EPROM typically takes several minutes (be careful on erasing time, because over-erasing can damage the IC). EPROMs are configured using an EPROM programmer that provides voltage at specified levels depending on the type of EPROM used.

EEPROM
Electrically erasable programmable read-only memory (EEPROM) chips that can be electrically programmed and erased. EEPROMs are typically changed 1 byte at time. Erasing EEPROM takes typically quite long. The drawback of EEPROM is their speed. EEPROM chips are too slow to use in many products that make quick changes to the data stored on the chip. Typically EEPROMs are found in electronics devices for storing the small amounts of nonvolatile data in applications where speed is not the most important. Small EEPROMs with serial interfaces are commonly found in many electronics devices.

Flash
Flash memory is a type of EEPROM that uses in-circuit wiring to erase by applying an electrical field to the entire chip or to predetermined sections of the chip called blocks. Flash memory works much faster than traditional EEPROMs because it writes data in chunks, usually 512 bytes in size, instead of 1 byte at a time. Flash memory has many applications. PC BIOS chip might be the most common form of Flash memory. Removable solid-state storage devices are becoming increasingly popular. Smart Media and Compact Flash cards are both well-known, especially as "electronic film" for digital cameras. Other removable Flash memory products include Sony's Memory Stick, PCMCIA memory cards, and memory cards for video game systems.

Speed, Size and Cost


Ideally, computer memory should be fast, large and inexpensive. Unfortunately, it is impossible to meet all the three requirements simultaneously. Increased speed and size are achieved at increased cost. Very fast memory systems can be achieved if SRAM chips are used. These chips are expensive and for the cost reason it is impracticable to build a large main memory using SRAM chips. The alternative used to use DRAM chips for large main memories. The processor fetches the code and data from the main memory to execute the program. The DRAMs which form the main memory are slower devices. So it is necessary to insert wait states in memory read/write cycles. This reduces the speed of execution. The solution for this problem is in the memory system small section of SRAM is added along with the main memory, referred to as cache memory. The program which is to be executed is loaded in the main memory, but the part of the program and data accessed from the cache memory. The cache controller looks after this swapping between main memory and cache memory with the help of DMA controller, Such cache memory is called secondary cache. Recent processor have the built in cache memory called primary cache. The size of the memory is still small compared to the demands of the large programs with the voluminous data. A solution is provided by using secondary storage,

mainly magnetic disks and magnetic tapes to implement large memory spaces, which is available at reasonable prices. To make efficient computer system it is not possible to rely on a single memory component, but to employ a memory hierarchy which uses all different types of memory units that gives efficient computer system. A typical memory hierarchy is illustrated below in the figure :
CPU

Increasing size
Primary cache

Increasing speed

Increasing cost per bit

Secondary cache

Main memory

Secondary memory
Figure 6: Memory mapping

Cache Memories Mapping Functions


First generation processors, those designed with vacuum tubes in 1950 or those designed with integrated circuits in 1965 or those designed as microprocessors in 1980 were generally about the same speed as main memory. On such processors, this naive model was perfectly reasonable. By 1970, however, transistorized supercomputers were being built where the central processor was significantly faster than the main memory, and by 1980, the difference had increased, although it took several decades for the performance difference to reach today's extreme.

solution to this problem is to use what is called a cache memory between the central processor and the main memory. Cache memory takes advantage of the fact that, with any of the memory technologies available for the past half century, we have had a choice between building large but slow memories or small but fast memories. This was known as far back as 1946, when Berks, Goldstone and Von Neumann proposed the use of a memory hierarchy, with a few fast registers in the central processor at the top of the hierarchy, a large main memory in the middle, and a library of archival data, stored off-line, at the very bottom. A cache memory sits between the central processor and the main memory. During any particular memory cycle, the cache checks the memory address being issued by the processor. If this address matches the address of one of the few memory locations held in the cache, the cache handles the memory cycle very quickly; this is called a cache hit. If the address does not, then the memory cycle must be satisfied far more slowly by the main memory; this is called a cache miss.

Figure 7:Adding a cache to the naive view

The correspondence between the main memory and cache is specified by a Mapping function. When the cache is full and a memory word that is not in the cache is referenced, the cache control hardware must decide which block should be removed to create space for the new block that constitutes the Replacement algorithm.

Mapping Functions
There are three main mapping techniques which decides the cache organization: 1. Direct-mapping technique 2. Associative mapping Technique 3. Set associative mapping technique

To discuss possible methods for specifying where memory blocks are placed in the cache, we use a specific small example, a cache consisting of 128 blocks of 16 word each, for a total of 2048(2k) word, and assuming that the main memory is addressable by a 16-bit address. The main memory has 64k word, which will be viewed as 4K blocks of 16 word each, the consecutive addresses refer to consecutive word.

Direct Mapping Technique


The cache systems are divided into three categories, to implement cache system. As shown in figure, the lower order 4-bits from 16 words in a block constitute a word field. The second field is known as block field used to distinguish a block from other blocks. Its length is 7-bits, when a new block enters the cache, the 7-bit cache block field determines the cache position in which this block must be stored. The third field is a Tag field, used to store higher order 5-bits of the memory address of the block, and to identify which of the 32blocks are mapped into the cache. Tag Block Word

Figure 8:Main Memory Address

It is the simplest mapping technique, in which each block from the main memory has only one possible location in the cache organization. For example, the block I of the main memory maps

on to block i module128 of the cache. Therefore, whenever one of the main memory blocks 0, 128, 256, . Is loaded in the cache, it is stored in the block 0. Block 1, 129, 257,.. are stored in block 1 of the cache and so on.

Main memory Cache


tag tag

Block 0 Block 1

Block 0 Block 1

Block 127
tag

Block 127

Block 128 Block 129

Block 255
Figure 9 :Direct Mapped Cache

Associative Mapping Technique


The figure shows the associative mapping, where in which main memory block can be placed into any cache block position, in this case, 12 tag bits are required to identify a memory block when it is resident in the cache. The tag bits of an address received from the processor are compared to the tag bits of each block of the cache, to see if the desired block is present. This is called associative-mapping technique. It gives the complete freedom in choosing the cache location in which to place the memory block.

Main memory Cache


tag tag

Block 0 Block 1

Block 0 Block 1

Block i
tag

Block 127

Tag
tag

word
tag

Block 4095

Figure 10: Associative mapped cache

Set-Associative Mapping

It is a combination of the direct and associative-mapping techniques can be used. Blocks of the cache are grouped into sets and the mapping allows a block of main memory to reside in any block of the specific set. In this case memory blocks 0, 64,1284032 mapped into cache set 0, and they can occupy either of the two block positions within this set. The cache might contain the desired block. The tag field of the address must then be associatively compared to the tags of the two blocks of the set to check if the desired block is present this two associative search is simple to implement Main memory Cache
Set 0

Block 0 Block 1

tag tag tag

Block 0 Block 1 Block 2

Set 1

tag

Block 3 Block 63

Block 64 tag tag Block 126 Block 127 Block 65

Set 63

Block 127 Tag 6 Set 6 Word Block 128 4 Main memory address
Figure 11: Set-Associative Mapped Cache

Replacement Algorithms

In a direct-mapped cache, the position of each block is fixed, hence no replacement strategy exists. In associative and set-associative caches, when a new block is to be brought into the cache and all the Positions that it may occupy are full, the cache controller must decide which of the old blocks to overwrite. This is important issue because the decision can be factor in system performance. The objective is to keep blocks in the cache that are likely to be referenced in the near future. Its not easy to determine which blocks are about to be referenced. The property of locality of reference gives a clue to a reasonable strategy. When a block is to be over written, it is sensible to overwrite the one that has gone the longest time without being referenced. This block is called the least recently used(LRU) block, and technique is called the LRU Replacement algorithm. The LRU algorithm has been used extensively for many access patterns, but it can lead to poor performance in some cases. For example, it produces disappointing results when accesses are made to sequential elements of an array that is slightly too large to fit into the cache. Performance of LRU algorithm can be improved by introducing a small amount of randomness in deciding which block to replace.

Virtual Memory
A cache stores a subset of the address space of RAM. An address space is the set of valid addresses. Thus, for each address in cache, there is a corresponding address in RAM. This subset of addresses (and corresponding copy of data) changes over time, based on the behavior of your program. Cache is used to keep the most commonly used sections of RAM in the cache, where it can be accessed quickly. This is necessary because CPU speeds increase much faster than speed of memory access. If we could access RAM at 3 GHz, there wouldn't be any need for cache, because RAM could keep up. Because it can't keep up, we use cache. One way to extend the amount of memory accessible by a program is to use disk. Thus, we can use 10 Megs of disk space. At any time, only 1 Meg resides in RAM. In effect, RAM acts like cache for disk. This idea of extending memory is called virtual memory. It's called "virtual" only because it's not RAM. It doesn't mean it's fake. The real problem with disk is that it's really, really slow to access. If registers can be accessed in 1 nanosecond, and cache in 5 ns and RAM in about 100 ns, then disk is accessed in fractions of seconds. It can be a million times slower to access disk than a register.

The advantage of disk is it's easy to get lots of disk space for a small cost. Still, because disk is so slow to access, we want to avoid accessing disk unnecessarily.

Uses of Virtual Memory


Virtual memory is an old concept. Before computers had cache, they had virtual memory. For a long time, virtual memory only appeared on mainframes. Personal computers in the 1980s did not use virtual memory. In fact, many good ideas that were in common use in the UNIX operating systems didn't appear until the mid 1990s in personal computer operating systems (preemptive multitasking and virtual memory). Initially, virtual memory meant the idea of using disk to extend RAM. Programs wouldn't have to care whether the memory was "real" memory (i.e., RAM) or disk. The operating system and hardware would figure that out. Later on, virtual memory was used as a means of memory protection. Every program uses a range of addressed called the address space. The assumption of operating systems developers is that any user program can not be trusted. User programs will try to destroy themselves, other user programs, and the operating system itself. That seems like such a negative view, however, it's how operating systems are designed. It's not necessary that programs have to be deliberately malicious. Programs can be accidentally malicious (modify the data of a pointer pointing to garbage memory). Virtual memory can help there too. It can help prevent programs from interfering with other programs. Occasionally, you want programs to cooperate, and share memory. Virtual memory can also help in that respect.

How Virtual Memory Works?


When a computer is running, many programs are simultaneously sharing the CPU. Each running program, plus the data structures needed to manage it, is called a process. Each process is allocated an address space. This is a set of valid addresses that can be used. This address space can be changed dynamically. For example, the program might request additional memory (from dynamic memory allocation) from the operating system. If a process tries to access an address that is not part of its address space, an error occurs, and the operating system takes over, usually killing the process (core dumps, etc). How does virtual memory play a role? As you run a program, it generates addresses. Addresses are generated (for RISC machines) in one of three ways:

A load instruction A store instruction Fetching an instruction

Load/store create data addresses, while fetching an instruction creates instruction addresses. Of course, RAM doesn't distinguish between the two kinds of addresses. It just sees it as an address. Each address generated by a program is considered virtual. It must be translated to a real physical address. Thus, address translation is occurring all the time. As you might imagine, this must be handled in hardware, if it's to be done efficiently. You might think translating each address from virtual to physical is a crazy idea, because of how slow it is. However, you get memory protection from address translation, so it's worth the hardware needed to get memory protection.

Secondary Storage
Electronic data is a sequence of bits. This data can either reside in

primary storage - main memory (RAM), relatively small, fast access, expensive (cost per MB), volatile (go away when power goes off) secondary storage - disks, tape, large amounts of data, slower access, cheap (cost per MB), persistent (remain even when power is off)

We will focus on secondary storage since the collections of data in databases are usually both too large to fit in primary storage and must be persistent.

Hard Disks
Features

spinning platter of special material mechanical arm with read/write head must be close to the platter to read/write data data is stored magnetically (if you'd like to keep your data it is usually best to avoid using powerful magnets near your hard disk) sometimes the mechanical arm digs into platter, resulting in a very bad crash and subsequent loss of data on part of your hard disk storage capacity is commonly between 2GB - 11GB disks are random access meaning data can be read/written anywhere on the disk to read a piece of data, the mechanical arm must be repositioned over the place in the platter where that data is stored, this is called the disk seek. 8 to 15 milliseconds is a common seek time.

once the arm has been positioned the data transfer rate varies, but commonly is between 1MB and 10MB a second a 5GB hard disk will cost anywhere from $300 to $1500, there are many options and vendors SCSI (Small Computer System Interface), special hardware to improve throughput, 100s MB per second transfer rates solid state hard disks, with no mechanical parts, are starting to become commercially available, they are generally faster and more expensive

Diskette or Floppy Disk


spinning platter of special material information stored by magnetically read/write head positioned by mechanical arm storage capacity is at a few MBs random access seek time from 10 to 40 milliseconds easily portable

Removable Hard Disk


like hard disk; designed to permit disk and/or disk drive to be removed and slotted into another machine within seconds. more expensive than hard disk less reliable

Optical Disks

CD-ROM - read only (books, software releases) WORM - write once, read many (archival storage) laser encoding, not magnetic 30-50 ms seek times 640MB - 17GB storage capacity cheaper than hard disks per MB of storage capacity, but slower portable Jukeboxes of optical disks are becoming popular for storing really, really large collections of data. The Mercury-20 jukebox (no I'm not selling these, just using it as a typical example) provides access to up to 150 CD-ROMs, or in other words 94GBs of storage capacity. The Mercury jukebox takes a maximum of four seconds to exchange

and load a disc into a drive, 2.5 seconds to spin up and access the data and 10 seconds to transfer a 6.0 MB file to the computer or server

You might also like