Professional Documents
Culture Documents
Memory Systems
Memory Systems
Memory Systems
Basic Concepts
The two primary classifications of memory are 1)Primary memory 2) Secondary memory Under Primary memory the two classifications are 1)RAM (Random Accesses Memory) 2)ROM(Read Only Memory) RAM :- Its is further classified into a) Static RAM (SRAM) b) Dynamic RAM(DRAM) DRAM:- the Dynamic RAM is again further Sub Classified into i) Synchronous DRAM ii) Asynchronous DRAM
The Maximum size of the memory in any computer is determined by the number address lines, provided by processor used in the computer. For ex: if processor has 20 address lines, it is capable of addressing 220 = 1M (mega ) memory locations. The maximum bits that can be transferred from memory or to the memory depend on the data lines supported by the processor. From the system standpoint, the memory unit is viewed as a black box. Data transfer between the memory and the processor takes place through the two processor registers MAR(memory address register) and MDR(memory data register). If MAR is k-bits long and MDR is n-bits long then the memory unit may contain up to 2k addressable locations. The bus also includes control lines Read/ Write (R/ W ) and MFC(memory function completed) for coordinating data transfers.
Memory
The processor reads the data from memory by loading the address of the required memory location into the MAR register and setting R/ W line to 1.the memory response by placing the data from the addressed location onto the data lines, and confirms this action by asserting the MFC signal.upon receipt of the MFC the processor loads the data on the data lines into the MDR register. The processor writes the data into a memory location by loading the address of this location into MAR and loading the data into MDR. It indicates that a write operation is involved by setting the R/ W line to 0.
modules, then came SIMMs (single in-line memory module). There has been two different SIMM types widely in use: 30-pin connector version (8 bit bus version) and 72-pin connector version (wider bus, more address lines). As processors grew in speed and bandwidth capability, the industry adopted a new standard in dual in-line memory module (DIMM). Many brands of notebook computers use small outline dual in-line memory module (SODIMM). Memory chips are normally nowadays only available to general public in a form of a card called a module. Most memory available today is highly reliable. Most systems simply have the memory controller check for errors at start-up and rely on that. Memory chips with built-in errorchecking typically use a method known as parity to check for errors. Parity chips have an extra bit for every 8 bits of data. The way parity works is simple. Let's look at even parity first. Computers in critical positions need a higher level of fault tolerance. High-end servers often have a form of error-checking known as error-correction code (ECC). The majority of computers sold today use nonparity memory chips. These chips do not provide any type of built-in error checking, but instead rely on the memory controller for error detection.
Bit line
Figure 2: A Single-transistor dynamic DRAM cell
A capacitor is like a small bucket that is able to store electrons. To store a 1 in the memory cell, the bucket is filled with electrons. To store a 0, it is emptied. he problem with the capacitor's bucket is that it has a leak (usually in few milliseconds a full bucket becomes empty). Therefore, for dynamic memory to work, either the CPU or the memory controller has to come along and recharge all of the capacitors holding a logic 1 state voltage level before they discharge. To do this, the memory controller reads the memory and then writes it right back. This refresh operation typically happens automatically thousands of times per second. DRAM works by sending a charge through the appropriate column (CAS) to activate the transistor at each bit in the column. When writing, the row lines contain the state the capacitor should take on. When reading, the sense-amplifier determines the level of charge in the capacitor. If it is more than 50 percent, it reads it as a 1; otherwise it reads it as a 0. The counter tracks the refresh sequence based on which rows have been accessed in what order. The length of time necessary to do is expressed in nanoseconds (billionths of a second). A memory chip rating of 70ns means that it takes 70 nanoseconds to completely read and recharge each cell. The amount of time that RAM takes to write data or to read it once the request has been received from the processor is called the access time. Typical access times vary from 9 nanoseconds to 70 nanoseconds, depending on the kind of RAM. Although fewer nanoseconds access is better, user-perceived performance is based on coordinating access times with the computer's clock cycles. Access time consists of latency and transfer time. Latency is the time to coordinate signal timing and refresh data after reading it. Typical DRAM memory access procedure is the following: To read a memory cell, we place a row address on the address bus lines (all the address lines together are called an address bus) and activate the Row Access Select (RAS) line and wait for 15ns while the holding circuitry to latches the Row address. Then we place column address on the address bus and activate the Column Access Select (CAS) line. Now, we have to wait for the level checking circuitry to determine if the location contains a 0 or 1. This information or data will appear as a high or low voltage on the data output pin. The DRAMs are classified into two categories: 1. Synchronous DRAMs 2. Asynchronous DRAMs
Synchronous DRAMs
More recent developments in memory technology have resulted in DRAMs whose operation is directly synchronized with a clock signal. such memories are known as synchronous DRAMs(SDRAMs).The figure below describes the structure of an SDRAM. The cell array is the same as in asynchronous DRAMs. The address and data connections are buffered by means of registers. The output of each sense amplifier is connected to a latch. SDRAMSs have several different modes of operation, which can be selected by writing control information into a mode register.
Refresh Counter
Row decoder
Cell array
Row/column address
Column address Latch Column decoder
Data
Figure 3: Synchronous DRAM
Asynchronous DRAMs: In the DRAM, the timing of the memory device is controlled asynchronously. A Specialized memory controller circuit provides the necessary control signals, RAS and CAS, that govern the timing. The processor must take into account the delay in the response of the memory. Such Memories are referred to as Asynchronous DRAMs.
T1
T2
There are many variations of SRAM in use. Here are some variations used inside computers:
ASRAM: A sync SRAM has been with us since the days of the 386, and is still in place in the L2 cache of many PCs. It's called asynchronous because it's not in sync with the system clock, and therefore the CPU must wait for data requested from the L2 cache. However, the wait isn't as long as it is with DRAM. BSRAM: Burst SRAM (also known as Synch Burst SRAM) is synchronized with the system clock or, in some cases, the cache bus clock. This allows it be more easily synchronized with any device that accesses it and reduces access waiting
time. It is used as the external level-2 cache memory for the Pentium II microprocessor chipset. PB SRAM: Using burst technology, SRAM requests can be pipelined, or collected so that requests within the burst are executed on a nearly instantaneous basis. PB SRAM uses pipelining, and while it's slightly behind system synchronization speeds, it's a possible improvement over Sync SRAM because it's designed to work well with bus speeds of 75 MHz and higher.
Static RAM typically is fast and expensive. So static RAM is typically used to create the CPU's speed-sensitive cache. In addition SRAM is sometimes used to store data "semi permanently", so that when system is not powered up, the data in SRAM chip is retained with a help of a small backup battery that provides operating power to memory when rest of the system is not operating (there are special SRAM ICs that consume very little power when they are not accessed, so they are suitable for battery backed up application).
ROM chips contain a grid of columns and rows. ut where the columns and rows intersect, there is a diode to connect the lines if the value is 1. If the value is 0, then the lines are not connected at all.
PROM
Programmable Read-only memory (PROM) is an integrated non-volatile memory circuit that is manufactured to be empty. It can be later programmed with specific data. The programming can be done only once. After programming this data is always stored to this IC. Blank PROM chips can be bought inexpensively and coded by anyone with a special tool called a programmer. PROM chips have a grid of columns and rows just as ordinary ROMs do. The difference is that every intersection of a column and row in a PROM chip has a fuse connecting them. A charge sent through a column will pass through the fuse in a cell to a grounded row indicating a value of 1. Since all the cells have a fuse, the initial (blank) state of a PROM chip is all 1s. To change the value of a cell to 0, you use a programmer to send a specific amount of current to the cell. The higher voltage breaks the connection between the column and row by burning out the fuse. This process is known as burning the PROM.
EPROM
Erasable programmable read-only memory (EPROM) chips work PROM chips, but they can be rewritten many times. EPROM is constructed to have a grid of columns and rows. In an EPROM, the cell at each intersection has two transistors. The two transistors are separated from each other by a thin oxide layer. One of the transistors is known as the floating gate and the other as the control gate. The floating gate's only link to the row (wordline) is through the control gate. As long as this link is in place, the cell has a value of 1. To change the value to 0 requires altering the placement of electrons in the floating gate. An electrical charge, usually 10 to 13 volts, is applied to the floating gate to charge the floating gate and thus turn bit to 0. A blank EPROM has all of the gates fully open, giving each cell a value of 1. Programming can change wanted cells to 0. To rewrite an EPROM, you must erase it first. Erasing an EPROM requires a special tool that emits a certain frequency of ultraviolet (UV) light (253.7 nm wavelength). An EPROM eraser is not selective, it will erase the entire EPROM. Erasing EPROM typically takes several minutes (be careful on erasing time, because over-erasing can damage the IC). EPROMs are configured using an EPROM programmer that provides voltage at specified levels depending on the type of EPROM used.
EEPROM
Electrically erasable programmable read-only memory (EEPROM) chips that can be electrically programmed and erased. EEPROMs are typically changed 1 byte at time. Erasing EEPROM takes typically quite long. The drawback of EEPROM is their speed. EEPROM chips are too slow to use in many products that make quick changes to the data stored on the chip. Typically EEPROMs are found in electronics devices for storing the small amounts of nonvolatile data in applications where speed is not the most important. Small EEPROMs with serial interfaces are commonly found in many electronics devices.
Flash
Flash memory is a type of EEPROM that uses in-circuit wiring to erase by applying an electrical field to the entire chip or to predetermined sections of the chip called blocks. Flash memory works much faster than traditional EEPROMs because it writes data in chunks, usually 512 bytes in size, instead of 1 byte at a time. Flash memory has many applications. PC BIOS chip might be the most common form of Flash memory. Removable solid-state storage devices are becoming increasingly popular. Smart Media and Compact Flash cards are both well-known, especially as "electronic film" for digital cameras. Other removable Flash memory products include Sony's Memory Stick, PCMCIA memory cards, and memory cards for video game systems.
mainly magnetic disks and magnetic tapes to implement large memory spaces, which is available at reasonable prices. To make efficient computer system it is not possible to rely on a single memory component, but to employ a memory hierarchy which uses all different types of memory units that gives efficient computer system. A typical memory hierarchy is illustrated below in the figure :
CPU
Increasing size
Primary cache
Increasing speed
Secondary cache
Main memory
Secondary memory
Figure 6: Memory mapping
solution to this problem is to use what is called a cache memory between the central processor and the main memory. Cache memory takes advantage of the fact that, with any of the memory technologies available for the past half century, we have had a choice between building large but slow memories or small but fast memories. This was known as far back as 1946, when Berks, Goldstone and Von Neumann proposed the use of a memory hierarchy, with a few fast registers in the central processor at the top of the hierarchy, a large main memory in the middle, and a library of archival data, stored off-line, at the very bottom. A cache memory sits between the central processor and the main memory. During any particular memory cycle, the cache checks the memory address being issued by the processor. If this address matches the address of one of the few memory locations held in the cache, the cache handles the memory cycle very quickly; this is called a cache hit. If the address does not, then the memory cycle must be satisfied far more slowly by the main memory; this is called a cache miss.
The correspondence between the main memory and cache is specified by a Mapping function. When the cache is full and a memory word that is not in the cache is referenced, the cache control hardware must decide which block should be removed to create space for the new block that constitutes the Replacement algorithm.
Mapping Functions
There are three main mapping techniques which decides the cache organization: 1. Direct-mapping technique 2. Associative mapping Technique 3. Set associative mapping technique
To discuss possible methods for specifying where memory blocks are placed in the cache, we use a specific small example, a cache consisting of 128 blocks of 16 word each, for a total of 2048(2k) word, and assuming that the main memory is addressable by a 16-bit address. The main memory has 64k word, which will be viewed as 4K blocks of 16 word each, the consecutive addresses refer to consecutive word.
It is the simplest mapping technique, in which each block from the main memory has only one possible location in the cache organization. For example, the block I of the main memory maps
on to block i module128 of the cache. Therefore, whenever one of the main memory blocks 0, 128, 256, . Is loaded in the cache, it is stored in the block 0. Block 1, 129, 257,.. are stored in block 1 of the cache and so on.
Block 0 Block 1
Block 0 Block 1
Block 127
tag
Block 127
Block 255
Figure 9 :Direct Mapped Cache
Block 0 Block 1
Block 0 Block 1
Block i
tag
Block 127
Tag
tag
word
tag
Block 4095
Set-Associative Mapping
It is a combination of the direct and associative-mapping techniques can be used. Blocks of the cache are grouped into sets and the mapping allows a block of main memory to reside in any block of the specific set. In this case memory blocks 0, 64,1284032 mapped into cache set 0, and they can occupy either of the two block positions within this set. The cache might contain the desired block. The tag field of the address must then be associatively compared to the tags of the two blocks of the set to check if the desired block is present this two associative search is simple to implement Main memory Cache
Set 0
Block 0 Block 1
Set 1
tag
Block 3 Block 63
Set 63
Block 127 Tag 6 Set 6 Word Block 128 4 Main memory address
Figure 11: Set-Associative Mapped Cache
Replacement Algorithms
In a direct-mapped cache, the position of each block is fixed, hence no replacement strategy exists. In associative and set-associative caches, when a new block is to be brought into the cache and all the Positions that it may occupy are full, the cache controller must decide which of the old blocks to overwrite. This is important issue because the decision can be factor in system performance. The objective is to keep blocks in the cache that are likely to be referenced in the near future. Its not easy to determine which blocks are about to be referenced. The property of locality of reference gives a clue to a reasonable strategy. When a block is to be over written, it is sensible to overwrite the one that has gone the longest time without being referenced. This block is called the least recently used(LRU) block, and technique is called the LRU Replacement algorithm. The LRU algorithm has been used extensively for many access patterns, but it can lead to poor performance in some cases. For example, it produces disappointing results when accesses are made to sequential elements of an array that is slightly too large to fit into the cache. Performance of LRU algorithm can be improved by introducing a small amount of randomness in deciding which block to replace.
Virtual Memory
A cache stores a subset of the address space of RAM. An address space is the set of valid addresses. Thus, for each address in cache, there is a corresponding address in RAM. This subset of addresses (and corresponding copy of data) changes over time, based on the behavior of your program. Cache is used to keep the most commonly used sections of RAM in the cache, where it can be accessed quickly. This is necessary because CPU speeds increase much faster than speed of memory access. If we could access RAM at 3 GHz, there wouldn't be any need for cache, because RAM could keep up. Because it can't keep up, we use cache. One way to extend the amount of memory accessible by a program is to use disk. Thus, we can use 10 Megs of disk space. At any time, only 1 Meg resides in RAM. In effect, RAM acts like cache for disk. This idea of extending memory is called virtual memory. It's called "virtual" only because it's not RAM. It doesn't mean it's fake. The real problem with disk is that it's really, really slow to access. If registers can be accessed in 1 nanosecond, and cache in 5 ns and RAM in about 100 ns, then disk is accessed in fractions of seconds. It can be a million times slower to access disk than a register.
The advantage of disk is it's easy to get lots of disk space for a small cost. Still, because disk is so slow to access, we want to avoid accessing disk unnecessarily.
Load/store create data addresses, while fetching an instruction creates instruction addresses. Of course, RAM doesn't distinguish between the two kinds of addresses. It just sees it as an address. Each address generated by a program is considered virtual. It must be translated to a real physical address. Thus, address translation is occurring all the time. As you might imagine, this must be handled in hardware, if it's to be done efficiently. You might think translating each address from virtual to physical is a crazy idea, because of how slow it is. However, you get memory protection from address translation, so it's worth the hardware needed to get memory protection.
Secondary Storage
Electronic data is a sequence of bits. This data can either reside in
primary storage - main memory (RAM), relatively small, fast access, expensive (cost per MB), volatile (go away when power goes off) secondary storage - disks, tape, large amounts of data, slower access, cheap (cost per MB), persistent (remain even when power is off)
We will focus on secondary storage since the collections of data in databases are usually both too large to fit in primary storage and must be persistent.
Hard Disks
Features
spinning platter of special material mechanical arm with read/write head must be close to the platter to read/write data data is stored magnetically (if you'd like to keep your data it is usually best to avoid using powerful magnets near your hard disk) sometimes the mechanical arm digs into platter, resulting in a very bad crash and subsequent loss of data on part of your hard disk storage capacity is commonly between 2GB - 11GB disks are random access meaning data can be read/written anywhere on the disk to read a piece of data, the mechanical arm must be repositioned over the place in the platter where that data is stored, this is called the disk seek. 8 to 15 milliseconds is a common seek time.
once the arm has been positioned the data transfer rate varies, but commonly is between 1MB and 10MB a second a 5GB hard disk will cost anywhere from $300 to $1500, there are many options and vendors SCSI (Small Computer System Interface), special hardware to improve throughput, 100s MB per second transfer rates solid state hard disks, with no mechanical parts, are starting to become commercially available, they are generally faster and more expensive
spinning platter of special material information stored by magnetically read/write head positioned by mechanical arm storage capacity is at a few MBs random access seek time from 10 to 40 milliseconds easily portable
like hard disk; designed to permit disk and/or disk drive to be removed and slotted into another machine within seconds. more expensive than hard disk less reliable
Optical Disks
CD-ROM - read only (books, software releases) WORM - write once, read many (archival storage) laser encoding, not magnetic 30-50 ms seek times 640MB - 17GB storage capacity cheaper than hard disks per MB of storage capacity, but slower portable Jukeboxes of optical disks are becoming popular for storing really, really large collections of data. The Mercury-20 jukebox (no I'm not selling these, just using it as a typical example) provides access to up to 150 CD-ROMs, or in other words 94GBs of storage capacity. The Mercury jukebox takes a maximum of four seconds to exchange
and load a disc into a drive, 2.5 seconds to spin up and access the data and 10 seconds to transfer a 6.0 MB file to the computer or server