Professional Documents
Culture Documents
Sense Amplifiers
Sense Amplifiers
I. INTRODUCTION
T r ig g e r V bl
BL &BLB
If the complete design of the 32KB area is 150000 um2 then
the utilization is 33%. Typical values for level 1 (L1) caches
Ta
are 25-40% and for L2 is 60-70%. The reason for this
SA_en
difference between L1 and L2 is that L1 is built for
dout
performance with smaller capacity, while for L2 the emphasis
is on density and less on speed. Using a sense amplifier
Figure 2 Timing relationship between main control provides the ability to have many cells in the same column
signals of a 6T memory array (figure 1) and hence sharing of the pre-charge, write driver,
and sense amplifier circuits which reduces the overhead. The
amplifier topologies compared are the current latched sense bitline then does not need a full voltage swing when using a
amplifier (CLSA), and the voltage latched sense amplifier sense amplifier to transfer the data value from the cell to the
(VLSA). The methods used for the measurement of the output but rather a small differential (0.2*Vdd), which
different comparison metrics is explained, along with a provides low power during read access.
presentation of the results. The main timing signals and its relative trigger time are
The paper is organized as follows: Section II presents an shown in Figure 2. The access is normally triggered from a
overview of the design principles of a small signal array; clock edge which de-asserts the pre-charge (off) and asserts
Section III briefly explains the topology and operation of the WL signal (turn it on) to access the cell. The self time Sa_en
sense amplifiers; Section IV explains the measurement of the signal gets asserted through a tracking circuitry after enough
different comparison metrics, and Section V concludes the voltage difference gets created between BL and BLB. The
paper. Sa_en signal enables the sense amplifier to sense the
difference in voltage which is based on the data stored in the
II. BASIC DESIGN PRINCIPLE OF SMALL SIGNAL ARRAY 6T. The sense amplifier also stores the data value in the latch
Six transistor cell (6T) based memory is widely used for to make it available for downstream logic. The time from WL
embedded memory due to its small area [3] and relatively fast to Sa_en (Ta) is the array access time and is programmable
access time. One important design parameter for memory through a tracking circuit. This programming enables a
especially when using 6T, is area utilization, which compares tradeoff between timing, power and yield.
the actual memory cell area to the total area of the memory
Figure 3: a) CLSA sense amplifier schematic and layout showing matching transistors b) VLSA sense amplifier
schematic and layout
III. CLSA & VLSA SENSE AMPLIFIERS
TABLE I: LP TECHNOLOGY SENSE AMPLIFIER METRIC
The CLSA & VLSA sense amplifier topologies are shown
COMPARISON
in figure 3a and b respectively. Both sense amplifiers have the Metrics VLSA/CLSA
‘bit’ and ‘bitb’ inputs connected to the column bitlines of the
Area 0.65x
SRAM. The sense input is actuated once the requisite voltage
Bitline input capacitance 3.2x
differential has developed on the bitlines. Each sense amplifier
Sense enable capacitance 1.5x
has the cross-coupled inverters that convert the voltage
differential at their inputs on the bitline to a full swing at the
TABLE II: DELAY AND LEAKAGE COMPARESON
outputs. The output inverters in each sense amplifier are used
FOR CLSA AND VLSA FOR BOTH LP AND HPM
for driving the downstream logic, and also serve to isolate the
internal nodes of the sense amplifier from the external load. delay CLSA VLSA
Figure 3a shows the CLSA sense amplifier design. Since LP 3.03 1
this is a current latched design, the bitlines drive the gates of HPM 1.56 0.63
transistors M9 and M10. Transistors M1, M4, M5 and M8 are
the precharge transistors. Transistors M2,M6 and M3,M7 form Leakage CLSA VLSA
the inverter pair that resolves the bitline differential voltage. LP 0.28 1
Traditionally this topology has been used because the memory HPM 3.36 1.79
bitlines are driving high impedance (gate) and full discharge
of array bitline due to timing mismatch is not a concern. triggering of the sense amplifier, transistor M7 is off and pass
Figure 3b shows the VLSA sense amplifier design transistors M1 and M4 are on. As the differential develops on
(schematic and layout in the 28nm design rules) . M2-M5 and the bitlines, it does so too on the internal nodes of the sense
M3-M6 form the inverters that resolve the differential voltage amplifier ‘sol’ and ‘sor’. When the sense signal ‘saenb’ is
on the bitlines to a full-swing at the output. The internal nodes asserted, the cross-coupled inverters formed of M2-M5 and
of this design are precharged through the bitlines. The obvious M3-M6 amplifies this differential voltage to its full-swing
advantage of this topology over the CLSA is the lower number output.
of transistors needed which means faster access and smaller
footprint. The challenge in using this topology has been the IV. COMPARISON METRICS
race condition for isolation signal that decouples the sense
amplifier bitline (sol, sor) from the array bitline (bit, bitb). If Simulations are performed on the two sense amplifier
the sense amplifier is enabled while M1 and M4 are on, the designs using full post layout netlist and same design flow
memory bitline (bit or bitb) could be discharged to logic 0. In used to qualify production level design. Both design have
traditional designs, a different signal other than sen (isolate) is same stimuli and drive same load for comparison. The metrics
used to control M1 and M4 which makes it hard to match chosen for comparison help to describe all aspects of operation
sense and isolate operation, but for our design we used the of the sense amplifier that are useful from a design
same signal that enable the sense amplifier to isolate the array perspective. Table I shows the comparative results of these
bitlines. simulations
Figure 5: Normalized required offset voltage for HPM process a) CLSA and b) VLSA sense amplifiers across process corners,
voltages and temperatures (VLSA requires half the offset compared to CLSA at the same PVT corner)
Figure 6. Increase in offset voltage for LP tech sense amp compared to HPM (a) CLSA (b) VLSA
Figure 7. Increase in offset voltage for cold vs. hot temperature (a) LP CLSA (b) LP VLSA
greater than that in the HPM technology. Each plot in the
figure is for a particular combination of supply voltage and
temperature, across the five different process corners. As can
be seen, the required offset voltage for the CLSA topology is
upto 30% lower in the HPM technology as compared to that in
the LP technology. Similarly, the required offset voltage is
about 45% greater in the LP technology for the VLSA design
as compared to the same design in HPM technology.
H. Temperature Sensitivity
Increased chip transistor density in advanced process
technology nodes also leads to higher chip operating
Figure 6: Percentage increase of required offset voltage
temperatures. It is thus pertinent to know the impact that
for -30C compare to same 125C PV 125C (hot)
temperature has on the required minimum offset voltage.
Temperature sensitivity
Figures 7 shows the extent to which the offset voltage
increases at cold as compared to hot temperatures for the LP
REFERENCES
technology CLSA and VLSA designs. As can be seen, at
[1] J. Yuan et al., “Performance Elements for 28nm Gate Length Bulk
higher voltages of 1.26v, the required offset voltage is higher
Devices with Gate First High-k Metal Gate”, Solid-State and Integrated
by 15%-20% for the CLSA design. In the HPM technology, Circuit Technology, 2010, 10th IEEE International Conference on, pp.
this trend is reversed with higher increases for the offset 66–69.
voltage seen for lower values of operating voltage (see Fig. 8) [2] Wu et al., “A Highly Manufacturable 28nm CMOS Low Power Platform
Technology withFully Functional 64Mb SRAM Using Dual/Tripe Gate
– at low voltages, the CLSA design offset voltage is higher by Oxide Process”, VLSI Technology, 2009 Symposium on, pp. 210–211.
more than 35% at cold temperatures as compared to hot. There [3] N. Weste and D. Harris. CMOS VLSI Design: A Circuits and
is not much variation in required offset voltage with Systems Perspective. Addison-Wesley, 2005
temperature for the VLSA design in HPM technology. [4] International Technology Roadmap for Semiconductor (ITRS) itrs.net
[5] Baker Mohammad, Martin Saint-Laurent, Paul Bassett, and Jacob
Abraham. Cache Design for Low Power and High Yield, IEEE
International Symposium on Quality Electronic Design (ISQED) ,March
V. SUMMARY & CONCLUSION 2008, pp 103-107, San Jose, CA, USA
[6] Baker Mohammad, Jacob Abraham; A reduced Voltage Swing Circuit
This publication describes the simulation methodology used Using A single Supply to Enable Lower Voltage Operation for SRAM-
to compare the two VLSA and CLSA sense amplifier designs. based Memory; Microelectronics journal, Elsevier, December 2011
The simulations results clearly show the advantage of the
VLSA design over the CLSA design. The faster speed of
operation and lower input differential required by the VLSA
design makes it an ideal choice for high speed, low power
datapath design. Traditional design complexity arise from
using VLSA has also been addressed by using one signal to
enable the sense amplifier and to isolate the array bitline from
sense amp bitline (as shown in figure 3b).
Figure 8: Increase in offset voltage for cold vs. hot temperature for LP technology (a) CLSA design (b) VLSI