Professional Documents
Culture Documents
Hybrid Latch-Type Offset Tolerant Sense Amplifier For Low-Voltage Srams
Hybrid Latch-Type Offset Tolerant Sense Amplifier For Low-Voltage Srams
Abstract— Sense amplifier (SA) input-referred offset often frequency, and power/energy consumption [1]. Among all
dictates the minimum required differential input (VBL-min ) and these related characteristics, the SA sensing delay, minimum
is an important factor in realizing low-voltage static random required differential input voltage, VBL-min (resolution), and
access memories. This paper presents a HYbrid latch-type Sense energy of the read/write operations are the most important [2].
Amplifier (HYSA-QZ), where the bitline signals are supplied
to multiple internal nodes to significantly reduce VBL-min .
A 65-nm CMOS test chip with arrays of the HYSA-QZ, two
II. BACKGROUND
intermediate formulations of the HYSA-QZ, conventional current Two popular SA topologies are the Voltage Latch
latch SA (CLSA), and conventional voltage latch SA (VLSA) SA (VLSA) and the Current Latch SA (CLSA). The choice of
were fabricated. Measurements over 5120 SAs of each type show specific topology is dependent on the technology used [3].
that the HYSA-QZ implemented with regular-VT transistors To make a reliable decision, an SA requires a minimum
require 50.0%, and 22.8% lower VBL-min with 6.5% (or 4.5%) worst-case differential signal (VBL-min), which should be
and 30.7% (or 18.8%) of total gate (or layout) area overhead
greater than the SA’s input-referred offset voltage (VOS ) [4].
compared to CLSA and VLSA at 0.4 V, respectively. Iso-gate-
area offset improvement was substantiated with Pelgrom’s mis- The SA’s VOS is determined by the VT mismatches of the
match model, where the HYSA-QZ with regular-VT transistors sensing and input transistors [5]. Abu Rahma et al. reported
showed 46.6% and 7.7% improvements in measured standard that larger standard deviation of input referred offset distri-
deviation of offset distribution compared with CLSA and VLSA, bution (σ O S ) has a significant negative impact on SRAM
respectively. Measured VBL-min for the HYSA-QZ remains speed and read access yield, Yread [6]. Analysis carried out
stable and low over a temperature range from 0 °C to 75 °C by Abu-Rahma et al. [6] shows that for a 28 nm, 16 Mb
at 0.4 V. Moreover, an additional 13.0% reduction in VBL-min SRAM with Yread of 97%, every 1 mV increase in σ O S of the
was measured in the HYSA-QZ when using low-VT transistors. SA requires a 10 mV increase in the VBL-min . Minimizing
Finally, the HYSA-QZ operates reliably at VDD-min of 260 mV VBL-min improves energy consumption as it takes less time to
in 25 °C.
develop a smaller differential voltage on the highly capacitive
Index Terms— Comparator, Offset Tolerant Latch, offset bitlines, which experience less discharge per read access. The
cancellation, sense amplifier, variation tolerant circuits, SRAM. SA’s VOS arises from the mismatches in the gain factor, the
drain current, the threshold voltage VT , and the layout of
I. I NTRODUCTION the devices used [7], [8]. Among these, VT mismatch has
been identified as the dominant contributing factor to VOS .
S TATIC Random Access Memories (SRAMs) often occupy
a significantly large area of System on Chips (SOCs)
and consequently affect their energy consumption, yield, and
Shah et al. [9] concluded that the VT mismatch between
the NMOS sensing pair mostly determines the CLSA’s VOS .
Similarly, the work on VLSA [10] reported that the majority of
reliability. A sense amplifier (SA) is an important circuit that VOS is contributed by the VT mismatch in NMOS pair when
reads and amplifies the data stored in an SRAM cell. The the bitlines are precharged to VDD . Unfortunately, aggressive
characteristics of an SA determine several important SRAM device scaling has resulted in increased device variations and
metrics, including minimum operating voltage, maximum read contributed to larger VOS in SAs [11], [12].
Manuscript received May 7, 2018; revised October 11, 2018 and The main target for this work is to investigate into offset tol-
December 25, 2018; accepted February 10, 2019. Date of publication March 5, erant low-voltage SA topologies that can leverage low-voltage
2019; date of current version June 18, 2019. This work was supported in SRAMs to enable wide range of battery-operated mobile,
part by the Natural Sciences and Engineering Research Council of Canada sensory and implantable SOCs with stringent energy con-
(NSERC) under Grant NSERC-RGPIN-205034-2012 052714. This paper
was recommended by Associate Editor A. Cilardo. (Corresponding author:
straints. In this paper, we propose a differential SA architecture
Dhruv Patel.) that reduces the required VBL-min while offering reliable
D. Patel was with the Electrical and Computer Engineering Department, operation from the nominal supply down to the subthreshold
University of Waterloo, Waterloo, ON N2L 3G1, Canada. He is now with supply range. The proposed SA combines both CLSA and
the Electrical and Computer Engineering Department, University of Toronto, VLSA features by applying the differential input signal to
Toronto, ON M5S 3G4, Canada (e-mail: dhruv.patel@isl.utoronto.ca).
A. Neale was with the Electrical and Computer Engineering Department, multiple sensing nodes, and thus referred to it as the HYbrid
University of Waterloo, Waterloo, ON N2L 3G1, Canada. He is now with SA (HYSA). This also makes our proposed differential HYSA
Intel Corporation, Hillsboro, OR 97124 USA (e-mail: ajneale@uwaterloo.ca). compatible with most of the symmetrical fully differential
D. Wright and M. Sachdev are with the Electrical and Computer Engineer- SRAM cells i.e. 6T, 8T, 10T etc.
ing Department, University of Waterloo, Waterloo, ON N2L 3G1, Canada
(e-mail: derek.wright@uwaterloo.ca; msachdev@uwaterloo.ca).
The rest of this paper is outlined as follows. Section III
Color versions of one or more of the figures in this paper are available introduces the proposed SA, HYSA-QZ as a succession
online at http://ieeexplore.ieee.org. from the conventional topologies with their description, and
Digital Object Identifier 10.1109/TCSI.2019.2899314 simplified circuit analysis and proving the improvement in
1549-8328 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
2520 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 66, NO. 7, JULY 2019
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
PATEL et al.: HYBRID LATCH-TYPE OFFSET TOLERANT SENSE AMPLIFIER FOR LOW-VOLTAGE SRAMs 2521
Fig. 3. Simplified circuit analysis showing how applying bitlines to various internal nodes over each of the topologies studied in this work is effectively
overcoming the VT mismatches. The analysis consists of comparing resulted differential over drive voltages (VOV ) and differential drain-to-source voltages
(VDS ) for an applied VBL on a given SA topology. The analysis was performed just at the moment when the SAE is asserted high (0 → 1 transition).
As apparent from Fig. 2 (a), an HYSA with appropriate SEL topologies, the expected outcome through the left branches
signals can be reduced to a CLSA. In all CLSA and HYSA (Q) is ‘1’ (VDD ) and through the right branches (QB) is
configurations, the BL/BLB are always applied to the gates ‘0’ (GND). Hence, favorably, left branches should exhibit
of the N3/N4 transistors. In literature, the SA proposed in the higher impedance to GND trying to hold to VDD and should
work of [13] is similar to the L-CLSA-Z which additionally drain less current compared to right branches. On the other
applies BL/BLB signals to Z/ZB nodes. On the other hand, hand, right branches should exhibit lower impedance to GND
the proposed SAs in the works of [14]–[16] are similar to trying to drain QB to GND and should drain more current
the L-HYSA-Q which additionally applies BL/BLB signals compared to left branches. Therefore, the preset or pre-charge
to Q/QB nodes, but not to Z/ZB nodes. In our proposed conditions where the transistors in the right branches have
R/L-HYSA-QZ, we apply BL/BLB signals to both Q/QB higher VOV (indicating higher current draining capability) and
and Z/ZB nodes simultaneously to further improve the offset lower VDS (indicating lower impedance to GND) compared to
tolerance and in fact, the best possible offset tolerance among the transistors in the left branches is desirable. This dictates
all analyzed topologies in this work. The significance and that higher magnitudes of VOV:2-1/VOV:4-3 with +ve
the effect of applying BL/BLB signals to both Q/QB and polarity and higher magnitudes of VDS:2-1 /VDS:4-3 with
Z/ZB nodes will be shown throughout this paper by the means –ve polarity is desired resulting in higher I with desirable
of offset tolerance analysis, simulations and detailed on-chip polarity working against mismatches; this ultimately dictates
measurements. topology’s ability to tolerate disadvantageous mismatch
The rest of this section provides a substantiation with conditions.
simplified circuit analysis showing HYSA-QZ having higher Calculated VOV and VDS in Fig. 3 are put into three
offset tolerance compared to other topologies. The circuit categories: (1) working with mismatch (Bad), (2) working
analysis shown in Fig. 3 evaluates differential overdrive volt- against mismatch to neutralize the mismatch effect (Neutral)
ages (VOV ) and differential drain-to-source voltages (VDS ) and (3) Compensating mismatch to 0 and/or giving additional
on the NMOS transistor pairs, N1/N2 and N3/N4 at the instant offset tolerance (Good). VLSA: The required VBL-min must
when the SAE signal makes a 0→1 transition. VOV and be at least equal to VT:2-1 to neutralize the mismatch effect
VDS of NMOS transistor pairs are analyzed as they are with some aid from VDS:2-1 pre-setting 2-1 in favor of
the key parameters for creating differential transconductance the desired decision. CLSA: It has VOV:2-1 compensated
(gm ) and differential impedance (), respectively giving to 0 where the mismatch due to VT:2-1 works in favor
rise to differential current (I) in the left and right branches of presetting 4-3 but works against presetting 2−1 .
of a given latch-type SA. The polarity of I set by the applied Coincidental advantages due to VT:2-1 are bound to random-
VBL and other preset conditions is of importance in making ness and therefore not assured. CLSA will typically require
a correct decision at the moment when the SAE is asserted. VBL-min of at least VT:4-3 for making a correct decision,
Hence, VOV and VDS are analyzed at this instant. requiring it to neutralize VOV:4-3 . CLSA-Z: It has applied
Assumptions and definitions used in this analysis VBL working in favor of presetting 4-3 giving assured
in Fig. 3 are shown in the first row. To impose worst-case aid in desirable initial amplification phase, but works against
disadvantageous mismatch conditions, VT mismatches (VT ) presetting 2-1 which is relatively less of a concern after
are assumed such that VT2 > VT1 and VT4 > VT3 inducing the sufficient signal amplification on Z/ZB is achieved [17].
I in undesirable opposite polarities. Given these mismatch It will require VBL-min of around VT:4-3 to neutralize
assumptions and applied VBL according to the topology VOV:4-3 or little larger to overcome worst-case VT:2-1
configurations (see third and fourth rows), VOV:2-1/VOV:4-3 mismatch. HYSA-Q: It has coincidental aid from VT:2-1 to
and VDS:2-1 /VDS:4-3 are calculated as shown in fifth and pre-set 4-3 in the desired polarity but is diminished by
sixth rows, respectively. Based on the applied BL/BLB in all the intentionally applied VBL . However, it has its VOV:2-1
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
2522 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 66, NO. 7, JULY 2019
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
PATEL et al.: HYBRID LATCH-TYPE OFFSET TOLERANT SENSE AMPLIFIER FOR LOW-VOLTAGE SRAMs 2523
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
2524 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 66, NO. 7, JULY 2019
Fig. 8. Simulated offset tolerance comparison across SAE pulse width with
Fig. 7. Simulated offset tolerance comparison across SAE rise time with
VDD = 0.4 V and VBL = 40 mV at TT/25 °C. Simulations were performed VDD = 0.4 V and VBL = 40 mV at TT/25 °C. Simulations were performed
with transistor noise and VRMS−noise = 5 mV on all VDD , bitlines and SAE with transistor noise and VRMS−noise = 5 mV on all VDD , bitlines and SAE
signals. (a) Offset tolerance due to VT mismatches of N4/N3 (VT:4-3 ) only.
signals. (a) Offset tolerance due to VT mismatches of N4/N3 (VT:4-3 ) only.
(b) Offset tolerance due to N1/N2 VT mismatches (VT:2-1 ) only. (b) Offset tolerance due to N1/N2 VT mismatches (VT:2-1 ) only.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
PATEL et al.: HYBRID LATCH-TYPE OFFSET TOLERANT SENSE AMPLIFIER FOR LOW-VOLTAGE SRAMs 2525
Fig. 10. (a) Simulated sensing delay across VDD . (b) Normalized sensing Fig. 12. (a) Simulated dynamic power. (b) Normalized dynamic power
delay w.r.t R-VLSA at a given VDD . performed at TT/25 °C with VBL = w.r.t R-VLSA at a given VDD . Performed at VBL = −40mV, FCLK =
−40 mV. 3.33 MHz, TT/25 °C.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
2526 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 66, NO. 7, JULY 2019
V. M EASUREMENT R ESULTS
SA arrays of L-HYSAs/CLSAs, R-HYSAs/CLSAs and
R-VLSA cells with different configurable modes were imple-
mented in a 65 nm-GP CMOS process shown in Fig. 13 with
the test chip and the test bench setup. Similar to the works
of [6] and [10], each SA array was organized with 64 rows,
and 8 columns containing total 512 cells that are individually
addressable with a row and a column decoder. The test chip
architecture for characterizing SA VOS is shown in Fig. 14.
Each row slice contains SAs, pre-charge circuitry, pull-down
NMOS transistors, and a level shifter with a latch. BL and
BLB signals are routed vertically throughout the array. All
the control signals for the mode selection transistors and the
multiplexers were driven by 1.0 V supply to minimize the tim-
ing uncertainty of control signals. The measurements consist
of two key aspects of characterization: (1) Directly measured
VOS statistics across VDD and temperatures. (2) Comparison
of different topologies on reliability/reproducible results over
clock frequency and the operating temperature at a given VDD .
Fig. 15 (a) and (b) show simulated and measured Cumula-
tive Distributions (CDF) of topologies analyzed in this work Fig. 15. (a) Simulated cumulative distribution of 512 SAs. (b) Measured
at 0.4 V at 25 °C, respectively. MC simulations and mea- cumulative distribution of 5120 SAs across 10 ICs at 1 mV of VBL step
sured CDF plots across 10 ICs with a VBL step resolution resolution.
of 1 mV show that the VOS distribution is tighter for
proposed topologies compared to others with same VT flavor. respectively. For example, 512 MC simulations in Fig. 16
Simulated and measured σ O S values in Fig. 16 (a) and (b) were (a) show that R-HYSA-QZ has σ O S of 7.0 mV compared
obtained by curve fitting (non-linear least-square minimization to R-CLSA of 11.7 mV, or R-VLSA of 9.6 mV. A similar
method) probability density function curves derived from their trend is observed in Fig. 16 (b) over 5120 measured samples
respective CDF curves illustrated in Fig. 15 (a) and (b), from 10 ICs with σ O S of 9.1 mV, 18.1 mV and 11.4 mV
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
PATEL et al.: HYBRID LATCH-TYPE OFFSET TOLERANT SENSE AMPLIFIER FOR LOW-VOLTAGE SRAMs 2527
Fig. 16. (a) Simulated probability density of VOS across 512 SAs.
(b) Measured probability density of VOS across 5120 SAs (10 ICs) at 1 mV
of VBL step resolution. Fig. 17. Statistics of measured within-die and inter-die offset distribution.
(a) Measured σ O S across each of the 10 ICs. (b) Measured μ O S across each
of the 10 ICs. (512 samples per IC).
for R-HYSA-QZ, R-CLSA and R-VLSA, respectively. The across VDD from their respective PDF curves to understand
differences between measurements and simulations could be the improvement over gate area overhead explained in the
explained by the differences in the sample size, excluded lay- next paragraph. VBL-min characteristics of the SAs were also
out parasitics and the inaccuracy of the mismatch model used characterized across temperature between 0 °C – 75 °C at
in MC simulations. Fig. 17 (a) and (b) show the measured σ O S 0.4 V at 10 MHz. As depicted in Fig. 19, measured VBL-min
and the mean of offset distribution (μ O S ) of each individual shows that all HYSA topologies have relatively more stable
ICs with VDD = 0.4 V at 25 °C, respectively, distinguishing and lower VBL-min across temperature compared to CLSA
the impact of within-die and inter-die variations. It was evident and VLSA. Overall, the measured relative trends were in
that σ O S of the proposed R/L-HYSA-QZ remained relatively good agreement with the circuit theory predictions, the offset
low and stable (deviated less than |±1 mV|) across each of the tolerance and the MC simulations of VT . They showed that
10 ICs. The μ O S of each topologies remained within 0 mV – the HYSA topologies had increased offset tolerance as more
6 mV across all 10 ICs. internal nodes get precharged with bitlines and increased even
A similar measured statistical analysis was carried out further by using L-VT transistors. As was also predicted in
over VDD between 0.4 V to 1.0 V at 10 MHz. For Section IV, applying bitlines at Q/QB nodes is far more
each topology on each IC, worst case VBL (VBL-min ) effective (∼4.5x at 0.4 V) than applying bitlines at Z/ZB
required for 100% yield (0 failures in both logic ‘1’ and nodes. R-VLSA had relatively more gate area spent on its
‘0’ from 512 SAs) was extracted. Subsequently, the average critical NMOS pair resulting in lower mismatch compared to
of VBL-min from 10 ICs, μVBL-min was computed for each other SAs. Therefore, despite having low offset tolerance as
VDD . Fig. 18 (a) summarizes these measurements, and each shown in Figs. 6-8, R-VLSA achieves lower VBL-min and
data point in the figure represents this value. For exam- lower σ O S compared to, for example, L-CLSA-Z, where it
ple, for R-HYSA-QZ, the μVB L−min at 0.4 V is 27.8 mV provides only comparable offset tolerance as R-VLSA.
as compared to 56.0 mV and 36.0 mV of R-CLSA and Due to the gate area penalty in the proposed HYSA-QZ,
R-VLSA, respectively. Based on these measured results we we analyze measured σ O S from Fig. 18 (c) while substanti-
conclude that R-HYSA-QZ requires 50.0% and 22.8%, smaller ating iso-gate area improvement. Pileggi et al. [10] reported
VBL-min at the cost of 6.5% and 30.7% gate area over- that VLSA σ O S accurately follows Pelgrom’s model [12]
head compared to R-CLSA and R-VLSA, respectively. Also, described by σ (VT ) = √AT0 , where AT0 , W and L are the
using L-VT transistors in L-HYSA-QZ lowered μVBL-min WL
by 13.0% (56.5% w.r.t R-CLSA & 32.9% w.r.t. R-VLSA) area proportionality constant, and width and length of two
compared to using R-VT transistors in R-HYSA-QZ. To vali- devices with close proximity, respectively. Considering this,
date the inter-die consistency of VBL-min, standard deviation adding 30.7% extra
area toR-VLSA’s NMOS pair would give
of VBL-min (σVBL-min ) across 10 ICs was computed and maximum of 1 − √ 1 ≈ 12.5% σ O S improvement in
1.307
shown in Fig. 18 (b). Again, the proposed HYSA-QZs having R-VLSA whereas R-HYSA-QZ improved measured σ O S by
relatively low σVB L−min confirms inter-die consistency in offset 20.2%; therefore resulting in net iso-gate-area improvement
tolerance. Also, as shown in Fig. 18 (c), σ O S were extracted of 7.7% at 0.4 V. However, at super-threshold supply, i.e.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
2528 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 66, NO. 7, JULY 2019
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
PATEL et al.: HYBRID LATCH-TYPE OFFSET TOLERANT SENSE AMPLIFIER FOR LOW-VOLTAGE SRAMs 2529
Fig. 20. Measured Frequency vs VDD shmoo plots (a-e): (a) R-VLSA vs R-HYSA-QZ (b) R-CLSA vs R-HYSA-QZ (c) L-CLSA-Z vs L-HYSA-QZ
(d) L-HYSA-Q vs L-HYSA-QZ (e) R-HYSA-QZ vs L-HYSA-QZ. Measured Temperature vs VDD shmoo plots (f-j): (f) R-VLSA vs R-HYSA-QZ
(g) R-CLSA vs R-HYSA-QZ (h) L-CLSA-Z vs L-HYSA-QZ (i) L-HYSA-Q vs L-HYSA-QZ (j) R-HYSA-QZ vs L-HYSA-Q.
R-CLSA with VDD−min = 420 mV at −5 °C). The VDD−min L-HYSA-Z and L-CLSA. For the calculation of Pdy-read ,
for most topologies was mainly dictated by higher temperature VDD = 0.4 V, FCLK = 3.33 MHz, measured σ O S from
as it reduces VT (see Fig. 21 (b)) further increasing leakage Fig. 18 (c) and simulated values of Pdy-SA with appropriate
and hence deteriorating the yield. However, in general terms, VBL-min calculated from eq. (4) were used. For example, for
the VDD−min trend across temperature at a given FCLK for the ζ factor of 8 mV/mV, Fig. 22 (a) shows the predicted
SAs in R-VT and L-VT flavors is dictated by the balance Pdy-read improvement with proposed R/L-HYSA-QZ. It shows
between yield affected by both leakage at higher temperatures the Pdy-read improvement across different values of CBL where
and reduced sensing delay due to increased VT at lower the 0% crossover point is annotated to highlight the minimum
temperatures (see Fig. 21 (b)). CBL requirement for Pdy-read improvement > 0%. At CBL
of 150 fF, R-HYSA-QZ results in Pdy-read improvement of
VI. S ENSE A MPLIFIERS C OMPARISON 11% and 45% compared to R-VLSA and R-CLSA, respec-
The resulting Pdy-read with the consideration of increased tively. Also, L-HYSA-QZ results in Pdy-read improvement
Pdy-SA in the proposed R/L-HYSA-QZ topologies needs to be of 5%, 3.5%, 38%, and 46% compared to R-HYSA-QZ,
analyzed and compared with the resulting Pdy-read with other L-HYSA-Q, L-CLSA-Z and L-CLSA, respectively. The ζ
SAs. The Pdy-read can be quantified as shown in eq. (1) where factor is subject to vary depending on the SRAM memory
the Pdy-BL/BLB is the power consumption due to the bitline size, target Yread and the technology related effects such
discharge defined by eq. (2) and the Pdy-SA is the SA dynamic as proximity, stress effects, VT variations, and active area
power analyzed in Section IV and also defined by eq. (3). rounding [6]. Therefore, the minimum CBL crossover points
Note that other dynamic power components of the read access were recorded as shown in Fig. 22 (b) for varied values of ζ
such as word-line drivers and control/timing are ignored in this between 4 mV/mV to 15 mV/mV. The values for minimum
comparative analysis as they would remain constant regardless CBL crossover points reduces with the increased values of
of the choice of an SA. As emphasized by the work of [6], ζ , and for the lower values of ζ , minimum CBL crossover
the amount of VBL discharge in eq. (4) is determined by the points are within typical CBL range for an SRAM macro.
combination of σ O S of a given SA and the ζ factor defined in This assures Pdy-read savings with proposed R/L-HYSA-QZ
eq. (5). The ζ factor is indicated by the amount of VBL-min topologies around threshold region. This analysis also high-
required per σ O S to meet certain Yread target for particular lights that utilizing L-VT flavor transistors in proposed HYSA-
SRAM size in a given technology. For example, in the work QZ topology ultimately provides Pdy-read savings compared to
of [6], for 28-nm CMOS and 16 Mb SRAM, the ζ factor was using R-VT transistors around threshold region.
found to be ∼10mV/mV for 97 % yield target. To further see the impact of the proposed R/L-HYSA-QZ
Since the Pdy-read improvement is a function of bitline with nominal supply in superthreshold region of operation,
capacitance (CBL ), the Pdy-read improvement is analyzed across the analysis from Fig. 22 (a) was repeated with VDD = 1 V
CBL comparing R-HYSA-QZ with R-VLSA and R-CLSA, and ζ = 8 mV /mV , and is shown in Fig. 22 (c).
and comparing L-HYSA-QZ with R-HYSA-QZ, L-HYSA-Q, From Fig. 22 (c) it was evident that the R/L-HYSA-QZ can
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
2530 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 66, NO. 7, JULY 2019
Fig. 21. (a) VDD−min at FCLK = 5 MHz from the shmoo plots in Fig. 20.
(extracted from the measured shmoo plots.) (b) 5k MC simulations of across
temperature for critical NMOS transistor in R-VT and L-VT flavors.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
PATEL et al.: HYBRID LATCH-TYPE OFFSET TOLERANT SENSE AMPLIFIER FOR LOW-VOLTAGE SRAMs 2531
TABLE II
C OMPARISON W ITH THE S TATE - OF - THE -A RT O FFSET T OLERANT S ENSE A MPLIFIERS
precharging internal nodes with bitlines in HYSA-QZ are Reported offset improvement with respective gate area penalty
simpler timing and relatively stable offset improvement over is justified by Pelgrom’s model where R-HYSA-QZ achieves
wider supply range (0.4 V – 1 V) as evident in Fig. 18 (c). 46.6% and 7.7% iso-gate-area improvement in σ O S compared
The on-chip work in [16], similar to L-HYSA-Q, brings down to R-CLSA and R-VLSA, respectively. Moreover, measured
BER from 2.8% to 1% with the layout area penalty of 16.8% shmoo plots showed that R-HYSA-QZ, R-CLSA and R-VLSA
compared to CLSA while offering record low subthreshold required VDD−min of 0.26 V, 0.30 V and 0.26 V at 25 °C,
operation at 140 mV. The SA in [16] was implemented in low- respectively. Utilizing L-VT transistors in L-HYSA-QZ further
leakage and Low-Power (LP) technology whereas this work lowered its VBL-min by 13.0% and improved its operational
was implemented in regular General-Purpose (GP) technology coverage across temperature and frequency at 0.4 V.
where the leakage is relatively higher due to significant Finally, the overall SRAM Pdy-read benefit was analyzed
degradation in ION /IOFF ratio, ultimately constraining the with measured VOS statistics and resulting Pdy-SA of all SAs.
data reproducibility in this work. The proposed R/L-HYSA- It suggested that offset improvement in proposed HYSA-QZ
QZ topology is simple to implement without any timing or ultimately leads to an overall SRAM Pdy-read improvement
layout complexity and can easily be scaled with technology. compared to other analyzed SAs around threshold region. For
Also, it does not depend on the body-biasing effect or capacitor the nominal supply operations, the choice between R-VLSA,
non-linearity. Thus, conventional SAs, VLSA and CLSA can R-HYSA-QZ and L-HYSA-QZ is subject to targeted CBL of
simply be replaced by the R/L-HYSA-QZ while offering an SRAM macro and the technology.
reliable and higher offset tolerance around threshold region.
VII. C ONCLUSION ACKNOWLEDGEMENTS
The offset tolerance in proposed HYbrid Latch-Type Sense The authors would like to thank David Nairn of University
Amplifier, HYSA-QZ for low-voltage SRAMs is significantly of Waterloo and Marcel Pelgrom of Pelgrom Consulting for
increased by pre-charging multiple internal nodes with bitline their insightful comments.
signals. Along with HYSA-QZ, two intermediate formulations
of HYSA-QZ (CLSA-Z and HYSA-Q), and conventional R EFERENCES
CLSA and VLSA topologies were also analyzed and imple-
mented in 65 nm-GP CMOS test chips. Offset measurements [1] K. Zhang, K. Hose, V. De, and B. Senyk, “The scaling of data sensing
were performed across 5120 SAs from 10 ICs. It showed schemes for high speed cache design in sub-0.18 μ/m technologies,” in
that HYSA-QZ achieves 50.0% and 22.8% lower VBL-min Symp. VLSI Circuits Dig. Tech. Papers, Honolulu, HI, USA, Jun. 2000,
pp. 226–227.
at 0.4 V with 6.5% (or 4.5%) and 30.7% (or 18.8%) total
[2] R. Houle, “Simple statistical analysis techniques to determine minimum
gate (or layout) area overhead compared to R-CLSA and sense amp set times,” in Proc. IEEE Custom Integr. Circuits Conf.,
R-VLSA implemented with R-VT transistors, respectively. San Jose, CA, USA, Sep. 2007, pp. 37–40.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.
2532 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 66, NO. 7, JULY 2019
[3] B. Mohammad, P. Dadabhoy, K. Lin, and P. Bassett, “Comparative study Dhruv Patel (M’12) received the B.A.Sc. degree
of current mode and voltage mode sense amplifier used for 28 nm (Hons.) in electrical engineering from the Univer-
SRAM,” in Proc. 24th Int. Conf. Microelectron. (ICM), Algiers, Algeria, sity of Waterloo in 2016. He currently pursuing
Dec. 2012, pp. 1–6. the M.A.Sc. degree with the University of Toronto
[4] B. Wicht, T. Nirschl, and D. Schmitt-Landsiedel, “Yield and speed in the area of mixed-signal integrated circuits for
optimization of a latch-type voltage sense amplifier,” IEEE J. Solid-State optical links. During the bachelor’s degree, he was
Circuits, vol. 39, no. 7, pp. 1148–1158, Jul. 2004. involved in variation tolerant sub-threshold SRAM
[5] A. Bhavnagarwala et al., “Fluctuation limits & scaling opportunities circuits research. He has previously held internships
for CMOS SRAM cells,” in IEDM Tech. Dig., Washington, DC, USA, at Apple, Arista, BlackBerry, and Christie Digital.
Dec. 2005, pp. 659–662. He has received two NSERC undergraduate student
[6] M. H. Abu-Rahma et al., “Characterization of SRAM sense amplifier research assistantship awards. He has received the
input offset for yield prediction in 28 nm CMOS,” in Proc. IEEE Custom Queen Elizabeth II Graduate Scholarship and the Edward Rogers Sr. Graduate
Integr. Circuits Conf. (CICC), San Jose, CA, USA, Sep. 2011, pp. 1–4. Scholarship from 2016 to 2018 for the M.A.Sc. degree.
[7] S. J. Lovett, G. A. Gibbs, and A. Pancholy, “Yield and matching Adam Neale (M’08) received the B.A.Sc., M.A.Sc.,
implications for static RAM memory array sense-amplifier design,” and Ph.D. degrees in electrical and computer engi-
IEEE J. Solid-State Circuits, vol. 35, no. 8, pp. 1200–1204, Aug. 2000. neering from the University of Waterloo in 2008,
[8] D. G. Laurent, “Sense amplifier signal margins and process sensitivities 2010, and 2015, respectively. He is currently a Qual-
[DRAM],” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 49, ity and a Reliability Engineer with Intel Corporation
no. 3, pp. 269–275, Mar. 2002. in Hillsboro, OR, USA, where he is responsible
[9] J. S. Shah, D. Nairn, and M. Sachdev, “An energy-efficient offset- for developing radiation-induced upset reliability
cancelling sense amplifier,” IEEE Trans. Circuits Syst. II, Exp. Briefs, models. He was a recipient of the Best Poster
vol. 60, no. 8, pp. 477–481, Aug. 2013. Paper Award from the 2014 IEEE Custom Inte-
[10] L. Pileggi, G. Keskin, X. Li, K. Mai, and J. Proesel, “Mismatch analysis grated Circuits Conference for his work on multiple
and statistical design at 65 nm and below,” in Proc. IEEE Custom Integr. adjacent-bit upset error correcting codes for SRAMs.
Circuits Conf., San Jose, CA, USA, Sep. 2008, pp. 9–12.
[11] J. F. Ryan and B. H. Calhoun, “Minimizing offset for latching Derek Wright (M’09) received the B.A.Sc. and
voltage-mode sense amplifiers for sub-threshold operation,” in Proc. M.A.Sc. degrees in electrical and computer engi-
9th Int. Symp. Quality Electron. Design (ISQED), San Jose, CA, USA, neering from the University of Waterloo in 2003 and
Mar. 2008, pp. 127–132. 2005, respectively, and the Ph.D. degree in collab-
[12] M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. P. G. Welbers, “Matching orative electrical and biomedical engineering from
properties of MOS transistors,” IEEE J. Solid-State Circuits, vol. 24, the University of Toronto in 2010. He is currently
no. 5, pp. 1433–1439, Oct. 1989. a Faculty Member of the Electrical and Computer
[13] S. H. Dhong et al., “A 4.8GHz fully pipelined embedded SRAM in the Engineering Department, University of Waterloo.
streaming processor of a CELL processor,” in IEEE ISSCC Dig. Tech. His research interests include low-power digital cir-
Papers, San Francisco, CA, USA, vol. 1, Feb. 2005, pp. 486–612. cuit design, multidomain modeling and simulation,
[14] N. Chen and R. Chaba, “Dual sensing current latched sense amplifier,” and biomedical devices, circuits, and systems.
U.S. Patent 12 731 623, Mar. 3, 2010. Manoj Sachdev (M’85–SM’97–F’12) received the
[15] M.-F. Tsai, J.-H. Tsai, M.-L. Fan, P. Su, and C.-T. Chuang, “Variation B.E. degree (Hons.) in electronics and communica-
tolerant CLSAs for nanoscale bulk-CMOS and FinFET SRAM,” in Proc. tion engineering from IIT Roorkee, India, and the
IEEE Asia Pacific Conf. Circuits Syst., Kaohsiung, Taiwan, Dec. 2012, Ph.D. degree from Brunel University, U.K.
pp. 471–474. He was with Semiconductor Complex Ltd.,
[16] K. Sarfraz, J. He, and M. Chan, “A 140-mV variation-tolerant deep Chandigarh, India, from 1984 to 1989, where he
sub-threshold SRAM in 65-nm CMOS,” IEEE J. Solid-State Circuits, designed CMOS integrated circuits. From 1989 to
vol. 52, no. 8, pp. 2215–2220, Aug. 2017. 1992, he was the ASIC Division of SGS-Thomson,
[17] B. Razavi, “The StrongARM latch [A circuit for all seasons],” IEEE Agrate, Italy. In 1992, he joined Philips Research
Solid State Circuits Mag., vol. 7, no. 2, pp. 12–17, 2015. Laboratories, Eindhoven, The Netherlands, where he
[18] A. Abidi and H. Xu, “Understanding the regenerative comparator researched on various aspects of VLSI testing and
circuit,” in Proc. IEEE Custom Integr. Circuits Conf., San Jose, CA, manufacturing. He is currently a Professor with the Electrical and Computer
USA, Sep. 2014, pp. 1–8. Engineering Department, University of Waterloo, Canada. He has contributed
[19] P.-F. Chiu, B. Zimmer, and B. Nikolić, “A double-tail sense amplifier for five books and two book chapters. He has co-authored over 200 technical
low-voltage SRAM in 28 nm technology,” in Proc. IEEE Asian Solid- articles in conferences and journals. He holds over 30 granted and several
State Circuits Conf. (A-SSCC), Toyama, Japan, Nov. 2016, pp. 181–184. pending U.S. patents on various aspects of VLSI circuit design, reliability, and
[20] M. E. Sinangil and A. P. Chandrakasan, “Application-specific SRAM test. His research interests include low-power and high-performance digital
design using output prediction to reduce bit-line switching activ- circuit design, mixed-signal circuit design, and the test and manufacturing
ity and statistically gated sense amplifiers for up to 1.9 × Lower issues of integrated circuits.
energy/access,” IEEE J. Solid-State Circuits, vol. 49, no. 1, pp. 107–117, Dr. Sachdev is a fellow of the Engineering Institute of Canada and the
Jan. 2014. Canadian Academy of Engineering. He has served on several conference
[21] X. Liang, K. Turgay, and D. Brooks, “Architectural power models for committees. He was a member of the Technical Program Committee of the
sram and cam structures based on hybrid analytical/empirical tech- IEEE Custom Integrated Circuits Conference from 2008 to 2014. Since 2016,
niques,” in Proc. IEEE/ACM Int. Conf. Comput. Aided Design, San Jose, he has been a Program Committee Member of the IEEE Design and Test
CA, USA, Nov. 2007, pp. 824–830. in Europe Conference. He, his students, and his colleagues have received
[22] B. Giridhar, N. Pinckney, D. Sylvester, and D. Blaauw, “13.7 A reconfig- several international awards. He has received the Best Paper Award from the
urable sense amplifier with auto-zero calibration and pre-amplification IEEE European Design and Test Conference in 1997. He has received the
in 28 nm CMOS,” in IEEE ISSCC Dig. Tech. Papers, San Francisco, Best Panel Award from the IEEE VLSI Test Symposium in 2004. He was
CA, USA, Feb. 2014, pp. 242–243. a co-recipient of the Honorable Mention Award from the IEEE International
[23] M. E. Sinangil et al., “A 28 nm 2 Mbit 6 T SRAM with highly con- Test Conference in 1998, the Best Paper Award from the IEEE International
figurable low-voltage write-ability assist implementation and capacitor- Symposium on Quality Electronics Design in 2011, and the Best Poster Award
based sense-amplifier input offset compensation,” IEEE J. Solid-State from the IEEE Custom Integrated Circuits Conference in 2015. He was
Circuits, vol. 51, no. 2, pp. 557–567, Feb. 2016. the Technical Program Co-Chair of the IEEE Mixed-signal Test Workshop
[24] Y. Sinangil and A. P. Chandrakasan, “A 128 Kbit SRAM with an in 1999. He has also served as the Technical Program Chair for the IEEE
embedded energy monitoring circuit and sense-amplifier offset compen- IDDQ Test Workshop in 1999 and 2000. He was the Program Chair of
sation using body biasing,” IEEE J. Solid-State Circuits, vol. 49, no. 11, Microsystems and Nanoelectronics Research Conference, Ottawa, Canada,
pp. 2730–2739, Nov. 2014. in 2008 and 2009. From 2007 to 2008, he was an Associate Editor of the
[25] D. Patel and M. Sachdev, “0.23-V sample-boost-latch-based offset IEEE T RANSACTIONS ON V EHICULAR T ECHNOLOGY. He currently serves
tolerant sense amplifier,” in IEEE Solid-State Circuits Lett., vol. 1, no. 1, on the Editorial Board for the Journal of Electronic Testing: Theory and
pp. 6–9, Jan. 2018. Applications.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on December 15,2021 at 15:52:22 UTC from IEEE Xplore. Restrictions apply.