Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 155

Scan Operation

Scan Chains
• Scan Enable:
• When active allows scan data to enter the registers.

• Scan Input port:


• Data is loaded into scan cells.

• Scan Output port:


• Data is read by shifting data out.
Scan Based Designs
• Normal mode:
• Sequential elements perform regular system functions.
Scan Based Designs (Cont.)
• Scan mode:
• Sequential elements are connected into one or more
shift registers.
• Circuit appears combinational.
Full Scan

 Full scan is a scan design methodology that replaces all memory elements in the design with their
scannable equivalents and then stitches (connects) them into scan chains.
Benefits
 Highly automated process.
 Highly-effective, predictable method.
 Ease of use.
 Assured quality
Partial Scan

• Partial scan is a scan design methodology where only a percentage of the storage elements in the
design are replaced by their scannable equivalents and stitched into scan chains.
Benefits
 Reduced impact on area.
 Reduced impact on timing.
 More flexibility between overhead
• and fault coverage
Scan operation
How it works

•Select scan mode


Scan operation
How it works


0
0 1 0
1

•Shift in scan cell values


Scan operation
How it works

• Select normal mode


Scan operation
How it works

1 1 1 1 1
1 1

0 1 0

• Force primary input values


Scan operation
How it works
Measure primary outputs

1 1 1 1 1
1 1
0 1 0
Scan operation
How it works

1 1 1 1 1
1 1
10 1 01

• Capture circuit response into


scan cells
Scan operation
How it works

1 1 1 1 1
1 1
1 1 1
1

• Select scan mode


Scan operation
How it works

1 1 1 1 1
1 1
000 10 10 10 111

Shift out scan data, shift in next set of scan cell values
Scan operation
How it works

1 1 1 1 1
1
1

0 0 0 1
1

• Select scan mode


Scan and ATPG Training

16
Content
s
❑ Recap of previous class
❑ Design for Testability Basics
❑ Scan Cells Designs
❑ Scan Architectures
❑ Scan Design Rules
❑ Summary
❑ Assignments

17
Yield and Defect
levels

DL = 1 – Y(1-d) DL = defective parts sold


total parts sold
DL = Defect Level
Y = Yield Y = defect free fabricated
d = Test total parts fabricated
coverage

18
Functional vs. Structural test
3 8

Functional -- 3+5=8
5 0

1 Bit adder with Carry


a
A
Fault list
r
e s
G

S a s-a-0 a s-a-
f
1
t
G b s-a-0 b s-a-
B 1
b
e. s-a-0 e s-a-1
c
C f. s-a-0 f s-a-1
r s-a-0 r s-a-
1
t s-a-0 t s-a-1
s s-a-0 s s-a-1
c s-a-0 c s-a-1 19
Stuck-At Fault
Model
• Fault models are
logic targets for
defects.
• A fault is
detected:
• When a difference is observed
between a “good” and “faulty”
circuit.

• Most common
fault model:
• Most defects are detected with
the stuck-at fault model.
• A terminal of a gate is
permanently stuck-at 0 or 1.

20
What Is Testability?
(Cont.)
• The ability to put a design into a known
initial state, and then control and observe
internal signal values

216
What Is Testability?
(Cont.)
• A highly testable design:
• A circuit that can be placed into a known initial state.
• PIs are controllable.
• POs are observable and measurable.

• Circuit with DFFs replaced with


MUX scan:
• A highly testable design.

22
Design For Test (DFT)

The act of adding logic / features to


enhance the testability of the
design

With the incorporation of DFT, the IC


has 2 modes of operation
 Test mode
 Functional mode

9
Why Design-for-
Test?
• Increased Productivity:
• Shorter time-to-market
• Reduced design cycle
• Reduced cost

• Improved Quality:
• Reduced Defects per million (DPM)
• Improved quality of test

10
Design for Testability
Basics
❑ Design for testability (DFT) refers to those
design techniques that make test generation
and test application cost-effective.
❑ Ad hoc DFT
▪ Effects are local and not systematic
▪ Not methodical
▪ Difficult to predict
❑A structured DFT
▪ Easily incorporated and budgeted
▪ Yield the desired results
▪ Easy to automate
11
Ad Hoc
Approach
❑ Typical ad hoc DFT techniques
▪ Insert test points
▪ Avoid asynchronous set/reset for
storage
elements
▪ Avoid combinational feedback loops
▪ Avoid redundant logic
▪ Avoid asynchronous logic
▪ Partition a large circuit into small blocks

12
Ad Hoc Approach – Test Point
Insertion
Logic circuit
.
Low-observ ability node

.
Low-observ ability node .
Low-observa bility node

A C
OP1 OP2 OP3
DI
DI DI OP2 shows the
1 SI SO SI 0
D Q SO OP output structure of an
1 SI SO
SE SE
observation,
SE
which is
SE . . . . composed of a
CK Observation shiftregister multiplexer
(MUX) and a
D flip-flop.
Observation point insertion

13
Ad Hoc Approach – Test Point Insertion

Logic circuit A MUX is inserted


Low-controllability node B between the source
Source x Destination and destination ends.
Original connection
Low-controllability node C During normal
Low-controllability node A operation, TM = 0,
such that the value
CP1 CP2 CP3 from the source end
DI DI
0 DO DI drives the destination
DO DO end through the 0
1
CP_input SI SO SI D Q . SO SI SO
port of the MUX.
TM TM TM

TM . . . . During test, TM = 1
such that the value
CK
Control shift register from the D flip-flop
drives the destination
end through the 1
Control point insertion port of the MUX.

14
Ad-Hoc DFT
Methods
 Good design practices learnt through experience are used
as guidelines:
• Avoid asynchronous (unclocked) feedback.
• Make flip-flops initializable.
• Avoid redundant gates. Avoid large fanin gates.
• Provide test control for difficult-to-control signals.
• Avoid gated clocks.
• Consider ATE requirements (tristates, etc.)
 Design reviews conducted by experts or design auditing tools.
 Disadvantages of ad-hoc DFT methods:
• Experts and tools not always available.
• Test generation is often manual with no guarantee of
high fault coverage.
• Design iterations may be necessary. 15
Structured Approach
❑ Scan design
▪ Convert the sequential design into a scan design
▪ Three modes of operation
– Normal mode
• All test signals are turned off
• The scan design operates in the original functional configuration
– Shift mode
– Capture mode
• In both shift and capture modes, a test mode signal TM is
often used to turn on all test-related fixes

16
Structured Approach - Scan
Design
Assume that a stuck-at
X1 Combinational logic Y1 fault f in the
Y2
0 X2 combinational logic
X3
f requires the primary input
X3, flip-flop FF2, and
0 FF3 flip-flop FF3, to be set to
Q D 0, 1, and 0.
The main difficulty in
1 FF2
testing a
Q D sequentialcircuit stems
FF1
. from the fact that it is
Q D difficult to control and
observe the internal state
CK . of the circuit.

Difficulty in testing a sequential circuit

17
Structured Approach - Scan
Design
Test stimulus application
n
1 1
Test stimulus Shift register composed of n scan cells Test response

n
Test response upload

1. Converting How to detect stuck-at fault f :


selected (1)switching to shift mode and shifting in the desired test
elements
storage in the stimulus, 1 and 0, to FF2 and FF3, respectively
(2) driving a 0 onto primary input X3
design into scan
2.. (3)switching to capture mode and applying one clock
cells.Stitching
pulse to capture the fault effect into FF1
together to form
them (4)switching back to shift mode and shifting out the test
scan chains. response stored in FF1, FF2, and FF3 for comparison with
the expected response.

18
Scan
 Design
Circuit is designed using pre-specified design rules.
 Test structure (hardware) is added to the verified design:
• Add a test control (TC) primary input.
• Replace flip-flops by scan flip-flops (SFF) and connect to form
one or more shift registers in the test mode.
• Make input/output of each scan shift register
controllable/observable from PI/PO.
 Use combinational ATPG to obtain tests for all
testable faults in
the combinational logic.
 Add shift register tests and convert ATPG tests into scan
sequences for use in manufacturing test.

19
Scan Cel
Design
❑A scan cell has two inputs: data
input and scan input
▪ In normal/capture mode, data input is selected to update
the output
▪ In shift mode, scan input is selected to update the output
❑ Three widely used scan cell designs
▪ Muxed-D Scan Cell
▪ Clocked-Scan Cell
▪ LSSD Scan Cell

20
Muxed-D
Scan Cell
This scan cell is composed of a D
DI 0
D
flip-flop and a multiplexer.
1 Q Q/SO

SI
SE CK
The multiplexer uses an additional
Edge-triggered scan enable input SE to select
muxed-D
scan cell
between the data input DI and the
scan input SI.

21
Muxed-D
Scan
Cel In normal/capture mode,
l
CK
SE is set to 0. The
value present at the
data input DI is
SE captured into the
internal D flip-flop
DI D
D1 D
D2 DD3 D
D4 when a rising clock
1D 2D 3D 4D
edge is applied.
SI T1 T2 T3 T4 In shift mode, SE is set
to
Q/SO D1
D T3 1. The scan input SI is
1 used to shift in new
data to the D flip-flop,
Edge-triggered muxed-D scan cell while the content of
the D flip- flop is being
design and operation shifted out.

22
Muxed-D
Scan Cell
This scan cell is composed of a
DI 0
1
D Q . Q
multiplexer, a D latch, and a
D flip-flop.
CK
SI
SE D Q SO In this case, shift operation is
conducted in an edge-triggered
CK . manner, while normal
operation and capture
operation is conducted in a
level-sensitive manner.
Level-sensitive/edge-triggered
muxed-D scan cell design

23
Scan Flip-Flop
(SFF)
SD Master latch Slave latch
SE
Logic Q
overhead

MUX
D Q

CK D flip-flop

CK Master open Slave open


t

SE Scan mode, SD selected Normal mode, D selected


t

24
Clocked-Scan
Cell

DI
In the clocked-scan
SI
Q/SO
cell, input selection is
conducted using two
DCK SCK independent clocks,
DCK and SCK.
Clocked-scan cell

25
Clocked-
Scan Cell
In normal/capture mode,
the data clock DCK is
used to capture the
contents present at the
data input DI into the
clocked-scan cell.
In shift mode, the shift
clock SCK is used to shift
in new data from the scan
input SI into the clocked -
scan cell, while the
content of the clocked-
Clocked-scan cell design and scan cell is being shifted
operation out.

26
LSSD Scan
Cell
An LSSD scan cell is
used for level-
D . .
SRL sensitive latch base
designs.
L . +L1
This scan cell contains
C . 1 two latches, a master 2-

I . .L
2
+L 2
port D latch L1 and a slave
D latch L2. Clocks C,
A and B are used to select
A . . between the data input D
and the scan input I to
B drive +L1 and +L2. In an
LSSD design, either +L1
or +L2 can be used to
Polarity-hold SRL drive the combinational
(shift register latch) logic of the design.

27
LSSD Scan
Cell
C In order to guarantee race-free
operation, clocks A, B, and C
A are applied in a non-
overlapping manner.
B

D
D1 D
D2 D
D3 D
D4 The master latch L1 uses the system
D
1 D
2 D
3 D
4 clock C to latch system data from
DI T1 T2 T3 T4 the data input D and to output this
data onto +L1. Clock B is used after
D T3
+L1 D1
1
clock A to latch the system data
from latch L1 and to output this data
+L2
T3
onto +L2.
Polarity-hold SRL design and
operation

28
Comparing three scan cell
designs
Advantages Disadvantages

Muxed-D Scan Compatibility to modern Add a multiplexer


Cell designs delay
Comprehensive support provided
by existing design automation
tools
Clocked- No performance degradation Require additional shift
Scan clock routing
Cell
LSSD Scan Insert scan into a latch-based Increase
Cell design routing
Guarantee to be race-free complexity

29
ScanArchitectures
❑ Full-Scan Design
▪ All or almost all storage element are converted into scan
cells
and combinational ATPG is used for test generation
❑ Partial-Scan Design
▪ A subset of storage elements are converted into scan
cells and sequential ATPG is typically used for test
generation
❑ Random-Access Scan Design
▪ A random addressing mechanism, instead of serial scan
chains, is used to provide direct access to read or write
any scan cell
30
Ful-Scan
Design
❑ All storage elements are replaced with scan cells
▪ All inputs can be controlled
▪ All outputs can be observed
❑ Advantage:
▪ Converts sequential ATPG into combinational ATPG
❑ Almost full-scan design
▪ A small percentage of storage elements are not
replaced with scan cells
– For performance reasons
• Storage elements that lie on critical paths
– For functional reasons
• Storage elements driven by a small clock domain that are
deemed too insignificant to be worth the additional scan
insertion effort

31
Muxed-D Ful-Scan
Design
X1 Y1
X2 Combinational logic
Y2
The three D flip-
X3 FF1 FF2 flops, FF1, FF2 and
FFQ3
D FF3, are replaced
. D Q DQ
with three muxed-D
CK .
scan cells, SFF1,
SFF2 and SFF3,
respectively.
Sequential circuit example

32
Adding Scan
Structure
PI PO

Combinational SFF SCANOUT

logic SFF

SFF

TC or TCK Not shown: CK or


SCANIN MCK/SCK feed
all SFFs.
33
Muxed-D Ful-Scan
Design
X1 Y1
To form a scan
PI X2 Y2
PO chain, the scan input
Combinational logic
X3 SFF
SI of2and SFF 3are
PPI PPO
connected to the
output Q of the
previous scan cell,
SFF1 SFF2 SFF3
SFF1 and SFF2,
SI
DI
SI . DI
SI . DI
SI . SO respectively. In
addition, the scan
SE
CK
. Q .SE . Q .SE Q SE
input SI of the first
scan cell SFF1 is
connected to the
primary input SI, and
(a) Muxed-D full-scan circuit the output Q of the
last scan cell SFF3 is
connected to the
primary output SO.

34
Muxed-D Ful-
Scan Design
PI V1: PI V2: PI

SE
S H C H S H C
CK
SFF1.Q 0 1 1 1 0 1 1 L
L L 1
X 0 1 1 1 0 0
H H L
SFF2.Q X X 0 0 L
L L H L 1 1

SFF3.Q V1: PPI V2: PPI H


PO PPO
observation observation

S: shift operation / C: capture operation / H: hold cycle

(b) Test operations

35
Muxed-D Ful -
ScanDesign
• Primary inputs (PIs)
– the external inputs to the
• Primary outputs
circuit (POs)
– can be set to – the external outputs of the
any required circuit
logic – can be observed
values – are observed directly in
– set directly in – parallel from the external
parallel from outputs
the
external inputs • Pseudo primary
• Pseudo primary outputs (PPOs)
inputs – the scan cell inputs
– can be observed
(PPIs) – are observed serially
– the scan cell outputs through
– can be set to any required logic scan chain outputs
values
– are set serially through scan
chain inputs
36
Comb. Test
Vectors

PI I1 I2 O1 O2 PO

SCANIN Combinational
SCANOUT
TC
logic
Present Next
S1 S2 N1 N2 state
state

37
Comb. Test
Vectors Don’t care
or
PI I1 I2 random
bits

SCANIN S1 S2
TC 0000000 1 00000 00 10000000

PO O1 O2

SCANOUT N1 N2
Sequence length = (ncomb + 1) nsff + ncomb clock periods
ncomb = number of combinational vectors
nsff = number of scan flip-flops
38
Testing Scan
Register
 Scan register must be tested prior to application of
scan test sequences.
 A shift sequence 00110011 . . . of length nsff+4 in scan
mode (TC=0) produces 00, 01, 11 and 10 transitions in
all flip-flops and observes the result at SCANOUT
output.
 Total scan test length: (ncomb + 1) nsff + ncomb + nsff + 4 clock periods.
• Example: 2,000 scan flip-flops, 500 comb. vectors, total
scan test length ~ 106 clocks.
• Multiple scan registers reduce test length.

39
Multiple Scan
Registers
 Scan flip-flops can be distributed among any
number of shift registers, each having a separate
scanin and scanout pin.
 Test sequence length is determined by the longest
scan shift register.
 Just one test control (TC) pin is essential.

PI/SCANIN PO/
Combinational M SCANOUT
logic U
SFF X

SFF
SFF
TC

CK
40
Scan
Overheads
• IO pins: One pin necessary.
• Area overhead:
• Gate overhead = [4 nsff/(ng+10nsff)] x 100%, where ng =
comb. gates; nff = flip-flops; Example – ng = 100k
gates, nsff = 2k flip-flops, overhead = 6.7%.
• More accurate estimate must consider scan wiring
and layout area.
• Performance overhead:
• Multiplexer delay added in combinational path;
approx. two gate-delays.
• Flip-flop output loading due to one additional fanout;
approx. 5-6%.

41
• Hierarchical
Scan flip-flops are chained within subnetworks
Scan
before chaining subnetworks.
• Advantages:
• Automatic scan insertion in netlist
• Circuit hierarchy preserved – helps in debugging and design
changes
• Disadvantage: Non-optimum chip layout.

Scanin Scanout
SFF1 SFF4
SFF1 SFF3
Scanin
Scanout
SFF2 SFF4 SFF2
SFF3
Flat layout
Hierarchical netlist
42
Optimum Scan
Layout X’

IO X SFF
pad cell

SCANIN
Flip-

flop
cell Y Y’

TC SCAN
OUT

Routing
channels
Interconnects Active areas: XY and X’Y’

43
Scan Area
Linear Overhead
dimensions of active
area: X = (C + S) / r
• y = track
X’ = (Cdimension,
+ S + S) / r wire
Y’ = Y + ry = Y + Y(1 –  ) / T
width+separation
•C = total comb. cell width S =
total non-scan FF cell
Area overhead
• width X’Y’ – XY
= ─────── × 100%
•s = fractional FFXYcell area
• = S/(C+S) 1–
•  = SFF= cell
[(1+width
s)(1+ ─── fractional
) – 1] x
increase 100%
• r = number of cell T rows or
routing channels
•  = routing fraction in active
area 1–
• T = cell =height
(s + ────in track
) x 100%
dimension y T 44
Example: Scan
Layout
• 2,000-gate CMOS chip
• Fractional area under flip-flop cells, s = 0.478
• Scan flip-flop (SFF) cell width increase,  = 0.25
• Routing area fraction,  = 0.471
• Cell height in routing tracks, T = 10
• Calculated overhead = 17.24%
• Actual measured data:
Scan implementation Area overhead Normalized clock rate
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
__

None 0.0 1.00

Hierarchical 16.93% 0.87

Optimum layout 11.90%


0.91 45
ATPG Example:
S5378
Original Full-scan

Number of combinational gates 2,781 2,781


Number of non-scan flip-flops (10 gates each) 179 0
Number of scan flip-flops (14 gates each) 0 179
Gate overhead 0.0% 15.66%
Number of faults 4,603 4,603
PI/PO for ATPG 35/49 214/228
Fault coverage 70.0% 99.1%
Fault efficiency 70.9% 100.0%
CPU time on 5,533 s 5s
SUN Ultra II, 414 585
200MHz 414 105,662
processor
Number of
ATPG vectors
Scan sequence
length 46
Timing and
Power
• Small delays in scan path and clock skew
can cause race condition.
• Large delays in scan path require slower
scan clock.
• Dynamic multiplexers: Skew between TC
and TC signals can cause momentary
shorting of D and SD inputs.
• Random signal activity in combinational
circuit during scan can cause excessive
power dissipation.
47
Muxed-D Ful-Scan
Design
Circuit Scan TM SE
Operation type
cell mode

Normal Normal 0 0

Shift Shift 1 1
Operation

Capture Capture 1 0
Operation

Circuit operation type and scan cell mode

48
Clocked Ful-Scan
Design
In a muxed-D full-
X1 Y1 scan circuit, a scan
PI X2 PO enable signal SE
X3 Combinational logic Y2
PPI PPO
is used.

In a clocked full-
scan design, two
SFF1 SFF2 SFF3 operations are
SI
DI
SI Q . DI
SI Q . DI
SI . SO
distinguished by
DCK SCK DCK SCK Q properly applying
DCK
SCK
. . . . DCK SCK the two independent
clocks SCK and
DCK during shift
mode and capture
Clocked full-scan circuit mode.

49
LSSD Ful-Scan
Design
❑ Single-latchdesign
❑ Double-latch
design

50
LSSD Ful-Scan
Design
X1 X3 The output port
Combinational logic 1 Combinational logic 2 Y2
X2 Y1 +L1 of the
latch L1 is used to
master
SRL1 SRL2 SRL3
drive the
D D D combinationa
SI I
+L2
I
+L2
I
+L2
SO
l logic of the
C
A
C
A
C
A
design. In this
+L1 B +L1 B +L1 B case, the
C1
A .. .. slave L2 is
latch
B
C2
.
used
only for scan
testing.
Single-latch design

51
LSSD Ful-Scan In normal mode, the
C1 and C2 clocks are
Design used in a non-
overlapping Manner.
During the shift
X1 Y1 operation, clocks A
Combinational logic and
X2 Y2
B are applied in a non-
X3 SRL1 SRL2 SRL3 overlapping manner,
SI
D
I . D
I . D
I . SO
the scan cells SRL1
~ SRL3 form a single
+L2 +L2 +L2
A +L1
C A +L1
C A +L1
C scan chain from SI
B B B
C1 . . to SO.
During the capture
A
C2 or B .. .. operation, clocks C1
and C2 are applied to
load the test response
Double-latch design from the
combinational logic
into the scan cells.

52
LSSD Design
Rules
❑ All storage elements must be polarity-hold latches.
❑ The latches are controlled by two or more non-
overlapping clocks.
❑ A set of clock primary inputs must follow three
conditions:
▪ All clock inputs to SRLs must be inactive when clock PIs
are inactive
▪ The clock input to any SRL must be controlled from one
or more clock primary inputs
▪ No clock can be ANDed with another clock or
its complement

53
LSSD Design
Rules
❑ Clock primary inputs must not feed the data inputs to
SRLs either directly or through combinational logic.
❑ Each system latch must be part of an SRL, and each
SRL must be part of a scan chain.
❑ A scan state exists under certain conditions:
▪ Each SRL or scan out SO is a function of only the
preceding SRL or scan input SI in its scan chain during the
scan operation
▪ All clocks except the shift clocks are disabled at the
SRL clock inputs

54
Partial-Scan
Design
❑ Was once used in the industry long before
full-scan design became the dominant scan
architecture.
❑ Can also be implemented using muxed-D
scan cells, clocked-scan cells, or LSSD scan
cells.
❑ Either combinational ATPG or sequential
ATPG can be used.

55
Partial-Scan
Design
A scan chain is onstructed
X1 Y1 with two scan cells SFF1 and
PI X2 PO
X3 Combinational logic Y2 SFF 3, while flip-flop FF is
PPI PPO 2
It is out.
left possible to reduce the test
generation complexity by
SFF1
. FF2 SFF3
splitting the single clock into
two separate clocks, one for
SI
DI
SI DI
DI
SI . SO controlling all scan cells, the
Q Q SE other for controlling all non-
SE
CK
. Q .SE . scan storage elements.
However, this may result in
additional complexity of
routing two separate clock
An example of muxed-D partial- trees during physical
scan design implementation.

56
Partial-Scan
Design
❑ Scan cell selection
▪ A functional partitioning approach
– A circuit is composed of a data path portion and a control portion
– Storage elements on the data path are left out of the scan
cell replacement process
– Storage elements on the control path can be replaced with
scan cells
▪ A pipelined or feed-forward partial-scan design
approach
– Make the sequential circuit feedback-free by selecting the
storage
elements to break all sequential feedback loops
– First construct a structure graph for the sequential circuit
▪ A balanced partial-scan design approach
– Use a target sequential depth to simply the test generation process
for the pipelined or feed-forward partial-scan design

57
Partial-Scan Design - Structure
Graph
❑A feedback-free sequential circuit
▪ Use a directed acyclic graph (DAG)
▪ The maximum level in the structure
graph
is referred to as sequential depth
❑A sequential circuit containing
feedback loops
▪ Use a directed cyclic graph (DCG)

58
Sequential circuit and its structure
graph
C1 FF3 C3 FF5 1 3 5
FF1

FF2 C2
2 4
FF4

(a) Sequential Circuit (b) Structure graph


Sequential depth is 3
The sequential depth of a circuit is equal to the maximum number of
clock cycles that needs to be applied in order to control and observe
values to and from all non-scan storage elements
• The sequential depth of a full-scan circuit is 0

59
Partial-Scan
Design
❑ Advantage:
▪ Reduce silicon area overhead
▪ Reduce performance degradation
❑ Disadvantage:
▪ Can result in lower fault coverage
▪ Longer test generation time
▪ Offers less support for debug, diagnosis
and failure analysis

60
Random-Access Scan
Design
❑ Advantages of RAS:
▪ Can control or observe individual scan cells without
affecting others
▪ Reduce test power dissipation
▪ Simplify the process of performing delay test
❑ Disadvantages of traditional RAS:
▪ High overhead in scan design and routing
▪ No guarantee to reduce the test application time
❑ Progressive Random-Access Scan( PRAS )
was proposed to alleviate the disadvantages
in traditional RAS

61
Traditional random-access
scan architecture
PI Combinational logic PO All scan cells are
organized into a
SC SC … SC two-dimensional
Row (X) decoder CK array. A ┌ log 2n ┐ -
SC SC … SC SI bit address shift
… SCK register, where n is


SO the total number of
SC SC … SC scan cells, is used to
specify which scan
Column (Y) decoder cell to access.
Address shift register AI

62
Progressive Random-Access
Scan (PRAS)
Structure is similar to that of a static
SD SD
random access memory (SRAM)
or a grid addressable
cell
RE latch.
Ф Ф In normal mode, all horizontal
D Q row enable (RE) signals are set to
Ф Ф Ф Ф 0, forcing each scan cell to act as
a normal D flip-flop.

In test mode, to capture the test


response from D, the RE signal is
PRAS scan cell design set to 0 and a pulse is applied on
clock Φ, which causes the value on
D to be loaded into the scan cell.

63
Progressive Random-Access
Scan (PRAS)
Sense-amplifiers & MISR
PO
… Rows are enabled in a
SC … fixed order.

Row enable shift register


SC SC

Combinational logic
SC SC … SC

It is only necessary to


supply a column address
… SC
SC SC
to specify which scan cell

TM Test
Column line drivers PI
in an enabled row to
SI/SO contro … access.
CK l logic Column address decoder

CA

PRAS Architecture

64
PRAS - test
procedure
for each test vector vi (i = 1, 2, …, N) {
/* Test stimulus application */
For each test
/* Test response compression */ vector, the test
enable TM;
for each row rj (j = 1, 2, …, m) { stimulus
read all scan cells in rj / update MISR; application and
for each scan cell SC in rj
/* v(SC): current value of SC */
test response
/* vi(SC): value of SC in vi */ compression are
if v(SC)  vi(SC)
update SC;
conducted in an
} interleaving
/* Test response acquisition */
disable TM;
manner when the
apply the normal clock; test mode signal
}
scan-out MISR as the final test
TM is enabled.
response;

65
Scan Design
Rules
 Use only clocked D-type of flip-flops for all state
variables.
 At least one PI pin must be available for test; more
pins, if available, can be used.
 All clocks must be controlled from PIs.
 Clocks must not feed data inputs of flip-flops.

66
Scan Compression &
Compression Techniques
Necessity of Scan
Compression
• Scan compression has become a necessity for meeting test cost and
quality requirements of today’s nanometer designs.
• When considering a scan compression technology, several key factors
should be considered in order to ensure that the compression
technology does not take anything away from the existing high quality,
low cost test.
Some of the key areas are:
- Impact on test quality (test coverage)
- Data and time compression (tolerance to X sources)
- Low pin count testing (to enable multi-site testing)
- Area and layout overhead
- Diagnostics and impact on manufacturing flow
Why
Compression
• Semiconductor companies realized a need for compression because of rising
tester costs.
• The test pattern data volume exceeded the tester memory, requiring pattern
reloads and excessive test application time.
• Over time, that need has been supplemented with the necessity to improve
test quality.
• New fault models and additional test patterns are needed to detect new types
of defects and meet the quality levels of nanometer designs.
• The undesirable option of pattern truncation results in lower test coverage and
ultimately an increase in defective parts per million (DPPM) that are shipped to
customers.
• Therefore, in order to avoid an increase in test escapes due to low test quality,
the industry has recognized an inevitable need for test pattern compression.
Uncompressed Scan Vs Compressed
Scan
Goals of Scan
Compression
Given that the goals of scan pattern compression are to lower tester costs and
maintain high test quality, we need to identify the specific requirements for an
effective compression technique.
• Test Cost
• Reduce the requirement of scan data memory
• Reduce test application time per part
• Reduce the number of required scan channels
• Reduce simulation time for serial load patterns
• Test Quality
• Ability to support and compress all pattern types to fit within tester memory
• Ability to support and compress patterns for several different fault models
• Ability to maintain high at-speed test coverage in the presence of many X sources
• Diagnostics of compressed scan patterns
Scan Compression Techniques
• The key requirement of any compression technology is
preservation of high test quality when compared to standard
(uncompressed) patterns.
• Several technologies have been developed over the years in
order to meet the compression goals outlined in the previous
section (Test Cost & Test Quality)

•Scan Compression Techniques:


1. Virtual Scan ---> SynTest Technologies
2. Adaptive Scan ---> Synopsys
3. On-Product Multiple Input Signature Register (OP-MISR) ---> Cadence
4. Embedded Deterministic Test (EDT) ---> Mentor Graphics
Embedded Deterministic Test
(EDT)
• EDT was developed primarily to reduce the testing time and test data
volume on large multimillion gate designs that need testing for
several fault models.
• Mentor Graphics TestKompress (TK) is the tool that can generate the
decompressor and compactor logic at the RTL level.
• The architecture consists of a decompressor and a compactor
logic embedded on the chip.
• The decompressor drives the scan chain inputs and the
compactor connects from the scan chain outputs.
• The EDT logic is inserted only in the scan path.
• In the presence of EDT logic we can have a large number of
very short scan chains.
EDT
Architecture
EDT
Architecture
EDT Architecture -
Decompressor
• The decompressor consists of a ring generator
• The outputs of the ring generator flops will connect to scan chain
inputs through a phase shifter consisting of XOR gates.
• Creation of the compressed stimuli from a test pattern consists of
solving a set of linear equations based on the ring generator
polynomial and the phase shifter connections.
• Inputs to the ring generator are driven from the compressed stimuli
on the ATE.
The input side is called a continuous-flow ring generator (Figure 3). It
is similar to a linear feedback shift register (LFSR) in that it can
produce random data, but the device is used to decode compressed
data with every shift of scan channel values
EDT Architecture -
Decompressor
EDT Architecture -
Compressor
• The output response compactor consists of an XOR tree and the masking
logic
• The masking logic consists of a pattern mask register, decoder, and AND
gates before the XOR tree.
• The logic values for the pattern mask register are loaded from the
compressed pattern data on the ATE.
• The masking logic and the XOR tree compactor have the ability to
handle
any number of unknowns (Xs) from the scan chains without any
modifications to the functional logic.
• The masking logic will also eliminate the effects of fault aliasing through
the XOR tree.
• The decompressor/compactor logic implemented in TestKompress can
also
perform fault diagnosis using the same compressed patterns that is used
on the ATE.
EDT Architecture -
Compressor
Advantages of EDT
Architecture
 Best encoding capacity.
A single scan channel can be used to obtain time and data compressions of more
than 100x.
No routing congestion, as there are no high fan-out nets. Modular
implementation can be used for easy block-level implementation.
 All faults that propagate to scan cells are guaranteed to be
detected by the
automated masking capability of the compactor, even in the
presence of any
number of Xs and fault aliasing.
No need to generate any top-up ATPG patterns as the test coverage in the
compressed and bypass mode are the same.
Compressed patterns can be directly diagnosed for failures found on the tester,
and there is no loss of diagnostic resolution compared with bypass mode
patterns.
Limitations of EDT
The ring generator, phase shifter, XOR tree compactor and the x-masking logic
contributes to an area overhead of up to 1% generally.
Scan Compression
Ratio
• Designs require on-chip test hardware to compress the time and
memory of automatic test pattern generation (ATPG) tests to
manageable budgets.
• This on-chip test hardware is generally referred to as scan
compression or simply compression.
• The ratio of the number of scan channels to the external
FULLSCAN
chains is the target compression ratio.
• When the scan chains are properly balanced, you can reduce test
time and test data volume close to the target ratio.
• In many designs, using the right architecture, DFT engineers can
expect to achieve 200X compression efficiency or more, translating
into equivalent test time and data volume savings.
Hierarchical Compression
• When designs are very large and contain multiple IP blocks, hierarchical
physical synthesis and implementation is the preferred approach.
• In this case, it would be quite difficult, and in some situations
impossible to
implement a single compression logic for the design.
• Power, timing, routing, and area considerations would have a much bigger
impact on DFT. A hierarchical approach to compression is also desired.
• In hierarchical compression architecture, multiple levels of compression
are
implemented.
• The lowest-level blocks would have scan channels, with compression logic
placed around them. Many compressed blocks are then further compressed
at the next level until the chip-level I/Os are accessible.
• With this approach, test budget goals and physical parameters can be met at
each of the compressed blocks, reducing the overall impact at the chip
level.
Hierarchical Compression
• Hierarchical compression is suitable for large designs that use hierarchical methodologies.
• The choice of pattern generation would depend upon the rest of the DFT architecture, including
IEEE 1500 usage and test partitioning plans.
Thank You
ATPG
December
2015

101
Agenda

Introduction

Why to test ?

Verification Vs testing

Faults, Errors & failures

Senario of manufacturing test

Fault models
- Stuck at fault model
- Transition fault model
- Path delay fault model
- Iddq fault model
- Bridging fault model

102
Agenda (conti...)

Fault Activation & Propagation

Common ATPG Terminology
- Fault Equivalence
- Fault Dominance
- Fault Collapsing

Why ATPG ?

System for test Generation

Steps involved in ATPG

Fault Classes
- Testable
- Untestable
Agenda (conti...)


Coverage
- Test Coverage
- Fault Coverage

Introduction to Testkompress
- What is Testkompress
- Testkompress ATPG flow

ATPG TOOL Inputs & Outputs
Introduction


ATPG stands for Automatic Test Pattern Generation.

Goal: generation of a small set of effective vectors at a low computational cost.


ASICs made with a synthesis tool are especially hard for manual
test generation, because human insight ismisssing in a machine generated
netlist.
Why to test


The process of determining whether a device or circuit
- Is functioning correctly or
- Is defected

Device can be defective because it does not fuction
- As designed or
- Specified

Guarantee IC Quality
- Not only working devices also reliable devices
Verification Vs Testing

Specifications Design code/ Silicon/phisical


Netlist device

Verification Testing
Faults, Errors and Failures

 Fault
 A physical defect within a circuit or a system
 May or may not cause a system failure
 Error
 Manifestation of a fault that results in incorrect circuit (system)
outputs or states.
 Caused by faults
 Failure
 Deviation of a circuit or system from its specified behavior
 Caused by an error
Fault ---> Error ---> Failure

108
Senario of Manufacturing Test
Test Vectors

Manufactured
circuits

Circuit Responses

Correct Responses Comparator Pass/Fail


Fault Models


Mathematical model of faulty behaviour.

Canbe used to assess the compliance of a circuit to various criteria.

For example,
- Structual compliance can be verified by using a stuck-at fault model.
- Timing compliance can be verified by using a delay fault model.
- Current leakage compliance can be verified by using a bridging fault
model.

Identifies target faults.

Models faults which are most likely to occur.
Fault Models


The Fault models which industries are following :
- Stuck-at fault model
- Transition fault model
- Path-delay fault model
- Bridging fault model
- Iddq Fault model

111
Stuck-at Fault Model

Assumptions:
 Only One line is faulty
 Faulty line permanently set to 0 or 1
 Fault can be at an input or output of a gate
Delay Faults
Transition faults
Models slow-to-rise or slow-to-fall transition on logic gate
 Path Delay Faults
Models slow-to-rise or slow-to-fall transition on some path(s) from primary
input to primary outputs
- Covers transition and gate delay faults.
Transition Fault Model
 Delay of all paths through faulty gate exceed specified cycle time

 Two Types
 Slow-to-rise (STR) and Slow-to-Fall (STF)
 Example: six faults in AND
 A slow-to-rise, A slow-to-fall
 B slow-to-rise, B slow-to-fall
 C slow-to-rise, C slow-to-fall
 Two pattern test required
 First pattern P1 initialize
 Second pattern P2 does the transition(lauches the value).
 TF very popular delay fault model
Two Methods Of Transition Delay tests

Launch-on-shift(LOS)
Launch on Capture (LOC)
Comparison of both the methods

Launch-on-Capture (LOC) Launch -On-shift(LOS)

Advantages • No constraints for scan • High Fault Coverage


enable signal • Compact test patterns

Disadvantages • Medium Fault Coverage  Deassertion of ScanEnable


• More test pattern needed signal is Timing Critical.
Path Delay Fault Model
Delay of at least one sensitizable path exceeds specified clock cycle time
 Associated with a Path (e.g. A-B-C-Z)
 More complicated than gate-delay fault
IDDQ Fault Model
• This is similar to the stuck at fault model but here instead of measuring the
voltage we measure the current In a CMOS design at the quiescent state.

• Ideally there is suppose to low leakage current in the silicon At steady state
But large amount of leakage current flows from VDD to GND in faulty
devices due to manufacturing defects. Quiescent current from VDD to VSS
is shown in Fig.
Bridging Fault model
 Unintended short between two signal and creates wired logic
 Shorts within logic gate are not bridging faults
Fault Activation & Propagation

• A fault can be detected if it is activated and propagated

 Fault activation create different values (w.r.t. to faulty and fault free
circuits) at the fault site

 Fault propagation (sensitized) to a primary output


 line e is sensitized to output if output value changes when line e
changes
 A path composed of sensitized lines is called a sensitized path
Example
• e stuck-at 0 is activated because c=1,
 good e=1; faulty e=0
 e stuck-at 0 fault is propagated to output f
 e and f are sensitized
 path e-f is a sensitized path
 So, e stuck-at 0 is detected
Common ATPG Terminology
 Fault Equivalence

 Fault Dominance

 Fault Collapsing
Fault Equivalence
AND gate
 all s-a-0 faults are equivalent
 OR gate
 all s-a-1 faults are equivalent
 NAND gate
 all the input s-a-0 faults and the output s-a-1 faults
are equivalent
 NOR gate
 all input s-a-1 faults and the output s-a-0 faults are
equivalent
 Inverter
 input s-a-1 and output s-a-0 are equivalent
 input s-a-0 and output s-a-1are equivalent
Equivalence Rules
Fault Collapsing
 Reducing the set of faults to test for by using equivalence classes is called
fault collapsing
 If a set of faults is functionally equivalent, we only need to use one test to
detect any single one of them
 Simple example: 2-input NAND gate
Input stuck-at-0 is equivalent to output stuck-at-1 collapse !!
Equivalence Example
Fault Dominance
 AND gate
 Output s-a-1 dominates any input s-a-1
 NAND gate
 Output s-a-0 dominates any input s-a-1
 OR gate
 Output s-a-0 dominates any input s-a-0
 NOR gate
 Output s-a-1 dominates any input s-a-0
Dominance Example
Why ATPG ?


Test generation can be the longest phase of the design cycle if done
manually.

ASIC made with a synthesis tool are especially hard for manual test
generation, because human insight is missing in a machine generated
netlist.
System forTest Generation

Fault list Remove all


Select next Detected Faults
fault

Test Generator Fault Simulator

Add Vectors to Test set Fault simulate


test set Vectors
Steps Involved in ATPG


Generating test Patterns.

Fault simulations to determine which faults are detected for the particular
set of test patterns.
Fault Class Hierarchy
• Testable (TE)
– Detected (DT)
- DET_Simulation (DS)
- DET_Implication (DI)
– POSDET (PD)
- POSDET_Untestable (PU)
- POSDET_Testable (PT)
– Atpg_untestable (AU)
– Undetected (UD)
- Uncontrolled (UC)
- Unobserved (UO)
• Untestable (UT)
– Unused (UU)
– Tied (TI)
– Blocked (BL)
– Redundant (RE)
Testable Faults
 Faults that cannot be proven untestable
 Subcategorized as:
Detected (DT)
 DET_Simulation (DS)
 DET_Implication (DI)
POSDET (PD)
 POSDET_Untestable (PU)
 POSDET_Testable (PT)
ATPG_Untestable (AU)
UNDetected (UD)
 UNControlled (UC)
 UNObserved (UO)
• Detected (DT)
 The detected fault class includes all faults that the ATPG process identifies
as detected
 The detected fault class contains two subclasses
 DET_Simulation (DS) - faults detected when the tool performs fault simulation
 DET_Implication (DI) - faults detected when the tool performs learning analysis
Posdet (PD)
 The Posdet, or possible-detected, fault class includes all faults that fault
simulation identifies as possible-detected but not hard detected
 A possible-detected fault results from a 0-X or 1-X difference at an
observation point
The Posdet class contains two subclasses
 POSDET_Testable (PT) - Potentially detectable Posdet faults PT faults result
when the tool cannot prove the 0-X or 1-X difference is the only possible
outcome. A higher abort limit may reduce the number of these faults
 POSDET_Untestable (PU) - Proven ATPG_Untestable and hard undetectable
Posdet faults
ATPG_Untestable (AU)
The ATPG_untestable fault class includes all faults for which the test
generator is unable to find a pattern to create a test, and yet cannot prove
the fault redundant
 Testable faults become ATPG_untestable faults because of constraints, or
limitations, placed on the ATPG tool (such as a pin constraint or an
insufficient sequential depth)
 These faults may be possible-detectable, or detectable, if you remove
some constraint, or change some limitation, on the test generator (such as
removing a pin constraint or changing the sequential depth)
Undetected Faults (UD)
 The undetected fault class includes undetected faults that can not be
proven untestable or ATPG_untestable.
 The undtectable class contains two subclasses
 Uncontrolled (UC) – Undetected faults, which during pattern simulation,
never achieve the value at the point of the fault required for fault
detection- that is, they are uncontrolled.
 Unobserved (UO) – Faults whose effects do not propagate to an
observable point.
Untestable Faults
Faults for which no pattern can exist to either detect or possible detect
them
Cannot cause functional failures, so the tools exclude them when
calculating test coverage
 Subcategorized as
Unused (UU)
Tied (TI)
Blocked (BL)
Redundant (RE)
Un Testable Faults
Unused Fault Tied Fault
Un Testable Faults

Redundant Fault Blocked fault


Coverage
 Test Coverage : Test coverage is a measure of test quality, which consists of
the percentage of all testable faults that the test pattern set tests.
l
Test coverage = detected faults / testable faults
 Fault Coverage : The fault coverage is the percentage of faults detected
among the total faults tested.
l
Fault coverage = detected faults/(testable +untestable )
Introduction To Testkompress


What is Testkompress

Testkompress ATPG Flow
What is Testkompress


Testkompress is a automatic test pattern generation tool from mentor
graphics

Support stuck-at , IDDQ and transition delay models

Can read netlist in verilog

Can write patterns in WGL , STIL & verilog
Testkompress ATPG Flow
ATPG TOOL inputs & outputs
// When we are doing Bypass mode
set edt off
add_scan_groups grp1 ./testprocfile/bypass.testproc
//SCAN Chain
set_edt_options -channels 2
set_edt_pins input_channel 1 SCAN_IN[0]
set_edt_pins input_channel 2 SCAN_IN[1]
set_edt_pins output_channel 1 SCAN_OUT[0]
set_edt_pins output_channel 2 SCAN_OUT[1]
//pins
add_clocks 0 CLOCK
add_clocks 1 RESET
add_pin_constraints TM C0
add_pin_constraints SE C0
// Put FastScan into ATPG mode which will execute the DRC checks and will
allow test generation commands:
set system mode atpg
report scan cells > ./report/scan_cells.rpt
report_drc_rules -fail > ./report/drc_rules_fail.rpt
add_faults –all
// Generate patterns for stuck -at faults : Set fault type to stuck -at
set fault type stuck // Stuck/Transition/Iddq
set_abort_limit 30
Create patterns
write_patterns ./SIM/scan_parallel/scan_parallel.v -scan -parallel -verilog -
replace
write_patterns ./SIM/chain_serial/chain_serial.v -chain -serial -verilog -
replace
REPort FAults -Class AU > ./report/AU_fault.rpt
report faults > stuckat_faults.fault_list
Test Procedure File
o timeplate
o load_unload
o shift
200 200
o capture
0 200 400 600 800 1000 1200
o test_setup
timeplate tp_slow = timeplate tp_fast =
force_pi 0; force_pi 0;
measure_po 10;
measure_po 100;
pulse CLK 20 40;
pulse CLK 200 400; period 40;
pulse TCK 200 400; end;
period 400;
end;
procedure test_setup =
timeplate tp_slow ;
cycle = // cycle 1
force RESET 1 ;
force TCK 0;
force CLK 0;
end ;
cycle = // cycle 2
//Reset
force RESET 0;
end ;
cycle = // cycle 3
//Reset
force RESET 1;
end ;
//reset JTAG
cycle = // cycle 4
force TRST 0;
pulse TCK ;
end;
cycle = // cycle 5
force TRST 0;
pulse TCK ;
end;
cycle = // cycle 6
force TRST 1;
pulse TCK ;
end;
end;
procedure load_unload =
scan_group grp1;
timeplate tp_slow ; procedure shift =
cycle = timeplate tp_slow ;
force clk 0; cycle =
force_sci ;
force SE 1;
measure_sco ;
force TM 1; pulse TCLK1 ;
end; end;
apply shift 127; end;
end;
procedure capture =
timeplate tp_fast ;
cycle =
pulse TCLK1;
end;
cycle =
pulse TCLK1;
end;

end;
procedure capture TCLK1_TCLK2 =
launch_capture_pair TCLK1 TCLK2;
cycle = // cycle 1
force TCLK1 0;
force TCLK2 0;
pulse TCLK1;
end;
cycle = // cycle 2
pulse TCLK2;
end;
end; // end of capture procedure
THANK YOU

You might also like