Professional Documents
Culture Documents
CAD4Soc - 2024 - Singlepage - 115 - End
CAD4Soc - 2024 - Singlepage - 115 - End
CAD4Soc - 2024 - Singlepage - 115 - End
History
• SPICE history has started 1968 at the University of • 1983: SPICE2G.6: Up to that point, SPICE was written
California, Berkeley in FORTRAN
• 1968: Predecessor CANCER (Computer Analysis of • 1993: SPICE3f: First mature release written in C
Nonlinear Circuits, Excluding Radiation)
• Today: SPICE3f5 is the newest version from Berkeley,
• 1971: Release of SPICE (improved CANCER) into the also available as sourcecode
public domain (responsible: Prof. D. Pederson) http://bwrc.eecs.berkeley.edu/Classes/IcBook/SPICE/
• 1975: SPICE2
• He died 2004.
• Because the source code of SPICE was available to • Today, LTSpice is the choice for free SPICE simulation.
everyone, many commercial versions were developed Software is hosted by AnalogDevices:
by the industry https://www.analog.com/en/resources/design-tools-and-
calculators/ltspice-simulator.html
VOLTAGE DIVIDER
VIN 1 0 DC 12
R1 1 2 3000
R2 2 0 1K
C1 2 0 10N
.OP
.END
• Letters immediately following a number that are • 1000, 1000.0, 1000HZ, 1E3, 1.0E3, 1KHZ, 1K all
not scale factors are ignored, also letters represent the same number
BSIM4.6.5 (newest version for < 65nm transistors) has > 400 parameters, 2/3 non-physical
• What’s the power consumption in a block? • Frequency domain: small signal analysis (.ac), noise
analysis (.noise)
• How fast can the load capacity be charged?
• DC domain: operating point analysis (.op), transfer
• What’s the threshold voltage of a certain gate?
function analysis (.tf, .dc), two-port network
parameters
• Algorithms solve this problem usually in three steps • Take out the “nonlinear”:
• Take out the “differential”: Reduction of the nonlinear algebraic equations into a
Reduction of nonlinear ADEs into a series of nonlinear series of linear equations using linearization methods
algebraic equations at a series of discrete time points (Newton-Raphson, ... method)
by using numerical integration methods (Euler, Accuracy is determined by number of
Trapezoid, Gear Shichman, ... method) linearization iterations
Timestep (delta between time points) is crucial • Solve the linear equation system:
for accuracy! Using methods like LU factorization
• One-dimensional example: ID
u u
(u1-u) G + ID = 0 1
f(u1) = u1 G + IS (exp (u1/uT) - 1) - u G = 0
i.e.: find solution u1 for nonlinear function f(u1) = 0
• Linearize equation around
starting value u10 : f’(u10)
and find solution u11 for f’(u10) = 0
f(u1 )
• Check if accuracy criteria is
met: | u10-u11 | < e f´(u11 )
Convergence?
Increment time
Transient Analysis
End of time interval?
Many loops and iterations to solve a huge • Accuracy is controlled by the methods applied
matrix numerically (integration and linearization) and certain error
• Resulting in rather long run-times even for small criteria
circuits • Higher accuracy requirements (that is, smaller time-
• Limiting the size of circuit to be simulated (memory steps, more lineari-zation iterations by using lower
• Well, did you think about… • parasitic resistors, capacitors (coupling), inductances,...
• the algorithms used for solving circuit equations • parasitics in power supply network
• …
DESIGN STRATEGIES
• Simulate
• Structured design: Top-down • There is hardly any pure „100%“ project out there
• Understand the top-level specification, which must be done exclusively in a „Top-down manner“, since it is
impossible to 100% specify complex systems
complete
• Structured design: Bottom-up • This will only work for small examples, since you will
simply not be able to master complex systems with
• Construct your system/sub-system from primitives
this methodology ....
• Assemble your sub-systems to form the system
to what you know/learned from an early • Verify your top-level design (top-level, either
implementation complete, or cross-sections, or with abstracts)
• Strategy 2: Fast Tree Design for Associative • At each level of the tree the op operations are performed
Operations simultaneously and their results are op’ed at the next higher
level, and so forth
• An associative operation op is defined as one for which: • E.g. of assoc. oper: +, *, and, or, xor
• Strategy 3: Speculative Computation • All the different o/p’s of the diff. Copies of B are Mux’ed
using prev. stage A’s o/p
• If there is a data dependency between two or more portions • E.g. design: Carry-Select Adder (at each stage performs two
of a computation (which may be obtained using D&C), don’t additions one for carry-in of 0 and another for carry-in of 1
wait for the the “previous” computation to finish before from the previous stage)
starting the next one
x
0 A
x B(0,0)
B A y
z 0
y
0
(a) Original design: Time = T(A)+T(B) B(0,1)
1
4:1 Mux
z
1
B(1,0)
0
1
B(1,1)
1
(b) Speculative computation: Time = max(T(A),T(B)) + T(Mux).
Works well when T(A) approx = T(B) and T(A) >> T(Mux)
• Strategy 4: Best of both worlds: Average and worst • Get the best of both (ave-case, worst-case) worlds
case delay
• In the above schematic, we get the good ave case
performance of unary division (assuming uniformly
distributed inputs w/o the disadvantage of its bad
• Use 2 circuits with different worst-case and average-
worst-case performance)
case behaviors
• Use the first available output
• Strategy 5: Pipelining
Stage 1
Original ckt
or datapath Stage 2
Conversion
to a simple
level-partitioned
pipeline (level
partition may not
always be possible
Stage k
but other pipe-
lineable partitions
may be)
00000000 00000001 0
--------------------
00000001 00000000 1
----------------------
11111111 11111111 0
……….
……….
• Approach 2(b): Structural algorithmic approach: • D&C approach: See if the problem can be “broken
• Be more innovative, think of the structure/properties up” into 2 or more smaller subproblems that can be
of the computational problem “stitched-up” to give a sol. to the parent problem
• E.g., think if the problem can be solved in a hierarchical • Do this recursively for each large subprob until
or divide-&-conquer (D&C) manner:
subprobs are small enough for TT-based solution
• If the subprobs are of a similar kind (but of smaller
Stitch-up of solns to A1 and A2 size) to the root prob then the breakup and stitching
to form the complete soln to A
will also be similar
Root problem A
Subprob. A1 Subprob. A2
Example 2: Design of a Parity Detection Circuit Solution 1: A linearly connected chain of XORs:
(16 bit)
(16 bit)
w(1,1) w(1,0)
w(0,0) = f
• No concurrency in Solution 1 ---the actual problem • Answer: (1) First of all when the operation makes
has available concurrency, though, and it is not sense for any # of operands. (2) It should be possible
exploited well in the above “linear” design to break it down into smaller operations. (3) Finally,
when the operation is associative. An operation “x” is
• Complete sequentialization leading to a delay that is
said to be associative if:
linear in the # of bits n (delay = n*td), td = delay of 1
gate a x b x c = (a x b) x c = a x (b x c).
• All the available concurrency is exploited in Solution • Thus if we have 4 operations a x b x c x d, we can
2 : a parity tree. either perform this as a x (b x (c x d)) [getting a linear
delay of 3 units] or as (a x b) x (c x d) [getting a
• Question: When can we have a tree-structured
logarithmic (base 2) delay of 2 units and exploiting
circuit for a chain of the same operation on multiple
the available concurrency due to the fact that “x” is
operands?
associative].
• Let f(xn-1, ….., x0) be an associative function. • Using the D&C approach for an associative operation
results in the stitch up function being the same as the
• What is the D&C principle involved in the design of
original function (not the case for non-assoc.
an n-bit xor/parity function? Can it also lead
automatically to a tree-based ckt? operations), but w/ a constant # of operands (2, if the
orig problem is broken into 2 subproblems)
Remember Semester 3 of your Bachelor class, when • Determine the Q-point of the transistor circuit shown on the right-
everything was smooth & easy? All values were bottom side. Assume the following values:
constant values … • V=-15V
• Have a look at the circuit on the top-right. Calculate the current I and • R=75W
the voltage VO (Q-Point). You may assume the following values for • W/L=1/1
the power supply and transistor parameters: • VTN=0.75V, VTP=-0.75V
• VTN=0.75V, VTP=-0.75V
400
380
360
340
As a designer, you have to live with these
VTS_0H_N10x016_P05
320
300
deviations…
280
260
240
220
180
within a large range of parameter changes … 160
nf 10x0.16µm²
140
15-Oct-2003 00:00 23-Jan-2004 00:00 02-May-2004 00:00 10-Aug-2004 00:00
date-TVM
• The fab engineers and the design engineers • Anyhow, the fab engineers will try their best to run
“sign” a contract their fab as promised to the design engineers
• Fab engineers promise a nominal value for essential • The design engineers will do their best to design
design parameters (e.g. VTN=0.380V) circuits that will work within the specified range, if the
• If this is an established technology, then the fab parameters vary within the specified range
engineers are able to give upper and lower bounds,
including some statistics (e.g. my 3σ lower and upper
bound is VTN=0.295V and VTN=0.476V; whereas my 4 σ
lower and upper bounds are VTN=0.275V and
VTN=0.500V)
a y
Will your circuit work for all cases? You don´t know ... Nominal case: 0->1 change on a brings a 1->0
transistion on y after 5ns
• What effects slow down and what speeds up • Voltage: The higher the voltage, the faster a
silicon based semiconductors? transistor is switching …
• Temperature: transistors can switch faster if
they are cold, slower if they are hot
• Process: this is how the transistor is made out
of implants, misalignment of different layers, ….
• In the early times of digital simulation, simply the • For a great process, high voltage and low
nominal time value was taken, and the so-called PVT temperature the delay will only be 5ns*
factors were multiplied upon 0.9*0.85*0.85=3.25ns
D-FF1 D-FF2
D Q a_sel D Q
a
s_bar w
s
b b_sel
clk
c
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 167
Case Study
• You have to look for the longest path first. For the nominal
case:
D-FF1 D-FF2
D Q a_sel D Q
a
1
s_bar w
s
b b_sel
clk
c
D-FF1 D-FF2
D Q a_sel D Q
1
a
1
s_bar w
s
b b_sel
clk
c
D-FF1 D-FF2
D Q a_sel D Q
1
a
1 0
s_bar w
s
b b_sel
clk
c
D-FF1 D-FF2
D Q a_sel D Q
1 a
1 0 0
s_bar w
s
b b_sel
clk
c
D-FF1 D-FF2
D Q a_sel D Q
1
a
1 0 0
s_bar 0/1
w
s
b b_sel
clk
c
D-FF1 D-FF2
D Q a_sel D Q
1
a
1 0 0
s_bar 0/1
w
s t=26ns
• But wait, the circuit needs also to work for the max • Not exactly, but you will experience certain fails in
case. So let´s assume the PVT factors we used a the application, or a lower yield. But overall, most
couple of slides ago. Then we have to multiply the companies do not accept this. For high volume
critical path with the max. PVT factor: products, this is not acceptable at all.
26ns*1.1*1.15*1.15=37.82ns
• What to do?
• This is then called a negative slack of 3.82ns
• Upsize the standard cells with larger output drivers
• Does this mean the circuit will not work? (e.g. an inverter with only td=4ns, an OR & AND with
only td= 6.5ns will do the job)
D-FF3 D-FF2
D Q a_sel D Q
a
s_bar w
s
b b_sel
clk
c
D-FF3 D-FF2
1D Q
a a_sel D Q
s_bar w
s
b b_sel
clk
c
D-FF3 D-FF2
1D Q
a a_sel D Q
1
s_bar w
s
b b_sel
clk
c
• What was the earliest time the signal was allowed to arrive?
1D Q
a a_sel D Q
1 t=13ns
s_bar 1
w
s
b b_sel
clk
c
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 178
Case Study
• But wait, the circuit needs also to work for min case. • But the hold time of a D-FF tells you that the input
So let´s assume the PVT factors we used a couple of must remain there thold-long after the rising edge of
slides ago. Then we have to multiply the critical path the clock, otherwise the output is not safely
with the min. PVT factor: 13ns*0.9*0.85*0.85=8.45ns switched. In our case thold=9ns (ok, not derated, so
let´s assume this is the case for min/nom/max ….)
• What to do?
D-FF3 D-FF2
1D Q
a a_sel D Q
1 t=13ns
s_bar 1
w
s
b b_sel
clk
c
• So let´s summarize: A circuit works from a timing • In case of a setup violation, simply upsize cells in the
point of view if max-path to be faster
• In case of a hold violation, simply insert buffers in the
min-path to be slower
• The longest path in the circuit does not cause any
setup violation, even if the worst (=max) PVT derating
is assumed
• This is exactly what Place & Route CAD software is
doing
• The shortest path in the circuit does not cause any
hold violation, even if the best (=min) PVT derating is
assumed
• Yes, there is. It is called Static Timing Analysis (=STA) • The software simply reports you a setup/hold margin
/ slack, for min/nom/max. In case of a violation, it
provides an example path …
• It is a piece of CAD software, that simply traverses all
possible paths and finds out the min/max paths and
checks for violations • Sounds too good to be true, huh? Indeed, there are a
few drawbacks …
• Not all paths in a circuit are meant to transmit data Usually they belong to a long net, are very slow,
signals. Typical examples are reset wires … asynchronous, but since the timing model of the D-FF
models this path, STA cannot distinguish between data
paths and control paths.
D-FF3 D-FF2
D Q a_sel D Q
a
s_bar w
s
b b_sel
clk
c
reset
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 184
Evaluating Circuits Using STA
• At STA, you save a lot amount of time since you do • Another drawback: suppose your logic circuit is
not have to create test cases extremely smart designed, so that the cases for
which a timing violation would occur will never
• But you have to spend some time to eliminate so-
happen … This cannot be detected by STA …
called “false pathes”
• Using our case study you have seen that a rather slow • But good news: It is quite unlikely that all stdcells in
design with fclk=25MHz can have problems to work. the min or max path follow the same pessimism.
(SSTA)
Process
Circuits Models
Designer 100%
Parametric
Yield
Analysis Tools
Methods
Analysis
Awareness
Tools
???
often with
not done Bugs
Methods
with A responsible-minded designer/in is the
Limitations only guy who can cope with all the
obstacles and uncertainties
0.4 0.4
P1 ( x ) P1 ( x )
0.3 0.3
P2 ( x ) P2 ( x )
P3 ( x ) P3 ( x )
0.2 0.2
0.1 0.1
0 0
6 3 0 3 6 6 3 0 3 6
x x
USL LSL
Cp
6
• Cp compares the specified range with the standard deviation
0.8 0.8
P1 ( x ) P1 ( x )
0.6 0.6
P2 ( x ) P2 ( x )
P3 ( x ) P3 ( x )
0.4 0.4
0.2 0.2
0 0
6 3 0 3 6 6 3 0 3 6
x x
0.4 0.8
P1 ( x ) P1 ( x )
0.3 0.6
P2 ( x ) P2 ( x )
P3 ( x ) P3 ( x )
0.2 0.4
0.1 0.2
0 0
6 3 0 3 6 6 3 0 3 6
x x
• One might argue Cpk = 1 would be manufacturable as • And there are more parameters than only this one
well
• Just with a little yield loss of 0.26% for a Gaussian
• Y = 0.9974 1-Y = 0.0026 = 0.26%
distribution
• Statistical process control is a closed loop procedure • Design verification can be managed either closed loop
• A wafer fab can and will react when things go wrong or open loop
• Design for manufacturing is an open loop procedure • Each update of model parameters starts a new
verification cycle or
• Prediction of Cpk based on models and assumptions
• The design is made once as robust to tolerate a
• Hence design has to cope with various uncertainties
parameter change
• Pls. reconsider, just one critically designed parameter
What’s your preference? – It’s up to you to make a
can noticeably pull down the yield of a whole product
reasonable choice!
Redesign?
• Device models and nominal model parameters • Process statistic and statistical models (Corner
• Model equations which can fit to the measured and MC)
characteristics • Maturity and sufficient stability of the
• Are all known effects implemented (halo, well manufacturing process
proximity, STI)? • Complete monitoring of all relevant process
• Best possible fitting of model parameters – no or parameters
small tradeoffs • Statistical models based on these process
But if the model doesn’t fit then the parameters monitoring results
can’t fit as well
Output resistance accuracy significantly improved! But still a trade-off between modeling at Vbs= 0V and Vbs= VDD necessary, since the
DITS-output resistance model has no parameter for body bias effect.
• Algorithms and simulation methodology • Design has to live with manifold uncertainties
• Limitations of Worst Case method • Sufficient design margin is the only way to cope
How many simulations do I need to get a • The model parameter window needs to be
confident yield estimate? reasonably wider than the assumed process
Verification of Yield
Robustness and Manufacturability Estimation
Nominal Case Digital Corners Analog Corners Uniform Alternative Standard MC
MC MC
y
Specification Test
• x>-5 • pass
• x+y<9 • pass
x • x > -3 • fail
• x+y<4 • fail
• x2 + y > -2 • “pass”
n 2n
1 2
2 4
3 8
4 16
5 32
10 1024
• The good news
• Performance depends typically on a small subset of a few parameters only
(But the subset is generally unknown w/o detailed knowledge of the circuit)
• Hence it is not needed to simulate all 2n corners but 16…64 well selected
All other circuits & 4.5 Sigma 4.5 Sigma 4.5 Sigma 3 Sigma
standard blocks
• General method
• needs a statistical model for the distribution function
of the (BSIM) model parameters
Random
• a random generator derives the actual simulation generator
20
measurements respectively 2
0
0.03 0.015 0 0.015 0.03
INT 0.5 h
the specification
6
4
2
0
0.03 0.015 0 0.015 0.03
0.5 h
estimate the yield
INT
•
pass fail
Y = 1/N * Number_of_Pass * 100%
YP = 98%
• it is not possible in general to extrapolate the yield • Apply an uniform parameter distribution rather
from the σ than Gaussian
• exception: the performance distribution is known • Results in an equal screening of the whole
in advance parameter range
but then brut force Monte-Carlo is actually not • Helps to identify design weakness w/r to PVT
needed at all • Very powerful method - but doesn’t allow to
• Required number of MC runs is usually very high estimate the yield
• How confident (reliable) is the yield estimation? • Rule of thumb for a confident yield estimate
• Depends on the yield itself • There should be about 9...10 fails inside the
• Depends on the confidence interval, i.e. a range • Rule of thumb: about 10/(1 –Y), i.e. 1000 MC
• Depends on the confidence level (95%, 99%) i.e. • 100 runs for 90%, 1000 runs for 99%, 10.000
• But does not depend on the simulated circuit or • We can simulate single blocks / specifications
• Suppose the fab finally fabricates your chip. How do • Rule of thumb: without a special repair step, no
you know your chip works perfectly fine and you can fabricated DRAM or CPU initially works! Malfunctions
sell it? have to be identified and repaired
• The fab has to run a so-called „Test“ procedure, but • You have to find a smart way of checking that all flip-
exhaustively testing a chip if out-of-the-question for flops/registers are working: DfT (Design for Test,
cost reasons already to be applied at Design level)
• Circuit is designed using pre-specified design rules. • Use combinational ATPG to obtain tests for all testable faults
in the combinational logic.
• Add shift register tests and convert ATPG tests into scan
• Test structure (hardware) is added to the verified
sequences for use in manufacturing test
design:
• Add a test control (TC) primary input.
• Memory elements (latches and flip-flops) are • Test data transferred serially to and from R making
designed so that they can be reconfigured memory state completely controllable and
dynamically to form a shift register R during testing observable
• N/T = 1: Scan in test pattern, hold appropriate bit • Scan provides complete controllability and
pattern on controllable primary inputs observability
• Use only clocked D-type of flip-flops for all state • All clocks must be controlled from PIs.
variables.
• Advantages:
• Design automation
STDCELLS, FLOORPLANNING,
PLACEMENT, ROUTING
• In a library several primitive cells are available • If available, more complex macros are also available
• Predefined Input/Output blocks (Pad-Cells) • SRAM/eDRAM
• Tracks form a grid for routing. • Tracks form a grid for routing.
• Spacing between tracks is center-to-center distance • Spacing between tracks is center-to-center distance
between wires. between wires.
• Track spacing depends on wire layer used. • Track spacing depends on wire layer used.
• Pitch: height of cell. • VDD, VSS connections are designed to run through
• All cells have same pitch, may have different widths. cells.
• Tracks form a grid for routing. • Horizontal and vertical can be routed relatively
• Spacing between tracks is center-to-center distance independently.
between wires.
• Placement of cells determines placement of pins.
• Track spacing depends on wire layer used.
• Pin placement determines difficulty of routing
• Different layers are (generally) used for horizontal and
problem.
vertical wires.
• Generate candidates, evaluate area and speed. • Generate candidates, evaluate area and speed.
• improve candidate without starting from scratch. • improve candidate without starting from scratch.
• To generate a candidate:
• place gates in a row;
Starting point:
• Blocks may be placed at different rotations and • Wiring plan must consider area and delay of
reflections. critical signals.
• Nodes are channels, edges placed between two channels that touch.
• Wire out of end of one channel creates pin on side of next channel:
• resize wires;
• Two-point nets are easy to design.
• add buffers;
• size transistors.
• Buffer insertion in a sized Steiner Tree: More complex • Capacitive coupling introduces crosstalk –
than placing buffers along a transmission line: capacitance and slew-rate determine impact
• complex topology;
• Crosstalk slows down signals to static gates
• unbalanced trees;
• Crosstalk can be controlled by methodological and
• differing timing requirements at the leaves.
optimization techniques, but at the cost of larger
• Neighboring wires influence each other: Crosstalk area
(XTalk)
• How to estimate delays induced by crosstalk? • Coupling effects depend on relative switching time of
nets.
VERILOG
• 1982: HiLo is a popular hardware description • 1991: Cadence „opens“ the Verilog language by
language (by GenRad) founding the OVI (Open Verilog International)
initiative for developing and standardizing the HDL.
• 1984: Phil Moorby (who was co-developing HiLo)
invents Verilog • 1992: Many companies offer Verilog simulators
• 1986: The Verilog-XL simulator (by Gateway) is the • 1995: The Verilog LRM provided by OVI becomes the
most-powerful simulator for digital circuits IEEE standard 1364.
• 1990: Cadence acquires Gateway and owns now • 2001: Latest version of Verilog is called Verilog-2001
Verilog and the Verilog-XL simulator. At the same (IEEE1364-2001)
time, Synopsys is pushing towards top-down logic
• Several extensions are available towards verification
synthesis.
and system description
• 1981: The United States Department of Defense • 1994: Publication of the VHDL-1993 standard
recognizes the need for an HDL to „overcome the
• 2000: Publication of the VHDL-2000 standard
hardware life cycle crisis“. Sponsored with more than
• 2002: Publication of the VHDL-2002 standard
30 Mio US$
• 2007: Publication of the VHDL Procedural Language
• 1983-85: Development of VHDL by IBM, Intermetrics
Application Interface standard (VHDL 1076c-2007)
and TI
• 2009: Revised standard (VHDL 1076-2008)
• 1986: The DoD transfers all rights of VHDL to the IEEE
SystemVerilog
SystemC
System
VHDL
Algorithm
Verilog
Logic
VITAL
Gate
Layout
Ideal for coding testbenches &
Ideal for coding hardware. The
Device verification. Not accepted for RT
standard for synthesis results on GL.
synthesis by industry
Preferred by US&AP companies
There´s plenty of free software available for trying out Verilog at home:
• Modelsim (various download sites, also for FPGA, digital logic etc.)
• Verilog is perfectly suited for the description of gate- module mux2 (in1, in2, sel, out);
level netlists. Therefore de-facto all tool-written output out;
input in1, in2, sel;
netlists on gate-level are in the Verilog language. But
it is also possible to design on Gate-level in Verilog: and a1(a1_o, in1, sel);
not n1(n1_o, sel);
and a2(a2_o, in2, n1_o);
a1 or o1(out, a1_o, a2_o);
in1
a1_o
endmodule
o1
out
in2 n1 a2
n1_o a2_o
sel
• Results of simulation
Test Generator
Device under Test
&
(DUT)
Monitor
testbench
tgam dut
s1 w1 s1
s2 w2 s2
out wo out
testgen_and_monitor
• The behavior of a block can also be expressed in in1 in2 sel out
a1 1 0 0 0
in1
a1_o
1 1 0 1
o1
sel out 0 0 1 0
in2
a2
0 1 1 0
a2_o
~sel 1 0 1 1
1 1 1 1
• The behavior of a block can also be expressed in terms of procedural statements, rather than gate-level primitives
• The statement always @(in1, in2, sel) tells the • The basic essence of a behavioral model is the
process:
simulator to suspend execution if there is no change
• Independent thread of control
in at least one of the three inputs
• Think of a complex system as a large set of
• The always continously repeats its statement, never independent, but communicating processes
exiting or stopping.
• In contrast to always : The initial construct is
• Don´t get confused that „out“ is now a reg. Looking executed only once, otherwise similar
• Concurrent processes „live“/“happen“ at the • What stops a process? Only a delay (#10), wait,
same time: not one after the other. or @ statement. If this is missing, the process
• One model waits for an event that happens will run forever.
• In a behavioral model, time is not existent! • An initial block will only execute once. It will
Behavioral statements (if, loop, while, ....) take always start at time=0 (even if there are more
zero time to execute than one initial block in the system).
• Time only advances in a process, if a wait, @ or • An always process will execute forever
delay is executed
endmodule
Majority of the three
inputs
initial
begin
qbar = 0; // blocking assignment
#100; // wait for 100ns
qbar = 1;
end
• Nonblocking assignment done with „<=„ • Value of the left-hand-side (lhs) changes only
after rhs has been evaluated
• Only inside a process
• Only to wire
• Processes can be blocked the following way: • Wait for a signal to change
@(sel_bar);
...
Else / else if is
if ((a > b) && (c < b))
// then-statement goes here a) Optional
else if (a > d) b) Always matched to
// else-if statement goes here the nearest “if”
else
// else clause goes here
A “zero” would be
interpreted as “false”
i = 16;
Condition must change inside
while(i) your process!
begin
// do something useful here
i = i – 1;
end
module a_very_abstract_dram;
always
begin
read_spd; // read timing/latency settings
forever
begin
get_commands_from_mem_ctrl;
end
end
endmodule
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 295
Control Structures
begin:break
for(i=0; i<n;i=i+1) Proceed with i=i+1; but
begin:continue stay in loop
if(a==0)
disable continue;
... // other statements
if(a==b)
Exit the loop
disable break;
... // other statements
end
end
always
begin
@(posedge clk)
ir <= dram[pc]; // get the ir from a dram address
@(posedge clk)
case (ir[15:13])
3‘b000: pc <= dram[ir[12:0]];
3‘b001: pc <= pc + dram[ir[12:0]];
3‘b010: acc <= -dram[ir[12:0]];
...
endcase Much better readable,
pc <= pc + 1; or?
end
module decoder;
reg [7:0] reg1;
always
begin
Case 2 …
reg1 = 8‘bx1x0x1x0;
casex(reg1)
8‘b001100xx: $display(“Case 1“);
8‘b1100xx00: $display(“Case 2“);
8‘b00xx0011: $display(“Case 3“);
8‘bxx001100: $display(“Case 4“);
endcase
end
endmodule
• Using modules you are able to partition large pieces of code. But modules imply structural boundaries. If this
is not the case, you may use functions and tasks.
• Example (see next page):
module advanced_decoder_with_function;
reg signed [15:0] m [0:8191];// signed 8192 x 16 bit memory
reg signed [12:0] pc; // signed 13 bit program counter
reg signed [12:0] acc; // signed 13 bit accumulator
reg ck; // a clock signal
always
begin: executeInstructions
reg [15:0] ir; // 16 bit instruction register
@(posedge ck)
ir <= m [pc];
@(posedge ck)
case (ir [15:13]) // as before
3'b111: acc <= multiply(acc, m [ir [12:0]]);
endcase
pc <= pc + 1;
end
module advanced_decoder_with_task;
reg signed [15:0] m [0:8191];// signed 8192 x 16 bit memory
reg signed[12:0] pc; // signed 13 bit program counter
reg signed[12:0] acc; // signed 13 bit accumulator
reg ck; // a clock signal
always
begin: executeInstructions
reg [15:0] ir; // 16 bit instruction register
@(posedge ck)
ir <= m [pc];
@(posedge ck)
case (ir [15:13]) // as before
3'b111 : multiply (acc, m [ir [12:0]]);
endcase
pc <= pc + 1;
end
• We learned, that functions and tasks are similar to • However, there are differences you must know as a
software functions and procedures. Their main goal is programmer:
to make code more readable.
Calling A task call is a separate procedural statement. It A function call is an operand in an expression. It is called
cannot be called from a continuous statement from within the expression and returns a value used in
the expression. Functions may be called within
procedural and continuous assignment statements
I/O Can have zero or more arguments of any kind. Has at least one input, but no inouts or outputs. At least
one value is returned.
Calling others tasks or A task may enable other tasks and functions A function can enable other functions, but not
functions
other tasks.
Storage Storage of the inputs, outputs and internally declared Storage of the inputs and internally declared
variables is static – concurrent calls share the storage. variables is static. If the function is declared
Exception: if the task is declared automatic, then the automatic, then the storage is dynamic and
storage is dynamic and each call gets its own copy. recursive calls get their own copies.
Returned values A task does not return values. If inout or output ports A function returns a single value to the expression
are changed, then this is copied back at the end of the that called it. The value to be returned is assigned
task execution. to the function identifier within the function.
• Building up system hierarchically is essentially needed for mastering the complexity. We need to understand how to
address variables and where data are known across hierarchies.
top
reg r, w
Named block (begin ... end): y
Block1 (instantiated as b) task t
reg s
Block2 (instantiated as d)
reg s
only be accessed in the local scope! • Functions and tasks used in many parts of your design
• Moore Machines
• The outputs depend only on the current state
• The next state depends on current state and inputs
• Mealy Machines
• The outputs depend on current state and inputs
• The next state depends on current state and inputs
next state
state
Logic
State
Register
outputs
Logic
inputs
next state
state
State
Logic
Register outputs
inputs
• Current state and next state are encoded binary (in the example: 3 bits)
000 1 x 101 0 0
be covered by exactly one line in
001 1 0 010 0 1
the table (not more, not less)
... ... ... ...
• The arrows indicate which state is taken in the next cycle, depending on the inputs and
the current state
a=0
00 current next
inputs outputs
0 state state
S1S0 a S1‘S0‘ x
a=0 00 0 00 0
always
a=1 00 1 01 0
01 0 00 0
01 1 11 1
11 01
11 x 00 0
1 a=1 0
current state
S1S0
Notation:
x assigned output
value
a = 0 / x 0
00 current
inputs
next
outputs
state state
S1S0 a S1‘S0‘ x
a=0/x0 00 0 00 0
always / x 1
00 1 01 1
a=1/x1
01 0 00 0
01 1 11 1
11 a=1/x1
01 11 x 00 0
• Because the outputs of a Mealy Machine also depend on the inputs, the values assigned
to them are annotated at the transitions
• The notation is: input condition / output assignment
• The encoding of the states plays a key role for the • The optimum choice depends on the used
implementation of a FSM technology (ASIC, PLA, FPGA, etc.) as well as on the
• It influences the complexity of the logic functions, the
given design goals
of the logic)
• Advantages:
• therefore the name “one hot” encoding • particularly advantageous for FPGA
implementations
less glitches
• Disadvantages:
Logic FF
Logic FF
0/010
0/000
1/100 1/101 1/110 1/110
1/100
Reset
A B C D E F
0/000 0/010
?/101
0/000
endmodule
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 324
Concurrent Processes
endmodule
<event_control>
::= @ <identifier>
||= @ ( <event_expression> )
<event_expression>
::= <expression>
||= posedge <scalar_event_expression>
||= negedge <scalar_event_expression>
||= <event_expression> or <event_expression>
• A more abstract form of event control is the named • with F0=0 and F1=1.
event statement. It allows a trigger to be sent to • The first numbers of the Fibonacci Sequence are:
another part of the design. We will use a Fibonacci 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, ...
number generator as one example.
Fn Fn 1 Fn 2
module numberGen(number);
output reg [15:0] number = 0;
always
begin
@ng.ready //wait for event signal
count = startingValue;
oldNum = 1;
for (fibNum = 0; count != 0; count = count - 1)
begin
temp = fibNum;
fibNum = fibNum + oldNum;
oldNum = temp;
end
$display ("%d, fibNum=%d", $time, fibNum);
end
endmodule
• The previous example may lead to problems, since the consumer nevers signals to the producer that it
successfully accepted the data. A consumer/producer handshake mechanism overcomes this deficiency:
You´re welcome!
prodRdy
consRdy
clock
rwLine
addrLines
dataLines
Bus master
drives rwLine=0
clock
and puts
addresses on
addrLines
rwLine
addrLines
dataLines
Bus slave
clock writes data on
dataLines to
specified
address
rwLine
• The question is now, which gate and switch level primitives are built into Verilog?
table
0 00 : 0;
0 01 : 0;
0 10 : 0;
0 11 : 1;
1 00 : 0;
1 01 : 1;
1 10 : 1;
1 11 : 1;
endtable
endprimitive
table
0 0? : 0;
0 ?0 : 0;
? 00 : 0;
? 11 : 1;
1 ?1 : 1;
1 1? : 1;
endtable
endprimitive
primitive latch
(output reg q,
input clock, data);
table
// clock data state output
0 1 : ? : 1;
0 0 : ? : 0;
1 ? : ? : -;
endtable
endprimitive
primitive dEdgeFF
(output reg q,
input clock, data);
table
// clock data state output
(01) 0 : ? : 0;
(01) 1 : ? : 1;
(0x) 1 : 1 : 1;
(0x) 0 : 0 : 0;
(?0) ? : ? : -;
? (??) : ? : -;
endtable
endprimitive
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 348
User Defined Primitives
• Only exact one output port allowed, but multiple inputs • All primitive ports are scalar – sorry, no vectors please
• The output port must be the first port to be listed • Only logic values of 0, 1 and x can be treated on inputs and
output. The z-value cannot be specified, but if put on an
input, it is equivalent to x
0 Logic 0
1 Logic 1
x unknown
? Can be 0, 1, and x Cannot be used in the
output field
b Can be 0 and 1 Cannot be used in the
output field
- No change May only be given in the
output field of a sequential
primitive
r Same as (01)
f Same as (10)
• Wires (or nets or interconnect) are simply used to connect • In Verilog, you can declare wires using the keyword wire,
ports or signals. Usually they do not store values, but and also assign a delay with this interconnect:
transmit values that are driven on them by structural
elements such as gate outputs, assign statements and
registers in a behavioral model.
• Wires can be declared implicitly. If an identifier appears in the port list of an instance of a gate primitive, module
instantiation, or on the LHS of a continous assignment, it will be implicitly declared as a net.
• By default, an implicitly declared net will be of type „wire“, and will have the default width of the connected module (or
gate primitive port).
• You may change or even turn off this default by setting the compiler directive
• Note that this is a fundamental difference between Verilog and VHDL – in VHDL you must declare everything that you
use, Verilog is much more „get the work done“. Be aware, typing errors in port-names (e.g. 0 instead of o) do not
automatically lead to an error in Verilog
• With a net declared as wand, its output will be 0 if any of its drivers will be 0. Or, the output c will be 1 if both a and b
are 0. Output d will be unknown if any input is unknown or z, or if a = !b
module wired_and_example
(input a, b, output wand c, output d);
a
not (c,a);
not (c,b); c
b
not (d,a);
not (d,b);
a
d
endmodule
• Verilog enables you to even model hardware on the transistor level, but of course still only for digital
signals. Have a look at the following MOS shift register:
in
wa1 wa2 wa3
phase1
phase2
module waveShReg;
wire shiftout; //net to receive circuit output value
reg shiftin; //register to drive value into circuit
reg phase1, phase2; //clock driving values
parameter d = 100; //define the waveform time step
initial
begin :main
shiftin = 0; //initialize waveform input stimulus
phase1 = 0;
phase2 = 0;
• Sometimes hardware is designed to overwrite signals by stronger drivers. Verilog enables you to model
this. Have a look at the following SRAM example:
Bitline
Wordline
address
g4
w1
g2 w4 w3
write g3
g1
dataIn 4-T SRAM cell
g5
dataOut
initial begin
#d dis;
#d address = 1; #d dis;
#d dataIn = 1; #d dis;
#d write = 1; #d dis;
#d write = 0; #d dis;
#d write = 'bx; #d dis;
#d address = 'bx; #d dis;
#d address = 1; #d dis;
#d write = 0; #d dis;
end
• The realistic delay of a circuit can only be modeled if each component can be assigned a realistic logic delay. Verilog offers a wide
variation of options to model hardware to a so-called „sign-off“ point.
module triStateLatch d
qQ qDrive
(output qOut, nQOut, qOut
input clock, data, enable);
tri qOut, nQOut; clock
nQ nQDrive
nd nQOut
not #5 (ndata, data);
data
nand #(3,5) d(wa, data, clock),
nd(wb, ndata, clock);
enable
nand #(12, 15) qQ(q, nq, wa),
nQ(nq, q, wb);
bufif1 #(3, 7, 13) qDrive (qOut, q, enable),
nQDrive(nQOut, nq, enable);
endmodule
• We have used the time units up to now without knowing the meaning of it. There is a compiler directive `timescale which
specifies the time unit and the time precision
• You may use s for seconds, ms for milliseconds, us for microseconds, ns for nanoseconds, ps for picoseconds and fs for
femtoseconds
• The previous way of instantiating eight xors was quite cumbersome.... A nicer way is:
endmodule
module register_bank
(output [7:0] q,
input [7:0] d,
input clock, clear);
endmodule
• Arrays of instances are only limited to quite simple repetitive structures. Much more power has the
generate command:
module xorGen
#(parameter width = 4,
delay = 10)
(output [1:width] xout,
input [1:width] xin1, xin2);
generate
genvar i;
for(i=1; i<=width; i=i+1)begin :xi
assign #delay xout[i] = xin1[i] ^ xin2[i];
end
endgenerate
endmodule
• Consider modeling an n-bit adder (with n>1) that also • In this case, not all generated instances of the adder
has condition code outputs to indicate if the result are connected the same
was negative, produced a carry, or produced a 2´s
• You may then use if-then-else and case-statements in
complement overflow.
the for-loop to generate these differences.
• Half Adder:
• Can be used to calculate the sum of two bits A1 and A2.
C A1 A2
S A1 A2
• For adding binary numbers having a bitwidth of more than one single bit.
• These equations can be realized either by logic gates (AND, OR, XOR) or by two half-adders and an OR gate.
Parallel Adders
• Chained full-adders where the carry „ripples“ through the whole chain from the LSB to the MSB.
module adderWithConditionCodes
#(parameter width = 1)
(output reg [width-1:0] sum,
output reg cOut, neg, overFlow,
input [width-1:0] a, b,
input cIn);
generate
genvar i;
for (i = 0; i <= width-1; i=i+1) begin: stage
case(i)
0: begin
always @(*) begin
sum[i] = a[i] ^ b[i] ^ cIn;
c[i] = a[i] & b[i] | b[i]& cIn | a[i] & cIn;
end
end
• Using Gate level primitives, the complete model is always sensitive to input changes. In other words:
inputs are always evaluated and determine the output to change.
• Any input change at any time will cause the gate instance to execute the evaluation of its output.
• Let´s assume that a scheduled event in our gate level timing example has not yet been executed (the model was still
busy executing a scheduled event – e.g. due to the 2 timeunits delay). If now a new event is generated for the output of
that element, the previously scheduled event will be cancelled and the new one will be put in the event queue instead.
set
• If a pulse is shorter than the propagation time at a gates input, the output of the gate will not change
• Inertial delay is the minimum time a set of inputs must be present for a change in the output to be seen
• Verilog gate models have – by definition – inertial delays just greater than their propagation delay – so watch out!
• Using behavioral models, you have to specify the sensitivity list yourself
• The sensitivities are context dependent – you decide to which input your model is sensitive!
• The model below is only sensitive to clockedges which are more than 5 timeunits apart
• The procedural timing model of Verilog does not cancel events in the event queue.
• If there are multiple events scheduled for the same time, the execution order is indeterminate. Therefore
you must avoid writing such bad code:
10ns out_gate
s1
sel_inverter 1
& result
selbar s2
sel 1
Time Time 8ns
Signal Signal &
Value Value i2
in_gate2 12ns
i1
i2
sel
10 ns 10 ns 12 ns 15 ns 30 ns 70 ns 100 ns
• Event queue i1 selbar s2 s1 i2 sel i1
1 1 0 0 0 1 0
for t = 0 ns
12 ns 15 ns 22 ns 30 ns 70 ns 100 ns
s2 s1 s2 i2 sel i1
0 0 1 0 1 0
• Event queue
for t = 10 ns selbar U
s1 U
...
s2 U
result U
• The following figure shows the basic elements of an event driven simulator
Removes all
Time-ordered Current time events
Scheduler
event list
Network connections
Gate outputs Gate Models
(inputs)
begin begin
a = #2 b; a <= #2 b;
c = #2 a; c <= #2 a;
end end
for the left example, b is taken, stored into a temporary variable, and delays for 2 timeunits the update
event for a. When this update event is executed (a = 1), the process continues. Same for c, so c gets the
value 1 after 4 timeunits.
for the right example, in the first line b is evaluated, an event is scheduled 2 timeunits in the future, and
the process continues. Another event (for c) is scheduled also 2 timeunits in the future. Therefore, c is 0
after 2 timeunits.
• In fact, the event scheduler deals differently with „regular“ and non-blocking events. The following
table tries to shed some light onto this:
a = #4 b; This is like a = #0 b; except that the update event and the continuation of
the process is scheduled 4 time units in the future.
a <= #4 b; This is like a <= #0 b; except that a will not be updated (using a non-blocking
update event) until 4 time units in the future.
#4 a <= b; Wait 4 time units before doing the action for a <= b; The value assigned
to a will be the value of b 4 time units later.
• Mixing gate level instances and behavioral code might make you loose your head ...
module blackhole
(output reg f,
input a, b);
reg q;
initial
f = 0;
always
@ (posedge a)
#10 q = b;
not (qbar, q);
always
@ q
f = qbar;
endmodule
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 392
Synthesizable Verilog
• Most Verilog language constructs are synthesizable (i.e. can be implemented as Hardware gates). This may
even be dependent on the Synthesis tool used (e.g. Synopsys DC, or FPGAExpress)
• In general, the following constructs are not supported by a synthesis tool:
• Wait construct
• Repeat, fork, join
• Data types: time, real, realtime
• User defined primitives
• Initial (a one-time sequential active flow)
• Delay operator (#)
• Switch level primitives: *mos where * is n, p, c, rn, rp, rc; pullup, pulldown; *tran+ where * is (null), r and + (null),
if0, if1 with both * and + not (null)
• 7-signal strength (or higher) logic values
• Tri-State Net definitions (such as triand,trior, tri0, tri1, trireg; but wand, wor, supply0, supply1 are!)
• Some operators (/, %, ===, !==)
• When you start writing HDL think about the hardware you wish to
produce.
Latches:
1. If possible, avoid using latches in your design. Using latches can be more difficult to design correctly and to
verify.
3. You can avoid inferred latches by using any of the following coding techniques.
• Assign default values at the beginning of a process
Latches:
In Verilog, latches are synthesized for a variable when all the following statements are true.
1. Assignment to the variable occurs in at least one but not all of the branches of a Verilog control statement.
Latches:
For Verilog , flip-flops are inferred when edges occur in an event list of
posedge clock or negedge clock.
1. Is easy to synthesize.
Asynchronous Reset:
signal like a clock. Usually, a tree of buffers is inserted at place and route.
3. Must be synchronously de-asserted in order to ensure that all flops exit the reset condition on the
same clock. Otherwise, state machines can reset into invalid states.
4. For both VHDL and Verilog, the asynchronous signal must be in the process and always sensitivity list.
Asynchronous Reset:
Combinatorial Logic:
1. envision the combinational circuit that will be synthesized. 3. when modeling purely combinational logic, ensure signals
are assigned in every branch of conditional signal
assignments.
2. avoid combinational feedback that is the looping of
combinational processes.
4. ensure the sensitivity list of process statements in VHDL
and the event list of always statements in Verilog are
complete.
Combinatorial Logic:
5. For VHDL, do not include the after clause in a signal assignment. This clutters the code and makes it harder to read.
Combinatorial Logic:
7. For Verilog, the always statement is supported by synthesis. The initial statement is not.
Verilog only.
1. Blocking assignments execute in sequential order, non- 2. When writing synthesizable code use non-blocking
blocking assignments executed concurrently. assignments in sequential blocks (-> always @ (posedge
clock) blocks).
VHDL only.
• During simulations, signal assignments are scheduled for • Variable assignments take effect immediately, and they
execution in the next simulation cycle. take place in the order in which they appear in code.
• a case statement infers a single-level multiplexer. • if-then-else statement infers a priority-encoded, cascaded
combination of multiplexers.
Case vs. if-then-else: 3. The default in Verilog case branch is essential to ensure all
branch values are covered and avoid inferring latches.
For VHDL and Verilog
4. The others in VHDL default case branch are optional to
ensure all branch values are covered.
for combinational logic from a case statement, ensure
that
OR
Overview
HDL – Coding
Synthesis
(e.g. Synopsys
Design Vision)
Overview
Verilog, System
Verilog,…
HDL – Coding
Synopsys Design
Constraints (SDC)
Synthesis
(e.g. Synopsys
Common Power Design Vision)
Format (CPF)
Overview
Verilog, System
Verilog,…
HDL – Coding
SDF File
Synopsys Design (Contains Delay
Information)
Constraints (SDC)
Synthesis
(e.g. Synopsys
Design Vision) Netlist
Common Power (The actual
Format (CPF) Circuit as Verilog
Netlist)
Place and Route
PnR Constraints (e.g. Cadence
GDSII
Innovus)
(The finished
Layout)
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 417
Constraining Synthesis
Verilog, System
Verilog,…
HDL – Coding
SDF File
Synopsys Design (Contains Delay
Information)
Constraints (SDC)
Synthesis
(e.g. Synopsys
Design Vision) Netlist
Common Power (The actual
Format (CPF) Circuit as Verilog
Netlist)
Place and Route
PnR Constraints (e.g. Cadence
GDSII
Innovus)
(The finished
Layout)
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 419
Constraining Synthesis
Constraining
• Clock domain crossings in which double synchronizer logic has been added
• Registers that might be written once at power up
• False paths for static signals due to node merging
• Reset or test logic
• Ignore paths between the write and asynchronous read clocks of an
asynchronous distributed RAM (when applicable)
Syntax set_false_path
[-setup] Command applies only for setup path
[-hold] Command applies only for hold path
[-rise] Command applies only to rising edges
[-fall] Command applies only to falling edges
[-from object_list] False Path originates at this object
[-to object_list] False Path ends at this object
[-through obect_list] False Path passes this object
[-fall_from object_list] Only falling edges /origin at this object
…
[-comment string] Add a comment to your false path
Examples:
Examples:
• Example 2: False paths for static signals arising due to merging of modes: Suppose you have a
structure as shown in the picture below. You have two modes, and the path to multiplexer output is
different depending upon the mode. However, in order to cover timing for both the modes, you have
to keep the “Mode select bit” unconstrained. This result in paths being formed through multiplexer
select also. You can specify "set false path" through select of multiplexer as this will be static in both
the modes, if there are no special timing requirements related to mode transition on this signal.
Specifically speaking, for the scenario shown:
Examples:
Examples:
In this example, if regs are resetted by negative edge: All Regs can be
resetted asynchronously. (No timing checks during reset assertion)
However, the deassertion from their reset state must occur
synchronously! (As command only applies to falling edge)
Clock constraining:
create_clock
–period period_value describes the clock period
[source_objects] explains the source of the clock e.g.
[get_ports X] or [get_nets X]
[-name clock_name] Sets the name of the Clock signal
[-waveform edge_list] Used to define a custom clock waveform
[-add] Generate more than one clock from same
source
[-comment string] Add a comment to your clock
Clock constraining:
Clocl Constraining:
Clock constraining:
Clock constraining:
Clock constraining:
• Setting jitter of clocks
set_clock_uncertainity
[-from | -rise_from | -fall_from Startpoint of uncertainity
clock]
[-to | -rise_to | -fall_to clock] Endpoint of uncertainity
[-setup] Command applies to setup time
[-hold] Command applies to hold time
[-rise] Command applies to rise of clk
[-fall] Command applies to fall of clk
Remark:
-from & -to are Uncertainity_value The actual value
obsolete [-object_list] Clocks the command applies to
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 434
Constraining Synthesis
Clock constraining:
Clock constraining:
• Setting latency from clock source (e.g. PLL) to clocked devices
set_clock_latency
[-rise] Command applies to rising edge
[-fall] Command applies to falling edge
[-min] Used to set a min value
[-max] Used to set a max value
[-source] Set the delay of (off-chip) source to clock
[-late] Set the option for the late edge of the clk
[-early] Set the option for the early edge of the clk
[-clock clock_list] List of the clocks this option applies to
delay The actual delay
object_list The source of the clock
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 436
Constraining Synthesis
Clock constraining:
Clock constraining:
Clock constraining:
Port constraining:
set_input_delay
[-clock clock_name] Specify the reference Clock
[-clock_fall] Sets this command to be referring to clocks neg edge
[-level_sensitive]
[-rise]
[-fall]
[-max]
[-min]
[-add_delay]
delay_value
port_pin_list
Summer Term | TU Darmstadt | Integrated Electronic Systems Lab 440
Chapter 10
VERILOG-AMS
• Analog signals are continous: the value of a signal at any point may be any value from a continuous range of values
Verilog-AMS
Verilog-
Verilog-A
HDL
• Top-down design flow has been discussed before • There is no commercial simulator available that is not
partitioning the analog (timecontinuous) and digital
• With heterogeneneous systems and/or
(timediscrete), for efficiency reasons – but quite good
multidisciplinary components, Verilog-AMS offers
hidden from the user
analog, mixed-signal and digital language constructs
that allow the creation of an abstract system-level
model – and simulation in de-facto one simulator
Collection of physical
signal types A real valued
parameter r
`include “disciplines.vams”
analog
V(p,n) <+ r*I(p,n); Across and Through
endmodule quantities
`include “disciplines.vams”
analog
I(p,n) <+ c*ddt(V(p,n));
endmodule
`include “disciplines.vams”
In Verilog-A the description of a constance valued voltage resp. current source looks like this:
Note: direction is only output
`include “disciplines.vams”
here!
module vsrc (p,n);
parameter real dc=0; // DC voltage (V)
output p, n;
electrical p, n;
analog
V(p,n) <+ dc;
endmodule
`include “disciplines.vams”
analog
I(p,n) <+ dc;
endmodule
`include “disciplines.vams”
analog
V(p,n) <+ gain*V(ps, ns);
endmodule
`include “disciplines.vams”
analog
I(p,n) <+ gain*I(ps, ns);
endmodule
In Verilog-A the description of a structural model (voltage source and resistor) looks like this:
`include “disciplines.vams”
`include “vsrc.vams”
`include “resistor.vams”
module smpl_ckt;
electrical n;
ground gnd;
Note that every circuit has one node designated as ground or reference node. This node defines zero potential for all
disciplines (and has no discipline itself).
`include “disciplines.vams”
analog begin
@(cross((V(a, c)+I(a,c), 0));
if((V(a,c)+I(a,c)) > 0)
V(a,c) <+ 0;
else
I(a,c) <+ 0);
endmodule
Adding voltage and current looks strange, but this is a very robust way to check if the
diode is in quadrant one!
RC chain:
SPICE netlist Verilog-A netlist
// RC Circuit
*RC Circuit
R1 in out 10k module RC(in, out);
C1 out gnd 10u inout in;
inout out;
electrical in;
electrical out;
ground gnd;
resistor #(.r(10k)) r1 (in, out);
capacitor #(.c(10u)) c1(out,gnd);
endmodule
Contribution statements are not the only way to assign values to analog signals. Sometimes a method called indirect
branch assignments is helpful. One example is the ideal opamp (inf gain, input resistance, zero output resistance, …). We
know that Vin = 0 (the difference voltage of the two inputs pins, and the output adjusts to a voltage according to the
feedback. Here is the code:
`include “disciplines.vams”
analog begin
V(out): V(in) == 0;
end
endmodule
analog begin
V(out): V(in) == 0;
This is an indirect branch assignment which reads „drive V(out) The left-hand-side of this indirect branch assignment must be
so that V(in) == 0, meaning: V(out) is driving with a voltage either an access function (such as V(out)), or ddt or idt applied
source, and the source voltage needs to be assigned in a way to an access function). The tolerance for the equation is taken
so the given assignment is satisfied. Any branches in the from the argument on the left side of the equality operator.
equation are only probed, but not driven! Such as V(in).
Modeling a real opamp with imperfections (offset, finite gain, finite slew-rate, frequency behaviour, limits, ….) is much more difficult.
Examples for this are manifold available online.
The key is to add/delete/limit the currents or voltages at the boundary of the opamp. Example:
• Quantization:
Mapping of a continuous signal into a set of
discrete ranges
• Coding:
Source: Sedra&Smith: Microelectronic Circuits.
VFS = KVREF
Assignment of a binary code to each discrete
range Change VREF until unkown vx is
determined within 0.5 LSB error:
in
VFS
v x VFS bi 2 i
n 1
i 1 2
1. Linear quantization
2. Nonlinear quantization: µ-law and A-law methods
- Combination of linear quantization and nonlinear
compression/expansion
e.g. Compander chip used to perform both com-
pression and expansion for speech samples
3. Delta modulation (Oversampling)
4. Sigma-delta modulation (Oversampling)
5. Adaptive differential quantization(e.g. ADPCM), ...
fT
Analog ADC Digital
Anti-aliasing
(Quantization +
Filter (LPF) Sample&Hold Coding)
Circuit
• Offset error
• Missing code
DLE > 1 LSB
missing code
• Nonmonotonicity
Input voltage inceases, output code decreases
Good ADC:
DLE < 0.5 LSB, no missing code Source: Sedra&Smith: Microelectronic Circuits.
In Verilog-A the description of an ideal A/D converter model looks like this:
`include “disciplines.vams”
`include “constants.vams”
module ideal_adc(in,clk,out);
input in,clk;
output [0:adc_size-1] out;
voltage in,clk,out;
real sample,thresh;
real result[0:adc_size-1];
integer i;
analog
begin
@(cross(V(clk)-clk_vth, +1))
begin
sample = V(in);
thresh = fullscale/2;
for(i=adc_size-1;i>=0;i=i-1)
begin
if (sample > thresh)
begin
result[i] = out_high;
sample = sample - thresh;
end
else result[i] = out_low;
sample = 2*sample;
end
end
V(out) <+ transition(result,delay_,trise,tfall);
end
endmodule
endmodule
Magnetic
Magnetomotive Force Magnetic Flux (Wb) Energy (J)
(A-turn)
Magnetomotive Force Magnetic Flux Rate Power (W)
(A-turn) (Wb)
Thermal
Temperature Entropy Flow (W/K) Power (W)
Temperature Heat (J) Energy-Temperature (JK)
Temperature Heat Flow (J/s) Power-Temperature (WK)
Radiant
Luminous Intensity (cd) Optical Flux (lm) cd2 sr)
`include “disciplines.vams”
`include “vsrc.vams”
module test;
ground gnd;
inout shaft, p, n;
rotational_omega shaft;
electrical p, n;
analog begin
V(p,n) <+ km*Omega(shaft)+r*I(p,n)+l*ddt(I(p,n));
Tau(shaft) <+ kf+I(p,n) –d*Omega(shaft)-j*ddt(Omega(shaft));
end
endmodule