Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 90

PD – TRAINING

Topic: CTS
Author: Nilesh Ingale &
P. Ravikumar
Date:08-11-2012

Confidential Information: Do not share or


photocopy without prior written approval 1
INTRODUCTION and SCOPE
After completing this unit, you should be able to:

1.) List the status of the design prior to CTS

2.) Set up the design for clock tree synthesis

3.) Identify implicit clock tree start/end points and when


explicit modifications are needed

4.) Control the constraints and targets used by CTS

5.) Execute the recommended clock tree synthesis and


optimization flow

6.) Analyze timing and clock specifications post CTS


Confidential Information: Do not share or
photocopy without prior written approval 2
Why Clocks?
 Clocks provide the means to synchronize
 By allowing events to happen at known timing boundaries, we can
sequence these events
 Greatly simplifies building of state machines
 No need to worry about variable delay through
combinational logic (CL)
 All signals delayed until clock edge (clock imposes the worst case
delay)

Confidential Information: Do not share or


photocopy without prior written approval 3
Prior to Clock Tree Synthesis (pre-CTS)
 Clock buffer tree is typically not built yet
 Clock input ports are connected directly to all FF clock pins

Confidential Information: Do not share or


photocopy without prior written approval 4
What does CTS do?
 Inserts buffers to meet skew, latency and transition goals

 Goals are set by the user, the vendor and/or PnR tool.

Confidential Information: Do not share or


photocopy without prior written approval 5
What does Timing Analysis do Pre-CTS?
 Timing analysis assumes ‘ideal’ clock networks by default ->
zero skew, latency and transition
 Ignores buffers even if they are
present
 This would produce overly
optimistic timing results

Confidential Information: Do not share or


photocopy without prior written approval 6
Modeling Clock Tree Effects Pre-CTS
 Your SDC constraints should contain
the following set of commands for
each clock domain:
set_clock_uncertainty
set_clock_latency
set_clock_transition
 The clocks are still considered
to be ‘ideal’, but the zero values
are overridden by the values
specified by these SDC commands

Confidential Information: Do not share or


photocopy without prior written approval 7
Design Status, Start of CTS Phase
 Placement – completed
 Power and ground nets – prerouted
 Estimated congestion – acceptable
 Estimated timing – acceptable (~0ns slack)
 Estimated max cap/transition – no violations
 High fanout nets:
 Reset, Scan Enable synthesized with buffers
 Clocks are still not buffered

Confidential Information: Do not share or


photocopy without prior written approval 8
Is the Design Ready for CTS?
 check_physical_designs –for_cts checks for:
 designs is placed
 clocks have been defined
 clock roots are not hierarchical pins

 check_clock_tree checks and warns if:


 a clock source pin is a hierarchical pin
 a generated-clock with improperly specified master-clock
 a clock tree has no synchronous pins
 there are multiple clocks per register

Confidential Information: Do not share or


photocopy without prior written approval 9
Starting Point before CTS

Confidential Information: Do not share or


photocopy without prior written approval 10
Basic Terminology
 PLL
 Clock Period
 Clock Latency
 Source Latency
 Network Latency
 Clock Uncertainty
 Setup & Hold
 Constraints
 Max Capacitance
 Max Fanout
 Max Transition
 Skew
 Global skew
 Local skew
 Useful skew
Confidential Information: Do not share or
photocopy without prior written approval 11
Clock Period
 For synchronized designs, data transfer between functional
elements are synchronized by clock signals
 Clock signal are generated externally (e.g., by PLL)
 Clock period equation

clock period  td  tskew  tsu

Td : Longest path through combinational logic


Tskew : Clock skew
Tsu : Setup time of the synchronizing elements

Confidential Information: Do not share or


photocopy without prior written approval 12
Clock Skew
 Clock skew is the maximum difference in the arrival time of a
clock signal at two different components.
 Clock skew forces designers to use a large time period between
clock pulses. This makes the system slower.
 So, in addition to other objectives, clock skew should be
minimized during clock routing.

Confidential Information: Do not share or


photocopy without prior written approval 13
Clock Skew Causes
 Designed (unavoidable) variations – mismatch in buffer load
sizes, interconnect lengths
 Process variation – process spread across die yielding different
Leff (Effective Channel length), Tox (oxide thickness), etc.
values
 Temparature gradients – changes MOSFET performance across
die
 IR voltage drop in power supply – changes MOSFET
performance across die

Confidential Information: Do not share or


photocopy without prior written approval 14
Clock Latency
 Clock source latency is defined as the delay from Clock source
to clock definition port in your design
 Clock network latency is defined as the delay from the Clock
definition port to clock sink of your design
 – It is also known as inserti.on delay (standard term)

Confidential Information: Do not share or


photocopy without prior written approval 15
Zero Skew Methodologies
 Global Skew
– Achieve zero skew between 2 synchronous pins, without
considering logic relationships
Skew global = LP – SP
Skew global = 3 - 2 = 1

• Local Skew
– Achieve zero skew between 2 synchronous pins, while
considering logic relationships

Skew local = LP – SP FF1 FF2 FF3


Skew local = 2.5 - 2 = 0.5

Confidential Information: Do not share or


photocopy without prior written approval 16
Useful Skew
• Skewing a clock to improve timing
• Determines clock insertion values based on logic
path delay
• Evenly distributes slack to adjacent paths
Useful Skew = D source – D target
Setup Slack = T – skew – data path

Confidential Information: Do not share or


photocopy without prior written approval 17
Clock Design Problem
 What are the main concerns for clock design?
 Skew
 No. 1 concern for clock networks
 For increased clock frequency, skew may contribute over
10% of the system cycle time
 Power
 very important, as clock is a major power consumer!
 It switches at every clock cycle!
 Clock Consumes ~ 40% of the power.
 Noise
 Clock is often a very strong aggressor
 May need shielding
 Delay
 Not really important
 But slew rate is important (sharp transition)
Confidential Information: Do not share or
photocopy without prior written approval 18
Clock Design Considerations
 Clock signal is global in nature, so clock nets are usually
very long.
 Significant interconnect capacitance and resistance
 So what are the techniques?
 Routing
 Clock tree versus clock mesh (grid)
 Balance skew and total wire length
 Buffer insertion
 Clock buffers to reduce clock skew, delay, and distortion
in waveform.
 Wire sizing
 To further tune the clock tree/mesh

Confidential Information: Do not share or


photocopy without prior written approval 19
Clock Jitter
 Variations in clock arrival time at inputs of a sequencing
Element
 Random and Deterministic components
 Varies cycle to cycle
 Contrast with Clock skew: measures the average difference in arrival times
of the clock at two different sequencing elements.
 Period jitter: Variations in period When referenced to Ideal
clock.
 Cycle to cycle jitter: Variations in next Edge when referenced
to previous edge

Confidential Information: Do not share or


photocopy without prior written approval 20
Clock Jitter: Flip flops

Confidential Information: Do not share or


photocopy without prior written approval 21
What causes clock jitter?

Confidential Information: Do not share or


photocopy without prior written approval 22
Clock Distribution Network
 General goal of clock distribution
 Deliver clock to all memory elements with acceptable skew
 Deliver clock edges with acceptable sharpness
 Clocking network design is one of the greatest challenges in the
design of a large chip
 Consume up to 1/3 of chip power
 Accurate signal delay
 Signal integrity
 Subject to uncertainty / variation of different processes /operating
conditions

Confidential Information: Do not share or


photocopy without prior written approval 23
Clock design Components
 Oscillator
 Dividers
 Buffers
 Strong drivers
 Reduce delay
 Signal integrity / slew rate
 Interconnects
 Balanced trees, meshes, etc.
 Shielding (e.g., for crosstalk reduction)
 Non-tree links / feedback loops

Confidential Information: Do not share or


photocopy without prior written approval 24
Clock Distribution Objective
 Minimum / bounded skew
 performance / hold time requirements
 Guaranteed slew rate / signal integrity
 Small insertion delay
 Robustness under process / operating condition variation
 Minimum cell / routing area
 Minimum power consumption

Confidential Information: Do not share or


photocopy without prior written approval 25
Clock Distribution Robustness Subject
to
 Radically different loading (flip-flop density)
 Across the die
 ECO (Engineering Change Order)
 Interconnect coupling
 Signal integrity
 Delay variation
 Process variation
 From lot-to-lot
 Across the die
 Buffers
 Metal width
 Supply voltage variation across the die
 Both static IR drop
 Dynamic voltage drop
 Temperature
Confidential Information: Do not share or
photocopy without prior written approval 26
Issues in Clock Distribution Network
Design
 Skew
 Process, voltage, and temperature
 Data dependence
 Noise coupling
 Load balancing
 Power, CV2f (consume up to 1/3 of total chip power)
 Clock gating
 Flexibility/Tunability
 Compactness – fit into existing layout/design
 Facilitate ECO

Confidential Information: Do not share or


photocopy without prior written approval 27
Clock Tree Synthesis

Confidential Information: Do not share or


photocopy without prior written approval 28
CTS Goals
 Meet the clock tree Design Rule Constraints (DRC):
 Maximum transition delay
 Maximum load capacitance
 Maximum fanout
 Maximum buffer levels

 Meet the clock tree targets:


 Maximum skew
 Min/Max insertion delay

Confidential Information: Do not share or


photocopy without prior written approval 29
Clock Tree Synthesis (CTS) (1/2)

Confidential Information: Do not share or


photocopy without prior written approval 30
Clock Tree Synthesis (CTS) (2/2)

Confidential Information: Do not share or


photocopy without prior written approval 31
Where does the Clock Tree Begin and
End?

Confidential Information: Do not share or


photocopy without prior written approval 32
Define Clock Root Attributes (1/2)
 When the clock root is a primary port of a block
 Ensure that an appropriate driving cell is defined
set_driving_cell
 The synthesis constraints may include a weak driving cell for
all inputs, including the clock port
 Because the clock is ideal during synthesis it has no effect
on design QoR
 But a weak driver on the clock port affects clock tree QoR
during CTS

Confidential Information: Do not share or


photocopy without prior written approval 33
Define Clock Root Attributes (2/2)
 When the clock root is a primary port, but at the CHIP level
through an IO-PAD
 Ensure that an appropriate input transition is defined
set_input_transition

Confidential Information: Do not share or


photocopy without prior written approval 34
Stop, Float and Exclude Pins

Confidential Information: Do not share or


photocopy without prior written approval 35
Leaf Pins
leaf_pins: Define speciafic pins as leafs, i.e. stop tracing
the clock when encountered

Confidential Information: Do not share or


photocopy without prior written approval 36
Generated and Gated Clocks

Confidential Information: Do not share or


photocopy without prior written approval 37
Skew Balancing not Required?

Confidential Information: Do not share or


photocopy without prior written approval 38
User-defined or Explicit Stop Pins

Confidential Information: Do not share or


photocopy without prior written approval 39
Defining an Explicit Stop Pin

Confidential Information: Do not share or


photocopy without prior written approval 40
Defining an Explicit Float Pin

Confidential Information: Do not share or


photocopy without prior written approval 41
Preserving Pre-Existing Clock Trees

Confidential Information: Do not share or


photocopy without prior written approval 42
Impact of Preexisting Clock Cells

 Any preexisting clock buffers and cells are counted


as clock gate levels
 Any clock gate level is considered as a balancing point,
therefore…
 Preexisting clock buffers/inverters might create
unnecessary clock levels for CTS
 Use remove_clock_tree to remove existing clock buffers
 Will generally lead to higher quality clock trees

Confidential Information: Do not share or


photocopy without prior written approval 43
Specifying Skew / Insertion Delay
Targets
 Refer to CTS script.

Confidential Information: Do not share or


photocopy without prior written approval 44
Set Buffer/Inverter Selection Lists
 To limit CTS to a list of buffers/inverters used for
specific optimizations:

Command:

 There is no priority on how CTS uses the members from


each list
 If a list is not specified, all buffers/inverters in the library
without dont_use attributes are used

Make sure the references are in target_library

Confidential Information: Do not share or


photocopy without prior written approval 45
When Clock Tree DRCs are Used

Confidential Information: Do not share or


photocopy without prior written approval 46
Non-Default Clock Routing

 PnR tool can route the clocks using non-default routing rules,
e.g. double-spacing, double-width, shielding

 Non-default rules are often used to “harden” the clock, e.g. to


make the clock routes less sensitive to Cross Talk or EM effects

Confidential Information: Do not share or


photocopy without prior written approval 47
NDR Recommendations
 Always route clock on metal 3 and above
 Avoid NDR on clock sinks:
set_clock_tree_options -
use_default_routing_for_sinks 1
 Avoid NDR on Metal 1
 may have trouble accessing metal 1 pins on buffers and
gates
 Put NDR on pitch – try to avoid blind double spacing
 Preserve routing resources/keep preroute RC estimation
accurate
 Consider double width to reduce resistance
 Consider double via to reduce resistance and improve yield

Confidential Information: Do not share or


photocopy without prior written approval 48
Effects of Clock Tree Synthesis
 Clock buffers added
 Congestion may increase
 Non clock cells may have been
moved to less ideal locations
 Can introduce new timing
and max tran/cap violations

Confidential Information: Do not share or


photocopy without prior written approval 49
Post CTS / Optimization
 clock_opt –only_psyn
 Reduces disturbances to other cells as much as possible
 Performs logical and placement optimizations to fix possible
timing and max tran/cap violations, based on propagated
clock arrivals
 To enable hold time fixing

 To prioritize TNS over WNS, set:

 To prioritize min over max, set:

Confidential Information: Do not share or


photocopy without prior written approval 50
Minimize Hold Time Violations in Scan
Paths Reordering
.
Reorders to minimize crossings between clock
buffers

Can reduce unnecessary hold time violations in the


scan chain

Confidential Information: Do not share or


photocopy without prior written approval 51
Recommended Flow

All CTS-built clocks are propagated automatically – no


need to use the “set_propagated_clock” command!

Confidential Information: Do not share or


photocopy without prior written approval 52
Analysis using the CTS GUI
 CTS browser
 Properties and attributes on clock tree objects
 Traversing clock tree levels
 Symbols for CTS objects like buffers, gates and sinks
 CTS schematic
 Trace forward/backward in schematic view
 Collapses all sinks in the fanout of a CTS buffer for clearer
CTS schematic
 Highlight CTS objects in the layout view
 Clock arrival histogram

Confidential Information: Do not share or


photocopy without prior written approval 53
Analyzing CTS Results
 report_clock_tree
-summary
-settings
-...
 Reports Max global skew, Late/Early insertion delay, Number
of levels in clock tree, Number of clock tree references
(Buffers), Clock DRC violations
 report_clock_timing
 Reports actual, relevant skew, latency, interclock latency
etc. for paths that are related.
 Example: report_clock_timing –type skew

Confidential Information: Do not share or


photocopy without prior written approval 54
What about CTS Operating Conditions?
 What happens when building the CT using min_max?
 The tree is compiled in –max then analyzed in –min
 If the skew analyzed in –min is not worse than the skew
in –max, compiling with –min_max will not make much
difference
 If the skew analyzed in –min is worse than that in –max,
then compile in –min_max will build a tree with a better
skew in –min at the cost of a possibly worst skew in –max
 In summary, a tree compiled in –min_max will build a tree with
less skew variation when analyzed in both –min and –max

 The skew will not be better than a tree compiled and analyzed in
–max

Confidential Information: Do not share or


photocopy without prior written approval 55
Clock Tree Optimization
Perform additional Clock Tree Optimization as
necessary to further improve clock skew.

Confidential Information: Do not share or


photocopy without prior written approval 56
Invoke CTS: Core Command

Confidential Information: Do not share or


photocopy without prior written approval 57
clock_opt use recommendation
 Using clock_opt in the following manner has been
found to be more flexible across designs and flows:

clock_opt -only_cts -no_clock_route


analyze…
clock_opt -only_psyn -no_clock_route
analyze…
route_group -all_clock_nets

Confidential Information: Do not share or


photocopy without prior written approval 58
Clock Tree Optimization Techniques
• Buffer/Gate Sizing
• Buffer/Gate Relocation
• Level Adjustment
• Reconfiguration
• Delay Insertion
• Dummy Load Insertion

Confidential Information: Do not share or


photocopy without prior written approval 59
Gate/Buffer Sizing
• Sizes up or down buffers/gates to improve both
skew and insertion delay
• These are LEQ cells extracted by the tool
• Users can limit some buffers/gates in the LEQ lists

Confidential Information: Do not share or


photocopy without prior written approval 60
Gate/Buffer Relocation
• Physically moves cells to reduce skew and
insertion delay
• Calls Overlap/Removal engine

Confidential Information: Do not share or


photocopy without prior written approval 61
Level Adjustment
• Adjusting a pin to its upper or lower logic
equivalent net

Confidential Information: Do not share or


photocopy without prior written approval 62
Reconfiguration
• Re-clustering of sequential logic
• Buffer placement performed after re-clustering –
runtime intensive
• Recommended for small clock trees

Confidential Information: Do not share or


photocopy without prior written approval 63
Delay Insertion
• Works on low fan-out nets where no clock tree is
inserted
• Delay cells may be specified by users or extracted
by the tool

Confidential Information: Do not share or


photocopy without prior written approval 64
Dummy Load Insertion
• Load balance function
• Uses a cells input capacitance to increase loading
• Dummy Load cells may be specified by users or
extracted by the tool

Confidential Information: Do not share or


photocopy without prior written approval 65
(Embedded) Clock Tree Optimization

Confidential Information: Do not share or


photocopy without prior written approval 66
Balancing Multiple Synchronous Clocks

Confidential Information: Do not share or


photocopy without prior written approval 67
Inter-Clock Delay Balancing

Confidential Information: Do not share or


photocopy without prior written approval 68
Inter-Clock Delay Balancing with Offset

Confidential Information: Do not share or


photocopy without prior written approval 69
SDC Latencies

CTS does not respect SDC latencies by default!

If you need your insertion delays to match the SDC


provided latencies, perform clock tree balancing

Note: Insertion delay will not be minimized if given SDC


latency is less than initial CTS insertion delay

Confidential Information: Do not share or


photocopy without prior written approval 70
CTS – Checklist.
 Prerequisite Check
 Make sure the design is legally placed
 Make sure all the clocks and clock constraints are defined
 Is source of generated clock really a clock source (make sure
that there is a create_clock defined on the source net)?
 Can create_generated_clock trace back along a real path to the
clock source? If not, the sinks of the generated clocks will not be
balanced with the sinks of the source.
 If Clock definitions on hierarchical ports are not supported in
during clock tree synthesis; if any such definitions exists,
redefine the clock on the output pin of the driver of the
hierarchical port.

Confidential Information: Do not share or


photocopy without prior written approval 71
CTS Check List Contd…
 Clock Exceptions Check
 If you use set_clock_tree_exceptions to specify a particular pin
as a stop_pin, float_pin or exclude_pin,the last one takes
precedence.
 Clock-related attributes and nondefault rules are propagated in
spite of dont_touch_subtrees being specified; use set
cts_traverse_dont_touch_subtrees false to override this feature.

Confidential Information: Do not share or


photocopy without prior written approval 72
CTS checklist Contd….
 Timer-Related Check
 Use report_disable_timing to make sure that the disabled timing
arcs are intentional.
 Use report_case_analysis to make sure that the
set_case_analysis are intentional and make sense; use
remove_case_analysisto remove the incorrect ones.
 The set_timing_derate command is ignored by clock tree
synthesis and report_clock_tree; use report_timing_derate to
check.The report_clock_timing and report_timing commands
honor set_timing_derate.
 The message "Invalid phase delay at pin xx/yy" implies a
problem - open a STAR; this message is printed only in debug
mode (set cts_use_debug_mode true).

Confidential Information: Do not share or


photocopy without prior written approval 73
Clock Tree Synthesis Best Practices
1.) Big Insertion delays.
 Are there delay cells in the design?
Check to see if there are delay cells in the netlist that are present
in the current design. These could be causing delay that cannot
be optimized and CTS is building clock trees which match all
other paths to this worst insertion delay.
 Are there cells marked "don’t touch" in the design?
There could be cells in the design that are marked "don’t touch“
which prevents CTS from deleting them and building optimal
clock trees.
 Can the floorplan be modified to be more clock friendly?
Sometimes it helps to consider CTS (and timing) as a constraint for
floorplanning. Long skinny channels leading to more long skinny
placement channels will give both timing optimization and CTS
problems. Consider using soft blockages or refloorplan.
Confidential Information: Do not share or
photocopy without prior written approval 74
CTS Best Practices Contd...
 Can you define new create_clocks that will assist CTS(divide and
rule)?
Many times running CTS on the main clock pin is not   the optimal
way to build clock trees.  It may help to divide the clock tree based
on the floorplan and the syncpins and build sub clocks, then
define the sync pins and build the upper main clock.
 Are the syncPins defined correctly for macros?
It is a good idea to check the syncPins file to see if the sync pins
make sense. Also check that the numbers are accurate and that
the time units are correct.
 If there are ignore pins in the design are they defined as ignore
pins?
If there are ignore pins in the design, make sure you define these as
ignore pins before running CTS.

Confidential Information: Do not share or


photocopy without prior written approval 75
CTS Best Practices. Contd...
 Have you used varRouteRules and propogated by
astMarckClockTree?
Defining varRouteRules helps to reduce the insertion delay. Define
shielding, and double or more width rules for clock nets, and
propagate them using astMarkClockTree.
 Are the CTU buffers marked as "dont use"?
Some technologies use clock tree buffers. Make sure you are
using these only for your clock tree. Also make sure they are not
marked "dont use".
 Be creative and use different CTS intParams to get better
results.
There are several CTS options in the form that you can try to
change to get better or more desirable CTS results.

Confidential Information: Do not share or


photocopy without prior written approval 76
CTS Best Practices Contd...
 Use the Block option in CTS in the first attempt.
This usually gives better insertion and skew results. If your design
is less than 5% std cell utilization try the Top option.
 CTO is designed to work on skew and will not reduce insertion
delay once it is built.
 Try providing a higher skew goal during CTS.
Use inverters only to build the clock tree if possible.
 Define variable route rules with greater than default widths and
clearance and also shield the clock nets.
Then propagate these rules using astMarkClockTree.  This will
help insertion delay.

Confidential Information: Do not share or


photocopy without prior written approval 77
CTS Best Practices for Unreasonable
skew.
 Do you have derived clocks that do not need skew matching?
If you have clocks that get divided and some branches do not need
skew balancing with the rest, then build clock trees for them
separately and do not allow skew calculation between them.
You can define sync pins or ignore pins at cross-over points.
 Look closely at your worst path(s) for possible culprits.
It is quite likely that some of your worst paths have an issue which
is preventing CTS from optimizing them and is causing all other
paths to get delay added to match the insertion delay or better
skew.

Confidential Information: Do not share or


photocopy without prior written approval 78
CTS doesn't run properly?
 Are the SDC constraints loaded and is create_clock defined?
If there are no create_clock statements in the SDC file loaded,
CTS will not run. Make sure you have at least one create_clock
in your SDC file. It is good practice to have set_clock_transition,
set_clock_latency, and set_clock_uncertainty also defined. For
the SDC latency values to be honored, the intParam
axSetIntParam "acts" "clock uncertainty goal" 1 should be set.
CTS uses constraints in the CTS form as first priority, then it
uses the constraints in the intParams, and then it uses SDC
constraints. Having these in the SDC file will also enable the
timer to account for your skew and insertion delay in
optimization steps.

Confidential Information: Do not share or


photocopy without prior written approval 79
 Build the clock tree on lower clocks, then define the sync pins
and run CTS on next level up (divide and conquer).
This is a good practice when building clock trees.  Always
remember to define sync pins if you need them.
 astSetDontTouch ?clock_buffers.list? #f done?
If your CTS buffers have a "dont use" property in your library, you
need to set that to false.
 Are the clock nets marked "dont touch" or is set_case_analysis
defined?
Occasionally you may end up with a "dont touch" property on your
clock net as a results of your analysis. Make sure you reset this
using the astmarkClockTree command. Also if your SDC
constraints have a set_case_analysis defined that disables the
clock net, CTS will not build clock trees.
Confidential Information: Do not share or
photocopy without prior written approval 80
 Is create_clock defined on a non-physical hierarchical pin?
If you define create_clock on a pin that is not present physically
and is only present in the heirarchical netlist, CTS will not be
able to run.
 Try different CTS options and use the one that gives the best
results.
As always, it is a good idea to experiment and try out different CTS
options and intParams to get the best result.

Confidential Information: Do not share or


photocopy without prior written approval 81
Clock Distribution Structures

Confidential Information: Do not share or


photocopy without prior written approval 82
Grids
 Gridded clock distribution common on earlier DEC Alpha
microprocessors
 Advantages:
 Skew determined by grid density,
not too sensitive to load position
 Clock signals available everywhere
 Tolerant to process variations
 Usually yields extremely low skew values
 Disadvantages:
 Huge amount of wiring and power
 To minimize such penalties, need to
make grid pitch coarser  lose
the grid advantage

Confidential Information: Do not share or


photocopy without prior written approval 83
H-Tree
 H-tree
 One large central driver, recursive structure to match
wirelengths
 Halve wire width at branching points to reduce reflections
 Disadvantages
 Slew degradation along long RC paths
 Unrealistically large central driver
- Clock drivers can create large temperature
gradients.
 Non-uniform load distribution

Confidential Information: Do not share or


photocopy without prior written approval 84
Buffered H-tree
 Advantages
 Ideally zero-skew
 Can be low power (depending on skew requirements)
 Low area (silicon and wiring)
 CAD tool friendly (regular)

 Disadvantages
 Sensitive to process variations
 Devices  Want same size buffers at each level of tree
 Wires  Want similar segment lengths on each layer in each source-sink path !!!
 Local clocking loads inherently non-uniform

Confidential Information: Do not share or


photocopy without prior written approval 85
Clock tree Mesh
 Clock meshes are homogeneous shorted grids of metal that are
driven by many clock drivers. The purpose of a clock mesh is to
reduce clock skew in both nominal designs and designs across
variations such as on-chip variation (OCV), chip-to-chip
variation, and local power fluctuations.

Confidential Information: Do not share or


photocopy without prior written approval 86
Benefits of Meshes
 Deterministic since shielded all the way down to rib distribution
 No ECO placement required: all buffers preplaced before block
placement
 Low latency since uses shorted (= ganged, parallel) drivers,
therefore lower skew
 ECO placements of FFs later do not require rebalancing of tree
 “Idealized” clocking environment for “concurrent dance” of RTL
design and timing convergence

Confidential Information: Do not share or


photocopy without prior written approval 87
Problems with Meshes
 Burn more power at low frequencies
 Blocks more routing resources (solution: integrated power
distribution with ribs can provide shielding for ‘free’)
 Difficult for ‘spare’ clock domains that will not tolerate regioning
 Post placement (and routing) tuning required
 No ‘beneficial skew’ possible
 Clock gating only easy at root
 Fighting tools to do analysis:
 Clumped buffers a problem in Static Timing Analysis tools
 Large shorted meshes a problem for STA tools
 What does Elmore delay calculation look like for a non-tree?
 Need full extraction and SPICE-like simulation to determine skew

Confidential Information: Do not share or


photocopy without prior written approval 88
Hybrid Structure
 Balanced tree on the top
 Mesh in the middle
 Minimize skew
 Steiner minimum tree at the bottom
 Minimize cost
 Facilitate ECO

Confidential Information: Do not share or


photocopy without prior written approval 89
ASSIGNMETS
1.) Report QOR before starting CTS / After CTS.
-Congestion Number.
- Setup / hold, TNS.
- Area.
- Number of Flops.

2.) Derive clock tree target constraints for leon block.

3.) Build the clock tree with minimum insertion delay of 40%
Show the relation between insertion delay and skew with
values?

4.) Report Clock tree transition with different transition settings 10%,
5%, 4%.

5.) Optimize clock tree with different CTS optimization techniques.


Use two of them. Report the QOR.
Confidential Information: Do not share or
photocopy without prior written approval 90

You might also like