Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

My Swiss Army Knife of Static Timing Analysis

Wan Chong Khor, Kean Hon Liew

Intel Corporation
Bayan Lepas, Penang, Malaysia

www.intel.com

ABSTRACT
Time has changed, technology has evolved so fast that the old ways of doing things are either
obsolete or no longer practical in this modern era. Designing SoCs has never been so
complicated, thus spurring multiple companies to create all kinds of Electronic Design
Automation (EDA) tools to help designers accomplish their work. In this sense, EDA tools are
almost indispensable. However, they are not perfect, or in others words, they might not always
be created in ways that suit the designers. This paper presents several innovations and
enhancements on the existing EDA static timing analysis (STA) flows, making them more useful
to the designers, hence significantly ease their complicated tasks. This is essentially the Swiss
army knife of static timing analysis.
SNUG 2019

Table of Contents
1. Introduction ........................................................................................................................................................................... 3
2. The Enhanced Tools............................................................................................................................................................ 3
2.1 Tight DMSA Hold Fix ............................................................................................................................................ 3
2.2 Regional DMSA Buffer Removal ...................................................................................................................... 4
2.3 Clock Push or Pull Analysis ............................................................................................................................... 5
2.4 Violation Summary Merge and Count........................................................................................................... 6
2.5 Debug Friendly Transitive Reporting ........................................................................................................... 6
2.6 ECO Friendly Path Reporting ........................................................................................................................... 7
3. Results ...................................................................................................................................................................................... 8
4. Conclusions ......................................................................................................................................................................... 10
5. References ........................................................................................................................................................................... 11

Table of Figures
Figure 1. Tight DMSA Hold Fix Options .......................................................................................................................... 3
Figure 2. Hold Fix with PBA Setup Margin .................................................................................................................... 4
Figure 3. Regional DMSA Buffer Removal Options .................................................................................................... 4
Figure 4. Clock Pull Analysis for Hold Violation ......................................................................................................... 5
Figure 5. Native EDA Transitive Report ......................................................................................................................... 7
Figure 6. Debug Friendly Transitive Report ................................................................................................................. 7
Figure 7. Timing Trace from ECO Friendly Path Reporting ................................................................................... 8
Figure 8. EDA Hold Fix versus Tight DMSA Hold Fix ................................................................................................ 9
Figure 9. Fix Log from Tight DMSA Hold Fix ................................................................................................................ 9
Figure 10. Congested Region Selected .......................................................................................................................... 10
Figure 11. Tool Identified Buffers Removal................................................................................................................ 10
Figure 12. Congested Region Cleared............................................................................................................................ 10

Table of Tables
Table 1. Merged Violations Summary Report .............................................................................................................. 6

Page 2 My Swiss Army Knife of Static Timing Analysis


SNUG 2019

1. Introduction
Analogous to a Swiss army knife, the innovations and enhancements to the EDA static timing analysis
flows presented in this paper are very handy and simple to use tools for the designers to assist in
their SoC design tasks. The goal is to make these collection of tools an essential part of the STA,
streamlining the design work. The main tools discussed in this paper are the tight DMSA hold fix tool,
which is a fine grain hold fix tool that can utilize PBA setup margin for hold fixing; the regional DMSA
buffer removal tool which enables localized redundant buffer removal; the clock push or pull analysis
tool that assists designers make data driven decisions on clock ECOs; the violations summary merge
and count tool that helps group the violations for easier triaging; the debug friendly transitive
reporting tool that eases path tracing for static values set on the pins; and last but not least, the ECO
friendly path reporting tool which makes it simple for designers to directly know the timing margins
available for timing ECOs. Details of how these aforementioned tools work will be discussed with
actual design data, though redacted to protect confidentiality, to prove the effectiveness of the tools.

2. The Enhanced Tools


2.1 Tight DMSA Hold Fix
Conventionally, the hold timing fix flows from EDA tools consider only graph-based analysis (GBA)
setup margin. There are also times that the EDA tools decide to give up fixing hold violations due to
incomprehensible reasons. The tight DMSA hold fix in this STA toolbox provides a much more
controllable and fine grain hold fixing solution to the designers.

Figure 1. Tight DMSA Hold Fix Options

Figure 1 above shows the options of tight DMSA hold fix solution. Being a DMSA tool, it can be used
to perform common or endpoint based hold fix across multiple scenarios similar to other EDA hold
fix solutions. However, this solution gives designers more fine grain controllability. As described in
its options, this solution enhances the conventional EDA hold fixing by enabling endpoint specific
hold margins to fix towards to. On top of that, designers are also able to utilize the path-based analysis
(PBA) setup margin for hold fixes, a feature not available in EDA hold fixing. This is very useful in the
case when the more pessimistic GBA setup margin is not enough for hold buffer insertion, but the
insertion is still possible using PBA setup margins. As shown in Figure 2 below, the path is violating
-15ps min hold violation and it should be fixable by inserting a 15ps min hold buffer. However,
considering the min to max hold buffer delay ratio of 2.5 times, the GBA max setup margin is not
enough to accommodate the hold fix. However, the min hold violation is still fixable if the PBA max
setup margin of 60ps is considered instead.

Page 3 My Swiss Army Knife of Static Timing Analysis


SNUG 2019

Figure 2. Hold Fix with PBA Setup Margin

The tight DMSA hold fix discussed here is not meant to replace EDA hold fix solutions. Instead, it is
meant to complement some limitations of the EDA solutions but providing the designers a bit more
flexibility. As such, the recommended usage model is, to converge the hold timing violations with EDA
solutions first, then further complementing it with tight DMSA hold fix to resolve the final few
stubborn hold violations.

2.2 Regional DMSA Buffer Removal


Oftentimes, especially very close to tape-in of a design that the designers find out that there are high
utilization or routing congestion issues preventing them to implement the last functional or timing
ECOs. All the cells that can be downsized have already been downsized to recover as much area as
possible. Fortunately, it is still possible that some paths are having enough hold margins that the
redundant buffers can be removed. Some EDA tools provide solutions to remove buffers, however,
the main objective is to optimize the design for power, and it is always targeted on the whole design.
Buffer removal on the whole design may not be feasible for a design which is close to tape-in, as
designers normally would want to maintain stability of the design to avoid surprises. Hence, the
regional DMSA buffer removal tool discussed here would be helpful for the designers to resolve
localized congestions.

Figure 3. Regional DMSA Buffer Removal Options

Figure 3 shows the options of the regional DMSA remove buffer tool. Designers are able to determine
what the hold margin below which the buffers can be removed by the tool. As shown, designers can
also restrict the buffer removal to a localized region by specifying the lower left and upper right
coordinates of the bounding box. Only buffers with enough hold margin, as well as located within the
bounding box will be considered for removal. To ensure the hold margin is met, the regional DMSA
remove buffer tool has precautionary measures to avoid removing multiple hold buffers within the
same path in a single iteration. On top of that, being a DMSA based tool, it also ensures that the hold
margins from all the scenarios are met before the buffer is being considered for removal. As far as
design rules are concerned, the tool also takes into consideration of the max capacitance and max
transition requirements are still met after buffer removal.

Page 4 My Swiss Army Knife of Static Timing Analysis


SNUG 2019

2.3 Clock Push or Pull Analysis


Oftentimes, designers are required to manually assess and implement timing fixes for violations
which are not fixable by the EDA timing fix solutions. Depending on the nature for the timing
violations, sometimes it is more efficient to implement the timing fixes on the clock network instead
of the data paths. For example, if a lot of endpoints are violating setup timing coming from a single
startpoint, it makes more sense to just shorten the launch clock to that particular startpoint, instead
of having to upsize or swap thousands of combinational cells in between the startpoint and all its
endpoints. However, to be able to perform timing fixes on the clock networks, be it pushing out or
pulling in the clocks, the designers must ensure that there are enough setup and hold margins on the
affected paths being doing so. The clock push or pull analysis tool discussed in this section makes it
possible for the designers to assess the feasibility of clock push or pull, if such approaches are
considered.
How the clock push or pull analysis tool works is in fact quite simple. Designers just have to provide
a point at which they intend to insert clock buffers to push or pull the clock network, as well as the
delay desired. The point of intent can either be a clock pin of a sequential cell, or, it can also be any
point within the clock network. Taking a clock push analysis on a sequential clock pin as an example,
if the point of intent is a sequential cell clock pin, the tool would check if there are enough setup
margins on the output pins of the sequential cell with respect to the delay desired by the designers.
It would also check if the hold margins on all the input pins of the sequential cell are adequate to
absorb the delay desired. Whereas for a clock pull analysis, the tool would check if there are enough
hold margins on the output pins of the sequential cell, and enough setup margins on the input pins of
the sequential cell with respect to the delay desired to be pulled in. If the point of intent specified by
the designers is a pin on the clock network, the tool would acquire a list of sequential cells in the
fanout cone of the point of intent, and check the setup and hold margins of all the affected sequential
cells before reporting if a clock push or pull is safe.
Figure 4 below shows an example of hold violating endpoint with a negative slack of 0.109704ns. If
the designers decide to pull in the clock latency of the violating endpoint to fix the hold violation, they
could use this tool to assess if the all the affected margins are enough. In this case, the clock latency
desired to be shortened is 0.110ns, the tool evaluates the setup margins of the input pins and also the
hold margin of the output pin. It is concluded that the clock pull should not cause new violations as
all the setup and hold margins of the pins are enough to absorb the 0.110ns change.

Figure 4. Clock Pull Analysis for Hold Violation

Page 5 My Swiss Army Knife of Static Timing Analysis


SNUG 2019

2.4 Violation Summary Merge and Count


Analyzing timing violations can be a daunting task, especially when a design is still in the early phase
of the design cycle. Violation count can easily go up to hundreds of thousands. Triaging the timing
violations path by path is almost impossible without some form of categorization and merger of the
timing violations. This paper shares a violation summary merge and count tool, of which it can help
designers to merge timing violations of the same nature together. The timing violations which can be
merged together need to meet several criteria below:
• Similar type of timing startpoints ignoring numerical indices
• Similar type of timing endpoints ignoring numerical indices
• Same launch clock
• Same capture clock
When multiple timing violations meet the above mentioned criteria, the tool would merge them by
substituting the numerical indices of all the timing startpoints and endpoints with wildcard
character, and reporting them out as a single item, instead of multiple lines. Additionally, the tool
would still keep track of the original count of the violations group before merger in a column called
occurrence. As for the violating slacks, the tool would keep track of the WNS of the merged violations
group. Table 1 below shows an example of the merged violations summary report generated out by
the tool. Both the startpoints and endpoints are being merged by substituting their numerical indices
with wildcard characters. The numerical indices of the clocks are not merged however as those clocks
might be totally unrelated to each other. As shown in the Table 1, WNS of the merged violations group
is preserved, and the occurrence tracks the total violation counts of that group.
Table 1. Merged Violations Summary Report

Startpoint Endpoint StartClock EndClock WNS Occurrence

pardttc/lp_*_ stage/lane_reg_*/clk pardttc/xbar_stage/conf_reg_*_*/d VP_ c_clk_6_CLK VP_ c_clk_6_CLK -0.16981 32

pardttc/lp_*_ stage/link_reg_*/clk pardttc/xbar_stage/stat_reg_*_*/d VP_ c_clk_4_CLK VP_ c_clk_4_CLK -0.11945 32

pardttc/lp_*_mux/lane_reg_*/clk pardttc/lp_*_mux/conf_reg_*/d VP_ c_clk_7_CLK VP_ c _clk_7_CLK -0.07334 14

2.5 Debug Friendly Transitive Reporting


A timing path is inferred when there are a bunch of combinational cells spanning in between two
sequential cells, in which both of these sequential cells are clocked by synchronous clocks. Designers
could use EDA tool timing reporting commands to inspect the setup or hold margins of the
aforementioned timing path, provided that they are not set as a false path, being disabled or asserted
to a static value. In the case when a timing path is being disabled or asserted to a static value, the
designers would not even be able to inspect the connectivity of the timing path as the EDA tool would
usually report it as no path without showing the trace. This could be a very inconvenient issue if the
designers would like to trace the connectivity of this timing path maybe to examine clock
propagation, or in the case which the path is mistakenly set to a static value, and they would like to
trace the pin of which the static value is being set on.
Certainly, designers can utilize the transitive fanin or fanout commands to trace the connectivity.
However, the debug friendly transitive reporting tool makes it even easier for designers. Using a path
being asserted with a static value of 0, Figure 5 below shows the native EDA transitive fanout report:

Page 6 My Swiss Army Knife of Static Timing Analysis


SNUG 2019

Figure 5. Native EDA Transitive Report

Figure 6 below shows the transitive fanout report of the same path produced by the debug friendly
transitive reporting tool. As noticed, this report has added information to show the exact pins of
which the static value of 0 is being applied, reported as “CASE: 0”, which greatly eases the debug
effort. In addition to showing the static value of the pins if any, this enhanced report also shows if
there is any clock propagating through those pins. For this case, all the pins listed in the report are
purely data pins without any clocks propagating through them, thus they are reported as “CLK: N/A”.

Figure 6. Debug Friendly Transitive Report

2.6 ECO Friendly Path Reporting


There are times when designers are tracing through timing paths, especially analyzing the paths for
manual setup and hold fixes that they would like to see if there are enough margins on certain pins
that they would like to implement timing fixes on. For example, while reviewing through the pins of
a min hold violation, designers often need to separately report out the max timing margins of those
pins to decide the optimum hold fix location. The ECO friendly path reporting tool enhances the
native EDA path reporting command by annotating the timing margins at all the pins of the path
reported. In the case of a min timing path, the setup timing margins are annotated at each pin. In
other words, the designers are able to tell what the worst case setup margin on a particular pin is,
and decide if that pin is a suitable to have hold buffers inserted on or not. Likewise, if a max timing
path is reported, all the pins would be annotated with their worst case hold timing margins, making

Page 7 My Swiss Army Knife of Static Timing Analysis


SNUG 2019

it easy for the designers if they were to upsize or swap any of the cells to speedier ones, which might
cause hold violations if not checked properly. Figure 7 below shows a hold timing path reported with
the ECO friendly path reporting tool. Notice at the side of each non-hierarchical pin, the setup margin
is being annotated, hence making it so simple for the designers to decide which pin is having enough
setup margin to absorb the hold fix.

Figure 7. Timing Trace from ECO Friendly Path Reporting

3. Results
Based on some sample designs, the results of the tight DMSA hold fix and the regional DMSA buffer
removal are discussed in this section.
For the tight DMSA hold fix, a design with 139 hold violating endpoints with WNS of -9.47ps was used
as the test case. With the native EDA timing ECO fix flow, 8 violations were not successfully fixed and
further investigation showed that the GBA setup margins of the pins of the paths were not enough to
absorb the hold fixes. Whereas, using the tight DMSA hold fix tool, all the 139 hold violations were
successfully fixed as it managed to squeeze tight PBA setup margins of the pins for hold fix buffer
insertions, as shown in Figure 8.

Page 8 My Swiss Army Knife of Static Timing Analysis


SNUG 2019

Figure 8. EDA Hold Fix versus Tight DMSA Hold Fix

Figure 9 below shows excerpt from the tight DMSA hold fix log. The setup slacks highlighted were
based on PBA in the max scenario, whereas, the hold slacks to be fixed were based on min scenario.
Hence, the setup slacks were just enough for the hold buffer insertion, considering that the hold fix
delays required were around 2.5 times of the hold slacks, converting from min scenario to max
scenario. Therefore, EDA hold fix solution could not perform fixes on these paths since the setup
margins taken into consideration were based on GBA.

Figure 9. Fix Log from Tight DMSA Hold Fix

For the regional DMSA buffer removal test case, a design with moderate level of cell density

Page 9 My Swiss Army Knife of Static Timing Analysis


SNUG 2019

congestion was used to demonstrate the effectiveness of the tool.

The region of the design selected for regional DMSA


buffer removal test case is circled in the cell density
congestion map shown in Figure 10 on the left. Notice
the patches of solid yellow, indicating rather
congested cell density. Typically, designers would
want to avoid cell density congestion for reasons like
timing and functional ECO-abilities, routability and so
on.

Figure 10. Congested Region Selected

The regional DMSA buffer removal tool was used to


resolve the selected region instead of the whole
design. Figure 11 on the left shows the buffers
identified for removal, and all of them were confined
within the region.

Figure 11. Tool Identified Buffers Removal

Figure 12 on the left shows again the cell density


congestion map after the regional DMSA buffer
removal. Notice that the solid yellow patches are gone
in the selected region. The tool has successfully
removed redundant buffers to resolve cell density
congestion issue, while maintain good hold timing
slacks and still adhering to the max capacitance and
max transition requirements.

Figure 12. Congested Region Cleared

4. Conclusions
In summary, the collection of enhanced tools presented in this paper, namely the tight DMSA hold fix,
the regional DMSA buffer removal, the clock push or pull analysis, the violations summary merge and

Page 10 My Swiss Army Knife of Static Timing Analysis


SNUG 2019

count, the debug friendly transitive reporting and last but not least, the ECO friendly path reporting
can really be considered as the Swiss army knife of static timing analysis. They are proven to be
simple and effective, streamlining the design work and increase the day-to-day productivity of the
designers. This collection of enhanced tools should be maintained as a baseline, such that new
innovations and enhanced tools can always be added to it.

5. References
[1] Chris M Hotz, “An Efficient Bottleneck based Hold-fixing Flow”, Intel 2012
[2] Wan Chong Khor, “Buffer Harvest - A Contingent Hold Fix Methodology for Metal ECO”, Intel 2011
[3] Oren Kol, “Turning-Point (CDS PT-ECO): A Uniform Combined PrimeTime based Fixer”, Intel 2015
[4] Yair Regev, “Spoon - PV Massive Run and Result Analysis Tool”, Intel 2015

Page 11 My Swiss Army Knife of Static Timing Analysis

You might also like