ICCII N-2017.09 Opto Placement Training

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

IC Compiler II 2017.

09 Release Incremental Training


Placement and Optimization

© 2017 Synopsys, Inc. 1


Overview

• Placement
–Buffering Aware Placement
–Embedded CDR Ultra Effort
–Wide Cell Modeling
–Layer Aware Congestion Modeling

• Optimization
–Global Route Based Optimization
–Advanced HFN Buffering
–The remove_buffer_trees Command Enhancements

© 2017 Synopsys, Inc. 2


Buffering Aware Placement (BAP)

© 2017 Synopsys, Inc. 3


Problem Description

• In the place_opt command flow, the initial placement is wirelength-driven.

• This can lead to poor placement of registers, resulting in unfixable timing in the
later stages of the place_opt command, because subsequent calls to the
coarse placer are incremental.

• However, simply running timing-driven placement as the initial placement is


also problematic.
–For instance, high fanout nets will dominate the timing and appear critical.
–Also, the timing-driven coarse placer cannot account for buffering by the
optimization engine later in the flow.
–Consequently, the placer will try to shorten nets unnecessarily, leading to sub-
optimal result.

© 2017 Synopsys, Inc. 4


Solution
• Buffering-aware placement (BAP)models the timing of unoptimized netlists
and passes the model timing to the core placement engine,
allowing the placer to better handle high-fanout nets and take into account
later buffering by the optimization engine.

© 2017 Synopsys, Inc. 5


User Interface

• New option for the stand-alone create_placement command


create_placement -buffering_aware_timing_driven

–If you use both the -buffering_aware_timing_driven and –timing_driven


options, only the -buffering_aware_timing_driven option is applied

• New application option to enable buffering-aware placement in the place_opt


command
set_app_option -list {place_opt.initial_place.buffering_aware true}
place_opt

© 2017 Synopsys, Inc. 6


Limitations

• Does not support BAP in the SPG flow

• Does not support the buffering aware placement (BAP) feature


concurrently with the congestion driven restructure (CDR) feature.
–When you apply both features, CDR is skipped
–The example flows on the following page shows how you can run both the
BAP and CDR features

© 2017 Synopsys, Inc. 7


Flow Examples With BAP and CDR
create_placement ;# For CDR
Flow #1 create_placement -buffering_aware_timing_driven ;# For BAP

place_opt -from initial_drc

create_placement ;# For CDR


Flow #2 create_placement -buffering_aware_timing_driven ;# For BAP

place_opt -from initial_drc -to initial_drc


create_placement -timing_driven -congestion –incremental

place_opt -from initial_drc

© 2017 Synopsys, Inc. 8


Embed Congestion Driven Restructuring (CDR)
Ultra Effort

© 2017 Synopsys, Inc. 9


Solution
• In version N-2017.09, the tool introduces effort variations for the embed congestion
driven restructuring strategy, similar to the original congestion driven restructuring
strategy
Version CDR strategy Supported effort levels
M-2016.12-SP5 or older Original CDR Low, medium, high, and ultra
Embed CDR Medium only
N-2017.09 Original CDR Low, medium, high, and ultra
Embed CDR Low, medium, high, and ultra

• The low and high effort settings operates differently from the medium effort by importing a
smaller or larger set of permutable pin arrays into the placer
• The ultra effort mode imports into the coarse placer not only the leaf-level pin arrays of the
associative commutative trees, but the whole trees instead, allowing rewiring of connections
along multiple cuts inside these trees during placement.

© 2017 Synopsys, Inc. 10


User Interface

• To change the CRD strategy and its effort level, use the following application
options:
set_app_options –name place.coarse.cong_restruct_strategy –value embed
–The default is embed

set_app_options –name place.coarse.cong_restruct_effort \


–value low/medium/high/ultra
–The default is medium

© 2017 Synopsys, Inc. 11


Limitation
• Original CDR ultra effort versus embed CDR ultra effort:
–Wirelength: Original CDR ultra effort is slightly better or comparable with embed CDR
ultra effort
–Runtime : Embed CDR ultra effort runs faster than original CDR ultra effort

• CDR in the SPG flow is not supported

• CDR does not work with timing driven placement, which is enabled by using the
-buffering_aware_timing_driven or -timing_driven options of the
create_placement command
–When you apply both features, CDR is skipped
–The example flows on the following page shows how you can run both features

© 2017 Synopsys, Inc. 12


Flow Examples With BAP and CDR
set_app_options –name place.coarse.cong_restruct_strategy –value embed
Flow #1 set_app_options –name place.coarse.cong_restruct_effort –value ultra

create_placement ;# for CDR


create_placement -buffering_aware_timing_driven ;# for BAP

place_opt -from initial_drc

set_app_options –name place.coarse.cong_restruct_strategy –value embed


Flow #2 set_app_options –name place.coarse.cong_restruct_effort –value ultra

create_placement ;# for CDR


create_placement -buffering_aware_timing_driven ;# for BAP

place_opt -from initial_drc -to initial_drc


create_placement -timing_driven -congestion –incremental

place_opt -from initial_drc

© 2017 Synopsys, Inc. 13


Wide Cell Modeling

© 2017 Synopsys, Inc. 14


Wide Cell Modeling

• The goal of Wide Cell modeling is to provide an improvement in the placement


of wide cells in advanced technology nodes
–The placer must respond to the density of wide cells, together with the details of the PG
network to ensure that these cells can be placed, and are not subject to large
displacements due to their ability to straddle or fit between the PG structures
–This feature helps improve congestion response, and convergence on timing QoR results
when there are many wide cell instances and PG structures with narrow pitch on the lower
layers of the design

© 2017 Synopsys, Inc. 15


Default Placement Result
Too Many Wide Cells are Placed in Dense Regions

© 2017 Synopsys, Inc. 16


Default Placement Result – After Legalization
Many Single-Row Cells Pushed Down to The Lower Area

© 2017 Synopsys, Inc. 17


Default Placement Result – After Legalization
Congestion Issues With Very Large Displacement

© 2017 Synopsys, Inc. 18


Controlling Wide Cell Density
Basic Idea

• Regardless of cell height, only one cell that is wider than half the M1 PG pitch can fit within PG
straps
– This feature makes the coarse placer aware of the density limits imposed by these cells and
automatically determines which cell types to consider

• For cells that are unable to straddle PG nets, the tool treats any cell that is wider than half the
M1 PG pitch as if its width was equal to the PG pitch
– In the following figure, to make it aware of the density limit in this area, during coarse placement the
tool treats any cell that is unable to straddle the PG nets and is less than 0.6555um wide as if it was
1.311um wide,
• Control the density of the wide
cells with density of other cells to 1.311um
ensure that both can fit as needed

© 2017 Synopsys, Inc. 19


Controlling Wide Cell Density
User Interface

• To enable this feature, use the following application option setting:


set_app_options –list {place.coarse.wide_cell_use_model true}

• The default is false

© 2017 Synopsys, Inc. 20


With Wide Cell Modeling – After Legalization
No Large Cell Displacement

No large
displacement

© 2017 Synopsys, Inc. 21


With Wide Cell Modeling – After Legalization
No Congestion Issue

© 2017 Synopsys, Inc. 22


Layer Aware Congestion Modeling

© 2017 Synopsys, Inc. 23


Layer-Aware Congestion Modeling

• The goal of layer aware congestion modeling is to provide an improvement in the accuracy of
congestion reduction in coarse placement
– Instead of always considering the congestion impact of all layers together, this feature will also
consider congestion represented in the layers required for pin-access as a separate factor
– Congestion reduction in the coarse placer will respond to the worst per-Gcell result of
– All layer congestion
– Low (pin-access) layer congestion

– This helps ensure that for areas of the design in which there are available upper-layer routing
resources, but few low-layer resources required to make the necessary pin connections, that the
placer will attempt to provide congestion relief in the affected areas

© 2017 Synopsys, Inc. 24


Layer-Aware Congestion Reduction

• This feature looks at the worst usage of:


– Lowest 6 non-ignored routable layers (typically 3 horizontal and 3 vertical) with congestion maps.
– All non-ignored routable layers with congestion maps.
• The coarse placer expands cells more in areas where there is higher usage values, thereby
reducing the congestion in those areas.

Default routing density map Map with layer-aware congestion reduction


© 2017 Synopsys, Inc. 25
Layer Aware Congestion Modeling
User Interface

• To enable this feature, use the following application option setting:


set_app_options –list {place.coarse.congestion_layer_aware true}

• The default is false

© 2017 Synopsys, Inc. 26


Global Route Based Optimization
Version M-2016.02-SP2

Optimization CAE
February 2017
Overview
• Global Route Based Optimization (GRO)
– Optimization based on global routing and GR timing
– Includes setup and hold delay optimization, logic DRC fixing, and leakage optimization (if enabled)
– Optimization moves considered are similar to those done by final-mode route_opt with the addition of
a rebuffering capability
– Available as a flow stage within the clock_opt command

• Most designs are expected to benefit from global route based optimization
– Improved timing, area, and leakage.
– Improved buffering topology, as buffering is based on global route layer assignment.
– Global route based optimization is not expected to either improve or degrade congestion.

© 2017 Synopsys, Inc. 28


User Interface
• An additional stage named global_route_opt, which performs global routing and optimization, is
added to the clock_opt command
– The global_route_opt stage is the final stage in the clock_opt command, after virtual route based
optimizations
– You can specify the global_route_opt stage by using the –from and –to options of the clock_opt
command

• The execution of the global_route_opt stage of the clock_opt command is controlled by the
following application option setting:
set_app_options –name clock_opt.flow.enable_global_route_opt -value true
– The default is false
– Existing flows do not see any change unless this application option is explicitly enabled

• By default, after the global_route_opt stage of the clock_opt command has been run, any
further global routing is skipped, including
– The global route call within the route_auto command
– Anatomic global route calls by using the route_global command
– Incremental global route calls

© 2017 Synopsys, Inc. 29


Default clock_opt Flow
• Global routing and optimization flow is not enabled by default
– Existing flows will see no change until the user explicitly enables global route based optimization

IC Compiler II Flow
build_clock
place_opt
route_clock
clock_opt
final_opto
route_auto Skipped : Stage is not enabled
global_route_opt
route_opt

clock_opt Command Flow

© 2017 Synopsys, Inc. 30


clock_opt With GRO and Signal Routing
• When enabled using the clock_opt.flow.enable_global_route_opt application option, the
tool executes the global_route_opt stage of the clock_opt command
• Upon completion of the global_route_opt stage, the output of clock_opt command is fully
global routed and the subsequent signal routing flow will automatically start with track routing

IC Compiler II Flow build_clock


place_opt route_clock
clock_opt final_opto

route_auto global_route_opt Global routing + optimization stage included


or
route_global
route_track
route_track
Signal routing automatically starts with track routing
route_detail
route_detail

route_opt

© 2017 Synopsys, Inc. 31


Guidelines and Expectations
• All routing related application options must be applied before you run the clock_opt command
with the global_route_opt stage enabled
– Check if any application options are currently defined before signal routing, in the existing flow. If so,
move them ahead of the clock_opt command, it if runs the global_route_opt stage

• Scenarios used during GRO should include those used during virtual route based optimization
and postroute optimization
– At least one active scenario should have leakage power active
– Enable the route_opt.flow.enable_power application option

• QoR improvement should be evaluated after postroute optimization


– Area and leakage optimizations in GRO can result in some detail route signal integrity effect, which will
be recovered in the postroute optimization flow

• Preroute of signals nets, using the route_group command, should be done before executing
GRO
© 2017 Synopsys, Inc. 32
Guidelines and Expectations

• In the event that it necessary to run global routing after GRO has been completed, it is possible
to do so
– Recommended to use incremental global routing, if possible, so as not to invalidate the optimization
which was done based on the existing global route
– Use the application option shown in the following example to allow global routing after GRO
set_app_options -name route.global.force_rerun_after_global_route_opt -value true
route_global -reuse_existing_global_route true

© 2017 Synopsys, Inc. 33


Advanced High Fanout Net (HFN) Buffering
M-2016.12-SP4
On-by-Default
Overview

• Advanced high fanout net (HFN) buffering is the new buffer tree synthesis engine that is used
for initial high fanout net synthesis in the initial_drc stage of the place_opt command
– This is not an incremental enhancement to the existing buffering. It is a completely different
technology in all aspects:, such as buffer selection, buffer tree and driver sizing, topology generation,
clustering, and so on.
– It still leverages the same core IC Compiler II timer, extractor, and MV engines

• The following is the QoR trend measured at the end of the flow (green: better, blue: same, red:
worse)

Timing Logical Area Buffer Tree Quality Wirelength Routability Dynamic Leakage
DRCs (WL, CI-Ratio) (Congestion / Power Power
DRCs)
-15% -0.7% -3% (wirelength) -1% -11% (DRCs) -1.3% -1.3%
(TNS PM)

© 2017 Synopsys, Inc. 35


User Interface
• Advanced HFN buffering is on by default starting with version M-2016.12-SP4
– You can disabled it by using the following setting, before you run the place_opt command:
set_app_options -name opt.buffering.enable_advanced_buffering -value false
– Only the buffering during the initial_drc stage of the place_opt command is affected by this
application option
IC Compiler II Flow
initial_place
place_opt create_placement
initial_drc -incremental
clock_opt -use_seed_locs
initial_opto
route_auto initial_drc
final_place
route_opt
final_opto
Optional Two-Pass
Placement Flow

© 2017 Synopsys, Inc. 36


place_opt Flow
Interaction With Existing Application Options

• The following buffering related application options have no affect on the


advanced HFN buffering engine in the initial_drc stage of the place_opt
command
–opt.common.drc_mode_buffering
–opt.common.buffering_for_advanced_technology

© 2017 Synopsys, Inc. 37


Identifying Advanced HFN Buffering

• To identify if advanced HFN buffering was performed, search for the word ORB
in the place_opt command output.

• The following example output shows such lines, which are printed by the
advanced HFN buffering engine:

ORB: scenario scen1 corner max


ORB: Nominal = 0.012846 Design MT = 0.500000 Target = 0.073370 (5.712 nominal) MaxRC =
0.046031

© 2017 Synopsys, Inc. 38


SPG Flow

• A default SPG flow enabled by using the place_opt.flow.do_spg application option does not
run the initial_drc stage of the place_opt command.
– Therefore, advanced HFN buffering is skipped, even if it is enabled.

• However, if your flow is different to the default flow, it might include the initial_drc stage of
the place_opt command.
– You can check your log file to see if advanced HFN buffering was run, as described previously

© 2017 Synopsys, Inc. 39


The remove_buffer_trees Command
Enhancements

© 2017 Synopsys, Inc. 40


Simplified User Interface

• To simplify the user interface, the following options of the remove_buffer_tree command
have been removed In the N-2017.09 release :
-hfs_fanout_threshold
– By default, the command removes all buffer tree with a fanout larger than one
– This behavior is unchanged
– Partial buffer trees can still be removed as before based on drivers and loads using the -from or
-sources_of option
-no_clustering
– By default the command now remove as many buffers and inverters as possible, without splitting
inverters
-verbose
– The default command output is now improved to give information about the number of buffer removed,
and also the number of buffers that were not removed
– For the buffers not removed, it indicates if they were kept due to dont_touch or size_only attribute
settings

© 2017 Synopsys, Inc. 41

You might also like