Professional Documents
Culture Documents
tb-01-gibbons-pres-snps
tb-01-gibbons-pres-snps
tb-01-gibbons-pres-snps
Arm-Synopsys Collaboration
Summary
− Hybrid Emulation
AMBA Transactors
Partnering for Arm Powered Products
System SW Dev. Software Stack
Virtual Prototypes
Physical Prototypes Virtualizer™ Tool Set & Development
HAPS® Support for Arm Cores, Hybrid Prototypes Kits (VDKs) for Pre-RTL SW Dev.
Connection to Juno Arm AMBA Transactors Using Arm Fast Models (v7 & v8),
Development Platform, Arch. Design with Platform Architect,
Prototyping Methodology Coverity & Defensics SW Signoff
Replace with
A76/G76 NR
Images source:
arm.com
POP
Artisan
Reference
Physical IP
Scripts
CPU optimized RTL-GDS scripts
Physical IP support for Synopsys
POP Landing
Team Support
https://www.arm.com/products/physical-ip/pop-ip
© 2018 Synopsys, Inc. 8
Why use Arm POP IP?
Reduced risks
Technology & schedule risks POP IP is developed and tuned in synergy with RTL over several
iterations. All physical IP & implementation issues have been identified
and solved by EAC date
Optimized PPA
Non-optimized PPA With proven track record, Arm POP IP delivers optimized PPA
Arm-Synopsys Collaboration
Summary
• Best starting point and most comprehensive solution for Arm CPU 350
implementation with Synopsys EDA tools
300
• Created in collaboration with Arm for a specific core, configuration,
constraints and Artisan physical IP 250
• Available for Arm Cortex-A76, -A75, -A55, -A73, -A72, -A53, -A57 etc. 150
50
• Download from SolvNet (www.synopsys.com/Arm)
0
• Contact Synopsys for
– QIKs for advanced Arm cores
– Expert services help, available from QuickStart to core hardening
*Publicly announced Armv8 cores
© 2018 Synopsys, Inc. 13 Copyright © 2018 by Synopsys. All rights reserved.
Synopsys QIK – High Level View
DCG/ICC2 Based
Design Compiler® Physical Implementation
RTL
Graphical
Libraries IC Compiler™ II
RedHawk™
place_opt
IR Drop
Constraints
clock_opt
PrimeTime™
route_opt ECO
• Incremental timing-driven
• place_opt CCD
• Buffer-aware placement IC Compiler II multibit register banking and
de-banking
• PrimeTime delay calc in
• Level shifter (LS)/Enabled LS
route_opt
banking
• Path-based opt. in route_opt
• High effort leakage flow
• For absolute best FMAX look into processor data flow Instruction Execution
– OOTB placement is good, but can be further tuned
– Guide module placement to better align with expected
data flow
Sub-Module Bounds
More structure to
Instruction,
Execution and FPU
logic
WNS
NVP
IC Compiler II
CCD
Place_opt
Slack-based adjustment
CCD Clock_opt
CCD Route_opt
RAM TNS (ns) LS Leakage (mW)
PrimeTime ADV
Clock path adjusted for timing ECO Only CCD Manual skewing + CCD
• At previous nodes, we needed a well-crafted recipe maximize performance and minimize power
– Used library analysis to decide VT / channel length classes available at each optimization stage
• At 7nm, tool improvements and Artisan library characteristics result in a vastly simplified recipe
Drive Strength
Very different leakage distribution
16nm 7nm
© 2018 Synopsys, Inc. 24 Copyright © 2018 by Synopsys. All rights reserved.
Improved Leakage Optimization
Power Optimized CPU: 7nm Vt Class Trials
20
10
10 5
0 0
TNS (ULVT) TNS (All VT) LKG (ULVT) LKG (All VT)
fix_eco_timing
ICC II ECO
PT-ECO Result
Fachieved > Fconstrained
-5ps
StarRC
PrimeTime SI/PX
Apply Negative Uncertainty -10ps
fix_eco_power
fix_eco_ timing
ICC II ECO
StarRC -15ps
PrimeTime SI/PX
80%
60%
Max Frequency
40%
& leakage
20%
0%
100% 95% 90% 85% 80% 75% 70%
% FMAX
• Prevention & fixing with signoff engines empowers Design Compiler Implementation
implementation engineers earlier in flow Graphical
Power Integrity
• Perfect correlation to rail signoff since same engines are IC Compiler II Convergence
used for signoff
RedHawk
Analysis Fusion
• Seamless and push-button integration saves schedule Block
and reduces iterations
Signoff
StarRC
• QIK integration
– Configures RedHawk runs from existing ICC II setup files
PrimeTime Power Integrity
– Defines scenarios in which to do analysis and load apl files Signoff
– Loads timing/physical data
RedHawk
– Creates tap_layers: needed to define physical PG connections Block & Full-chip
– Runs rail analysis
QIK provides complete script to setup and run RedHawk rail analysis within ICC II
Template structure enables easy modification for different libraries/CPUs
© 2018 Synopsys, Inc. 29 Copyright © 2018 by Synopsys. All rights reserved.
Crosstalk Mitigation
• In advanced node designs, crosstalk effects are Tool Density and Congestion Settings Value
often related to pin density effects DCG placer_max_cell_density
70
• Manage placement densities through cell ICC II place.coarse.max_density
clumping and spreading DCG target_routing_density
60
– Automated mode in ICC II ICC II place.coarse.target_routing_density
– Line up DCG to the same value as in ICC II ICC II place.coarse.pin_density_aware true
Use automated mode for all parameters until you decide to push it one way or another
Long paths in main data flow (L2 to L1) Fixed with blockage settings
400
SI Non-SI
300
TNS
200
100
route_opt
Tmax_maxtb.tcl Tmax_update_spf.tcl
• New to this QIK is inclusion of a full set of ATPG
Tmax_stuck.tcl Tmax_diagnosis.tcl
scripts
– Support added for TetraMAX and TetraMAX II Tmax2_stuck.tcl Tmax_debug_config.tcl
1 21 1 21
Single Top-level
CODEC
big Core LITTLE Core DSU
(A55)
I/O Shared to
identical CPU
cores
1 24 1 24
1 24 25 32
80%
99%
70%
60%
98%
50%
97% 40%
30%
96% 20%
10%
95%
0%
Power-opt. Core Perf-opt. Core
Power-opt. Perf-opt.
Stuck-at (TMAX) Stuck-at (TMAX II) Core Core
Transition (TMAX) Transition (TMAX II) %Test Time Reduction
100% 100%
TNS (NS)
80%
40
60%
40%
20%
0 0%
Synthesis place_opt clock_opt route_opt Signoff ECO 1 Signoff ECO 2
IMPLEMENTATION STAGES
240%
80
180%
TNS (NS)
120%
40 100%
60%
0 0%
Synthesis place_opt clock_opt route_opt Signoff Signoff ECO1
IMPLEMENTATION STAGES