Professional Documents
Culture Documents
Automating Sequential Clock Gating With PowerPro CG
Automating Sequential Clock Gating With PowerPro CG
Automating Sequential Clock Gating With PowerPro CG
White Paper
INTRODUCTION
Clock gating is a common Register Transfer Level (RTL) power optimization. Today, RTL synthesis tools identify and automate simple, combinational clock gating. However, greater power savings can be achieved through sequential clock gating optimizations. Until recently, sequential clock gating required manual identification and implementation by expert hardware designers. Now, with the availability of RTL power optimization tools, designers have access to advanced automated, low-power design techniques, eliminating the need for the often difficult and error-prone manual methods. This white paper describes sequential analysis and its application to clock gating. An example of sequential clock gating is given as well as a case study of reducing power in a digital signal correlation block using an automated RTL power optimization tool.
Sequential clock gating saves dynamic power by reducing power dissipation in the clock tree and associated registers. Additionally, switching activity in downstream combinational logic and registers is eliminated further reducing power dissipation. The keys to sequential clock gating are understanding the sequential nature of the design and identifying the correct enable conditions.
vld
d_1 din C G C G
d_2
vld
vld_2
C G
To demonstrate sequential clock gating, the diagram in figure 1 shows a non-optimized and clock gated datapath. In the example, data flows through two computational stages before being latched into the output register dout. The output of dout is held based on the signal vld_2. The clock gate on dout is a simple combinational substitution of the feedback loop. Sequential clock gating on d_1 and d_2 requires sequential analysis to propagate the data hold condition backwards, disabling the unused computations in previous cycles.
Figure 3 shows one of the sequential clock gating transformations found in the correlator. In this case the output of register Q is used only when output of register B is zero. The value of register B comes from register A in the previous cycle. Understanding the temporal relationship between register A and register Q, it becomes clear that register Q can be clock gated whenever register B is zero. Identifying this sequential relationship and recognizing the opportunity for clock gating requires sequential analysis; power aware RTL synthesis tools wont find these clock gating opportunities. The tool generated new, functionally equivalent RTL code with register A driving the enable logic of register Q. Overall, more than 50 sequential transformations were implemented to produce a low-power version of the correlator RTL. This code was run through RTL synthesis to measure power, timing and area. The results showed 24% power reduction with number of clock gated registers increasing by 60%. Results from RTL synthesis showed the total area was unchanged and timing slack went from 1402 ps in the original design to 1378 ps in the new design.
By looking at the waveform corresponding to the clock gated datapath in figure 2, the yellow check marks show the cycles during which clock to the register dout is gated. Similarly the red check marks show the additional switching eliminated by sequential clock gating on d_1 and d-2.
4. CASE STUDY
A correlator function is commonly used in pattern recognition algorithms. The correlator measures the similarity of two signals. In this case, it was used to find features in an unknown signal by comparing it to a known one at different times. The original design consumed 964 uW and already had 44% of the registers clock gated. To reduce power in the correlator, the design team added sequential clock gating to the RTL code using PowerPro CG from Calypto Design Systems. This automated RTL power optimization tool uses sequential analysis technology to identify clock-gating optimizations. Its cost-driven optimizations take into account area, timing and static power while evaluating sequential transformations. The design team used the software to run the correlator block and many sequential clock gating opportunities were identified.
Reg B
Reg A
Reg Q
Page 3
Conclusions
Sequential analysis of RTL identifies powerful sequential clock gating optimizations that reduce dynamic power without changing functionality or impacting timing. PowerPro CG automates the sequential clock gating process, reducing power without impacting design area or timing.
Calypto Europe
Tel: +44.1344.310673
info_eu@calypto.com
Calypto India
Tel: +91 120 472.1500
info_in@calypto.com
Calypto Japan
Tel: +81.45.470.2070
info_jp@calypto.com
Calypto China
+86.10.6805.8081
info_cn@calypto.com
Calypto Korea
Tel: +82.2.488.3538
info_kr@calypto.com
2012 This document contains information that is proprietary to Calypto Design Systems, Inc. and may be duplicated in whole or in part by the original recipient for internal business purposes only, provided that this entire notice appears in all copies. In accepting this document, the recipient agrees to make every reasonable effort to prevent unauthorized use of this information. Calypto, Catapult, SLEC, PowerPro and Enabling ESL are trademarks of Calypto Design Systems, Inc. All other trademarks are property of their respective owners. WP-0008 04-2012