Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

2001 IC/CAD Contest

Problem 6: Clock Tree Synthesis


Source: Faraday Technology Corp.
January 2, 2001
Revised on March 16, 2001
(Revised to make the description of input format conform to the on-line test cases.)

I. Introduction
In synchronous design, the clock net needs to be routed with great precision since the clock net delay
also determines the maximum clock frequency on which a chip can operate. An important issue of the
clock net design is buffering, which is necessary to control clock skew and delay. The clock skew is
the maximum difference among the arrival times of the clock signals. The objective of this problem
concerns about the re -selection of buffers in the clock tree so that the given constraint of the clock
skew is met and at the same tim e the clock delay (the maximum arrival time among clock signals) is
minimized.


 


In this problem, let us assume that the circuit design including the clock tree has been placed and
routed. All information about the wire/net re sistance and capacitance is also available. Then, one can
obtain the arrival time of a clock pin by computing the delay from the clock root to the clock pin. The
delay includes the delay of buffers in the path and the delay of wires/nets. When calculating the delay
of a buffer, one needs to find the total fanout capacitance of the buffer and the input slop of the buffer.
With these two numbers ready, the delay and the output slop of the buffer can be looked up from the
tables in the (timing) technology libr ary. The resistance and capacitance of a wire/net can be found in
the Cadence RSPF file, which extracts the parasitic result from post -layout clock tree. After obtaining
the delay of all buffers and wires/nets, the arrival times of all clock pins can be co mputed. The clock
delay is the largest arrival time among the clock pins, and the clock skew is the difference between the
largest and the smallest arrival times among the clock pins. By selecting different sizes for buffers,

-1-

one can adjust the arrival tim e of clock signals to meet the clock skew constraint and to minimize the
clock delay.

II. Delay Calculation Using Non-linear Delay Model (NLDM)


The delay of a buffer will be calculated using a non
-linear delay table lookup model. The
non-linear delay mo del characterizes a cells delay with a variety of input slew rates and output load
capacitances. The results form a table, with input transition and output load capacitance as the
deciding factor for calculating the resultant cell delay. The following figure shows the resulting delays
and slew rates interpolated to produce a non -linear delay model. If the delay number falls within the
square (table in the library), then the delay is computed using interpolation techniques as in Figure 1.
The values of the surrounding four points are used to determine the delay value from the numerical
methods. To complete this problem, it is required to write a delay calculator and also the timing
analysis tool to compute the delay for all the paths.
For example in Figure 2 , consider comput ing the delay of Buffer A and the net delay from
Buffer A to Buffer B where the values of C1, C2 C3, R1, and R2 can be extracted from the RSPF file.
First, the total fanout capacitance of the buffer is given by
Total_Capacitance = C1+C2+the input capacitance of the Buffer B.
Together with the input slew value of Buffer A, the delay of the Buffer A can be looked up from
the timing library. Note that if Buffer A has more than one
fanout, the input capacitance of other
Buffers also need s to be added. The net delay between Buffer A and Buffer B can be computed by
R2*C3.

( *'+
(& ) '

,  .-/  102#  013

 "!# ! # $%

Figure 1.

Nonlinear Delay Calculation

-2-

O
A

O
A
C1
Figure 2.

R1

R2
E1

C2

+
-

B
C3

Delay calculation with net delay

III Assumption and Input/Output Specification


1. Assumption:
(1). The leaf nodes are all the same kind of D filp -flops with an input capacitance of
0.0032pF.
(2). The clock is from outside of the chip and is assumed to be nearly ideal clock with the
slop being 0.1ns.

2. Inputs
(1). A text file (single line) describes the clock tree skew constraint:
(2). A text file (with net description format) describes the clock tree.
(3). A Cadence RSPF file describes the Post layout clock tree parasitic extraction result.
(4). The timing library of buffers/inverters.

3. Output
(1). A text file (with net description format) describes the clock tree.

IV Examples of Input Formats


(1). A text file (single line) that describes the clock tree skew constraint (p6-constraint.in).
Clock

<clockNetName>

<skew> Example:

Clock CLK1 0.3

(2). A text file (with net description format) that describes the clock tree (p6-*.ct).
The text file contains a series of
net description blocks. Each net description block
corresponds to a net in the physical design. A net description block has three fields named
driver, net and fanout. The first string in a driver field indicates the name of the driver and the
second string gives the name of its output pin. The string in a net field gives the name of the
net. There can be several fanouts. The first string gives the name of a fanout cell, the second
string gives its corresponding library cell (reference) , and the third string gives the input pin
name of the cell connected to the net. Als o the first net description block gives the root of the
-3-

clock tree. The following is an example of a net description block.

4651798;:5=<>@?
E : FG<H>@? A;BC6D
LNMO;PRQ F@>T< S AHBCIAHAJK S Z[
L
>TS A;B UWS VYX L SZ[ B
A;B VYX
B
X]\\;K_^a``cbed
L3_I4
I
L2_I1
O

L3_I3
I

L2_I1_69

U779
CK

Figure 3. A net description block

The example in Figure 3 shows that


the output pin O of cell L2_I1 drives net
L2_I1_69. There are three cells in the fanout of cell L2_I1
, including L3_I4, L3_I3 , and
U779. Cell L3_I4 is mapped to the library cell BUF3CK whose input pin I is connected to net
L2_I1_69. Similarly, cell U779 is mapped to the library cell DFF whose input pin I is also
connected to L2_I1_69.

(3). A Cadence RSPF file that describes the post-layout clock tree parasitic extraction result
(p6-*.rspf) f

The following shows an example of RSPF (Reduced Standard Parasitic Format). The format of R
and C values can be extracted in the file.

gHh i=jek L
Cl m
g
S p
gHh 4]n j Eqp2L j
gHh 4]s6t B1n o p2u :"vxKHw Q;rHyV? m C < < ? p
gHh | n E 4 i p Z M vH: O;} Jz:6C"4]{ :.UH~U 7 UeO m je rH~1rHF: r ~  >G4 p
gHh ki i D s6 p p
gHh | n D]i=o j Ep S Ual m;Z l r ? H Q 7 y F M F:5@tG Q w Q;y
M F ?< p
rNC CKHKHK C Cr
gHh 4 | 46B1n D ih l U lC J
gHh 4]n B > B tGn i
gHh j B tB l
g VX VYB
S
j Z[=t L j
lg X]V
KHrHVm C
-4-

g E : F j : } F7 PO
g
gHh i E 4 E n t | jj
g o D=X A
g

gHh E n ts i t ? k L
gHh 4 i | n Vi D k s rl r i t
k s i t
gHh jz B k s D i A t VD l DD A VD D
i D k A s VD i t l DlCk s i t
Z C6D k A s VD i t l D|D jAj VD k L l DlC6rel r;m
Z? C6D k A s VD i t l D |jaj rel rNC k L
rl rNC
gHh > D s6A4s VD i t l DlC
gHh jzD s i Vt D
n s VD i t lC| jj k s i t |jj
C"l r
i ?xC s VD i t lC s Di t A VD l D
ZS_s VD i t l|C jj VD k L rl rHrr;m
Cl r
g VD
lnll E 4 j
l
(4). The timing library of buffer/inverter (p6-timing.tlf).
The foll owing example shows the timing library of a buffer. (The file p6 -timing.tlf posted at the
contest web site gives the library of a set of buffers for the contest.) The input capacitance of the
buffer can be extracted from the line of
Pin(I Pintype(Data) Pi
ndir(Input)
Timing_Props(Pin_Cap(0.002200)))
P vH: y 7 P 4]: yM Hi 7~: P v: y 517~: ~1. P The
v Rise and Fall delay
P vH: ofy 7 Pthe
4]: ybuffer
can
vH: y be1obtained
~1 P v in the table

M
N
L

M

y
y
P

M

y
y
of
and the table of
P vH: y j yP : i r 7~: A;P vH: y 517l ~: In addition,
~ P v
r A;
the Rise
Fall slops of the buffer can be found in the table of
and
P vH: and
y j yPR : LNM yy P vH: y 1M yy ~1 P v :51:xH78H:.~ M F797 O  y 7 5 M 5 P QH :5~ a ;7 } 7~ My ~ PM 8 M 7 yr M y : M AHF F;: }"PO F:.~1F
the
: ~79F:
r A;
l
l (Notice that we will use only the library listed in the file p
6-timing.tlf to test all
benchmark circuits, those available on -line as well as those reserved. So we suggest that you
include it in your package.)

a2 / 


 " 9 .I " 9 @ 9 6.% %" "
  9 ";  I 9 6.. 1; .I ."" 
 9 "/
9   
   
 " 9 @69 . % @ 
 " 9 . x eG"""
 9
" "

%"2

-5-


  " "
""""
9 2 e"" "I"  .9

"

"

"

a

% I  . 

6 . "6 9 9 .%2""
" I
%

"
"
Y"Y
..  %

"

"

"

"

"

 I

R " .=R e""6R ""."=R .""R ""e.R "e16R ""R "eGR "
"e= """" e

"

.
.. 

 %  Y"R ".""=RR "R ";""=R """""xI"=R9 .."


 I Y"NR """""=R ".""=R e"""=R  """=R ".""R """""

"

 .
.. 

 %  Y"R "=R"""R "."=R . "x ""."x  "."


 I Y"NR """""=R ".""=R e"""=R  """=R ".""R """""


  % 
  
" "

9 "  .
..   I

R ""."=R9""""=R1""=R9."=R e""R e"""


-6-

R""""=R9% """=R""""=R e"""=R ;"""R " ."


R"."=R9"."=RI ""=R .""=R " """R "e""
R.""=R9e% "=R eI""=R "" "=R ""R ."""
R """""=R  ."=R ".""=R ; "=R aI "_ ;% "
R "e""=R ""."=R e""=R """=R """_ ""

" e "

9    .
..   I

R "a"=R9%;""=R""""=R ee"=R .""R ."""


R.""=R91. "=R. ""=R   "=R " "R " ""
R""""=R9""""=R""""=R .e"=R .""_ e."
R1."""=R9I"""=R eI""=R ."=R e"_ ""
R"""=R """""=R "a"=R ;"""=R ."."_ ."
R e "=R "" "=R """""=R """"=R .."_ ."

" "

R  .
..   I

R "."=R9"1.""=RIe"=R ;"""=R .."R " ""


R ""."=R9""""=RIe"=R ;% ""=R . ""R ".""
R "e""=R9""""=RI"""=R ;e"=R . ""R ".H"
R ""."=R9 "=R"."=R ;."=R ..""R ". "
R . "=R91."""=R ""=R """"=R "e"R """""
R ."=R9""""=R " ""=R ."""=R " ""R e"""

" %e "

R    .
..   I

R . "=R "" "=R%"" "=R e "=R ""_ "."""


R ."""=R " ""=R%"" "=R e "=R ""_ "."""
R "" "=R """""=R1."""=R "e""=R e""_ ".e"
R "" "=R9.""=R1""=R " ""=R """"_ ". "
R ."""=R9e""=R" "=R   "=R .""""_ """
R"=R91 "=RI""=R "a"=R .""""_ .""""

-7-




   I R

"

"

"

"

"

 2 

%

 "   2 I9   R  R""""
 "     R  R """""

" "
 a;_eGe 9 "; "    %e " 
 a;_ x 9      

IV Evaluation

The skew improvement


The clock delay
Accuracy of the cell delay calculation
The number of changed clock buffers
CPU time

References.

N. Sherwani, Algorithms for VLSI Physical Design Automation , Kluwer Academic


Pub., 2nd Ed., 1995.
B. Wu and N. A. Sherwani, Effective buffer insertion of clock trees for high -speed
vlsi circuits, Microelectronics, 23:291-300, July 1992.
S. Dhar, M.A. Franklin, and D. F. Wang, Reduction of clock delays in VLSI
structures, Proceedings of IEEE/ACM International Conference on
Computer-Aided Design, pp. 770-783, 1984.
D. Harris, M. Horowitz, and D. Liu, T iming analysis including clock skew, IEEE
Transactions on Computer -Aided Design , Volume 18, No. 11, pp. 1608 -1618,
Nov. 1999.

-8-

You might also like