Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application

Congestion Aware High Level Synthesis Combined with Floorplanning


Junhua Wu, Chunmei Ma, Baogui Huang
College of Computer Science, Qufu Normal University
shdwjh@163.com

Abstract present in [2]. Layout driven resource sharing in high-


1evel synthesis algorithm is proposed in [3]. The problem
As VLSI gets large and complicated, routing of interconnection delay is handled by utilizing
congestion becomes more and more difficult to handle. In floorplanning information during high level synthesis in
the traditional design flow, very little physical [4]. Low power is considered in floorplanning driven high
information is available at the high-level synthesis stage. level synthesis algorithms[5~7]. However they are not
To decrease the routing congestion and improve circuit’s used to solve routing congestion problem. Routing
performance, we propose a routing congestion aware congestion problem was considered in the stage of logic
high level synthesis method, which is based on simulated synthesis algorithm in [8,9]. Up to present, few scholars
annealing (SA) algorithm. The method has two loops, the contribute to the work of reducing overall congestion for
inner loop is a practical iteration, and the outer loop is a better routability at high level synthesis stage. CRSF
gradually cooling process. After the high level synthesis, method[10] is effective to solve this problem. In this
we accept or reject the new result according to the cost paper, a novel high level synthesis method to deal with
function value which is obtained by fore-placement. routing congestion problem is presented. Our method
Experimental results show that our method is more handle the congestion from a global view, not limited to
effective than traditional method and CRSF (Congestion local congestion. It tries to distribute the routing demands
driven re-synthesis after floorplanning ) algorithm. evenly before physical design begins. Experimental
results show that this approach is more effective than
1. Introduction CRSF.
The remainder of the paper is organized as follows. A
With the increase of design complexity and the motivation example is given in section II to show the
development of fabrication technology, electronic design meaning of the method in this paper. The high level
automation methods have to encounter new challenges, synthesis algorithm and cost function model used in our
such as delay, area, local congestion and routability. Part method are described in section III and IV respectively.
local routing congestion can be avoided by the technology The routing congestion aware high level synthesis method
of multilayer routing, but it still can’t satisfy the high is presented in section V. Experimental results are given
performance demand of VLSI [1]. The local routing to show the effectiveness of the method in section VI. We
congestion will lead to routing resource lack, and even conclude in Section VII.
lead the signal wires to be not routable in some location
of the circuit. In the traditional design flow, the wires 2. Motivation
local congestion can’t be found until the detail routing
stage. On the other hand, the high level synthesis result After high level synthesis, we get the necessary
decides the cells’ topological interconnect relationship, physical location information by fore-placement, from
which affects the routing distribution dramatically. which we can predict the interconnection wire, find out
Considering physical factor in high level synthesis is a the possible congestion place, modify and adjust the high
new idea to solve the congestion problem. level synthesis result. Consequently, the wires are evenly
Since congestion optimization is NP-hard, there is no distributed on chip and the possibility of the local wires
efficient algorithm in physical design which guarantee the congestion is reduced further. It is a good foundation for
even routing distribution. If routing congestion exists in the following placement and routing. The final purpose of
physical design stage with the given constraint, the whole high level synthesis is to obtain an optimal or good
design flow has to restart again from the very beginning. scheme, so that the interconnection wires are distributed
In the traditional design flow, there is little work about as even as possible on chip.
wiring to do in high-level synthesis since there is very The simple example in figure 1 shows high level
little or no physical information available at this stage. synthesis results how to alter the routing distribution of
There are many floorplanning driven high level floorplanning. Figure 1 (a) and (c) are the results of high
synthesis algorithms. A probabilistic approach to perform level scheduling and allocation, figure 1 (b) and (d) are
integrated datapath allocation and floorplanning was their corresponding floorplanning results with

978-0-7695-3490-9/08 $25.00 © 2008 IEEE 935


DOI 10.1109/PACIIA.2008.205
interconnection information. In the example, there are To solve problems using genetic algorithm, the initial
four operations: A, B, C, and D, which performs r1 + r2, r2 population is composed of random solutions. To evaluate
+ r4, r3 + r5 and r2 + r5 respectively. each new member generated from population, a fitness
function is necessary. Child individuals inherit the merit
of parent individuals through the genetic operations, such
as selection, crossover, mutation etc. They are defined as
follows.
• Selection: replicate the most successful solutions
found in population at the rate proportional to their
relative quality.
•Crossover: decompose parent solutions to two parts at
the same crossover point and then exchange the second
parts to form novel solutions.
(a) (b) • Mutation: randomly alter a gene of the chromosome.
To solve synthesis problem with genetic algorithm,
three genetic algorithms are implemented in this paper.
One is used for DFG(Data Flow Graph) scheduling. In
this genetic algorithm, chromosomes are integer arrays,
such that the length of arrays is equal to the node number
of DFG. S[i] = j denotes the ith node that is scheduled at
jth cycle. The population size for this problem is the
number of chromosomes. Each gene is a random integer
(c) (d) value between ASAP and ALAP time step of
corresponding node. For this problem, fitness function
Figure 1 Example of high level synthesis results impact on
floorplanning
gives better fitness to the chromosome with less latency.
Other two genetic algorithms are used for hardware
In figure 1(a), operation A is scheduled to control allocation for a given SDFG (Scheduled Data Flow
step1 and allocated to module m1(adder), namely, adder Graph). One of them allocates functional units for each
m1 executes r1+r2 at step1.Similarly, B is allocated to node; the other maps each variable of SDFG to a register.
adder m1 and executed at step2, C is allocated to adder Chromosomes of both genetic algorithms are integer
m3 and executed at step3 and D is allocated to adder m1 arrays. The chromosomes length of module allocation is
and executed at step4. Figure 1(a) is corresponding to a equal to the number of nodes of SDFG, and this length in
RTL (register transfer level) circuit, and the locations of register allocation problem is equal to number of
cells are determined after floorplanning, and then figure variables. Fitness function of the two problems defines
1(b) is obtained. It is easy to see that the routing fitness of the chromosomes according to their area
distribution is congested around adder m1. We can overhead. The lower the area overhead is, the larger the
change high level synthesis results from (a) to (c), and the fitness is. The main title (on the first page) should begin
floorplanning results are changed from (b) to (d) 1-3/8 inches (3.49 cm) from the top edge of the page,
correspondingly. Now the potential congestions are centered, and in Times 14-point, boldface type. Capitalize
erased and routing demands distribute much more evenly. the first letter of nouns, pronouns, verbs, adjectives, and
From figure 1, we can clearly find that different adverbs; do not capitalize articles, coordinate
scheduling and allocation scheme in high level synthesis conjunctions, or prepositions (unless the title begins with
may lead to quite different routing distribution. This is the such a word). Leave two blank lines after the title.
preliminary of our method. Many practical cases are not
as simple as the example, more complex scheduling and 4. Probabilistic congestion estimation
allocation are often needed to solve routing congestion
problem in order to get a high quality design circuit. Given a placed netlist, we discretize the core area with
a homogeneous rectangular mash. We analyze the
3. High level synthesis method congestion for every grid in the mesh. The number of
grids in the mesh can be either a fixed number or a
Genetic algorithm (GA) is widely used in digital variable that depends on the core area and process
circuit design. Although it is not guarantee that genetic technology parameters.
algorithm always gives the optimal solution, it can usually The capacity of a grid is the number of available
give a feasible solution in acceptable time. routing tracks within the grid. The horizontal routing
tracks and the vertical capacity of a grid is the number of
available horizontal routing tracks and the vertical

936
capacity of a grid is defined as the number of available
vertical routing tracks. The usage of a grid is defined as 5. Congestion aware high level synthesis
the number of used routing tracks within the grid. The algorithm
horizontal usage of a grid is the number of used horizontal
routing tracks and the vertical usage of a grid is the The congestion aware high level synthesis algorithm
number of used vertical routing tracks. proposed in this paper is based on simulated annealing
Assume the pins are located at the lower left and upper (SA) algorithm. High level synthesis algorithm and the
right corners of the bounding box, the two-pin net covers method of computing cost function have been introduced
an m × n mesh, where m and n are the number of rows and in section III and IV. The algorithm flow chart is show as
the number of columns. We define F(m, n) as the total in Figure 2. The algorithm is a double loop. The outer
number of possible ways to optimally route a two-pin net loop is a gradually cooling process, and the inner loop is a
covering the mesh. practical iteration. The iteration count and the reject
F(m, 1)= F(1, n)=1 function are determined by the current temperature. When
F(m, n)= F(m-1, n)+ F(m, n-1) each loop is executed, we get a new high level synthesis
F(m, n)= F(n,m) result, and then accordingly update the routing demand of
F(m, n)=C(m+n-2, m-1)= C(m+n-2, n-1) global edges and computer the new cost function value. If
Assume Px(i,j) and Py(i,j) represent the probabilities of the cost function value of the new solution is equal of less
horizontal and vertical usages for this net in grid(i, j) than the current cost function value, we accept it.
⎧a : F (m, n − 1)
⎪b :1 (1) Otherwise, we accept it with the probability of
exp ( ( cost-new cost ) / ( k * T ) ) .

⎪c : F (m − i + 1, n − 1)
1 ⎪
Px (i, j ) = × ⎨ F (m, n − j + 1) + F (m, n − j )
F ( m, n ) ⎪ d :
⎪ 2
⎪ F (i, j ) F (m − i + 1, n − j ) + F (i, j − 1) F (m − i + 1, n − j + 1)
⎪⎩e : 2
⎧ a : F (m − 1, n)
⎪b :1 (2)

⎪ F (m − i + 1, n) + F (m − i, n)
1 ⎪c :
Py (i, j ) = ×⎨ 2
F (m, n) ⎪
d : F (m − 1, n − j + 1)

⎪ F (i, j ) F (m − i, n − j + 1) + F (i − 1, j ) F (m − i + 1, n − j + 1)
⎪⎩e : 2
case a : i = 1, j = 1
case b : i = 1, j = n
case c :1 < i < m, j = 1
case d : i = 1,1 < j < n
case e : other
According to the formula (1) and (2), we can obtain
the routing demand of every global edge and the routing
demand of the whole chip using the same method. Then
we can find the routing overflow edge.
Since our objective of optimization is to distribute the
routing most evenly, the cost function here is to indicate
the degree of evenly-distribute of the routing demands.
The objective cost function is shown as formula (3) (4).
⎧⎪0 di < d Figure 2 Flow chart of the congestion aware high level
ci = ⎨ (3)
synthesis method
⎪⎩ di / d d i ≥ d
Cost= ∑ci
i (4) 6. Experimental results

Where di is the routing demand on global edge i; d is The experiments are carried out on a PC Pentium 4
with 3.0GHz master frequency and 512M memory. We
the average routing demand; ci is the cost on global edge i.
have implemented our methods in C language. Three
The bigger the value of ci is, the more the routing demand benchmarks are used in our high level synthesis method,
exceeds the average value. Formula (4) obtains the cost of which are 11th order FIR filter FIR11, 7th order IIR filter
the high level synthesis scheme by accumulating all the IIR7 and elliptical wave filter EWF. We compared our
cost of global edges. method with traditional method and CRSF method in [3].

937
Table 1 Maximum congestion value
FIR11 IIR7 EWF 7. Conclusion
Traditional 91.4 112.2 24.5
This paper proposes a novel routing congestion aware
CRSF 52.9 62.9 20.6 high level synthesis algorithm. A probabilistic congestion
Our method 49.7 58.6 19.1 estimation model is given first, and then the routing
congestion method based on simulated annealing and
Table2 Routing demand distribution of fir11 genetic algorithm is proposed. Experimental data show
that our algorithm can make the wires evenly distributed
Number of global edges routing demands range
on the chip before physical design begin. The maximum
0~10 11~25 26~50 >50 congestion value is about 6% less than CRSF, and 40%
Traditional 3768 204 25 14 less than the traditional method.
CRSF 3908 94 7 2
Our method 3941 65 5 0 References

Table3 Routing demand distribution of iir7 [1] Jinan Lou, Shashidhar Thakur, Shankar Krishnamoorthy,
Henry S. Sheng, Estimating routing congestion using
Number of global edges routing demands range probabilistic analysis, IEEE transactions on computer aided
0~10 11~25 26~50 >50 design of integrated circuits and systems, Vol.21, No.1,2002
[2] Vijay Sundaresan, Ranga Vemuri, A novel approach to
Traditional 1127 131 28 14 performance-oriented datapath allocaton and floorplanning,
CRSF 1134 137 26 3 Proceedings of the 2006 Emerging VLSI Technologies and
Our method 1139 140 20 1 Architectures, 2006
[3] Um Junhyung , Kim Jae-hoon , Kim Taewhan, Layout
Table4 Routing demand distribution of EWF driven resource sharing in high-1evel synthesis. Proceedings of
International Conference of Computer Aided Design,San Jose
Number of global edges routing demands range ,2002:614—618
0~5 6~15 16~22 >22 [4] Y.Wang, J.Bian, Q.Wu and H.Hu, Reallocation and
rescheduling after floor-planning for timing optimization, ASIC,
Ttraditional 2133 211 12 2
2003, Proceedings 5th International Conference, pp.212-215
CRSF 2235 118 5 0 Vol.1. 2003.
Our method 2234 120 4 0 [5] A stammermann, D.Helms, M.Schulte, A Schulz and
W.Nebel, Binding allocation and flooplanning in low power
high-level synthesis, Computer Aided Design, 2003, ICCAD,
Table I shows that the maximum congestion value in International conference,pp.544-550,2003.
our algorithm is about 6% less than CRSF, and about 40% [6] Mohamed A.Elgamel, Magdy A.Bayoumi, On low power
less than traditional method. In our algorithm, each global hign level synthesis using Genetic Algorithm, Proceeding of the
edge has assigned a value indicating routing demands. For 9th International conference on Electronics, Circuits and
benchmark FIR11, Table II shows the routing demands systems, Vol.2, pp:725-728,2002
amount within different value regions of global edges. It [7] Lin Zhong, Niraj K.Jha, Interconnect-aware low-power high-
can be seen that the number of global edges with high level synthesis, IEEE transactions on computer aided design of
routing demands decrease, at the same time, the number integrated circuits and systems, 2005,24(3):336-351
of global edges with low routing demands increase in our [8] Davide Pandini, Lawrence T.Pileggi, and Andrzej
J.Strojwas, Congestion aware logic synthesis, Proceedings of the
algorithm. For FIR11 in Table II, the edge with highest
2002 Design, Automation and Test in Europe Conference and
routing demands (>50) completely disappears, and the Exhibition, 2002
edges with middle routing demands also reduced while [9] Thomas Kutzschebauch, Leon Stok, Congestion aware
the edges with low routing demands increase compared layout driven logic synthesis, Proceedings of the International
with CRSF algorithm. The data show that the routing conference on Computer-aided design, 2001, 216 - 223
demands are distributed much more evenly on the chip [10] Wang W J, Bian J N, Wang Y F. A congestion driven re-
without congestion in some local areas by our method. synthesis method after floorplanning. Proceedings of
Experimental results of testing benchmark IIR7 and EWF International Conference on Communications, Circuits and
are shown in table III and IV, which also indicate the Systems.2005:1220 一 1240
routing demands are more evenly distributed in our
algorithm than traditional method and CRSF.

938

You might also like