Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO.

10, OCTOBER 2004 1953

Parasitic-Aware RF Circuit Design and Optimization


Jinho Park, Kiyong Choi, and David J. Allstot, Fellow, IEEE

Abstract—RF circuit synthesis techniques based on particle


swarm optimization and adaptive simulated annealing with
tunneling are described, and comparisons of parasitic-aware
designs of an RF distributed amplifier and a nonlinear power
amplifier are presented. Synthesized in 0.35- m digital CMOS
using a single 3.3-V power supply, the designs provide an 8-dB
gain and 8-GHz bandwidth for a four-stage distributed amplifier,
and 1.2-W output power with 55% drain efficiency at 900 MHz
for a three-stage power amplifier. A standard circuit simulator,
HSPICE or SPECTRE, embedded in an optimization loop is used
to evaluate cost functions. The proposed design and optimization
methodology is computationally efficient and robust in searching
complex multidimensional design spaces.
Index Terms—RF integrated circuits, RF circuit synthesis, sim-
ulated annealing, particle swarm optimization (PSO).

I. INTRODUCTION

T HE burgeoning demand for system-on-chip (SOC) solu-


tions has stimulated intense worldwide research on CMOS
RF integrated circuits. However, the issue of parasitics associ- Fig. 1. Parasitic-aware design and optimization flow. The circuit simulator
ated with CMOS active components that are inferior to their is HSPICE or SPECTRE. A compact model generator provides the parasitic
GaAs and SiGe counterparts, and on-chip passive components modeling function. The optimizer uses ASAT or PSO.
on lossy silicon substrates, has impeded development. Without
the use of parasitic-aware design and optimization tools, many when the temperature cools too quickly so that hill climbing is
RF circuits are difficult to synthesize in fine-line CMOS tech- disabled prematurely. Moreover, it is inherently slow because
nology owing to detuning effects associated with omnipresent it approaches the global minimum only after conditionally ac-
nonlinear parasitic components. cepting a costlier solution. In this paper, parasitic-aware RF cir-
Several synthesis approaches that optimize the performance cuit design and optimization techniques using particle swarm
of CMOS RF and analog integrated circuits using the simulated optimization (PSO) and adaptive simulated annealing with tun-
annealing algorithm have been developed [1]–[3]. In contrast neling (ASAT) are presented.
to conventional gradient-descent optimization methods, simu- PSO draws inspiration from the observations of social sci-
lated annealing avoids being trapped in local minima in the entists of group behavior; e.g., birds flying in search of food
design space using a hill climbing heuristic that is analogous compete as individuals but cooperate as a flock to adjust their
to the physical process of annealing a solid [4], [5]. In that flight trajectories [6]. PSO initially positions solution candidates
process, molten metal is poured into a mold at a high tempera- (i.e., particles) randomly in a multidimensional design space. At
ture and then cooled so that the constituent atoms organize into a each iterative step, the particles exchange information to update
minimum energy state. Ideally, simulated annealing circuit syn- their movements toward the global minimum. Since PSO oper-
thesis begins at a high temperature and cools at an optimum rate ates concurrently on a population of particles rather than a single
to find the lowest cost solution in the design space. In com- solution candidate, it is potentially faster and more robust than
plex problems, however, it often gets trapped in local minima simulated annealing [7].
Manuscript received January 9, 2003; revised April 12, 2004. This work was It is difficult to predict the requisite coefficient values for
supported in part by the U.S. Defense Advanced Research Project Agencies conventional simulated annealing; e.g., the temperature coeffi-
NeoCAD Program under Grant N66001-01-1-8919, by the National Science cient Temp used in the cooling schedule is notoriously problem
Foundation under Grant CCR-0086032 and Grant CCR-0120255, by the Semi-
conductor Research Corporation under Grant 2000-HJ-771 and Grant 2001-HJ- dependent. The ASAT heuristic avoids this classical problem
926, and by the National Semiconductor, Corp., and Texas Instruments, Inc. by adaptively controlling Temp to eliminate unnecessary hill
This paper was recommended by Associate Editor A. Hajimiri. climbing iterations. A local search algorithm also increases the
J. Park and K. Choi were with the Department of Electrical Engineering, Uni-
versity of Washington, Seattle, WA 98195-2500 USA. They are now with Mar- computational efficiency of the ASAT algorithm.
vell Semiconductors, Sunnyvale, CA 94089 USA. The general parasitic-aware synthesis flow for RF integrated
D. J. Allstot is with the Department of Electrical Engineering, University circuits includes three major functional blocks (Fig. 1). The
of Washington, Seattle, WA 98195-2500 USA (e-mail: allstot@ee.wash-
ington.edu). optimization core comprises ASAT or PSO, the circuit simu-
Digital Object Identifier 10.1109/TCSI.2004.835691 lator is HSPICE or SPECTRE, and the parasitic modeling block
1057-7122/04$20.00 © 2004 IEEE
1954 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 10, OCTOBER 2004

Fig. 2. Final four-stage CMOS RF distributed amplifier after parasitic-aware synthesis [8], [9].

is a compact model generator (e.g., [19]). In Section II, sim-


ulated annealing is briefly reviewed and the PSO and ASAT
approaches are motivated. Parasitic-aware RF circuit synthesis
using ASAT and PSO optimization cores is described in Sec-
tions III and IV, respectively. Comparisons between simulated
annealing and ASAT for a simple cost function are also made
in Section III. Section IV details the PSO paradigm and of-
fers insights into the selection of its parameter values. A com-
pact model generation method as applied to on-chip spiral and
bond wire inductors is reviewed in Section V. In Section VI,
PSO and ASAT are compared to simulated annealing in the par-
asitic-aware synthesis of two CMOS RF circuits: a four-stage
distributed amplifier and a three-stage power amplifier. The cri-
teria for comparison are computational efficiency, robustness,
and balance between global and local search capabilities. Con-
clusions are given in Section VII.

II. SIMULATED ANNEALING SYNTHESIS


A four-stage distributed amplifier in 0.6- m CMOS with
6.5-dB gain over a 0.5–5.5-GHz bandwidth demonstrates
the essential need for parasitic-aware RF circuit synthesis Fig. 3. Simulated gain and phase responses of a four-stage CMOS
distributed amplifier [8], [9]. There is good agreement between the analytic
[8]; Fig. 2 details the final design obtained using simulated design excluding parasitics and the final design including parasitics after
annealing optimization. The simulated frequency responses parasitic-aware optimization. The analytic design with parasitics before
of Fig. 3 show good agreement between the analytic design optimization exhibits unacceptable errors.
without parasitics and the final parasitic-aware design, which
actually achieves superior pass-band characteristics. Without nine design variables, each optimization run required about
parasitic-aware optimization, parasitics de-tune the design 5000 iterations and one day of CPU time on a Hewlett-Packard
as shown. Measurements on CMOS prototypes show close 712/100 Unix workstation. For more complex RF circuit
agreement with simulations [8]. designs, the number of design variables is expected to grow
The parasitic-aware synthesis flow comprises three major exponentially [9]. Hence, more efficient optimization methods
functional blocks (Fig. 1). The circuit simulator evaluates can- are needed to reduce cost by reducing the requisite number of
didate solutions using HSPICE or SPECTRE using complex iterations and circuit simulations.
active and passive device models; it is the most expensive This paper presents an ASAT heuristic and a PSO technique
computational function. For example, the CMOS four-stage that are more efficient and robust than simulated annealing. Syn-
distributed amplifier of Fig. 2 [8] was optimized using a sim- thesis results for a power amplifier and a distributed amplifier
ulated annealing optimization core. Although there were only are also described.
PARK et al.: PARASITIC-AWARE RF CIRCUIT DESIGN AND OPTIMIZATION 1955

Fig. 4. (a) Hill climbing in conventional simulated annealing. (b) Tunneling in ASAT for a cost function versus design variable X.

Fig. 5. Flowchart for the tunneling heuristic used in the ASAT algorithm.

III. ASAT The tunneling process is controlled by two empirical coeffi-


cients: Tunneling threshold and tunneling radius. The tunneling
The ASAT heuristic includes a tunneling technique, a local
threshold determines the beginning of a climb where tunneling
optimization method, and an adaptive temperature coefficient
is invoked. Tunneling radius denotes the maximum distance that
(Temp) algorithm.
a simulation point tunnels in the design space. The tunneling
radius is more critical. If it is too small, the probability of tun-
A. Tunneling Technique neling is low, but if it is too large, the simulation point tunnels
A well-known advantage of simulated annealing is its hill so far that the optimum is not found. Fig. 5 charts the tunneling
climbing capability determined by the temperature coefficient, heuristic. When tunneling occurs, a fraction of tunneling radius
Temp, in the cooling schedule and the slope of the cost function. is added to the previous design variable values to determine the
Showing a simple cost function that depends on a single design new values. To illustrate its performance, a cost function is de-
variable , Fig. 4(a) depicts conventional hill climbing wherein fined as
there is a nonzero probability of escaping local minima in search
COST (1)
of the global minimum. However, when Temp is low and the
hill is steep, many iterations are wasted traversing from A to B Fig. 6 validates the tunneling technique. In contrast to simulated
though there are only higher cost candidate solutions between annealing [Fig. 6(a)], the tunneling algorithm in Fig. 6(b) wastes
them. Using the technique of Fig. 4(b), the optimizer senses a little time crawling over cost-function hilltops, explores all pos-
hill climb at A and invokes tunneling through it. sible minima, and frequently finds the optimum solution.
1956 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 10, OCTOBER 2004

Fig. 7. Improved local search strategy. The history vector ~


y is defined between
the previous (Z ) and present (Z ) design points. The next simulation
point (Z ) is a weighted sum of the history vector ~
y and the random vector
~x. If Temp is small, the history component dominates and vice-versa.

Fig. 6. Results from 30 simulation runs. (a) Simulated annealing wastes


iterations crawling over cost function hilltops and fails to find the optimum. (b)
The ASAT algorithm effectively explores all minima and frequently finds the Fig. 8. When the slope of the cost function hill is steep, Temp, should be large
global minimum. and vice-versa.

B. Local Optimization Method iterations are wasted, and if Temp is too small, local search
dominates and the optimizer gets stuck in a local minimum.
As simulated annealing optimization progresses, Temp is de- The optimum value of Temp depends on RF circuit topology;
creased in compliance with the cooling schedule to emphasize i.e., high Temp is required for circuits with steep cost functions
local, rather than global, search capabilities. However, even and vice-versa [10], [17]. Greater computational efficiency is
when Temp is small, it determines the values of the design vari- achieved adapting Temp using an estimate of the cost function
ables randomly at the next simulation point. Consequently, it is slope [18]. Moreover, the adaptive method eliminates the need
notoriously inefficient as it approaches an acceptable solution. to guess an initial value of Temp. An implicit relationship
An improved local search strategy uses a history vector between Temp and the cost function slope is given in (3a).
defined between the previous and present simulation points Temp is adapted by replacing the random number with a fixed
(Fig. 7). In simulated annealing, a fraction of random vector parameter , solving for Temp, and averaging as in (3b)
is added to the present design vector to determine the next
state; herein, a weighted sum of vectors and defines the next
point. A temperature-dependent weighting factor determines cost cost old
(3a)
the balance between local and global search capabilities: if
Temp is large, the random component dominates, which em- cost cost old
phasizes the global search capability and vice-versa. The next (3b)
optimization point is accepted
Fig. 9 charts a flow for adaptively determining Temp. For a given
topology, experience shows that an estimate of Temp obtained
stepsize (2) from iterations within the first optimization loop
is sufficiently accurate for further optimization cycles. All pos-
where is proportional to Temp. itive cost-function differences between iteration points within
the first loop are weighted and averaged to estimate Temp. (Of
C. Adaptive Temperature Coefficient Heuristic course, Temp can be estimated using additional loops.)
Temp is critical in simulated annealing in minimizing the Whereas the adaptive Temp algorithm eliminates the need for
number of iterations. Fig. 8 relates Temp and the cost func- an initial value of Temp, an empirical parameter, , is
tion slope. If Temp is too large, global search dominates and introduced. It represents the initial probability of hill climbing
PARK et al.: PARASITIC-AWARE RF CIRCUIT DESIGN AND OPTIMIZATION 1957

Fig. 9. Flowchart of the adaptive temperature coefficient heuristic used in the ASAT algorithm.

and ranges from 0.70–0.90; i.e., it determines the balance be-


tween global and local search capabilities. An advantage of this
approach is that is not topology dependent as is Temp.

IV. PSO
PSO works with a large population of potential solution
candidates called particles. Hence, the inherently parallel
approach of PSO is potentially faster than simulated annealing. Fig. 10. Describing equations for PSO. The position of a particle at the next
PSO also gains advantages over alternative population-based iteration step is given by the sum of its current position and velocity vectors.
The next velocity vector is a sum of weighted inertia, and randomly weighted
optimization algorithms owing to its unique swarming capabil- competition and cooperation vectors.
ities. Moreover, its describing equations are simple and easily
implemented in the core optimization block. distributed random function. Each member of the group gains
knowledge of the globally best position by cooperating and
A. PSO Theory communicating with all other particles. The competition vector
Social scientists have observed that swarming in search of links the current position of a particle to its personal best
food differs from other animal behaviors in that individuals position ; it is weighted using a second uniformly
benefit from the discoveries and experiences of all other group distributed random function. The competition factor describes
members [6]. Swarming behavior is observed in flocks of birds, the tendency of a particle to explore the vicinity of its own
schools of fish, swarms of bees, etc. PSO mimics the swarming personal best position. Finally, the inertia vector represents the
behavior [7]. An obvious advantage is its simplicity; it is tendency of a particle to maintain its current velocity ; it
easily implemented in the core optimizer using the describing is weighted by a constant . PSO cleverly combines inertia,
equations in Fig. 10. competition, and cooperation in an optimum fashion so that the
The motion of an object in PSO is represented as the vector particles swarm to the best solution.
sum of present position and velocity vectors (Fig. 10). The Whereas it is clear that cooperation among particles is essen-
second equation that details the algorithm for updating the par- tial for finding the global optimum solution, the need for the
ticle velocity comprises three vectors: inertia, competition, and inertia and competition is not obvious; the main reason for in-
cooperation. The cooperation vector links the current position cluding them is to avoid trapping in local minima. To appreciate
of a particle to the position of the particle that enjoys the this point, consider a four-particle example (Fig. 11) in which
best global position ; it is weighted using a uniformly cooperation (Fig. 10) is activated but inertia and competition are
1958 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 10, OCTOBER 2004

Fig. 11. Four-particle PSO example with the cooperation factor enabled and the inertia and competition factors disabled. Performance depends on the initial
particle positions. For the initial positions shown, the particles miss the global optimum and converge on a local minimum.

disabled. The probability of finding the global optimum without C. PSO Parameters
being trapped in a local minimum depends on the initial posi- The inertia weighting factor is important in determining the
tions from which the particles fly straight toward the known best balance between global and local search capabilities. If it is too
position. In the example, the particles converge on a local min- large, PSO emphasizes global searching and is slow, and if it is
imum without ever experiencing the global optimum. This be- too small, it emphasizes local searching and gets trapped in local
havior is reminiscent of gradient-descent algorithms. minima. Another important optimization parameter not shown
Trapping in local minima is avoided using the complete PSO in Fig. 10 is , which limits the maximum particle velocity
formulation (Fig. 10). In Fig. 12(a), four particles are positioned and effectively limits its range of movement between iterations.
with the rightmost particle initially occupying the best position. The size of the population in PSO is less critical than in other
The particles move according to a randomly weighted sum of population-based algorithms: two to three times the number of
inertia, competition, and cooperation vectors. Notice that PSO design variables is generally effective [6].
encourages the particles to travel in different directions to ex- Nonlinear power amplifier (22 particles) and RF distributed
plore different regions of the design space. After the first iter- amplifier (55 particles) designs are used to investigate the im-
ation, the leftmost particle experiences the lowest cost, and the pact of the parameters on PSO optimization. Twelve synthesis
positions after a second iteration are shown in Fig. 12(b); PSO runs were performed for each circuit for nine combinations of
allowed the rightmost particle to escape the local minimum. All and . The failure rate, the percentage of runs that did not
particles quickly swarm toward the optimum solution. find an acceptable solution within the maximum number of it-
The weighting of the inertia, competition, and cooperation erations (5000 for the distributed amplifier and 20 000 for the
factors is important in determining the efficiency and robust- power amplifier), is plotted in Fig. 15. In this comparison, an
ness of PSO. Because it is intuitively appealing to assign the iteration in PSO is an update of only one particle. Statistics of
same weight to each vector, the factors, , and the number of iterations versus parameter values are shown in
are chosen to have average values of one; hence, Fig. 16; in these examples, and are op-
. In some cases, is adjusted to be less than 1 for timum. Additional details are presented in Section VI.
faster convergence. This issue is revisited later.
V. COMPACT MODELS FOR PASSIVE COMPONENTS
B. PSO Procedure On-chip inductors, transformers, micro-strip transmission
lines, and coplanar wave-guides are ubiquitous in impedance
PSO begins with the population of particles assigned random matching networks, resonant circuits, etc. The parasitic-aware
initial positions in the design space. Each particle is also given synthesis paradigm ensures that parasitic components do not
an initial random velocity (Fig. 13). As PSO progresses, each limit circuit performance. Hence, parasitic modeling is a key
particle keeps track of its best solution, Pbest, while the whole component as indicated in Fig. 1.
group keeps track of the overall best solution experienced by any The use of on-chip spiral inductors provides increased
particle, Gbest. Particle swarming continues until a sufficiently integration. However, their performance is inferior to their
low-cost solution is found, or a maximum number of iterations off-chip counterparts owing to parasitic elements. Fig. 17
are executed. PSO for parasitic-aware synthesis is charted in shows cross-sectional and top views of a square spiral inductor
Fig. 14. [10]–[12]. Process and design parameter information is needed
PARK et al.: PARASITIC-AWARE RF CIRCUIT DESIGN AND OPTIMIZATION 1959

Fig. 12. PSO example. (a) At the initial positions the rightmost particle has the lowest cost. For the first iteration, the particles move based on the weighted inertia,
cooperation, and competition factors. The leftmost particle now occupies the best position. (b) Particle positions after the second iteration. PSO encourages the
rightmost particle to escape the local minimum. Note particle swarming toward the optimum.

Fig. 13. Initial conditions for PSO. In this example, the four particles are assigned random initial positions and velocities.

to develop accurate compact circuit models. Process informa- the substrate, metal thickness, substrate resistivity, etc. Design
tion includes oxide thickness between the metal layers and parameters are metal width , number of turns , center
1960 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 10, OCTOBER 2004

Fig. 14. Flowchart of PSO as used in the core optimization block of a parasitic-aware RF circuit synthesis tool.

Fig. 15. Performance results for 12 parasitic-aware PSO runs versus v and w . (a) Four-stage distributed amplifier. (b) Three-stage power amplifier. The failure
rate is the percentage of synthesis runs for which an acceptable solution is not found within the maximum number of iterations. v = 0:1 and w = 0:80 work
well for both examples.

spacing , and metal line spacing . The relationships between inductor in 0.35 m CMOS. Design parameters for the metal-3
the parasitic component values and the design and process pa- spiral are: turns, m, m, and
rameters are complex and difficult to model accurately. Fig. 18 m. The spiral provides 9 nH of inductance with a
displays the measured impedance versus frequency for a spiral peak of 4 at 3.2 GHz.
PARK et al.: PARASITIC-AWARE RF CIRCUIT DESIGN AND OPTIMIZATION 1961

Fig. 17. Cross-sectional (left) and top (right) views of a parasitic-laden 2 turn
monolithic square spiral inductor. A compact  -model circuit is also shown.

Fig. 18. Measured impedance of a square spiral inductor in 0.35 m CMOS.


Design parameters for the metal-3 spiral are: N = 6:25 turns, W = 15 m,
S = 101:4 m, and D = 1:2 m.

Fig. 19. Parasitic component values for the compact  -model of Fig. 15
versus square spiral inductance in 0.35 m CMOS. Design parameters for the
metal-3 spirals are: W = 15 m, S = 101:4 m, and D = 1:2 m. x denotes
measured or MOMENTUM simulation value; ‘—’ indicates the describing
equation obtain using the polyfit function of MATLAB. Representative
polynomials are shown.

desired range of values. Fig. 19 shows , and


versus inductance from 1 to 18 nH using the same design
parameters as above. Several steps are used to generate Fig. 19.
1) Fabricate and measure small, medium, and large inductors
Fig. 16. Statistics for v and w . (a) Average number of iterations. (b) that span the desired range of inductances.
Standard deviation of the number of iterations for the four-stage distributed 2) Calibrate MOMEMTUM so that its results match the
amplifier and the nonlinear power amplifier. three sets of measured results.
3) Use MOMENTUM (calibrated) to determine parasitic
Parasitic-aware optimization requires parasitic component component values for several inductances interpolating
values expressed as functions of the design variables over a the range.
1962 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 10, OCTOBER 2004

Fig. 21. Four-stage CMOS distributed amplifier with artificial LC gate and
drain delay lines.

Fig. 20. Bond wire inductor connected between two on-chip bonding pads.
Key parasitics that need to be modeled include the N -well and substrate
resistances and capacitances.

4) Finally, use the polyfit command in MATLAB, or a sim-


ilar curve-fitting tool, to fit equations to the measured and
simulated points (Fig. 19).
The fitting equations are compatible with HSPICE or SPECTRE
netlist descriptions that enable on-the-fly compact model gener-
ation for use in the optimization loop.
A bond wire inductor has several advantages. It is often made of
gold, rather than aluminum, and its radius is large (about 30 m)
compared to the dimensions of the spiral. Consequently, its se-
ries loss resistance is much smaller and its quality factor is much Fig. 22. (a) Three-stage (class-AB, class-E, class-E) power amplifier showing
larger. Fig. 20 depicts a bond wire and its important parasitics. The the desired gain and drain efficiency () distributions. (b) CMOS circuit
implementation details.
bondingpadscontributeconstant and valueswithrespect
to the length of the wire. Repeatability of bondwire inductors has
been a concern. However, it has been shown that machine-bonded A. Four-Stage Distributed Amplifier
wires exhibit less than % inductance variations and less than The four-stage distributed amplifier (Fig. 21) used two artifi-
% variations in manufacturing [13]. cial LC delay lines. One, the gate line applies delayed versions of
the input RF signal to the four gate terminals. The other, the drain
VI. PARASITIC-AWARE OPTIMIZATION RESULTS line adds the drain signal currents constructively in the load re-
For a given circuit topology, the optimization core updates the sistor. Cascode transistors reduce the Miller effect by imposing a
design variable values and the associated parasitic components low impedance at the drains of the amplifying devices. Cascoding
at every iteration, and provides an updated netlist. The simula- also reduces capacitive coupling between the artificial transmis-
tion output is then used to compute the cost. Iterations around sion lines and increases gain flatness, reverse isolation, stability,
the optimization loop continue until a predetermined cost func- and input and output impedance matching accuracy. Finally, the
tion objective is met. For the examples of this section, the cost high output impedance of the cascode configuration reduces gain
function is loss associated with loading on the drain line.
An LC transmission line exhibits an intrinsic mismatch at each
COST (4) termination point due to image impedance variations with fre-
quency. Hence, -derived half sections are inserted to improve
the impedance matches to the delay lines [14].
where is a weighting coefficient, is a performance ob-
In order to increase gain flatness and decrease gain peaking
jective to be optimized, and and are the optimization
near the cutoff frequency, a staggering technique is frequently
target and the worst case specification, respectively. Thus, the
used in distributed amplifiers [15]. Staggering means adopting
cost function translates a multidimensional design space into a
slightly different cutoff frequencies for each delay line in order
single expression that is easily evaluated. The optimizer then
to increase the linearity of the phase response. Designing the
updates the design variables according to the cost. This process
gate line to have a higher cutoff frequency improves gain flat-
repeats until the optimum is found or a maximum number of it-
ness.
erations are executed.
Assuming matched impedances at the input and output
In this section, applications of parasitic-aware synthesis using
ports, the cutoff frequency, characteristic impedance, and low
the ASAT and PSO algorithms to the synthesis of a distributed
frequency gain are
amplifier and a power amplifier are described, and compared to
simulated annealing. A 0.35- m digital CMOS process is used
(5)
for both amplifiers.
PARK et al.: PARASITIC-AWARE RF CIRCUIT DESIGN AND OPTIMIZATION 1963

Fig. 23. Averages (AVG) and standard deviations (STD) of the number of iterations for PSO, ASAT, and SA for (a) four-stage distributed amplifier, and (b)
three-stage power amplifier.

current and voltage waveforms to achieve high efficiency. It is a


(6)
nonlinear power amplifier suited to digital modulation schemes
(7) such as GMSK, MSK, etc.
Fig. 22 shows a three-stage power amplifier with gain and
drain efficiency distributions. Fig. 22(a) achieves a 30-dB
where is the small-signal transconductance of the active de-
gain with 50% drain efficiency; drain efficiency of the third stage
vices [16]. The term in (7) indicates the unique property of
is most important because of its large output power. Hence, the
gain addition in a distributed amplifier.
last two stages are class-E for high drain efficiency and the first
Key amplifier specifications are a constant gain greater than 8
stage is class-AB for high gain. The first two stages use spiral in-
dB over a bandwidth greater than 8 GHz with linear phase over
ductors and all choke inductors use bondwires. To obtain a 1-W
the full bandwidth. The input and output ports are matched to
output power, the resistance presented by the matching network
50 , and the distributed amplifier operates from a single 3.3-V
needs to be about 5 . Hence, a bondwire inductor is used in the
power supply. Eleven design variables are used in optimizing
third stage. Cascodes are used to reduce the voltage stresses on
the design including all on-chip inductor and capacitor values.
the FETs. The cost function components are output power and
drain efficiency. To optimize the power amplifier, 22 design pa-
B. Three-Stage Class-E Power Amplifier rameters are considered including all device sizes and bias con-
A class-E power amplifier uses a transistor as a switch and ditions; the goal is 55% drain efficiency with 1.2 W of output
capacitors and inductors to minimize the overlap between the power.
1964 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 10, OCTOBER 2004

Fig. 24. Average costs versus iterations for (a) the distributed amplifier and
(b) the power amplifier. Fig. 25. (a) Forward gain magnitude (S in decibels) results. (b) Forward
gain phase results.

C. Comparative Results tially perform poorer than simulated annealing. One reason is
For comparison purposes, the best parameter settings are used that PSO in our examples generates the particles in a serial
for all optimization runs. Since Temp is critical for both the sim- fashion, even though PSO is an inherently parallel approach,
ulated annealing and ASAT algorithms, the adaptive Temp coef- to provide a worst-case comparison to the other approaches. It
ficient algorithm is used in both. For particle swarm, updates the velocity and position of each particle, and deter-
and is chosen for both amplifiers as explained earlier. mines the cost after the whole group of particles is created. Thus,
Fig. 23 compares three different optimization approaches for PSO appears slow at the beginning of the optimization process.
12 synthesis runs. The distributed amplifier is optimized using In ASAT, the adaptive tunneling process emphasizes the global
11 design variables and the power amplifier has 22. For the dis- search at the beginning, resulting in a slow pace. As optimiza-
tributed amplifier, all three optimization techniques find an ac- tion progresses, both techniques achieve superiority over sim-
ceptable set of design parameters to achieve a flat-gain response ulated annealing, which wastes iterations hill climbing as de-
within a reasonable number of iterations. Both PSO and ASAT scribed earlier. As the constraints become tighter and as itera-
converge more than twice as fast as simulated annealing. The tions increase, the discrepancy between PSO or ASAT and sim-
standard deviations of the number of iterations indicate sim- ulated annealing widens as detailed in Fig. 24. PSO is powerful
ilar robustness for all three approaches. For the power ampli- in reducing the computation cost due to cooperation among its
fier, simulated annealing is poorer in both average and standard particles. ASAT is also more effective owing to its novel local
deviation; the search space is larger for this example and it has search algorithm. Both PSO and ASAT exhibit a very good bal-
difficulty converging to the optimum. ance between global and local search capabilities. Overall, PSO
Fig. 24 plots costs versus the number of iterations for the and ASAT outperform simulated annealing by more than 2X.
various methods. For the power amplifier, PSO and ASAT ini- PSO is 15% faster than ASAT in both examples.
PARK et al.: PARASITIC-AWARE RF CIRCUIT DESIGN AND OPTIMIZATION 1965

[2] D. Leenaerts, G. Gielen, and R. A. Rutenbar, “CAD solutions and


outstanding challenges for mixed-signal and RF IC design,” in Proc.
IEEE/ACM Int. Conf. Computer-Aided Design, Nov. 2001, pp. 270–277.
[3] M. J. Krasnicki, R. Phelps, J. R. Hellums, M. McClung, R. A. Rutenbar,
and L. R. Carley, “ASF: A practical simulation-based methodology for
the synthesis of custom analog circuits,” in Proc. IEEE/ACM Int. Conf.
Computer-Aided Design, Nov. 2001, pp. 350–357.
[4] E. Aarts and J. Korst, Simulated Annealing and Boltzmann Machines: A
Stochastic Approach to Combinatorial Optimization and Neural Com-
puting. New York: Wiley, 1989.
[5] R. A. Rutenbar, “Simulated annealing algorithms: An overview,” IEEE
Circuits Devices Mag., vol. 5, pp. 19–26, Jan. 1989.
[6] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc.
IEEE Int. Conf. Neural Networks, 1995, pp. 1942–1948.
[7] J. Park, K. Choi, and D. J. Allstot, “Parasitic-aware design and optimiza-
tion of a fully-integrated CMOS wideband amplifier,” in Proc. Asia and
South Pacific Design Automation Conf., 2003, pp. 904–907.
[8] B. M. Ballweber, R. Gupta, and D. J. Allstot, “A fully integrated 0.5–5.5
GHz CMOS distributed amplifier,” IEEE J. Solid-State Circuits, vol. 35,
pp. 231–239, Feb. 2000.
[9] B. M. Ballweber, “Design and computer-aided optimization of a fully
integrated CMOS RF distributed amplifier,” M.S. thesis, Dept. Elect.
Eng., Oregon State Univ., Corvallis, Nov. 1998.
[10] H. M. Greenhouse, “Design of planar rectangular microelectronic in-
Fig. 26. Simulation results with ideal inductors and parasitic-laden inductors, ductors,” IEEE Trans. Parts, Hybrids, Packaging, vol. 10, pp. 101–109,
before and after optimization. June 1974.
[11] K. Choi, “Design and optimization of a class-E power amplifier using
simulated annealing techniques,” M.S.E.E. dissertation, Dept. Elect.
D. RF Circuit Optimization Results Eng., Arizona State Univ., Tempe, 1999.
[12] J. Craninckx and M. S. J. Steyaert, “A 1.8-GHz CMOS low-phase-noise
Figs. 25 and 26 illustrate key final results for synthesis of the voltage-controlled oscillator with prescaler,” IEEE J. Solid-State Cir-
cuits, vol. 30, pp. 1474–1482, Dec. 1995.
distributed amplifier and power amplifier, respectively. The opti- [13] Y.-G. Lee, S.-K. Yun, and H.-Y. Lee, “Novel high-Q bond wire inductor
mized distributed amplifier has a much better frequency response. for MMIC,” in Proc. IEEE Int. Electron Devices Meeting, 1998, pp.
Note that the optimizer has cleverly regained phase linearity by 19.7.1–19.7.4.
[14] D. M. Pozar, Microwave Engineering. Reading, MA: Ad-
eliminating internal parasitic mismatch effects. Loss compensa- dison-Wesley, 1990.
tion and improved phase linearity also improve the gain flatness [15] D. G. Sarma, “On distributed amplification,” Proc. Inst. Elect. Eng., vol.
(Fig. 25).Gainroughnessin thepassbandbeforeoptimizationwas 102B, pp. 689–697, 1954.
[16] E. L. Ginzton et al., “Distributed amplification,” Proc. IRE, vol. 36, pp.
dB; after optimization it is dB. 956–969, Aug. 1948.
The desirability of parasitic-aware synthesis is also illustrated [17] K. Choi, D. Allstot, and S. Kiaei, “Parasitic-aware synthesis of RF
in Fig. 26. With ideal parasitic-free inductors on all three stages, CMOS switching power amplifiers,” in Proc. IEEE Int. Symp. Circuits
Syst., vol. 1, May 2002, pp. 269–272.
the output power is 30 dBm and the drain efficiency is 58% [18] Mathematical optimization: Computer Science Educational Project [On-
for an input power level of 0 dBm. Using parasitic-laden spiral line]. Available: http://csep1.phy.ornl.gov/mo/mo.html
and bond wire inductors, but before optimization, the output [19] T. Kim, X. Li, and D. J. Allstot, “Compact model generation for on-chip
transmission lines,” IEEE Trans. Circuits Syst. I, vol. 51, pp. 459–470,
power drops to 27 dBm and the drain efficiency decreases to Mar. 2004.
only 30% with 0 dBm of input power. Both output power and
the drain efficiency are restored to nearly ideal specifications
after optimization.

VII. CONCLUSION
Methods for parasitic-aware RF circuit synthesis using PSO
and ASAT are presented and compared to simulated annealing. Jinho Park was born in Seoul, Korea, in 1972. He
PSO and ASAT provide greater computational efficiency and received the B.S. degree from Seoul National Uni-
robustness in the presence of on-chip and package parasitics. A versity, Seoul, Korea in 1996, the M.S. degree from
Oregon Graduate Institute, Portland, in 1999, and
parasitic-aware synthesis system has been described that com- the Ph.D. degree in electrical engineering from the
prises three parts: an optimization core, an embedded RF cir- University of Washington, Seattle, in 2003. During
cuit simulator, and a compact model generator. A four-stage dis- his Ph.D. program, he studied CMOS ultra-wideband
LNAs and RF synthesis techniques using particle
tributed amplifier with 8-dB forward gain and 1-dB gain flatness swarm optimization.
over a 8-GHz bandwidth, and a three-stage 900 MHz nonlinear In 2003, he joined Marvell Semiconductors, Sun-
power amplifier with 30-dBm output power and 55% drain effi- nyvale, CA, where he is currently engaged in design
of RF synthesizers and dc–dc converters. He is the coauthor of a book and a
ciency have been synthesized in 0.35- m digital CMOS. book chapter in electronics. From 1999 to 2003, he served as President of Ko-
rean Graduate School Association of Electrical Engineering at University of
REFERENCES Washington and Director of Korean–American Scientists & Engineers Associ-
ation, Pacific Northwest Chapter.
[1] R. Gupta and D. J. Allstot, “Parasitic-aware design and optimization of Dr. Park received awards for outstanding analog design from the National
CMOS RF integrated circuits,” in Proc. IEEE Radio Frequency Inte- Science Foundation Center for the Design of Analog/Digital Integrated Circuits
grated Circuits Symp., June 1998, pp. 325–328. (CDADIC) in 2002 and Analog Devices, Inc. in 2003.
1966 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 10, OCTOBER 2004

Kiyong Choi was born in Seoul, Korea. He received David J. Allstot (S’72–M’72–SM’83–F’92) re-
the B.S.E.E. and M.S.E.E. degrees in electrical engi- ceived the B.S., degree from the University of
neering from Arizona State University, Tempe, AZ. Portland, Portland, OR, the M.S. degree from
in 1998 and 1999, respectively, and the Ph.D. degree Oregon State University, Corvallis, and the Ph.D.
from University of Washington, Seattle, in 2003. degree from the University of California, Berkeley,
His interests include high-speed and high-power respectively.
analog integrated circuit design and computer-aided He has held several industrial and academic po-
design optimization. He is currently with Marvell sitions and has been the Boeing-Egtvedt Chair Pro-
Semiconductors, Sunnyvale, CA. fessor of Engineering at the University of Washington
since 1999. He is currently the Acting Chair of Elec-
trical Engineering. He has advised approximately 75
M.S. and Ph.D. graduates and published about 225 papers.
Dr. Allstot is the recipient of several outstanding teaching and advising
awards Awards include the 1978 IEEE W.R.G. Baker Prize Paper Award,
the 1995 IEEE Circuits and Systems (CAS) Society Darlington Best Paper
Award, the 1998 IEEE International Solid-State Circuits (SSC) Conference
Beatrice Winner Award, 1999 IEEE CAS Society Golden Jubilee Medal, and
the 2004 Technical Achievement Award of the IEEE CAS Society.He was an
Associate Editor of IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II:
ANALOG AND DIGITAL SIGNAL PROCESSING from 1990 to 1993, and its Editor
from 1993 to 1995. He has served on the Technical Program Committee, IEEE
Custom Integrated Circuits Conference, from 1990 to 1993, on the Education
Award Committee, IEEE CAS Society, from 1990 to 1993, on the Board of
Governors, IEEE CAS Society, from 1992 to 1995, on the Technical Program
Committee, IEEE International Symposium on Low-Power Electronics and
Design from 1994 to 1997, on the Mac Van Valkenberg Award Committee,
IEEE CAS Society, from 1994 to 1996, and since 1994 is serving on the
Technical Program Committee, IEEE International SSC Conference. He has
been the 1995 Special Sessions Chair, IEEE International Symposium on CAS,
the Executive Committee Member and Short Course Chair, IEEE International
SSC Conference, from 1996 to 2000, the Co-Chair, IEEE SSC and Technology
Committee, from 1996 to 1998, Distinguished Lecturer, IEEE CAS Society,
from 2000 to 2001, and the Co-General Chair, IEEE International Symposium
on CAS in 2002. He is a Member of Eta Kappa Nu and Sigma Xi.

You might also like