Professional Documents
Culture Documents
Harkouss 1999
Harkouss 1999
198
ANN in Microwa¨ e De¨ ice and Circuit Modeling 199
specifications established at the first stage. II. DEVICE AND CIRCUIT MODELS
Next, circuit specifications in the hierarchi-
cal system model are replaced by previously A. Nonlinear Device Model
evaluated circuit performances, and the
Adequate modeling of nonlinear microwave de-
overall system behavior is analyzed by means
vices still remains a complex but unavoidable step
of a system-level simulator, in order to ver-
toward the achievement of a good circuit design.
ify the target specifications.
Lumped equivalent electrical circuits, an example
is shown Figure 2, which are today’s mostly used
Figure 1 shows the two stages of the design models w1x, offer the advantage of being computa-
process. In practice the designer goes through tionally efficient and accurate, but at the expense
several iterations until he arrives at a stable sys- of very complex model parameters extraction car-
tem model that adequately covers the require- ried through numerical fitting and optimization
ments. w2x joined to the availability of an accurate circuit
Accurate information modeling across the hi- structure.
erarchy and between heterogeneous tools that are As an alternative, black box models relying on
being used is one important aspect of this design directly measured data enable us to get rid of the
process. Going up in the hierarchy implies the use circuit topology. The Volterra series approach has
of specific simulation tools acting on one hand on recently re-emerged through the use of time vary-
specific models, from low-level physical models to ing Volterra kernels w3, 4x, in the case of short
high-level behavioral models, and on the other memory devices, these series converge rapidly
hand on specific signals, from one carrier to com- and thus can be limited to a first order kernel.
plex modulations. At each level the product The use of time varying kernels enables us to step
model-complexity = signal-complexity must be over the limitation of the classic Volterra series
constant in order to allow simulation in realistic to weak nonlinearities and avoids us the complex
delays. and tedious task of measuring higher order ker-
The development of more efficient tools for nels. The short memory condition tends to be
hierarchical modeling of systems, circuits and de- satisfied by most electronic devices and therefore
vices, and means to embed more information into the application of the modified Volterra series
the models are the focus of this paper. approach to these devices modeling may be
justified.
When the short memory criterion is verified,
the time varying Volterra series has the following
form w4x:
i i Ž t . s Ii0 ¨ 1Ž t . , . . . , ¨ n Ž t .4
n P
q Ý Ý Yi j ¨ 1Ž t . , . . . , ¨ n Ž t . , p 4 Vj p e j p t ,
js1 psyP
A. Multilayer Perceptron
An important class of neural networks, multilayer
feedforward networks w5, 6x, i.e., multilayer per-
ceptron MLP, is widely used to solve a wide range
of problems among function approximation. This
network consists of a set of neurons arranged in
Figure 5. ANN circuit model. layers: an input layer, one or more hidden layers,
Figure 5 seems to be very adequate for a success- and an output layer. All neurons in a layer are
ful modeling. connected to all neurons in the adjacent layers
From circuit simulations or from measure- through unidirectional links. The input signal
ments, the circuit being realized, input᎐output propagates through the network in a forward
relations may be sampled over a large range of direction on a layer-by-layer basis in order to
frequency and voltages presented at input and compute the output of the network. The MLP is
output, in order to build a learning base. Choice considered as a universal approximator to a non-
criteria for neural network structures and learn- linear multidimensional mapping. It has been the-
ing algorithms are developed in the following oretically proven that a MLP with at least one
section. hidden layer is able to approximate any complex
nonlinear multidimensional relationship w8, 9x. A
block diagram of a three-layer perceptron is illus-
III. ARTIFICIAL NEURAL NETWORKS trated in Figure 6. Where X g R d is the input
AND LEARNING SCHEME Ž x i . vector, Y g R p is the output Ž yo . vector, and
wr sŽ r sso h, h i4. is the synaptic weight from neural s
A nonlinear black box structure for a static or Ž h or i . of the previous layer to the neuron r
dynamic system is a model structure that is pre- Ž o or h..
pared to describe virtually any nonlinear behav- The oth Ž1 F o F p . nonlinear input᎐output
iors. There has been considerable recent interest mapping performed by the three-layer perceptron
in this area with structures based on neural net- can be written as the following:
works: multilayer perceptron, radial basis func-
M L PŽ X . s wo h w h i x i q h q o .
tion neural networks, as well as wavelet neural
networks. From a user’s viewpoint, the key prob-
žÝ žÝ
h i
/ /
lem is to find a suitable model structure, within
A MLP is usually a static network with a
which a good model is to be found. Fitting a
forward direction of signal flow and no feedback
model within a given structure is then, in most
loops, and is usually constructed with sigmoid
cases, a lesser problem.
neurons wwith the sigmoid function as the activa-
An artificial neural network is considered as a
tion function: Ž x . s tanhŽ xr2. s Ž1 y eyx .r
grossly simplified model of the human brain. It is
a computing system made up of a number of
information processing units or artificial neurons
highly interconnected through a set of links rep-
resented by synaptic weights which are adjusted
during a training process.
Neural networks are widely applied to a variety
of practical problems as signal processing, speech
processing, and control systems w5, 6x, analysis and
optimization of microwave circuits w7x, intermodu-
lation and power analysis w2x, and more. Many
successes have been reported. Recently, artificial
neural networks have become a popular tool in
learning nonlinear maps from discrete data, due
to their ability to learn from data, to generalize Figure 6. Structure of a three-layer perceptron.
202 Harkouss et al.
proach is the ‘‘stepwise selection by orthogonal- rameters to be determined. is not very large, we
ization’’ w13x Ži.e., techniques of regressor selec- may make use of robust quasi-Newton optimiza-
tion.. The following is an outline of this proposed tion methods such as BFGS ŽBroyden᎐Fletcher᎐
approach: Goldfarb᎐Shanno. or LBFGS Žlimited memory
BFGS. algorithm w14x. In this application, the
1. Construct a library of discrete dilated and BFGS algorithm is clearly superior to the BP
translated versions of a given mother wavelet approach and is thus preferred for wavelet neural
. This library is constructed according to networks ŽWNN. Žor MLP. training. Detailed
the available training data set ⌰N s comparisons show that the BFGS method is typi-
Ž X i, y i . g R d = R, i s 1, . . . , N; N - ⬁4 , cally 30᎐100 = faster than BP and the LBFGS
where N is the number of patterns in the method is superior to the BFGS method when the
training set ⌰N and Ž X i, y i . is the ith sam- number of parameters in the networks is large
ple of the inputs and desired output of the Ži.e., large scale optimization. and is very competi-
network. tive due to its low iteration cost w15x.
2. Select the best regressors Ži.e., wavelets.
from this library in an iterative fashion and
that best fits the training data ⌰N . During F. Wavelet Network and Broyden –
the selection phase, at each stage, one Fletcher – Goldfarb – Shanno
wavelet is automatically selected in a direc- The choices that have to be made and considera-
tion to minimize the mean squared error tions that are relevant to do that are the key
MSE between the desired output and the points for a successful application of these
network output, techniques.
N
Modeling tools built on these techniques have
1 1 2 to be used by microwave system designers without
MSE s Ý Ž y i y W NN Ž X i .. .
N is1 2 any special artificial neural network’s ŽANN’s.
background, they must be robust and able to
When the regressors are selected, the linear deliver an accurate model in the shortest time,
coefficients wi are determined by the least with less parameter value choice than possible.
squares solutions. After several experiments using different ar-
chitectures coupled with different training algo-
The choice of the number of hidden neurons rithms, we have chosen for our applications to use
Ži.e., number of model parameters. is based on a wavelet network coupled with the BFGS opti-
the complexity of the nonlinearity and on the mization algorithm. This choice is the one which
requested accuracy for the approximation. For better verifies the previous requirements. There
this type of network, we simply apply the Akaike’s are several advantages by combining wavelets
final prediction error criterion for determining and neural networks, for instance, results from
the number of neurons in the wavelet network. the wavelet theory are a better alternative of the
This criterion is given by: JF P E s Fp .MSE, where theorems on the approximation ability of the
Fp s Ž1 q n prN .rŽ1 y n prN . is the factor of feedforward neural networks, and the wavelet
penalty. transform provides useful guidelines for the con-
struction of wavelet networks. Therefore, the
wavelet neural networks ŽWNN. and one algo-
E. Training Algorithms rithm for determining the parameters of net-
All above network structures might be trained works, that is the procedure for building WNN
using a wide range of algorithms such as the based on training data by combining techniques
celebrated backpropagation error algorithm, or in regression analysis and the commonly used
the conjugate gradient and quasi-Newton algo- quasi-Newton BFGS procedure, are chosen.
rithms. The BP algorithm is very general and not The main key points of this choice are the
limited to the one-hidden-layer sigmoid neural following:
network, but in practice the training process often
settles in undesirable local minima of the error 䢇 The good initialization of the wavelet net-
surface and converges too slowly. work yields a fast training procedure.
In the case of function approximation, where 䢇 During the training process the wavelet neu-
the size of the problem Ži.e., the number of pa- ral network automatically adjusts its param-
204 Harkouss et al.
eters Ži.e., wi , d i , Ti . so that the error MSE logistic function as activation function., and one
is minimized. In most cases, the BFGS algo- output Ž25 parameters᎐network.. The output vec-
rithm leads to the lowest MSE in the short- tor of the block containing the 10 neural net-
est time. However, for large scale applica- works as shown in Figure 3 consists of signal
tions, the LBFGS algorithm may favorably parameters, Id s , I g s , and Yi j parameters. The 10
replace BFGS. neural networks are trained separately with the
modified backpropagation algorithm w6x Žon-line
The choice of the number of hidden neurons backpropagation algorithm with momentum. us-
Ži.e., number of model parameters. is based on ing our simulator NETLET Žneural wavelet simu-
the complexity of the nonlinearity and on the lator., and acting on 350 measurement points in
requested accuracy for the approximation. the learning base for DC characteristics and 7000
measurement points in the learning base for Y-
parameters. In this learning algorithm, the learn-
IV. MODELING EXAMPLES ing rate is varied according to whether or not an
iteration decreases the total sum squared error
In this section, a nonlinear device and circuit ŽSSE.:
modeling using, respectively, multilayer percep-
tron MLP and wavelet neural network WNN, 䢇 if the total error function keeps decreasing
which confirm the capacity and the efficiency after m consecutive iterations, the learning
of artificial neural networks in the nonlinear rate is multiplied by a factor a ) 1 for next
microwave devices and circuits modeling, are iterations;
䢇 if the total error function keeps increasing
presented.
after n consecutive iterations, the learning
rate is multiplied by a factor b - 1 for next
A. Device Modeling Example iterations.
A.1. Volterra Kernel Modeling. In this first exam-
ple, a transistor has been measured, and used in a Table I shows the approximation results ob-
13.6 GHz amplifier. The transistor is modeled by tained for the Volterra-ANN device model after
a black box for which signal parameters Žthe drain the neural networks learning procedure. As an
to source current Id s , the gate to source current example, Figure 8 shows the evolution of SSE s
I g s , and the complex value of each parameter Yi j . N)MSE Žsum squared error. as a function of
are evaluated through 10 neural networks, based
upon the fitting of both of these parameters to
the corresponding Volterra series kernels over
the whole operational range Ž Vd s g w0, 9 Vx, Vg s g
wy4, 0.8 Vx, and Freq g w1, 86 GHzx. for amplifier
simulations.
The input vector for each Yi j neural network
consists of three parameters: frequency Freq, and
the gate-source and drain-source voltages Vg s
and Vd s .
These networks contain 15 neurons in the hid-
den layer Žwith the logistic function as the activa-
tion function. and one output Ž76 parameters᎐
network.. Each DC neural network is composed
of two inputs: gate-source and drain-source volt- Figure 8. Evolution of sum squared error SSE during
ages, six neurons in the hidden layer Žwith the the ReŽ Y11 .’s neural network learning process.
epochs obtained after the ReŽ Y11 . neural network A.2. Simulation Results (Test Results). In this sec-
learning process. N is the number of patterns in tion, we present the neural networks obtained
the training data set. The optimal learning rate test results for inputs never used in the training
parameter is about 0.2 and the optimal momen- data set.
tum constant is about 0.9. Analyses of a 13᎐14 GHz amplifier under one-
A classical lumped equivalent electrical circuit and two-tones excitations were done, the input
is deduced from the same measurements, in order voltage source was swept between y10 and
to compare simulation results. 13.5 dBm, the fundamental frequency for the
Figure 9. Measured and artificial neural network generated DC characteristics. Ža. drain
current vs. drain and gate voltages; Žb. gate current vs. drain and gate voltages.
206 Harkouss et al.
one-tone analysis was chosen to be 13.6 GHz, and put power vs. input power for the lumped and the
for the two-tone analysis the fundamental fre- block box models are given in Figure 11. As it can
quencies were 13 and 14 GHz. Figures 9 and 10 be seen from the figures, the black box model and
show some of the characteristics of a TA5446 the lumped equivalent circuit are in total agree-
MESFET from THOMSON foundry, which ment, there by proving the efficiency of the fol-
lumped equivalent circuit is given in Figure 2, the lowed modeling scheme.
curves obtained by the neural model totally agree The results indicate that the MLP is also able
with the measurements assuring us of the data to generalize and to predict the input᎐output
compression scheme efficiency. Plots of the out- relationship with high accuracy for input᎐output
Figure 10. Measured and neural network computed Y-parameters against frequency. Ža.
ReŽ Y11 .; Žb. ImŽ Y11 .; Žc. ReŽ Y12 .; Žd. ImŽ Y12 .; Že. ReŽ Y21 .; Žf. ImŽ Y21 .; Žg. ReŽ Y22 .; Žh.
ImŽ Y22 ..
ANN in Microwa¨ e De¨ ice and Circuit Modeling 207
patterns never used in creating or training the load pull principle, allows the determination of
network. input and output variables on a wide range of
feasible physical states. The neural network archi-
tecture is shown Figure 5 and the previous mea-
B. Circuit Modeling Example surements constitute the learning bases of each
B.1. Circuit Modeling. The second example pre- network building the overall structure. The learn-
sents a 2.9 GHz microwave amplifier modeling ing base size is about 2625 measurement points.
directly from measurements. The circuit being When the training data set is prepared, the
realized, a measurement system, based on active initialization techniques proposed in Section III.D
208 Harkouss et al.
are applied to initialize each wavelet neural model wavelets are an appropriate choice. Table II shows
and to determine the number of hidden wavelets the number of wavelets used in each wavelet
in each network. We simply apply the Akaike’s neural network.
final prediction error criterion for determining The initialized wavelet neural models are
the number of hidden neurons in the wavelet trained separately by the BFGS algorithm and
neural model. In Figure 12 this criterion J FPE and each wavelet network’s black box model has
the mean squared error MSE are plotted as a been implemented in Cqq-programming language
function of the number of wavelets used in the on a typical UNIX workstation. The determina-
ReŽ I1 . network. According to this criterion, 20 tion of the wavelet neural network parameters
ANN in Microwa¨ e De¨ ice and Circuit Modeling 209
d i , Ti , wi , ¨ i , 4 , during the initialization and the we listed the mean of square error MSE of all the
learning procedures, was done using our simula- networks of the ANN circuit model on the train-
tor NETLET. A few minutes are required to ing data set.
achieve the wavelet networks training Ž, 5 minr Figure 13 shows the modeling results where
network.. the abscissa represents the load-pull measure-
Table III shows the approximation results ob- ment or the training data and the ordinate gives
tained for the ANN circuit model after the initial- the approximation or the wavelet neural network
ization and the learning procedures. In this table computed output currents. These results demon-
210 Harkouss et al.
Figure 11. Calculated output power of MESFET amplifier. Ža. single-tone harmonic output
power of the 13.6 GHz MESFET amplifier; Žb. two-tone output power Ž f 1 and 2 f 1 y f 2 . of
the 13.6 GHz MESFET amplifier; Žc. two-tone output power Ž f 2 and 2 f 2 y f 1 . of the
13.6 GHz MESFET amplifier.
strate an excellent agreement with the measure- model previously trained. Results are presented
ment data. in Figure 14, and the good agreement between
measured and computed power allow to validate
B.2. Simulation Results (Test Results). The output the approach used.
power of the microwave amplifier under consider- As it has been detailed in Section II.B, the
ation, loaded by a 50 ⍀ impedance, is measured main interest of the proposed circuit model is to
while sweeping input power. The same character- be able to take into account interactions between
istics are computed using the amplifier neural the modeled circuit and circuits connected at
ANN in Microwa¨ e De¨ ice and Circuit Modeling 211
Figure 12. J FPE and MSE as a function of the number of wavelets used in ReŽ I1 .’s neural
network.
TABLE II. Number of Wavelets Used in Each Wavelet TABLE III. Approximation Results after the
Neural Network Initialization and the Learning Procedures
Network Number of Wavelets Network MSEŽinit.. MSEŽfin..
NN1 ŽReŽ I1 .. 20 NN1 ŽReŽ I1 .. 1.03e᎐05 3e᎐07
NN2 ŽImŽ I1 .. 1 NN2 ŽImŽ I1 .. 3.4e᎐15 3.4e᎐15
NN3 ŽReŽ I2 .. 24 NN3 ŽReŽ I2 .. 4.68e᎐05 8.65e᎐07
NN4 ŽImŽ I2 .. 24 NN4 ŽImŽ I2 .. 4.8e᎐05 1.46e᎐06
212 Harkouss et al.
Figure 13. Resulting approximation: measured and wavelet neural network generated
output currents. Abscissa: load-pull measurement, ordinate: WNN model computed. Ža.
ReŽ I1 .; Žb. ImŽ I1 .; Žc. ReŽ I2 .; Žd. ImŽ I2 ..
model ports, due to possible mismatches. Such lation validity. Present results have been carried
mismatches might considerably affect circuit per- out using our own harmonic balance simulator,
formances as it can be seen in Figure 10 where LISA, modified in order to take into account
two input᎐output power curves are shown for two neural network models.
different loads and compared to the ideal 50 ⍀ The results indicate that the WNN is also able
curve. Unfortunately, today’s system simulators to generalize and to predict the output of network
are not able to handle this kind of model and this with high accuracy for inputs never used in the
limitation leads to a significantly decreasing simu- training set.
ANN in Microwa¨ e De¨ ice and Circuit Modeling 213
Figure 14. Harmonic balance simulation of the amplifier loaded by 50 ⍀ and by two
mismatched loads.
a new system simulator is currently one of our 6. A. Cichocki and R. Unbehauen, Neural networks
main research topics. for optimisation and signal processing, John Wiley
& Sons, Inc., and B.G. Teubner, Stuttgart, Ger-
many, 1993.
7. A.H. Zaabab, Q.J. Zhang, and M. Nakhla, A neural
REFERENCES network modeling approach to circuit optimization
and statistical design, IEEE Trans Microwave The-
ory Tech 43 Ž1995., 1349᎐1358.
´ ´ J. Obregon, and J.P. Teyssier, Non linear
1. R. Quere,
8. K. Hornik, M. Stinchcombe, and H. White, Multi-
characterization and modelling of semiconductor
layer feedforward network are universal approxi-
devices: An integrated approach, 23rd European mators, Neural Networks 2 Ž1989., 359᎐366.
Microwave ConferenceᎏWorkshop Proceedings, 9. T. Poggio and F. Girosi, Networks for approxima-
Spain, 1993, pp. 18᎐21. tion and learning, Proc IEEE 78 Ž1990., 1481᎐1497.
2. J. Rousset, Y. Harkouss, J.M. Collantes, and M. 10. S. Mallat, Multiresolution approximation and
Campovecchio, An accurate neural network model wavelets orthonormal bases of L2 Ž R ., Trans Amer
of FET for intermodulation and power analysis, Math Soc 315 Ž1989., 69᎐88.
26th European Microwave Conference, Czech Re- 11. C. Chui, Wavelets: A tutorial in theory and applica-
public, 1996. tions, Academic Press, Boston, San Diego, 1992.
3. M. Asdente, M.C. Pascussi, and A.M. Ricca, Modi- 12. Q. Zhang and A. Benveniste, Wavelets networks,
fied Volterra᎐Wiener functional method for highly IEEE Trans Neural Networks 3 Ž1992., 889᎐898.
13. Q. Zhang, Using wavelet network in non paramet-
non linear systems, Alta Frequenza XLV Ž1976.,
ric estimation, IEEE Trans Neural Networks 8
pp. 756-312E᎐315E-759.
Ž1997., 227᎐236.
4. F. Filicori, G. Vannini, and V.A. Monaco, A non- 14. D.C. Lui and J. Nocedal, On the limited memory
linear integral model of electron devices for HB BFGS method for large scale optimization, Math
circuit analysis, IEEE Trans Microwave Theory Programming 45 Ž1989., 503᎐528.
Tech MTT-40 Ž1992., 1456᎐1465. 15. S. McLoone and G.W. Irwin, Fast parallel off-line
5. S. Haykin, Neural networks, a comprehensive foun- training of multilayer perceptrons, IEEE Trans
dation, Macmillan College Co. New York, 1994. Neural Networks 8 Ž1997., 646᎐653.
ANN in Microwa¨ e De¨ ice and Circuit Modeling 215
BIOGRAPHIES
Youssef Harkouss was born in Beirut, Edouard Ngoya received the Dipl.-Eng. Appl. degree from the
Lebanon. He received the Engineer Institut des Telecom d’Oran, Algeria, in 1982, and the Ph.D.
Diploma in E.E. with honors from the degree in electronics from the University of Limoges, France,
Lebanese University of Beirut, Lebanon, in 1988. From 1988 to 1990, he was a research engineer with
1993, and the Diploma of D.E.A. in elec- Caroline and Racal Redac, where he contributed to the devel-
tronics from the University of Limoges, opment of MIC CAD software. In 1991, he joined the French
France, 1995. In 1994, he joined the Centre National de la Recherche Scientifique ŽCNRS. as a
St-Nazaire Hospital for a training period researcher. His current fields of interests include computer
in medical materials maintenance. Cur- methods for analysis, modeling, and optimization of mi-
rently he is working toward the Ph.D. degree in microwave crowave communication circuits. ŽPhoto not available..
electronics at IRCOM, University of Limoges, France. His
research interests include advanced neural networks software
development, neural network modeling for microwave devices
and circuits, and CAD of passive devices.
Denis Barataud was born in St.-Junien,
France, on October 1970. He graduated
Jean Rousset received the Ph.D. degree in physics from the ´
from ‘‘Ecole Nationale Superieure de
University of Limoges, France, in 1976. From 1974 to 1981, he ´´
Telecommunications de Bretagne’’ in
was an associate professor at the University of Algeria. He 1994. Since 1995, he has been with the
joined the IRCOM in 1982 as a research engineer. He is Microwave Laboratory of the University
currently managing software developments, and his research of Limoges, studying for the doctor de-
interests are in nonlinear analysis methods and simulation gree in electronics engineering. His main
languages ŽPhoto not available.. research interest is the time domain
characterization of nonlinear devices.