Automatica: Peter Nauclér Torsten Söderström

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Automatica 46 (2010) 17521761

Contents lists available at ScienceDirect

Automatica
journal homepage: www.elsevier.com/locate/automatica

Unbalance estimation using linear and nonlinear regression


Peter Nauclr a , Torsten Sderstrm b,
a
Ericsson ABB, Stockholm, Sweden
b
Department of Information Technology, Uppsala University, Uppsala, Sweden

article info abstract


Article history: This paper considers the problem of unbalance estimation of rotating machinery. It is formulated as a
Received 27 March 2008 parameter estimation problem, where the unknowns enter nonlinearly in a regression model. By use
Received in revised form of a certain method, the problem can be reformulated as a linear estimation procedure with a closed
18 December 2009
form solution. This procedure is sometimes known as the influence coefficient method. In its derivation,
Accepted 21 June 2010
Available online 1 August 2010
no special treatment is devoted to disturbing terms and imperfections in the model. Therefore, a novel
method is derived which takes disturbances into account, leading to a nonlinear estimator.
Keywords:
The two procedures are compared and analyzed with respect to their statistical accuracy. Using
Unbalance estimation the example of unbalance estimation of a separator, the nonlinear approach is shown to give superior
Balancing performance.
Nonlinear regression 2010 Elsevier Ltd. All rights reserved.
Linear regression
Variable projection algorithms

1. Introduction added or removed at various axial locations and angular positions.


These additional weights contribute to the rotating forces of the
Estimation of mechanical unbalances is an important topic system. Due to the assumption of linearity, the rotors vibration
in many applications. Such problems appear in balancing of amplitude is proportional to the mass unbalance of the rotor. The
high speed machinery, where the purpose is to estimate mass proportionality coefficients are called the influence coefficients.
unbalances in complex dynamic systems rotating at high speed. They are complex functions of the dynamic characteristics of
Applications that can be mentioned are machining tools, aircraft the rotating machine and depend strongly with frequency. The
turbine engines (Zhou & Shi, 2001), steam turbines, electric response at some fixed frequency (generally above the critical
generators (Darlow, 1989), compressors and separators. speed) is recorded. From the measurements in the experiments,
the user then has to determine both the influence coefficients and,
In order to counteract the effects of unbalances, first, they
particularly, the unknown unbalance in the machine.
need to be determined. Their are several ways to do this; see for
This paper is specifically inspired by the problem of separator
example the surveys Foiles, Allaire and Gunter (1998) and Zhou
balancing, but the techniques and the analysis apply to other
and Shi (2001). In this paper we focus on the influence coefficient
applications of unbalance estimation as well. It is an important
matrix approach. The method is based on the assumption of topic in the field of separator technology. The separator bowl is
linearity of both the machine itself and the measuring system. rotating with high speed, which typically is about 5000 revolutions
No detailed modeling of the modal properties of the machine is per minute. The appeared centrifugal force is used to separate
required. Modal balancing on the other hand, see for example different substances, e.g. liquids. The use of large centrifugal
Tiwari and Chakravarthy (2006), is based on a detailed modeling forces is the core of separation technology. Since the bowl is
of the dynamic properties of the rotating machine. very heavy and rotates with such a high speed, small mass
In the influence coefficient matrix approach, a number of unbalances create large radial bearing forces that may even be
experiments are carried out, where small amounts of weights are harmful. The magnitudes of these forces can often be tens of
kN, i.e. several tonnes. Therefore, the separator must be balanced
after manufacturing. When the unbalance estimation is completed,
mass corrections are applied to the bowl to counteract the
The material in this paper was not presented at any conference. This paper
unbalances.
was recommended for publication in revised form by Associate Editor Giuseppe De
Nicolao under the direction of Editor Ian R. Petersen.
When balancing rotating machines, and determining unbal-
Corresponding author. Tel.: +46 18 4713075; fax: +46 18 511925. ances, different situations can occur. In many cases, like for
E-mail addresses: Peter.Naucler@ericsson.com (P. Nauclr), turbines and crankshafts, there is often a considerable mass dis-
Torsten.Soderstrom@it.uu.se (T. Sderstrm). tribution, and the deformation resulting from the unbalances
0005-1098/$ see front matter 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.automatica.2010.06.053
P. Nauclr, T. Sderstrm / Automatica 46 (2010) 17521761 1753

vector x. The separator is then driven up to its speed of operation.


The applied masses together with the unknown mass unbalance
x0 give rise to a vibrational response, which is measured at two
k k
frame position. The procedure is then repeated for a new set of trial
masses, for a total of M experiments.
The matrix A = A(i0 ) can be viewed as the frequency response
k
function from the current unbalance state (x0 + xk ) to the measured
harmonic vibrational response yk at the angular frequency 0 . The
user chosen variable xk is used to excite the system so that the
problem becomes solvable. The reason for performing experiments
with a system that operates in stationary rotation is that the
relation between measured output and applied input becomes
simple. Irrespectively of the order of the system (which can be
k extremely large), the entries of A becomes scalar complex numbers
when the frequency response is evaluated at the single frequency
= 0 . The influence coefficient matrix is a function of the
Fig. 1. A separator model. The stiffnesses are modeled as complex numbers, which
structural properties of the underlying system.
is a way to introduce damping in the system. These stiffnesses are subject to change In the present paper, A is assumed to be square. Thus, the
between experiments. number of inputs is equal to the number of outputs. This is
a reasonable assumption for the separator problem, since the
tends to be distributed, and should be described by a flexible rotating bowl is considered to be a rigid body and a square
model. This is the situation treated in many references, for ex- system of equations captures the entire vibrational state. This is
ample, Alauze, Der Hagopian and Gaudiller (2001), Blanco-Ortega, in contrast to balancing of flexible rotors, where the underlying
Beltrn-Carbajal, Favela-Contreras and Silva-Navarro (2008), Kang, model rather is a partial differential equation and a large number
Tseng, Wang, Chiang and Wang (2003), Schneider (1991), Sinha, of sensors may be needed in order to minimize the response over
Friswell and Lees (2002), Sinha, Lees and Friswell (2004) and Tiwari the entire structure. Thus, generally, A can be rectangular. In such
and Chakravarthy (2006). For separators, on the other hand, the circumstances, one should employ the pseudo-inverse of the tall
situation is both simpler and more complicated. The situation is matrix A instead of A1 , whenever it appears. The results presented
simpler in the sense that the effect of the unbalances can be mod- here then can be applied also for balancing applications where
eled as that of a rigid body. Hence it is enough to characterize the more sensors than inputs are desired. Notice that increasing the
unbalance effect as force and torque in one point, and it would number of sensors does not inevitably imply improved statistical
be sufficient to carry out unbalance experiments by adding trial accuracy in the estimation. The reason for this is that the number of
masses in two planes only. With more experiments a more ac- unknown parameters in A increases at the same rate as the number
curate result can though be expected. However, the situation is of additional sensors.
also more complicated for separators in the following sense. When
adding an additional mass, for construction and geometrical rea- 1.1. Existing methods
sons the separator has to be dismantled and a lot of different parts
removed. After the additional trial mass is mounted, the different Equations of the type (1) frequently appear in the literature
parts of the separator are to be put back in place, and this will in- on balancing of rotating machinery (Darlow, 1989; Foiles et al.,
evitably lead to that some parts become slightly differently ad- 1998; Goodman, 1964; Lund & Tonnesen, 1972). Even though the
justed, and the dynamics will change. For the same reason, the Eq. (1) is nonlinear in the unknowns A and x0 , it can be transformed
change in dynamics will be different for each new experiment. to a linear estimation problem. This is the basis for the influence
The balancing problem may be modeled as follows. Let n denote coefficient method (Darlow, 1987, 1989; Kang, Chang, Tseng, Tang
the number of sensors used to measure vibrations caused by the and Chang, 2000; Zhou & Shi, 2001). It is an experimental method
mass unbalance. Further, let there be M experiments, where for that can be implemented in different ways, but the basis is to use
each new experiment, the added mass xk is modified. Given this, xk = 0 in the first experiment. If disturbances can be neglected,
the estimation problem can be modeled as the first measurement becomes
yk = A (x0 + xk ) , k = 1, . . . , M , (1) y1 = Ax0 , (3)
n1 nn
where yk C is a measured variable, A C is the influence which can be employed to subtract the effects of x0 in the
coefficient matrix (unknown), x0 Cn1 is the unbalance to be remaining experiments where xk = 0. Then the matrix A can be
estimated (an unknown variable), and xk Cn1 is the added mass estimated. When it is considered to be known, it is straightforward
in experiment k (a user chosen variable), and where Cnm denotes to compute an estimate of x0 , for example as
the set of complex valued n m matrices. The number of unknowns
is apparently n2 + n and the number of equations is nM. As the x0 = A1 y1 . (4)
number of equations must be at least as large as the number of
The equations needed to carry out such a procedure can be
unknowns to guarantee that a feasible solution may exist, we find
arranged in different ways, but the basics are as described above.
that the number of experiments must fulfill
The estimation problem is often treated as a deterministic
M n + 1. (2) problem in the literature, leading to a least squares approach for
determining A. This has been advocated by e.g. Goodman (1964)
We are primarily interested in estimating x0 and, therefore, A can
and Lund and Tonnesen (1972). Such an approach can easily be
be treated as a nuisance variable. For each experiment, the sought
analyzed also for the case when sensor noise on the measurements
variable x0 is invariant.
has to be taken into account. In such cases the appropriate model
The procedure is visualized using the separator model shown in
should be
Fig. 1. The trial masses (user chosen variables) are applied in two
positions of the bowl. The notation [x]i here means element i of the yk = A(x0 + xk ) + ek , k = 1, . . . , M , (5)
1754 P. Nauclr, T. Sderstrm / Automatica 46 (2010) 17521761

which has been treated in Larsson (1976). Here, an optimal 2. Preliminaries


weighting is introduced and a statistical analysis is carried out.
However, sometimes there may be another type of uncertainty The purpose of this section is to introduce some notation and
as well. The main source of uncertainties can be that the dynamical mathematical tools that will be utilized in the sequel of this paper.
properties of the system change between experiments. This is The vec operator is the operator that stacks the columns of a
an observation which has triggered the work in the present matrix. If A = [a1 . . . an ], where ak is column k, we define
paper Hillstrm (2008). There are several reasons for this kind
a1
of uncertainty. First of all, in case of separators, the bowl often
, vec(A) = ... . (7)

needs to be opened in order to apply the trial masses. When
doing this, some of the structural properties will change due an
to plays of bearings etc. Also a main source of uncertainty is
that different stiffnesses and damping elements seem to change Similarly, the vectorized uncertainty matrix is defined as
somewhat between experiments. For example, there are rubber
k = vec(Ak ) (8)
damping elements whose properties depend on temperature
and the vibrational amplitude. For such cases we will need the and furthermore the uncertainty vector from all M experiments
extended model of the form becomes

1

yk = A + Ak (x0 + xk ) , k = 1, . . . , M , (6)
= ... . (9)

where Ak C nn
is a disturbance. The way that Ak enters M
the system makes the estimation problem trickier and far more
Before proceeding, an assumption regarding the uncertainty
nontrivial to handle than the case when the presence of sensor
matrix is needed:
noise is the main random effect.
The model (6) is also considered by Li, Lin, Untaroiu and Allaire Assumption 1. The uncertainty matrix is zero mean and Ak is
(2003), but then the perturbations Ak are assumed to be bounded uncorrelated with Al for k = l. It has an associated covariance
and deterministic. Also, an estimate of A is assumed to have matrix
been estimated beforehand. The unbalance determination problem
Ek l = R k,l ,
T
is formulated as a certain convex optimization problem which (10)
includes upper bounds for the the perturbations. In contrast we
where E denotes the expectation operator and k,l is the Kronecker
will in this paper use the model (6) but treat the perturbations
delta function.
as random variables. In addition, no previous estimate of the
influence coefficient matrix is needed. Recall that the reason for the error Ak in the influence matrix is
In the current context it is assumed that sensor noise is that the separator is dismantled and rebuilt between the test using
negligible compared to the system disturbance Ak . In fact, the a new trial mass. Therefore it is reasonable to assume that this type
effect of measurement noise has been checked in previous practical of error is independent from one experiment to another.
studies on separators, Hillstrm (2008), and found to be negligible The fact that the uncertainty is independent between experi-
to as compared to the effects of the varying dynamics from ments implies that
one experiment to another. To the best of our knowledge there
R,M , cov () = IM R , (11)
is no statistical analysis associated with estimation of (1) and
no algorithms proposed that are devoted to a sound statistical where IM is the identity matrix of dimension M and is the
treatment of the disturbing variable. Kronecker product.
In the separator system, equations of the type (1) can be set The vec operator has many useful properties. One that will be
up for several angular frequencies. The measured quantity yk and extensively employed in this context is
the matrix A then become functions of frequency, while x0 and xk
vec (ABC) = CT A vec (B) .

are frequency independent. Still, the number of experiments must (12)
fulfill (2). Thus, the core of the problem is to be able to perform Application of this result on the system equation (1) yields
unbalance estimation at a single frequency, which is considered in
vec (yk ) = yk = (x0 + xk )T In

this paper.
The paper is organized as follows. The next section contains + (x0 + xk )T In k .

(13)
some preliminary mathematical notations and basic results.
Section 3 contains an analysis of a deterministic approach for Let B(x) and C(x) be matrices whose entries are functions of a real
determining the unbalance based on least-squares. The resulting valued vector x. Furthermore, let [x]k be the k-th element of the
estimate is evaluated for the case when there are random errors vector x. Then we define
in the influence matrix, as in the model (6). Section 4 develops B(x) 2 B(x)
a more advanced method, where the structure of the model (6) B(k) = , B(kl) = . (14)
[x]k [x]k [x]l
is exploited, leading to a nonlinear estimation procedure. The
resulting estimate is shown to have much better (statistical) For products of matrices the chain rule applies,
performance. Both statistical and computational aspects of the
estimator are analyzed. A detailed numerical example, based on (BC)(k) = B(k) C + BC(k) , (15)
the separator model above, is treated in Section 5, showing again where the x-argument is dropped for notational convenience. For
superior behavior of treating the disturbance terms Ak as random differentiation of matrix inverses it holds that
variables. Most details of the statistical analysis of the treated (k)
B(k) , B1 = B1 B(k) B1 .

methods are placed in the appendices. (16)
P. Nauclr, T. Sderstrm / Automatica 46 (2010) 17521761 1755

3. Linear deterministic estimation 3.2. Statistical Properties of A1

One approach to handle the problem of estimating x0 from


The statistical analysis is carried out under the following
measured data is to adopt a deterministic viewpoint. Thus, if the
conditions:
effect caused by A is neglected, or considered to be insignificant,
the model to apply is the one given by (5). It is the basis for
two different approaches to estimate x0 . The two identification Assumption 2. The norm of the stochastic disturbance Ak is small
procedures are labeled A1 and A2, respectively. Both these compared to the norm of A. This means that the signal to noise
approaches are employed in the balancing industry Hillstrm ratio, SNR, is large.
(2008).
Remark 1. The number of experiments M is not assumed to be
3.1. Approach A1 large.
These conditions will be employed also for the analysis of the
From the relation (5), the unknown variable x0 can be found methods A2 and A3 that will be introduced in the sequel. Remark 1
using a simple procedure. The first step is to subtract the effects of is important since for the underlying application, a very large
x0 from (5). This is performed by choosing x1 = 0 which yields number of experiments would not be feasible. The first and second
y1 = Ax0 (17) order statistics of A1 are summarized in the following lemma.

and for the remaining M 1 equations, we form


Lemma 1. The expected value of the estimate (24) is
zk , yk y1 , k = 2, . . . , M , (18)
which yields Ex0 = x0 + O EAk 2 ,

zk = Axk , k = 2, . . . , M (19) and its covariance matrix is for large SNR given by
if the disturbance is neglected. Both zk and xk are known and
cov x0 = A1 C1 R,M C1 A ,

therefore it is straightforward to compute an estimate of the (25)
nuisance variable A. This can be performed in different ways. One
where
option is apply the vec operator to (19), which gives

C1 = (xT0 In ) (In2 + 81 C1b ) 81 C1a ,

zk = xTk In (26)

(20)
(x0 + x2 )T In

and upon stacking the experiments in a tall vector 0
.. ,
C1a = . (27)
T
z = zT1 ... zTM 1 = 81 ,

(21)
0 (x0 + xM ) In
T
where
1
xT2 In

= ... xT0 In ,

81 =
.. . (22)
C1b (28)
. 1
T
xM In
C1 Cn(M 1)n (M 1) , C1b Cn(M 1)n .
2 2

By use of (21) an estimate of the nuisance variable A can be found.


Thereafter it is straightforward to estimate x0 using (17). The two-
step procedure becomes: Proof. The proof is given in Appendix A. 
Step 1: Let x1 = 0 and xk = 0 for k 2. Form (21) and (22), and
compute Remark 2. An improved form (called approach A2 in what
follows) of A1 can be constructed using the following ideas. Details
= 81 z. (23)
are explained in full in Nauclr (2008), which is available from
Thereafter, form the estimate A from . www.uu.se. In the second step (24) of A1, the unknown variable
Step 2: Use the first experiment (17) and A to estimate x0 : x0 is estimated using the first experiment only. This can be
problematic if A1 happens to be large, and the approach is indeed
x0 = A1 y1 . (24) not the soundest from a statistical point of view. Indeed, as the data
The procedure to estimate unbalances by using (17) in order from the first experiment is subtracted in approach 1 from all other
to linearize the equations is often referred to as the influence experiments, if the model employed for the first experiment has
coefficient method in the balancing literature Darlow (1989); a large systematic error (that is A1 is considerable), then this will
Goodman (1964); Larsson (1976). There exist many variants on deteriorate all the new data used to determine the unbalances. One
how to organize the equations Foiles et al. (1998). Another way to avoid this problem is to introduce the variable m = Ax0 .
alternative that is more computationally efficient is to arrange the Eq. (5) then becomes
equations in the first step as
yk = m + Axk , k = 1, . . . , M , (29)
... ...

z2 zM = A x2 xM
which is linear in the unknowns m and A and all experiments can
... ... .

A = z2 zM x2 xM
be used to identify these unknown parameters. Then, x0 can be
where (. . .) denotes the pseudo-inverse. The two formulations computed using their estimates. Still, no nonlinear optimization
yield the same result, but the one chosen for this paper is more is needed and the approach is shown to have better statistical
tractable from a statistical analysis point of view. properties than A1, see Nauclr (2008).
1756 P. Nauclr, T. Sderstrm / Automatica 46 (2010) 17521761

4. Approach 3: nonlinear regression The covariance matrix of the residual term C(x0 )
is denoted by

In this section we derive a loss function that handles the Q (x0 ) = C(x0 )R,M CT (x0 ) R2nM 2nM , (40)
stochastic uncertainty A in a more sophisticated fashion. This
leads to a problem formulation with a loss function that is which is a function of the unknown variable x0 . Similarly to the
nonlinear in x0 . Thus, there exists no closed form solution and a approaches A1 and A2 an estimate of x0 is found by minimizing
numerical search procedure is required. In order to use standard a quadratic criterion. However, in order to make the covariance
optimization routines, the system equation (1) is reformulated as matrix of the estimation error minimal, the equations should be
a real valued problem. This is done by representing the complex weighted with the inverse of Q (Sderstrm & Stoica, 1989). The
valued quantities with their real and imaginary parts separated. criterion then reads
This operation is denoted here with ()
and we let
V (x, ) = y B(x)2 1 . (41)
Q (x)
Re (yk )
[ ] [ ] [ ]
y x0R
yk = kR = , x0 = ,
ykI Im (yk ) x0I Minimization of V with respect to is straightforward. For a fixed
[ ] value of x = x , the minimum is (Sderstrm & Stoica, 1989)
xkR
xk = , R2n1 , (30)
xkI 1 T
= BT (x )Q
(x )B(x )
1
B (x )Q
(x )y
1

(42)
where Rnm denotes the set of real valued n m matrices and
where Re(yk ) and Im(yk ) are the real and imaginary parts of yk , and insertion of (42) into (41) yields a concentrated loss function
respectively. The corresponding convention with subscripts R and 1 2
W (x) = min V (x, ) = y B BT Q 1
BT Q 1

B y

I will be employed in the sequel. Furthermore, the vectorized
1
Q
matrices with separated real and imaginary parts are defined as 1
= yT yT Q 1 T 1
BT Q 1

B B Q B
Re (vec (A))
[ ] [ ]
2
= R = R2n 1 , (31)
I Im (vec (A))
1 T 1
y B BT Q 1

B B Q y

Re vec Ak
= yT Q1/2 I2nM Q 1/2
1
B BT Q 1
[ ]
k = kR =
2
R2n 1 , B
kI
Im vec Ak
BT Q1/2 Q1/2 y

(43)
1

where the dependence on x is dropped for brevity. The concen-
= ... R2n M 1 ,
2
(32) trated loss function (43) can be formulated as

M W (x) = yT Q 1/2
(x)5 (x)Q 1/2
(x)y, (44)

and the corresponding covariance matrices are defined as
where 5 is the orthogonal projector onto the null-space of
1/2
cov
k = R , cov
= R,M = IM R . (33) B T Q and it is given by

1/2 1/2
1
5 = I2nM Q B B T Q 1
BT Q .

A given complex valued equation B (45)

y = Ax, yR + iyI = (AR + iAI ) (xR + ixI ) (34) The parameter estimation problem becomes a two-step proce-
can be reformulated as a real valued relation dure:
[ ] [ ][ ][ ]
yR
=
AR AI xR yR x 0 = min W (x) (46)
yI AI AR xI yI x
1
xTR In xTI In R = BT (x 0 )Q
(x0 )B(x0 )
1 BT (x 0 )Q
(x0 )y.
1
[ ][ ]
(47)
= . (35)
xTI In xTR In I
By the separation into two estimation steps the complexity
Using this fact, the system equation (1) can be rewritten as of the optimization problem has been significantly reduced.
yk = Dk (x0 ) + Dk (x0 )k , k = 1, . . . , M , (36) Minimization of the original loss function (41) would require a
nonlinear search over 2(n2 + n) unknown parameters. By use of
where the concentrated loss function (43), the problem is reduced to a
nonlinear minimization over 2n variables and a simple weighted
(x0R + xkR )T In (x0I + xkI )T In
[ ]
Dk (x0 ) = (37) linear least squares fit to find the remaining 2n2 unknown
(x0I + xkI )T In (x0R + xkR )T In parameters. The second step is only needed if the nuisance variable
2 A is of any importance.
R2n2n . If all experiments are stacked in a tall vector
The optimization problem (46) is often referred to as a variable
T projection problem (Golub & Pereyra, 1973). Such optimization
y = yT1 yT2 . . . yTM = B(x0 ) + C(x0 ),


(38)
problems frequently appear in sensor array processing (Viberg &
where Ottersten, 1991) and in many other applications (Golub & Pereyra,
2003). However, the fact that Q in (44) is a function of the
D1 (x0 ) D1 (x0 )

0
unknown variable is quite uncommon. Notice that Q depends on
B(x0 ) =
.. ..
, C(x0 ) = . . (39)

. the uncertainty covariance matrix R through (40). Therefore, R
DM (x0 ) 0 DM (x0 ) needs to be a priori known or estimated.
P. Nauclr, T. Sderstrm / Automatica 46 (2010) 17521761 1757

4.1. Statistical Properties of A3

First notice that the outcome x 0 from the optimization (46) is


such that

W (k) (x 0 ) = 0 (48)

for a successful minimization. Assume that the estimate x 0 lies in

I
a neighborhood close to the true value x = x0 , i.e. x 0 = x0 + x ,
where x is small. Then (see e.g. Ljung (1999) and Sderstrm and
Stoica (1989)),
T W T W

0 = =
x x=x 0 x x=x0 +x
T W 2 W

+ x . (49)
x x=x0 x2 x=x0
R
Remember that W (k) = W /[x]k , where [x]k is the k-th element
Fig. 2. Level curves of the loss function. The true parameter value is x0 = 0.55.
of x, see Section 2. Eq. (49) implies that the estimation error
approximately is
1 The figure shows that at least in this case the loss function is
2W T W

well behaved.
x = , (50)
x2 x In most applications, the covariance matrix Q should be
positive definite. However, situations where it is ill conditioned,
where the derivatives should be evaluated at x = x0 . The accuracy
or rank deficient may occur. Such situations need to be taken care
of the estimate then becomes
of. It can be done using regularization,
1 1
2W T W 2W

cov (x ) = cov . (51) Q = CR,M CT + I2nM ,
x2 x x2
We are now ready to give the main result of this section: where is a small real number.
In order to use approach A3, the statistics of the uncertainty
Lemma 2. The estimation procedure A3 yields must be known or estimated beforehand. The good news is that
only the structure of R and not its absolute value is of importance.
Ex0 = x0 + O EAk 2 (52) A scaling of the covariance matrix will only scale the loss function
(44). Thus, the value of x 0 that minimizes the criterion (44) will
and the accuracy is for large SNR given by remain the same.
When the projection matrix 5 is computed, the effects of
cov x 0 = H 1 GR,M GT H 1 , (53) rounding errors may become significant. Therefore, it should be
computed in a numerically sound way. First, rewrite (45) as
where
(Mahata, 2003)
[H ]kl = 2T BT (k) Q

1/2 1/2 (l)
5 Q B , (k, l) = 1, . . . , 2n, (54)
5 = I2nM MM , M = Q1/2 B (57)
T (k) 1/2 1/2
[G]k,: = 2 B T
Q
Q C, k = 1, . . . , 2n, (55)
and perform the QR factorization
where [G]k,: means row k of the matrix G.
[ ]
R1
Proof. See Appendix B. = Q1 R 1 ,

 M = QR = Q1 Q2 (58)
0
4.2. Computational aspects where Q is an orthogonal matrix and R1 is upper triangular. This
gives
The loss function (44) is a nonlinear function of the unknown
variable x0 . Therefore, numerical optimization is needed in order 1
M = RT1 QT1 Q1 R1 1 Q1 ,
RT1 QT1 = R 1 T

(59)
to compute the estimate x 0 . For this purpose, there are some
computational issues that need to be addressed. where the last equality follows from the orthogonality of Q1 . Using
Any optimization routine need to be started with an initial this result, the projection matrix can be written as
guess of the minimizing variable. Instead of just choosing e.g.
1 Q1 = I2nM Q1 Q1 = Q2 Q2 .
x 0 = 0, the optimization is initialized with the outcome from the 5 = I2nM Q1 R1 R 1 T T T
(60)
procedure A2.
It is not easily seen if there exist local minima from the Eq. (60) is less sensitive to rounding errors compared to direct
expression (44). So far, no problems with convergence to computation of (45). In addition, the use of Q2 forces (60) to be
inaccurate estimates have been experienced. If n = 1, it is positive semidefinite. Therefore, the QR decomposition approach
possible to visually depict the level curves of the concentrated loss should be used for the numerical computations.
function. Such an example is shown in Fig. 2. Here, the number of Many optimization routines converge in fewer iterations if in
experiments is M = 7 and each step the analytical value of the gradient of the loss function
is supplied. Such expressions are given in Nauclr (2008), for any
A = 1 + 0.78i, x0 = 0.55, R = cov
k = 103 I2 . (56) x = x0 .
1758 P. Nauclr, T. Sderstrm / Automatica 46 (2010) 17521761

5. Numerical illustration column #


1 2 3 4 5 6 7 8
Below we evaluate the approaches A1, A2 and A3 for the 1
separator model presented earlier in Sections 3 and 4. In Nauclr 0.9
(2008) some further numerical examples are provided, that point 2 0.8
in the same direction: The approach A2 gives much better
results (much smaller estimation errors) than A1, and A3 gives 3 0.7
considerably better results than A1. All error variances decrease 0.6
with an increased number of experiments, which is expected. 4

row #
The benefit of adding additional experiments is, however, much 0.5
5
lesser for A1 compared to the other two approaches. Monte 0.4
Carlo simulations studies produces results that for all approaches 6 0.3
are very similar to the results predicted by the theory (such as
Lemmas 1 and 2). 7 0.2
Consider a model of a separator as described in Section 1 and
0.1
shown in Fig. 1. It is a 2-dimensional model with 12 degrees 8
of freedom. The beam at which the separator bowl is attached
is however modeled with the EulerBernoulli partial differential Fig. 3. The structure of R . Each square shows the magnitude of the corresponding
equation. The masses of the bowl and the frames are in the order element in R . The matrix is scaled so that the greatest element have unit
hundreds of kilograms. The stiffnesses are modeled using the magnitude.
concept of hysteretic damping. It means that they are modeled
as complex valued stiffnesses, which is a is a way to introduce dependent. If M > 3, the further experiments are drawn from a
damping in the system. The damping does not change with statistical distribution
frequency, in contrast to viscous damping.
m1 e1 i
[ ] [ ]
[xk ]1
The complex valued stiffnesses are subject to change between xk = = i , k 4, (64)
[xk ]2 m2 e 2
experiments, which leads to the uncertainty term. Between
each experiment, each stiffness varies uniformly 1% around its where
nominal value. The modeling is quite extensive and the details are
by purpose left out in order to make the presentation compact. The mi {30, 40, 50, 60} [g], i U(0, 2 ) [rad]. (65)
system model becomes All values of mi are equally probable and U(0, 2 ) is a discrete
uniform distribution with resolution 1 degree. Not too much effort
yk = A + Ak (x0 + xk ) , k = 1, . . . , M , (61) is put on choosing good candidates for trial masses. Instead, the
masses are changed according to (65) for each new Monte Carlo
where realization. The purpose with this procedure is to diminish the
effect of specific choices of xk and instead put the focus on the
0.0095 0.5335i 0.0036 + 0.1743i
[ ]
A = 104 , [m/(sg)], performance of the estimators.
0.0089 0.4344i 0.0017 + 0.1932i
Monte Carlo simulations are used to evaluate the performance
]
21e37 180 i of the three estimation algorithms. The covariance matrix of the
[
x0 = 111 i , [g]. (62) estimates are computed using 300 realizations for each value of
17e 180
M. The result is shown in Fig. 4. The figure shows that if the
The unit of A depends on the fact that the measured quantity is in true covariance matrix of the uncertainty is known, the nonlinear
[m/s] and the applied masses are in grams [g]. The quantities are estimation method A3 outperforms A1 and A2. Even with the ad
complex valued since they are associated with a magnitude and = I, A3 gives better performance compared to A1
hoc choice R
an angular position. The structure of the covariance matrix R is and A2. Such a choice is probably natural if the statistics of the
depicted in Fig. 3.
uncertainty is completely unknown. In reality, user choices of R
In order to use A3, the statistics of the uncertainty must be
would probably lead to a performance of A3 that lies somewhere in
known or estimated somehow. Two scenarios here are evaluated.
between the curves marked with squares. Thus, better knowledge
The first is that the statistics of the uncertainty is fully known.
about the system at hand is expected to yield better estimates.
The other scenario is that it is completely unknown and therefore
Finally, we show in Fig. 5 a histogram plot of the estimation
= I is employed. The latter choice clearly deviates from the
R 8 error for M = 14. The error of [x 0 ]1 = Re([x0 ]1 ) is shown. It
true covariance matrix as depicted in Fig. 3. Still, the algorithm A3 can be seen that the estimation error is centered around zero and
can be used, but the weighting is no longer optimal. Therefore, it = R is
the distribution is by far most narrow when A3 with R
is not necessarily so that A3 should perform better than the other
employed.
two approaches in this case.
Each trial weight [xk ]i has certain mass mi and angular position
i , relative to a reference position in the bowl. Typically, x1 = 0, 6. Conclusions
since in the first experiment it is decided if balancing is at all
needed. Thus, if balancing is needed, the first experiment is for An estimation problem which is motivated by the application of
unbalance estimation of rotating machinery has been considered.
free. In this example M 3 is required and it is chosen to use
Two different estimation techniques (A1 and A3) are derived and
analyzed with respect to their respective statistical property. In
[ ]
0 30 30
x1 x2 x3 = (63) addition, an approach (A2) based on A1, is discussed and compared
0 30 30
to the other two approaches using a numerical example. The
as the trial masses (in grams) for the first three experiments. This estimation problem is special in the way that the disturbance is
is done to ensure that the trial masses do not become too linearly entering the system equations. Instead of noisy measurements
P. Nauclr, T. Sderstrm / Automatica 46 (2010) 17521761 1759

The z vector (21) then becomes



1
.

z = 81 .. xT0 In 1


1

(x0 + x2 )T In 2

0
.. ..
+ . . (A.2)

0 (x0 + xM )T In M
= 81 + C1b C1a ,

(A.3)
with C1a and C1b as defined in (27) and (28), respectively.
The first step of the estimation procedure is to compute an
estimate of , as in (23)

M = 81 z (A.4)

= + 81 C1b C1a

(A.5)
Fig. 4. Performance of the different estimators for the separator example.
, + , (A.6)
where
= 81 C1b C1a .


(A.7)
Thus, the estimate of A can be written as

A = A + A , (A.8)
where A is formed from , i.e. vec(A ) = .
Next, let
m = Ax0 . (A.9)
The use of (17) implies that

m = y1 = m + A1 x0 , (A.10)
which follows from (1). Eq. (A.10) can be rewritten as
m = m + m , (A.11)
where

Fig. 5. Histogram plot of the estimation error of the real part of [x0 ]1 . The number m = (xT0 In )1 . (A.12)
of realizations is 300. We will next use the series expansion
(ordinary least squares problems) or noisy inputs (errors in (A + A)1 A1 A1 AA1 + (A.13)
variables problems), the main source of uncertainty is here
considered to act on the system parameters in a stochastic fashion. where the second order terms can be omitted if A is small
An example of unbalance estimation of a separator is consid- compared to A in the sense A A.
ered for evaluation of the estimators. Here, it is shown that the ac- Using (24), (A.8), (A.11) and (A.13) the estimate of x0 can be
curacy can be significantly improved if the nonlinear estimation written as
approach A3 is employed. This is particularly so if the number of x0 = (A + A )1 (m + m )
experiments is increased. In such circumstances, it matters very = (A1 A1 A A1 + )(m + m )
much how the estimation is performed. The nonlinear approach
A3 may then perform considerably much better than the linear es- = x0 + A1 (m A x0 ) +
timators A1 and A2. The analytical accuracy expressions could be = x0 + A1 ((xT0 In )1 (xT0 In ) ) +
employed as a basis for experiment design, i.e. the problem of find-
= x0 + A1 xT0 In (1 81 C1b C1a ) +

ing a sequence of xk that minimizes the estimation error. (A.14)

x0 + A1 (xT0 In ) (In2 + 8 C1b ) 81 C1a ,



Acknowledgement (A.15)

We are grateful to Dr. Lars Hillstrm at Alfa Laval Machine where the approximation in (A.15) follows from the fact that Ak
Dynamics for fruitful discussions and for letting us use the is assumed to be much smaller than A. Therefore, also A is
separator model. much smaller than A. From (A.14) it is concluded that

Appendix A. Proof of Lemma 1 Ex0 = x0 + O EAk 2 (A.16)
The identification procedure is derived while neglecting the
since the error term in (A.15) is linear in , which has zero mean.
effects of Ak . In the presence of this disturbance (19) and (20) For large SNR, (A.15) is a valid approximation. Then, covariance
modify to matrix of x0 becomes
zk = Axk A1 x0 + Ak (x0 + xk )
cov(x0 ) = A1 C1 cov()C1 A , (A.17)
= xTk In xT0 In 1 + (x0 + xk )T In k ,

with C1 given by (26). Furthermore, the covariance matrix of is
k = 2, . . . , M . (A.1) given by (11), which concludes the proof. 
1760 P. Nauclr, T. Sderstrm / Automatica 46 (2010) 17521761

Appendix B. Proofs of Lemma 2 The final proposition is related to the second derivatives of P:

We first need a number of preliminary results. Proposition 4.


To analyze the approach A3 statistically, we need to evaluate (kl)
(l) (k)
[BT Q 1
B = BT Q 1
B PB + B(k) PB(l)

the gradient and the Hessian, B]P
BT (k) Q 1 T (k)
(I2nM BP) B(l)

+ B Q
W (1)

W T
BT (l) Q T (l)
(I2nM BP) B(k) .
= ... ,
1

(B.1) + B Q (B.9)

x (2n)
W
Proof. First notice that B(kl) = 0. Next, differentiate (B.6) with
(11) (12) (1(2n))

W W W respect to [x]l , and make use of the chain rule
2W W (21) W (22) W (2(2n)) (l)
P(kl) = [BT Q 1 1
BT (k) Q 1 T (k)

= . .. .. .. , (B.2) B] + B Q

x2 .. . . . (I2nM BP)

W ((2n)1) W ((2n)2) W ((2n)(2n)) + [BT Q 1 1
T (k) 1
B Q + BT Q( k)

B]
of the loss function and evaluate them at x = x0 . In order to
B(l) P BP(l) P(l) B(k) P PB(k) P(l) .

(B.10)
accomplish this it is useful to rewrite the criterion function (44)
as Using Propositions 1 and 3 we obtain
W (x) = yT Q
(x) (I2nM B(x)P(x)) y,
1
BT Q 1 (kl)
B = BT (k) Q 1 T (k)

(B.3) B P + B Q
B(l) P + BP(l) B BT Q 1 (l) (k)

where
BP B PB
P(x) = [BT (x)Q (k) (l)
(x)B(x)] B (x)Q (x).
1 1 T 1
(B.4) BT Q 1
BP B P B

The matrix P has some useful properties that are summarized 1
BT Q
in what follows, in a series of propositions. For notational
= BT (k) Q 1 T (k)
(I2nM BP) B(l)

convenience the dependence on x is dropped. + B Q
(l) (k) 1 (k) (l)
BT Q 1
BP B + B T Q
B PB (B.11)
Proposition 1.
and using Propositions 2 and 3 and some algebraic manipulations,
PB = I2nM . (B.5)
the term that involves P(l) is expanded
(l) (k)
B T Q 1
= = BT (l) Q 1 T (l)

Proof. The result directly follows from the definition (B.4) of P.  BP B + B Q

(I2nM BP) B(k) BT Q 1 (l)


B PB .
(k)
(B.12)
Proposition 2.
Finally, combining (B.11) and (B.12) gives the desired result. 
P(k) = [BT Q 1 1
BT (k) Q 1 T (k)

B] + B Q
After these technical results we present one lemma needed
(I2nM BP) PB(k) P. (B.6) in order to compute the gradient and Hessian (B.2) of the
concentrated loss function W (x), (44).
Proof. Application of the chain rule and the rule for differentiation
of matrix inverses yields Lemma 3. Under Assumption 2 (Ak A) it holds that
P(k) = {[BT Q 1 1 T 1 (k)
B ] B Q } W (k) (x0 ) 2T BT (k) Q1/2 5 Q

1/2
C (B.13)
(k) T 1
= [BT Q 1
B] B Q + [BT Q 1
B]
1
(kl) T (k)
T (k) 1 W (x0 ) 2 B T
Q1/2 5 Q

1/2 (l)
B . (B.14)
k)
B Q + BT Q(


= [BT Q 1 1
BT (k) Q 1 T (k)

B] B + B Q B
Proof. Differentiation of (B.3) yields
1 (k)
+ BT Q [ B T Q 1 1 T 1

B B ] B Q W (k) = yT Q(k) y, (B.15)

P
T (k) 1 where
k)
+ [BT Q 1 1
B Q + BT Q(

B]
Q(k) = Q(k)
(I2nM BP) Q 1
B(k) P + BP(k) .

T (k) 1 (B.16)
k)
= [BT Q 1 1
B Q + BT Q(

B]
Let y = B(x0 ) + C(x0 )
as in (38) and evaluate (B.15) at x = x0 .
(I2nM BP) PB(k) P. 
This gives

W (k) x=x = T BT Q(k) B + 2T BT Q(k) C




A very useful consequence of the first proposition is
0

T
Proposition 3. + CT Q(k) C.
(B.17)
(k) (k)
P B = PB . (B.7) By use of Propositions 1 and 3 it follows that BT Q(k) B = 0, so the
first term vanishes. Next, it is argued that if Ak A, then
Proof. Application of the chain rule on (B.5) yields T
the term CT Q(k) C is negligible compared to the middle term of
(B.17). It remains to compute
P(k) B + PB(k) = 0 P(k) B = PB(k) , (B.8)
T
which is the desired result.  2T BT Q(k) C
= 2 CT Q(k) B (B.18)
P. Nauclr, T. Sderstrm / Automatica 46 (2010) 17521761 1761

and again Propositions 1 and 3 give Darlow, M. S. (1989). Balancing of high speed machinery. New York, NY: Springer-
Verlag.
T Foiles, W. C, Allaire, P. E., & Gunter, E. J. (1998). Review: rotor balancing. Shock and
W (k) (x0 ) 2
CT Q1 (I2nM BP) B(k) ,
(B.19) Vibration, 5, 325336.
Golub, G. H., & Pereyra, V. (1973). The differentiation of pseudo-inverses and
which can be equivalently written as (B.13). nonlinear least squares problems whose variables separate. SIAM Journal of
Next, we want to find an expression for W (kl) . Differentiation of Numerical Analysis, 10(2), 413432.
Golub, G. H., & Pereyra, V. (2003). Separable nonlinear least squares: the variable
(B.15) with respect to [x]l yields projection method and its applications. Inverse Problems, 19(2), R1R26.
Goodman, T. P. (1964). A least-squares method for computing balance corrections.
W (kl) = yT Q(kl) y, (B.20) Journal of Engineering for Industry, 86(3), 273279.
Hillstrm, L. (2008). Personal communication.
(k) Kang, Y., Chang, Y. P., Tseng, M. H., Tang, P. H., & Chang, Y. F. (2000). A modified
where Q is given by (B.16). If the model (38) for y is inserted, one
approach based on influence coefficient method for balancing crank-shafts.
obtains Journal of Sound and Vibration, 234(2), 277296.
Kang, Y., Tseng, M.-H., Wang, S.-M., Chiang, C.-P., & Wang, C.-C. (2003). An accuracy
W (kl) = T BT Q(kl) B + 2T BT Q(kl) C
improvement for balancing crankshafts. Mechanism and Machine Theory, 38,
14491467.
T
+ CT Q(kl) C T BT Q(kl) B. (B.21) Larsson, L. O. (1976). On the determination of the influence coefficients in rotor
balancing, using linear regression analysis. In Vibrations in rotating machinery,
Cambridge, UK, September (pp. 9397).
The approximation follows from that the term that is quadratic in Li, G., Lin, Z., Untaroiu, C., & Allaire, P. E. (2003). Balancing of high-speed rotating
is nonzero, and the assumption Ak A. machinery using convex optimization. In IEEE conference on decision and control,
Differentiation of (B.16) with respect to [x]l yields Maui, Hawaii, USA, December (pp. 43514356).
Ljung, L. (1999). System identification (2nd edition). Upper Saddle River, NJ, USA:
Q(kl) = Q(kl)
(I2nM BP) Q(k)
B(l) P + BP(l) PrenticeHall.

Lund, J. W., & Tonnesen, J. (1972). Analysis and experiments on multi-plane
l)
(k)
B P + BP(k)
balancing of a flexible rotor. Journal of Engineering for Industry, 94(1), 233242.
Q(

Mahata, K. (2003). Estimation using low rank signal models. Ph.D. thesis, Department
(k) (l) of Information Technology, Uppsala University, Uppsala, Sweden.
Q 1
B P + B(l) P(k) + BP(kl) .

(B.22) Nauclr, P. (2008). Estimation and control of resonant systems with stochastic
disturbances. Ph.D. thesis, Department of Information Technology, Faculty of
Computation of (B.21) with application of Propositions 14, B(kl) = Science and Technology, Uppsala University, Uppsala, Sweden.
Schneider, H. (1991). Balancing technology. Technical report, Carl Schenck AG.
0, and evaluation at x = x0 yields Sinha, J. K., Friswell, M. I., & Lees, A. W. (2002). The identification of the unbalance
and the foundation model of a flexible rotating machine from a single run-down.
W (kl) T BT Q(kl) B Mechanical Systems and Signal Processing, 16(23), 255271.
k)
(l) Sinha, J. K., Lees, A. W., & Friswell, M. I. (2004). Estimating unbalance and
= T 0 BT Q( B BPB(l)

misalignment of a flexible rotating machine from a single run-down. Journal
l)
(k) of Sound and Vibration, 272, 967989.
BT Q( B BPB(k)

Sderstrm, T., & Stoica, P. (1989). System identification. Hemel Hempstead, United
(k) (l) Kingdom: Prentice Hall International.
BT Q 1
B PB B(l) PB(k) BT Q 1 (kl)
B

BP Tiwari, R., & Chakravarthy, V. (2006). Simultaneous identification of residual
unbalances and bearing dynamic parameters from impulse responses of rotor-
= T BT (k) Q (I2nM BP) B
1 (l)

bearing systems. Mechanical Systems and Signal Processing, 20, 15901614.
Viberg, M., & Ottersten, B. (1991). Sensor array processing based on subspace fitting.
+ BT (l) Q
(I2nM BP) B
1 (k)


(B.23) IEEE Transactions on Signal Processing, 39(5), 11101121.
Zhou, S., & Shi, J. (2001). Active balancing and vibration control of rotating
machinery: a survey. The Shock and Vibration Digest, 33(5), 361371.
= 2T BT (k) Q (l)
(I2nM BP) B ,
1
(B.24)
where (B.23) follows from Proposition 4 and some algebra.
Peter Naucltr received the M.Sc degree in engineering
(I2nM BP) is a symmetric matrix. The
Eq. (B.24) follows since Q 1
physics and the Ph.D. degree in electrical engineering with
expression (B.24) can be equivalently written as (B.14).  specialization in automatic control from Uppsala Univer-
sity, Uppsala, Sweden, in 2003 and 2008, respectively. His
Proof of Lemma 2. Using (49) and Lemma 3, the estimation error doctorate work mainly concerned modeling and control
can be written as of mechanical systems with stochastic disturbances. Since
2008 he is with Ericsson AB, Stockholm, Sweden, where he
works with radio access technologies for the fourth gener-
x = H 1 G
+ O
2 , (B.25) ation telecommunication systems.

which is consistent with (52), since H and G are constant matrices.


For large SNR, (50) is a valid approximation
and the covariance
matrix of x satisfy cov (x ) = cov x 0 . Then, (53) immediately Torsten Sderstrm received the M.Sc. degree (civilin-
genjr) in engineering physics in 1969 and the Ph.D. de-
follows from (51) and Lemma 3.  gree in automatic control in 1973, both from Lund Institute
of Technology, Lund, Sweden. He is a Fellow of IEEE, and an
Remark 3. The results presented apply for any variable projection IFAC Fellow.
During 19671974 he held various teaching positions
problem of the type (44). The only assumption made is that B is at the Lund Institute of Technology. Since 1974, he has
linear in x, so that B(kl) = 0. If B would be a nonlinear function of been with the Department of Systems and Control, Upp-
x, terms that involve B(kl) appear in the results above. The details sala University, Uppsala, Sweden, where he is a professor
of automatic control.
needed in order to carry out the final computations for the specific Dr Sderstrm is the author or coauthor of many tech-
problem at hand are given in Nauclr (2008). nical papers. His main research interests are in the fields of system identification,
signal processing, and control. He is the (co)author of four books: Theory and Prac-
tice of Recursive Identification, MIT Press, 1983 (with L Ljung), The Instrumen-
References tal Variable Methods for System Identification, Springer-Verlag, 1983 (with P Sto-
ica), System Identification, PrenticeHall, 1989 (with P Stoica) and Discrete-Time
Alauze, C., Der Hagopian, J., & Gaudiller, L. (2001). Active balancing of turbomachin- Stochastic Systems, PrenticeHall, 1994; second edition, Springer-Verlag, 2002. In
ery: application to large shaft lines. Journal of Vibration and Control, 7, 249278. 1981 he was, with coauthors, given an Automatica Paper Prize Award.
Blanco-Ortega, A., Beltrn-Carbajal, F., Favela-Contreras, A., & Silva-Navarro, Within IFAC he has served in several capacities including vice-chairman of the
G. (2008). Active disc for automatic balancing of rotor-bearing sys- TC on Modelling, Identification and Signal Processing, (199399), IPC chairman
tems. In American Control Conference, Seattle, Washington, USA, June 11-13 of the IFAC SYSID94 Symposium, Council member (19962002), Executive Board
(pp. 20232038). member (1999-2002) and Awards Committee Chair (19992002). He was an asso-
Darlow, M. S. (1987). Balancing of high speed machinery: theory, methods and ciate editor (198491), guest associate editor and editor for four special issues with
experimental results. Mechanical Systems and Signal Processing, 1(1), 105134. Automatica and is the editor for the area of System Parameter Estimation since 1992.

You might also like