Professional Documents
Culture Documents
Least Squares Fitting of Data: 1 Linear Fitting of 2D Points of Form (X, F (X) )
Least Squares Fitting of Data: 1 Linear Fitting of 2D Points of Form (X, F (X) )
Least Squares Fitting of Data: 1 Linear Fitting of 2D Points of Form (X, F (X) )
David Eberly
Magic Software, Inc.
http://www.magic-software.com
Created: July 15, 1999
Modified: September 17, 2001
This document describes some algorithms for fitting 2D or 3D point sets by linear or quadratic structures
using least squares minimization.
This is the usual introduction to least squares fit by a line when the data represents measurements where
the ycomponent is assumed to be functionally dependent on the xcomponent. Given a set of samples
{(xi , yi )}m
i=1 , determine A and B so that the line y = Ax + B best fits the samples in the sense that the
sum of the squared errors between the yi and the line values Axi + B is minimized. Note that the error is
measured only in the ydirection.
Pm
Define E(A, B) = i=1 [(Axi + B) yi ]2 . This function is nonnegative and its graph is a paraboloid whose
vertex occurs when the gradient satistfies E = (0, 0). This leads to a system of two linear equations in A
and B which can be easily solved. Precisely,
(0, 0) = E = 2
m
X
[(Axi + B) yi ](xi , 1)
i=1
and so
P
m
x2
Pi=1 i
m
i=1 xi
P
m
x
x
y
A
i=1 i
= Pi=1 i i .
Pm
m
B
i=1 1
i=1 yi
Pm
It is also possible to fit a line using least squares where the errors are measured orthogonally to the proposed line rather than measured vertically. The following argument holds for sample points and lines in n
~
~ +A
~ where D
~ is unit length. Define X
~ i to be the sample points; then
dimensions. Let the line be L(t)
= tD
~i = A
~ + di D
~ + pi D
~ i
X
~ (X
~ i A)
~ and D
~ is some unit length vector perpendicular to D
~ with appropriate coefficient
where di = D
i
~
~
~
~
pi . Define Yi = Xi A. The vector from Xi to its projection onto the line is
~i di D
~ = pi D
~ .
Y
i
~i di D)
~ 2 . The energy function for the least squares minimization
The squared length of this vector is p2i = (Y
Pm 2
~
~
is E(A, D) = i=1 pi . Two alternate forms for this function are
~ D)
~ =
E(A,
m
X
h
i
~t I D
~D
~t Y
~i
Y
i
i=1
and
~ D)
~ =D
~t
E(A,
m h
X
~i Y
~i )I Y
~i Y
~it
(Y
~ =D
~ t M (A)D.
~
D
i=1
Using the first form of E in the previous equation, take the derivative with respect to A to get
m
h
iX
E
~D
~t
~i .
= 2 I D
Y
A
i=1
This partial derivative is zero whenever
sample points).
Pm ~
Pm ~
~
i=1 Yi = 0 in which case A = (1/m)
i=1 Xi (the average of the
~ the matrix M (A) is determined in the second form of the energy function. The quantity D
~ t M (A)D
~
Given A,
is a quadratic form whose minimum is the smallest eigenvalue of M (A). This can be found by standard
~ completes our construction of the least
eigensystem solvers. A corresponding unit length eigenvector D
squares line.
~ = (a, b), then matrix M (A) is given by
For n = 2, if A
!
Pm
m
n
X
X
1
0
(xi a)2
2
2
M (A) =
(xi a) +
(yi b)
P i=1
m
0 1
i=1
i=1
i=1 (xi a)(yi b)
~ = (a, b, c),
For n = 3, if A
1 0 0
M (A) = 0 1 0
0 0 1
(x a)(yi b)
i=1 (yi b)
Pi=1 i
P
m
m
i=1 (xi a)(zi c)
i=1 (yi b)(zi c)
where
=
m
m
m
X
X
X
(xi a)2 +
(yi b)2 +
(zi c)2 .
i=1
i=1
i=1
Pm
Pm
b)
The assumption is that the zcomponent of the data is functionally dependent on the x and ycomponents.
Given a set of samples {(xi , yi , zi )}m
i=1 , determine A, B, and C so that the plane z = Ax + By + C best fits
the samples in the sense that the sum of the squared errors between the zi and the plane values Axi +Byi +C
is minimized. Note that the error is measured only in the zdirection.
Pm
Define E(A, B, C) = i=1 [(Axi + Byi + C) zi ]2 . This function is nonnegative and its graph is a hyperparaboloid whose vertex occurs when the gradient satistfies E = (0, 0, 0). This leads to a system of three
linear equations in A, B, and C which can be easily solved. Precisely,
m
X
(0, 0, 0) = E = 2
[(Axi + Byi + C) zi ](xi , yi , 1)
i=1
and so
P
m
2
i=1 xi
Pm
xy
Pi=1 i i
m
i=1 xi
Pm
i=1 xi yi
Pm 2
i=1 yi
Pm
i=1 yi
Pm
i=1 xi
Pm
i=1 yi
Pm
i=1 1
P
m
i=1 xi zi
Pm
B =
yz
Pi=1 i i
m
C
i=1 zi
It is also possible to fit a plane using least squares where the errors are measured orthogonally to the proposed
plane rather than measured vertically. The following argument holds for sample points and hyperplanes in
~ (X
~ A)
~ = 0 where N
~ is a unit length normal to the hyperplane
n dimensions. Let the hyperplane be N
~
~
and A is a point on the hyperplane. Define Xi to be the sample points; then
~i = A
~ + i N
~ + pi N
~ i
X
~ (X
~ i A)
~ and N
~ is some unit length vector perpendicular to N
~ with appropriate coefficient
where i = N
i
~
~
~
~
~ . The squared length
pi . Define Yi = Xi A. The vector from Xi to its projection onto the hyperplane is i N
2
2
~
~
~ N
~ ) = Pm 2 .
of this vector is i = (N Yi ) . The energy function for the least squares minimization is E(A,
i=1 i
Two alternate forms for this function are
~ N
~) =
E(A,
m
X
h
i
~t N
~N
~t Y
~i
Y
i
i=1
and
~t
~ N
~) = N
E(A,
m
X
~i Y
~it
Y
~ =N
~ t M (A)N
~.
N
i=1
Using the first form of E in the previous equation, take the derivative with respect to A to get
m
h
iX
E
~N
~t
~i .
= 2 N
Y
A
i=1
3
Pm ~
Pm ~
~
i=1 Yi = 0 in which case A = (1/m)
i=1 Xi (the average of the
~ the matrix M (A) is determined in the second form of the energy function. The quantity N
~ t M (A)N
~
Given A,
is a quadratic form whose minimum is the smallest eigenvalue of M (A). This can be found by standard
~ completes our construction of the least
eigensystem solvers. A corresponding unit length eigenvector N
squares hyperplane.
~ = (a, b, c), then matrix M (A) is given by
For n = 3, if A
Pm
Pm
2
i=1 (xi a)
i=1 (xi a)(yi b)
Pm
Pm
2
M (A) =
(xi a)(yi b)
i=1
i=1 (yi b)
P
Pm
m
i=1 (xi a)(zi c)
i=1 (yi b)(zi c)
Pm
2
2
2
Given a set of points {(xi , yi )}m
i=1 , m 3, fit them with a circle (x a) + (y b) = r where (a, b) is
the circle center and r is the circle radius. An assumption of this algorithm is that not all the points are
collinear. The energy function to be minimized is
E(a, b, r) =
m
X
(Li r)2
i=1
where Li =
(xi a)2 + (yi b)2 . Take the partial derivative with respect to r to obtain
m
X
E
= 2
(Li r).
r
i=1
1 X
Li .
m i=1
1 X
1 X Li
a=
xi + r
m i=1
m i=1 a
4
and
1 X
1 X Li
b=
yi + r
.
m i=1
m i=1 b
Replacing r by its equivalent from E/r = 0 and using Li /a = (a xi )/Li and Li /b = (b yi )/Li ,
we get two nonlinear equations in a and b:
L
a =: F (a, b)
a=x
+L
L
b =: G(a, b)
b = y + L
where
x
1
m
y =
1
m
1
m
a
L
b
L
=
=
=
1
m
1
m
Pm
i=1
Pm
i=1
Pm
i=1
xi
yi
Li
Pm
axi
Li
byi
i=1 Li
i=1
Pm
2
2
2
2
Given a set of points {(xi , yi , zi )}m
i=1 , m 4, fit them with a sphere (x a) + (y b) + (z c) = r where
(a, b, c) is the sphere center and r is the sphere radius. An assumption of this algorithm is that not all the
points are coplanar. The energy function to be minimized is
E(a, b, c, r) =
m
X
(Li r)2
i=1
where Li =
(xi a)2 + (yi b)2 + (zi c). Take the partial derivative with respect to r to obtain
m
X
E
= 2
(Li r).
r
i=1
1 X
Li .
m i=1
1 X
1 X Li
a=
xi + r
m i=1
m i=1 a
and
1 X
1 X Li
b=
yi + r
.
m i=1
m i=1 b
and
1 X
1 X Li
c=
zi + r
.
m i=1
m i=1 c
Replacing r by its equivalent from E/r = 0 and using Li /a = (a xi )/Li , Li /b = (b yi )/Li , and
Li /c = (c zi )/Li , we get three nonlinear equations in a, b, and c:
L
a =: F (a, b, c)
a=x
+L
L
b =: G(a, b, c)
b = y + L
L
c =: H(a, b, c)
c = z + L
where
=
1
m
y =
1
m
z =
1
m
1
m
a
L
=
=
1
m
b
L
1
m
c
L
1
m
Pm
i=1
Pm
i=1
xi
yi
Pm
i=1 zi
Pm
i=1
Li
Pm
axi
Li
byi
i=1 Li
Pm czi
i=1 Li
i=1
Pm
m
X
(Li r)2
i=1
7.1
It is sufficient to solve this problem when the ellipse is axisaligned. For other ellipses, they can be rotated
and translated to an axisaligned ellipse centered at the origin and the distance can be measured in that
system. The basic idea can be found in Graphics Gems IV (an article by John Hart on computing distance
between point and ellipsoid).
Let (u, v) be the point in question. Let the ellipse be (x/a)2 + (y/b)2 = 1. The closest point (x, y) on
the ellipse to (u, v) must occur so that (x u, y v) is normal to the ellipse. Since an ellipse normal is
((x/a)2 +(y/b)2 ) = (x/a2 , y/b2 ), the orthogonality condition implies that ux = tx/a2 and vy = ty/b2
for some t. Solving yields x = a2 u/(t + a2 ) and y = b2 v/(t + b2 ). Replacing in the ellipse equation yields
2
2
au
bv
+
= 1.
t + a2
t + b2
Multiplying through by the denominators yields the quartic polynomial
F (t) = (t + a2 )2 (t + b2 )2 a2 u2 (t + b2 )2 b2 v 2 (t + a2 )2 = 0.
The largest root t of the polynomial corresponds to the closest point on the ellipse.
The largest root can be found by a Newtons iteration scheme. If (u, v) is inside the ellipse,
then t0 = 0 is
a good initial guess for the iteration. If (u, v) is outside the ellipse, then t0 = max{a, b} u2 + v 2 is a good
initial guess. The iteration itself is
ti+1 = ti F (ti )/F 0 (ti ), i 0.
Some numerical issues need to be addressed. For (u, v) near the coordinate axes, the algorithm is ill
conditioned. You need to handle those cases separately. Also, if a and b are large, then F (ti ) can be quite
large. In these cases you might consider uniformly scaling the data to O(1) as floating point numbers first,
compute distance, then rescale to get the distance in the original coordinates.
7
7.2
TO BE WRITTEN LATER. (The code at the web site uses a variation on Powells direction set method.)
8.1
It is sufficient to solve this problem when the ellipsoid is axisaligned. For other ellipsoids, they can be
rotated and translated to an axisaligned ellipsoid centered at the origin and the distance can be measured
in that system. The basic idea can be found in Graphics Gems IV (an article by John Hart on computing
distance between point and ellipsoid).
Let (u, v, w) be the point in question. Let the ellipse be (x/a)2 + (y/b)2 + (z/c)2 = 1. The closest point
(x, y, z) on the ellipsoid to (u, v) must occur so that (x u, y v, z w) is normal to the ellipsoid. Since
an ellipsoid normal is ((x/a)2 + (y/b)2 + (z/c)2 ) = (x/a2 , y/b2 , z/c2 ), the orthogonality condition implies
that u x = t x/a2 , v y = t y/b2 , and w z = t z/c2 for some t. Solving yields x = a2 u/(t + a2 ),
y = b2 v/(t + b2 ), and z = c2 w/(t + c2 ). Replacing in the ellipsoid equation yields
2
2
2
au
bv
cw
+
+
= 1.
t + a2
t + b2
t + c2
Multiplying through by the denominators yields the sixth degree polynomial
F (t) = (t + a2 )2 (t + b2 )2 (t + c2 )2 a2 u2 (t + b2 )2 (t + c2 )2 b2 v 2 (t + a2 )2 (t + c2 )2 c2 w2 (t + a2 )2 (t + b2 )2 = 0.
The largest root t of the polynomial corresponds to the closest point on the ellipse.
The largest root can be found by a Newtons iteration scheme. If (u, v, w) is inside the ellipse,then t0 = 0 is
a good initial guess for the iteration. If (u, v, w) is outside the ellipse, then t0 = max{a, b, c} u2 + v 2 + w2
is a good initial guess. The iteration itself is
ti+1 = ti F (ti )/F 0 (ti ), i 0.
8
Some numerical issues need to be addressed. For (u, v, w) near the coordinate planes, the algorithm is ill
conditioned. You need to handle those cases separately. Also, if a, b, and c are large, then F (ti ) can be quite
large. In these cases you might consider uniformly scaling the data to O(1) as floating point numbers first,
compute distance, then rescale to get the distance in the original coordinates.
8.2
TO BE WRITTEN LATER. (The code at the web site uses a variation on Powells direction set method.)
~ i = Q(x
~ i , yi ). The minimum occurs when the gradient of E is the zero vector,
where Q
E = 2
m
X
~ i zi )Q
~ i = ~0.
(P~ Q
i=1
i=1
i=1
s(x4 )
s(x3 y) s(x2 y 2 ) s(x3 ) s(x2 y) s(x2 )
p1
s(zx2 )
s(x2 y 2 ) s(xy 3 )
s(y 4 )
s(xy 2 ) s(y 3 ) s(y 2 ) p3 s(zy 2 )
s(x3 )
s(x2 y) s(xy 2 ) s(x2 ) s(xy) s(x) p4 s(zx)
s(x2 y) s(xy 2 )
s(y 3 )
s(xy)
s(y 2 )
s(y) p5 s(zy)
s(x2 )
s(xy)
s(y 2 )
s(x)
s(y)
s(1)
p6
s(z)