Jurnal Asing Acara 3 2

An Efficient Algebraic Solution to the Perspective-Three-Point Problem
Tong Ke Stergios I. Roumeliotis

University of Minnesota University of Minnesota
Minneapolis, USA Minneapolis, USA
kexxx069@cs.umn.edu stergios@cs.umn.edu ∗
Abstract tion process followed for arriving at a univariate polyno-

mial. Later on, Quan and Lan [22] and more recently Gao et
In this work, we present an algebraic solution to the al. [6] employed the same formulation but instead used the
classical perspective-3-point (P3P) problem for determin- Sylvester resultant [3] and Wu-Ritz’s zero-decomposition
ing the position and attitude of a camera from observations method [24], respectively, to solve the resulting system of
of three known reference points. In contrast to previous ap- equations, and, in the case of [6], to determine the number
proaches, we first directly determine the camera’s attitude of real solutions. Regardless of the approach followed, once
by employing the corresponding geometric constraints to the feature’s distances have been computed, finding the
formulate a system of trigonometric equations. This is then camera’s orientation, expressed as a unit quaternion [12] or
efficiently solved, following an algebraic approach, to de- a rotation matrix [13], often requires computing the eigen-
termine the unknown rotation matrix and subsequently the vectors of a 4 × 4 matrix (e.g., [22]) or performing singular
camera’s position. As compared to recent alternatives, our value decomposition (SVD) of a 3 × 3 matrix (e.g., [6]),
method avoids computing unnecessary (and potentially nu- respectively, both of which are time-consuming. Further-
merically unstable) intermediate results, and thus achieves more, numerical error propagation from the computed dis-
higher numerical accuracy and robustness at a lower com- tances to the rotation matrix significantly reduces the accu-
putational cost. These benefits are validated through exten- racy of the computed pose estimates.
sive Monte-Carlo simulations for both nominal and close- To the best of our knowledge, the first method1 that does
to-singular geometric configurations. not employ the law of cosines in its P3P problem formula-
tion is that of Kneip et al. [15], and later on that of Mas-
selli and Zell [18]. Specifically, [15] and [18] follow a ge-
1. Introduction ometric approach for avoiding computing the features’ dis-
tances and instead directly solve for the camera’s pose. In
The Perspective-n-Point (PnP) is the problem of deter- both cases, however, several intermediate terms (e.g., tan-
mining the 3D position and orientation (pose) of a cam- gents and cotangents of certain angles) need to be com-
era from observations of known point features. The PnP puted, which negatively affect the speed and numerical pre-
is typically formulated and solved linearly by employing cision of the resulting algorithms.
lifting (e.g., [1]), or as a nonlinear least-squares problem Similar to [15] and [18], our proposed approach does not
minimized iteratively (e.g., [9]) or directly (e.g., [11]). The require first computing the features’ distances. Differently
minimal case of the PnP (for n=3) is often used in practice, though, in our derivation, we first eliminate the camera’s
in conjunction with RANSAC, for removing outliers [5]. position and the features’ distances to result into a system
The first solution to the P3P problem was given by of three equations involving only the camera’s orientation.
Grunert [8] in 1841. Since then, several methods have Then, we follow an algebraic process for successively elim-
been introduced, some of which [8, 4, 19, 5, 17, 7] were inating two of the unknown 3-dof and arriving into a quar-
reviewed and compared, in terms of numerical accuracy, tic polynomial. Our algorithm (summarized in Alg. 1) re-
by Haralick et al. [10]. Common to these algorithms is quires fewer operations and involves simpler and numeri-
that they employ the law of cosines to formulate a sys- cally more stable expressions, as compared to either [15]
tem of three quadratic equations in the features’ distances or [18], and thus performs better in terms of efficiency, ac-
from the camera. They differ, however, in the elimina-
1 Nister and Stewenius [20] earlier also followed a geometric approach
∗ Thiswork was supported by the National Science Foundation (IIS- for solving the generalized P3P resulting into an octic univariate polyno-
1328722). mial whose odd monomials vanish for the case of the central P3P.
7225
curacy, and robustness. Specifically, the main advantages of
our approach are:
• Our algorithm’s implementation takes about 33% of

the time required by the current state of the art [15]. 2
• Our method achieves better accuracy than [15, 18] un-

der nominal conditions. Moreover, we are able to fur-
ther improve the numerical precision by applying root
polishing to the solutions of the quartic polynomial Figure 1. The camera {C}, whose position, G pC , and orientation,
G
while remaining significantly faster than [15, 18]. CC, we seek to determine, observes unit-vector bearing measure-
ment C bi of a feature fi , whose position, G pi , is known.
• Our algorithm is more robust than [15, 18] when
considering close-to-singular configurations (the three G
C(C bi × C bj ) to yield the following system of 3 equa-
C
points are almost collinear or very close to each other). tions in the unknown rotation GC C:
The remaining of this paper is structured as follows. Sec- (G p1 − G p2 )T CG C(C b1 × C b2 ) = 0 (2)

tion 2 presents the definition of the P3P problem, as well G G TG C
( p1 − p3 ) C( b1 × b3 ) = 0 C
(3)
C
as our derivations for estimating first the orientation and
(G p2 − G p3 )T CG C(C b2 × C b3 ) = 0 (4)
then the position of the camera. In Section 3, we assess
the performance of our approach against [15] , [18], and [6] Next, and in order to compute one of the 3 unknown degrees
in simulation for both nominal and singular configurations. of rotational freedom, we introduce the following factoriza-
Finally, we conclude our work in Section 4. tion of G
C C:
G
2. Problem Formulation and Solution C C = C(k1 , θ1 )C(k2 , θ2 )C(k3 , θ3 ) (5)
2.1. Problem Definition where3
Given the positions, G pi , of three known features fi , i =

1, 2, 3, with respect to a reference frame {G}, and the cor-
G
p1 − G p2 C
b1 × C b2 k1 × k3
k1 , , k 3 , , k2 ,
responding unit-vector, bearing measurements, C bi , i = kG p1 − G p2 k kC b1 × C b2 k kk1 × k3 k
1, 2, 3, our objective is to estimate the position, G pC , and (6)
orientation, i.e., the rotation matrix GC C, of the camera {C}.
Substituting (5) in (2), yields a scalar equation θ2 :
2.2. Solving for the orientation kT1 C(k2 , θ2 )k3 = 0 (7)
From the geometry of the problem (see Fig. 1), we have which we solve by employing Rodrigues’ rotation for-
(for i = 1, 2, 3): mula [16]:4
C(k2 , θ2 ) = cos θ2 I − sin θ2 ⌊k2 ⌋ + (1 − cos θ2 )k2 kT2 (8)

G G G C
pi = pC + d C bi iC (1)
to get
where di , k pC − pi k is the distance between the cam-
G G
π
era and the feature fi . θ2 = arccos(kT1 k3 ) ± (9)
2
In order to eliminate the unknown camera position, G pC ,
and feature distance, di , i = 1, 2, 3, we subtract pair- Note that we only need to consider one of these two solu-
wise the three equations corresponding to (1) for (i, j) = tions [in our case, we select θ2 = arccos(kT1 k3 ) − π2 ; see
(1, 2), (1, 3), and (2, 3), and project them on the vector Fig. 2], since the other one will result in the same G
C C (see
3 C(k, θ) denotes the rotation matrix describing the rotation about the
2 Although Masselli and Zell [18] claim that their algorithm runs faster unit vector, k, by an angle θ. Note that in the ensuing derivations, all
than Kneip et al.’s [15], our results (see Section 3) show the opposite to rotation angles are defined using the left-hand rule.
be true (by a small margin). The reason we arrive at a different conclusion 4 ⌊k⌋ denotes the 3×3 skew-symmetric matrix corresponding to k such
is that our simulation randomly generates a new geometric configuration that ⌊k⌋a = k × a, ∀ k, a ∈ R3 . Note also that if k is a unit vector, then
for each run, while Masselli employs only one configuration during their ⌊k⌋2 = kkT − I, while for two vectors a, b, ⌊a⌋⌊b⌋ = baT − (aT b)I.
entire simulation, in which they save time due to caching. Lastly, it is easy to show that ⌊⌊a⌋b⌋ = baT − abT .
7226
in (12) yields (for i = 1, 2):
ᇱᇱ T
ଵ ଶ ଷ cos θ1
−⌊k1 ⌋2 ui

⌊k1 ⌋ui + (kT1 ui )k1
sin θ1
߮

cos θ3 T
−⌊k′3 ⌋2 vi′

· −⌊k′3 ⌋vi′ + (k′3 vi′ )k′3 = 0
ߠଶ ଷ sin θ3
ଵ ᇱ (15)
ߨଵ ଷ
ᇱ ߨଶ Expanding (15) and rearranging terms, yields (for i = 1, 2)
ଵ
T
uTi ⌊k1 ⌋2 ⌊k′3 ⌋vi′ cos θ3
uTi ⌊k1 ⌋2 ⌊k′3 ⌋2 vi′

Figure 2. Geometric relation between unit vectors cos θ1
k1 , k2 , k3 , k′3 , k′′3 , and u1 . Note that k1 , k3 , and k′3 sin θ1 uTi ⌊k1 ⌋⌊k′3 ⌋2 vi′
uTi ⌊k1 ⌋⌊k′3 ⌋vi′ sin θ3
belong to a plane π1 whose normal is k2 . Also, k2 , k′3 , and k′′3
cos θ3
lie on a plane, π2 , normal to π1 . +(kT1 ui ) −kT1 ⌊k′3 ⌋2 vi′ −kT1 ⌊k′3 ⌋vi′

sin θ3

′T ′
T ′ ′
cos θ1
Appendix 5.2 for a formal proof). =(k3 vi ) ui ⌊k1 ⌋⌊k3 ⌋k1 ui ⌊k1 ⌋k3
T
(16)
sin θ1
In what follows, we describe the process for eliminating
θ3 from (3) and (4), and eventually arriving into a quartic Notice that the term uTi ⌊k1 ⌋⌊k′3 ⌋ appears three times in
polynomial involving a trigonometric function of θ1 . To do (16), and
so, we once again substitute in (3) and (4) the factorization
C C defined in (5) to get (for i = 1, 2):
of G uT1 ⌊k1 ⌋⌊k′3 ⌋ = uT1 k′3 kT1
uTi C(k1 , θ1 )C(k2 , θ2 )C(k3 , θ3 )vi = 0 (10) = (G p1 − G p3 )T ⌊k1 ⌋⌊k′3 ⌋
= (G p1 − G p2 + G p2 − G p3 )T ⌊k1 ⌋⌊k′3 ⌋
where
= (G p2 − G p3 )T ⌊k1 ⌋⌊k′3 ⌋
ui , G pi − G p3 , vi , C bi × C b3 , i = 1, 2, (11) = uT2 k′3 kT1 = uT2 ⌊k1 ⌋⌊k′3 ⌋ (17)
and employ the following property of rotation matrices This motivates us to rewrite (12) as (for i = 1, 2):
C(k1 , θ1 )C(k2 , θ2 )CT (k1 , θ1 ) = C(C(k1 , θ1 )k2 , θ2 ) 0 = uTi C(k1 , θ1 )C(k′3 , θ3 )vi′
to rewrite (10) in a simpler form as = uTi C(k1 , θ1 )C(k1 , −φ)C(k1 , φ)C(k′3 , θ3 )vi′
= uTi C(k1 , θ1 − φ)C(C(k1 , φ)k′3 , θ3 )C(k1 , φ)vi′
T
ui C(k1 , θ1 )C(C(k2 , θ2 )k3 , θ3 )C(k2 , θ2 )vi = 0
= uTi C(k1 , θ1′ )C(k′′3 , θ3 )vi′′ (18)
⇒ uTi C(k1 , θ1 )C(k′3 , θ3 )vi′ = 0 (12)
where
where
θ1′ , θ1 − φ, vi′′ , C(k1 , φ)vi′ , k′′3 , C(k1 , φ)k′3 (19)
vi′ , C(k2 , θ2 )vi , i = 1, 2
k′3 , C(k2 , θ2 )k3 = k2 × k1 (13) To simplify the equation analogous to (16) that will re-
sult from (18) [instead of (16)], we seek to find a φ
The last equality in (13) is geometrically depicted in Fig. 2 (not necessarily unique) such that uT1 k′′3 = 0, and hence,
and algebraically derived in Appendix 5.1. Analogously, it uTi ⌊k1 ⌋⌊k′′3 ⌋ = 0 [see (17)], i.e.,
is straightforward to show that
0 = uT1 k′′3
k′1 , CT (k2 , θ2 )k1 = −k2 × k3
= uT1 C(k1 , φ)k′3 (20)
Next, by employing Rodrigues’ rotation formula [see (8)], = uT1 (cos φI − sin φ⌊k1 ⌋ + (1 − cos φ)k1 kT1 )k′3
for expressing the product of a rotation
matrix and
Ta vector = cos φuT1 k′3 − sin φuT1 ⌊k1 ⌋k′3
as a linear function of the unknown cos θ sin θ , i.e.,
= cos φuT1 k′3 − sin φuT1 k2
C(k, θ)v = (− cos θ⌊k⌋2 − sin θ⌊k⌋ + kkT )v

2
cos θ 1 T
= −⌊k⌋ v −⌊k⌋v + (kT v)k (14)

sin θ ⇒ cos φ sin φ = u k2 uT1 k′3 (21)
δ 1
7227
where Replacing θ3 by θ3′ in (26), we have
q
δ, (uT1 k′3 )2 + (uT1 k2 )2 = ku1 × k1 k (22)
f11 cos θ1′ f15

cos θ3′

f
= 13 sin θ1′
f21 cos θ1′ + f24 f22 cos θ1′ + f25 sin θ3′ f23
and thus [from (19) using (8)]
(28)
k′′3 = cos φk′3 − sin φ⌊k1 ⌋k′3 + (1 − cos φ)k1 kT1 k′3 where
T
= (k′3 kT2 u1 − k2 k′3 u1 )/δ = u1 × (k′3 × k2 )/δ
f11 , δkT3 C b3 (29)
u1 × k1
= (23) f21 , δ( b1 b2 )(k3 b3 )
C TC TC
(30)
ku1 × k1 k
f22 , δ(k3 b3 )k b1 × b2 k
TC C C
(31)
Now, we can expand (18) using (14) to get an equation anal-
ogous to (16): f13 , f¯13 = δvT k3 1 (32)

cos θ1′
T T
ui ⌊k1 ⌋2 ⌊k′′3 ⌋2 vi′′ uTi ⌊k1 ⌋2 ⌊k′′3 ⌋vi′′ cos θ3
f23 , f¯23 = δv2T k3 (33)
sin θ1′ uTi ⌊k1 ⌋⌊k′′3 ⌋2 vi′′ uTi ⌊k1 ⌋⌊k′′3 ⌋vi′′ sin θ3 f24 , (u2 k1 )(k3 b3 )k b1 × b2 k
T TC C C
(34)

cos θ3 f15 , −(u1 k1 )(k3 b3 )
T TC
(35)
+ (kT1 ui ) −kT1 ⌊k′′3 ⌋2 vi′′ −kT1 ⌊k′′3 ⌋vi′′

=
sin θ3
f25 , −(u2 k1 )( b1 b2 )(k3 b3 )
T C TC TC
(36)
T
cos θ1′
(k′′3 vi′′ ) uTi ⌊k1 ⌋⌊k′′3 ⌋k1 uTi ⌊k1 ⌋k′′3 (24)
sin θ1′ From (28), we have
Substituting ui ⌊k1 ⌋⌊k′′3 ⌋
T
= 0 [see (17)] in (24) and renam- −1
ing terms, yields (for i = 1, 2): cos θ3′ f11 cos θ1′ f15
= det
sin θ3′ f21 cos θ1′ + f24 f22 cos θ1′ + f25
f¯i1 f¯i2 cos θ3
T
cos θ1′
¯ ¯
cos θ3
f
+ i4 f i5 f22 cos θ1′ + f25 −f15 f13
sin θ1′ 0 0 sin θ3 sin θ3 · sin θ1′
−(f21 cos θ1′ + f24 ) f11 cos θ1′ f23
cos θ1′
= 0 f¯i3 (37)

sin θ1′

cos θ3 Computing the norm of both sides of (37), results in
⇒ f¯i1 cos θ1′ + f¯i4 f¯i2 cos θ1′ + f¯i5 = f¯i3 sin θ1′

sin θ3
(25)
f22 cos θ1′ + f25 −f15
2
f13
(1 − cos2 θ1′ )
where5 −(f21 cos θ1′ + f24 ) f11 cos θ1′ f23
2
f11 cos θ1′ f15
f¯i1 , uTi ⌊k1 ⌋2 ⌊k′′3 ⌋2 vi′′ = δviT k2 = det
f21 cos θ1′ + f24 f22 cos θ1′ + f25
f¯i2 , uTi ⌊k1 ⌋2 ⌊k′′3 ⌋vi′′ = δviT k′1
f¯i3 , (k′′ v′′ )uT ⌊k1 ⌋k′′ = δvT k3 which is a 4th-order polynomial in cos θ1′ that can be com-
T
3 i i 3 i
pactly written as:
f¯i4 , ′′ 2 ′′
−(ui k1 )k1 ⌊k3 ⌋ vi = (uTi k1 )(viT k′1 )
T T
f¯i5 , −(uTi k1 )kT1 ⌊k′′3 ⌋vi′′ = −(uTi k1 )(viT k2 ) 4

X
αj cosj θ1′ = 0 (38)
For i = 1, 2, (25) results into the following system: j=0
f¯11 cos θ1′ + f¯14 f¯12 cos θ1′ + f¯15 cos θ3 f¯

= ¯13 sin θ1′ with
f¯21 cos θ1′ + f¯24 f¯22 cos θ1′ + f¯25 sin θ3 f23
(26)
Note that since f¯11 f¯14 +f¯12 f¯15 = 0, we can further simplify α4 , g52 + g12 + g32 (39)
(26) by introducing θ3′ , where α3 , 2(g5 g6 + g1 g2 + g3 g4 ) (40)
h
cos θ3′ f¯11 cos ¯ ¯ ¯ α2 , g62 g22 g42 g12 g32
iT
, √θ32+f122 sin θ3 − f14 cos
√θ32+f152 sin θ3 + 2g5 g7 + + − − (41)
′ ¯ ¯ ¯ ¯
sin θ3 f11 +f12 f14 +f15
α1 , 2(g6 g7 − g1 g2 − g3 g4 ) (42)
(27)
α0 , g72 − g22 − g42 (43)
5 The simplified expressions for the terms shown after the second equal-
ity, require lengthy derivations which we omit due to space limitations. (44)
7228
where Substituting (55) in (54), we have
g1 , f13 f22 (45) G

C C = C̄C(e1 , θ1′ )C(e2 , θ3 )C̄T C(k1 , φ)C(k2 , θ2 )
g2 , f13 f25 − f15 f23 (46) = C̄C(e , θ′ )C(e , θ )C(e , θ′ − θ )C̄ ¯
1 1 2 3 2 3 3
g3 , f11 f23 − f13 f21 (47) = ¯
C̄C(e1 , θ1′ )C(e2 , θ3′ )C̄ (56)
g4 , −f13 f24 (48)
where
g5 , f11 f22 (49)
¯ , C(e , θ − θ′ )C̄T C(k , φ)C(k , θ )
C̄
g6 , f11 f25 − f15 f21 (50) 2 3 3 1 2 2
′
′ ′
T
g7 , −f15 f24 (51) = C(e2 , θ3 − θ3 ) k1 k3 k1 × k3
(27) C T
= b1 k3 C b1 × k3
We compute the roots of (38) in closed form to find cos θ1′ .
Similarly to [15] and [18], we employ Ferrari’s method [2]
The advantages of (56) are: (i) The matrix product
to attain the resolvent cubic of (38), which is subsequently ¯
C(e1 , θ1′ )C(e2 , θ3′ ) can be computed analytically; (ii) C̄, C̄
solved by Cardano’s formula [2]. Once the (up to) four real
are invariant to the (up to) four possible solutions and thus,
solutions of (38) have been determined, an optional step is
we only need to construct them once.
to apply root polishing following Newton’s method, which
improves accuracy for minimal increase in the processing 2.3. Solving for the position
cost (see Section 3.2). Regardless, for each solution of
cos θ1′ , we will have two possible solutions for sin θ1′ , i.e., Substituting in (1) the expression for d3 from (63) and
p rearranging terms, yields
sin θ1′ = ± 1 − cos2 θ1′ (52)
G
δ sin θ1′ G C
pC = G p3 − C b3 (57)
which, in general, will result in two different solutions for kT3 C b3 C
C
G C. Note though that only one of them is valid if we use
the fact that di > 0 (see Appendix 5.3). Note that we only use (1) for i = 3 to compute G pC from
G
Next, for each pair of (cos θ1′ , sin θ1′ ), we compute cos θ3′ CC. Alternatively, if we care more for accuracy than speed,
and sin θ3′ from (37), which can be written as we can find the position using a least-squares approach
based on (1) (see [14] for details). Lastly, the proposed P3P
cos θ3′ sin θ1′ g1 cos θ1′ + g2 solution is summarized in Alg. 1.
=
sin θ3′ g5 cos2 θ1′ + g6 cos θ1′ + g7 g3 cos θ1′ + g4
(53) Algorithm 1: Solving for the camera’s pose
Lastly, instead of first computing θ1 from (19) and θ3
Input: G pi , i = 1, 2, 3 the features’ positions;
from (27) to find G C C using (5), we hereafter describe a C
bi , i = 1, 2, 3 bearing measurements
faster method for recovering G C C. Specifically, from (5),
Output: G pC , the position of the camera; CG C, the
(12) and (18), we have
orientation of the camera
G 1 Compute k1 , k3 using (6)
C C = C(k1 , θ1 )C(k2 , θ2 )C(k3 , θ3 )
2 Compute ui and vi using (11), i = 1, 2
= C(k1 , θ1 )C(k′3 , θ3 )C(k2 , θ2 ) ′′
3 Compute δ and k3 using (22) and (23)
= C(k1 , θ1′ )C(k′′3 , θ3 )C(k1 , φ)C(k2 , θ2 ) (54) 4 Compute the fij ’s using (29)-(36)
5 Compute αi , i = 0, 1, 2, 3, 4 using (39)-(51)
Since k1 is perpendicular to k′′3 , we can construct a rotation 6 Solve (38) to get n (n = 2 or 4) real solutions for
matrix C̄ such that ′(i)
cos θ1′ , denoted as cos θ1 , i = 1...n

C̄ = k1 k′′3 k1 × k′′3
7 for i = 1 : n do q
′(i) ′(i)
8 sin θ1 ← sign(kT3 C b3 ) 1 − cos2 θ1
and hence ′(i) ′(i)
9 Compute cos θ3 and sin θ3 using (53)
k1 = C̄e1 , k′′3 = C̄e2 (55) 10 Compute CG C(i) using (56)
(i)
11 Compute G pC using (57)
where 12 end
e3 , I 3

e1 e2
7229
position orientation 2.5 ×10
4
Gao’s method 6.36E-05 1.31E-04 Proposed method+Root polishing

Proposed method
Kneip’s method 1.18E-05 1.02E-05 Masselli's method
2 Kneip's method
Masselli’s method 1.84E-08 4.89E-10 Gao's method
Proposed method 1.66E-10 5.30E-12
Proposed method+Root polishing 5.07E-11 1.53E-13 1.5
Table 1. Nominal case: Pose mean errors.
count
1
3. Simulation results
Our algorithm is implemented6 in C++ using the same 0.5
linear algebra library, TooN [23], as [15] We employ simu-
lation data to test our code and compare it to the solutions
0
of [15] and [18]. For each single P3P problem, we ran- -18 -16 -14 -12 -10 -8 -6
domly generate three 3D landmarks, which are uniformly log10(error) (Orientation)
distributed in a 0.4 × 0.3 × 0.4 cuboid centered around the Figure 3. Nominal case: Histogram of orientation errors.
origin. The position of the camera is G pC = e3 , and its
orientation is CG C = C(e1 , π). 18000
Proposed method+Root polishing
16000 Proposed method
3.1. Numerical accuracy Masselli's method
Kneip's method
14000 Gao's method
We generate simulation data without adding any noise
or rounding error to the bearing measurements, and run all 12000
three algorithms on 50,000 randomly-generated configura- 10000

tions to assess their numerical accuracy. Note that the posi-
count
tion error is computed as the norm of the difference between 8000
the estimate and the ground truth of G pC . As for the orien- 6000
tation error, we compute the rotation matrix that transforms
4000
the estimated G C C to the true one, convert it to the equiva-
lent axis-angle representation, and use the absolute value of 2000
the angle as the error. Since there are multiple solutions to a

0
P3P problem, we compute the errors for all of them and pick -18 -16 -14 -12 -10 -8 -6
log10(error) (Position)
the smallest one (i.e., the root closest to the true solution).
The distributions and the means of the position and ori- Figure 4. Nominal case: Histogram of position errors.
entation errors are depicted in Fig.s 3 - 4 and Table 1. As
evident, we get similar results to those presented in [18] for to evaluate the computational cost of the three algorithms
Kneip et al.’s [15] and Masselli and Zell’s methods [18], considered. We run it on a 2.0 GHz×4 Core laptop and the
while our approach outperforms both of them by two or- results show that our code takes 0.43 µs on average (0.41
ders of magnitude in terms of accuracy. This can be at- µs without root polishing) while [15], [18] and [6] take 1.3
tributed to the fact that our algorithm requires fewer oper- µs, 1.5 µs, and 3.1 µs respectively. This corresponds to a
ations and thus exhibits lower numerical-error propagation. 3× speed up (or 33% of the time of [15]). Note also, in
For completeness, in Table 1, we also list the pose errors for contrast to what is reported in [18], Masselli’s method is
OpenCV’s [21] implementation of Gao’s method [6]. actually slower than Kneip’s. As mentioned earlier, Mas-
Lastly, and as shown in the results of Table 1, we can fur- selli’s results in [18] are based on 1,000 runs of the same
ther improve the numerical precision by applying root pol- features’ configuration, and take advantage of data caching
ishing. Typically, two iterations of Newton’s method [25] to outperform Kneip.
lead to significantly better results, especially for the orien-
tation, while taking only 0.01 µs per iteration, or about 5% 3.3. Robustness
of the total processing time. There are two typical singular cases that lead to infinite
3.2. Processing cost solutions in the P3P problem:
We use a test program that solves 100,000 randomly gen- • Singular case 1: The 3 landmarks are collinear.
erated P3P problems and calculates the total execution time
• Singular case 2: Any two of the 3 bearing measure-
6 Our code is available as supplemental material and in OpenCV [21]. ments coincide.
7230
In practice, it is almost impossible for these conditions to position orientation
hold exactly, but we may still have numerical issues when Kneip’s method 1.42E-14 1.34E-14
the geometric configuration is close to these cases. To test Masselli’s method 7.13E-15 6.15E-15
the robustness of the three algorithms considered, we gen- Proposed method 5.16E-15 3.73E-15
erate simulation data corresponding to small perturbations Table 2. Singular case 1: Pose median errors.
(uniformly distributed within [−0.05 0.05]) of the features’
positions when in singular configurations. The errors are position orientation
defined as in Section 3.1, while we compute the medians7 Kneip’s method 8.10E-14 8.85E-14
of them to assess the robustness of the three methods. For Masselli’s method 7.24E-14 6.07E-14
fairness, we do not apply root polishing to our code here. Proposed method 6.73E-14 1.75E-14
According to the results shown in Fig.s 5 - 8 and Tables 2 - Table 3. Singular case 2: Pose median errors.
3, our method achieves the best accuracy in these two close-
12000
to-singular cases. The reason is that we do not compute any
Proposed method
quantities that may suffer from numerical issues, such as Masselli's method
Kneip's method
cotangent and tangent in [15] and [18], respectively. 10000
18000
8000
Proposed method
16000 Masselli's method
Kneip's method
count
14000 6000
12000
4000
10000
count
8000 2000
6000
0
-18 -16 -14 -12 -10 -8 -6
4000 log10(error) (Position)
2000 Figure 7. Singular case 2: Histogram of position errors.
0 12000
-18 -16 -14 -12 -10 -8 -6
log10(error) (Position) Proposed method
Masselli's method
Figure 5. Singular case 1: Histogram of position errors. Kneip's method
10000
18000 8000
Proposed method
16000 Masselli's method
count
Kneip's method 6000

14000
12000 4000
10000
2000
count
8000
6000 0
-18 -16 -14 -12 -10 -8 -6
log10(error) (Orientation)
4000
Figure 8. Singular case 2: Histogram of orientation errors.
2000
0
-18 -16 -14 -12 -10 -8 -6 4. Conclusion and Future Work
log10(error) (Orientation)
Figure 6. Singular case 1: Histogram of orientation errors. In this paper, we have introduced an algebraic approach
for computing the solutions of the P3P problem in closed
7 In close-to-singular cases, a few instances of large errors ( 10−1 ) may form. Similarly to [15] and [18], our algorithm does not
dominate and skew the mean. Instead, the median is a robust measure of solve for the distances first, and hence reduces numerical-
the central tendency, and provides a better performance metric. error propagation. Differently though, it does not involve
7231
numerically-unstable functions (e.g., tangent, or cotangent) Note that GC C in (61) is of the same form as that in (5),
and has simpler expressions than the two recent alternative (2)
C C computed using θ2
so any solutions of G will be found
methods [15, 18], and thus it outperforms them in terms of (1)
by using θ2 . Thus, we do not need to consider any other
speed, accuracy, and robustness to close-to-singular cases. solutions for θ1 and θ3 beyond the ones found for GC C.
As part of our ongoing work, we are currently extend-
ing our approach to also address the case of the generalized 5.3. Determining the sign of sin θ1′
(non-central camera) P3P [20].
From (52), we have two solutions for sin θ1′ , and thus
′(2)
5. Appendix for θ1′ , with θ1 = −θ1′ . This will also result into two
solutions for θ3′ [see (53)] and, hence, two solutions for θ3 :
5.1. Proof of k′3 = k2 × k1 (2)
θ3 and θ3 = θ3 + π. Considering these two options, we
First, note that k2 × k1 is a unit vector since k2 is per- get two distinct solutions for G
C C [see (54)]:
pendicular to k1 . Also, from (13) and (7) we have

C1 , C(k1 , θ1′ )C(k′′3 , θ3 )C(k1 , φ)C(k2 , θ2 )
k1 k′3
T T
= k1 C(k2 , θ2 )k3 = 0 (58)
C2 , C(k1 , −θ1′ )C(k′′3 , θ3 + π)C(k1 , φ)C(k2 , θ2 )
Then, we can prove k′3 = k2 × k1 by showing that their
inner product is equal to 1: Then, notice that
(k2 × k1 )T k′3 = kT1 (k′3 × k2 ) C2 CT1 = C(k1 , −θ1′ )C(k′′3 , π)C(k1 , −θ1′ )
(6)kT1 (k′3 × (k1 × k3 )) = C(k′′3 , π)C(CT (k′′3 , π)k1 , −θ1′ )C(k1 , −θ1′ )
=
kk1 × k3 k = C(k′′3 , π)C(−k1 , −θ1′ )C(k1 , −θ1′ )
T ′ T ′
k
(9) 1
T
(k 1 (k3 k3 ) − k3 (k1 k3 )) = C(k′′3 , π)
=
cos θ2
kT3 C(k2 , θ2 )k3 If C1 = C2 , this would require
= =1
cos θ2
C(k′′3 , π) = C2 CT1 = I
5.2. Equivalence between the two solutions of θ2
which cannot be true, hence C1 and C2 cannot be equal.
When solving for θ2 [see (9)], we have two possible so- Thus, there are always two different solutions of G
(1) (2) (1) C C.
lutions θ2 = arccos(kT1 k3 ) − π2 and θ2 = θ2 + π. If, however, we use the fact that di (i = 1, 2, 3) is pos-
(2)
Next, we will prove that using θ2 to find G C C is equivalent itive, we can determine the sign of sin θ1′ , and choose the
(1) valid one among the two solutions of G
to using θ2 . First, note that (see Fig. 2) C C. Subtracting (1)
pairwise for (i = 3) from (i = 1), we have
(1) π π
C(k2 , θ2 + )k3 = C(k2 , )k′3 = −k2 × k′3
2 2 G
p1 − G p3 =d1 G C G C
C C b 1 − d3 C C b 3
= −k2 × (k2 × k1 ) = k1 (59)
⇒ G p1 − G p3 =C(k1 , θ1′ )C(k′′3 , θ3 )C(k1 , φ)
(2) ·C(k2 , θ2 )(d1 C b1 − d3 C b3 ) (62)
Then, we can write C(k2 , θ2 ) as
(2) T
C(k2 , θ2 ) Multiplying both sides of (62) with k′′3 C(k1 , −θ1′ ) from
(1) π π the left, yields
=C(k2 , θ2 + )C(k2 , )
2 2 T
(1) π π k′′3 C(k1 , −θ1′ )(G p1 − G p3 )
=C(k2 , θ2 + )C(k3 , π)C(k2 , − )C(k3 , −π)
2 2 T
=k′′3 C(k1 , φ)C(k2 , θ2 )(d1 C b1 − d3 C b3 )
(1) π (1) π π
=C(C(k2 , θ2 + )k3 , π)C(k2 , θ2 + )C(k2 , − )C(k3 , π) ⇒k′′ T (cos θ′ I + sin θ′ ⌊k ⌋ + (1 − cos θ′ )k kT )u
2 2 2 3 1 1 1 1 1 1 1
(59) (1) ′T C C
= C(k1 , π)C(k2 , θ2 )C(k3 , π) (60) =k 3 C(k ,
2 2 θ )(d 1 b 1 − d 3 b 3 )
(23) T
(2) ⇒ sin θ1′ k′′3 ⌊k1 ⌋u1 = kT3 (d1 C b1 − d3 C b3 )
If we use θ2 to find G
C C,
⇒ − sin θ1′ uT1 ⌊k1 ⌋k′′3 = −d3 kT3 C b3
G (2) (2) (2)
C C= C(k1 , θ1 )C(k2 , θ2 )C(k3 , θ3 ) ⇒δ sin θ1′ = d3 (kT3 C b3 ) (63)
(60) (2) (1) (2)
= C(k1 , θ1 )C(k1 , π)C(k2 , θ2 )C(k3 , π)C(k3 , θ3 )
Using the fact that d3 > 0 and δ > 0, we select the sign of
(2) (1) (2)
= C(k1 , θ1 + π)C(k2 , θ2 )C(k3 , θ3 + π) (61) sin θ1′ to be the same as that of kT3 C b3 .
7232
References [16] D. Koks. A roundabout route to geometric algebra. Explo-
rations in Mathematical Physics: The Concepts behind an
[1] A. Ansar and K. Daniilidis. Linear pose estimation from Elegant Language, pages 147–184, 2006.
points or lines. IEEE Transactions on Pattern Analysis and
[17] S. Linnainmaa, D. Harwood, and L. S. Davis. Pose deter-
Machine Intelligence, 25(5):578–589, 2003.
mination of a three-dimensional object using triangle pairs.
[2] G. Cardano, T. R. Witmer, and O. Ore. The Rules of Algebra: IEEE Transactions on Pattern Analysis and Machine Intelli-
Ars Magna, volume 685. Courier Corporation, 2007. gence, 10(5):634–647, 1988.
[3] D. A. Cox, J. Little, and D. O’Shea. Using algebraic ge- [18] A. Masselli and A. Zell. A new geometric approach for
ometry, volume 185. Springer Science & Business Media, faster solving the perspective-three-point problem. In Proc.
2006. of the IEEE International Conference on Pattern Recogni-
[4] S. Finsterwalder and W. Scheufele. Das tion, pages 2119–2124, Stockholm, Sweden, Aug. 24–28
rückwärtseinschneiden im raum. Verlag d. Bayer. Akad. d. 2014.
Wiss., 1903. [19] E. Merritt. Explicit three-point resection in space. Pho-
[5] M. A. Fischler and R. C. Bolles. Random sample consen- togrammetric Engineering, 15(4):649–655, 1949.
sus: a paradigm for model fitting with applications to image [20] D. Nistér and H. Stewénius. A minimal solution to the gener-
analysis and automated cartography. Communications of the alised 3-point pose problem. Journal of Mathematical Imag-
ACM, 24(6):381–395, 1981. ing and Vision, 27(1):67–79, 2007.
[6] X.-S. Gao, X.-R. Hou, J. Tang, and H.-F. Cheng. Complete [21] OpenCV. Open Source Computer Vision Library. Available:
solution classification for the perspective-three-point prob- http://opencv.org/. Accessed Apr. 10, 2017.
lem. IEEE Transactions on Pattern Analysis and Machine [22] L. Quan and Z. Lan. Linear n-point camera pose determi-
Intelligence, 25(8):930–943, 2003. nation. IEEE Transactions on Pattern Analysis and Machine
[7] E. W. Grafarend, P. Lohse, and B. Schaffrin. Dreidimen- Intelligence, 21(8):774–780, 1999.
sionaler rückwärtsschnitt teil i: Die projektiven gleichungen. [23] TooN. C++ Linear Algebra Library. Available:
Zeitschrift für Vermessungswesen, pages 1–37, 1989. https://www.edwardrosten.com/cvd/toon/
[8] J. A. Grunert. Das pothenotische problem in erweit- html-user/. Accessed Apr. 10, 2017.
erter gestalt nebst über seine anwendungen in der geodäsie. [24] W. Wen-Tsun. Basic principles of mechanical theorem prov-
Grunerts archiv für mathematik und physik, 1:238–248, ing in elementary geometries. Journal of Automated Reason-
1841. ing, 2(3):221–252, 1986.
[9] R. M. Haralick, H. Joo, C.-N. Lee, X. Zhuang, V. G. Vaidya, [25] T. J. Ypma. Historical development of the Newton-Raphson
and M. B. Kim. Pose estimation from corresponding point method. SIAM review, 37(4):531–551, 1995.
data. IEEE Transactions on Systems, Man, and Cybernetics,
19(6):1426–1446, 1989.
[10] R. M. Haralick, D. Lee, K. Ottenburg, and M. Nolle. Anal-
ysis and solutions of the three point perspective pose esti-
mation problem. In Proc. of the IEEE Conference on Com-
puter Vision and Pattern Recognition, pages 592–598, La-
haina, HI, June 3–6 1991.
[11] J. A. Hesch and S. I. Roumeliotis. A direct least-squares
(DLS) method for PnP. In Proc. of the 13th International
Conference on Computer Vision, pages 383–390, Barcelona,
Spain, Nov. 6–13 2011.
[12] B. K. Horn. Closed-form solution of absolute orientation us-
ing unit quaternions. Journal of the Optical Society of Amer-
ica A, 4(4):629–642, 1987.
[13] B. K. Horn, H. M. Hilden, and S. Negahdaripour. Closed-
form solution of absolute orientation using orthonormal
matrices. Journal of the Optical Society of America A,
5(7):1127–1135, 1988.
[14] T. Ke and S. I. Roumeliotis. An efficient algebraic solution
to the perspective-three-point problem. Available: https:
//arxiv.org/abs/1701.08237.
[15] L. Kneip, D. Scaramuzza, and R. Siegwart. A novel
parametrization of the perspective-three-point problem for a
direct computation of absolute camera position and orien-
tation. In Proc. of the IEEE Conference on Computer Vi-
sion and Pattern Recognition, pages 2969–2976, Colorado
Springs, CO, June 21–25 2011.
7233

Jurnal Asing Acara 3 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Jurnal Asing Acara 3 2

Uploaded by

Copyright:

Available Formats

An Efficient Algebraic Solution to the Perspective-Three-Point Problem

Tong Ke Stergios I. Roumeliotis

Abstract tion process followed for arriving at a univariate polyno-

• Our algorithm’s implementation takes about 33% of

• Our method achieves better accuracy than [15, 18] un-

The remaining of this paper is structured as follows. Sec- (G p1 − G p2 )T CG C(C b1 × C b2 ) = 0 (2)

2.1. Problem Definition where3

Given the positions, G pi , of three known features fi , i =

C(k2 , θ2 ) = cos θ2 I − sin θ2 ⌊k2 ⌋ + (1 − cos θ2 )k2 kT2 (8)

f¯i5 , −(uTi k1 )kT1 ⌊k′′3 ⌋vi′′ = −(uTi k1 )(viT k2 ) 4

f¯11 cos θ1′ + f¯14 f¯12 cos θ1′ + f¯15 cos θ3 f¯

g1 , f13 f22 (45) G

Gao’s method 6.36E-05 1.31E-04 Proposed method+Root polishing

three algorithms on 50,000 randomly-generated configura- 10000

tion error is computed as the norm of the difference between 8000

the angle as the error. Since there are multiple solutions to a

2000 Figure 7. Singular case 2: Histogram of position errors.

Kneip's method 6000

pendicular to k1 . Also, from (13) and (7) we have

You might also like