Professional Documents
Culture Documents
Part 09
Part 09
Ji Hui
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 1 / 43
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 2 / 43
Multi-camera systems for 3D perception
Multi-camera survillance
Stereo camera for driverless car
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 3 / 43
Depth from Stereopsis
image plane
optical center
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 4 / 43
3D perception for stereo imaging
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 5 / 43
World co-ordinate vs image co-ordinate
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 6 / 43
Geometric camera calibration
Camera calibration estimates the co-ordinate mapping among image
frame, camera frame and world frame
intrinsics: pixel co-ordinates () Pinhole camera co-ordinates
extrinsics: camera coordinates () world co-ordinates
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 7 / 43
Perspective transform
Pinpole camera model
Consider a point P
Pc : coordinate of the point in camera frame
Pw : coordinate of the point in world frame
The mapping between two coordinates:
Pc = R(Pw T ).
In homogenous form:
2 3 2 32
3
Xc Xw
6 Yc 7 6 R RT 7 6 7
6 7=6 7 6 Yw 7 ,
4 Zc 5 4 5 4 Zw 5
1 0 0 0 1 1
where R = Rx Ry Rz
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 10 / 43
Intrinsic mapping
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 11 / 43
Calibration matrix M
Combining both extrinsic and intrinsic mapping:
2 3 2 3
2 3 Xw Xw
u 6 7 6
4 v 5 = Mint Mext 6 Yw 7 = M 6 Yw 7
7,
4 Zw 5 4 Zw 5
w
1 1
with
2 3 2 3
fx s xo r11 r12 r13 R1T T
Mint =4 0 fy yo 5 ; Mext = [R, RT ] = 4 r21 r22 r23 R2T T 5
0 0 1 r31 r32 r33 R2T T
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 12 / 43
Calibration pattern
Show checkerboard pattern to cameras with different poses for
estimating calibration matrix
32
localization. As a counterexample, this is not the case for the corners of a white square on a black
background. If the image is blurred somewhat, changing the image gamma will cause the square to
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 14 / 43
Calibration with known correspondence
pk , P~k }K
Point correspondence: matching the corners of board: {~ k=1
2 3
2 3 Xw
u 6 Yw 7
p~ ⌘ 4 v 5 = Mint Mext 6 7
4 Zw 5 = Mint Mext P
~
w
1
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 16 / 43
Calibration: how to estimate M
Using the following equation to estimate parameters of calibration matrix
p~ = M P~ =) p~ ⇥ M P~ = p~ ⇥ p~ = 0,
p~k ⇥ (M P~k ) = 0, k = 1, . . . , K
Let m
~ 2 R12 denote the 12 entries of matrix M . the constraints above can
be expressed as a linear system
Am
~ =0
~ 22 ,
min kAmk subject to kmk = 1,
A = QR,
The relationship
⇢
hei , aj i, i j
Q = [e1 , e2 , . . . , eK ]; Ri,j =
0, otherwise.
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 18 / 43
Calibration: Refinement
Estimate calibration matrix M from 8 point correspondences.
Decompose M into internal and external mappings M = Mint Mext
1 Estimating translation: Let T̃ denote the estimation of T
In non-homogenous co-ordinates. Recall that Pc = R(Pw T ). Then, for the
world point T , we have
Pc = R(T T ) = 0.
In homogenous co-ordinates,
Mext T̃ = 0 =) M T̃ = Mint R [I3 , T ] T̃ = 0.
Thus, T̃ can be estimated by finding the eigenvector of the eigenvalue of M with
smallest magnitude.
2 Estimating camera rotation and intrinsic parameters.
Recall that
⇥ ⇤
M = Mint Mext = Mint R [I3 , T ] = M 0 [I3 , T ] = M 0 , M 0 T
where M 0 = Mint R, Mint : upper-triangular, R :unitary matrix.
Running the QR decompostion on the left 3 ⇥ 3 block of M , M [0 : 2, 0 : 2], to
have R and Mint .
Updating M using estimated Mext and Mint , and using it as an initial for
some iterative scheme for solving
X
min pk ⇥ M P~k |2 ,
|~
M
k
e.g. Deepest descent method for several iterations.
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 19 / 43
3D vision: Stereosis
Two viewing points provide disparity, which translates to depth
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 20 / 43
epth from Two Views: Stereo
Stereo vision
For an point in the world, its single view cannot determine its location in
All points
space
on projective line to P map to p
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 21 / 43
Stereo
Depth vision
from Two Views: Stereo
For an point in the world, 2 different views can determine its location in
I can get 3D!
space
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 23 / 43
Epipolar geometry: Pixel correspondence
Find pairs of points that correspond to same scene point
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 24 / 43
Epipolar geometry
epipoles e, e0
intersection of baseline with image plane
projection of projection center in other image
vanishing point of camera motion direction
epipolar plane = plane containing baseline (1-D family)
epipolar line = intersection of epipolar plane with image
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 25 / 43
Example
Converging camera
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 26 / 43
Calibrated camera: essential matrix
where 2 3 (
0 t3 t2 T
p = (u, v, 1)
[t⇥ ] = 4 t3 0 t1 5 ; T
t2 t1 0 p0 = (u0 , v 0 , 1)
Thus, we have
p> Ep0 = 0 with E = [t⇥ ] R,
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 28 / 43
Uncalibrated camera: fundamental matrix
Fundamental matrix F :
F is of rank 2,
Has 7 degrees of freedom
There are 9 elements, but scaling can be omitted and det F = 0
Essential matrix E:
E is of rank 2
Its two nonzero singular values.
Has only 5 degrees of freedom, 3 for rotation, 2 for translation
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 30 / 43
8-point algorithm
Recall that fundamental matrix is determined by the correspondence
pairs {(xi , yi )> , (x0i , yi0 )> }ni=1 :
x0>
i F xi = 0,
Af = 0,
where 2 3
x01 x1 x01 y1 x01 y10 y1 y10 x1 y1 1
A = 4 ... .. .. .. .. .. .. .. 7
6
. . . . . . . 5
x0n xn x0n yn x0n yn0 yn yn0 xn yn 1
Normalized 8-point method
Normalize points by shifting to the origin
Computing F by SVD for minimizing Mean squares error (MSE)
Enforce the rank-2 constraints.
Output F by re-shifting back.
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 31 / 43
Triangulation: calibrated camera
Finding P as the midpoint of the common perpendicular to the two rays in
space.
Linear triangulation:
⇢ ⇢
x = MX x ⇥ MX = 0
=) =) AX = 0,
x0 = M 0 X x0 ⇥ M 0 X = 0
where A is determined by the pairs (x, x0 ).
The linear system can be solved by
min kAXk22 , subject to kXk2 = 1.
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 32 / 43
Reconstruction via minimizing geometric error
Finding a pair (x̂, x̂0 ) whose rays intersections and is close to (x, x0 ):
>
b)2 + d(x0 , x
min0 d(x, x b0 )2 , subject to xb0 F x
b = 0,
x̂,x̂
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 33 / 43
Rectification
Camera rectification for simplifying reconstruction
Re-project image planes onto common plane such as all epipolar lines are
horizontal, i.e. two cameras are parallel
The distance between two optical center T is called the stereo baseline,
and is assumed to be known.
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 34 / 43
Depth and disparity from rectified camera
Point correspondence in rectified camera with baseline T
Notice that
T T d T
= =) Z = f
Z Z f d
Thus, the disparity d is proportional to inverse depth 1
|Z|
1
d / |Z| .
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 35 / 43
Correlation-based dense correspondence
3D scene reconstruction requires dense correspondences
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 36 / 43
Window matching
Issue: ambiguity exists when comparing only single intensity
Idea: comparing the neighboring window
Window matching
For each window (e.g. 3 ⇥ 3), match to closest window on epipolar line in
another image.
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 38 / 43
Results
Stereo: with different
Parallel patch
Calibrated sizes
Cameras
Smaller patches:
Smaller patches:more
more detail, noisy. Bigger:
detail, but noisy. Bigger:less
less detail,
detail, butbut smooth
smooth
two images
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 40 / 43
Depth map from Stereo
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 41 / 43
Solutions with other sensors
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 42 / 43
Solutions with other sensors
Lidar in iphone
Ji Hui (National University of Singapore) Visual Information Interpretation: 3D from stereosis October 11, 2021 43 / 43