Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Automatic Tie-Point and Wire-frame Generation Using Oblique Aerial Imagery

By Seth Weith-Glushko A senior project proposal submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in the Chester F. Carlson Center for Imaging Science Rochester Institute of Technology November 15, 2003

Advisor Approval I have read the attached proposal and believe Seth Weith-Glushko can accomplish the proposed work. I will work with Seth and provide the necessary resources to complete this research by the course deadlines.

Signature: ______________________________________ Carl Salvaggio, Ph. D.

_______________ Date

Abstract The desire for better visualization methods leads researchers into exploring how technology can be used to display data in different fashions. For example, measurements of an objects real-life dimensions can be made using a photograph through the use of photogrammetry. To enable a researchers ability to make these photogrammetric measurements, matched points within images called tie points must be defined. This is a time-consuming process and methods to reduce the human interaction to define these points are sought. Another facet of visualization is the development of a realistic threedimensional model of a scene. Using a method called bundle adjustment, a model can be formed using a series of images collected around a common point and information about utilized image collection system. The purpose of this research project is to develop two algorithms. The first algorithm is an automatic tie-point generator for oblique aerial images. Using a series of images as input, a listing of matched points is to be output across each unique combination that can be formed from the series. The second algorithm is a bundle adjustment for oblique aerial images. Using inertial navigational system data, camera information, and a series of images, a three-dimensional model output will be created.

Table of Contents 1. Background and Significance 2. Specific Aims 3. Experimental Design and Methods... 4. Resources and Environment. 5. Timetable... 6. Budget... 7. Budget Justification.. 8. Biographical Sketch.. 9. Literature Cited. 1 3 4 15 16 16 16 17 17

Background and Significance Photogrammetry, when taking its Latin roots, means measure with light. It had its earnest beginnings with the works of Leonardo da Vinci in perspective and central projection which continued on to Lambert and his mathematical principles allowing the discovery of an object point from image space. Coinciding with the development of scene geometry, Daguerre invented the daguerreotype, the first crude form of a photograph. Around 1840, the French geodesist Jean Arago began to advocate the beginnings of modern photogrammetry, the use of photography to make measurements of a scene. [1] Konecny defined the development of photogrammetry in four stages: plane table photogrammetry, analog photogrammetry, analytical photogrammetry and digital photogrammetry. Plane table photogrammetry is closely associated with the advanced development of cartography. Using terrestrial photographs, the first modern maps were created. Along with this use, scientists began experimenting with aerial imagery in the forms of balloon and kite photography. Due to the inherent unreliability of these systems, improvements in the photographic systems were made. These improvements led to the analog photogrammetry stage. In this stage, the definition of stereoscopy led to improved measurements. Likewise, the invention of the airplane provided an improved platform for aerial imaging. It is during this stage where most of the theory and instrumentation was developed. [1] With the advent of the computer came the stage known as analytical photogrammetry. Using matrix algebra, the theories of photogrammetry were transformed to handle multiple images around a common point. As such, it was possible to perform ultra-accurate measurements. However, to make these ultra-accurate

measurements, corresponding locations in multiple images, called tie points, must be defined. For much of the analytical photogrammetry stage, this was generated manually and consequently, was subject to human error. This deficiency among a series led to the advent of digital photogrammetry. [1] The permeating idea behind digital photogrammetry is the complete removal of any analog system from the photogrammetric problem. The invention of the CCD array came close to this goal. Yet, one analog system remained: the human generating tie points. Research into this area has been very minimal; untested theories abound in the written literature of the subject. Only recently have proven but crude and inefficient implementations been developed. These implementations have been performed on ortho-rectified images, or images that are flat and devoid of any information about the three-dimensional structure of an imaged object. No research has been performed on oblique imagery, or imagery taken at an angle. The development of photogrammetry is born from the desire to improve our visualization of the world around us. In the past, two-dimensional methods such as photography have been used to image a three-dimensional world. For a time, this proved to be sufficient. Although the two-dimensional methods remain prevalent, threedimensional visualization means have grown by leaps and bounds, as shown by the proliferation of highly realistic video game characters and virtual cinema worlds. For a human to consider a three-dimensional constructed scene to be realistic, it must be convincing. Such properties as depth, texture and volume must be carefully manipulated to produce these convincing scenes. It can be done manually but the process is financially costly and time-consuming. Accordingly, one would want to use our digital prowess to

generate these scenes since computer processing is relatively cheap and time management becomes a non-issue. Bundle adjustment was created to harness this computing capability. The theory behind bundle adjustment is to estimate a three-dimensional scene by making inferences about its plane geometry, or structure, from a series of two-dimensional images around a common area. Bundle adjustment also estimates information about the texture, or view, of imaged objects. This texture can be applied to the three-dimensional scene to give it a sense of realism. What makes bundle adjustment different from the photogrammetric problem presented before is that bundle adjustment has been thoroughly theorized and developed; most of the current research is on refinement of the methods. Like before, bundle adjustment has not been performed on oblique aerial imagery because of the requirement that precise information about the location of the camera must be known.

Specific Aims The purpose of this research project is to develop two algorithms: an algorithm that enables automatic tie-point generation across oblique imagery and a bundle adjustment algorithm that uses oblique imagery. The development will be done in stages, with development of each algorithm considered a stage. The tie-point stage will be of paramount importance. If time allows, work will progress on the bundle adjustment stage. Proof of the algorithms feasibility will come in the form of engineering code. Use of these algorithms is meant in a compartmentalized manner for future computer programs. In the immediate future, the tie-point algorithm will become part of an

electronic geologic survey software program that uses data from oblique aerial imagery; however, plans for the bundle adjustment algorithm have not been finalized. In the future, these algorithms would become part of an advanced visualization program. This program would be used for a variety of professionals interested in generating a realistic three-dimensional scene from aerial data. For example, civil engineers could use the program to model how structures in a flood plain would be affected after a major rainfall. Likewise, insurance companies could use the data to perform targeted cost analysis for insurance users within the flood plain. Experimental Design and Methods The development of the two algorithms will come in the form of engineering code developed in the IDL environment. The tie-point algorithm will be formed by using different transforms and algorithms before arriving at the end result. The input to the algorithm will be a pair of oblique images. If there are more than two images to be input, the algorithm will be run on every possible combination that can be made from the series of images. The output will be a listing of matching points across the pair. An overview of the algorithm can be seen in Figure 1.

Input

Ortho-rectification Transform

Image processing

Point generation

Point matching

Inverse Ortho-rectification Transform

Output

Figure 1 Overview of the proposed automatic tie-point generation algorithm

Once the images have been read into the programming environment, orthorectification takes place. Ortho-rectification is the process by which an oblique image is transformed into a flat image, devoid of any three-dimensional information. The image is likened to a map where a straight-down aerial view is taken. A linear equation, shown below, is used to perform this process.
a1 X + b1Y + c1 a3 X + b3Y + 1
a 2 X + b2 Y + c 2 a3 X + b3Y + 1

x=

(1)

y=

(2)

X and Y represent the ortho-rectified coordinates in the flattened image; x and y represent the original coordinates; and a1, b1, c1, a2, b2, c2, a3 and b3 represent scalar constants to be solved for. To solve for the constants, four image points with ortho-rectified pairs must be defined. [2] Using the corners of the oblique image, these points are used. This transform is performed on both images. As a consequence of using this transform, it is necessary to perform some type of interpolation on the flattened image. Within image processing, there are three major types of interpolation: nearest neighbor, bilinear interpolation and cubic convolution. Each type has its own strengths and weaknesses. Nearest neighbor produces a characteristic pixilation in an interpolated image yet takes the least computation time and does not modify any of the original radiometry contained in the gray-level counts. Bilinear interpolation produces a smooth transition between grayscale values and takes a moderate amount of computation time. Finally, cubic convolution has an even smoother change between interpolated pixels but has a large computation time. As such, nearest neighbor

will be used. In the case of this transform, the coordinates in the original image are rounded to find an appropriate grayscale to put into the flattened image. [3] Once the image has been flattened, image processing takes place. Image processing must be done to make each image similar in grayscale values for the purpose of normalizing the data to make the point generator work better. The only kind of processing to be performed is histogram processing. Histogram processing involves transforming the histogram of one image into another. Figure 2 highlights this process.
First Image CDF

First Image Histogram

Lookup Table

Second Image Histogram Second Image CDF

Figure 2 Histogram specification processing. Courtesy [4] The histogram is taken for both images. Treating the normalized histogram as the probability distribution function, a cumulative distribution function (CDF) is calculated. Using an arbitrary input grayscale value, an output grayscale value is calculated using the path shown in Figure 2. As a result, a lookup table is formed and applied to one of the images. The end result is images with similar grayscale values, independent of image

structure. [3] It is important to note that although image content is changed, these changes are discarded once the matched points have been found. The third step in the algorithm is point generation. Point generation is performed by applying a Laplacian of Gaussian (LoG) spatial filter to an image and then thresholding the result to an arbitrary value. Walli found that by applying this thresholded filter to an image, points with high frequency detail (i.e. edges) could be isolated and defined. The theory behind this practice is that these points will be similar across images of a common point. An example of how this filter is applied can be seen in Figure 3 below. [5]

Original Image

LoG Filtered Image

GCP Threshold

Figure 3 LoG filter is applied to an image with cultural features. Courtesy [5] Once these points have been defined, point matching algorithms are run. The point sets are run through a series of iterative algorithms and after passing through each algorithm, more points are removed. The first algorithm is point distance comparison. This algorithm calculates a points distance from every other point in a localized scene using Eqn. 3 and creates a matrix of distances. The process is outlined in Figure 4.

Figure 4 Outline of the point distance comparison algorithm. Courtesy [5]

Dis tan ce = ( x1 x 2 ) 2 + ( y1 y 2 ) 2

(3)

x1 and y1 represent the coordinates of one pixel while x2 and y2 represent coordinates of a second pixel within the image. The matrices that are formed are compared row-by-row to find the total number of matches. The two rows that have the greatest number of matches (within some arbitrary error) are considered matched points. Using these points, the second point matching algorithm is run: point scale comparison. As before, matrices containing distances from pixels are calculated using Eqn. 3. The process can be seen in Figure 5 below.

Figure 5 Outline of the point scale comparison algorithm. Courtesy [5] The difference between the previous and current algorithm lies in their matching criteria. Whereas distances were simply compared before, ratios of distances between like points across images are calculated here.

dist first image(1 to 2) dist first image(1 to 3)

=(

distsecond image(2 to 1) 2 2 2 ) =( ) 3 distsecond image(2 to 3) 6

(4)

If the ratios are equal, then the points are considered matched. The returned point set is then run through a LoG maxima comparison. Theory dictates that if points match, their LoG value will be similar. To counteract the effects of differing image structure, the LoG values for each image are normalized and then compared. If they are within a certain error value, the points are considered matched. [5]

The next algorithm in the series is point angle comparison. With this algorithm, it is necessary to compare vertices. Hence, three points will be used to define an angle of interest. The process is shown in Figures 6 and 7.

Figure 6 Schematic of how each point is isolated into matrices containing angles for each vertex. Courtesy [5]

10

Figure 7 Example of how point matches are determined through a count of matching angles. Courtesy [5]
c2 a2 + b2 ) 2ab

[a, b, c] = cos 1 (
where:

(5)

a = distance left vertex middle vertex

11

b = distance right vertex middle vertex

c = distance left vertex right vertex .

By using the four points, numerous angles can be defined using Eqn. (5). Eqn. (5) was derived from the Cosine Law. These angles are input into a matrix set for each image. These matrix sets are then compared, matrix by matrix. Those matrices that have the largest number of matching angles (within some arbitrary error), and the points they contain, are considered matched. [5] The next step in the tie-point algorithm involves a geometric transformation of one image to register it with the other image. A global polynomial distortion model is used to perform the transformation. Mathematically, this model is defined in the equations below.
2 2 x m = a 00 + a10 x ref ,m + a 01 y ref ,m + a11 x ref ,m y ref ,m + a 20 x ref ,m + a 02 y ref ,m

(6) (7)

2 2 y m = b00 + b10 x ref ,m + b01 y ref ,m + b11 x ref ,m y ref ,m + b20 x ref ,m + b02 y ref ,m

xm and ym represents matched image coordinates in one image; xref and yref represent matched image coordinates in the other image; and anm and bnm represent constants to be solved for. It is important to note that there are unique equations for each matched point set. As such, there are multiple linear equations. Hence, these equations can be written in matrix form as seen below.

12

x1 1 x ref ,1 x 2 1 x ref , 2 = M M M xm 1 x ref ,m

y ref ,1 y ref , 2 M y ref ,m

x ref ,1 y ref ,1 x ref , 2 y ref , 2 M x ref ,m y ref ,m

x x

2 ref ,1 2 ref , 2

M
2 x ref ,m

a 00 a 00 a a 10 y 10 y a 01 = W a 01 M a11 a11 2 a 20 y ref ,m a 20 a 02 a 02


2 ref ,1 2 ref , 2

(8)

or,
X = WA
Y = WB

(9) (10)

These constants can be solved by using the matrix inverse using the minimum number of needed matched points (6) as below:

A = W 1 X
B = W 1Y

(11) (12)

or using the pseudo-inverse when there are more than 6 matched points. [4]
A' = (W T W ) 1W T X B' = (W T W ) 1W T Y

(13) (14)

Once the geometric transformation has taken place, quality metrics on the registration between the two images begin. The first quality metric is absolute mean variance (AMV), a calculated value that can be likened to a difference image. To calculate absolute mean variance, a simple algorithm is employed, as shown in Figure 8.

13

Overlap the first and second images Determine the digital count difference per pixel over entire image Average the difference over the number of pixels overlap Divide by the digital count range of the images Multiply the resulting number to find percent AMV Figure 8 Algorithmic approach for computing absolute mean variance

If the absolute mean variance is fairly low, it can be said with confidence that the point matching was accurate. This metric assumes no changes exist between the two images, however. This assumption was reasonable as the original images this algorithm was applied to were LANDSAT imagery, an imaging system with large pixel sizes. As such, few features within the image would change from exposure to exposure. Because the aerial imaging system used has much smaller pixel sizes, this metric might not work as well. A second quality metric that is performed is the root mean square distance error (RMSDE). Here, the geometrically transformed points are compared against the matched points in the first image. Distances are calculated using Eqn. 3 and a mean RMSDE is found. Using an iterative method, distance errors that are more than one standard deviation from the mean are removed and the geometric transformation is recalculated until no more points can be removed. Those points that are left are considered matched. [5] Finally, using this refined matched point set, an inverse ortho-rectification transform is performed. The oblique points are calculated by inputting the matched ortho-rectified points into Eqns. (1) and (2) using the same derived constants as the first ortho-

14

rectification transform. As a result, a matched point set in an oblique coordinate system is found and output. If time permits, a second algorithm will be developed: a bundle adjustment algorithm. A bundle adjustment algorithm uses multiples images of a scene to estimate the underlying plane geometry, or structure. To relate these multiple images, tie points are used to define areas of commonality. In addition to the images, the algorithm also requires information about the camera used to capture the scene, specifically its optical properties and location parameters (roll, pitch, yaw, global positioning system coordinates and altitude over land). In the case of oblique imagery, no specific redevelopment of the algorithm is required. Using the methods described in Cornous paper Bundle adjustment: a fast method with weak initialization [6] and Pollefeys textbook 3D Modeling From Images [7], engineering code will be developed. For input, the algorithm will use all of the oblique images with their respective tie-point matches. It will also use the calibrated optical parameters of the camera that have been defined and inertial navigational system data to specify its location parameters.
Resources and Environment

Due to the nature of the research, all that is required is a computer loaded with IDL and a DVD-ROM drive. The DVD-ROM drive is used to read test data. The primary investigators personal computer satisfies this need.

15

Timetable

September 1, 2003 November 15, 2003: Search for previous research, background knowledge November 15, 2003 April 1, 2004: Development of tie-point algorithm and engineering code; development of bundle adjustment algorithm if time allows April 1, 2004 May 15, 2004: Complete paper, poster and presentation

Budget Time Credits, Winter Quarter Credits, Spring Quarter Money

2 2

Total
Budget Justification

$0.00

Two credits have been budgeted for winter and spring quarter, for a total of four credits. Due to the nature of the research and its external funding, much of the work to be performed will be done so on paid time. Hence, only four credits will be required to keep flexibility in the primary investigators schedule for graduation from the Rochester Institute of Technology.

16

Biographical Sketch EDUCATION

Rochester Institute of Technology, Rochester, NY Major: Imaging Science Minor: Criminal Justice Degree: Bachelor of Science expected May 2004 GPA: 3.91/4.00 PFOS: 4.00/4.00

EMPLOYMENT Digital Imaging & Remote Sensing Laboratory, RIT, Rochester, NY April 2002-present Developed custom quality control software for spectroscopic measurements Made spectral measurements using a variety of portable spectrometers Developing an on-line spectra metadata and tracking system Creating an image processing algorithm to find matching points in oblique aerial imagery Deans List (Fall, Spring 2000; Fall, Winter, Spring 2001; HONORS Fall, Winter, Spring 2002) RIT Presidential Scholarship Elizabeth Ellen Locke Scholarship Member, Honors & Leadership Program (September 2000present) Literature Cited

[1] Burtch, B. History of Photogrammetry. 10 Nov 2003. <http://www.ferris.edu/htmls/ academics/course.offerings/burtchr/sure340/notes/History.pdf> [2] Wolf, Paul R. Elements of Photogrammetry. 2nd Ed. New York: McGraw-Hill, 1983. [3] Gonzalez, Rafael C. and Richard E. Woods. Digital image processing. 2nd Ed. Upper Saddle River: Prentice Hall, 2002 [4] Salvaggio, C. Digital Image Processing I Notes. 18 Nov 2003. <http://www.cis.rit.edu/people/faculty/salvaggio/courses/1051-461/1051461.pdf> [5] Walli, Karl C. Multisensor Image Registration Utilizing the LOG Filter and FWT. Diss. Rochester Institute of Technology, 2003. [6] Cornou, Sebastien, Michel Dhome and Patrick Sayd. Bundle adjustment: a fast method with weak initialization. British Machine Vision Conference 2002, 13 Sep. 2002. Ed. Rosin P.L. and D. Marshall. England: British Machine Vision Association, 2002. 223-232. 17

[7] Pollefey, M. 3D Modeling from Images. 9 Oct. 2003. <http://www.esat.kuleuven.ac.be/~pollefey/tutorial/tutorialECCV.html>

18

You might also like