Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Principal Component

Analysis
GISC 9216- D2

Taylor J Dilliott, BA
March 12, 2018

1-109 Sanford Ave South, Hamilton, ON L8M 2G7


E: Taylor.J.Dilliott@gmail.com M: (289) 969-1847
Taylor J Dilliott, BA
1-109 Sanford Ave South, Hamilton, ON L8M 2G7
E: Taylor.J.Dilliott@gmail.com M: (289) 969-1847

March 12, 2018


Ms. Janet Finlay
Instructor
Niagara College, Niagara-on-the-Lake Campus
135 Taylor Road, Niagara-on-the-Lake, ON, L0S 1J0

Dear Ms. Finlay

RE: GISC9216 D2 – Principal Component Analysis

Please accept this as my formal letter of transmittal for GISC9216 D2- Principal Component
Analysis as per the terms of reference. Enclosed you will find the formal report and two formal
map layouts as per the terms of reference.

I thoroughly enjoyed this assignment as a tool to learn additional digital image processing
techniques and while the PCA Classification did not impact my results very much, I can
definitely see situations where it would be more useful moving forwards. I also enjoyed the
comparison between the two images, and trying to observe the differences without use of
processing software.

If you have any issues with the enclosed documents or any questions please do not hesitate
to contact me.

Sincerely,

Taylor J. Dilliott, BA
GIS-GM Certificate Candidate

TJD\

Enclosed: DilliottTGISC9216D2 – Principal Component Analysis Report


GISC 9216- D2
Principal Component Analysis March 12, 2018

Table of Contents
1.0 Introduction & Purpose ............................................................................................................ 1
2.0 Background & Study Area ............................................................................................................ 1
3.0 Methodology ................................................................................................................................. 3
3.1 Band Correlation Analysis ..................................................................................................... 3
3.2 Principal Component Analysis .............................................................................................. 4
3.3 Unsupervised Classification ................................................................................................... 7
4.0 Image Comparison ....................................................................................................................... 8
5.0 Conclusion .................................................................................................................................... 11
Bibliography........................................................................................................................................ 11

List of Figures
Figure 1- PCA Study Area ................................................................................................................... 2
Figure 2- Histogram Comparison between Bands 1, 3 and 4. ....................................................... 3
Figure 3 - Comparison of Band 1 to Bands 3, 4 and 6, showing examples of Strong, Weak
and No Correlation ............................................................................................................................. 3
Figure 4- Principal Component Window with User Inputs ............................................................... 5
Figure 5 - PCA created Subset Image .............................................................................................. 6
Figure 6 - Histograms for Each of the newly created Bands .......................................................... 6
Figure 7 - Scatterplots showing Correlation between Each Band ................................................ 7
Figure 8 - Unsupervised Classification Parameters .......................................................................... 7
Figure 9 - PCA Unsupervised Classification ...................................................................................... 8
Figure 10 - Original Unclassified Supervision on the Left, PCA Unclassified Supervision on the
Right....................................................................................................................................................... 9
Figure 11 - Summary Matrix Display Window Showing Values for Class 6 (Trees) ........................ 9
Figure 12 - Matrix Comparing the Classifications of Each Image ............................................... 10
Figure 13 - Closeup of the Town of Binbrook.................................................................................. 10

List of Tables
Table 1- Pre PCA Band Correlations based on Scatterplot Interpretation .................................. 4
Table 2 - Eigenvalues Used to Determine PCA Comparability ...................................................... 5

Page | i
GISC 9216- D2
Principal Component Analysis March 12, 2018

1.0 Introduction & Purpose


While remote sensing techniques are constantly evolving and becoming more accurate,
there is still a high degree of interband correlation present in many multispectral images. This
essentially means that digital images from various wavelengths can often look similar and
therefore display fundamentally similar information (Lillesand, Keifer, & Chipman, 2015). This
was apparent in the previous assignment, where certain methods had difficulty distinguishing
between similar bands, such as between barren fields and pavement.

Principal Component Analysis is one method that attempts to reduce this confusion, thereby
creating a clearer, more efficient classification process. Principal Component Analysis works
by combining information from all bands into a new dataset with fewer bands that can then
be used for classification.

This project aims to test whether or not the Principal Component Analysis enhances the
quality of the classified image compared to the previously created classifications.

2.0 Background & Study Area


The Principal Component Analysis for this project will be carried out using the same subset
study area image that was created for D1. The area consists of the greater Hamilton Area,
including Stoney Creek, Ancaster and part of Burlington as well as much of the farmland to
the south. The study area can be seen on the following page in Figure 1.

Page | 1
GISC 9216- D2
Principal Component Analysis March 12, 2018

Figure 1- PCA Study Area

Page | 2
GISC 9216- D2
Principal Component Analysis March 12, 2018

3.0 Methodology
3.1 Band Correlation Analysis
Prior to conducting the Principal Component Analysis, the 6 bands present in the subset
image seen in Figure 1 had to be examined for correlation between wavelengths. This was
using both histograms and scatter plots. In Figure 2 a comparison of histograms can be seen,
where Band 1 and 3 show a strong correlation with each other and a very weak correlation
to band 4.

Figure 2- Histogram Comparison between Bands 1, 3 and 4.

The comparison between these 3 bands is also shown in the Feature Space Image
(scatterplots) which are designed to show correlation between 2 bands. Figure 3 shows the
comparison of band 1 to bands 3, 4 and 6 as scatterplots.

Band 1 Vs Band 3 – Strong Band 1 Vs Band 4 – No Band 1 Vs Band 6 – Weak


Correlation Correlation Correlation
Figure 3 - Comparison of Band 1 to Bands 3, 4 and 6, showing examples of Strong, Weak and No Correlation

Page | 3
GISC 9216- D2
Principal Component Analysis March 12, 2018

Table 1 shows the correlations between all of the bands with one another, and provides a
label of Strong, Weak or None to describe the correlation.

Band Comparison Correlation


Band 1 Vs Band 2 Strong
Band 1 Vs Band 3 Strong
Band 1 Vs Band 4 None
Band 1 Vs Band 5 Weak
Band 1 Vs Band 6 Weak
Band 2 Vs Band 3 Strong
Band 2 Vs Band 4 None
Band 2 Vs Band 5 Weak
Band 2 Vs Band 6 Weak
Band 3 Vs Band 4 None
Band 3 Vs Band 5 Weak
Band 3 Vs Band 6 Weak
Band 4 Vs Band 5 None
Band 4 Vs Band 6 None
Band 5 Vs Band 6 Strong
Table 1- Pre PCA Band Correlations based on Scatterplot Interpretation

As seen in the table, 10 of the 15 band comparisons returned at least a Weak correlation,
showing that a Principal Component Analysis should prove very useful.

3.2 Principal Component Analysis


After observing the correlations between the various bands, the PCA analysis could be
completed. Figure 4 shows the Principal Components analysis window and the user inputted
parameters for completing the analysis. For this study, the number of components desired for
the output was set to 3, and the Eigen Matrix/Eigenvalues were set to write to a new file.

Page | 4
GISC 9216- D2
Principal Component Analysis March 12, 2018

Figure 4- Principal Component Window with User Inputs

After running the Principal Component Analysis, a new subset image is created along with
the Eigen Matrix and Eigenvalues. The Eigenvalues are used to determine the comparability
of the new subset to the old by comparing the total value of the 3 components that were
kept to the total of the pre-PCA components as a percentage. This can be seen in Table 2,
below.

Band Number Band Value Percentage of Total


1 3957816 0.789489
2 986022.1 0.196688
3 57450.81 0.01146
4 8313.468
5 2239.41
6 1294.927
Total Value 5013137 99.76366
Table 2 - Eigenvalues Used to Determine PCA Comparability

The value 99.76366 shows that over 99.75% of the original data was properly transformed for
use in the PCA. This tells us that the imagery should be very similar to the original classification
output.

Page | 5
GISC 9216- D2
Principal Component Analysis March 12, 2018

The output image from the PCA can be seen in Figure 5.

Figure 5 - PCA created Subset Image

In order to check for variability of the bands, histograms and scatterplots were again used.
Figure 6 shows the histograms for each of the 3 newly created PCA bands.

Figure 6 - Histograms for Each of the newly created Bands

The histograms show little to no correlation between the layers, showing that the PCA worked.
Additional proof of this can be seen in the scatterplots, shown in figure 7.

Page | 6
GISC 9216- D2
Principal Component Analysis March 12, 2018

Band 1 Vs Band 2 – No Band 1 Vs Band 3 – No Band 2 Vs Band 3 – No


Correlation Correlation Correlation
Figure 7 - Scatterplots showing Correlation between Each Band

As evidenced by the histograms and scatterplots, the Principal Component Analysis did
remove much of the overlap between bands found in the original 6 band subset image.

3.3 Unsupervised Classification


After creating the new subset image and comparing the correlation between the bands, the
unsupervised classification was able to occur. This is done using the Unsupervised
Classification tool and the same parameters that were used to create the original
Unsupervised Classification used in D1. The parameters can be seen in Figure 8.

Figure 8 - Unsupervised
Classification Parameters

Page | 7
GISC 9216- D2
Principal Component Analysis March 12, 2018

The unsupervised classification was then recoded to match the output of the original
unsupervised classification from D1 in order to search for differences between the two
images. The recoded PCA unsupervised classification can be seen below in Figure 9 and a
formal map layout of the classification can be seen in Appendix A.

Figure 9 - PCA Unsupervised Classification

4.0 Image Comparison


After completing the recoding of the unsupervised classification, the comparison between
the PCA version and non-PCA version can begin. As a reminder the original recoded
unsupervised classification beside the new PCA unsupervised classification in Figure 10. A
formal layout of the original unsupervised classification can be seen in Appendix A.

Page | 8
GISC 9216- D2
Principal Component Analysis March 12, 2018

Figure 10 - Original Unclassified Supervision on the Left, PCA Unclassified Supervision on the Right

Visually, it is very hard to determine any differences between the two images, with every pixel
appearing to be identical. In fact, no differences were found through visual methods of
observing the difference in pixels. However using a summary report of the union shows that
there is minor pixel variability. An example of this window can be seen below in Figure 11.

Figure 11 - Summary Matrix Display Window Showing


Values for Class 6 (Trees)

Using the Matrix Summary tells us that for every newly created classification, well over 99.5%
of the data matches the original classification, with the largest amount of pixels outside of
their original classifications being for the Trees classification seen above, with 796 pixels being
different, which equates to 0.31%. Below is the full matrix comparing the two images.

Page | 9
GISC 9216- D2
Principal Component Analysis March 12, 2018

Figure 12 - Matrix Comparing the Classifications of Each Image

Given the numbers from the matrix, it does not appear as if the PCA had much, if any real
effect on the creation of the unsupervised classification for the urban and agricultural
environments, which are represented here by classes 4 and 5. As can be seen within the
matrix, class 4 has a variability of 246 total pixels, and class 5 has a variability of 291 total pixels,
which account for a variation of 0.22% and 0.08% respectively, a largely negligible difference.

As an example, below is the area around the town of Binbrook, a primarily rural community
with a small urban center.

Figure 13 - Closeup of the Town of Binbrook

Page | 10
GISC 9216- D2
Principal Component Analysis March 12, 2018

5.0 Conclusion
In conclusion, while the logic behind using a Principal Component Analysis makes sense, in
practice there doesn’t seem to be a large enough difference in the two outputs to really
justify the extra steps in creating an unsupervised classification, particularly when a deadline
may be concerned.

That being said, if a different subset image was to be used with some more subtle differences,
or even if the recoding was done differently, it may yield a totally different set of results and
therefore be worth it.

Bibliography
Lillesand, T. M., Keifer, R. W., & Chipman, J. W. (2015). Remote Sensing and Image
Interpretation. Hoboken: John Wiley & Sons, Inc. .

Page | 11
GISC 9216- D2
Principal Component Analysis March 12, 2018

Appendix A: Formal
Map Layouts

You might also like