Professional Documents
Culture Documents
Cking Network For Fast POLSAR Image Classification
Cking Network For Fast POLSAR Image Classification
Cking Network For Fast POLSAR Image Classification
Abstract— Inspired by the popular deep learning architecture, processing, including natural image classification [2], [9],
deep stacking network (DSN), a specific deep model for polari- object detection [4] and scene labeling [5], [10]. DBN is a
metric synthetic aperture radar (POLSAR) image classification is deep network as a probabilistic generative structure, which
proposed in this paper, which is named Wishart DSN (W-DSN).
First of all, a fast implementation of Wishart distance is achieved performs good in many tasks, including image recognition [3],
by a special linear transformation, which speeds up the classi- speech recognition [6] and natural language processing [8].
fication of POLSAR image and makes it possible to use this With successful application in many tasks (such as speech
polarimetric information in the following neural network (NN). classification [7], [11], information retrieval [12], and image
Then, a single-hidden-layer NN based on the fast Wishart classification [1], [7]), the deep learning architecture DSN
distance is defined for POLSAR image classification, which is
named Wishart network (WN) and improves the classification has received increasing attention as well. Despite many fields
accuracy. Finally, a multi-layer NN is formed by stacking WNs, have been studied, this paper focuses on issues in the field of
which is in fact the proposed deep learning architecture W-DSN remote sensing, where deep learning is not fully developed.
for POLSAR image classification and improves the classification To be exact, an exploration of polarimetric synthetic aperture
accuracy further. In addition, the structure of WN can be radar (POLSAR) image classification by DSN is made in this
expanded in a straightforward way by adding hidden units if nec-
essary, as well as the structure of the W-DSN. As a preliminary paper.
exploration on formulating specific deep learning architecture With the fact that more than one polarization is used,
for POLSAR image classification, the proposed methods may POLSAR acquires a much richer characterization of the
establish a simple but clever connection between POLSAR image observed land and plays an important role in many areas,
interpretation and deep learning. The experiment results tested such as military, agriculture and geology [13], [14]. As a
on real POLSAR image show that the fast implementation
of Wishart distance is very efficient (a POLSAR image with result, it is of great significance to interpret POLSAR image
768 000 pixels can be classified in 0.53 s), and both the single- effectively. POLSAR image classification is one of the most
hidden-layer architecture WN and the deep learning architecture fundamental issues in the process of interpretation, where
W-DSN for POLSAR image classification perform well and work each pixel in POLSAR image is assigned to one class (such
efficiently. as urban, water, and grass). In fact, the task of POLSAR
Index Terms— Deep stacking network (DSN), POLSAR image image classification corresponds to the task of scene label-
classification, Wishart network (WN), Wishart deep stacking
network (W-DSN).
ing in nature image. Nevertheless, POLSAR data is always
represented by a coherency/covariance matrix, which contains
I. I NTRODUCTION fully polarimetric information, rather than by a real-valued
scalar/vector (in gray/color image).
W ITH A booming development of deep learning in recent
years, many deep learning architectures have been well
known in various fields, such as image processing [1]–[5],
Methods for POLSAR image classification have been stud-
ied for decades. Taking the specificity of POLSAR data
speech recognition [6], [7], natural language processing [8] into consideration, both scattering mechanism and statistical
and so on. These architectures include, but are not limited property of POLSAR data are widely used in many classical
to, Convolutional Neural Network (CNN) [2], [4], [9], [10], methods, such as Pauli decomposition [15], Entroy/Alpha
Deep Belief Network (DBN) [3], [6], [8] and Deep Stacking (H − α) decomposition [16] and Wishart distance [17], [18].
Network (DSN) [1], [7], [11], [12]. CNN is aimed at image In the usage of scattering mechanism, a professional and
thorough analysis of POLSAR data is of great importance
Manuscript received October 23, 2015; revised March 30, 2016; accepted to design a proper classifier [19]–[21], which is challenging
April 28, 2016. Date of publication May 11, 2016; date of current version
May 24, 2016. This work was supported in part by the National Basic for scholar who does not specialize in POLSAR and still
Research Program (973 Program) of China under Grant 2013CB329402 and intends to settle the task by machine learning. As for sta-
in part by the National Natural Science Foundation of China under Grant tistical property, the popular Whishart distance was proposed
61271302, Grant 61272282, Grant 61572383, and Grant 61573267. The
associate editor coordinating the review of this manuscript and approving based on the Wishart distribution of coherency matrix and
it for publication was Prof. David Clausi. covariance matrix [17], [22], [23], which is a maximum
The authors are with the Key Laboratory of Intelligent Perception and Image likelihood classifier actually [17] and has been used in both
Understanding of Ministry of Education, International Research Center for
Intelligent Perception and Computation, School of Electronic Engineering, unsupervised and supervised POLSAR image classification
Xidian University, Xi’an 710071, China (e-mail: lchjiao@mail.xidian.edu.cn; [18], [24], [25]. However, the calculation of Wishart distance is
fayliu77@163.com). time-consuming and the accuracy of classifier Wishart distance
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. is low. Methods based on more complicated distribution, take
Digital Object Identifier 10.1109/TIP.2016.2567069 too much time in estimating parameters [26], [27]. Besides the
1057-7149 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
3274 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 7, JULY 2016
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
JIAO AND LIU: W-DSN FOR FAST POLSAR IMAGE CLASSIFICATION 3275
module-by-module and supervised way, without propagation classification as well [18]. In unsupervised classification [18],
over all modules. J. S. Lee et al. initially clustered the POLSAR image by
As stated in the previous section, the classification of polarimetric decomposition, and then estimated the cluster
POLSAR image is consistent with the scene labeling of mean Tm by averaging all pixels from class m, i.e.,
natural image, i.e., each pixel in POLSAR image corresponds
Tm = E [T |T ∈ m ] = |1m | T, (2)
to one label. Fortunately, benefiting from the polarimetric T∈m
characteristics of POLSAR data, each pixel can be recognized
where m is the set of pixels from class m, and |m | is
without considering its neighborhoods in the task of POLSAR
the number of pixels in m . In supervised classification,
the
image classification. Then a pixel is represented by itself here,
labeled training data set = 1 2 · · · M is given,
rather than by a pixel block, and used as the input data.
where M is the number of classes. Then the cluster mean of
Furthermore, each POLSAR pixel is denoted by a complex-
each class can be calculated directly by eq. (2) with the labeled
valued matrix, which is different from the real-valued pixel
training data set.
in natural image. Specifically the 3 ∗ 3 complex coherency
2) A Linear Implementation of Wishart Distance: Although
matrix T is used to represent a POLSAR data for its Wishart
Wishart distance has been widely used in a great many of
distribution, where the corresponding Wishart distance is fully
literatures for POLSAR image classification [18], [24], [35],
used.
little attention is paid to the high-computation in calculating
Based on DSN, this paper reveals a specific deep model
the Wishart distance. Traditionally, a two-level loop (where
named W-DSN, which is designed for POLSAR data and
one level corresponds to the number of classes and the
classifies each POLSAR pixel T effectively and efficiently.
other one corresponds to the number of pixels) is needed
to compute the Wishart distance of each pixel from each
III. F ROM POLSAR DATA TO NN
cluster mean, since the equation of Wishart distance contains
To exploit a specific deep model for POLSAR image matrix operations (as shown in eq. (1)). It is a very important
classification, unique characteristics of POLSAR data should issue in POLSAR image classification, since that there are
be considered. In what follows, it is fully discussed that the always millions of pixels in a POLSAR image and the Wishart
Wishart distance serves for a guide to transfer the information distance of every pixel from each class should be calculated.
of POLSAR data to NN. In 2006, W. Wang el al. explored a fast implementation of
H-Alpha Wishart classification [36]. But, it just reduces the
A. Fast Implementation of Wishart Distance number of pixels in computing new cluster means to speed up,
Having a complex Wishart distribution [22], coherency without realizing that what matters most is the computation of
matrix and covariance matrix have been widely used in the Wishart distance. A fast implementation of Wishart distance
analysis of POLSAR data [17], [18]. J. S. Lee proposed a is proposed in this paper, which exactly computes the Wishart
maximum likelihood classifier based on Wishart distribution distance by a special linear transformation.
using covariance matrix [17], which is also suitable for As shown in eq. (1),
there are two items in the Wishart
T −1 T
coherency matrix (coherency matrix and covariance matrix distance, i.e.,
T r ace m and ln |Tm |. The first
can be transformed into each other by a linear operator). item T r ace Tm −1 T is usually calculated by multiplying
Complex coherency matrix T is chosen in this paper, which Tm −1 by T to obtain the matrix Tm −1 T firstly and
is conjugate symmetric, i.e., then the trace of the obtained matrix is computed. It needs
⎡ ⎤ 27 multiplication operations and 20 addition operations, with
T11 T12 T13
the obtained Tm −1 . Let = Tm −1 T. Both Tm −1
T = ⎣T12 T22 T23 ⎦ and T are 3*3 complex matrices, so is . It is noticed
T13 T23 T33 that the trace of a matrix is just a summation of diagonal
where T11 , T22 and T33 are real-valued, the remaining elements elements of the matrix. Namely, the computation of getting
are complex-valued and • is a conjugate of an element. is redundant and only the diagonal elements are necessary. The
1) Wishart Distance: With the maximum likelihood clas- fast implementation of Wishart distance is based on leaving
sifier [17], a multi-look POLSAR pixel T is classified out the redundant computation.
according to a so-called Wishart distance d (T |Tm ), as Let σ = f () be a function which arranges
shown in eq. (1), all elements in the matrix into column and out-
puts the column-vector form of (e.g., f (T) =
d (T |Tm ) = T r ace Tm −1 T + ln |Tm | (1)
T11 , T12 , T13 , T12 , T22 , T23 , T13 , T23 , T33 ). = f −1 (σ ) is
where tr ace (•) is the trace of a matrix, •-1 is the inverse of a the inverse function of σ = f (). In the following part of
matrix, |•| is the determinant of a matrix, Tm for class m is this paper, the n-th pixel in POLSAR image is denoted by tn ,
estimated by training samples from class m and it is regarded n = 1, 2, ..., N, where N is the total number of POLSAR
as the cluster mean of the m-th class. With calculated distance pixels. Then let T = [t1 , t2 , ..., t N ], where tn = f (Tn )
for each class, a pixel is assigned to the class with minimum is the column-vector form of matrix Tn and Tn is the
coherency matrix of the n-th pixel in POLSAR image.
What’s
distance.
Even through Wishart distance was originally proposed for more, let W = [w1 , w2 , ..., w M ], where wm = f Tm −1 ,
supervised classification [17], it has been used in unsupervised m = 1, 2, ..., M. Thus, it is easy to notice the fact that
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
3276 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 7, JULY 2016
Fig. 2. Comparison of traditional implementation and the fast implementation. (a) Traditional implementation. (b) Fast implementation.
T r ace Tm −1 Ti = (wm ) ti , where both wm and ti are form of Wishart distance is the key point to establish the
9-dim complex column vectors and (wm ) is merely the following WN.
transposition of wm without conjugation. The computation Experiment results listed in section V verify this improve-
of (wm ) ti includes 9 multiplication operations and 8 addi- ment: a POLSAR image with 768000 pixels and of 15 classes
tion operations, can be classified in 0.53s, while it needs over 330s with
which
are almost just one-third of what
T r ace Tm −1 Ti needs in traditional way. traditional implementation by eq. (1).
Furthermore, let
b = [ln (|T1 |) , ln (|T2 |) , . . . , ln (|T M |)] B. Wishart Network
be a column vector. As a result, the Wishart distances of a Despite the fact that it is highly efficient to compute the
pixel t from different cluster means are calculated by the linear Wishart distance by the proposed method as discussed in
transformation W t + b, where t = f (T). Thus the Wishart section III-A, it does not improve the classification accuracy
distance matrix is calculated by eq. (3), at all. It is because the proposed eq. (3) is just a fast but
exact implementation of Wishart distance. As a matter of
D=WT+B (3) fact, the classification accuracy of Wishart classifier depends
where B = [b, b, ..., b] is formed by repeating b N times, on the cluster mean of each class, which is not changed in
⎡
⎤ the fast implementation of Wishart distance. In supervised
d T1 |T1 d T2 |T1 · · · d T N |T1 POLSAR image classification, the cluster mean of class m is
⎢
1
2
N ⎥
⎢ d T |T2 d T |T2 · · · d T |T2 ⎥ simply estimated by averaging pixels from class m, as shown
D=⎢ ⎢ .. .. ..
⎥
⎥ in eq. (2). Inspired by the method of learning in machine
⎣ . . . ⎦
1
2
N learning, an effort to explore higher accuracy is made by a
d T |T M d T |T M · · · d T |T M learning method named WN, for the task of POLSAR image
is the matrix of Wishart distance, and D (m, n) denotes the classification.
Wishart distance of the n-th pixel Tn from the m-th cluster 1) The Definition of Wishart Network (WN): The Wishart
mean Tm . In this way, the Wishart distance of each pixel distance matrix D shown in eq. (3) is a linear transformation
from each cluster mean is calculated by eq. (3) without of T, with W and B as weight parameter and bias parameter.
looping, but a relative efficiency of compiled array-handling The n-th pixel is assigned to the class with minimum distance
code, while it needs a two-level loop by eq. (1). The omission in the n-th column of D. In supervised POLSAR image
of loops make it convenient to execute on computer, then the classification by Wishart classifier, labeled training pixels tend
method shown in eq. (3) is a fast and linear implementation to be classified correctly with
proper cluster
means.
Let
labeled
of Wishart distance matrix. training pixel set be = t1l , yl1 , t2l , yl2 , ..., tlK , ylK ,
A clear comparison is demonstrated in Fig. 2. The fast where tkl , k = 1, 2, ..., K denotes the k-th labeled pixel, ylk is
implementation leaves out the redundant computation of , the corresponding label vector and ylk is a unit column vector
and it is further organized in a way so that the Wishart distance where the index of the only non-zero element indicates the
matrix is calculated by a special linear transformation. As a label of tkl , and K is the number of labeled training pixels
result, the efficiency of calculating Wishart distance is highly in .
improved. In addition, the more classes and pixels, the higher Let W = [w1 , w2 , ..., w M ], where each cluster
the acceleration, since the two-level loop is omitted. The linear mean is associated with a column of W, as discussed
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
JIAO AND LIU: W-DSN FOR FAST POLSAR IMAGE CLASSIFICATION 3277
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
3278 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 7, JULY 2016
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
JIAO AND LIU: W-DSN FOR FAST POLSAR IMAGE CLASSIFICATION 3279
Algorithm 2 Training of W-DSN where tk,1 = [tk ; 1], as shown in eq. (6). When the second
lowest module (which is adjacent to the lowest one) is trained,
the optimization is
K
2
l 2
mi n E 2 = yk − U2 sigm W tk,2 , (11)
k=1
where tk,2 = t ; y . The initialization of parame-
k,1 k,1
ter is W 2 = W 1 ; P , where P is a M × M ran-
dom real-valued matrix associated with yk,1 . Thus E 2 =
K
2
l
yk − U2 sigm W 1 t k,1 + P y k,1 . If P a null matrix,
k=1
these two optimizations shown in eq. (10) and (11) are the
same, i.e., mi n E 1 is equal to mi n E 2 . Yet, P is a variable
in the second optimization, hence the latter optimization is
a generalization of the former one. Consequently, E 2 may
have a smaller minimum than E 1 , and parameters learned by
minimizing E 2 provide a better estimated label vector than that
learned by minimizing E 1 . Namely, the second lowest module
provides a better estimated label vector than the lowest one.
This will continue for higher modules. So W-DSN with more
modules can improve the classification accuracy further.
It has been discussed that both W-DSN (with more than
one WN) and expanded WN have a higher ability to express
complicated classification function than the basic WN. There-
fore, W-DSN with expanded WN will exhibit better classi-
combined with raw input data to form a new input in adjacent fication performance than both of them. Experiment results
higher module (e.g., [T; Y1 ]), the output (e.g., Y1 ) is exactly demonstrated in the following section confirm this conclusion
the linear combination of hidden units in the lower module, completely.
without estimating the predicted label. That is, only the output
of the last module (e.g., Y3 ) is used to estimate the final V. E XPERIMENTS
predicted label, in the process of testing.
A detail should be noticed that the module used here can be The POLSAR image which is tested on to verify the
the basic WN with M hidden units or the expanded WN with effectiveness of the proposed methods is related to the site of
more hidden units. The more hidden units there are, the higher Flevoland, the Netherland. The size of this POLSAR image is
ability to express complicated function the WN has, so is the 750*1024 and it has been used in many papers [37], [38].
W-DSN. If the number of classes is big or the training pixels Fig. 4 illustrates the corresponding Pauli RGB image
from different classes are difficult to distinguish from each and groundtruth respectively. There are 15 classes in this
other, which lead to the fact that the classification function groundtruth, where each class indicates a type of land covering
should be complicated enough to distinguish these pixels, more and is identified by one color. 167712 pixels are labeled in the
hidden units will be needed. groundtruth and only 5% of them are used as training pixels.
Some notes are laid out here for a clearer understanding of The reported testing accuracies are obtained by testing on
our work. W-DSN is different from DSN in that the modules the 95% residual pixels. What’s more, the proposed methods
stacked in W-DSN are WNs, which are defined specialized are compared with six other methods, including Support
for POLSAR data in section III. Thus the initialization and Vector Machine (SVM) [30], Radial Basis Function (RBF)
training of W-DSN are modified accordingly, despite the network, Wishart classifier, DBN with RBM [32], DBN with
similar procedure. WRBM [33] and DBN with WBRBM [34]. The importance
of the proposed W-DSN is emphasized by comparing W-DSN
B. Analysis of W-DSN with DSN separately, which is demonstrated in section V-D.
Although the explanation that estimated label vector of the As discussed in previous sections, the input data of WN
higher module can be regarded as a rectification of that of and W-DSN is the column-vector form of coherency matrix,
the lower one is intuitional, a rigorous mathematic analysis is i.e., t = f (T). Wishart classifier classifies pixel using the
provided. For simplicity, only the first two lower modules are original coherency matrix T. While in the last five comparing
considered, since higher modules work in a similar way. methods, POLSAR pixel is denoted by a 9-dim real-valued
When the lowest module is trained, the optimization is vector, i.e., [T11 , T22 , T33 , r e(T12), i m(T12 ), r e(T13 ), i m(T13),
K
2 r e(T23), i m(T23 )], where Ti, j denotes the element locating in
l 1 the i -th row and the j -th column, r e (•) and i m (•) denote the
mi n E 1 = yk − U1 sigm W tk,1 , (10)
k=1 real part and the image part of a complex number respectively.
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
3280 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 7, JULY 2016
TABLE I
C OMPARISON OF T RADITIONAL AND FAST I MPLEMENTATION
This real-valued vector contains all information of T, where the latter time-consumption is less than 1/600 of the
so does t. former one. This is because the redundant computation and
Experiment results below are reported to confirm three main two-level loop embedded in the traditional implementation
contributions of this paper, including the fast implementation are left out of the proposed fast implementation. Even though
of Wishart distance, the single-hidden-layer network WN and an extra step is needed in the fast implementation, it works
the deep learning architecture W-DSN for POLSAR image very efficiently, noting that the preparation time is negligible
classification. Finally, a comprehensive evaluation is provided. (0.0003s) and the speedup is large (over 600 times). The
On a 3.20 GHz machine with 4.00 GB RAM, all the experi- accuracy of the implementation of the Wishart distance is
ments are conducted using MATLAB. unchanged, since the fast implementation gives exactly the
same results as the traditional method.
A. The Comparison of Traditional and Fast
Implementation of Wishart Distance B. Effectiveness of WN
As discussed in section III, the proposed fast implementa- The proposed learning method WN, which is initialized by
tion of Wishart distance is achieved by a special linear trans- cluster mean of each class and uses the column-vector form of
form, without redundant computation and the two-level loop. original coherency matrix as input data, improves the accuracy
In supervised classification, Wishart classifiers with traditional by supervised training. The result of WN is compared with
implementation and with fast implementation respectively are that of SVM, RBF, Wishart classifier, and DBNs with RBM,
both tested. A simple preparation (i.e., the construction of WRBM and WBRBM respectively. For fairness, there are only
W and b, which corresponds to the first loop in Fig. 2 (b) one hidden layer in each DBN and linear classifier is used,
and the ’preparation’ part of Algorithm 1) is needed in the which are the same to the basic WN. In the following, these
fast implementation and the corresponding time-consumption three DBN methods are called RBM, WRBM and WBRBM
is called ’preparation time’. The time for classifying the whole respectively for conciseness.
image of 768000 pixels is called ’classification time’. First, all methods are repeatedly tested 50 times without
Running times and accuracies are listed in TABLE I. The changing training samples and the total accuracies are demon-
proposed fast implementation takes 0.0003s to construct W strated in boxplot in Fig. 5, to explore their robustness to
and b, which is unnecessary in the traditional implementation. randomness. It is clear that SVM, RBF, Wishart classifier
The traditional implementation takes 336.4s to classify the and the proposed WN are stable, which are not affected by
whole image, while the fast implementation takes just 0.5288s, random parameters. However, RBM, WBRBM and WBRBM
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
JIAO AND LIU: W-DSN FOR FAST POLSAR IMAGE CLASSIFICATION 3281
TABLE II
T HE A CCURACIES
Fig. 5. Total accuracy of each method by repeatedly running. Fig. 6. Total accuracy of each method by repeatedly sampling.
are sensitive to some extent, where RBM is the most sensitive Experiment results demonstrated in Fig. 5 and Fig. 6 show
one, WRBM performs more stable than RBM due to its that the proposed WN is robust to both random parameters
utility of some polarimetric information and WBRBM is the and different training samples, with relative high accuracy.
most stable one in these three methods because of its stricter Third, individual class accuracies are listed in TABLE II
mathematic support. for better insight, since many times they are more attractive to
Second, sample 5% labeled pixels repeatedly 50 times as users. In addition, the expanded WNs with 2M, 3M and 4M
training samples for each method, and the total accuracies are hidden units respectively are tested as well, to illustrate that
revealed in boxplot as well, as shown in Fig. 6. Similarly, the addition of hidden units gives rise to the improvement of
SVM, RBF, Wishar classifier and the proposed WN perform accuracy indeed.
rather stable with different training samples. However, the To analyze individual class accuracies, it should be declared
other three comparing methods are unstable, because they are that percentages of different classes in the whole training
not task-oriented. Note that, WRBM is less stable than RBM sample set are 7.89%, 10.77%, 6.08%, 4.20%, 5.72%, 4.52%,
here, however WBRBM are still more stable than both of them 3.04%, 5.98%, 6.65%, 13.27%, 3.77%, 8.27%, 9.78%, 0.43%
(the total accuracies of RBM, WRBM and WBRBM are not as and 9.63% respectively, corresponding to the class order listed
high as listed in [32]–[34], since more hidden layers with more in TABLE II. Even though SVM, RBF and Wishart classifier
units were employed there and nonlinear classifiers were used). reach similar total accuracies, SVM and RBF misclassify
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
3282 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 7, JULY 2016
TABLE III
A CCURACY OF W-DSN W ITH D IFFERENT N UMBERS OF M ODULES
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
JIAO AND LIU: W-DSN FOR FAST POLSAR IMAGE CLASSIFICATION 3283
TABLE IV
A CCURACY AND RUNNING T IME
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
3284 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 7, JULY 2016
Fig. 10. (a)∼(i) show the classification results of SVM, RBF, Wishart classifier, RBM, WRBM, WBRBM, WN (with M hidden units), WN (with 4M hidden
units) and W-DSN (of 3 modules with 4M hidden units in each module).
The ‘preparation time’ in Wishart classifier is renamed as is needed (17s). W-DSN with more modules results in higher
‘training time’, to keep consistent with other methods. For accuracy as well (i.e., the accuracy of W-DSN of 3 modules
RBM, WRBM and WBRBM, both the time for unsupervised 0.9258 is higher than than of W-DSN of 1 module (i.e., WN)
training and the time for supervised training are contained in 0.9018). Even though both WN with more hidden units and
‘training time’. Besides, DSN is not included here for its rather W-DSN with more modules take much more time for training,
poor performance as revealed in section V-D. the total time is much less than that of the comparing methods.
According to the results listed in TABLE IV, the fast Specifically, W-DSN with 3 modules takes 50s to complete
implementation of Wishart distance works very efficient and the task of classification, while Wishart classifier (traditional
reduces the time-consumption greatly, compared with the implementation), SVM and RBF take 336.40s, 295.33s and
traditional implementation. WN with M hidden layer is the 988.89s respectively. In short, the proposed WN and W-DSN
original form of the learning method and achieves a higher achieve higher accuracy with less time.
accuracy than Wishart classifier, at the cost of 2.2851s for Meanwhile, the corresponding classification results are also
training. With the same number of hidden units, WN with M illustrated in pictures, as shown in Fig. 10. It is clear that
hidden units takes much less time than RBM, WRBM and Fig. 10(g) shows better results than Fig. 10(a)∼(f), con-
WBRBM (3s is much less than 61.34s, 69.56s and 58.77s), firming the conclusion that WN does achieve better results,
while WN still achieves higher accuracy. With more hidden compared with SVM, RBF, Wishart classifier, RBM, WRBM
units in WN, the accuracy becomes higher (i.e., the accuracy and WBRBM. By comparing Fig. 10(g), (h) and (i) with
of WN with 4M hidden units 0.9018 is higher than that of each other, it is clearly illustrated that WN with more hid-
WN with M hidden units 0.8650), where a little more time den units or W-DSN with more modules can improve the
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
JIAO AND LIU: W-DSN FOR FAST POLSAR IMAGE CLASSIFICATION 3285
classification results, which agrees with the numerical values [17] J. S. Lee, M. R. Grunes, and R. Kwok, “Classification of multi-look
listed in TABLE III. polarimetric SAR imagery based on complex Wishart distribution,” Int.
J. Remote Sens., vol. 15, no. 11, pp. 2299–2311, 1994.
[18] J.-S. Lee, M. R. Grunes, T. L. Ainsworth, L.-J. Du, D. L. Schuler, and
VI. C ONCLUSION AND F UTURE W ORK S. R. Cloude, “Unsupervised classification using polarimetric decompo-
sition and the complex Wishart classifier,” IEEE Trans. Geosci. Remote
A deep learning architecture named W-DSN is constructed Sens., vol. 37, no. 5, pp. 2249–2258, Sep. 1999.
specialized for POLSAR image classification in this paper. [19] F. Shang and A. Hirose, “Use of Poincare sphere parameters for fast
supervised PolSAR land classification,” in Proc. IEEE Int. Geosci.
Two other methods about POLSAR image classification are Remote Sens. Symp. (IGARSS), Jul. 2013, pp. 3175–3178.
also proposed for the construction of W-DSN, including a fast [20] G. Singh, Y. Yamaguchi, and S.-E. Park, “General four-component scat-
implementation of Wishart distance and a network named WN. tering power decomposition with unitary transformation of coherency
matrix,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 5,
The fast implementation of Wishart distance makes it possible pp. 3014–3022, May 2013.
to directly speed up the methods based on Wishart distance, [21] J.-S. Lee, M. R. Grunes, E. Pottier, and L. Ferro-Famil, “Unsupervised
and WN is a single-hidden-layer network which improves the terrain classification preserving polarimetric scattering characteristics,”
IEEE Trans. Geosci. Remote Sens., vol. 42, no. 4, pp. 722–731,
classification accuracy at a high speed. Most importantly, the Apr. 2004.
W-DSN achieves higher accuracy further, which is proved to [22] N. R. Goodman, “Statistical analysis based on a certain multivariate
be an effective and efficient deep architecture specialized for complex Gaussian distribution (an introduction),” Ann. Math. Statist.,
vol. 34, no. 1, pp. 152–177, 1963.
POLSAR image classification. [23] F. Cao, W. Hong, Y. Wu, and E. Pottier, “An unsupervised segmentation
The spatial information, which is also very important in with an adaptive number of clusters using the SPAN/H/α/A space and the
POLSAR image classification, will be considered in our future complex Wishart clustering for fully polarimetric SAR data analysis,”
IEEE Trans. Geosci. Remote Sens., vol. 45, no. 11, pp. 3454–3467,
work, to complete the classification more rapidly and precisely. Nov. 2007.
[24] L. J. Du and J. S. Lee, “Polarimetric SAR image classification based
on target decomposition theorem and complex Wishart distribution,” in
R EFERENCES Proc. IEEE Remote Sens. Sustain. Future, Int, Geosci. Remote Sens.
[1] J. Li, H. Chang, and J. Yang. (2015). “Sparse deep stacking Symp. (IGARSS), vol. 1. May 1996, pp. 439–441.
network for image classification.” [Online]. Available: https://arxiv. [25] G. Zhou, Y. Cui, Y. Chen, J. Yin, J. Yang, and Y. Su, “Pol-SAR
org/abs/1501.00777 images classification using texture features and the complex Wishart
[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification distribution,” in Proc. IEEE Radar Conf., May 2010, pp. 491–494.
with deep convolutional neural networks,” in Proc. Adv. Neural Inf. [26] M. Liu, H. Zhang, C. Wang, and F. Wu, “Change detection of multilook
Process. Syst., 2012, pp. 1097–1105. polarimetric SAR images using heterogeneous clutter models,” IEEE
[3] O. Russakovsky et al., “ImageNet large scale visual recognition chal- Trans. Geosci. Remote Sens., vol. 52, no. 12, pp. 7483–7494, Dec. 2014.
lenge,” Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252, 2014. [27] J. S. Lee, D. L. Schuler, R. H. Lang, and K. J. Ranson,
[4] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature “K-distribution for multi-look processed polarimetric SAR imagery,” in
hierarchies for accurate object detection and semantic segmentation,” Proc. Surf. Atmospheric Remote Sens., Technol., Data Anal. Interpre-
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2014, tation, Int. Geosci. Remote Sens. Symp. (IGARSS), vol. 4. Aug. 1994,
pp. 580–587. pp. 2179–2181.
[5] P. H. O. Pinheiro and R. Collobert. (2013). “Recurrent convo- [28] Y. C. Tzeng and K. S. Chen, “A fuzzy neural network to SAR image
lutional neural networks for scene parsing.” [Online]. Available: classification,” IEEE Trans. Geosci. Remote Sens., vol. 36, no. 1,
http://arxiv.org/abs/1306.2795 pp. 301–307, Jan. 1998.
[6] G. E. Dahl, D. Yu, L. Deng, and A. Acero, “Context-dependent pre- [29] C. Chen, K.-S. Chen, and J.-S. Lee, “The use of fully polarimetric
trained deep neural networks for large-vocabulary speech recognition,” information for the fuzzy neural classification of SAR images,” IEEE
IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 30–42, Trans. Geosci. Remote Sens., vol. 41, no. 9, pp. 2089–2100, Sep. 2003.
Jan. 2012. [30] C. Lardeux et al., “Support vector machine for multifrequency SAR
[7] L. Deng, D. Yu, and J. Platt, “Scalable stacking and learning for building polarimetric data classification,” IEEE Trans. Geosci. Remote Sens.,
deep architectures,” in Proc. IEEE Int. Conf. Acoust., Speech Signal vol. 47, no. 12, pp. 4143–4152, Dec. 2009.
Process. (ICASSP), Mar. 2012, pp. 2133–2136. [31] S. Fukuda and H. Hirosawa, “Support vector machine classification of
[8] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and land cover: Application to polarimetric SAR data,” in Proc. IEEE Int.
P. Kuksa, “Natural language processing (almost) from scratch,” J. Mach. Geosci. Remote Sens. Symp. (IGARSS), vol. 1. Jul. 2001, pp. 187–189.
Learn. Res., vol. 12, pp. 2493–2537, Nov. 2011. [32] Q. Lv, Y. Dou, X. Niu, J. Xu, J. Xu, and F. Xia, “Urban land use and
[9] K. Simonyan and A. Zisserman. (2014). “Very deep convolutional land cover classification using remotely sensed SAR data through deep
networks for large-scale image recognition.” [Online]. Available: belief networks,” J. Sensors, vol. 2015, Jul. 2015, Art. no. 538063.
http://arxiv.org/abs/1409.1556 [33] Y. Guo, S. Wang, C. Gao, D. Shi, D. Zhang, and B. Hou, “Wishart
[10] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and RBM based DBN for polarimetric synthetic radar data classification,”
A. L. Yuille. (2014). “Semantic image segmentation with deep in Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS), Jul. 2015,
convolutional nets and fully connected CRFs.” [Online]. Available: pp. 1841–1844.
https://arxiv.org/abs/1412.7062 [34] F. Liu, L. Jiao, B. Hou, and S. Yang, “POL-SAR image classification
[11] L. Deng and D. Yu, “Deep convex net: A scalable architecture for speech based on Wishart DBN and local spatial information,” IEEE Trans.
pattern classification,” in Proc. Interspeech, 2011, pp. 2285–2288. Geosci. Remote Sens., vol. 54, no. 6, pp. 3292–3308, Jun. 2016.
[12] L. Deng, X. He, and J. Gao, “Deep stacking networks for infor- [35] C. Wang, W. Yu, R. Wang, Y. Deng, F. Zhao, and Y. Lu, “Unsuper-
mation retrieval,” in Proc. IEEE Int. Conf. Acoust., Speech Signal vised classification based on non-negative eigenvalue decomposition
Process. (ICASSP), May 2013, pp. 3153–3157. and Wishart classifier,” IET Radar, Sonar Navigat., vol. 8, no. 8,
[13] E. Attema et al., “ENVISAT ASAR science and applications,” ESA Publ. pp. 957–964, Oct. 2014.
SP, vol. 1225, no. 1, p. 59, Nov. 1998. [36] W. Wang, J. Wang, S. Mao, and P. Wu, “Fast implementation of
[14] F. T. Ulaby and C. Elachi, Eds., Radar Polarimetry for Geoscience H/α-Wishart classification to polarimetric SAR images images,” in Proc.
Applications, vol. 1. Norwood, MA, USA: Artech House, 1990, p. 376. CIE Int. Conf. Radar, Oct. 2006, pp. 1–4.
[15] E. Pottier, “Dr. J. R. Huynen’s main contributions in the development [37] T. L. Ainsworth, J. P. Kelly, and J.-S. Lee, “Classification comparisons
of polarimetric radar techniques and how the ‘radar targets phenomeno- between dual-pol, compact polarimetric and quad-pol SAR imagery,”
logical concept’ becomes a theory,” Proc. SPIE, vol. 1748, pp. 72–85, ISPRS J. Photogram. Remote Sens., vol. 64, no. 5, pp. 464–471, 2009.
Feb. 1993. [38] P. Yu, A. K. Qin, and D. A. Clausi, “Unsupervised polarimetric
[16] S. R. Cloude and E. Pottier, “An entropy based classification scheme SAR image segmentation and classification using region growing with
for land applications of polarimetric SAR,” IEEE Trans. Geosci. Remote edge penalty,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 4,
Sens., vol. 35, no. 1, pp. 68–78, Jan. 1997. pp. 1302–1317, Apr. 2012.
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.
3286 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 7, JULY 2016
Licheng Jiao (SM’89) received the B.S. degree Fang Liu was born in China, in 1990. She received
from Shanghai Jiaotong University, China, in 1982, the B.S. degree in information and computing
and the M.S. and Ph.D. degrees from Xi’an science from Henan University, Kaifeng, China,
Jiaotong University, Xi’an, China, in 1984 and 1990, in 2012. Since then, she has been taking suc-
respectively. Since 1992, he has been a Professor cessive post-graduate and doctoral programs with
with the School of Electronic Engineering, Xidian the Key Laboratory of Intelligent Perception and
University, Xi’an, where he is currently the Direc- Image Understanding of Ministry of Education,
tor of the Key Laboratory of Intelligent Perception Xidian University, Xi’an, China. Her research inter-
and Image Understanding of the Ministry of Edu- ests include deep learning, polarimetric sar image
cation of China, International Research Center of classification, and change detection in polarimetric
Intelligent Perception and Computation. His current sar images.
research interests include intelligent information processing, image processing,
machine learning, and pattern recognition. He is a member of the IEEE Xi’an
Section Execution Committee, the President of Computational Intelligence
Chapter, the IEEE Xi’an Section and IET Xi’an Network, the Chairman of
the awards and Recognition Committee, the Vice Board Chair-Person of the
Chinese Association of Artificial Intelligence, a Councilor of the Chinese
Institute of Electronics, a Committee Member of the Chinese Committee of
Neural Networks, and an Expert of the Academic Degrees Committee of the
State Council.
Authorized licensed use limited to: Chang Gung Univ.. Downloaded on December 28,2020 at 10:18:18 UTC from IEEE Xplore. Restrictions apply.