Deep Wavelet Network For Image Classification

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Deep Wavelet Network for Image Classification

Salwa Said1,2 , Olfa Jemai1,2 , Salima Hassairi2 , Ridha Ejbali2 , Mourad Zaied2 and Chokri Ben Amar2
1
Higher Institute of Computer Science and Multimedia, University of Gabes, Tunisia
2
REGIM-Lab: REsearch Groups in Intelligent Machines University of Sfax
salwa.said@ieee.org, olfa.jemai@ieee.org, salima.hassairi.tn@ieee.org,
ridha ejbali@ieee.org, mourad.zaied@ieee.org, chokri.benamar@ieee.org

Abstract—The success of the deep learning and specifically image classification using wavelet networks and deep learning
learning layer by layer led to many impressive results in several methods to create new Deep Wavelet Network. The remaining
contexts that include neural network. This gave us the idea to part of this paper is composed of five sections. The algorithm
apply this principle of learning on wavelet network because it
is an active research topic at the moment. This paper present of our approach is proposed in the section 2. The section
our approach for image classification by the combination of 3 gives an overview of the dataset which we used in the
two techniques of learning: the wavelet network and the deep classification test. Then, the section 4 contains the experiments
learning. We try to classify images in a supervised way following and the tests results of our network. Finally, the section 5
by an unsupervised learning using the principle of autoencoder. summarizes and concludes this paper.
Experiments on two databases COIL-100 and MNIST show
that our approach gives good results for the two classifiers that II. P ROPOSED APPROACH
we used.
Recent work has shown that deep learning methods have
Index Terms: wavelet network, deep learning, supervised
classification, unsupervised learning, autoencoder. produced impressive results in several contexts, that include
neuron networks, this gave us the idea to apply this principle
I. I NTRODUCTION of learning on wavelet network.
For precision, in the next subsections, we firstly introduce
Before 2006, we did not know how to train a deep archi-
a theoretical background of our approach. Then, we explain
tecture: the iterative optimization converged local minima of
how to use it for classification.
poor quality [1], [2] because of the vanishing gradient problem
[3], [4]. Thus, neural networks with more than two hidden A. Wavelet network
layers initialized randomly give worse results than the shallow Wavelet network is the result of the combination between
networks [5]. wavelet and neural network [19] [20]. It is constituted of three
To avoid this problem, Hinton and all. [6], in 2006, sug- layers (Fig.1): a first layer with Ni entries, a hidden layer
gested to pre-train each layer to learn a good representation consisting of Nw wavelets and an output layer accommodating
of its input. Thus, there are several possible algorithms for pre- the weighted outputs of wavelets as we show the following
training are surveyed in [7]. So, we succeed to get networks figure:
with 2 or more hidden layers which work not only better
then some networks not deep, but they beat the learning
algorithms to the state of the art, for example new methods
for unsupervised pre-training [8], [9], [10]. Those who led to
the breakthrough of the Deep Learning (DL) were Hinton,
Bengio and LeCun. Hinton used the Restricted Boltzmann
Machines (RBM) [11] as generative models of several different
types of data as well as labeled or unlabeled images [12].
The AutoEncoders (AE) were developed by Bengio [13], [14]
and have been used for learning efficient coding. Yun LeCun
invented the sparse representations for image classification and
Fig. 1. Graphic representation of wavelet network
object recognition [15], [16] [12].
In the other hand, Zhang and Benveniste introduce the It uses a feed-forward propagation algorithm from the input
wavelet networks in 1992 [17], [18] . They are the result to the output neurons [21]. Furthermore, it has a certain
from the combination of two techniques of signal processing proximity with the architecture of the neural networks. The
”wavelet transform and artificial neural networks” whose the main similarity between these two networks is that both of
activation functions are based on a family of wavelets. them calculate a linear combination of nonlinear functions
Combining the ability of wavelet network and the Deep whose form depends on the adjustable parameters (dilations
Learning techniques for image classification is the main objec- and translations) of this combination. However, the major
tive of our work. In this paper, we propose a novel approach for difference between them is the nature of the transfer functions
978-1-5090-1897-0/16/$31.00 2016
c IEEE utilized by the hidden cells [22].
B. Autoencoder • Step 6: Repeat the steps 3, 4 and 5 according to the
An autoencoder is a neural network trained to predict desired number of hidden layers.
their own entries (x=x’). It aim to minimize the error of After the learning phase, we organize the vectors obtained
reconstruction of the input data as shown in the figure below: in a matrix which each column represents a picture so that we
can apply the classification phase. The figure below illustrates
the steps of creation of our network:

Fig. 2. Principle of autoencoder

For each sample x from input dataset {x1,x2,...,xn}, We


have:
h = fenc (xWenc + b) (1)

r = fdec (hWdec + c) (2)


Where:
- Wenc : the encoding weight;
- Wdec : the decoding weight;
- b et c: the encoding and the decoding bias vector;
- Equation (1) corresponds to the transformation of
the input (also called encoding);
- Equation (2) corresponds to the reconstruction
input;
- The mapping function fenc and fdec is usually non-
linear activation function. Fig. 3. Steps of deep wavelet network creation
1
σ(x) = (3) Where:
1 + e(−x)
We would like to build a deep wavelet network that taking - Wij : the connection weights between the neuron i
as input the pixels of an unlabeled picture and outputting its and the output of neuron j.
characteristics. We used the principle of autoencoder to create - ai : the dilations coefficients of the neuron i.
our deep wavelet network. We began training this autoencoder - bi : the translations coefficients of the neuron i.
without using labels to have an unsupervised training. Our approach model is based on two hidden layers where the
wavelet basis function works as an activation function ψ.
C. Deep wavelet network
1) Training algorithm:
The steps for creating our network are: In the learning phase, we used the algorithm of feed-forward
• Step 1: Create a wavelet network with a single hidden propagation. The use of this algorithm is to reduce the error
layer whose transfer function is based on a wavelet produced by our network, thus to correct these parameters that
family. we defined them above: the weights, dilations and translations.
• Step 2: Create another wavelet network with the final We utilized the quadratic cost function to measure this error:
layer removed to generate the characteristics obtained in
the first hidden layer. T
1X 2
• Step 3: Train the second autoencoder using the features E= (yd (t) − y(t)) (4)
that were generated from the first autoencoder (Step 2). 2 t=1
• Step 4: Remove the final layer to generate the character-
istics obtained in the second hidden layer. Where :
• Step 5: Stack the encoders from the autoencoders to- - y(t) : the output given by the network;
gether to form a deep wavelet network. - yd (t): the desired output.
The network output expression is: −1 2
ψ(x) = e 2 x e5ix (13)
S
X t − bs
y(t) = ws .ψs ( ) (5) 2) Mexican hat wavelet The expression of this wavelet
s=1
as
is defined by:
We chose the gradient descent algorithm for the error −1 2
minimization method. At each iteration of this algorithm, an ψ(x) = (1 − x2 )e 2 x (14)
image is presented to the network (input / output) and then
3) RASP(Rational function with Second-order Poles)
propagates the calculation of a layer to another until the output
Finally, we used the wavelet RASP to compare the
layer. It consists of changing the settings in the opposite
results between these 3 wavelet and see who gives us
direction to the gradient of the error function. These settings
good classification results. The expression of the wavelet
are changed using the following formula:
RASP is presented below:
∂E
Vt+1 = Vt − ε(t) (6) sin(3.145x)
∂V ψ(x) = (15)
x+1
Where:
The graph of these three functions is shown in the
- Vt : represents one of the parameters {w, a, b} at following figure:
iteration t.
- ε(t): is the pitch of the gradient iteration t.
While putting e(t) = yd (t) - y(t), we will have these derived
functions:
T
∂E X
= e(t)ψ(τ ) (7)
∂ωij t=1 Morlet wavelet Mexican hat wavelet RASP wavelet
T
∂E X ∂ψ(τ ) Fig. 4. Wavelets used
= e(t)ωij (8)
∂ai t=1
∂ai
T D. Classification phase
∂E X ∂ψ(τ )
= e(t)ωij (9) Nowadays, the classification has a very important place in
∂bi ∂bi
t=1 the analysis of intelligence data (supervised) and exploratory
Where: τ = t−b (unsupervised). It seeks to identify similar objects in the
a
The parameters {w, a, b} have been initialized randomly. direction of a criterion of homogeneity. These applications
The modification of these settings is achieved by applying the have an important role in resolving many problems in pattern
following formulas: recognition [24], color image segmentation [25], data mining
[26] and in different domains such as medicine [27], biology
[28], marketing [29], etc [30].
ω(t + 1) = ω(t) + µw ∆ω (10)
Classification methods can be grouped into two categories:
supervised and unsupervised. The main task of the first one
a(t + 1) = a(t) + µa ∆a (11) is to group objects into ones groups without any previous
knowledge of the classes. The most frequently used classi-
fication methods are: softmax, k-Nearest Neighbors (k-nn)
b(t + 1) = b(t) + µb ∆b (12) [31], support vector machines (SVM) [32], neural networks
Where: (NNs) [34] [33], [35], etc. On the other hand, the objective
of unsupervised classification methods is to group similar data
- µw ,µa and µb are the learning rates of the three
into the same cluster from the knowledge of the classes that are
network settings.
defined by a probabilistic approach. The most used algorithms
- ∆ω = − ∂E
∂ω [36] in this type are variants of the k-means algorithm [37],
- ∆a = − ∂E
∂a hierarchical clustering [38], spectral clustering [39], Kohonen-
- ∆b = − ∂E
∂b like algorithms [40], etc.
2) The used wavelets: When, we obtain all the feature of the database, we apply
There are several families of wavelet. In our application, the classification phase. In the next subsection, we will explain
we selected those who gave us the best performance. These the steps of our approach.
functions are explained as follows: Figure below illustrate the sequence of our approach :
1) Morlet wavelet The expression of morlet wavelet is We tried to use two supervised classifiers: k-NN and soft-
defined as below: max function.
2) MNIST contains 60,000 handwritten digits images (0..9)
with the resolution of 28 * 28 pixels.
Evaluation criteria:
In the experiments conducted in this study, we have used 31
of the data samples for test, and the remaining i.e. 23 for the
training, for each database. To verify the learning capabilities
of the proposed method, we use two metrics to evaluate
our experimental results: correct classification rate (CR) and
confusion matrix (CM).
B. Quantitative Results
In this subsection, we try to evaluate the influence of
the K value for the K-NN classifier. Then, we present the
classification results obtained by the softmax classifier. The
tables given below shows the results obtained on the two
databases respectively:

Fig. 5. Application process for supervised classification 1) Evaluation on COIL-10 dataset:


K-NN:
TABLE I
K-NN Classification results for COIL-10 using the k-NN classifier

The wavelet functions used


The principle of K-NN is to make a comparison between Morlet Mexican hat RASP
the two matrices obtained. Each image in the test database K-NN
wavelet wavelet wavelet
is compared with all the images of the learning base. This k=1 98.33 % 82.92 % 98.33 %
comparison consists of calculating the distance (in our appli- k=3 99.58 % 86.67 % 99.58 %
k=4 99.58 % 87.50 % 99.58 %
cation we use the Euclidean distance) between the vector of k=5 99.58 % 88.33 % 99.58 %
the weight of the test image and that of the learning images. k=6 99.58 % 89.17 % 99.58 %
Finally, all distances are recovered. In the sorting, the shortest
distance is obtained. The algorithm therefore decide to which Softmax: To evaluate the performance of our approach
class the test image belongs. using the softmax classifier, we have studied the influence of
Softmax the behavior of our classifier depending on the variation of
the wavelet used. Three family of wavelet were used (Morlet,
This is an activation function specialized for classification Mexican hat and RASP), which is justified by that we know
networks. the analytic expression of these wavelets, which we need it
In a classification problem, the output of the network is usu- to compute the outputs of the wavelet network (obtain exact
ally the softmax function applied to a linear transformation of values).
the appropriate size of the last hidden layer. When optimizing, - Morlet wavelet
hidden layers then learn how to transform the entries for the
classes are linearly separable.
TABLE II
Confusion matrix for morlet wavelet
III. E XPERIMENTATION AND RESULTS
1 2 3 4 5 6 7 8 9 10 CR
A. Experimental Setups 1 24 0 0 0 0 0 0 0 0 0 100 %
Used databases: 2 0 22 0 0 0 0 0 0 2 0 91.7 %
3 0 0 24 0 0 0 0 0 0 0 100 %
We try to test our approach on two databases: COIL-100 4 0 0 0 24 0 0 0 0 0 0 100 %
1
and MNIST 2 . They are the most used to evaluate the 5 0 0 0 0 24 0 0 0 0 0 100 %
classification algorithms. 6 0 0 0 0 0 24 0 0 0 0 100 %
7 0 0 0 0 0 0 24 0 0 0 100 %
1) COIL-100 is a database of 7,200 color images of 100 8 0 0 0 0 0 0 0 24 0 0 100 %
objects (72 images per object). We used only 10 objects 9 0 0 0 0 0 0 0 0 24 0 100 %
for the first database. The size of image is 128 * 128 10 0 0 0 0 0 0 0 0 0 24 100 %
Global classification rate 99.2 %
pixels. They have been resized to 64 * 64 pixels to
reduce the execution time [41].
- Mexican hat wavelet
1 http://www.cs.columbia.edu/CAVE/software/softlib/coil-100.php
2 http://yann.lecun.com/exdb/mnist/ - RASP wavelet
TABLE III TABLE VI
Confusion matrix for mexican hat wavelet Classification results for MNIST using the softmax classifier

1 2 3 4 5 6 7 8 9 10 CR The wavelet functions used


1 24 0 0 0 0 0 0 0 0 0 100 % Morlet Mexican hat RASP
2 1 22 0 0 0 0 0 0 1 0 91.7 % wavelet wavelet wavelet
3 0 0 24 0 0 0 0 0 0 0 100 % Softmax 84.50 % 84.50 % 84.50 %
4 0 0 2 22 0 0 0 0 0 0 91.7 %
5 0 0 0 1 23 0 0 0 0 0 95.8 %
6 0 0 0 0 0 24 0 0 0 0 100 %
7 0 0 0 0 0 1 20 1 2 0 83.3 %
base COIL-10.
8 0 0 0 0 0 0 0 23 1 0 95.8% The table as below contains the confusion matrix obtained
9 0 0 0 0 0 0 0 2 20 2 83.3 % by the softmax classifier for three wavelet:
10 0 0 0 0 0 0 0 0 0 24 100 %
Global classification rate 94.2 % TABLE VII
Confusion matrix for MNIST
TABLE IV
Confusion matrix for rasp wavelet 1 2 3 4 5 6 7 8 9 10 CR
1 36 0 1 1 0 1 0 0 0 1 90.0%
1 2 3 4 5 6 7 8 9 10 CR 2 1 36 1 2 0 0 0 0 0 0 90.0%
1 24 0 0 0 0 0 0 0 0 0 100 % 3 0 2 30 0 7 0 0 1 0 0 75.0%
2 0 22 0 0 0 0 0 0 2 0 91.7 % 4 1 0 0 33 0 0 0 1 5 0 82.5%
3 0 0 24 0 0 0 0 0 0 0 100 % 5 0 0 2 0 35 0 0 0 2 1 87.5%
4 0 0 0 24 0 0 0 0 0 0 100 % 6 0 0 0 3 0 34 0 1 2 0 85.0%
5 0 0 0 0 24 0 0 0 0 0 100 % 7 0 0 0 0 0 0 39 1 0 0 97.5%
6 0 0 0 0 0 24 0 0 0 0 100 % 8 1 1 4 1 0 1 0 29 1 2 72.5%
7 0 0 0 0 0 0 24 0 0 0 100 % 9 1 0 0 1 0 1 0 2 33 2 82.5%
8 0 0 0 0 0 0 0 24 0 0 100 % 10 0 1 2 0 1 0 0 2 1 33 82.5%
9 0 0 0 0 0 0 0 0 24 0 100 % Global classification rate 84.5%
10 0 0 0 0 0 0 0 0 0 24 100 %
Global classification rate 99.2 %
So, the results are in favor of the K-NN classifier with an
overall classification rate of around 99% for MNIST.
The performance comparison with other approaches re-
It is found that the performance of our classifier depends ported in Ref. [42] is studied. Two different categories of
on the families of wavelets used. Indeed, the different results classifier were employed in Ref. [42] which were SVM
were found in favor of families of wavelets morlet and rasp. and neural networks (Randomly initialized backpropagation,
Thus, it is noticed that more the value of k is growing, more Backpropagation using steepest descent). Details for those
the results of affectation is realized. machine learning classifiers can be found in Ref. [42]. The
From the results illustrated above, we achieve a good per- results reported there covered most of the important machine
formance on COIL-10 in terms of overall correct classification learning classification approaches. Thus, it serves as a good
rate is 99.2 %. also, it is noted that for the two classifiers, we source of comparison with the performance of the proposed
obtain good performance, with a slight improvement in favor approach. Furthermore, information about the maximal re-
of the classifier k-NN for k = 6. ported classification accuracy for the MNIST benchmark data
2) Evaluation on MNIST dataset: set is used to compare with actual measured values using our
K-NN: proposed DWN as mentioned in Table VIII. One motivation
TABLE V for determining comparative performance across a set of
Classification results for MNIST using the k-NN classifier different learning algorithms is to assess whether any particular
algorithm demonstrates a significant advantage over the others.
The wavelet functions used
Morlet Mexican hat RASP
k-ppv
wavelet wavelet wavelet TABLE VIII
k=1 96.50 % 96.50 % 96.50 % P ERFORMANCE COMPARISON OF OUR APPROACH ON M NIST
k=3 97.75 % 97.75 % 97.75 %
k=4 98.50 % 98.50 % 98.50 % Classification
k=5 98.50 % 98.50 % 98.50 % Methods Error rate
rate
k=6 99 % 99 % 99 % Our approach 99 % 1%
Randomly initialized
[42] 98.4 % 1.6 %
backpropagation
SVM [42] 98.6 % 1.4 %
Softmax: Backpropagation using
98.8 % 1.2 %
Note that each of these two classifiers gives the same steepest descent [42]
classification rate for the three wavelets on MNIST. They
exhibit good performance when increasing the number of We can notice that the proposed DWN produces better
vector k. But, it is observed that k-NN gives better results performance than all other methods with a classification rate of
than softmax classifier for MNIST, as opposed to the other 99% for Mnist data set. The maximum accuracy reported in the
literature for this data set is measured using steepest descent [19] H. Szu, B. Telfer, and S. Kadambe, Neural network adaptive wavelets for
backpropagation [42] (i.e. classification rate of 98.8%). signal representation and classification, Optical Engineering, 31: 1907-
1961, 1992.
[20] H. Szu, B. Telfer, and J. Garcia, Wavelet ransforms and neural networks
IV. C ONCLUSIONS for compression and recognition, Neural Networks, 9: 695-708, 1996.
[21] S. S. Luengar, E. C. Cho and Vir V. Phoha, Foundations of wavelets
In this paper, we have proposed a novel approach for image networks and applications, Chapman & Hall/CRC, 2000.
classification using a deep wavelet network. The comparisons [22] Dammak, M., Mejdoub, M., Zaied, M., Amar, C.B. Feature vector
of the performance identified, have concluded that our ap- approximation based on wavelet network (2012) ICAART 2012 - Pro-
ceedings of the 4th International Conference on Agents and Artificial
proach gives a good results and shows high performances, Intelligence, 1, pp. 394-399.
which involves the robustness for our new method. We aim at [23] The MathWorks, Inc, Training a Deep Neural Network for Digit
improving our system by exploiting other classifier, and use Classification, Neural Network T oolboxT M , 2015.
[24] L. Zheng, X. He, Classification techniques in pattern recognition, In:
other algorithms for training a deep neural network other than Proceedings of the 13th International Conference in Central Europe on
autoencoder. Computer Graphics, Visualization and computer vision, pp. 77 - 78, 2005.
[25] Dr. V. Mohan, A. Kannan, Color Image Classification and Retrieval
ACKNOWLEDGMENT using Image mining Techniques, International Journal of Engineering
Science and Technology. Vol. 2(5), 1014-1020, 2010.
The authors would like to acknowledge the financial support [26] T. N. Phyu, Survey of classification techniques in data mining, In: Pro-
ceedings of the International MultiConference of Engineers and Computer
of this work by grants from General Direction of Scientific Scientists, Vol I, March 18 - 20, Hong Kong, 2009.
Research (DGRST), Tunisia, under the ARUB program. [27] J. S. Wang, W. C. Chiang, Y. L. Hsu, Y. T. C. Yang, ECG arrhythmia
classification using a probabilistic neural network with a feature reduc-
R EFERENCES tion method, Neurocomputing. Volume 116, Pages 3845, 2013.
[28] J. I. Arribas, G. V. Snchez-Ferrero, G. Ruiz-Ruiz, J. Gmez-Gil, Leaf
[1] K. Hornik, M. Stinchcombe, and H. White, Multilayer feedforward classification in sunflower crops by computer vision and neural networks,
networks are universal approximators, Neural networks. Volume 2, pp. Comput. Electron. Agric. Volume 78, Issue 1, Pages 918, 2011.
359-366, 1989. [29] F. Kaefer, C. M. Heilman, S. D. Ramenofsky, A neural network
[2] D. Erhan, P.-A. Manzagol, Y. Bengio, S. Bengio, and P. Vincent, The application to consumer classification to improve the timing of direct
difficulty of training deep architectures and the effect of unsupervised marketing activities, Comput. Oper. Res. Volume 32, Issue 10, Pages
pretraining, In 12th International Conference on Artificial Intelligence 25952615, 2005.
and Statistics (AISTATS) 2009, Clearwater Beach, Florida, USA. Volume [30] I. Jawad, A. Farhi , A. E. motassadeq, H. Chehouani and S. Erraki,
5, pp. 153-160, 2009. Unsupervised classification of grayscale image using Probabilistic Neural
[3] Y. Bengio, P. Simard, and P. Frasconi, Learning long-term dependencies Network ( PNN), 2012 International Conference on Multimedia Comput-
with gradient descent is difficult, IEEE Transactions on Neural Networks. ing and Systems, 2012.
Volume 5(2), pp. 157-166, 1994. [31] Mejdoub, M., Ben Amar, C. Classification improvement of local feature
[4] X. Glorot, Y. Bengio, Understanding the difficulty of training deep vectors over the KNN algorithm (2013) Multimedia Tools and Applica-
feedforward neural networks, In International Conference on Artificial tions, 64 (1), pp. 197-218.
Intelligence and Statistics (AISTATS10), pp. 249-256, 2010. [32] M. Brown, H. Lewis, S. Gunn, Linear spectral mixture models and
[5] D. Alain, S. Ivaldi, and O. Sigaud, Learning a repertoire of actions with support vector machines for remote sensing, IEEE Trans. Geosci. Remote
deep neural networks, 4th International Conference on Development and Sens. 23462360, 2000.
Learning and on Epigenetic Robotics, 2014. [33] J. Zeng, H. f. Guo, Y. m. HU, Artificial neural network model
[6] G. E. Hinton and R. R. Salakhutdinov, Reducing the Dimensionality of for identifying taxi gross emitter from remote sensing data of vehicle
Data with Neural Networks”, Science, Volume 313, pp. 504-507, 2006. emission, J.Environ. Sci. 427431, 2007.
[7] Y. Bengio, A. Courville, P. Vincent, Representation Learning: A Review [34] D. M. Miller, E. J. Kaminsky, S. Rana, Neural network classification of
and New Perspectives, IEEE Transactions on Pattern Analysis and Ma- remotesensing data, Comput. Geosci. 377386, 1995.
chine Intelligence, Volume: 35, Issue: 8, pp. 1798-1828, 2014. [35] C. Huang, L. Davis, J. Townshend, An assessment of support vector
[8] G. Hinton, A fast learning algorithm for deep belief nets, Neural compu- machines for land cover classification, Int. J. Remote Sens. 725749, 2002.
tation, 2006. [36] A. Jain, M. Murty, and P. Flynn, Data clustering: a review, ACM
[9] Z. Chen, J. Wang, H. He, and X. Huang, A fast deep learning system using computing surveys (CSUR), 31(3): 264323, 1999.
gpu, Circuits and Systems (ISCAS), IEEE International Symposium, [37] J. Macqueen, Some methods for classification and analysis, In Berkeley
2014. Symposium on Mathematical Statistics and Probability, volume 233,
[10] A. Krizhevsky, I. Sutskever, and G. Hinton, Imagenet classification pages 281297, 1967.
with deep convolutional neural networks, Neural Information Processing [38] J. Ward, Hierarchical grouping to optimize an objective function, Journal
Systems 25, 2012. of the American statistical association, 58(301): 236244, 1963.
[11] G. Hinton, A practical guide to training restricted boltzmann machines, [39] A. Y. Ng, M. I. Jordan, Y. Weiss, et al, On spectral clustering: Analysis
Lecture Notes in Computer Science, 7700, 2012. and an algorithm, In Advances in Neural Information Processing Systems,
[12] S. Hassairi, R. Ejbali and M. Zaied, A Deep Convolutional Neural volume 14, pages 849856, 2001.
Wavelet Network to supervised Arabic letter image classification, In 15th [40] T. Kohonen, Self-organized formation of topologically correct feature
International Conference on Intelligent Systems Design and Applications, maps, Biological cybernetics, 69: 5969, 1982.
Marrakesh, Morocco, December 14-16-2015. [41] S. A. Nene, S. K. Nayar and H. Murase, Columbia Object Image Library
[13] C.-Y. Liou, W.-C. Cheng, J.-W. Liou, and D.-R. Liou, Autoencoder for COIL, Department of Computer Science Columbia University NewYork
words. Neurocomputing, 139: 8496, 2014. N. Y. 10027, MARCH 1996.
[14] Y. Bengio, Learning deep architectures for ai, In Foundations and Trends [42] G. E. Hinton and R. R. Salakhutdinov, Reducing the Dimensionality of
in Machine Learning, pages 1127, 2009. Data with Neural Networks, Science, Volume 313, pp. 504-507, 2006.
[15] K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun, What is the best
multi-stage architecture for object recognition?, ICCV, 2009.
[16] Y. LeCun, Learning invariant feature hierarchies, Computer Vision-
ECCV, 2012.
[17] Q. Zang, Wavelet Network in Nonparametric Estimation, IEEE Trans.
Neural Networks, 8(2): 227-236, 1997.
[18] Q. Zang and A. Benveniste, Wavelet networks, IEEE Trans. Neural
Networks, vol. 3, pp. 889-898, 1992.

You might also like