Professional Documents
Culture Documents
Hafemann Ijcnn 2015
Hafemann Ijcnn 2015
Abstract—Convolutional Neural Networks (CNNs) have set dataset that contains 1.2 million images on 1000 categories.
the state-of-the-art in many computer vision tasks in recent For such tasks, it is common to have a large number of
years. For this type of model, it is common to have millions trainable parameters (in the order of millions), requiring a
of parameters to train, commonly requiring large datasets. We significant amount of computing power to train the network.
investigate a method to transfer learning across different texture
classification problems, using CNNs, in order to take advantage In the task of texture classification, CNNs are not yet
of this type of architecture to problems with smaller datasets. widely used. Titive et al. [15] used convolutional neural
We use a Convolutional Neural Network trained on a source networks on the Brodatz texture dataset, but considered only
dataset (with lots of data) to project the data of a target dataset low resolution images, and a small number of classes. More
(with limited data) onto another feature space, and then train a recently, Hafemann et al. tested this type of model for Forest
classifier on top of this new representation. Our experiments show Species classification with good results [16]. For forest species
that this technique can achieve good results in tasks with small
datasets, by leveraging knowledge learned from tasks with larger
recognition, the datasets were large enough to successfully
datasets. Testing the method on the the Brodatz-32 dataset, we train the network - for example, one of the datasets consists
achieved an accuracy of 97.04% - superior to models trained with of 3k images of size 3264 x 2448. Considering the results
popular texture descriptors, such as Local Binary Patterns and from both object recognition and texture classification, having
Gabor Filters, and increasing the accuracy by 6 percentage points large datasets seem to be required for successfully training
compared to a CNN trained directly on the Brodatz-32 dataset. CNNs. In practical applications, however, there are several
We also present a visual analysis of the projected dataset, showing cases where we must deal with relatively small datasets. For
that the data is projected to a space where samples from the same this reason, techniques that try to overcome this situation, such
class are clustered together - suggesting that the features learned as transfer learning, are important to increase the accuracy of
by the CNN in the source task are relevant for the target task. such systems.
Model A. Datasets
For this experiment, we selected one large texture dataset
for training the source task, and a smaller texture dataset
New representation for the target task. For the source task, we used a dataset
of macroscopic images for forest species recognition[22], the
same dataset used in [16] to train a CNN. This dataset contains
1. Extract patches Fully-connected
pictures of cross-section surfaces of the trees, obtained using
(testing set) 2x Locally-connected
a regular digital camera. This dataset contains 41 classes, and
Max Pooling
images with resolution 3264 x 2448. This dataset is large
Convolution (around 65 GB in a bitmap format), which is desired for the
source task. Examples from this dataset can be found in Figure
Max Pooling 3
Convolution
Input
2. Forward Propagate
Fig. 2. The testing procedure for transfer learning. First, we extract patches Fig. 3. Sample images from the macroscopic Brazilian forest species dataset
from the testing set of the target task. For each patch, a new representation
is obtained by forward-propagating the sample through the CNN trained on For the target task, we used a smaller dataset: Brodatz-
the source task. This new representation is used to obtain a prediction using
the model trained for transfer learning. Lastly, the predictions of all patches 32. The Brodatz album is a classical texture database, dating
of the image are combined. back to 1966, and it is widely used for texture analysis.
[23]. This dataset is composed of 112 textures, with a single
640x640 image per each texture. In order to standardize the the simpler approach (non-overlapping patches) that achieved
usage of this dataset for texture classification, Valkealahti [24] a performance of 96.35%.
introduced the Brodatz-32 dataset, a subset of 32 textures from
the Brodatz album, with 64 images per class (64x64 pixels in Accuracy vs. number of patches per image
size), containing patches of the original images, together with 97 ●
●
rotated and scaled versions of them (see Figure 4). This dataset
is small (around 8 MB in a bitmap format) and was selected to 96 ● ●
Classification Accuracy
●
identify if we can increase classification performance on small
datasets using transfer learning. Samples from this dataset can 95
Grid Patches
be found in Figure 4. ●
94 Random Patches
93
92
●
20 40 60 80
Number of random patches
V. C ONCLUSION
Fig. 6. Visualization of patches (32x32) from the original Brodatz-32 dataset, We presented a method to transfer learning across different
reduced to 2D using t-SNE. texture classification problems, using CNNs. Our experiments
showed that this method can be used to obtain high classi-
Figure 7 contains the visualization of the Brodatz-32 fication accuracy on tasks with limited amounts of data, by
dataset transformed through a CNN trained on the same leveraging knowledge learned on tasks with large datasets,
dataset. We can see that for several classes the samples get even if the tasks are not very similar. In the tests with the
grouped together in the feature space. For example, the dark Brodatz-32 dataset, we obtained an accuracy of 97.04%, a
blue cluster in the bottom of the page refers to the Brodatz result that is better than the classification accuracy of models
image D104. that use some of the most popular texture descriptors, such
as LBP (with 91.40% of accuracy) and Gabor Filters (with
Figure 8 contains the visualization of the Brodatz-32 95.80% of accuracy). This method also performed better than
dataset transformed through a CNN trained on the Forest training a CNN directly on the Brodatz-32 dataset (which
Species dataset. It is interesting to note, that although the CNN obtained an accuracy of 91.27%). We also presented a visual
had no access to samples of the Brodatz-32 dataset during analysis of how the samples are projected to a better feature
training, it is still able to project the dataset to a feature space space by using CNNs, showing that the weights learned by the
where samples from the same class get clustered together. CNN in one task can be used to project the data of another
We notice that in this case each class do not necessary form dataset to a better feature space, where the samples are more
a single cluster: samples from the same class form multiple easily separated.
groups, but most of these groups are concise and separated
from groups of other classes. This suggests that the features R EFERENCES
learned by the CNN trained on the Forest Species dataset [1] H. Harms, U. Gunzer, and H. M. Aus, “Combined local color and
are useful for the Brodatz-32 dataset, in spite of the large texture analysis of stained cells,” Computer Vision, Graphics, and Image
differences of the samples from the two datasets. Processing, vol. 33, no. 3, pp. 364–376, Mar. 1986.
[2] A. Khademi and S. Krishnan, “Medical image texture analysis: A case convolutional neural networks,” in TENCON 2006. 2006 IEEE Region
study with small bowel, retinal and mammogram images,” in Canadian 10 Conference. IEEE, 2006, pp. 1–4.
Conference on Electrical and Computer Engineering, 2008. CCECE [16] L. G. Hafemann, L. S. Oliveira, and P. Cavalin, “Forest species recogni-
2008, May 2008, pp. 001 949–001 954. tion using deep convolutional neural networks,” in Pattern Recognition
[3] R. Sutton and E. L. Hall, “Texture measures for automatic classification (ICPR), 2014 22nd International Conference on. IEEE, 2014, pp.
of pulmonary disease,” IEEE Transactions on Computers, vol. C-21, 1103–1107.
no. 7, pp. 667–676, Jul. 1972.
[17] M. Oquab, L. Bottou, I. Laptev, and J. Sivic, “Learning and transferring
[4] J. Y. Tou, Y. H. Tay, and P. Y. Lau, “A comparative study for texture mid-level image representations using convolutional neural networks,”
classification techniques on wood species recognition problem,” in in CVPR, 2014.
Natural Computation, 2009. ICNC’09. Fifth International Conference
on, vol. 5, 2009, pp. 8–12. [18] S. J. Pan and Q. Yang, “A survey on transfer learning,” Knowledge and
Data Engineering, IEEE Transactions on, vol. 22, no. 10, pp. 1345–
[5] P. Filho, L. Oliveira, A. Britto, and R. Sabourin, “Forest species 1359, 2010.
recognition using color-based features,” in 2010 20th International
Conference on Pattern Recognition (ICPR), 2010, pp. 4178–4181. [19] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
[6] M. Nasirzadeh, A. Khazael, and M. bin Khalid, “Woods recognition O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-
system based on local binary pattern,” in 2010 Second International plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch-
Conference on Computational Intelligence, Communication Systems and esnay, “Scikit-learn: Machine learning in Python,” Journal of Machine
Networks (CICSyN), 2010, pp. 308–313. Learning Research, vol. 12, pp. 2825–2830, 2011.
[7] R. Haralick, K. Shanmugam, and I. Dinstein, “Textural features for [20] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support
image classification,” IEEE Transactions on Systems, Man and Cyber- vector machines,” ACM Transactions on Intelligent Systems and
netics, vol. SMC-3, no. 6, pp. 610–621, Nov. 1973. Technology, vol. 2, pp. 27:1–27:27, 2011, software available at
http://www.csie.ntu.edu.tw/ cjlin/libsvm.
[8] D. Bertolini, L. S. Oliveira, E. Justino, and R. Sabourin, “Texture-based
descriptors for writer identification and verification,” Expert Systems [21] C.-W. Hsu, C.-C. Chang, C.-J. Lin et al., “A practical guide to support
with Applications, 2012. vector classification,” 2003.
[9] J. Zhang and T. Tan, “Brief review of invariant texture analysis [22] P. L. P. Filho, L. S. Oliveira, S. Nisgoski, and A. S. Britto, “Forest
methods,” Pattern recognition, vol. 35, no. 3, pp. 735–747, 2002. species recognition using macroscopic images,” Machine Vision and
[10] Y. Guo, G. Zhao, and M. Pietikinen, “Discriminative features for texture Applications, Jan. 2014.
description,” Pattern Recognition, vol. 45, no. 10, pp. 3834–3843, Oct. [23] P. Brodatz, Textures: a photographic album for artists and designers.
2012. Dover New York, 1966, vol. 66.
[11] Y. Bengio, “Learning deep architectures for AI,” Found. Trends Mach. [24] K. Valkealahti and E. Oja, “Reduced multidimensional co-occurrence
Learn., vol. 2, no. 1, pp. 1–127, Jan. 2009. histograms in texture classification,” IEEE Transactions on Pattern
[12] Y. Bengio and A. Courville, “Deep learning of representations,” in Analysis and Machine Intelligence, vol. 20, no. 1, pp. 90–94, Jan. 1998.
Handbook on Neural Information Processing. Springer, 2013, pp. [25] J. Chen, V. Kellokumpu, G. Zhao, and M. Pietikinen, “RLBP: Robust
1–28. local binary pattern.” BMVC, 2013.
[13] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Ben- [26] E. R. Urbach, J. B. Roerdink, and M. H. Wilkinson, “Connected shape-
gio, “Maxout networks,” arXiv e-print 1302.4389, Feb. 2013, JMLR size pattern spectra for rotation and scale-invariant classification of
WCP 28 (3): 1319-1327, 2013. gray-scale images,” Pattern Analysis and Machine Intelligence, IEEE
[14] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, Transactions on, vol. 29, no. 2, pp. 272–285, 2007.
V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” [27] L. Van der Maaten and G. Hinton, “Visualizing data using t-sne,”
CoRR, vol. abs/1409.4842, 2014. Journal of Machine Learning Research, vol. 9, no. 2579-2605, p. 85,
[15] F. H. C. Tivive and A. Bouzerdoum, “Texture classification using 2008.