Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

An experimental study on the use of visual texture for

wood identification using a novel convolutional


neural network layer.

Loke Kar Seng Teddy Guniawan


Faculty of Engineering, Computing and Science Faculty of Engineering, Computing and Science
Swinburne University of Technology Sarawak Swinburne University of Technology Sarawak
Kuching, Malaysia Kuching, Malaysia
ksloke@swinburne.edu.my

Abstract—We describe a novel approach to transform the input


layer of a convolutional neural network (CNN) so that textural II. RELATED WORKS
properties and relational pixel properties are captured directly. In general, the research related to wood identification can be
We test this approach by using wood texture for wood
classified according to the use of microscopic or macroscopic
identification. We compare our results with our dataset against
another approach using CNN and obtained better results. This images. It is also common to use some kind of texture
approach can be used for detecting textures in general. description features or statistical features. For textural features it
is common to use grey level co-occurrence matrix (GLCM) [2],
Index Terms—Texture, texture recognition, wood identification, local binary patterns (LBM) or Gabor features.
convolutional neural network Khalid et al [3] used GLCM features and a multilayer
perceptron classifier [4]. Their dataset consists of macroscopic
I. INTRODUCTION wood images acquired at 10x magnification. The images were
captured after the wood was sectioned using a sliding
World-wide demand for timber and wood products creates
microtome. A total of 1753 images was used for training and 196
opportunity for illegal trade. Timber identification technology is
for testing. The number of wood species in the dataset is 20 that
required to combat illegal logging, promoting quality and
consists of Malaysian tropical wood species. The authors
substitution and conservation. Supporting legal trade of timber.
reported an accuracy of around 95%.
Identifying wood is a process that uses pattern recognition by
Tou et al [5] described a wood species recognition system by
identifying regular patterns in the wood structure. The
classifying wood texture. The wood cross section textures are
anatomical structure of the wood patterns provides its
transformed using grey level co-occurrence matrices (GLCM),
identifying features. There are two fundamental types of wood,
covariance matrix and Gabor filters as input to the recognition
softwood and hardwood, based on its biologically distinct
system. They used six wood species with images of 512x512
features [1]. Softwoods are composed mostly of tracheids while
pixels from the CAIRO wood dataset [6]. The images are
are structurally more complex with vessels, fibres, axial
macroscopic view of the wood cross section surfaces. They
parenchyma, and rays. These features are exposed when the
reported best results using covariance matrix generated from
wood is cut transversely or perpendicularly. Some of these
Gabor filters at 85% accuracy, and the next best results for
structural features are only visible under magnification. It is
GLCM at 78.3% accuracy.
differences in these anatomical and structural features that
distinguishes different wood genus or species. Similarly, Wang et al [7] also described a GLCM based
system using wood stereogram images. They extracted six
There are demands for on the easy and rapid identification
features (energy, entropy, contrast, dissimilarity, inverse
of wood in the field especially as a means to combat illegal
difference moment and variance) from the GLCM and achieved
logging. The demand is outpacing the availability of
91.7% accuracy via a support vector machine (SVM) classifier.
experienced personnel capable of performing such tasks.
A total of 24 wood species with 480 image samples was used in
Training new personnel is also not easy and there could be a lack
the reported research.
of interest and available resources. The increase in computing
power and its corresponding decrease in cost has made wood Another GLCM based approach was reported in Yadav et al
identification through anatomical structures a viable and [8]. A total of 22 textural features was extracted the GLCM and
attractive alternative in the field as compared to non-anatomical fed in to multilayer backpropagation artificial neural network.
identification such as chemical, DNA or spectroscopic. Their best result was with 23 hidden neurons achieving 92.6%
accuracy average over training, validation and test datasets. The
dataset consists of 25 hardwood species with 500 samples of species used listed in their local names: Jongkong, Bindang,
microscopic images. Yellow Meranti, Resak, Pulai, Ramin, Menggris, Selangan Batu,
Martins et al [9] also described a system using 24 GLCM Mersawa, Keruntum, Perupok, Bingtangor, Keruing,
features and 18 colour features, together with a multilayer Geronggang, Terentang, Jelutong, Durian, Dark Red Meranti,
perceptron classifier on a database 11 Brazilian forest species. Kapur, Light Red Meranti, Alan Batu, Sepetir, White Meranti,
The dataset consists of 347 images that was acquired through a Merbau, Rengas, Nyatoh, Alan Bunga
low-cost digital camera set a macro mode. The results reported
was 82% accuracy.
Yusof et al [10] used genetic algorithm (GA) approach to
select non-linear features through a kernel function. They used
GLCM, basic grey level aura matrix (BGLAM) and other
statistical properties of pore distribution (SPPD) as the input.
Their dataset consists macroscopic images from 52 wood Fig 1. Sample images: Bintangor (left); Jongkong (right)
species belonging to 23 genera. 100 images from each sample A. Architecture of CNN
are used with 70 images for training and 30 for testing. They
Convolutional neural networks work on individual pixels
reported a 98.7% accuracy.
over 3 red-green-blue (RGB) channels (for colour images).
Another approach using a different type of textural features
However, if we want to capture statistical properties or relational
was presented by Wang et al [11]. They reported an approach
distribution of pixel, direct image pixel input to CNN would not
using higher order local autocorrelation (HLAC) that was
be useful. If we want to have some pixel relational properties
proposed by Otsu [12]. The classifiers used as k-nearest
such as those exhibited by textures, we can’t use image pixels
neighbors (k-NN) and support vector machines (SVM). Their
directly as input.
best results obtained as 87.7% accuracy. The dataset is a set of
stereograms for 24 wood species found in Zhejiang province in
China.
While most research approach the problem of wood Original image pixel value
10 10 10
identification, the recent successes of deep learning [13] have
prompted an approach that does not require pre-processing to 10 5 10
extract textural features such as GLCM or local binary patterns 10 10 10
(LBP). Oliveira et al [14] described a system using Transformed value (will be normalized)
convolutional neural networks (CNN) directly using image
pixels as input to the deep learning model. Their deep learning 450
architecture 64 convolutional filters (5x5) and a pooling layer of
size 3x3. This was fed to another convolution and pooling layer,
which was then fed to 2 locally connected layer. The final layer
is a fully connected layer. Their macroscopic dataset consists of Fig 2. Input pixel values transformation
41 classes which as obtain through a regular digital camera.
They have a second microscopic dataset of 112 species. Image
patches of size 48x48 and 64x64 are used as input. They reported
an accuracy of 95.7% for the macroscopic images and 97.3% for RR
R
the microscopic images. RG
G
RB
III. OUR APPROACH B
GG
In our approach, we developed a novel input layer for a deep BB
convolutional neural network that does not require pre-
BG
extraction of textural features such as GLCM. Even if given
textural features such as GLCM, it is not clear how a CNN
would be used as a CNN is used directly on image pixels not
pre-extracted features. The implementation was coded in Python
using the Tensorflow/Keras library. Fig 3 RGB layers transformed to 6 layers
Our dataset consists of 27 class images taken with a normal
In order to capture some of relational properties we
single lens reflex digital camera. The images are captured from
preprocess the input pixel layers to capture some of those
small wood blocks obtained from the Sarawak Timber Industry
properties. We transform the image pixels by multiplying the
Development Corporation. The dataset we obtained consists of
central pixel gray value with its surrounding pixels’ value (see
13,216 sample image patch of size 50x50 pixels. A total of 9251
Fig 2). This gives a measure of how correlated the central pixel
randomized patches was used for training and the remaining
is with its surrounding pixels; a higher value indicates a high
3965 was used for testing validation. The testing images were
correlation. All the input pixel values will be transformed in this
never used in the model training. The following are the wood
manner by sliding window. This is like the convolution sliding
window except we don’t have weights to learn and the For the same dataset we also tested using LBP features and
connections are different. SVM. The results obtained was only 49% for training and 47%
We may put in concise form as: for testing.
𝐿𝑋1𝑋2 (𝑖, 𝑗) = ∑ 𝑋1 (𝑖, 𝑗)𝑋2 (𝑎, 𝑏)
TABLE 1: Results of our model and Oliveira et al.
𝑎,𝑏∈𝒩(𝑖,𝑗)
𝒩(𝑖, 𝑗) is the circular neighborhood function for index (i,j)
Input Architecture Epoch Training Testing
that returns coordinates of pixel around (i,j). 𝐿𝑋1𝑋2 is the 2D Layers Accuracy Accuracy
output layer value at (i,j) that is calculated from the X layers of 3 Channel Locally 120 96.53% 88.75%
the original RGB input. For example, 𝐿𝑅𝐺 output is calculated Connected
from the original R and G layer of the input image. 6 Channel Locally 120 100.00% 86.18%
We can further decompose the layers into their respective Connected
3 Channel Our model 100 82.26% 85.12%
RGB layers so that we can correlate different colors between the
6 Channel Our model 100 94.20% 93.94%
central and surrounding pixels. For example, we may calculate
the red central pixel value against blue surround pixels. If we do
this for RGB layers, we would obtain 6 input pairs, namely RR,
RG, RB, BG, GG, BB (fig 3) if we do not care about the order
between central and surround. Potentially the order of the central
and surround pixel may be important but for this experiment we
limited it to 6 pairs. These pairs therefore constitute the input
layers to the convolutional neural network. The proposed
approach is not the only way to obtain pixel to pixel correlations.
There are potential more ways to obtain them, this will be our
focus for the future.
The rest of the architecture consists of:
• 32 Conv2D 3x3; RELU activation
• MaxPool2D 2x2
• 32 Conv2D 3x3; RELU activation
• MaxPool2D 2x2
• 64 Conv2D 3x3; RELU activation
• MaxPool2D 2x2
• 64 Fully Connected Layer; RELU activation
• Dropout 0.3
• 27 Fully Connected; Softmax activation

B. Training procedure
The CNN was coded in Python using the Tensorflow/Keras
library. The training was run for 100-120 epochs depending on Fig 4. The model accuracy and loss for the 6 layer input with our CNN model
the model used. The number of epochs was decided by looking
at the training lost curve where the lost eventually taper out or V. DISCUSSION
when the accuracy not no increase further (Fig. 4).
The results indicate that our novel input layer transformation
The batch size used was 100 and the input images are 50x50
works well to capture statistical distribution of pixel relational
pixels in size. We tested our model against the architecture
values. We intend to investigate further our approach for
developed by Oliveira et al [14] described earlier because their
capturing statistical properties. This approach could be used for
approach is closest to ours. Unfortunately, the dataset is different
capturing and recognizing textures in other domains as well.
from their reported results.
REFERENCES
IV. RESULTS
The results are presented in Table 1 below comparing our [1] A. C. Wiedenhoeft and D. E. Kretschmann, “Species
results with Oliveira et al [14] which we denote as “locally Identification and Design Value Estimation of Wooden
connected”. Members in Covered Bridges,” U.S. Department of
Table 1 indicates that our approach works better than using Agriculture, Forest Service, Forest Products Laboratory,
locally connected layer as in Oliveira et al. It would seem that Madison, WI, FPL-GTR-228, 2014.
the locally connected architecture would be prone to overfitting [2] R. Haralick, K. Shanmugam, and I. Dinstein, “Textural
as the training accuracy is much higher than the testing accuracy. features for image classification.,” Syst. Man Cybern. IEEE
Trans. On, vol. SMC3, pp. 610–621, Dec. 1973.
[3] M. Khalid, E. L. Y. Lee, and R. Yusof, “DESIGN OF AN in 2013 Fourth National Conference on Computer Vision,
INTELLIGENT WOOD SPECIES RECOGNITION Pattern Recognition, Image Processing and Graphics
SYSTEM,” Int. J. Simul. Syst. Sci. Technol., vol. 9, no. 3, (NCVPRIPG), Jodhpur, India, 2013, pp. 1–5.
p. 11, 2008. [9] J. Martins, L. S. Oliveira, S. Nisgoski, and R. Sabourin, “A
[4] Rumelhart, David E., James L. McClelland, and R. J. database for automatic classification of forest species,”
Williams, “Learning Internal Representations by Error Mach. Vis. Appl., vol. 24, no. 3, pp. 567–578, Apr. 2013.
Propagation,” in Parallel distributed processing: [10] R. Yusof, M. Khalid, and A. S. M. Khairuddin,
Exploratrions in the microstructure of cognition, vol. 1, “Application of kernel-genetic algorithm as nonlinear
David E. Rumelhart, James L. McClelland, and PDP feature selection in tropical wood species recognition
research group, Eds. MIT Press, 1986. system,” Comput. Electron. Agric., vol. 93, pp. 68–77, Apr.
[5] J. Y. Tou, Y. H. Tay, and P. Y. Lau, “A Comparative Study 2013.
for Texture Classification Techniques on Wood Species [11] H. Wang, G. Zhang, and H. Qi, “Wood Recognition Using
Recognition Problem,” in 2009 Fifth International Image Texture Features,” PLoS ONE, vol. 8, no. 10, p.
Conference on Natural Computation, Tianjian, China, e76101, Oct. 2013.
2009, pp. 8–12. [12] N. Otsu and T. Kurita, “A new scheme for practical flexible
[6] P.Y. Lau, Y.H. Tay, and J.Y. Tou, “Rotational Invariant and intelligent vision systems,” Jan. 1988.
Wood Species Recognition through Wood Species,” [13] Ian Goodfellow, Yoshua Bengio, and Aaron Courville,
presented at the Proc. 1st Asian Conference on Intelligent Deep Learning. MIT Press, 2016.
Information and Database Systems, 2009, vol. 3, pp. 1592– [14] Pedro Luiz de Paula, L. S. Oliveira, Alceu de Souza Britto
1597. Jr, and R. Sabourin, “Forest Species Recognition Using
[7] B. Wang, H. Wang, and H. Qi, “Wood recognition based Color-Based Features,” in 2010 20th International
on grey-level co-occurrence matrix,” in 2010 International Conference on Pattern Recognition, Istanbul, Turkey,
Conference on Computer Application and System Modeling 2010, pp. 4178–4181.
(ICCASM 2010), 2010, vol. 1, pp. V1-269-V1-272.
[8] A. R. Yadav, M. L. Dewal, R. S. Anand, and S. Gupta,
“Classification of hardwood species using ANN classifier,”

You might also like