Illegal Logging Listeners Using IoT Networks

2020 IEEE REGION 10 CONFERENCE (TENCON)
Osaka, Japan, November 16-19, 2020
Illegal Logging Listeners Using IoT Networks

Ananta Srisuphab∗ , Nopparat Kaakkurivaara† , Piyanuch Silapachote∗ ,
Kitipong Tangkit† , Ponthep Meunpong† and Thanwadee Sunetnanta∗
Faculty of Information and Communication Technology, Mahidol University, Nakhon Pathom, Thailand∗
Faculty of Forestry, Kasetsart University, Bangkok, Thailand†
Email: {ananta.sri, piyanuch.sil, thanwadee.sun}@mahidol.edu, {ffornrm, fforkpt, fforptm}@ku.ac.th
Abstract—Protecting and increasing worldwide green space

have been an international effort. Individuals and organizations
are encouraged to plant urban trees and to get involved in many
reforestation and restoration projects. Offsetting these much
needed plans to save the forests is illegal logging. Trees that
have grown for many years, some are protected resources inside
restricted areas, are felled and the wood is smuggled. Watching
for these illegal activities is very difficult and also very danger-
ous. It is quite impossible for rangers to patrol every entry and
exit point of forests that cover thousands of squared kilometers.
Applying Internet of Things technology to ecological forestry,
we are proposing integrating sound acquisition networks and
acoustic signal analyzers to enhance the robustness of an al-
ready successful camera-based surveillance solution that is also
equipped with a global positioning system tracker. Our listener Fig. 1. The NCAPS for catching illegal logging activities at the Thap Lan
devices record sounds of the forest and periodically send it to a National Park, Thailand. Left: one is installed at a gated entry to a restricted
cloud storage over cellular networks. The device is affordable, area. Right: its motion sensor, an infrared camera, and a GPS tracker.
the system is small and portable, and the network is flexibly
extensible. From the data, acoustic features are extracted and
visualized. The Mel-frequency cepstral coefficients of the signals number of dangerous weapons and firearms. There is thus a
have exhibited promising distinctiveness for detection of illegal constant urgency for researchers and developers to build an
chainsaw activities in the wild. even more robust detector, staying ahead of illegal activities
Index Terms—illegal logging, sound acquisition, IoT, MFCC
and actively protecting the forests.
Artificial intelligence (AI) and machine learning (ML)
I. M OTIVATIONS AND R ELATED W ORKS have blended into endless lists of applications, engaging in
Driven by market forces, demands for high quality timber countless fields of study and businesses. Liu and colleagues
and wood products have never ceased. With the potentials [1] put together ML applications in forest ecology, from
of very high monetary returns, too many are considering species distributions, carbon cycles, to forest managements.
illegal logging worth doing, despite the risk of being caught Their focuses were on three commonly used methods: de-
and running into troubles with the law. Catching illegal cision trees, neural networks, and support vector machines.
logging activities could be at a harvesting site where trees They conjectured that communications between ecologists
are cut and wood is processed or transformed, typically into and ML researchers posed challenges and a major bottleneck
economically-sized wood boards. It could also be during was lack of qualitative and quantitative data. Our research
transportation from harvesting locations to the edge of the presented here is breaking into both barriers. Ours is a joint
forests or from there to further destinations where woods project between teams of forestry specialists and ML as well
are collected or sold. While logs are oftentimes transported as software engineers. Furthermore, we have partnered with
on foot, trolleys, or motorcycles inside the forests, they are park rangers, gaining the true insights of illegal logging
carefully packed, hidden-in-plain-sight, in specially modified problems and learning about their pain points in catching
vehicles when they are on the road. violators. Proposing is an IoT device network and method-
SeaForest by Seacon Europe Ltd. is a commercial sensor ology for collecting large amount of acoustic data needed
designed to detect unauthorized vehicles entering the forests, for developing a deep learning application with the goal of
targeting timber thefts, vandalism, waste dumpling, and off- improving automated detection of illegal logging activities.
road biking. Another product by Rainforest Connection of
II. I NTEGRATED S OUND AND V ISION S URVEILLANCE
the United States uses old cell phones to record wildlife
S YSTEMS WITH E MBEDDED GPS T RACKERS
and alert rangers of possible illegal logging. Advancing of
technology does not stop loggers. Instead, they regularly This work is a collaboration with Thap Lan National Park,
adopt and adjust their strategies to render these systems situated at the center of the Dong Phayayen–Khao Yai Forest
ineffective. Organizations involved in illegal logging can be Complex, a UNESCO World Heritage Site in Thailand. Thap
highly complex networks, involving many parties and a large Lan rangers have had many run-ins with illegal logging and
continuously been improving their approaches to overcome
This work is supported by the National Research Council of Thailand
(NRCT) under the Plant Genetic Conservation Project under the Royal it. A breakthrough came in 2014 when they launched the
Initiative of HRH Princess Maha Chakri Sirindhorn (RSPG). Network Centric Anti–Poaching System or NCAPS (Fig.1).
978-1-7281-8455-5/20/$31.00 ©2020 IEEE 1277
Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 21,2021 at 06:51:51 UTC from IEEE Xplore. Restrictions apply.
Fig. 2. Overview of the system architecture, components and configurations, of our embedded sound acquisition system. A Raspberry Pi is programmed
to control a microphone and an attached sound card to continuously record sound in the forests. Audio signals captured are then automatically packaged
and periodically sent through cellular networks to a cloud storage, where the data can be downloaded remotely at any time for further processing.
With NCAPS, the number of cases caught before trees quality sound with a sufficiently high sampling rate, good
were cut had doubled, while those after damages had been audio sensitivity, and a minimal amount of unwanted noise.
done were reduced by more than half. This has been quite Each individual unit is designed so that it is readily exten-
significant for their success, considering that illegal logging sible; multiple units can then be deployed as an extended
should be intercepted in its early stages. The earlier it is logging listener network on a budget.
caught, the less the severity of the damage is. To capture high quality sound, we selected a professional-
NCAPS comprises four core components: an efficient grade, broadcast-quality lavalier microphone, specifically our
strategic planning when teams move-in to search and arrest choice was a RODE SmartLav+. Its high-performance om-
loggers, effective devices and technologies, readiness of nidirectional condenser capsule enables our system to pick
park rangers, and a strict enforcement of the law. NCAPS up acoustic sound signals equally from all directions in a
devices consist of motion sensors, infrared cameras, and GPS three-dimensional sphere pattern around this 4.5-millimeter
trackers. Sensors are wrapped in a camouflage casing and miniature-sized microphone. Equipped with a wind shield or
installed high-up in the trees. Installation of these devices a pop filter, it is less sensitive to wind and other unwanted
not only discourages a number of loggers but also catches background pop-noises. The RODE SmartLav+ spans a fre-
them in the act. Once motion is detected, an email alert is quency range from 20 to 20,000 Hertz. Its output connector
sent to rangers in charge. There are a few true positives that is a TRRS (i.e. tip/ring/ring/sleeve) jack, which we connected
should be acted upon buried in a very large number of false to a RODE S3 3.5MM TRRS to TRS adapter.
alarms; rangers could receive up to eight or nine hundred Sounds captured are acoustic analog signals. Sound is
email alerts in a day. a mechanical wave generated by physical vibrations of air
Our research project aims to enhance this current camera- particles or molecules of other mediums. It is a longitudinal
and-GPS-based NCAPS regime by integrating analysis of pressure wave that propagates through a medium by means
sound. Audio information can help confirming or rejecting of high-pressure compression and low-pressure rarefaction.
alerts when suspiciously abnormal motion is detected. It Sound wave propagates in the direction that is the same as or
can be used to filter false positive email messages, leaving parallel to the displacement of the vibrations of air particles.
rangers with fewer that require attention. Additionally, since Sound travels in all directions and it echoes when it hits
sound can be heard further away than the typically usable and bounces off a solid surface. Conversion of analog sound
field of view of a camera, sound of logging activities can be waves to electrical impulses is necessary to enable digital
captured before loggers come into the view of a camera. This sound signal processing. In our design, this conversion is
effectively increases coverage distances of the system. In a carried out by an external sound card. The TRS jack output
similar work, Ahmad and Singh detected the sound of tree from the microphone cable is connected to a UGREEN
cutting by axes [2]. Unlike their work, we are capturing and 30724 external stereo sound adapter, which is then plugged
visualizing the use of chainsaws, which are one of the most into a Raspberry Pi 3 Model B+ embedded single-board
common tools used in both harvesting and processing steps. computer through a Universal Serial Bus (USB) 2.0 port.
We built an IoT sound acquisition network and deployed it in A Raspberry Pi module is our on-site, pint-sized opera-
a real, but controlled, wilderness environment. From recorded tional control and computational unit. Pi locally stores audio
audio files, acoustic features were extracted to detect and data it receives temporarily for a short period of time. Audio
classify illegal chainsaw activities. data saved initially on its own local storage, a micro SD
card, is periodically packed and send to an off-site cloud
III. E MBEDDED S YSTEM D ESIGN AND D EVELOPMENT storage. Connected to Pi is a USB air card dongle with
Our embedded sound listener device was designed for active cellular communication channels, and through cellular
high performance and yet to be portable and affordable. An networks, data are deposited in the cloud. Our approach
overview of its architecture and configuration is shown in employs the fourth generation of broadband cellular network
Fig.2. Every component is hand-picked so as to record high technology (4G) as used in [3]. This choice was made
1278
Fig. 3. Scenes of our experimental sites where sound was recorded. Left: Fig. 4. Topographic map of our experimental sites with marked locations
Site A, the eucalyptus reforestation. Right: Site B, the thick natural forest. of our listeners and sound sources, where log cutting with chainsaws was
conducted. At Site A, sound is recorded at 100 m and 300 m away from
the source, whereas at Site B, the recorders are set at 100 m and 270 m.
This map is generated using a GPSVisualizer tool by Adam Schneider.
to enable extended uploading of audio data with minimal
interruption and maintenance. Though there are incurring
charges of 4G data plans for each listener unit, the main
advantage of using mobile networks over WiFi connections
is the extended range. This is a critical factor for fieldworks
in remote sites as it was also of a concern in [4]. WiFi has a
limited range within reach of a wireless router, the coverage
of 4G services is much wider and more accessible.
Once in the cloud, recorded digital audio data can then
be checked remotely and downloaded for further processing.
Using cloud data storage instead of local SD card or an on-
site wired external hard disk storage enables building a very
large data bank and also reduces the Size-Weight-and-Power
(SWaP) requirements. Electrical power for any additional
electronics could be challenging when devices are deployed
off grid. We note here that although our prototype Raspberry Fig. 5. Setting of our sound acquisition devices. Each listener is placed on
Pi is currently powered by a power bank, our system design is the ground with a loosely covered camouflage tarp for protection.
to have it self-sustained solar-powered, an option that could
be handily implemented.
IV. DATA C OLLECTION IN THE F ORESTS

To properly collect real sounds of chainsaw logging ac-
tivities, we conducted an actual practice of log cutting by
chainsawing at the Wang Nam Khiao Forestry Student Train-
ing and Research Station, hosted by Kasetsart University.
This training center is located at Wang Nam Khiao district,
Nakhon Ratchasima province in the northeast of Thailand.
Our data collection was performed over a three-day camp-
work in March of 2020.
Selected locations included a eucalyptus forest (Site A)
and a natural forest (Site B); both are shown in Fig.3. The
eucalyptus forest is under a restoration project. It is relatively Fig. 6. A listener device in a sealed plastic box and its detail components.
an open space without much ground cover plants. On the
other hand, the natural forest is a much thicker woods that
is of high biodiversity, exhibiting both species richness and other at 300 meters. Similarly, Site B has one at 100 meters
evenness. Differences between these two locations enable a and the other at 270 meters. Terrain maps of the areas with
more rigorous evaluation of the usability of our devices as these locations marked are depicted in Fig.4 and photographs
trees and branches affect reflection and refraction of sound. of the locations are displayed in Fig.5. A camouflage tarp was
Additionally, Site B is located closer to a major highway with used to cover our sound acquisition devices and to keep all
heavy traffic at times. As a consequence, sounds of vehicles electronics from moisture, rain, or small insects, we secured
passing by do mix into our recorded audio data. Though this our circuit boards and all connected components in a sealed
may seem undesired, it handily asserts a challenging factor hard-plastic housing, illustrated in Fig.6.
for our sound processing unit. Logging and cutting with chainsaw activities were planned
At each site, we setup two sound listener devices at some so they mimicked how illegal logging would be performed in
distance away from the sound source where log cutting is real incidents. Case reports from the Department of Forestry
performed. Site A has one listener at 100 meters and the of Thailand, indicated that loggers do use various different
1279
TABLE I TABLE II
VARIATIONS IN CHAINSAW AND LOG CUTTING ACTIVITIES . C HARACTERISTIC PARAMETERS OF THE SELECTED SOUND SAMPLES .
STIHL Chainsaw MS180: small and light duty Sample Site Distance Chainsaw Wood Size of Log Chainsaw Engine
MS382: large forestry and agricultural saw (a) A 100 m Large Dry 321 mm start and accelerate
MSA120C: electric, cordless, battery-powered (b) A 300 m Electric Dry 120 mm no acceleration
Condition of wood Fresh, moist, or dry (c) B 100 m Large Moist 275 mm with acceleration
Size of wood (mm) 60, 120, 145, 160, 175, 190, 275, or 320 (d) B 270 m Small Dry 160 mm with acceleration
Logging activities Starting and warming up of the saw engine
Cutting with or without accelerating the engine
meters and 270 meters away, respectively. Parameters of
these four sound samples are listed in TABLE II. Examining
the signal Fig.8(a), the sounds of active chainsaws are evident
in the beginning of the time series, around the halfway mark,
and at the end. Chainsaw patterns are less apparent as the dis-
tance from a sound source triples, shown in Fig.8(b). Under
noisy environment at Site B, Fig.8(c) and 8(d), the waves
are noticeably overwhelmed by undesired noise, making it
very difficult, if not impossible, to pinpoint during which
Fig. 7. Our wood cutting activities using chainsaws, conducted at the Wang segments of time chainsaw activities occur.
Nam Khiao Forestry Student Training and Research Station. These served In digital sound processing, acoustic signals are typically
as sound sources of our data collection, simulating possible illegal logging. transformed from the time to the frequency domain. A
discrete Fourier transform (DFT) is used to decompose a
signal into a linear combination of its sinusoidal frequencies
types of chainsaws, ranging from compact-sized to heavy-
and phase contents. This projection of a sound wave onto the
duty, and also including electric saws. Making it even harder
Fourier spectral domain uncovers its constituent frequency
to catch and be heard, some are apparently modified with
components, revealing which frequencies are present when
sound absorbing materials or soundproofing gear. For our
and how often. Fourier transforms, however, are not generally
experiment, three commonly used chainsaws were selected.
applied to the entire length of the wave. Instead, audio signals
Conditions and sizes of logs were varied, as well as how
are broken down into small chunks of time and onto each
saw engines were run while wood was cut. These variations
of these tiny sound segments a short-time Fourier transform
are summarized in TABLE I and photographs of our logging
(STFT) is applied. It thus details frequency components
activities are shown in Fig.7.
of a nominally constant signal, unwrapping the underlying
Logging was conducted as sound was collected in intervals
discriminative features. Following commonly used setting,
of five-minute recording with a five-minute break. The gap-
25 ms chunks were implemented in this work. To maintain
times allow completion of audio files being written to storage
continuity of signals, an overlapping window of 10 ms was
and transferred to the clouds. This experiment yielded 30
applied between every consecutive chunk.
minutes of recorded logging activities per hour. Our log
Resulting Fourier spectra are complex-valued vectors; with
cutting activity time totaled 13 hours. With four listeners
one column vector for each 25 ms audio. These are visualized
running in parallel, 26 hours of audio data were recorded.
through a spectrogram or referred to as a periodogram [5]. It
Logging activities during the 5-minute recording were per-
is the logarithmic scale of the power spectral density (PSD),
formed as naturally as possible. Chainsaws were run for a
denoted by P , of Fourier energies X at every frequency k.
minute or two or sometimes longer. Wood was cut, split,
The STFT spans over N samples of a time-varying signal
stacked, or moved. There were interruptions when chains
x(n). Since a 25 ms chunk was implemented and our signal
failed and needed to be re-set, or batteries ran out and needed
was recorded at 22,050 samples per second, therefore N =
to be changed. With these settings, the data collected contain
551. The PSD function is expressed as:
both positive and negative samples of the sound of wood
cutting with chainsaws, which is our target for detection N −1 2
1 1 X
of illegal logging. In addition to sound, environmental data P (k) = X(k)2 = x(n)e−i2πkn/N
recorded included temperature, humidity, wind speed, and N N n=0
wind direction. These parameters could potentially have an
affect and will be use later in the analysis of acoustic signals. To form a single spectrogram are columns of the power
spectrum of each 25-ms segment stitched together along the
V. ACOUSTIC S IGNAL P ROCESSING AND V ISUALIZATION time-axis. A spectrogram is a graphical representation of
The simplest and most familiar visualization of sound is a sound that shows sequential characteristics of the signal. It
time series plot of the amplitude. Fig.8 shows four examples displays how intensity of sound is distributed in frequency
from our data set. Fig.8(a) and 8(b) are from the restoring domain and how frequency spectra changes over time. Fig.9
eucalyptus forests (Site A) while Fig.8(c) and 8(d) are from depicts four periodograms corresponding to the four waves
natural forests (Site B). In the other dimension, Fig.8(a) and presented in Fig.8. Sounds of chainsaws are clearly exposed
8(c) were recorded from the distance of 100 meters away in Fig.9(a) and 9(c); both of which are recorded at 100
from the sound source while Fig.8(b) and 8(d) were 300 meters. While chainsaw are visually detectable in both the
1280
(a) (b)
(c) (d)
Fig. 8. Examples of sound waves plotted in time domain. Two signals in the top row (a)-(b) are from site A and the other two on the bottom row are from
site B. Those on the left, (a) and (c), are recorded at 100 m away from the sound source. Two on the right are set at 300 and 270 meters, respectively.
(a) (b)
(c) (d)
Fig. 9. Pictorial representation of our sample audio data in the frequency domain. Each plot is a spectrogram of the corresponding time series sound wave
shown in Fig.8. The x-axis is time in second. The y-axis is frequency in cycle per second or Hertz (Hz). Colors represent the third dimension, which
indicates the intensity of each frequency component of the acoustic signal in logarithmic scale in the unit of decibel (dB).
time and the frequency domain for sample (a), they are frequency contents of sounds by being narrower in lower
only distinguishable in frequency domain for the sample (c). frequencies and becomes wilder as frequencies increase.
The frequency spectrum has effectively separated the higher- Taking the logarithm of the power of Mel-filtered signals
intensity sound of log-cutting chainsaws from lower-intensity results in a waveform onto which a discrete cosine transform
environmental noise. At 300 meters distance, samples (b) (DCT) is applied. In addition to extracting periodicity of the
and (d), sound captured is naturally less intense. Patterns harmonics, since DCT is closely related to Karhunen–Loève
of chainsawing are distinguishable but considerably more transform and principle component analysis (PCA), it also
faded. Observing (d), in addition to chainsaws, frequency de-correlates the log-energies. The DCT-transformed signal
separations also include the sound of bird calling. These are is presented in a quefrency domain, a nominal of the time
multiple short-vertical bars in the periodogram. domain, and its amplitudes are those of Mel-frequency
The frequency scale in periodograms is linear in the unit cepstral coefficients or MFCC.
of cycles per second or Hertz (Hz). This, however, does The Mel-frequency cepstral coefficients of our four sample
not correspond to human auditory systems. In Hertz-scale audio signals are demonstrated in Fig.10. Differences be-
we perceive musical tones more discriminatively at lower tween the times at which there are chainsaw activities and
frequencies whereas acoustic signals at higher frequencies those without can be observed more distinctly in samples (a)
are perceived closer together. To reflect this non-linearity, f and (c). They are less pronounced in samples (b) and (d).
Hz in the Hertz-scale is mapped to m mel in the Mel-scale The signal in (d) appears to be the most difficult of all four.
(Mel is short for melody), following [6]: It is not only at the farther distance in a noisier site, but also
intermixed with bird calling sounds.
f Focusing on individual characteristics of sound at in-
m = 2595 log10 1 +
700 stances of time, we plotted the MFCCs of half-a-second long
signals, randomly segmented from each of the four audio
Mathematically, the powers of Fourier spectrum are fil- samples in Fig.10. This reveals strikingly distinctive features,
tered with the Mel-scale triangular overlapping windows. presenting a noticeably clear separation between chainsaw
The Mel triangular filters resemble human perception of the logging activities and other sounds; these are demonstrated in
1281
(a) (b)
(c) (d)
Fig. 10. The Mel-frequency cepstrum of our sample audio data. Each is corresponded to the time-series plot in Fig.8 and the frequency-domain periodogram
in Fig.9. Same as both Fig.8 and Fig.9, the horizontal axis of the plots is the time from 0 to 4 minutes. The vertical axis is the columns of the MFCCs.
Resulting MFCCs provides us with very promising acous-

tic features with remarkably high potential to be employed in
future analysis of this research work, particularly detection
and classification of chainsaw activities. Sound is typically
processed and evaluated in microscopic fractions of a second
to represent information at instantaneous moments of time.
(a) (b) With millisecond overlapping windows, a one-minute audio
recording can result in up to a couple thousands of features.
A large number of these will be required to effectively apply
techniques in machine learning such as deep convolutional
neural networks.
VI. C ONCLUSIONS AND F UTURE W ORKS
(c) (d) Teaming with front-line experienced park rangers and
using advanced technologies to tackle illegal logging prob-
Fig. 11. Extracted MFCC features of the sound of log cutting activities
using chainsaws. Each was computed over half-a-second of audio that was lems, this research work is an unparalleled collective effort,
randomly segmented from the corresponding full-length signal in Fig.10. bringing together forestry researchers and computer scientists
in a strive to develop a practical cloud-based network of
chainsaw listeners. A field test confirmed that our IoT ac-
quisition devices can capture log cutting sound. Visualization
and processing of recorded data showed that Mel-frequency
cepstral coefficients can distinctly differentiate audio signals
of chainsaws from the surroundings. Through this proof of
concepts, we have tools to collect large amount of acoustic
(a) (b) data and methodology to extract discriminative features.
Both are necessary for training deep learning networks. Our
ultimate goal is an integrated alert system that senses both
visual information and sounds.
R EFERENCES
[1] Z. Liu et al., “Application of machine-learning methods in forest ecol-
(c) (d) ogy: Recent progress and future challenges,” Environmental Reviews,
vol. 26, Jul 2018.
Fig. 12. Extracted MFCC features of the sound of surrounding environment [2] S. F. Ahmad and D. Singh, “Automatic detection of tree cutting in
and background noise. Each was computed over half-a-second of audio forests using acoustic properties,” Journal of King Saud University -
that was randomly segmented from the corresponding full-length signal in Computer and Information Sciences, Feb 2019.
Fig.10. By visual observation, these plots of Mel-frequency spectrum are [3] S. Sethi, R. Ewers, N. Jones, C. D. Orme, and L. Picinali, “Robust, real-
discriminatingly different from chainsaw activities presented in Fig.11. time and autonomous monitoring of ecosystems with an open, low-cost,
networked device,” Methods in Ecology and Evolution, vol. 9, Sep 2018.
[4] P. Kalhara et al., “Treespirit: Illegal logging detection and alerting
system using audio identification over an iot network,” in International
Fig.11 and Fig.12, respectively. Regardless of the distances Conference on Software, Knowledge, Information Management and
from sound sources or how noisy the environment is, the Applications, Dec 2017.
[5] S. V. Vaseghi, Advanced Digital Signal Processing and Noise Reduction,
Mel-frequency cepstrum of log cutting chainsaws and those 4th ed. United Kingdom: John Wiley and Sons, Ltd., 2008.
of the surrounding environment and background noise can [6] D. O’Shaughnessy, Speech Communication: Human and Machine.
hardly be mistaken even by visual inspection. Reading, Mass: Addison-Wesley, Pub. Co., 1987.
1282

Illegal Logging Listeners Using IoT Networks

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Illegal Logging Listeners Using IoT Networks

Uploaded by

Copyright:

Available Formats

2020 IEEE REGION 10 CONFERENCE (TENCON)

Osaka, Japan, November 16-19, 2020

Illegal Logging Listeners Using IoT Networks

Abstract—Protecting and increasing worldwide green space

978-1-7281-8455-5/20/$31.00 ©2020 IEEE 1277

IV. DATA C OLLECTION IN THE F ORESTS

Resulting MFCCs provides us with very promising acous-

You might also like