Empirical Mode Decomposistion

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 61, NO.

12, DECEMBER 2012 3175

FPGA Implementation for Real-Time


Empirical Mode Decomposition
Ying-Yi Hong, Senior Member, IEEE, and Yu-Qing Bao

Abstract—This paper presents a novel field-programmable gate signals. Unlike traditional signal processing methods, the HHT
array (FPGA) based method for empirical mode decomposition does not depend on any a priori assumptions before the signal
(EMD) in real time. Traditionally, EMD can be easily implemented processing and is free from the uncertainty principle. Many
and developed using a high-level computer language in a PC or
DSP chip. However, it is difficult to implement EMD in a hardware successful applications have been presented, such as fault de-
environment. This paper develops EMD for real-time applica- tection in machines [2], [3], electrocardiogram [4], [5], analysis
tions using a hardware-based FPGA. The proposed FPGA-based of power quality and electromagnetic transients [6], [7], and
method calculates the upper and lower envelopes in EMD point by acoustic analysis [8], [9].
point by using a circular queue to temporarily store values of max- The HHT consists of empirical mode decomposition (EMD)
ima and minima, from which the upper and lower envelopes in the
EMD can be determined continuously. Additionally, an attempt is and the Hilbert transform. As the foundation of the HHT,
made to increase the efficiency of the computational process by EMD can decompose a signal into scaled signals with differ-
cascading several identical modules as a serial pipeline structure ent features. Each scaled signal is called an intrinsic mode
in order to conduct an iterative loop for calculating the intrinsic function (IMF), which consists of the essential features of
mode functions in EMD. The fast process from the serial pipeline the signal. However, IMF requires many iterative calculations,
structure results in real-time computation with a sampling rate of
up to 12.5 MHz and mitigation of the end effect. The proposed which cannot be made in parallel. Hence, realizing a real-time
method is validated by the simulation results obtained by Quartus environment using software for developing EMD is difficult.
II and verified by FPGA (Altera Stratix III EP3SL150F1152C2) Most HHT-based studies involve the analysis of nonreal-time
realization, revealing its effectiveness in real-time applications. signals using personal computers. Many applications require
Index Terms—Field-programmable gate arrays (FPGAs), real-time EMD results. For example, the detection of faults
pipeline processing, real-time systems, signal analysis. in an operating machine must be carried out in real time.
With a real-time HHT, a faulted machine can be identified and
I. I NTRODUCTION stopped when a fault occurs. However, achieving fast EMD
using hardware remains a major challenge in the development
T IME-FREQUENCY studies are essential for signal anal-
yses. Traditional time-frequency methods, such as short-
time Fourier transform (STFT), Wigner–Ville distribution, and
of a real-time HHT.
A field-programmable gate array (FPGA) based method was
proposed for use as a hardware accelerator to achieve real-
wavelet transform, play an important role in science and tech-
time EMD in [10]. The FPGA-based hardware accelerator
nology. However, based on the Fourier transform concept,
can significantly enhance the computational performance to
these methods fail to self-adjust according to the characteristics
obtain the upper and lower envelopes. However, in this method,
of the signal itself in order to perform optimally. Moreover,
FPGA serves only as an ancillary accelerator. The core function
such methods are restricted by the uncertainty principle (which
of the studied system in [10] is still conducted by software.
imposes mutually related constraints on time- and frequency-
The merits and limitations of software-based and hardware-
domain resolutions).
based EMD implementations have been discussed in relation
The Hilbert–Huang transform (HHT) [1], which was pro-
to a theoretical analysis of EMD in [11]. However, relevant
posed by N. E. Huang in 1998, is a time-frequency analysis
experimental results were not provided in [11].
method that is applicable to both nonlinear and nonstationary
A technique to implement real-time EMD using a DSP chip
and an FPGA chip was presented in [12]. The FPGA chip
Manuscript received January 1, 2012; revised May 28, 2012; accepted was adopted as a controller, in which the sampled data were
May 29, 2012. Date of publication September 19, 2012; date of current
version November 15, 2012. This work was supported in part by the National
incorporated into the ping-pong buffer whose length was 1000.
Science Council, Taiwan, under Grants NSC 99-2632-E-033-001-MY3 and One thousand sampled data were simultaneously incorporated
NSC 100-2221-E-033-002 and in part by the Institute of Nuclear Energy into the DSP chip. Also, the iteration loops of EMD were
Research, Taiwan, under Grants NL1000250 and NSC 101-3113-P-042A-
0019. The Associate Editor coordinating the review process for this paper was
calculated using the DSP chip. However, this method can only
Dr. Kurt Barbe. implement real-time EMD for those signals whose frequencies
Y.-Y. Hong is with the Department of Electrical Engineering, Chung Yuan are below 1 kHz. Moreover, this method suffers from the end
Christian University, Chung Li City 320, Taiwan (e-mail: yyhong@dec.ee.
cycu.edu.tw). effect [12].
Y.-Q. Bao is with the School of Electrical Engineering, Southeast University, On the other hand, microprocessors rely on software to im-
Nanjing 210096, China (e-mail: 503000747@qq.com). plement functions efficiently, e.g., iterative calculations. How-
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. ever, the frequency of the microprocessors restricts the speed of
Digital Object Identifier 10.1109/TIM.2012.2211460 such implementation. Conversely, hardware-based technology

0018-9456/$31.00 © 2012 IEEE


3176 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 61, NO. 12, DECEMBER 2012

with a unique structure, such as the pipeline structure, can


overcome the restriction imposed by the processor frequency.
Hardware-based FPGAs provide an approach to design digital
systems that complement the role of microprocessors. FPGAs
are configured using SRAM, enabling them to be manufactured
by standard VLSI fabrication processes. The functionality of
FPGA is interwoven into the logic structure. There are many
applications using FPGA, e.g., power measurement [13], in-
duction motor failure detection [14], power quality monitoring
[15], and time-to-digital converters [16]. Accordingly, FPGA
is a powerful component to the development of real-time
applications.
This paper proposes a novel FPGA-based method for imple-
menting EMD in real time. Unlike the methods in [10] and [11],
the proposed method involves only FPGA-based hardware. The
proposed method does not divide the studied data, which were
decomposed in [12], into groups. In the proposed method, the
studied data can be input into the FPGA chip sequentially and
consecutively. Also, the FPGA chip outputs the results sequen-
tially and consecutively to achieve real-time computation. More
specifically, calculating the upper and lower envelopes involves
designing a group of circular buffers to temporarily store the
maximum or minimum values of three consecutive sampled
points. Every iteration in an IMF computation is packaged
into a module. The sampling rate in the proposed method
can reach 12.5 MHz for real-time EMD applications. The end
effect occurs only during the initial/final stages, which can be
negligible.
In this paper, the proposed method is first validated by
simulation using a software platform (Quartus II). Finally, the
Verilog code is synthesized, and the result is mapped into Fig. 1. Flowchart of EMD.
the FPGA (Altera Stratix III) resources in order to conduct
the hardware verification.
Step 4: if a convergence criterion (e.g., S number criterion
[18] in this paper: j < S) is satisfied, then h(i,j) (t) is an
II. EMD T HEORY IMF which is defined as ci (t): ci (t) = h(i,j) (t); otherwise,
EMD decomposes a signal into different scaled data se- x(i,j+1) (t) = h(i,j) (t), j = j + 1, and go to step 1.
quences of distinct features. Each sequence is called an IMF, Step 5: estimate the residual signal ri (t) by using ri (t) =
which must satisfy the following two conditions [1]. x(i,1) (t) − ci (t).
Step 6: if ri (t) fulfills the termination criterion, stop; otherwise,
1) In the whole data set, the numbers of extreme points let x(i+1,1) (t) = ri (t), i = i + 1, j = 1, and go to step 1.
(local maxima and minima) and zero-crossing points
The aforementioned algorithmic steps yield the following
must be equal to each other or differ by at most one.
equation
2) At any point of the signal, the mean values of the envelope
defined by the local maxima and the envelope defined by 
n
the local minima must be equal to zero. x(t) = ci (t) + rn (t) (1)
i=1
Traditional algorithmic steps for applying EMD to a signal
x(t) (a function of time) are described as follows. where n denotes the number of extracted IMFs and rn (t) is the
Step 0: let i and j denote the outer and inner iterative indices, residue component. Fig. 1 shows the flowchart of EMD.
respectively. Initially, i = 1, j = 1, and x(i,j) (t) = x(t). The “convergence criterion,” as depicted in step 4, deter-
Step 1: identify all local maxima and minima of the signal mines the number of sifting steps to produce a proper IMF
x(i,j) (t), and then, interpolate these extreme points by when implementing EMD in practice. EMD convergence cri-
cubic spline curve fitting or sawtooth transform (ST) [17] teria include the Cauchy-type convergence criterion, S number
to generate the upper and lower envelopes. criterion [18], and frequency-bandwidth criterion [19]. This
Step 2: calculate the mean values of the upper and lower paper adopts the S number criterion because it is the simplest
envelopes, which are defined as m(i,j) (t). and most suitable for real-time application, and it requires the
Step 3: calculate the difference between x(i,j) (t) and m(i,j) (t): fewest memories. The S number criterion implies that the num-
h(i,j) (t) = x(i,j) (t) − m(i,j) (t). ber of iterations to produce an IMF is customized. Additionally,
HONG AND BAO: FPGA IMPLEMENTATION FOR REAL-TIME EMPIRICAL MODE DECOMPOSITION 3177

Fig. 3. Block diagram of the EMD approach proposed in [12].


Fig. 2. Envelopes computed by cubic spline interpolation and ST. (a) Original
data. (b) Envelopes computed by cubic spline interpolation. (c) Envelopes
computed by ST.
menting EMD. According to the EMD algorithm introduced in
Section II, the input signal x(t) is a function of time. Assume
the value of S is assumed to be 3–5, resulting in a good IMF that the whole data set is as follows:
[18]. Following determination of the number of sifting steps,
the number of iterations is fixed and linked together in the form x(t1 ), x(t2 ), . . . , x(tk )
of a cascaded (pipeline) connection.
Less important than the convergence criterion, the “termina- where x(t1 ) is the first sampled point, x(t2 ) is the second one,
tion criterion” in step 6 can be set according to the required and x(tk ) is the kth one.
number of IMFs [18]. The whole data set was stored using many ping-pong buffers
In the standard EMD process, the so-called cubic spline in [12]. Each ping-pong buffer has a length of 1000, implying
interpolation is generally used for calculating the upper and that the whole data set must be divided into sections
lower envelopes of x(t). ST was developed in [17] to extract
such envelopes. This method does not require the calculation of section 1 : x(t1 ), x(t2 ), . . . , x(t1000 );
a derivative at the extreme points: it requires only that all of the
extreme points are connected by line segments to generate the section 2 : x(t1001 ), x(t1002 ), . . . , x(t2000 );
envelopes. Fig. 2 shows an example of obtaining envelopes by ....
both cubic spline interpolation and ST.
Fig. 2(b) and (c) differs only slightly from each other, indi- The EMD that was proposed in [12] was applied to one section
cating that ST is not inferior to the cubic spline interpolation. at a time. Fig. 3 shows the block diagram of that EMD approach
Characterized as simple and fast, ST requires less hardware [12]. Steps 1–6 in Fig. 3 are described in Section II; the symbols
and is highly promising for real-time applications. Because x(i,j)1000 , m(i,j)1000 , h(i,j)1000 , c(i)1000 , and r(i)1000 denote
this paper focuses on implementing real-time EMD using hard- 1000 sampled points of x(i,j) (t), m(i,j) (t), h(i,j) (t), c(i) (t),
ware, ST is adopted instead of the cubic spline method to and r(i) (t), respectively. Also, x(tp ) is the sampled data at
calculate the upper and lower envelopes of x(t) in a hardware moment tp . ci (tp ) and rn (tp ) are the IMFs and the residual
environment. component of input signal x(t), respectively.
The next section elaborates the implementation of the pro- The method in [12] has at least two disadvantages.
posed method.
1) According to Fig. 3, many iterative calculations are nec-
essary when EMD is applied to the 1000 sampled points
III. D ESIGN I MPLEMENTATION x(i,j)1000 . The inner iteration loop, which comprises steps
1–4, shifts h(i,j)1000 to attain an IMF; the outer iteration
A. Review of Existing Method
loop, which comprises steps 1–6, finds all of the IMFs
This section first introduces the method developed in [12] to (c(i)1000 ) and the residual component (r(n)1000 ). If the
demonstrate the effectiveness of the proposed method in imple- inner iteration loop is performed S times and the outer
3178 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 61, NO. 12, DECEMBER 2012

Fig. 4. Block diagram of achieving overall EMD.

Fig. 6. Flowchart of computing the upper envelope.

in an iteration, respectively. The terms x(i,j) (tp ), m(i,j) (tp ),


h(i,j) (tp ), c(i) (tp ), and r(i) (tp ) are simplified as x(i,j) , m(i,j) ,
h(i,j) , c(i) , and r(i) here. The EMD algorithm requires many
Fig. 5. Block diagram for calculating h(i,j) in an iteration (steps 1–3). iteration loops to calculate each IMF. The iterative loops are
herein modified to a serial pipeline process. As shown in Fig. 4,
iteration is conducted n times, then steps 1–6 will be the process for obtaining the final IMF can be realized by the
carried out as many as S × n times. When EMD is cascading modules defined in Fig. 5.
processing x(i,j)1000 , other data must wait in the ping- By this serial pipeline structure, when the N th module
pong buffers until S × n iterations for x(i,j)1000 are calculates h(i,j) for data x(tp ), the other (N − 1)th, (N − 2)th,
finished. This treatment makes the method in [12] rather . . ., modules can calculate h(i,j−1) , h(i,j−2) , . . ., for data x(tp −
inefficient. Δt), x(tp − 2Δt), . . ., simultaneously (assuming j > 2; Δt is
2) When the envelopes of each section including 1000 sam- the time required to calculate h(i,j) once). Hence, this cascaded
pled points are calculated, the values cannot be obtained design can greatly enhance the computational speed while
accurately at two ends of the section (e.g., the points x(t1 ) maintaining continuous data flow.
and x(t1000 )), resulting in the end effect for every 1000
sampled points. C. Modules for Calculating Upper and Lower Envelopes
To solve the aforementioned problems, this paper proposes
a novel FPGA-based method to implement EMD in real time. The structure for calculating every upper or lower envelope
The rest of this section describes the design of the proposed is vital in achieving the overall realization of the pipeline
method. The overall system is introduced first; then, the details structure. This paper also proposes modules to calculate con-
of the modules for calculating the upper and lower envelopes tinuously the upper and lower envelopes. Figs. 6 and 7 show a
are described. flowchart and a block diagram, respectively, for the computa-
tion of the upper envelope. The lower envelope is determined
in the same way. During the experiment, a sample is assumed
B. Overall Realization
here to be taken in each clock cycle.
Figs. 4 and 5 show the overall block diagrams of EMD and According to Fig. 7, first, input x(tp ) is sent to a buffer,
the calculation of h(i,j) (t) (defined in step 3 of Section II) which comprises three registers: Register A, Register B, and
HONG AND BAO: FPGA IMPLEMENTATION FOR REAL-TIME EMPIRICAL MODE DECOMPOSITION 3179

Fig. 8. (a) Original input data. (b) Upper envelope computed by the proposed
method.

and the corresponding front pointers point to the next nodes.


The upper envelope can be obtained by a1 + (a2 − a1 )it /b1
(where a1 and a2 refer to the first and second data in front of
Circular Buffer A; and b1 is the first datum in front of Circular
Buffer B) using the data in the front nodes of the two circular
buffers.
Following the delay of bm clock cycles and computation
of several clock cycles in the computation module, the upper
envelope can be attained at the output. As long as the data
Fig. 7. Block diagram of computing the upper envelope. are inputted continuously, the envelope at the output is also
continuous. The value of bm affects the performance of the
EMD implementation. Hence, the value of bm should be as-
Register C. The input data at moments (tp − 2δt), (tp − δt), signed according to real situations and should be larger than the
and tp are temporarily stored in Register A, Register B, and maximum time span between the two adjacent maxima. The
Register C (where δt is the duration of the sampling cycle), length of Circular Buffer B should be equal to bm in order
respectively. These three registers constitute a register queue. to store a sufficient amount of data. When the value of bm is
During each clock cycle, a datum is moved to the next register chosen, b1 should be smaller than bm : b1 ≤ bm . Therefore, a
from the current one. The datum in Register B is compared with larger bm implies a longer allowed time span, a longer delay,
those in Register A and Register C by two comparators, whose and better adaptability; however, additional hardware resources
results pass through an AND gate. If the datum in Register B is are consumed. In the experiment of this paper, the value of bm
found to be larger than those in both Register A and Register and the length of the circular buffers are selected to be 30 to fit
C, then the datum in Register B is a maximum. The maximum the studied problem.
is placed at the end of Circular Buffer B, and the rear pointer The proposed method is implemented by a bottom-up strat-
of Circular Buffer A points to the next node. Simultaneously, egy. Specifically, the basic modules (including “Determine
the Counter starts counting (in every clock cycle, the value Local Maxima,” “Circular Buffer A & B,” “Compute a1 +
of the Counter increases by 1). When the next maxi- (a2 − a1 )it /b1 ,” “bm clock cycles’ delay,” and “Counter”),
mum arrives, the Counter is cleared to zero, and a new as shown in Fig. 7, are implemented by Verilog. These basic
round of counting starts. Meanwhile, the rear pointer of modules can be integrated to be a larger module (e.g., upper
Circular Buffer B points to the next node, and the last counting envelope) using the function of “block diagram file” supported
result is placed at the rear of Circular Buffer B. The values of the by Quartus II. The larger modules are also integrated using the
maximum points are stored temporarily in Circular Buffer A, same approach. The entire system is ultimately established by
and the time intervals between every two maximum points this bottom-up strategy.
are stored temporarily in Circular Buffer B. Whenever a new Fig. 8 shows the upper envelope obtained by computing a
datum enters Circular Buffer A, the timing of the new bm set of data using the proposed method. On account of the delay
clock cycles begins. Hence, bm is a number determined by in determining the maxima and in the computing process, plus
real situations and should be larger than any data in Circular a delay of 30 clock cycles in the circular buffer, a delay of
Buffer B. Whenever bm clock cycles come to an end, the data 38 clock cycles occurs between the input data and the obtained
in the front nodes of Circular Buffer A & B are discarded, upper envelope.
3180 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 61, NO. 12, DECEMBER 2012

IV. E XPERIMENTAL S ETUP TABLE I


R ESOURCE U TILIZATION IN FPGA
A. Software Development
The software Quartus II is used herein to develop EMD
code, in which two-layered EMD is implemented to decompose
a signal into IM F 1, IM F 2, and the residue function r, as
shown in Fig. 4. Also, the S number criterion is adopted as a
convergence criterion, where S is set to 3.
As described in Section I, the studied data may be embed-
ded sequentially and consecutively into the FPGA chip. The
FPGA chip outputs the results sequentially and consecutively to
achieve real-time computation. Therefore, the proposed method
can deal with any amount of data. The voltage in the Altera
Stratix III board is within the range [−1, 1] V, and the resolution
of the AD/DA converters is 14 b. In the experiment of this paper,
FPGA only deals with integers. Hence, the input and output
data are within the range [−8192, 8191].
electromagnetic interference, which causes a high-frequency
After the Verilog code for implementing EMD is synthe-
alternating signal to be displayed on the instruments. In the
sized and configured, both simulation-based experiment and
second case, high-frequency interference of the form 1000 ·
hardware-based experiment can be carried out. The simulation
sin(2 · π · 2 500 000t) is filtered. In the third case, the proposed
experiment was performed using a Quartus–MATLAB cosim-
method is used to filter out white noise by ignoring IM F 1. In
ulation. The input waveform, saved as a vector file (∗.vec),
the fourth case, Altera FPGA (Stratix III EP3SL150F1152C2)
was generated using MATLAB software, as required by the
is used to investigate the same conditions studied in the
Quartus II simulation. The waveform simulation was then stud-
second case.
ied in Quartus II. Next, the simulation result was saved as tbl
file (∗.tbl) imported to MATLAB, which can plot the resulting
waveform. V. A NALYSIS OF S IMULATION AND
E XPERIMENTAL R ESULTS
B. Hardware Setup The developed Verilog code for implementing EMD is syn-
The hardware experiment is performed using an experimen- thesized and configured in Quartus II. As can be seen in Table I,
tal platform (i.e., Terasic DE3 development board of Altera the main consumed FPGA resources are logic components;
Stratix III EP3SL150F1152C2 FPGA). The testing signal x(t), in addition, the hardware resources in FPGA are adequate.
which is a section of recorded ultrasound signal added by high- The software platform Quartus II automatically utilizes the
frequency interference, is stored in an FPGA chip and generated logic components to realize circular buffers by several Ver-
by a DA converter. x(t) is sampled by an AD converter and ilog variables. Notably, the memory blocks are not used in
is decomposed into IM F 1, IM F 2, and r in Altera Stratix Quartus II.
III EP3SL150F1152C2 FPGA. The reconstructed signal is out- The Timing Analyzer Tool in Quartus II is used to analyze
putted at the DA converter. The experimental waveforms are the synthesized results, by which the minimum clock cycle of
observed using an oscilloscope. the program code can be obtained. The Timing Analyzer Tool in
Quartus II indicates that the minimum clock cycle is 62.845 ns.
The critical path that yields a minimum clock cycle of 62.845 ns
C. Characteristics in the Case Studies results from the division operation, which is easily implemented
Four case studies are discussed in this paper. The first three using Verilog but requires a complex circuit configuration. The
involve simulations of signal decomposition, high-frequency bottleneck caused by the division operation can be alleviated
interference filtering, and white-noise filtering. Their purpose by rewriting the code of the division function and adding more
is to validate the proposed method. The fourth case involves the pipeline stages to the division function. Because this paper
use of Altera Stratix III EP3SL150F1152C2 FPGA to verify the focuses on the overall structure of EMD, the division function
accuracy of the proposed method using a section of a recorded offered by Quartus II library is used herein. The clock cycle
ultrasound signal to which had been added high-frequency (80 ns herein, corresponding to a clock frequency of 12.5 MHz)
interference. used in the designed program code must be longer than
In the first case, signals that comprise 50-, 500-, and 62.845 ns (corresponding to 15.91 MHz), which imposes the
2500-kHz components are decomposed. The remaining three timing constraint.
cases involve examples of ultrasonic testing (UT). UT is the
fastest growing and most widely used nondestructive examina-
A. Application to Signal Decomposition
tion approach, which plays an important role in flaw detection/
evaluation, dimensional measurements, and material charac- This section describes the accuracy of the proposed method,
terization. However, the ultrasound probe is very sensitive to which can decompose a mixed signal into its individual
HONG AND BAO: FPGA IMPLEMENTATION FOR REAL-TIME EMPIRICAL MODE DECOMPOSITION 3181

Fig. 9. Original experimental signal.

Fig. 11. Comparison of IM F 1, IM F 2, and r with 2500, 500, and 50 kHz


sine waves. (a) IM F 1. (b) 2500-kHz sine wave. (c) IM F 2. (d) 500-kHz sine
wave. (e) Residue function r. (f) 50-kHz sine wave.

TABLE II
COR AND RMSE B ETWEEN A NY T WO S TUDIED S IGNALS

Fig. 10. Simulation results of EMD to the experimental signal.

components of different frequencies. Fig. 9 shows the experi-


mental signal, which is expressed as

x(t) = 4000 · sin(2 · π · 50 000t) + 1000 · sin(2 · π · 500 000t)

+500 · sin(2 · π · 2 500 000t).


Fig. 12. Original ultrasonic pulse signal with high-frequency interference.

The sampling frequency is 12.5 MHz in this testing. The


original signal comprises three components of frequencies 50, Fig. 11 compares IM F 1, IM F 2, and r obtained by the
500, and 2500 kHz. Fig. 10 shows the simulated IM F 1, developed Verilog code with the sine waves of 2500, 500, and
IM F 2, and r values obtained by the two-layer EMD, as de- 50 kHz. The correlation coefficient (COR) and root mean
fined in Fig. 4. According to the EMD theory, IM F 1, IM F 2, square error (RMSE) between each pair of signals are illus-
and r represent signals with frequencies of 2500, 500, and trated in Table II. This table reveals that the CORs between
50 kHz, respectively. IM F 1, IM F 2, r, and their corresponding frequency compo-
As shown in Fig. 10, each IMF has a time delay, mainly nents are all close to unity, implying that the proposed method is
owing to the process of computing envelopes. The time delay effective for signal decomposition. Also, the RMSE is markedly
(i.e., processing time) for obtaining IM F 2 is only about 20 μs. smaller than the amplitude of the original signal in Fig. 9.
This processing time is slightly longer than that of the FPGA-
based STFT (which is less than 1 μs). This difference arises
B. Signal Filtering
from the fact that the EMD algorithm is more complicated
and EMD requires many more operations than STFT does. This section describes the effect of signal filtering using the
In the whole process of EMD, the end effects exist only at proposed method. An ultrasonic signal with high-frequency in-
the starting moment. Afterward, IM F 1, IM F 2, and r are all terference is employed in this experiment, as shown in Fig. 12,
continuous without any end effects, which is likely to occur which can be expressed as
owing to a moving window with a fixed number of sampled
points. x(t) = u(t) + 1000 · sin(2π · 2 500 000 t)
3182 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 61, NO. 12, DECEMBER 2012

Fig. 15. Comparison of original and reconstructed signals. (a) Noise-free


Fig. 13. Simulation results of EMD to the ultrasonic pulse signal. signal. (b) Unfiltered signal. (c) Reconstructed signal IM F 2 + r.

Fig. 14. Comparison of original and reconstructed signals. (a) Reconstructed


signal IM F 2 + r. (b) Original signal.

where u(t) is the original ultrasonic pulse signal, whose wave


width and pulsewidth are about 2 and 10 μs, respectively
[20], [21]. The function 1000 · sin(2 · π · 2 500 000t) is the
added high-frequency interference. In the experiment, the orig-
inal ultrasonic pulse signal is a section of a recorded ultra-
sound signal, and the added high-frequency interference is
generated using MATLAB. The sampling rate in this test is Fig. 16. FPGA-based hardware verification of the proposed signal-filtering
system. CH1: Original input signal with high-frequency interference. CH2:
12.5 MHz. Signal reconstructed using the proposed EMD.
As shown in Fig. 13, IM F 1 represents the high-frequency
interference, while IM F 2 and r represent the two components
of the original ultrasound signal. Thus, IM F 2 and r can be tion of a recorded ultrasound signal added by high-frequency
used to reconstruct the ultrasonic signal by ignoring (filtering) interference, is stored in another FPGA chip and generated
IM F 1. by a DA converter. The original signal is decomposed into
Fig. 14 compares the filtered and original signals. The COR IM F 1, IM F 2, and r in FPGA. IM F 2 and r are then used
and RMSE between these two signals are 0.9984 and 20.9534, to reconstruct the original signal, as shown in Fig. 16. CH1 is
respectively. The RMSE is markedly smaller than the amplitude the signal before filtering, whereas CH2 is the filtered signal.
of the original signal in Fig. 12. Therefore, the proposed FPGA- This hardware experiment reveals that the proposed FPGA-
based EMD can filter out the ultrasonic pulse signal with high- based EMD method can be used effectively in a real-time
frequency interference. environment.
White noise is considered as well in this paper. Fig. 15
shows the result of reconstructed ultrasonic pulse signal using
C. Comparison With Previous Works
IM F 2 + r only. The COR and RMSE between the original
noise-free signal and the reconstructed signal (IM F 2 + r) are Table III compares the proposed method and methods that
0.8491 and 177.8637, respectively. have been proposed in previous works, including the hardware-
The hardware experiment is carried out in order to verify accelerated method in [10] and the FPGA/DSP-based method
the simulation results. The original signal x(t), which is a sec- in [12].
HONG AND BAO: FPGA IMPLEMENTATION FOR REAL-TIME EMPIRICAL MODE DECOMPOSITION 3183

TABLE III Future works can improve the proposed method by using the
C OMPARISON OF E XISTING M ETHODS AND THE P ROPOSED M ETHOD
cubic spline to calculate the upper/lower envelopes, thereby
increasing accuracy. To achieve this, more circular buffers
are required, and modules for calculating the upper/lower en-
velopes need to be redesigned.

R EFERENCES
[1] N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng,
N. C. Yen, C. C. Tung, and H. H. Liu, “The empirical mode decomposition
and the Hilbert spectrum for nonlinear and nonstationary time series
analysis,” Proc. R. Soc. Lond. A, Math. Phys. Sci., vol. 454, no. 1971,
pp. 903–995, Mar. 1998.
[2] J. A. Antonino-Daviu, M. Riera-Guasp, M. Pineda-Sanchez, and R. B.
Perez, “A critical comparison between DWT and Hilbert–Huang-based
According to Table III, the proposed method significantly methods for the diagnosis of rotor bar failures in induction machines,”
IEEE Trans. Ind. Appl., vol. 45, no. 5, pp. 1794–1803, Sep./Oct. 2009.
improves upon previous methods in the following areas. [3] R. Yan and R. X. Gao, “Hilbert–Huang transform-based vibration signal
analysis for machine health monitoring,” IEEE Trans. Instrum. Meas.,
1) Data processing: although the method in [10] devel- vol. 55, no. 6, pp. 2320–2329, Dec. 2006.
oped a hardware-accelerated method, that method did [4] A. J. Nimunkar and W. J. Tompkins, “EMD-based 60-Hz noise filtering
not achieve real-time data processing eventually. The of the ECG,” in Proc. IEEE 29th Annu. Int. Conf. Eng. Med. Biol. Soc.,
Aug. 2007, pp. 1904–1907.
required 1.43 s, as shown in Table III, is only an estimated [5] Z. Zhao and Y. Wang, “Analysis of diastolic murmurs for coronary artery
time. The method in [12] divided the whole data set into disease-based on Hilbert Huang transform,” in Proc. Int. Conf. Mach.
sections so that every section contains 1000 sampled data; Learn. Cybern., Aug. 2007, vol. 6, pp. 3337–3342.
[6] N. Senroy, S. Suryanarayanan, and P. F. Ribeiro, “An improved
however, the proposed method does not need to divide Hilbert–Huang method for analysis of time-varying waveforms in power
the data. quality,” IEEE Trans. Power Syst., vol. 22, no. 4, pp. 1843–1850,
2) Calculation time: the times required to process a given Nov. 2007.
[7] D. Yang, Y. Li, R. Christian, and R. Xiu, “Analysis of low frequency
amount of data using the methods in [10] and [12] oscillations in power system based on HHT technique,” in Proc. 9th Int.
are much longer than that required using the proposed Conf. Environ. Elect. Eng., May 2010, pp. 289–292.
method. [8] Y. Zhang, Y. Gao, L. Wang, J. Chen, and X. Shi, “The removal of wall
components in Doppler ultrasound signals by using the empirical mode
3) Highest sampling rate: the highest sampling rate for the decomposition algorithm,” IEEE Trans. Biomed. Eng., vol. 54, no. 9,
method in [12] is rather limited (360 Hz); however, the pp. 1631–1642, Sep. 2007.
proposed method can support sampling rates of up to [9] A. Liao, C. Shen, and P. Li, “Potential contrast improvement in ultrasound
pulse inversion imaging using EMD and EEMD,” IEEE Trans. Ultrason.,
12.5 MHz, implying the feasibility of the proposed Ferroelect., Freq. Control, vol. 57, no. 2, pp. 317–326, Feb. 2010.
method for real-time applications. [10] L. Wang, M. I. Vai, P. U. Mak, and C. I. Ieon, “Hardware-accelerated
implementation of EMD,” in Proc. 3rd Int. Conf. Biomed. Eng. Inf.,
4) End effect: the end effect in [10] occurs near the two ends Oct. 2010, vol. 2, pp. 912–915.
of the data set (2048 data). The end effect in [12] occurs [11] J. D. Jonesa, J. S. Peib, P. J. Wright, and M. P. Tull, “Embedded EMD
near the initial/final data of every group of 1000 sampled algorithm within an FPGA-based design to classify nonlinear SDFO sys-
tems,” in Proc. SPIE, Mar. 2010, vol. 7647, pp. 76470E-1–76470E-6.
data, e.g., x(t1 ), x(t1000 ), x(t1001 ), x(t2000 ), . . . , x(tk ). [12] M. Lee, K. Shyu, P. Lee, C. Huang, and Y. Chiu, “Hardware implementa-
Because the proposed method does not need to divide the tion of EMD using DSP and FPGA for on-line signal processing,” IEEE
whole data set into sections of 1000 data, the proposed Trans. Ind. Electron., vol. 58, no. 6, pp. 2473–2481, Jun. 2011.
[13] R. Jevtic and C. Carreras, “Power measurement methodology for FPGA
method avoids all of the end effects except for the two devices,” IEEE Trans. Instrum. Meas., vol. 60, no. 1, pp. 237–247,
ends of the whole data set: (the end effect occurs only in Jan. 2011.
x(t1 ) and x(tk )). [14] L. M. C. Medina, R. de Jesus Romero-Troncoso, E. Cabal-Yepez,
J. de Jesus Rangel-Magdaleno, and J. R. Millan-Almaraz, “FPGA-based
multiple-channel vibration analyzer for industrial applications in induc-
tion motor failure detection,” IEEE Trans. Instrum. Meas., vol. 59, no. 1,
VI. C ONCLUSION pp. 63–72, Jan. 2010.
This paper has presented a fast and real-time EMD method [15] A. D. Femine, D. Gallo, C. Landi, and M. Luiso, “Power-quality mon-
itoring instrument with FPGA transducer compensation,” IEEE Trans.
using FPGA. The developed FPGA-based method is applicable Instrum. Meas., vol. 58, no. 9, pp. 3149–3158, Sep. 2009.
to high-frequency signals, and the end effect occurs only in the [16] M. A. Daigneault and J. P. David, “A high-resolution time-to-digital con-
initial/final stages. In contrast, the traditional DSP-based EMD verter on FPGA using dynamic reconfiguration,” IEEE Trans. Instrum.
Meas., vol. 60, no. 6, pp. 2070–2079, Jun. 2011.
methods are only applicable to low-frequency signals in a real- [17] L. Y. Lu, “Fast intrinsic mode decomposition of time series data
time environment, and the end effect always occurs. The pro- with sawtooth transform,” ORACLE, Redwood Shores, CA, pp. 1–13,
Nov. 2007, Tech. Rep.
posed method is characterized by the following: 1) the design of [18] N. E. Huang, M. C. Wu, S. R. Long, S. S. P. Shen, W. Qu, P. Gloersen,
the modules for continuous calculation of the upper and lower and K. L. Fan, “A confidence limit for the empirical mode decomposition
envelopes and 2) the design of the serial pipeline (space-based and Hilbert spectrum analysis,” Proc. R. Soc. Lond. A, Math. Phys. Sci.,
vol. 459, no. 2037, pp. 2317–2345, Sep. 2003.
implementation) to substitute the complicated iteration loops [19] B. Xuan, Q. Xie, and S. Peng, “EMD sifting based on bandwidth,” IEEE
(time-based implementation). The calculation of envelopes is Signal Process. Lett., vol. 14, no. 8, pp. 537–540, Aug. 2007.
simplified by using ST rather than the cubic spline. Moreover, [20] A. O. Boudraa, J. C. Cexus, and Z. Saidi, “EMD based signal noise
reduction,” Int. J. Signal Process., vol. 1, no. 1, pp. 33–37, 2004.
the proposed method is validated first by Quartus II simulation [21] A. O. Boudraa and J. C. Cexus, “EMD-based signal filtering,” IEEE
and then by FPGA hardware. Trans. Instrum. Meas., vol. 55, no. 6, pp. 2196–2202, Dec. 2007.
3184 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 61, NO. 12, DECEMBER 2012

Ying-Yi Hong (SM’00) received the B.S. degree in Yu-Qing Bao was born in Zhenjiang, China, in 1987.
electrical engineering from Chung Yuan Christian He received the B.S. degree in electrical engineering
University, Chung Li, Taiwan, in 1984, the M.S. from Jiangsu University of Science and Technology,
degree in electrical engineering from National Chen Zhenjiang, in 2009 and the M.S. degree in electri-
Kung University, Tainan, Taiwan, in 1986, and the cal engineering from Southeast University, Nanjing,
Ph.D. degree from the Institute of Electrical En- China, in 2012. He is currently working toward the
gineering, National Tsing-Hua University, Hsinchu, Ph.D. degree in the School of Electrical Engineering,
Taiwan, in 1990. Southeast University.
Sponsored by the Ministry of Education of China, From 2010 to 2011, he was an exchange stu-
he conducted research in the Department of Electri- dent with the Department of Electrical Engineering,
cal Engineering, University of Washington, Seattle, Chung Yuan Christian University, Chung Li, Taiwan.
from August 1989 to August 1990. From February 1991 to July 1995, he His current research interests include signal processing, power demand side
served as an Associate Professor with the Department of Electrical Engineering, management, and fault detection in power systems.
Chung Yuan Christian University, where he was promoted to the rank of Full
Professor in August 1996. His areas of interest are power system analysis, field-
programmable gate array design, and AI applications.
Prof. Hong was the Chair of the IEEE PES Taipei Chapter in 2001.

You might also like