Professional Documents
Culture Documents
Example of Elsevier Article Template With Dummy Text Copy
Example of Elsevier Article Template With Dummy Text Copy
Example of Elsevier Article Template With Dummy Text Copy
Abstract
The field of digital image processing is widespread in several applications
which are essential in daily life. Different image processing algorithms assist
in clarifying the image and improving the ability to recognize distinct charac-
teristics of the image. The median filter is a widespread image pre-processing
way; it effectively eliminates salt and pepper noise from the image. The main
target of this paper is to investigate efficient median filter units to connect to
a general-purpose processor (GPP) for FPGA-based embedded systems. The
paper exposes two novel median filtering techniques. These techniques focus
on a 3x3 image window filtering in which the median unit of the filter can
produce the required result in high accuracy, short time, and reduced number
of comparisons. These are inspired by the median of median (MOM) algo-
rithm. Besides, the paper illustrates how the concept beyond the proposed
techniques can be used to perform the maximum pooling for convolution
neural networks. Furthermore, the two proposed filtering units are imple-
mented leveraging the advantage of the parallel processing, pipelining, and
the FPGA flexible resources to satisfy the real-time processing constraints.
A comparison between the two proposed techniques and their counterparts
in the literature is included. The comparison reveals the superiority of the
first technique in terms of accuracy with fewer comparators than previously
published techniques.
1
Keywords:
1 1. Introduction
2 Digital images have a significant role in daily life applications and areas
3 of research and technology such as magnetic resonance imaging, satellite
4 television, geographical information systems, and astronomy Raju et al. [1].
5 It acts a vital rule in some medical applications such as diabetic retinopathy
6 diseases Priyanka [2]. Data sets gathered by image sensors, the circuitry of a
7 scanner or digital camera are generally infected by noise. It can originate by
8 the image sensor, in the film grain, or in an ideal photon detector’s obligatory
9 shot noise. In addition to the problems with the data acquisition process and
10 imperfect instruments, transmission errors, interfering natural phenomena,
11 and compression can also be sources for image noise [3]. This noise can
12 be recognized as a fundamental signal distortion that degrades the process
13 of information extraction and image observation. Eliminating image noise
14 (denoising) forms a necessary basis for image processing and analysis. Image
15 denoising has been exhaustively studied in the area of computer science.
16 Therefore, many filtration algorithms exist. Denoising algorithms include
17 transform-domain thresholding, statistical models, random fields, dictionary
18 learning methods, anisotropic diffusion methods, spatial domain filtering,
19 and hybrid methods. [4].
20 Regarding the spatial domain techniques, the similarities between pixels
21 or patches of an image are exploited. Spatial domain methods comprise
22 local and non-local filters. A non-local filter makes use of the correlation
23 between the complete range of image pixels. Local filters are limited by
24 spatial distance. The noise reduction schemes in local filtering are classified
25 into linear filtering and nonlinear filtering techniques. These linear filters
26 depend on convolving the image matrix with a filter mask to produce a
27 linear extension of neighborhood values. This kind of spatial filtering is the
28 most accessible and most primitive form of noise removal, but it frequently
29 causes an undesirable quantity of edges smoothing, loss of details, and poor
30 feature localization [5, 6, 7] . Hereafter, a variety of nonlinear filters have
31 been introduced in order to focus on the aforementioned restrictions in a
32 manner of their innovative and improvisations concepts [8, 9].
33 The median filter is one of the nonlinear filters based on order-statistics
34 theory David and Nagaraja [10]and used for impulse noise removal. Im-
3
35 pulse noise for Grayscale images is also expressed as Salt and Pepper noise.
36 This filter substitutes each pixel by the median of the surrounding pixels.
37 The median value must essentially be one of the neighborhood pixels val-
38 ues. As a result, separated noise points will be removed without generating
39 new impracticable pixel values. Therefore, this technology is preferred for
40 eliminating the random additive and the salt-and-pepper noise. It can also
41 keep the high-frequency portion of the image to a great extent, thus en-
42 sures the image visual effect. Consequently, median filters can present good
43 noise reduction facilities, significantly less blurring than smoothing linear
44 filters of equal size. For these reasons, the median filtering approach has
45 become a widespread algorithm of the image repairing field in recent years
46 [1]. Nevertheless, this technique’s methodology comprises too much manipu-
47 lation involving the scan, compare and move operation, making it relatively
48 slow [1]. Standard median filtering can not comply with the real-time re-
49 quirement. Additionally, software implementations of sorting procedures are
50 time-consuming. Thus, optimized sorting techniques should be used for fast
51 median filtering. Also, it is critical to implement the filter in a high-speed
52 environment. Therefore, two filtering techniques are proposed, which do not
53 depend on sorting all the elements but on making some comparisons to ex-
54 clude the nonmedian elements. This concept prevents unnecessary sorting
55 and comparisons from satisfying the real-time processing constraints. On the
56 other hand, the implementation of 3x3 (9 elements window) median filters
57 has been motivated, given that larger filters usually reduce small edges [11].
58 Furthermore, hardware acceleration is utilized in this work using FPGA to
59 improve the performance. Since FPGAs usually offer a substantial speed
60 improvement compared to software-based solutions via adaption to the ap-
61 plication and parallelism features. Hence, the two proposed filtering units
62 benefiting from the merits of the parallel processing, pipelining, and FPGA
63 flexible resources. The rest of the paper is organized as follows: Section
64 2 describes the background of the median filter techniques, including the
65 sorting-based techniques and the selection-based techniques. The two pro-
66 posed algorithms are demonstrated in detail in sections 3, 4, and 5. While,
67 the hardware implementation, and the simulation results are illustrated in
68 section 6, and 7 respectively. Finally, the paper is concluded in section 8.
4
69 2. Background of median filter algorithms
70 The median value is the value in the mid position of an arranged sequence.
71 70 Assume that the window pixel values are put into an array named x, such
72 that its elements are x1 , x2 , x3 , x4 , . . . . . . , xn and it turns into xi1 ≤ xi2 ≤
73 xi3 ≤ xi4 . . . . . . ≤ xin after sorting it in descending order, afterwards its
74 median value is xi(n−1)/2 . There are many techniques to get the median
75 number of an array; some of them depend on sorting the array elements
76 first then selecting the elements of the mid position, and the others are
77 done without sorting. The sorting-based techniques are more popular than
78 the techniques that belong to the other category. Therefore, the literature
79 includes many implementations of the median filter using different sorting
80 algorithms such that the Bubble, Insertion, Selection, Quick, Heap, Shell,
81 Merge, and Tim sort [12]. In these implementations, the effectiveness of these
82 median filters depends mostly on the adopted sorting techniques’ speed and
83 efficiency. However, even with the high-efficiency sorting techniques, there
84 is a waste in the used hardware and the computational time wherein sorting
85 all the elements is not actually required. Additionally, some of the sorting
86 techniques, i.e., merge sorting and odd-even merge-sort, restrict the array
87 length to be a power of 2 which means appending the window array by
88 some zeros (or some other values), and after applying the sort algorithm,
89 those appended zeros should be eliminated [13]. Hence, in the case of 3x3
90 windows, there should be seven appended elements to satisfy this condition
91 which incorporates undesired overhead in the used memory, the required
92 hardware, and processing time. Therefore these techniques are not suitable
93 to be a base for a median filter. Thus, there is a need to get the median
94 without sorting the entire array. Therefore, other techniques with different
95 conceptions have been arisen, which do not wholly sort the array. These
96 techniques are divided into the quick selection algorithm and the median of
97 median algorithm according to their main idea of thinking.
5
104 weak worst-case performance. Quickselect and its versions are the most fre-
105 quently used selection algorithm in productive real-world implementations.
106 It utilizes the same global methodology as quicksort, which considers one
107 element as a pivot, then separating the data into two groups that relied on
108 the pivot, less than the pivot group or greater than it. As a result of com-
109 paring the pivot to all other elements, the rank ‘r’of the pivot is found out.
110 The pivot is then inserted in the rth index of the array. All other elements,
111 equal to or smaller than the pivot, are located into positions before the pivot.
112 Moreover, every larger element is placed after the pivot [1]. However, instead
113 of repeating into both sides, like in quicksort, quickselect only reiterates into
114 one side, which contains the element of interest. Like quicksort algorithm,
115 quickselect is in general realized as an in-place technique, and at the same
116 time as selecting the k th element, it also partially arranges the data. This
117 degrades the order of the average complexity from O(n log n) to O(n). Nev-
118 ertheless, the worst-case situation occurs when choosing the largest or the
119 smallest element as a pivot. Consequently, each step would eliminate just
120 one element from the list, and the complexity will be O(n2 ) rather than O(n).
121 Therefore, this algorithm is a practical randomized linear algorithm for me-
122 dian finding. However, it is insufficient to have only a randomized algorithm
123 where sometimes it may be more critical to be accurate than to be fast. So
124 the median of the median algorithm has been developed, which is a linear
125 worst-case time algorithm.
6
141 tions are developed on this algorithm to overcome this problem, like median
142 3, median 4, and median 5. Afterward, an improvement was accomplished
143 on MOM that is recognized as REPEATEDSTEP. However, the number of
144 comparisons for nine elements list is still large, which can be calculated using
145 (1) [16].
7
173 Furthermore, FPGA is used as a hardware accelerator to implement the
174 proposed techniques for satisfying higher processing speed. Since FPGAs
175 provide inherent parallelism. The proposed algorithms will be illustrated in
176 detail in the following sections.
Figure 1: The Comparisons in each clock cycles with swapping steps for ALG1 algorithm
178 Assuming that the pixels array, p, contains the values for all the pixels
179 in the selected 3X3 window. The pixels are represented by 8-bit numbers
8
180 and located in the indices 0,1,2,3,4,5,6,7,8. The proposed algorithm which
181 aims to extract the median pixel, relies on rearranging the array to ensure
182 that the numbers in the last four locations are greater than the numbers
183 in the first five locations. So in the first step, the numbers at locations
184 0,1,2,3 are compared with their counterparts at locations 5,6,7,8 respectively.
185 Swapping processes are performed, if needed, to guarantee that (P[0]<P[5]),
186 (P[1]<P[6]), (P[2]<P[7]), and (P[3]<P[8]). Secondly, the same concept is
187 used with the array tailing half by performing the following comparisons
188 (P[5]<P[7]), (P[6]<P[8]) and doing swapping if these conditions are not ini-
189 tially satisfied. Furthermore, if the swapping between any couple of these
190 locations is done, another swapping should be done to keep the validity of
191 the reached relations. Such that, if swapping between P[6] and P[8]is oc-
192 curred then swapping between P[0] and P[2] should be performed regardless
193 of their arrangement. Similarly, for the case of P[1] and P[3] if P[5] and P[7]
194 are swapped. Thirdly, a comparison between P[7] and P[8] is executed, then
195 P[8]is undoubtedly greater than numbers at locations 0, 1, 2, 3, 5, 6, 7. So
196 P[8] will be excluded from the subsequent comparisons. However, in this
197 step if the condition between P[7] and P[8] is verified, some swapping should
198 be made (0 with 1, 2 with 3, and 5 with 6). These swapping processes are
199 necessary to keep the reached relations between the elements valid whether
200 the condition between P[7] and P[8] is fulfilled or not. In the fourth step,
201 two comparisons would be made. One of them is between elements of in-
202 dices 6 and 7 to throw the element of index 7 from the comparisons after
203 ensure that element number 7 is more significant than 0, 1, 2, 3, 5, 6. The
204 other one is needed to include index 4 in the sequence of comparisons, and it
205 will be between P[3] and P[4]. The following steps are processed to let P[4]
206 larger than its left side elements and less than its correct side elements. The
207 sequence of comparisons for this technique is illustrated in Fig 1.
208 This algorithm includes seventeen comparisons which are executed in ten
209 clock cycles, and it detects the median number of a nine elements array with
210 an accuracy 100%. Some modifications can be applied to this technique in
211 order to reduce the required number of clock cycles. However, in all of these
212 versions, the accuracy will be decreased too. The hardware architecture of
213 this technique is shown in Fig. 2.
214 The results of applying the ALG1 ALGORITHM on two noisy images
215 are portrayed in Fig.3. One can deduce that the salt and pepper noise are
216 significantly eliminated.
9
Figure 2: The hardware implementation of the ALG1 algorithm
10
Figure 3: a: The original images, b: The images after adding salt and pepper noise, c:
The noisy images after applying the proposed technique
239 and 6. Where P[5] in place of P[4], P[6] in place of P[5], and finally P[4] in
240 place of P[6]. Afterward, there will be six elements only for the remaining
241 comparisons, and the comparisons will continue until the grantee that P[4]
242 is larger than all elements except P[5]. In step five, a comparison between
243 elements of indices 3 and 5 is executed. Next, in step six, a comparison is
244 made between indices 1 and 3. Accordingly, if the condition is true, swapping
245 will be between indices 1 and 3; otherwise, swapping will be between indices
246 2 and 3. In the seventh step, two comparisons are made: between indices
247 2 and 4, and the other is between indices 3 and 5. The last step includes
248 comparing indices 3 and 4, so if the condition is satisfied, the median is at
249 index four; otherwise, another comparison will be executed between indices
250 4 and 5. As a result, the median is at index 4. The error can occur when the
251 1st assumption does not fulfill.
252 The result of applying this technique on two different graphs is presented
253 in Fig.5. The figure includes the original, noisy, and denoise plots. This
254 technique can deduce the median number after nine clock cycles using only
255 fifteen comparisons with accuracy equals 87%. On the other hand, in ex-
256 ploiting this technique for noisy grayscale images, the accuracy is improved
257 to 96% because of the grayscale pixels convergent scaling.
11
Figure 4: The hardware implementation of the second technique
12
Figure 5: a:Original, b:noisy, c:denoisy for 2nd technique
277 support the pipelining feature. The suggested architecture for the median
278 filter depends on a sequence of pipeline stages to reduce the computational
279 time. Hence, any algorithm’s hardware architecture is used as a building
280 block, identified as "median 9 block", in the median filter architecture.
281 A detailed comparison between the first proposed design, the second one,
282 and the design published in [1] is illustrated in Table. 1. From the simulation
283 results, it can be deduced that our proposed filter, using the first algorithm,
13
Figure 6: The hardware implementation of Max technique
284 requires ten clock pulses with 398.2 MHz maximum operating frequency to
285 find out the median value with 100% accuracy. This provides an improvement
286 in the processing time than the filter proposed in [1], although the technique
287 proposed in [1] needs only three clock pulses but with 100 MHz maximum
288 operating frequency. Moreover, the accuracy is enhanced by 7%, and the
289 execution time is reduced by more than 16%. Besides, by comparing the first
290 proposed technique with the designs investigated in [20, 19, 21, 22, 2], more
291 than 10% improves the hardware of each median block because it needs only
292 seventeen compactors instead of nineteen.
293 7. Conclusion
294 The flexibility, high performance, and low-cost features of the FPGA
295 and hardware implementation are utilized to develop the high-speed, low
296 area, and high accuracy median filters. The hardware results showed that
297 these proposed algorithms have enhanced and comprehensive output results
298 compared to standard median techniques as well as the adaptive median
14
299 algorithm. Hence, they have suitable applications prospect in real-time image
300 processing and meet the different applications’ requirements.
301 References
302 [1] K. S. Raju, P. Phukan, G. Baurah, An fpga implementation of a fast 2-
303 dimensional median filter, in: National Conference on Recent Advances
304 in Communication, Control and Computing Technology,RACCCT 2012,
305 2012, pp. 144–147.
306 [2] C. Priyanka, Median filter algorithm implementation on fpga for restora-
307 tion of retina images, International Journal of Innovative Science, En-
308 gineering and Technology 3 (2016).
312 [4] H.-Y. Yang, X.-Y. Wang, P.-P. Niu, Y.-C. Liu, Image denoising using
313 nonsubsampled shearlet transform and twin support vector machines,
314 Neural networks 57 (2014) 152–165.
315 [5] M. Lebrun, M. Colom, A. Buades, J.-M. Morel, Secrets of image de-
316 noising cuisine, Acta Numerica 21 (2012) 475.
317 [6] L. Shao, R. Yan, X. Li, Y. Liu, From heuristic optimization to dictionary
318 learning: A review and comprehensive comparison of image denoising
319 algorithms, IEEE transactions on cybernetics 44 (2013) 1001–1013.
320 [7] R. Gonzalez, R. Woods, Image processing. digit, Image Process 2 (2007).
15
328 [11] A. Rauh, G. R. Arce, A fast weighted median algorithm based on quick-
329 select, in: 2010 IEEE International Conference on Image Processing,
330 2010, pp. 105–108. doi:10.1109/ICIP.2010.5651855.
331 [12] Y. Ben Jmaa, R. Ben Atitallah, D. Duvivier, M. Ben Jemaa, A com-
332 parative study of sorting algorithms with fpga acceleration by high level
333 synthesis, Computación y Sistemas 23 (2019) 213.
334 [13] R. Sedgewick, Algorithms in Java, Parts 1-4, Addison-Wesley Profes-
335 sional, 2002.
336 [14] H.-K. Hwang, T.-H. Tsai, Quickselect and the dickman function, Com-
337 binatorics Probability and Computing 11 (2002) 353–371.
338 [15] M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest, R. E. Tar-
339 jan, Time bounds for selection, J. Comput. Syst. Sci. 7 (1973)
340 448–461. URL: https://doi.org/10.1016/S0022-0000(73)80033-9.
341 doi:10.1016/S0022-0000(73)80033-9.
342 [16] A. Alexandrescu, Fast deterministic selection, arXiv preprint
343 arXiv:1606.00484 (2016).
344 [17] S. Sadangi, P. Priyanka, Fpga implementation of parallel sorting mech-
345 anism for turbo decoding in lte system, in: 2018 Second International
346 Conference on Inventive Communication and Computational Technolo-
347 gies (ICICCT), 2018, pp. 359–362. doi:10.1109/ICICCT.2018.8473025.
348 [18] B. Graham, Fractional max-pooling, arXiv preprint arXiv:1412.6071
349 (2014).
350 [19] A. H. Rasheed, Fpga-based optimized systolic design for median filtering
351 algorithms, International Journal of Applied Engineering Research 12
352 (2017) 16100–16113.
353 [20] M. A. Vega-Rodríguez, J. M. Sánchez-Pérez, J. A. Gómez-Pulido, An
354 fpga-based implementation for median filter meeting the real-time re-
355 quirements of automated visual inspection systems, Proc. 10th Mediter-
356 ranean Conf. Control and Automation (2002).
357 [21] A. Eric, Fpga implementation of median filter using an improved al-
358 gorithm for image processing, International Journal for Innovative Re-
359 search in Science & Technology 1 (2015) 25–30.
16
360 [22] S. Maurya, I. Gupta, Fpga based hardware implementation of median
361 filtering and morphological image processing algorithm, International
362 journal of engineering research and technology 3 (2014).
17