SSRN Id3791105

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Review on CNN Based Image Denoising

Dinsha Babu and Sajeev K. Jose


Department of Electronics and Communication Engineering
Government College of Engineering Kannur

Abstract—In the last few years, convolutional neural networks is utilized to eliminate noise from pictures. Joining the self-
(CNNs) have attracted good attention in the area of denoising. similarities and sparse representation can lessen the storage
There have a lot of approaches to remove noise from an affected for picture denoising and enhance the performance.
image and to reconstruct a clean image with high visual quality.
Some of the non-CNN related approaches are much efficient to Over the last few years, various techniques that determine
remove noise from affected images. But these methods also show specific image priors have been utilized for removing noise
some issues like complex optimization, Computational expenses from affected image. It comprise of Markov random field mod-
etc. while performing denoising. To beat the disadvantages of els (MRF), gradient models, sparse models and nonlocal self-
these denoising approaches, CNN related methodologies are similarity models (NSS), among these NSS based strategies
created to enhance the denoising performance. These denoising
strategies are drawing in acceptable considerations for their are more famous technique for denoising. The NSS infers
efficient performance in image rebuilding. They utilize some that a few patterns happen over and again in a picture and
techniques like residual learning to enhance the denoising per- the picture patches with comparable patterns can be found
formance. So this paper provides a review on some CNN related away from one another. The execution of NSS prior has
image denoising approaches. enhanced the exhibition of noise removal fundamentally, and
Index Terms—Denoising, CNN, Residual learning
numerous on-going denoising calculations can be named NSS
based strategies. NSS models like nonlocally centralised sparse
I. I NTRODUCTION representation (NCSR) [8], Weight nuclear norm minimization
Image denoising is a notable old style strategy use for (WNNM) [9] and Block-matching and 3D filtering (BM3D)
image reclamation and it is dynamic subject in low level [7] are extremely successful for picture denoising measure. In
vision. Image denoising shows wide applications in numerous any case, these conventional denoising techniques have a few
fields like human entertainment and pathological analysis. impediments, they have to set the boundaries physically to get
In the course of recent years, numerous endeavours have the optimal outcomes and they utilize complex optimization
been devoted to making various calculations for denoising calculation to improve the denoising efficiency, so it expands
of images. The fundamental objective of denoising is to the computational expense.
recuperate an unmistakable image from a noise affected image. To beat the disadvantages of prior related denoising tech-
The corruption model utilized in image denoising is shown as niques, some discriminative learning related methodologies are
a = b + m, where a is noise affected image, b is a noiseless as of late created to learn picture prior models. Plain MLP
image and m is the added Gaussian noise, m has a standard technique proposed by Burger et a [20] is the main discrim-
deviation σ. Noise defilement consistently influences the visual inative denoising strategy which furnishes similar execution
nature of the actual image. Noise expulsion from an affected with BM3D. Cascade of shrinkage fields model (CSF) that
image is a crucial step for computer vision assignment and consolidates the random field related model and quadratic op-
image handling. Additive white Gaussian noise or AWGN is timization. Teachable nonlinear response dissemination model
a typical noise, which generally influence the image during (TNRD) [11] is also a well-known learning related technique
handling. In early techniques for removing noise from affected used for noise removal from images. However, these strategies
image, analysts eliminated the AWGN by presenting various are restricted in a particular prior model.
filters like bilateral filter and Gaussian filter or varying the co- As of late, neural network (NN) based denoising strategies
efficients using transforms like wavelet and Fourier transforms. are drawing in acceptable considerations for their efficient
However, some of the customary techniques have restricted performance in image rebuilding. They first train the network,
and imperfect image priors, so it can’t efficiently remove then the network accept input as noisy patches and the noise-
noise from images. As expressed in the Bayesian hypothesis, less clear patches are estimated from the noise patches. Each
assessing prior is significant for removing noise. For instance, network contains set of non-linear activations and convolution
wavelet transformation with a prior of Markov random field operations. It distinguishes the hidden prior of image from
the training set for image recovery. Deep learning approaches
Dinsha Babu is an M. Tech student associated with Department of Elec- have best learning capacity and adaptable network design,
tronics and Communication Engineering, Government College of Engineering which improve the efficiency of denoising. CNN is a deep
Kannur, Kerala, India (e-mail: dinshababudinu@gmail.com) learning approach, which have pulled in more consideration in
Sajeev K. Jose is a faculty of Department of Electronics and Communica-
tion Engineering, Government College of Engineering Kannur, Kerala, India denoising of noisy images [12]. Rectifier Linear Unit (ReLU)
(e-mail: sajeev@gcek.ac.in) [13], residual learning [15] and batch normalization (BN) [14]

178
Electronic copy available at: https://ssrn.com/abstract=3791105
Government College of Engineering Kannur (GCEK)
are utilized in CNN to quicken the training of network and
improve the efficiency of denoising. When sample data is
given to the network, its distribution changes after it goes
through the convolution layer. This problem is termed as inter-
nal covariate shift problem. This issue can be diminished by
utilizing batch normalization method. Batch normalization first Fig. 1. Block diagram of DnCNN [1].
normalizes the sample data then it utilizes scale and shift tasks
to recover the training data distribution. Activation function is
placed after BN of each layer. ReLU is a commonly using denoising. For network learning, reconciliation of both BN
activation function in CNN. In learning related denoising, the and residual learning approach is helpful to enhance the
efficiency of noise removal increase as the network depth overall performance of denoising. There have three sorts of
increase. However, expanding the network depth may prompt layers in this approach, ReLU and convolution are include
the disappearing gradient issues. Applying residual learning in first layer, where rectified linear unit (ReLU) [13] is the
approach to the CNN is a good choice to remove this issue. It activation function used for nonlinearity. Second layer includes
primarily adds the output of different feature layers with input Convolution, ReLU and BN, where BN helps to accelerate
image and given as an input of the next layer to ensure the network training and enhance the performance. Last layer is
performance. a convolution layer. DnCNN can easily sort out clean image
CNN is appropriate for equal calculation on current pow- from noisy image via hidden layers by consolidating ReLU
erful GPU and it can be utilized to enhance the run time with convolution. Applying zero padding to the input image
execution. The combination of CNN and qualities of denoisng helps to maintain the size of each and every feature maps in
phase is helpful to eliminate obscure noise. Consolidating inner layers unchanged. Zero padding also avoids boundary
CNN and nature of pictures is exceptionally viable to acquire artifacts.
a noiseless image. CNN shows better modelling capacity; this For a certain noise level, current discriminative denoising
is one of the best advantages of CNN to use as a denoiser. strategies like MLP [20], TNRD and CSF can only train a
The primary CNN-based denoiser consists of 5 layers and it is single model. In any case, those strategies can’t be applied
invented by Jain and Seung [18]. However, this strategy can’t to non-Gaussian type noise such as JPEG de-blocking and
beat with the notable BM3D denoising approach [7] due to SISR. But in case of DnCNN, it is conceivable to use one
the shallow network layer. CNN model for a few general picture denoising functions, for
example, Gaussian noise removal with obscure noise level,
II. LITERATURE REVIEW
JPEG de-blocking using diverse quality elements and SISR
A. DnCNN based Denoising using various up-scaling components. DnCNN shows the good
The current discriminative learning approaches like TNRD PSNR value than other contending strategies like BM3D and
[11] and CSF [10] are based on analysis model, so they TNRD. Proposed DnCNN shows a moderately high speed on
cannot perfectly capture all features of image. Also they use CPU, also it is quicker than two discriminative approaches,
handcrafted parameters. Their network training occurs at a MLP and CSF. In any case, DnCNN can’t exceed the BM3D
particular noise level. Denoising convolutional neural network and TNRD in speed on CPU.
(DnCNN) [1] utilizes BN and residual learning [14, 15] to
quicken the network training and enhance the efficiency of B. BMCNN based Denoising
noise removal. Noise reduction in DnCNN occurs in all noise Non-local self-similarity (NSS) prior-based techniques and
level. CNN based methods are two main algorithms of image
As opposed to straightforwardly recovering the clean image denoising. Methods built on NSS are used on Standard and
a, DnCNN works in a way to estimate the residual picture w, repeated image patterns and CNN based techniques are used
i.e., w is the contrast between the noisy and clear images. on irregular structures. Denoising based on NSS integrates
The usage of BN procedure in DnCNN helps to enhance similar input image patches into a 3D block then the block
and maintain the network training efficiency. Reconciliation is denoised. But it indicates certain demerits. First, as the
of both BN and residual learning approach is helpful to block denoising phase is made based on a particular prior,
enhance the overall performance of denoising. The trained this is tough to satisfy combined features of an image. Image
DnCNN network with a particular noise level can results a priors are human observation-based, Therefore, it might not be
good recovery of image from Gaussian noise when compare to optimal. Some parameters of this method should be adjusted
other traditional methods of noise reduction such as WNNM, manually. The optimization process in NSS based techniques
TNRD and BM3D. such as NCSR [8], WNNM [9] and LSSC are complex and
To complete a particular task using CNN model includes 2 time consuming.
steps. That are, (I) design the corresponding network architec- A novel method called block-matching convolutional neural
ture, (ii) learning of model using trained data. Designing of network (BMCNN) [2] combine NSS prior method with CNN
network architecture involves fixing of network depth, which based techniques. Patch based techniques are mainly used in
dependent on patch sizes utilized in traditional methods of many method of image denoising. So in this method equivalent

179
Electronic copy available at: https://ssrn.com/abstract=3791105
Government College of Engineering Kannur (GCEK)
local patches of input noisy image are combined to form a 3D [28]. So ResDNet gives better result than DnCNN. But in
block like in BM3D. An existing algorithm for denoising is ResDNet denoising, each noise level requires the training of
used to denoise the noisy input image patches to avoid the deep network. So it affects the storage capacity of devices,
noise during block matching. It provides more detailed result. which implemented this network.
This denoised input image is then termed as a pilot image for ResDNet consist of 5 blocks. First block is a convolutional
the process of block matching. After block matching, resulting layer with 64 filters. Second block is a non-linear block.
pilot patch blocks are used as a reference for again aggregating Non-linear block have a parametrized rectified linear unit
identical patches from input image. Then the denoised patches activation function (PReLU), it is followed by a convolutional
and noisy input patches are grouped together, because some layer with 64 filters. The output of non-linear block is given
information may loss during denoising process, so nnoisy to a transposed convolution layer it reduces the number of
patch of inputs will help to reconstruct the image features. channels from 64 to 3. Transposed convolution layer follows a
These blocks are given as an input to the CNN based denoising projection layer. Noise variance is given as an additional input
network. DnCNN [1] can be used as training based method. to the projection layer, which is used to normalize the noise
It has better denoising performance. BM3D [7] can be used realization estimate to get the correct variance, before this is
as NSS based method for preprocessing. BM3D is faster than subtracted from the network input. Final output is clipped to
other NSS based methods. get the output intensity range between 0 and 255. Inputs of
each convolution layers are padded to ensure that each feature
map shows the same spatial size as the input image. Normally
most of the deep learning networks use zero padding but
reflective padding is used in this network. Also this network
does not use batch normalization after convolutions, rather it
use parametric convolution representation.

Fig. 2. Block diagram of BMCNN [2].


Fig. 3. Block diagram of ResDNet [3].
BMCNN outperform DnCNN algorithm. BMCNN shows
the highest PSNR average for all noise level. BMCNN is
D. FFDNet based Denoising
also much quicker than other approaches based on NSS.
Performance of DnCNN with the formulation is saturated, Most discriminative learning strategies for denoising utilize
so for deeper network it is difficult perform better. BMCNN an explicit model for each and every noise level. So it is
can include extra information on the network. But BMCNN is important to utilize different models for denoising pictures
slower when compare to other CNN based denoising methods. with different noise levels. Likewise they are not adaptable
Because this method is a two-step process that includes the to handle with spatially variant noise. NSS based denoising
denoising based on both NSS and CNN. So it results a large strategies like WNNM and BM3D are adaptable in different
computational cost. Block matching phase in BMCNN is noise levels. Yet, their improvement calculations are time con-
difficult to design with GPU. suming. It can’t straightforwardly eliminate spatially variant
noise. Likewise NSS based techniques generally use hand-
C. ResNet based Denoising made image priors like nonlocal self-similarity and sparsity
Residual Denoising Network is a deep neural network and they are feeble to identify complex image forms.
for solving denoising problem [3]. ResDNet architecture is Fast and flexible denoising convolutional neural network
inspired by powerful optimization strategy and image regu- (FFDNet) [4] can be utilized to beat these issues. It use tunable
larization approaches. This approach leads to a high quality noise level map as input to manage a wide range of noise level,
reconstruction of image and use less trainable parameters. additionally the noise level map control the trade-off between
Residual network is similar to DnCNN that is the network detail preservation and noise reduction. FFDNet is versatile
act as a noise estimator by subtracting the network output to manage spatially varying noise by characterizing a non-
from its input image. Unlike DnCNN, ResDNet accept two uniform noise level map. The mapping function of DnCNN
inputs that are distorted input and the variance of the noise. model is a = F (b; θσ) where b is the input noisy observation,
ResDNet works on wide range of noise level. DnCNN has a is the desired output and θσ is the model parameter with
an internal mechanism to estimate the noise variance so it fixed noise level σ. Model parameter vary with change of
can apply for different noise levels. But if the noise statistics noise level. For FFDNet, a = F (b, N ; θ), where N is a noise
diverge from training condition then the internal mechanism level map. Here the noise level map is displayed as an input
will fail. It affects the denoising performance of DnCNN and the model parameter don’t change with noise level. So

180
Electronic copy available at: https://ssrn.com/abstract=3791105
Government College of Engineering Kannur (GCEK)
FFDNet is adaptable to deal with spatially variant noise than utilized to obtain mapping between input and residual images,
other existing CNN based strategies. FFDNet work on down- which eliminate the clean picture with activities in the inner
sampled subimages, which helps to speed-up the testing and layers and it additionally help the network to train deep
training speed and broaden the receptive field, also it increase layers without any problem. Residual excitation (RE) help to
the efficiency of network. make short ways from input picture to output block. Residual
A reversible down-sampling operator is utilized to reshape excitation reduces gradient vanishing and helps to obtain final
the input image into four down-sampled sub-pictures. Orthog- output clear image with more details. EDCNN contain 52
onal initialization technique is utilized to the convolution filter feature layers, so it has large network depth compare to other
to empower noise level map to control the trade-off between CNN networks for image denoising.
detail preservation and noise reduction without any visual EDCNN includes three sections, first one is low-level data
artifacts. The tunable noise level map M and down-sampled extraction block (LIB), second block is noise feature extraction
sub-images are joined together and given as an input to the blocks (NFB) and the last one is output block (OB). LIB
network. Each layer in the network consists of three operations have the activation function ReLU [13] to remove linearity
that are Convolution (Conv), Rectified Linear Units (ReLU) of convolution output, two convolution layers and BN [14]
[13] and Batch Normalization (BN) [14]. Zero padding is layer. BN works between convolution and ReLU. NFB block
applied after every convolution to make the feature map size creates a short way between outputs of current block and
unchanged. Finally sub-pixel convolution layer is placed in the next block with the help of residual learning approach. GRL
network to invert the down-sampling process. Output of the approach helps to remove low-level noise from corrupted
organization is given to an up-scaling operator; it is used to images. But for high-level noise, the variation between noise
deliver the assessed clean image. FFDNet operates on down- image a and noiseless clean image b is large. This issue can
sampled subimages so there is no need of utilizing widened be sort out by using RE strategy, it helps to decrease the
convolution dilated convolution to additionally increase the variation between noise image a and noiseless clean image b.
receptive field. Down-sampled sub-images denoising can ef- So the resulting output clean image acquires more information
fectively increase the receptive field and it improves efficiency. from input image. Each NFB block comprises of eight layers
of convolution. Output block contain one convolution layer.
EDCNN has good denoising performance than DnCNN and
it provide more details and reconstruct sharp edges when
compare with other CNN and NSS related noise removal
approaches.
Fig. 4. Block diagram of FFDNet [4].

Sub-sampling and sub-pixel convolution in FFDNet are


effective to reduce memory storage. So FFDNet is memory
friendly. Compare to DnCNN, FFDNet doesn’t predict the
noise, because FFDNet do not use residual learning for net-
work design. FFDNet outperform NSS based methods like
BM3D and WNNM for a wide range of noise level. But when
compare to the DnCNN, FFDNet is less efficient when the Fig. 5. Block diagram of EDCNN [5].
noise level is low and it outperforms DnCNN as the noise
level increases. FFDNet can remove strong noises, because it F. ECNDNet based Denoising
has large receptive field than DnCNN. But DnCNN has good
modelling capacity, so it is suitable for denoising lower noises. The current deep CNN related networks can be utilized
Run time of FFDNet is better than DnCNN on GPU and CPU. as good denoising approach. But they show some drawbacks
If the input noise level is lower than the actual noise level, like, difficulty to train images and performance saturation
the noise cannot be perfectly removed. So high noise level is challenges faced by deeper network. Most of the CNN based
required for better denoising. But using high noise level will methods like BMCNN [2] and DCCRDN [3] have some
also results the removal of some image details with noise. drawbacks, they results the vanishing of gradient when the
network is very deep and their computational cost is very high.
E. EDCNN based Denoising Enhanced convolutional neural denoising network (ECND-
The current CNN based approaches cannot remove the Net) [6] is a novel denoising method, which uses BN and
high level noise completely and perfectly. Enhanced deep residual learning approach to overcome the training issues and
convolutional neural network (EDCNN) [5] can be used to to get a better denoising. Residual learning method [15] helps
overcome this issue. It utilizes both global and local residual to avoid the vanishing of gradient [19] and batch normalisation
learning approaches and residual excitation approach for noise enhance the training efficiency of model by normalizing the
removal. The local residual learning (LRL) approach reduces data. Dilated convolution technique [17] is used to increment
gradient vanishing. Global residual learning (GRL) approach is the receptive field and enhance the network performance;

181
Electronic copy available at: https://ssrn.com/abstract=3791105
Government College of Engineering Kannur (GCEK)
it utilizes dilated filter and dilation factor to obtain more R EFERENCES
acquired data. Extending the CNN’s receptive field is helpful [1] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian
for getting more highlights for picture denoising. Two well- denoiser: Residual learning of deep cnn for image denoising,” IEEE
known approaches to grow the receptive field are expanding Trans. Image process., 2017.
[2] Ahn, B., Cho, N.I.: ‘Block-matching convolutional neural network for
the network depth and extending the network width. But image denoising’, arXiv preprint arXiv:1704.00524, 2017.
extending the network width makes more parameters; it leads [3] Kokkinos, F., Lefkimmiatis, S.: ‘Deep image demosaicking using a
to the network over-fitting and large computational expense. cascade of convolutional residual denoising networks’, arXiv preprint
arXiv:1803.05215, 2018. 1983.
Expansion of network depth leads to gradient vanishing prob- [4] Zhang, K., Zuo, W., Zhang, L.: ‘FFDNet: toward a fast and flexible
lem. As a result, using dilated convolution in network is a solution for CNN based image denoising’, IEEE Trans. Image Process.,
better decision to expand the receptive field. ECNDNet has 17 2018, pp. 4608–4622..
[5] Chunwei Tian, Yong Xu, Lunke Fei, Junqian Wang, Jie Wen, Nan
layers so it reduces the computational cost. Second, fifth, ninth Luo.: ‘Enhanced CNN for image denoising ‘,CAAI Transactions on
and twelfth layers of ECNDNet contains dilated convolution, Intelligence Technology, IET, 2019.
it additionally diminishes the computational expense. The [6] Haizang Zou, Rushi Lan, Yanru Zhong, Zhenbing Liu, Xiaonan Luo,
“EDCNN: A Novel Network for Image Denoising,” in IEEE Interna-
first furthermore, sixteenth layers contain both ReLU and tional Conference on Image Processing, 2019.
convolution. Dilated convolution, BN and ReLU are included [7] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by
in the second, fifth, ninth and twelfth layers of ECNDNet. In sparse 3-D transform-domain collaborative filtering,” IEEE Transactions
on Image Processing, vol. 16, no. 8, pp. 2080–2095, 2007.
particular, the dilated factor in dilated convolution is crucial [8] W. Dong, L. Zhang, G. Shi, and X. Li, “Nonlocally centralized sparse
to expand the receptive field. representation for image restoration,” IEEE Transactions on Image
Processing, vol. 22, no. 4, pp. 1620–1630, 2013.
[9] S. Gu, L. Zhang, W. Zuo, and X. Feng, “Weighted nuclear norm
minimization with application to image denoising,” in IEEE Conference
on Computer Vision and Pattern Recognition, 2014, pp. 2862–2869.
[10] U. Schmidt and S. Roth, “Shrinkage fields for effective image restora-
tion,” in IEEE Conference on Computer Vision and Pattern Recognition,
2014, pp. 2774–2781.
[11] Y. Chen and T. Pock, “Trainable nonlinear reaction diffusion: A flexible
framework for fast and effective image restoration,” to appear in IEEE
transactions on Pattern Analysis and Machine Intelligence, 2016.
[12] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
large-scale image recognition,” in International Conference for Learning
Representations, 2015.
[13] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” in Advances in Neural
Fig. 6. Block diagram of ECNDNet [6]. Information Processing Systems, 2012, pp. 1097–1105.
[14] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep
network training by reducing internal covariate shift,” in International
Both ECNDNet and DnCNN [1] have same network depth, Conference on Machine Learning, 2015, pp. 448–456.
but ECNDNet is better than DnCNN in run time. It has high [15] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for
image recognition,” in IEEE Conference on Computer Vision and Pattern
PSNR value compare to other existing CNN based denoising Recognition, 2016, pp. 770–778.
methods. [16] Lefkimmiatis, S.: Universal denoising networks: A novel cnn architec-
ture for image denoising. In: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition. (2018) 3204–3213.
[17] K. Zhang, W. Zuo, S. Gu, and L. Zhang, “Learning deep CNN denoiser
III. C ONCLUSION prior for image restoration,” in IEEE Conference on Computer Vision
and Pattern Recognition, 2017, pp. 3929–3938.
There have a lot of approaches to remove noise from an [18] V. Jain and S. Seung, “Natural image denoising with convolutional
networks,” in Advances in Neural Information Processing Systems,
affected image and to reconstruct a clean image with high 2009, pp. 769–776.
visual quality. Different CNN and non-CNN related denoising [19] T. Tong, G. Li, X. Liu, and Q. Gao, “Image superresolution using
approaches have the ability to provide a good image with dense skip connections,” in IEEE International Conference on Computer
Vision, 2017, pp. 4809–4817.
efficient performance. But some of the CNN related noise [20] H. C. Burger, C. J. Schuler, and S. Harmeling, “Image denoising: Can
removing approaches shows excellent result when compare plain neural networks compete with BM3D?” in IEEE Conference on
to current popular non-CNN approaches. Because of the good Computer Vision and Pattern Recognition, 2012, pp. 2392–2399.
modelling capacity shown by the CNN network, it can denoise
the images with good efficiency. BN, ReLU and residual
learning are some techniques utilized in CNN network to
enhance the denoising performance. These techniques help to
quicken the training phase of the network and to fasten the
over-all process. Some of the CNN networks for denoising are
reviewed in the paper. DnCNN, BMCNN, FFDNet, ResNet,
EDCNN and ECNDNet are the reviewed networks for noise
recovery, and they shows comparatively good PSNR value than
that of other non-CNN approaches.

182
Electronic copy available at: https://ssrn.com/abstract=3791105
Government College of Engineering Kannur (GCEK)

You might also like