Underwater Image Enhancement Using A Multiscale Dense Generative Adversarial Network

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

862 IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 45, NO.

3, JULY 2020

Underwater Image Enhancement Using a Multiscale


Dense Generative Adversarial Network
Yecai Guo , Hanyu Li, and Peixian Zhuang

Abstract—Underwater image enhancement has received much attributed to the following two factors. The abundant suspended
attention in underwater vision research. However, raw underwater particles cause shifts in light scattering and propagation direc-
images easily suffer from color distortion, underexposure, and fuzz tion. Furthermore, both suspended particles and water affect the
caused by the underwater scene. To address the above-mentioned
problems, we propose a new multiscale dense generative adversar- scene contrast by reflecting light to the camera lenses [4].
ial network (GAN) for enhancing underwater images. The residual To improve underwater images, traditional methods include
multiscale dense block is presented in the generator, where the mul- enhancement methods and restoration methods. Image enhance-
tiscale, dense concatenation, and residual learning can boost the ment methods [3]–[6] without underwater physical parameters
performance, render more details, and utilize previous features, focus on adjusting image pixel values to produce appealing
respectively. And the discriminator employs computationally light
spectral normalization to stabilize the training of the discrimina- results. Image restoration techniques [7]–[11] consider the
tor. Meanwhile, nonsaturating GAN loss function combining L1 degradation model to enhance underwater images. However,
loss and gradient loss is presented to focus on image features of various complex underwater physical and optical factors are re-
ground truth. Final enhanced results on synthetic and real under- quired that make these traditional methods inflexible to be imple-
water images demonstrate the superiority of the proposed method, mented. Due to the lack of abundant training data, these methods
which outperforms nondeep and deep learning methods in both
qualitative and quantitative evaluations. Furthermore, we perform display poor generalization performance in different underwater
an ablation study to show the contributions of each component and images, and enhanced images of some scenes tend to be overen-
carry out application tests to further demonstrate the effectiveness hanced or underenhanced. Deep convolutional neural networks,
of the proposed method. powerful supervised learning models, obtain convincing success
Index Terms—Dense concatenation, generative adversarial net- on low-level vision tasks, e.g., image superresolution [12], image
work (GAN), multiscale, residual learning underwater image deraining [13], and image denoising [14], and some researchers
enhancement. apply deep learning to underwater image processing [15]–[22].
In this paper, we develop a trainable multiscale dense gener-
I. INTRODUCTION ative adversarial network (GAN), and the main contributions of
this paper are summarized as follows.
ECENTLY, the underwater imaging has played an impor-
R tant role in deep ocean exploration, underwater robotics,
and sea life monitoring. However, raw underwater images sel-
1) We propose a novel multiscale dense block (MSDB) with-
out constructing the underwater degeneration model and
image prior. The effective combination of residual learn-
dom fulfill the requirements concerning image processing. Due ing, dense concatenation, and multiscale can correct color
to light attenuation and scattering in the water, color distortion, casts and recover image details to improve subjective and
underexposure, and fuzz are three major problems of image objective evaluations. In addition, the ablation study is
degeneration [1]. First, considering the depth, light conditions, conducted to demonstrate the effect of each component in
water type, and different light wavelengths, the color of under- the proposed block.
water images is often distorted [2]. Second, the absorption of 2) The meaningful adversarial loss including the L1 and gra-
light energy results in underexposure. The objects at a distance dient loss is adopted to preserve image features of ground
of more than 10 m from the camera lenses are almost indis- truth. Meanwhile, the spectral normalization stabilizes the
tinguishable [3]. Third, the fuzz of underwater images can be training of the discriminator, which is computationally
light, fast, and easy to incorporate into GAN training.
Manuscript received December 7, 2018; revised March 5, 2019; accepted 3) With two no-reference metrics adopted for the underwa-
April 9, 2019. Date of publication June 4, 2019; date of current version July
14, 2020. This work was supported in part by the National Natural Science
ter environment, numerous experiments are provided to
Foundation of China under Grant 61701245, in part by the Startup Founda- demonstrate the superiority of the proposed method on
tion for Introducing Talent of Nanjing University of Information Science and both synthesized and real-world underwater images. Fi-
Technology 2243141701030, and in part by a project funded by the Prior-
ity Academic Program Development of Jiangsu Higher Education Institutions.
nally, we carry out application tests to further show the
(Corresponding author: Yecai Guo.) effectiveness of the proposed method.
Associate Editor: H. Zheng.
The authors are with the School of Electronic and Information Engi-
neering, Nanjing University of Information Science and Technology, Nan- II. RELATED WORK
jing 210044, China (e-mail: guo-yecai@163.com; lihanyu1204@163.com;
zhuangpeixian0624@163.com). Given the importance of underwater vision, many methods
Digital Object Identifier 10.1109/JOE.2019.2911447 toward underwater images enhancement have been proposed
0364-9059 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Tsinghua University. Downloaded on December 20,2020 at 12:09:37 UTC from IEEE Xplore. Restrictions apply.
GUO et al.: UNDERWATER IMAGE ENHANCEMENT USING A MULTISCALE DENSE GENERATIVE ADVERSARIAL NETWORK 863

in recent years. Existing approaches of improving underwater The literature [21] develops a weak underwater image color cor-
image quality can be summarized into the following categories. rection model based on the cycle-consistent adversarial network
(CycleGAN) [24] and a multiterm loss function. Considering
A. Enhancement-Based Methods that CycleGAN can translate an image from one domain to an-
other domain without paired training data or depth pairings, an
Image-enhancement-based methods focus on adjusting image underwater GAN (UGAN) [17] employs it as a degradation pro-
pixel values to produce a subjectively and visually appealing cess to generate paired training data, and then uses the model
image. The literature [3] derives the inputs and weights from based on pix2pix [25] to improve underwater image quality.
a raw underwater image. There are two inputs that one white Computing the gradient penalty [26] adopted by a UGAN is
balanced version discards unwanted color casts of the subsea more time consuming than the spectral normalization [27].
images and another filtered version renders the details. Ad- Differing from a previous application, we propose an effective
ditionally, four weight maps aim to determine which pixel is block designed for underwater image enhancement with residual
advantaged to appear in the restored output. However, the en- learning, dense concatenation, and multiscale operation, which
hanced image easily becomes overenhanced or underenhanced. would be demonstrated to be effective in ablation study. The
The integrated color model with the Rayleigh distribution [4] spectral normalization is utilized to stabilize the training of the
minimizes overenhanced and underenhanced regions, but it in- discriminator, which has been demonstrated to be computation-
troduces noise in the output results. A retinex-based (RB) ap- ally light, fast, and easy to incorporate into GAN training [27].
proach [5] is proposed to enhance a single underwater image, and In addition, the proposed network performs well in terms of both
there are three major steps: a simple and effective color correc- subjective and objective evaluations on up to 215 real underwater
tion strategy, a variational RB framework, and a postprocessing images.
on fuzz and underexposure. This approach reduces underwa-
ter blue–green effect effectively and removes amplified noise.
III. METHODOLOGY
These enhancement-based methods improve underwater scene
contrast and image quality to some extent, but output images in GANs [28] have attracted favorable attention in machine
some scenes become overenhanced or underenhanced, simulta- learning community not only for its ability to learn the target
neously their methods reckon without the complex underwater probability distribution but also for its theoretically attractive
physical parameters. aspects. Inspired by GANs, we propose an underwater GAN
(UWGAN) to learn a nonlinear mapping between the nondis-
torted image and the distorted image. The proposed network
B. Restoration-Based Methods
generates enhanced results by leveraging on an end-to-end and
In image restoration techniques, the goal is to recover under- data-driven training mechanism. As shown in Fig. 1, the pro-
water images by constructing the degradation model and then posed model contains two components, a generator network
estimating model parameters. The dark channel prior method G and a discriminator network D. Residual MSDB (RMSDB)
[23] assumes that, in most local patches of haze-free outdoor is employed in a fully convolutional network of the generator.
images, there will be some pixels with very low intensities in The generator is designed to synthesize the underwater images,
at least one color channel. It then uses this assumption to esti- whereas the discriminator is designed to distinguish the synthe-
mate the transmission and restores the image. Intricate under- sized images produced by the generator from the corresponding
water images are similar to hazy images to some extent (e.g., real underwater images. We use nonsaturating loss, L1 loss, and
backscatter). Therefore, some researchers apply this method to gradient loss to produce visually pleasing images.
process underwater images. The underwater dark channel prior
(UDCP) [7] proposes a novel prior based on observing the ab- A. Generator Network
sorption rate of red channel in abundant underwater images to
restore high-quality images. However, the UDCP is sensitive to In recent years, a large number of feature extraction modules
the variations in an underwater scene. Similarly, the red channel have been designed. The widely used inception architecture [29]
method [8] recovers degraded images by restoring the colors aims to find out an optimal local sparse structure in a network
associated with short wavelengths. However, many physical pa- structure. However, these different scale features concatenate
rameters and underwater optical property are required, making in a simple fashion at the end of the block, partly leading to
these methods inflexible to be implemented. Owing to a lack the underutilization of feature maps [30]. Moreover, the litera-
of abundant training data, these methods based on dark channel ture [31] proposes a deep residual learning framework to ease the
prior exhibit poor performance for marine scenarios. optimization process of networks. Simultaneously, they could
easily enjoy more competitive results. After that, the dense block
[32] is designed to strengthen feature propagation and encourage
C. Deep-Learning-Based Methods feature reuse.
Relying on abundant training data, deep-learning-based meth- Inspired by the above-mentioned feature extraction modules,
ods are capable of improving image quality in different under- we propose a novel MSDB. Fig. 2 depicts the detailed struc-
water scenes. Combined with the physical model, WaterGAN ture of our MSDB. Each concatenation operation has three or
[16] uses in-air images with corresponding depth information to four feature maps to take full advantage of the local features of
generate the synthetic image for specific underwater scenarios. the image, one of which directly comes from the output of the

Authorized licensed use limited to: Tsinghua University. Downloaded on December 20,2020 at 12:09:37 UTC from IEEE Xplore. Restrictions apply.
864 IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 45, NO. 3, JULY 2020

Fig. 1. Architectures of generator and discriminator networks. “Conv” denotes convolution layer, whereas “Deconv” denotes deconvolution layer. MSDB
represents the multiscale dense block and “BN” represents batch normalization. Spectral normalization is used in the convolution layer of the discriminator.

1
F1 = L(ω5×5 ∗ Xn−1 ) (3)
2
T2 = L(ω3×3 ∗ [T1 , F1 , Xn−1 ]) (4)
2
F2 = L(ω5×5 ∗ [T1 , F1 , Xn−1 ]) (5)
3
Xn = L(ω1×1 ∗ [T2 , F2 , O1 , Xn−1 ]) (6)

where ω represents the weights, and the biases are omit-


ted for simplifying notations. The convolution operation is
marked by “∗.” The superscripts denote the location of
the convolutional layer, whereas the subscripts denote the
size of the corresponding convolutional kernel. L(x) denotes
Leaky ReLU (LReLU) activation function [33] and [T1 , F1 ,
Xn−1 ], [T1 , F1 , Xn−1 ], [T2 , F2 , O1 , Xn−1 ] refer to the concate-
nation of the feature maps.
To facilitate the concatenation operation, each layer in an
MSDB uses the convolutional kernel with stride 1. The 1 × 1
convolutional layer in end of the block reduces the feature map
to the number of input channels to the MSDB, thus the input
and output of our block have exactly the same number in terms
of feature maps. The distinctive operation allows multiple MS-
DBs to be connected together. We add skip connection to the
MSDB, which serves as one block and further encourages the
flow of information and gradient. The RMSDB combines two
blocks to gain comparable performance. In Fig. 3, we notice that
more than two blocks improve the performance but introduce too
Fig. 2. Architecture of the MSDB. “Contact” denotes dense concatenation many parameters and increase the training time. Therefore, the
operation.
proposed network takes two blocks as the final version.
In Tables I and II, RMSDB stands for residual MSDB
and BN for batch normalization [34]. The format [filter_h,
previous layer. The two middle paths have different kernel sizes
filter_w, stride] is kernel size and h × w× channels is output
to detect the feature maps at different scales. The last 1 × 1
shape. The slopes of all LReLU activation functions is set to
convolution can be introduced as the bottleneck layer, which
0.2.
promotes feature fusion and improves computational efficiency.
In the first two layers of network, we use two convolutional
The operation can be expressed as follows:
layers with 7 × 7 kernels and 64 feature maps (3 × 3 kernels
1
O1 = L(ω1×1 ∗ Xn−1 ) (1) and 128 feature maps for the second layer) followed by BN
and LReLU activation. The first two layers of convolution can
1
T1 = L(ω1×1 ∗ Xn−1 ) (2) reduce the feature map size and extract preliminary features. As

Authorized licensed use limited to: Tsinghua University. Downloaded on December 20,2020 at 12:09:37 UTC from IEEE Xplore. Restrictions apply.
GUO et al.: UNDERWATER IMAGE ENHANCEMENT USING A MULTISCALE DENSE GENERATIVE ADVERSARIAL NETWORK 865

Fig. 3. Enhanced result with the proposed network under the different number of blocks. We obtain the results on the 215 real underwater images and average
them.

TABLE I
GENERATOR NETWORK

TABLE II
DISCRIMINATOR NETWORK

Fig. 4. The loss curve of discriminator. “−SN” represents the discriminator


without spectral normalization.

manner [25]. Spectral normalization restricts the Lipschitz con-


stant of the discriminator to stabilize the training of the discrim-
inator [27]. In addition, this procedure is computationally light
and easy to implement. As shown in Fig. 4, compared to the
discriminator without spectral normalization, the discriminator
a result, the RMSDB can be connected at the output of the first with spectral normalization owns a stable and falling curve.
two layers and extract more features. Two deconvolution layers
are employed to reconstruct an image. The last deconvolution C. GAN Objective Function
layer maps to the number of input channels, which uses a Tanh The proposed generator produces an image to fool the discrim-
function to match the input distribution of [−1, 1]. inator, which is designed to distinguish between synthesized and
real-world underwater images. We let x be an in-air image and y
B. Discriminator Network be the same image with degradation. The proposed loss function
The proposed discriminator network consists of five layers includes nonsaturating GAN loss, L1 loss, and gradient loss as
with spectral normalization [27], as shown in Fig. 1, similar to follows:
the work of 70 × 70 PatchGAN. As depicted in Table II, BN is LUWGAN = min max V (D, G) + λ1 LL1 (G) + λg Lg (G). (7)
not applied to the first layer and the last layer. All remaining con- G D

volutional layers follow the same basic design, a convolution- The nonsaturating GAN loss can be expressed as follows:
BN–LReLU layer. PatchGAN is first used in pix2pix [25] and
then extends to apply in later CycleGAN [24]. Such a Patch- min max V (D, G) = Ex∼ptrain (x) [log D(x)]
G D
GAN that has fewer parameters than a full-image discrimina-
tor can handle arbitrarily sized images in a fully convolutional + Ey∼pgen (y) [log(1 − D(G(y)))] (8)

Authorized licensed use limited to: Tsinghua University. Downloaded on December 20,2020 at 12:09:37 UTC from IEEE Xplore. Restrictions apply.
866 IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 45, NO. 3, JULY 2020

Fig. 5. Qualitative comparisons for sample results on real underwater images.

where D(x) represents the probability that x comes from the real function f : x → y as well as g : y → x. Finally, 6128 image
underwater images rather than the output of generator G(y). It is pairs for training data are generated by degenerating in-air im-
well known that the nonsaturating loss outperforms the minimax ages in x with f . Simultaneously, CycleGAN learns a mapping
variant [35]. λ1 and λg are the weight of L1 distance and gradient g : y → x, which is similar to image enhancement and is used
loss, respectively. as a comparison method. We select 119 real underwater images
To give results some sense of ground truth and stabilize train- from the related papers and seafood breeding bases in Zhangzi-
ing process, we explore this option by using L1 distance and dao, China, and select 96 images from Imagenet [36] and SUN
gradient loss as follows: [37]. The test set contains a total of 215 real underwater images.
2) Training Details: In our training process, training and
LL1 (G) = E[||x − G(y)||1 ] (9)
test images have dimensions 256 × 256 × 3 and are normalized
Lg (G) = E[||∇(x) − ∇(G(y))||1 ]. (10) between [−1, 1]. We use λ1 = 60, λg = 10, and LReLU with
a slope of 0.2 and the Adam algorithm [38] with the learning
IV. EXPERIMENTS rate of 0.0001. The batch size is set to 32. The discriminator
updates five times per generator update. The entire network was
In this section, we first discuss the detail setup of the pro-
trained on a GTX 1070 Ti using the TensorFlow framework for
posed network. We then show the performance of the proposed
60 epochs.
method by comparing it with other nondeep and deep learning
3) Compared Methods: We compare the proposed model
methods on both synthetic and real underwater images. Finally,
with other enhancement methods on both synthesized and real-
an ablation study and application tests further demonstrate the
world underwater images. These competitive methods include
superiority of the proposed method.
FusionEnhance (FE) [3], RB [5], UDCP [7], CycleGAN [24],
weakly supervised color transfer (WSCT) [21], and UGAN [17].
A. Setup
1) Data Set: The proposed method is conducted in a paired
B. Real-World Underwater Enhancement
system by using the training data from the literature [17]. UGAN
divides subsets of Imagenet [36] containing underwater images We first evaluate the proposed method on test set with nondeep
into two categories based on subjective vision. Let x be the set of and deep learning methods. As shown in Fig. 5, FE has obvious
underwater images with no distortion, and y be the set of under- reddish color shift due to the inaccurate color correction algo-
water images with distortion. CycleGAN can learn the mapping rithm. RB can enhance underwater image quality meanwhile

Authorized licensed use limited to: Tsinghua University. Downloaded on December 20,2020 at 12:09:37 UTC from IEEE Xplore. Restrictions apply.
GUO et al.: UNDERWATER IMAGE ENHANCEMENT USING A MULTISCALE DENSE GENERATIVE ADVERSARIAL NETWORK 867

Fig. 6. Qualitative comparisons of different enhancement methods on starfish image.

TABLE III
UNDERWATER IMAGE QUALITY EVALUATION OF DIFFERENT ENHANCEMENT METHODS ON REAL-WORLD UNDERWATER IMAGES

generates some darkish images. We note that UDCP aggravates underwater image sharpness measure (UISM), and the underwa-
the blue–green effect. CycleGAN has the limited positive effect ter image contrast measure (UIConM). Higher values of UCIQE
on images because the image-to-image translation is not well and UIQM denote better images quality.
suited for underwater images enhancement. WSCT introduces In our assessment experiments, we employ UCIQE and UIQM
greenish tone in some patches of underwater images partly be- to evaluate the test set. Table III lists the average values ob-
cause of the lack of techniques to stabilize GAN training. For tained by different methods on 215 images. The best results of
example, in the first image of WSCT, the background intro- UCIQE and UIQM metrics are marked in bold. The number
duces green color deviation. From Fig. 6, the color of starfish is following the mean denotes the variance. FE has obvious red-
not clear enough. Compared with other methods, the proposed dish color shift due to the inaccurate color correction algorithm,
method not only restores visually appealing results in underwa- causing the decreased value of UICM. Therefore, FE ranks the
ter scenarios but also enhances the underwater images even in fifth under UIQM. The CycleGAN is trained for 50 000 itera-
situations where other methods fail. tions and 100 000 iterations with no added benefit, because the
To make our results more convincing, we employ two no- image-to-image translation with a cycle consistency loss is not
reference metrics to evaluate the underwater images, e.g., un- well suited for the underwater scenario. UGAN utilizes many
derwater color image quality evaluation (UCIQE) [39] and un- convolution layers with up to 512 kernels to enhance under-
derwater image quality measure (UIQM) [40]. The UCIQE for a water images, resulting in too many network parameters. The
quantitative evaluation utilizes a linear combination of chroma, proposed method combined with the effective block achieves
saturation, and contrast, which aims to quantify the nonuniform higher metrics while using fewer parameters than UGAN. It can
color cast, blurring, and low contrast, respectively. Similarly, be seen that the UIQM of the proposed method is larger than
the UIQM comprises three properties of underwater images, the other methods, the UCIQE is also larger than most methods,
such as the underwater image colorfulness measure (UICM), the and the variance is smaller than most methods.

Authorized licensed use limited to: Tsinghua University. Downloaded on December 20,2020 at 12:09:37 UTC from IEEE Xplore. Restrictions apply.
868 IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 45, NO. 3, JULY 2020

Fig. 7. Qualitative comparisons of different enhancement methods on synthetic underwater images.

TABLE IV
UNDERWATER IMAGE QUALITY EVALUATION OF DIFFERENT ENHANCEMENT
METHODS ON THE SYNTHETIC UNDERWATER IMAGES

Fig. 8. The examples of artifacts. “−DC” represents the UWGAN without


dense concatenation operation.

TABLE V
UNDERWATER IMAGE QUALITY EVALUATION OF DIFFERENT VARIANTS patches of underwater images. Compared with UGAN, such as
OF THE PROPOSED METHOD
cabinet and the color of wall, the proposed method renders more
details and reduces color shifts.
Considered the underwater environment, we also employ
UCIQE and UIQM metrics to evaluate synthesized underwa-
ter images. In Table IV, the average of the results is listed, and
the values in bold denote the results that surpass all competing
methods. The number following the mean denotes the variance.
It can be seen that the UCIQE score of the proposed method
is higher than the other methods, the UIQM score is relatively
C. Synthetic Underwater Enhancement high, and the variance is smaller than the other methods.
We also evaluate the proposed method on 60 synthesized im-
ages, which come from WaterGAN obtained using the MHL
D. Ablation Study and Application Tests
data sets [16]. As shown in Fig. 7, FE and RB introduce
reddish color shift meanwhile UDCP fails to correct the under- 1) Ablation Study: The ablation study aims to reveal the
water color. CycleGAN and WSCT have a greenish tone in some effect of each component. We carry out the test in 119 real

Authorized licensed use limited to: Tsinghua University. Downloaded on December 20,2020 at 12:09:37 UTC from IEEE Xplore. Restrictions apply.
GUO et al.: UNDERWATER IMAGE ENHANCEMENT USING A MULTISCALE DENSE GENERATIVE ADVERSARIAL NETWORK 869

Fig. 9. Keypoint matching test on original underwater images and enhanced images.

Fig. 10. Canny edge detection on original underwater images and enhanced images.

underwater images considering that these methods are variants combined with residual learning can improve network perfor-
of the proposed method and the following statements hold: mance, the multiterm loss function can produce visually pleas-
1) UWGAN removes residual learning operation (−RL); ing enhanced results. Numerous experiments are performed to
2) UWGAN removes dense concatenation operation (−DC); demonstrate the superiority of the proposed method both on
3) UWGAN removes multiscale operation (−MS). synthetic and real underwater images. In addition, the ablation
The best results of UCIQE and UIQM metrics are marked study is carried out to show the contributions of each component,
in bold. In Table V, we notice that both residual learning and and the application tests further demonstrate the effectiveness of
multiscale operation could improve the UCIQE and UIQM of the proposed method.
the underwater images. Compared with the proposed network We find that our model cannot generate aesthetically pleasing
without dense concatenation, we take several dense concatena- synthesized underwater images. We leave this part in our future
tion operations in the specified convolutional layer to improve work.
the UICM and UIConM. As seen in Fig. 8, the UWGAN with
dense concatenation operation removes unpleasant artifacts at
the cost of the decreased performance of UCIQE, which aims at ACKNOWLEDGMENT
better subjective perception. The authors would like to thank the anonymous reviewers for
2) Application Tests: Some feature-related algorithms, in- their constructive and valuable comments. They also would like
cluding SIFT [41] and Canny [42], are employed to further to thank C. Li for providing help. Finally, the authors would like
demonstrate the effectiveness of the proposed methods. As to thank J. Li, K. A. Skinner, and J. Zhang.
shown in Figs. 9 and 10, the enhanced images render more key
matching points and more edge detection feature than the orig-
inal underwater images. REFERENCES
[1] D. M. Kocak, F. R. Dalgleish, F. M. Caimi, and Y. Y. Schechner, “A focus
on recent developments and trends in underwater imaging,” Mar. Technol.
V. CONCLUSION Soc. J., vol. 42, no. 1, pp. 52–67, 2008.
[2] H. Lu, Y. Li, and S. Serikawa, “Underwater image enhancement using
In this paper, we present a method for underwater image guided trigonometric bilateral filter and fast automatic color correction,”
enhancement through GAN. Meanwhile, the proposed MSDB in Proc. IEEE Int. Conf. Image Process., 2013, pp. 3412–3416.

Authorized licensed use limited to: Tsinghua University. Downloaded on December 20,2020 at 12:09:37 UTC from IEEE Xplore. Restrictions apply.
870 IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 45, NO. 3, JULY 2020

[3] C. Ancuti, C. O. Ancuti, T. Haber, and P. Bekaert, “Enhancing underwater [29] C. Szegedy et al., “Going deeper with convolutions,” in Proc. IEEE Conf.
images and videos by fusion,” in Proc. IEEE Conf. Comput. Vision Pattern Comput. Vision Pattern Recognit., 2015, pp. 1–9.
Recognit., 2012, pp. 81–88. [30] J. Li, F. Fang, K. Mei, and G. Zhang, “Multi-scale residual network for im-
[4] A. S. A. Ghani and N. A. M. Isa, “Underwater image quality enhancement age super-resolution,” in Proc. Eur. Conf. Comput. Vision, 2018, pp. 517–
through integrated color model with Rayleigh distribution,” Appl. Soft 532.
Comput., vol. 27, pp. 219–230, 2015. [31] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
[5] X. Fu, P. Zhuang, Y. Huang, Y. Liao, X.-P. Zhang, and X. Ding, “A retinex- recognition,” in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2016,
based enhancing approach for single underwater image,” in Proc. IEEE pp. 770–778.
Int. Conf. Image Process., 2014, pp. 4572–4576. [32] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely
[6] C.-Y. Li, J.-C. Guo, R.-M. Cong, Y.-W. Pang, and B. Wang, “Underwater connected convolutional networks,” in Proc. IEEE Conf. Comput. Vision
image enhancement by dehazing with minimum information loss and his- Pattern Recognit., 2017, pp. 4700–4708.
togram distribution prior,” IEEE Trans. Image Process., vol. 25, no. 12, [33] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve
pp. 5664–5677, Dec. 2016. neural network acoustic models,” in Proc. Int. Conf. Mach. Learn., vol. 30,
[7] P. L. Drews, E. R. Nascimento, S. S. Botelho, and M. F. M. Campos, “Un- no. 1, pp. 1–6, 2013.
derwater depth estimation and image restoration based on single images,” [34] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network
IEEE Comput. Graph. Appl., vol. 36, no. 2, pp. 24–35, Mar./Apr. 2016. training by reducing internal covariate shift,” in Proc. 32nd Int. Conf.
[8] A. Galdran, D. Pardo, A. Picon, and A. Alvarez-Gila, “Automatic red- Mach. Learn., 2015, vol. 37, pp. 448–456.
channel underwater image restoration,” J. Vis. Commun. Image Represent., [35] K. Kurach, M. Lucic, X. Zhai, M. Michalski, and S. Gelly, “The GAN
vol. 26, pp. 132–145, 2015. landscape: Losses, architectures, regularization, and normalization,” 2018.
[9] N. Wang, H. Zheng, and B. Zheng, “Underwater image restoration via [Online]. Available: https://arxiv.org/abs/1807.04720
maximum attenuation identification,” IEEE Access, vol. 5, pp. 18941– [36] O. Russakovsky et al., “ImageNet large scale visual recognition chal-
18952, 2017. lenge,” Int. J. Comput. Vision, vol. 115, no. 3, pp. 211–252, 2015.
[10] Y. Wang, H. Liu, and L.-P. Chau, “Single underwater image restoration [37] J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba, “Sun database:
using adaptive attenuation-curve prior,” IEEE Trans. Circuits Syst. I, Reg. Large-scale scene recognition from abbey to zoo,” in Proc. IEEE Comput.
Papers vol. 65, no. 3, pp. 992–1002, Mar. 2018. Soc. Conf. Comput. Vision Pattern Recognit., 2010, pp. 3485–3492.
[11] Y.-T. Peng and P. C. Cosman, “Underwater image restoration based on im- [38] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
age blurriness and light absorption,” IEEE Trans. Image Process., vol. 26, 2014. [Online]. Available: https://arxiv.org/abs/1412.6980
no. 4, pp. 1579–1594, Apr. 2017. [39] M. Yang and A. Sowmya, “An underwater color image quality evaluation
[12] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional metric,” IEEE Trans. Image Process., vol. 24, no. 12, pp. 6062–6071,
network for image super-resolution,” in Proc. Eur. Conf. Comput. Vision, Dec. 2015.
2014, pp. 184–199. [40] K. Panetta, C. Gao, and S. Agaian, “Human-visual-system-inspired un-
[13] X. Fu, J. Huang, D. Zeng, Y. Huang, X. Ding, and J. Paisley, “Removing derwater image quality measures,” IEEE J. Ocean. Eng., vol. 41, no. 3,
rain from single images via a deep detail network,” in Proc. IEEE Conf. pp. 541–551, Jul. 2016.
Comput. Vision Pattern Recognit., 2017, pp. 3855–3863. [41] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”
[14] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian Int. J. Comput. Vision, vol. 60, no. 2, pp. 91–110, 2004.
denoiser: Residual learning of deep CNN for image denoising,” IEEE [42] J. Canny, “A computational approach to edge detection,” IEEE Trans.
Trans. Image Process., vol. 26, no. 7, pp. 3142–3155, Jul. 2017. Pattern Anal. Mach. Intell., vol. PAMI-8, no. 6, pp. 679–698, Nov. 1986.
[15] S. Anwar, C. Li, and F. Porikli, “Deep underwater image enhancement,”
2018. [Online]. Available: https://arxiv.org/abs/1807.03528 Yecai Guo received the Ph.D. degree in underwater
[16] J. Li, K. A. Skinner, R. M. Eustice, and M. Johnson-Roberson, “Water- acoustic engineering from Northwestern Polytechni-
GAN: Unsupervised generative network to enable real-time color correc- cal University, Xi’an, China, in 2003.
tion of monocular underwater images,” IEEE Robot. Autom. Lett., vol. 3, He is currently a Professor with Nanjing Univer-
no. 1, pp. 387–394, Jan. 2018. sity of Information Science and Technology, Nanjing,
[17] C. Fabbri, M. J. Islam, and J. Sattar, “Enhancing underwater im- China. His research interests include meteorological
agery using generative adversarial networks,” 2018. [Online]. Available: telecommunication technology, underwater commu-
https://arxiv.org/abs/1801.04011 nication theory, and their applications.
[18] Y.-S. Shin, Y. Cho, G. Pandey, and A. Kim, “Estimation of ambient light Dr. Guo was a national winner of National Out-
and transmission map with common convolutional architecture,” in Proc. standing Doctoral Dissertations in 2006, the train-
MTS/IEEE OCEANS Conf., Monterey, CA, USA, 2016, pp. 1–7. ing object of “six talents peak” in Jiangsu province
[19] X. Yu, Y. Qu, and M. Hong, “Underwater-GAN: Underwater image in 2008, and one of the leaders of the advantage discipline “sensor network
restoration via conditional generative adversarial network,” in Proc. Int. and modern meteorological equipment” of colleges and universities of Jiangsu
Conf. Pattern Recognit., 2018, pp. 66–75. province in 2009, respectively.
[20] Y. Hu, K. Wang, X. Zhao, H. Wang, and Y. Li, “Underwater image restora- Hanyu Li received the B.S. degree in the department
tion based on convolutional neural network,” in Proc. Asian Conf. Mach.
of automation from Nanhang Jincheng College, Nan-
Learn., 2018, pp. 296–311.
jing, China, in 2017. He is currently working toward
[21] C. Li, J. Guo, and C. Guo, “Emerging from water: Underwater image the M.S. degree in image processing and deep learn-
color correction based on weakly supervised color transfer,” IEEE Signal
ing with the School of Electronic and Information
Process. Lett., vol. 25, no. 3, pp. 323–327, Mar. 2018.
Engineering, Nanjing University of Information Sci-
[22] X. Chen, J. Yu, S. Kong, Z. Wu, X. Fang, and L. Wen, “Towards quality
ence and Technology, Nanjing, China.
advancement of underwater machine vision with generative adversarial His research interests include generative adversar-
networks,” 2017. [Online]. Available: https://arxiv.org/abs/1712.00736
ial network and underwater image processing.
[23] K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel
prior,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 12, pp. 2341–
2353, Dec. 2011.
[24] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image
translation using cycle-consistent adversarial networks,” in Proc. IEEE Peixian Zhuang received the Ph.D. degree in sparse
Conf. Comput. Vision Pattern Recognit., 2017, pp. 2223–2232. representation, compressed sensing, and Bayesian
[25] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image transla- machine learning from Xiamen University, Xiamen,
tion with conditional adversarial networks,” 2017. [Online]. Available: China, in 2016.
https://arxiv.org/abs/1611.07004 He is currently a Lecturer with the School of Elec-
[26] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, tronic and Information Engineering, Nanjing Univer-
“Improved training of Wasserstein GANs,” in Proc. Adv. Neural Inf. Pro- sity of Information Science and Technology, Nanjing,
cess. Syst., 2017, pp. 5767–5777. China. His research interests include image process-
[27] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normaliza- ing and Bayesian machine learning.
tion for generative adversarial networks,” 2018, arXiv:1802.05957. Dr. Zhuang was the recipient of the Best Ph.D.
[28] I. Goodfellow et al., “Generative adversarial nets,” in Proc. Adv. Neural Thesis Award in Fujian Province in 2017.
Inf. Process. Syst., 2014, pp. 2672–2680.

Authorized licensed use limited to: Tsinghua University. Downloaded on December 20,2020 at 12:09:37 UTC from IEEE Xplore. Restrictions apply.
View publication stats

You might also like