Chen 2020

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 1

A Transfer Learning Based Super-Resolution


Microscopy for Biopsy Slice Images:
The Joint Methods Perspective
Jintai Chen† , Haochao Ying† , Xuechen Liu† ,
Jingjing Gu, Ruiwei Feng, Tingting Chen, Honghao Gao∗ , and Jian Wu∗

Abstract—Biopsy slices are widely used in medical practice. Higher-resolution biopsy slice images reveal many details, which are
helpful to doctors and scientists. However, it is time-consuming to take the high-resolution slice images. In this paper, we propose a
joint framework, containing a novel transfer learning strategy and a deep super-resolution framework generating higher-resolution slice
images with lower-resolution ones as input. The framework called SRFBN+ is proposed by modifying SRFBN, which is a
state-of-the-art framework. Specifically, the structure of the feedback block of SRFBN was modified to be more flexible. Besides, it is
challenging to use typical transfer learning strategies directly for the tasks on slice images, as the patterns on different types of biopsy
slice images are varying. To this end, we propose a novel transfer learning strategy, called Channel Fusion Transfer Learning
(CF-Trans). CF-Trans builds a middle domain by mixing the data manifolds of the source domain and the target domain, serving as a
springboard for knowledge transferring. Thus, SRFBN+ can be trained on the source domain and then the middle domain and finally
the target domain. Experiments on slice images validate SRFBN+ works well in generating super-resolution slice images and CF-Trans
is an efficient transfer learning strategy.

Index Terms—Transfer Learning, Super-Resolution Techniques, Mixup, Image Reconstruction, Biopsy Slices.

1 I NTRODUCTION

H IGH - RESOLUTION slice images provide many details,


which are beneficial in medical applications and re-
lated research works. However, it is time-consuming to take
a magic power in generating super-resolution images from
the lower-resolution ones. It is meaningful if the super-
resolution techniques can be used to generate higher res-
high-resolution slice images and such images occupy much olution slice images. However, off-the-shelf deep super-
storage of equipment as well. What troubles the users is the resolution methods haven’t obtained great performances
limited storage of equipment and the wearing process in on slices taken with the microscopes, which is partially
taking slice images, yet purchasing advanced microscopes because the patterns in the slice images is more complex
also lands the research institutions and hospitals a heavy than to the natural images used in previous studies. In other
burden. Nowadays, deep learning has achieved significant words, when training deep super-resolution frameworks,
results in many computer vision tasks, such as pattern the typical operation is to down-sample high-resolution
recognition [1], [2], image compression [3], and image images into lower-resolution ones; and train a framework
generation [4]. These breakthroughs also deeply impacted to execute the inverse operation for reconstructing the high-
many other fields, including medical image computing and resolution images with the lower-resolution ones as in-
computer-assisted intervention, which are widely commit- put. The down-sampling methods (e.g., average/ median
ted to doctors and patients. down-sampling [5], Gaussian noise [6], and bilateral down-
Recently, deep super-resolution algorithms also showed sampling [7]) might drop many details in images. In natural
image super-resolution tasks, the lost details can be some-
what supplemented by using the higher-level semantics.
• Jintai Chen, Haochao Ying, Xuechen Liu, Ruiwei Feng, Tingting Chen, Nevertheless, the data pattern in the slice images is more
Jian Wu are with the Real Doctor AI Research Centre, Zhejiang Univer-
sity, Hangzhou, P. R. China complex with more intricate textures and more details.
E-mail: {JTigerChen, haochaoying}@zju.edu.cn, bhfs9999@outlook.com, This difference determines that the tasks of training super-
{ruiwei feng, trista chen0603, wujian2000}@zju.edu.cn resolution microscopy frameworks are more difficult com-
• Jingjing Gu is with the Department of Computer Science and Technology,
Nanjing University of Aeronautics and Astronautics, Nanjing China.
paring to training a common super-resolution framework
Email: gujingjing@nuaa.edu.cn (for natural images). Besides, the number of natural images
• Honghao Gao is with Computing Center, Shanghai University, Shanghai, is much larger in magnitude than that of slice images, so
P.R. China that training super-resolution framework for slice images is
E-mail: gaohonghao@shu.edu.cn
• † Jintai Chen, Haochao Ying and Xuechen Liu are co-first authors relatively harder with fewer and complex training samples.
• ∗ Corresponding Authors: Honghao Gao and Jian Wu The first super-resolution technique based on deep learn-
ing [8] was reported in 2014, which was designed to learn
Manuscript received November 10, 2019; revised January 10, 2020. an interpolation algorithm via stacked convolution kernels.

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 2

Most of the following methods were inspired by [8] and data manifolds of the source domain and the target domain,
utilized convolutional neural networks (CNNs for short) to so the middle domain keeps the characteristics of both. In
learn the data-specific interpolation strategies. Smaller con- our proposed transfer learning process, the model is trained
volutional kernels were used in [9] to make the frameworks on the source domain, the middle domain and the target
lighter, instead of the nine-by-nine and five-by-five con- domain in sequence.
volutional kernels [8]. This modification finally improved In our work, SRFBN+ generates the ovarian slice images
the speed by an amazing margin. Another milestone was a of super-resolution from the low-resolution images. The
framework named VDSR [10], which achieved an attractive SRFBN+ is trained with the novel transfer learning strategy,
performance by introducing the residual learning idea. Later transferring the knowledge from the colonoscopy biopsy
methods, such as LapSRN [11], DRCN [12], MDSR [13], were slices to the ovarian biopsy slices. Experiments show that
proposed, following residual learning idea. Other works our proposed SRFBN+ framework is effective at super-
were also introduced by using GAN (e.g., SRGAN [14]), and resolution biopsy slice image generation. Also, the Channel
reached good performances but took more time in training. Fusion Transfer Learning strategy is verified to be helpful,
In summary, previous works showed that the deep learning which outperforms the direct training on the target domain
methods were suitable for the higher-resolution tasks, and and direct transfer learning from the source domain to
the stacked convolution kernels and the residual learning the target domain. Note that the channel fusion transfer
manner were pretty advantageous. There were also a few learning does not introduce any other information, which
frameworks proposed to reconstruct the biomedical images well proves this strategy helps to narrow the gap between
of high-resolution. In [15], simple convolution-based up- the source domain and target domain.
sampling methods were utilized to generate high-resolution The remainder of this paper is organized as follows.
slice images. Deep-STORM [16] employed a full convolu- The related works on the super-resolution methods and
tional architecture, generating high-resolution slice images transfer learning strategies are discussed in Sec. 2. Sec. 3 will
directly. However, these methods used a lot of redundant review and discuss a state-of-the-art work SRFBN and intro-
parameters to learn some patterns which are not useful duce our proposed SRFBN+, the Channel Fusion Transfer
for super-resolution, as discussed in [10]. Such methods Learning (CF-Trans), and the training pipeline as well. The
failed to employ the successful experiences (e.g., the resid- experiments in Sec. 4 verify the effectiveness of SRFBN+ and
ual learning) in the super-resolution fields, thus might be CF-Trans.
decrease the model performances.
Besides, various transfer learning strategies were fre-
2 R ELATED W ORKS
quenters due to their effectiveness. Transfer learning meth-
ods were widely used to alleviate the problem of the lack 2.1 Super-Resolution Methods
of samples. As the biopsy slices are costly, using transfer In the last decades, plenty of super-resolution methods
learning can help promote the model performance in gen- have been proposed, including traditional methods [20],
erating slice images of super-resolution. The main idea of [21], [22], and deep learning based methods [10], [23], [24],
the transfer learning is to learn and then to transfer the [25], [26], [27]. No matter what kinds of methods were
knowledge, as the helpful supplements, from a domain used and how complex they were, the essence of them was
(e.g., source domains) to the target domain. However, every seeking a suitable interpolation algorithm for given data.
dataset has its own interest, so that transferring knowledge The early methods, such as bi-linear and bicubic methods,
among the “distant domains” (where the data manifolds are were simple but not very well-performing, as they did
very different) is not an easy task. Many transfer learning not fit to the data manifolds. Later, some assumptions
strategies were proposed to solve this problem, such as were introduced into the models, such as non-local similar
transferring knowledge of the source and target domains assumption [21] and the sparsity prior assumptions [22].
into a third space. However, these methods were not so Although such methods are flexible in application, they
easy to implement, as the third space is hard to select in were time-consuming and the image assumptions were
practice [17]. somewhat biased.
In this paper, we propose a deep learning framework Due to the excellent power of deep learning frameworks
called SRFBN+ to generate the biopsy slice images of super- in capturing the features, especially the convolution neural
resolution from the low-resolution ones following the resid- networks (CNN), it’s intuitively to introduce the deep
ual learning manner [1], [10]. SRFBN+ is built by modifying learning frameworks [8], [9], [10], [11], [12], [13], [14], [23],
state-of-the-art work, SRFBN [18]. The detailed design is [24], [25], [26], [27] into the super-resolution tasks. As the
described in Sec. 3. The motivation is to make the feed- deep learning methods are usually data-driven so that
back block more flexible and to make the training process they tend to get better performances in a specific field,
more stable. To this end, the feedback block is split into comparing with the traditional methods. In our views,
several ones, and add the feedback connections. For better there were three milestones in the deep super-resolution
training the model, a novel transfer learning method called field. One was the groundbreaking work SRCNN [8],
Channel Fusion Transfer Learning (CF-Trans) is proposed, [23], which gave the first successful trials in deep super-
which decreases the difficulties for knowledge transferring. resolution tasks and proposed several referential ideas. The
Motivated by Mixup [19], we introduce a middle domain second one introduced the residual learning manner [10]
between the source domain and the target domain and split and followed by mostly deep super-resolution methods.
the typical direct transfer learning process into two parts. Besides, the other one was the GAN-based frameworks [14],
Simply speaking, the middle domain is built by fusing the where GAN made the model more flexible (However, the

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 3

SR SR SR SR

3×3 Conv 3×3 Conv 3×3 Conv 3×3 Conv

Deconv Deconv Deconv Deconv


Up-sample Up-sample Up-sample Up-sample
Feedback Feedback Feedback Feedback
block block block block

1×1 Conv 1×1 Conv 1×1 Conv 1×1 Conv

3×3 Conv 3×3 Conv 3×3 Conv 3×3 Conv

LR LR LR LR

Fig. 1. The left part is the main architecture of SRFBN [18] and the right part is the illustration on feedback connection for easy viewing. The puzzle
arrow is the feedback connection.

disadvantage of it is time-consuming). Many other excellent on large-scale datasets (e.g., the ImageNet [34]) was proved
works were also proposed, making the super-resolution well performed [35]. Another popular method was to fix the
community flourish. parameters of the shallow layers in a pre-trained network,
However, as discussed in the introduction section, the and continue to train in the target dataset. These methods
slice images have more details and textures thus making were widely used in complex tasks (e.g., the object detection
the reconstruction more difficult than that of natural and recognition [36], [37], [38], the image segmentation [39],
images. Besides, as the slice images are much more precious [40]), in which training the model directly was difficult.
than the natural images, directly training deep learning A novel method was proposed in [41], called Transitive
frameworks for a specific slice image domain is not an Transfer Learning, seeking an intermediate domain to help
effective way. Thus, the super-resolution algorithm for the transferring. In this manner, the model is transferred
biomedicine requires additional researches. from the source domain to the intermediate domain and
then to the target domain. For example, if one tried to
transfer the knowledge from face identification to airplane
2.2 Deep Super-Resolution Methods in Biomedicine identification, car identification could be the middle task.
However, the intermediate domain needed careful selection,
There were some deep super-resolution methods proposed
so that the performances were not guaranteed. Although the
for biomedicine. A simple deep learning methods were
method in [41] was not perfect, the excellent idea bridging
proposed in [15], which up-sampled the images with CNNs
the source domain and the target domain by using a middle
after cropping the various tissue images into patches.
domain is potential. The biomedicine needs transfer learn-
Deep-STORM [16] built a full convolution architecture,
ing, as annotations of the medical samples are very precious.
analogous to the shape of Unet [28]. An ANN-based
There were also many applications in biomedical fields [42],
method was proposed in [29] to deal with the samples
[43], [44]. These successful cases verified that using transfer
taken by PALM and STORM microscopy. These methods
learning in biomedicine fields is beneficial.
partly met the requirements in biomedicine, however,
their methods neglected the successful experience in the
computer vision field. For example, without residual 3 M ETHODOLOGY
learning, the frameworks would obtain lots of redundant
Consider a special task on super-resolution that some biopsy
parameters.
slice images of low-resolution are taken, and we will gener-
ate the corresponding images of higher-resolution via a deep
learning model by using the low-resolution images as input.
2.3 Deep Transfer Learning Strategy In typical supervised deep learning setting, it is necessary to
The deep transfer learning methods were proved very use- train the super-resolution framework with image pairs, each
ful in promoting the performances of models [30], [31], [32], of which consists of a low-resolution and a high-resolution
[33]. However, what and how to transfer were vital, which slice image. In the training process, we manage to generate
partly determined the performances of the transfer learning. slice images of super-resolution with high-resolution images
There was a gap between the knowledge of the source do- as target and the low-resolution ones as input. We follow
main and the target domain. A most simple way for transfer the state-of-the-art framework SRFBN [18] by modifying its
learning was to train a deep learning model in a dataset, design and train the modified model with a novel transfer
and then transferred the learned parameters as initialization learning strategy. In our paper, our design is verified on the
to train in the target dataset. The frameworks pre-trained ovarian biopsy slices, by transferring the knowledge from

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 4

1×1 Conv

1×1 Conv

1×1 Conv

1×1 Conv

1×1 Conv

1×1 Conv
DeConv

DeConv

DeConv
𝑡−1
𝐹𝑜𝑢𝑡 𝐻1𝑡 𝐻𝑔𝑡 𝐻𝐺𝑡 𝑡
𝐹𝑜𝑢𝑡

Conv

Conv

Conv
𝐿𝑡0 𝐿𝑡1 𝐿𝑡𝑔 𝐿𝑡𝐺
𝑡
𝐹𝑖𝑛

Fig. 2. The left part is the primary architecture of SRFBN+ [18]. The puzzle arrows are feedback connections. The feedback block 1, 2, and 3 are
lighter feedback blocks than the feedback block in SRFBN.

colonoscopy biopsy slices. Hgt = Ug ([Ltj |j = 0, 1, 2, ..., g − 1]) (3)


The architecture of our deep super-resolution frame-
work, SRFBN+, is shown as in Fig. 6. Like other state-of-the- where L indicates the low-level features, and H indicates
art frameworks, SRFBN was built with a residual connection the high-level features. g = 1, 2, 3, ..., G, and G is the num-
to learn the residual patterns. Formally, in super-resolution ber of the convolutions or the de-convolutions (they are built
architecture, the residual connection is defined by: in pair) in a feedback block. The function Ug and Dg are the
up-sampling operation and down-sampling operation, and
x = fresidual (x) + I(x) (1) implemented by using convolutions and de-convolutions,
respectively. An illustration is shown in Fig. 2.
where fresidual (·) denotes a deep learning model/ module,
and the function I(·) is just an interpolation function.
Our framework is built following the idea of the super- 3.1.2 The advantages and disadvantages of SRFBN
resolution feedback network (SRFBN), which introduces an
The major module of SRFBN is the feedback block, which
RNN to refine the low-level representations with high-level
is built primarily based on the dense connections [2], as
information. But we simplify the feedback block in SRFBN,
shown in Eq. 2 and Eq. 3. The dense connections are utilized
and introduce an excellent transfer learning strategy to
for the information transmission among the different layers
build the super-resolution microscopy.
(convolutions and de-convolutions) in the feedback block.
Besides, out of the feedback block, the feedback connections
allow the information flowing from the higher-level layers
3.1 The architecture of SRFBN to the lower-level layers. The dense connection is specified
In this section, we will review the super-resolution feedback by:
network (SRFBN) [18], and discuss its advantages and xi+j = fi+j−1 ([fdense (xi ), xi ]) (4)
disadvantages.
where j denotes the span of the dense connection, and
the function fi+j−1 and fdense can be implemented by
3.1.1 Feedback mechanism neural layers. The dense connections in feedback block
help to capture variant spatial information. Also, the details
The feedback mechanism allows the lower layers of
information from the lower-level feature maps are remained
networks to consult the information of higher layers.
via the dense skip connections, which help retain the vital
SRFBN [18] utilized a recurrent neural network (RNN)
semantics to the final reconstructions.
regime to obtain the feedback information, as shown
However, the dense connection based feedback block
in Fig. 1. As described in [18], the feedback block is
brought lots of parameters, shown in Eq. 2 and Eq. 3. Dense
implemented with several groups of convolutions and
connections were proved to be useful in DenseNet [2].
de-convolutions. The outputs of convolutions and de-
However, the large model size makes model bloated. For
convolutions are connected by several dense connec-
example, in Eq. 3, if one convolution/ de-convolution is
tions [2]. With the dense connections, the convolution out-
added, the computation is largely added, as every Dg
puts from different layers are concatenated as inputs of de-
or Ug is involved except for D0 . In this view, got a bit
convolutions, and the de-convolution outputs are concate-
deeper this feedback block, the computation shall increase
nated as the inputs of convolutions as well. The processing
tremendously. Following this design, SRFBN is not flexible,
is defined by:
as going deeper with the feedback module will bring
 t
Lg = Dg ([Hit |i = 1, 2, ..., g]) massive parameters.
t−1 (2)
Lt0 = D0 ([Fout t
, Fin ])

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 5

3.2 SRFBN+
SRFBN is a current state-of-the-art framework for super-
resolution tasks. We discussed its main advantages and
disadvantages in Sec. 3.1.2. The reason why SRFBN succeed
is that it drew on the experience of previous works (e.g.,
using residual connections and the dense connections, as
Eq. 2 and Eq. 3) and proposed the feedback mechanism.
The disadvantage of SRFBN is that it used too much dense
connections [2] (as shown in Fig. 2). In this paper, we follow
the idea of SRFBN and modify the feedback block to make
the model more light and expandable.

+
3.2.1 Analysis
SRFBN is preformed by: (1) sampling an image of low-
resolution biopsy slice by the bi-linear interpolation [45].
(2) learning the residual patterns by the main part of the
framework. The residual learning manner is defined by
HR − I(LR), and I(·) denotes the bi-linear interpolation
function. HR and LR denote the slice images of the high-
resolution and the corresponding low-resolution ones, re- Fig. 3. An illustration of the channel fusion operation on two RGB slice
spectively. This design has been proved useful in the previ- images. After the channels are split, the red channel of the slice image
ous work [10] and SRFBN, so keeping on using residual from the source domain and the green channel and blue channel of the
slice image from the target domain are selected (by random), and a slice
learning manner in SRFBN+ is accessible. With residual image of the middle domain is generated.
connection, the goal of our framework very clear and more
straightforward: to learn the residuals. The intricate patterns,
if they are not related to the residuals, will not occupy
computing resources. In this way, the main part of the deep are up-sampled into the identical size of the high-resolution
learning framework will focus on supplying the residual images by the bi-linear interpolation, and the framework
patterns. focuses on learning the residual patterns. To make the
Also, the dense connections and the feedback blocks feedback block lighter and the framework expandable, a
are proved useful in SRFBN [18](discussed in Sec. 3.1.2). feedback block [18] can be split into several ones (e.g.,
Although the dense connections brought lots of parame- feedback blocks 1, 2, 3 in Fig. 6). In other words, we replace a
ters, they also promoted excellent performance in SRFBN. large feedback block with several tiny ones and introducing
There are two main reasons to keep on using the dense more feedback connections. Also, to remain the captured
connection. First, reconstructing the high-resolution images features of lower-levels, two skip connections (shown as the
of the biopsy slices needs the semantics of multi-levels. The brown arrows in Fig. 6) are added to connect the blocks.
lower-level features might provide some details, while the As one can see in Fig. 4, some one-by-one convolutions
higher-level features might provide the global semantics. In are added to limit the sizes of feature maps in feedback
this view, dense connections (e.g., the orange line shown in block, which is helpful to keep the model light. If a deeper
Fig. 6) can help to transmit the semantics of different levels, network is needed, we can stake the feedback blocks in
promoting the performances of our frameworks. Second, the sequence, and the skip connections (shown as the brown
dense connections enable models to fit the complex data arrows in Fig. 6) are added to transmit the features learned
distribution. In practice, the data-fitting capability of each by every feedback block to the top-more layers. In our
model/ module is limited. It’s evident that the long-tailed paper, the feedback block of SRFBN+ is the half size of the
data is hard to fit, which is discussed in much literature [46], feedback block of SRFBN, and we stack the two feedback
[47], [48], [49]. In our opinion, with these dense connec- block as shown in Fig. 6. The SRFBN+ is trained under
tions, complex data distribution can be learned as a mixing the specification of the MSE loss function. The MSE loss
distribution summing up by some simple distribution, and function LM SE is defined by:
the branches of the dense connections can learn each sim-
ple distribution. As medical images (e.g., the biopsy slice
images) follow more complex patterns comparing with the (y HR − y ∗ )2
LM SE = (5)
natural images, the dense connections are very beneficial in n
the super-resolution tasks for slice images.
where y HR denotes a biopsy slice image of high-resolution,
3.2.2 The Details of SRFBN+. and the y ∗ is the corresponding generated biopsy slice
Following the insights mentioned in Sec. 3.2.1, the SRFBN+ image of super-resolution, and n is the batch size. The
is proposed by modifying the SRFBN. One can see in Fig. 6, MSE loss function is a typical loss used in many image
this framework utilizes the residual connection and the reconstruction tasks.
feedback blocks. The biopsy slice images of low-resolution

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 6

1 1 Conv

1 1 Conv

1 1 Conv

1 1 Conv

1 1 Conv
1 1 Conv
$%&
+&$ +,$ +-$ $

DeConv

DeConv

DeConv
!"#$ !"#$

Conv

Conv

Conv
)$* )$& )$, )$-
$
!'(

Fig. 4. An illustration of the feedback block of SRFBN+. The green circles denote the one-by-one convolutions. The one-by-one convolutions are
optional, which is used to align the number of channels.

3.3 Channel Fusion Transfer Learning indicates where the channels are from. For example, the
As mentioned in Sec. 2, the transitive transfer learning [41] option [Rs , Gs , Bt ] means selecting red and green color
is a potential strategy proposed to transfer the knowledge channel from slice s and selecting blue color channel from
from the source domain to the target domain when the the slice t. Then, a new slice is generated, with its channels
data manifold is not similar enough. However, the middle from different original slices. An example is illustrated in
domain in [41] is presented as a third dataset, which needs Fig. 3.2.1. In our paper, the source domain is the slice images
careful selection. Motivated by Mixup [19], a novel tran- of colonoscopy, and the target domain is the slice images of
sitive transfer learning method is proposed, by generating ovarian. The middle domain is generated, with the red color
the middle domain with the source domain and the target channel from a colonoscopy slice image and the green and
domain, as shown in Fig. 5. If a middle domain can be blue color channels from an ovarian slice image.
built fusing the information of source domain and the
target domain and used as a springboard, the process of
transferring the knowledge among “distant domains” will 3.3.2 Analysis
be more stable. One may doubt that whether the slices of the middle domain
make sense, as the slice generation procedure is simple but
unusual. Now, we will analyze the advantages of using the
3.3.1 Channel Fusion Operation middle domain. Firstly, as mentioned before, the generated
We generate the dataset of the middle domain by a new slices combine the knowledge from the source domain and
operation, called Channel Fusion. Imagine that there is a the target domain, under the assumption that the samples
slice s from the source domain and a slice t from the target from one dataset are identically distributed. Hence, the
domain, and a slice of the middle domain can be generated. patterns of the source slices are highly preserved when
The data manifolds in different domains have different transferring a framework to the middle domain. One can see
inherent data patterns. In our paper, the source domain in Fig. 5, it’s difficult to transferring the knowledge from the
and the target domain are both presented as datasets with distribution marked in orange to the distribution marked in
biopsy slice images (but very different). Assume that the green. Then the middle domain is built represented as the
slice images are all RGB (red, green, blue) images. The distribution marked in blue, the transfer learning process
middle data generation goes with several steps. Firstly, can be divide into two processes and each process is easy.
one group was randomly selected from these options (8 Also, this method contains an extra channel-independent
groups): [Rs , Gs , Bt ], [Rs , Gt , Bt ], [Rs , Gt , Bs ], [Rt , Gt , Bs ], assumption. Consider two slice pairs: {s, S} and {t, T } from
[Rt , Gs , Bt ], [Rt , Gs , Bs ], [Rs , Gs , Bs ] and [Rt , Gt , Bt ]. The the source domain and the target domain, respectively. The
chance of picking every option is identical. The subscript capitals indicate the slice images of high-resolution, and
the lowercases indicate the low-resolution slice images. We
denote a function f (·) to present a deep learning super-
(a)
(b) (c) resolution model. The training constraints is:

f (s) = S
(6)
f (t) = T
Fig. 5. An illustration for Channel Fusion Transfer Learning. The distri- Consider a toy example that s, S , t and T only have two
butions marked in orange and green presents the distributions of two
different domains, and the distribution marked in blue is generated by channels. plugging in Eq. 6, then we have
fusing such two distributions. The arrow (a) denotes the direct transfer 
learning, while (b) and (c) denote the two transfer learning stages of f ([s1 , s2 ]) = [S1 , S2 ]
CF-Trans.
(7)
f ([t1 , t2 ]) = [T1 , T2 ]

Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 7

SR SR SR SR

3 3 Conv 3 3 Conv 3 3 Conv 3 3 Conv

Deconv Deconv Deconv Deconv


Up-sample Up-sample Up-sample Up-sample
Feedback Feedback Feedback Feedback
block 1 block 1 block 1 block 1

Feedback Feedback Feedback Feedback


block 2 block 2 block 2 block 2

Feedback Feedback Feedback Feedback


block 3 block 3 block 3 block 3

1 1 Conv 1 1 Conv 1 1 Conv 1 1 Conv

3 3 Conv 3 3 Conv 3 3 Conv 3 3 Conv

LR LR LR LR

Fig. 6. The left part is the main architecture of SRFBN+ [18]. The puzzle arrows are feedback connections. The feedback block 1, 2, and 3 are
lighter feedback blocks than the feedback block in SRFBN.

When training it in the middle domain, one can get four training in the middle domain implies that the function
equations in total: f (·) is required to learn under the specification of the
 channel-independence. In this way, the framework is
f ([s1 , s2 ]) = [S1 , S2 ]



 f ([t , s ]) = [T , S ] required to reconstruct every channel of high-resolution
1 2 1 2
(8) with the corresponding low-resolution one. In biopsy slices,

 f ([t1 , t2 ]) = [T1 , T2 ] every channel contains useful information. CF-Trans make
it possible to require the deep learning framework to get

f ([s1 , t2 ]) = [S1 , T2 ]

good performances on every channel, which is beneficial in
So that a equation is obtained as: the biomedicine field.
f ([s1 , s2 ]) − f ([t1 , s2 ]) = [S1 − T1 , 0] (9)
As the function f (·) (a deep learning model) is a complex
3.4 Training pipeline
function, we assume it can be reformulated as:
The pipeline of training the whole framework is illustrated
f ([x1 , x2 ]) = g(x1 ) + g(x2 ) + g(x1 ) g(x2 ) (10) in Fig. 7. Assume there are two domains, a source domain
where x ∈ {s, t}, and the first term in the right side a and a target domain. Firstly, we train a deep learning
function w.r.t. the first channel x1 only, and the second term framework to converge in the source domain. Thus, the
is a function with w.r.t. the second channel only. The third knowledge of the source domain is learned. Later, we will
term is a complex function modelling the dependence of x1 generate the middle domain. A slice in the source domain
and x2 . If x1 and x2 are independent to each other, the third and a slice in the target domain are randomly selected
term equals to zero. We plug Eq. 10 into Eq. 9, then have: and fused following one fusing option as described in the
previous section. Repeat this procedure, a dataset of the
g(s1 ) + g(s2 ) + g(s1 ) g(s2 ) middle domain can be created. Then, the deep learning
(11)
− (g(t1 ) + g(s2 ) + g(t1 ) g(s2 )) = [S1 − T1 , 0] model can be trained on the built dataset (the middle
domain) to converge again with the pre-trained parameters
, then
g(s1 ) − g(t1 ) − [S1 − T1 , 0] learned in the source domain. Finally, the model is trained
(12) in the target domain with the pre-trained parameters
= [g(s1 ) g(s2 ) − g(t1 ) g(s2 )]
learned in the middle domain. In this paper, the deep
The right side of Eq. 12 is w.r.t. s2 , but the left side is not learning model is SRFBN+ framework, and the source
w.r.t. s2 . As there are various images in a dataset, that only domain and the target domain are the datasets containing
one resolution existed — the g(s2 ) is independent to g(s1 ) the colonoscopy biopsy slice images and the ovarian biopsy
and g(t1 ), and the both sides of Eq. 12 equal to a constant. slice images, respectively.
Besides, another four equations can be obtained, similar
to Eq. 9. Based on the four equations, one can reason that
the s1 is independent against s2 , and t1 is independent
4 EXPERIMENTS
against t2 . So that f (x1 ) = X1 and f (x2 ) = X2 . Further, the
conclusion can be expanded if the number of channels of 4.1 Dataset.
the slice images is larger than 2. To verify the performances of SRFBN+ and the Channel Fu-
Based on the analysis above, the conclusion is that sion Transfer Learning strategy, we conduct experiments on

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 8

Middle Domain Generation Transfer Learning Process

Source Middle Target


domain domain domain
Domain Domain
transfer transfer

Fig. 7. The pipeline of training our proposed super-resolution microscope with the Channel Fusing Transfer Learning. The middle domain is first
generated (left) and the transfer learning process is split into two parts. The “X” shape indicates a deep super-resolution model.

two datasets corresponding to two domains. One contains We randomly divide the dataset by 8 : 2, with 80% slice
the slice image pairs of colonoscopy slices, and the other images as the training set and the 20% as the valid set.
one contains the ovarian slice image pairs. Both datasets
are images of cancer slices, and the high-resolution one in
the image pairs are captured by high power lenses and
the low-resolution one is generated by down-sampling the
high-resolution. These two datasets are very different, as
shown in Fig. 8. The colonoscopy slice images in Fig. 8 are
taken by the simple optical microscope after cell staining,
while the ovarian slice images in Fig. 8 (b) are captured by a
high power electron microscope following the multiphoton
infrared. The color in these ovarian slice images is the natu-
ral FDA fluorescence. As the equipment is more advanced, (a) (b)

the ovarian slices are more precious than the colonoscopy


slices. The colonoscopy slice dataset has 100 slice pairs, Fig. 8. Examples of the colonoscopy biopsy slice (a) and the ovarian
biopsy slice (b). It is obvious that these two biopsy slices are very
the higher-resolution slices of which are 8000 pixels×8000 different.
pixels, and the lower-resolution ones are 2000 pixels×2000
pixels, after resizing. The ovarian slice dataset has only 25
slice image pairs, the higher-resolution slices of which are 4.2 Experimental Setups
8000 pixels×8000 pixels, and the lower-resolution ones are 4.2.1 Model Setups
2000 pixels×2000 pixels. The receptive fields are aligned.
We implement SRFBN+ and the Channel Fusing Transfer
We use the RGB images as input. Before feeding into the
Learning strategy by Python 3.6. SRFBN+ is implemented
model, the image is normalized in channel-wise, by:
by PyTorch 1.0. In our experiments, we use the dataset
I − Im containing colonoscopy slice images as the source domain,
I0 = (13)
σI and the ovarian slice images as the target domain. And
where the I is the original slice, I 0 is the normalized slice, the middle domain is built following the channel fusion
and Im is the mean and σI is the standard deviation over operation. Firstly, SRFBN+ is trained on colonoscopy slices
the whole dataset. The Im and σI are calculated for every until convergence, and then transferred to the middle
channel, respectively. To save the storage of devices, the domain with learned parameter. Finally, the model is
slices are cropped with a 512 × 512 sliding window on trained on the target domain with the learned parameters.
the high-resolution slices, and the sliding step is 256. The our model is trained on both datasets with batch size equal
patches on the low-resolution are cropped by 128 × 128 slid- to 32, on 4 Titian 2080 GPU, each GPU with eight patches.
ing window, whose step is 64. It’s obvious that the patches The optimizer of SRFBN+ is Adam optimizer [50]. The
are overlapping. In references, the overlapped fragments are learning rate is initialized as 10−4 with cosine learning
averaged as the final results, by schedule. The whole training procedure takes about 96
hours, with about 500 epochs in every domain (dataset).
Ip1 + Ip2
Ip0 = (14)
2
where Ip1 and Ip2 are the predicted overlapped areas of 4.2.2 Transfer Learning Setups
two neighboring patches, and the Ip0 is the final result of the The transfer learning strategies in comparison are: (A)
overlapped area. the direct learning (DL) without transfer learning on the

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 9

ovarian biopsy slices; (B) the direct transfer learning (DTL) TABLE 1
from colonoscopy dataset to ovarian dataset, (C) Transfer The quantitative evaluation results on comparing among variant
transfer learning (DL=Direct Learning, DTL=Direct Transfer Learning,
Learning on the Third Dataset (TLTD); (D) the Channel TLTD=Transfer Learning on the Third dataset, CF-Trans=Channel
Fusing Transfer Learning (CF-Trans) from the colonoscopy Fusing Transfer Learning).
dataset to the ovarian dataset. To further compare the usage
of the middle domain, a third dataset containing gastric DL DTL TLTD CF-Trans
biopsy slice images are used as the middle domain in PRNR 23.73 23.89 23.90 25.04
(C). We perform the transfer learning on the third dataset SSIM 0.7004 0.7121 0.7128 0.7213
(TLTD) following these steps: (1) train the SRFBN+ on the
source domain; (2) train the SRFBN+ on the middle domain
TABLE 2
(gastric biopsy slice image dataset); (3) train the SRFBN+ on The quantitative evaluation results on comparing among variant
the target domain finally. As mentioned in the Sec. 3.4, after super-resolution frameworks.
convergence, SRFBN+ is trained on the following dataset
(domain) using the parameters obtained the current dataset SRFBN+ SRFBN resLF EDSR
as initialization. PRNR 23.73 23.61 23.11 22.85
SSIM 0.7004 0.6943 0.6848 0.6821

4.3 Metrics
Similar to other methods, we evaluate the performances on 4.5 Visualization
the two most common metrics for deep super-resolution To show the effects of SRFBN+ and the Channel Fusion
tasks. Peak Signal-to-Noise Ratio (PSNR) and Structured Transfer learning, some visualization results of ovarian slice
Similarity Index (SSIM) are utilized to measure the recon- images are illustrated in Fig. 9 and Fig. 10. The blue frames
struction quality. on the high-resolution slice images mark the patch for detail
illustration. Comparing the super-resolution patch and the
4.4 Quantitative evaluations high-resolution patch, it can be seen that our framework
4.4.1 Comparing Transfer Learning Strategies is advantageous in retaining the patterns of slice images.
As shown in Fig. 9 and Fig. 10, the strip textures are well
The performances of several transfer learning strategies
restored, retaining the direction and shape information.
are shown in Table 1. The results of the valid dataset of
the ovarian biopsy slice images among various transfer
learning strategies are equalled compared, with SRFBN+.
The average results of 5 runs are reported. As shown in the 5 C ONCLUSIONS
Table 1, the results of DL and DTL are almost the same, In this paper, we propose a deep learning framework
which infers that these two domains (the colonoscopy and SRFBN+ and a novel transfer training strategy called
ovarian biopsy slices) are so different, that direct transfer Channel Fusion Transfer Learning to generate a high-
learning among these two datasets is difficult. Comparing resolution slice images from the low-resolution slice
CF-Trans with direct transfer learning, CF-Trans obtains a images. We modify the SRFBN [18] to make the architecture
large promotion at around 1% on SSIM and around 1.1% on lighter and more flexible, showing a magic power in super-
PRNR. Note that the middle domain is built by fusing the resolution slice generation. The novel transfer training
information from the source domain and the target domain, strategy generates a middle domain for easy transferring.
without introducing other information. This result proves The proposed methods are proved useful by experiments.
that introducing the middle domain in channel fusing Besides, the Channel Fusion Transfer Learning strategy and
strategy makes the transfer learning process more effective, its variants can be used in other fields.
which largely helps the model in the super-resolution
microscopy task. The result of TLTD shown in Table 1
proved that the data manifolds in the source, the third data
domain and the target domains are very different, thus ACKNOWLEDGMENTS
using a third dataset as middle domain is not much helpful The research of the Real Doctor AI Research Centre was
in transitive transfer learning. partially supported by the Zhejiang University Education
Foundation under grants No. K18-511120-004, No. K17-
511120-017, and No. K17-518051-021, the National Natural
4.4.2 Comparing Super-resolution frameworks Science Foundation of China under grant No. 61672453,
To verify the effect of SRFBN+, SRFBN+ is compared with the National key R & D program sub project “large scale
other current state-of-the-art frameworks (e.g., ResLF [51], cross-modality medical knowledge management” under
SRFBN [18], EDSR [13]) in direct learning strategy. The grant No. 2018AAA0102100, the Zhejiang public welfare
comparing results are shown in Table 2. It is obvious that technology research project under grant No. LGF20F020013,
our model outperforms the other models in the ovarian the National Key R & D Program Project of “Software
slice images super-resolution tasks. Besides, SRFBN+ also Testing Evaluation Method Research and its Database
outperforms the SRFBN by around 0.1% in PRNR and SSIM, Development on Artificial Intelligence Medical Information
which verifies that our modification is effective. System” under the Fifth Electronics Research Institute
of the Ministry of Industry and Information Technology

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 10

High-resolution High-resolution patch Super-resolution patch

Fig. 9. An visualization results of SRFBN+ on ovarian slice. The patch is particularly shown to see the details.

High-resolution High-resolution patch Super-resolution patch

Fig. 10. Another visualization results of SRFBN+ on ovarian slice. The patch is particularly shown to see the details.

(No. 2019YFC0118802), and The National Key R & D [6] B. B. Mandelbrot, “A fast fractional gaussian noise generator,”
Program Project of “Full Life Cycle Detection Platform and Water Resources Research, vol. 7, no. 3, pp. 543–553, 1971.
[7] S. Fleishman, I. Drori, and D. Cohen-Or, “Bilateral mesh denois-
Application Demonstration of Medical Artificial Intelligence ing,” in ACM transactions on graphics (TOG), vol. 22, no. 3. ACM,
Product” under the National Institutes for Food and Drug 2003, pp. 950–953.
Control (No. 2019YFB1404802), and the Key Laboratory of [8] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep
Medical Neurobiology of Zhejiang Province. convolutional network for image super-resolution,” in European
conference on computer vision. Springer, 2014, pp. 184–199.
[9] C. Dong, C. C. Loy, and X. Tang, “Accelerating the super-
resolution convolutional neural network,” in European conference
on computer vision. Springer, 2016, pp. 391–407.
R EFERENCES [10] J. Kim, J. Kwon Lee, and K. Mu Lee, “Accurate image super-
resolution using very deep convolutional networks,” in Proceed-
[1] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for ings of the IEEE conference on computer vision and pattern recognition,
image recognition,” in Proceedings of the IEEE conference on computer 2016, pp. 1646–1654.
vision and pattern recognition, 2016, pp. 770–778. [11] M. Haris, G. Shakhnarovich, and N. Ukita, “Deep back-projection
[2] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, networks for super-resolution,” in Proceedings of the IEEE conference
“Densely connected convolutional networks,” in Proceedings of the on computer vision and pattern recognition, 2018, pp. 1664–1673.
IEEE conference on computer vision and pattern recognition, 2017, pp. [12] J. Kim, J. Kwon Lee, and K. Mu Lee, “Deeply-recursive convolu-
4700–4708. tional network for image super-resolution,” in Proceedings of the
[3] L. Theis, W. Shi, A. Cunningham, and F. Huszár, “Lossy im- IEEE conference on computer vision and pattern recognition, 2016, pp.
age compression with compressive autoencoders,” arXiv preprint 1637–1645.
arXiv:1703.00395, 2017. [13] B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced
[4] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, deep residual networks for single image super-resolution,” in
A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang et al., “Photo- Proceedings of the IEEE conference on computer vision and pattern
realistic single image super-resolution using a generative adver- recognition workshops, 2017, pp. 136–144.
sarial network,” in Proceedings of the IEEE conference on computer [14] Y. Nagano and Y. Kikuta, “Srgan for super-resolving low-
vision and pattern recognition, 2017, pp. 4681–4690. resolution food images,” in Proceedings of the Joint Workshop on
[5] J. Leskovec and C. Faloutsos, “Sampling from large graphs,” in Multimedia for Cooking and Eating Activities and Multimedia Assisted
Proceedings of the 12th ACM SIGKDD international conference on Dietary Management. ACM, 2018, pp. 33–37.
Knowledge discovery and data mining. ACM, 2006, pp. 631–636. [15] Y. Rivenson, Z. Göröcs, H. Günaydin, Y. Zhang, H. Wang, and

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 11

A. Ozcan, “Deep learning microscopy,” Optica, vol. 4, no. 11, pp. [37] Q. Sun, L. Ma, S. Joon Oh, L. Van Gool, B. Schiele, and M. Fritz,
1437–1443, 2017. “Natural and effective obfuscation by head inpainting,” in Pro-
[16] E. Nehme, L. E. Weiss, T. Michaeli, and Y. Shechtman, “Deep- ceedings of the IEEE Conference on Computer Vision and Pattern
storm: super-resolution single-molecule microscopy by deep Recognition, 2018, pp. 5050–5059.
learning,” Optica, vol. 5, no. 4, pp. 458–464, 2018. [38] Q. Sun, B. Schiele, and M. Fritz, “A domain based approach to
[17] M. Rohrbach, S. Ebert, and B. Schiele, “Transfer learning in a social relation recognition,” in Proceedings of the IEEE Conference on
transductive setting,” in Advances in neural information processing Computer Vision and Pattern Recognition, 2017, pp. 3481–3490.
systems, 2013, pp. 46–54. [39] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L.
[18] Z. Li, J. Yang, Z. Liu, X. Yang, G. Jeon, and W. Wu, “Feedback Yuille, “Deeplab: Semantic image segmentation with deep convo-
network for image super-resolution,” in Proceedings of the IEEE lutional nets, atrous convolution, and fully connected crfs,” IEEE
Conference on Computer Vision and Pattern Recognition, 2019, pp. transactions on pattern analysis and machine intelligence, vol. 40, no. 4,
3867–3876. pp. 834–848, 2017.
[19] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “Mixup: [40] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in
Beyond empirical risk minimization,” in ICLR, 2018. Proceedings of the IEEE international conference on computer vision,
[20] L. Zhang and X. Wu, “An edge-guided image interpolation algo- 2017, pp. 2961–2969.
rithm via directional filtering and data fusion,” IEEE transactions [41] B. Tan, Y. Song, E. Zhong, and Q. Yang, “Transitive transfer
on Image Processing, vol. 15, no. 8, pp. 2226–2238, 2006. learning,” in Proceedings of the 21th ACM SIGKDD International
[21] K. Zhang, X. Gao, D. Tao, and X. Li, “Single image super-resolution Conference on Knowledge Discovery and Data Mining. ACM, 2015,
with non-local means and steering kernel regression,” IEEE Trans- pp. 1155–1164.
actions on Image Processing, vol. 21, no. 11, pp. 4544–4556, 2012. [42] C.-K. Shie, C.-H. Chuang, C.-N. Chou, M.-H. Wu, and E. Y. Chang,
[22] W. Dong, L. Zhang, G. Shi, and X. Wu, “Image deblurring and “Transfer representation learning for medical image analysis,” in
super-resolution by adaptive sparse domain selection and adap- 2015 37th annual international conference of the IEEE Engineering in
tive regularization,” IEEE Transactions on Image Processing, vol. 20, Medicine and Biology Society (EMBC). IEEE, 2015, pp. 711–714.
no. 7, pp. 1838–1857, 2011. [43] B. Q. Huynh, H. Li, and M. L. Giger, “Digital mammographic
[23] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution tumor classification using transfer learning from deep convolu-
using deep convolutional networks,” IEEE transactions on pattern tional neural networks,” Journal of Medical Imaging, vol. 3, no. 3, p.
analysis and machine intelligence, vol. 38, no. 2, pp. 295–307, 2015. 034501, 2016.
[24] W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep laplacian [44] H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao,
pyramid networks for fast and accurate super-resolution,” in D. Mollura, and R. M. Summers, “Deep convolutional neural
Proceedings of the IEEE conference on computer vision and pattern networks for computer-aided detection: Cnn architectures, dataset
recognition, 2017, pp. 624–632. characteristics and transfer learning,” IEEE transactions on medical
imaging, vol. 35, no. 5, pp. 1285–1298, 2016.
[25] K. Zhang, W. Zuo, and L. Zhang, “Learning a single convolu-
[45] K. T. Gribbon and D. G. Bailey, “A novel approach to real-time
tional super-resolution network for multiple degradations,” in
bilinear interpolation,” in Proceedings. DELTA 2004. Second IEEE
Proceedings of the IEEE Conference on Computer Vision and Pattern
International Workshop on Electronic Design, Test and Applications.
Recognition, 2018, pp. 3262–3271.
IEEE, 2004, pp. 126–131.
[26] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, “Image super-
[46] Y. Wang, X. Zhao, Z. Sun, H. Yan, L. Wang, Z. Jin, L. Wang, Y. Gao,
resolution using very deep residual channel attention networks,”
C. Law, and J. Zeng, “Peacock: Learning long-tail topic features for
in Proceedings of the European Conference on Computer Vision (ECCV),
industrial applications,” ACM Transactions on Intelligent Systems
2018, pp. 286–301.
and Technology (TIST), vol. 6, no. 4, p. 47, 2015.
[27] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense
[47] C. Sun, A. Shrivastava, S. Singh, and A. Gupta, “Revisiting unrea-
network for image super-resolution,” in Proceedings of the IEEE
sonable effectiveness of data in deep learning era,” in Proceedings
Conference on Computer Vision and Pattern Recognition, 2018, pp.
of the IEEE international conference on computer vision, 2017, pp. 843–
2472–2481.
852.
[28] O. Ronneberger, P. Fischer, and T. Brox., “U-Net: Convolutional [48] Y.-X. Wang, D. Ramanan, and M. Hebert, “Learning to model the
Networks for Biomedical Image Segmentation,” in MICCAI, 2015. tail,” in Advances in Neural Information Processing Systems, 2017, pp.
[29] W. Ouyang, A. Aristov, M. Lelek, X. Hao, and C. Zimmer, “Deep 7029–7039.
learning massively accelerates super-resolution localization mi- [49] X. Zhang, Z. Fang, Y. Wen, Z. Li, and Y. Qiao, “Range loss for deep
croscopy,” Nature biotechnology, vol. 36, no. 5, p. 460, 2018. face recognition with long-tailed training data,” in Proceedings of
[30] P. Dube, B. Bhattacharjee, S. Huo, P. Watson, B. Belgodere, and J. R. the IEEE International Conference on Computer Vision, 2017, pp. 5409–
Kender, “Automatic labeling of data for transfer learning,” nature, 5418.
vol. 192255, p. 241, 2019. [50] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimiza-
[31] S. Khan, N. Islam, Z. Jan, I. U. Din, and J. J. C. Rodrigues, tion,” arXiv preprint arXiv:1412.6980, 2014.
“A novel deep learning based framework for the detection and [51] S. Zhang, Y. Lin, and H. Sheng, “Residual networks for light field
classification of breast cancer using transfer learning,” Pattern image super-resolution,” in Proceedings of the IEEE Conference on
Recognition Letters, vol. 125, pp. 1–6, 2019. Computer Vision and Pattern Recognition, 2019, pp. 11 046–11 055.
[32] C. Qu, F. Ji, M. Qiu, L. Yang, Z. Min, H. Chen, J. Huang, and
W. B. Croft, “Learning to selectively transfer: Reinforced transfer
learning for deep text matching,” in Proceedings of the Twelfth ACM
International Conference on Web Search and Data Mining. ACM,
2019, pp. 699–707.
[33] R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng, “Self-taught
learning: transfer learning from unlabeled data,” in Proceedings of
the 24th international conference on Machine learning. ACM, 2007,
pp. 759–766. Jintai Chen received the B.S. degree in applied
[34] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Im- statistics from the Zhongnan University of Eco-
agenet: A large-scale hierarchical image database,” in 2009 IEEE nomics and Law. He is currently working toward
conference on computer vision and pattern recognition. Ieee, 2009, pp. the Ph.D. degree in the College of Computer Sci-
248–255. ence, Zhejiang University. His research interests
[35] D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, P. Vincent, includes deep learning, machine learning and
and S. Bengio, “Why does unsupervised pre-training help deep data mining, especially on the computer vision,
learning?” Journal of Machine Learning Research, vol. 11, no. Feb, graph neural network and medical intelligence.
pp. 625–660, 2010.
[36] J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara,
A. Fathi, I. Fischer, Z. Wojna, Y. Song, S. Guadarrama et al.,
“Speed/accuracy trade-offs for modern convolutional object de-
tectors,” in Proceedings of the IEEE conference on computer vision and
pattern recognition, 2017, pp. 7310–7311.

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2020.2991173, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, APRIL 2020 12

Haochao Ying is currently an assistant pro- Honghao Gao received the Ph.D. degree in
fessor in the School of Public Health, Zhejiang Computer Science and started his academic ca-
University. He received the Ph.D. degree in the reer at Shanghai University in 2012. He is an
College of Computer Science from Zhejiang Uni- IET Fellow, BCS Fellow, EAI Fellow, IEEE Senior
versity in 2019, and the B.S. degree in computer Member, CCF Senior Member, and CAAI Senior
science and technology from Zhejiang University Member. Dr. Gao is currently a Professor with
of Technology in 2014. His research interests the Key Laboratory of Complex Systems Model-
include data mining for healthcare and person- ing and Simulation, Ministry of Education, China.
alized recommender system. He has authored He is also a Research Fellow with the Software
some papers at prestigious international confer- Engineering Information Technology Institute of
ences and journals, such as World Wide Web Central Michigan University, USA, and is a Pro-
Journal, IJCAI, CVPR, WSDM and PAKDD. fessor of Gachon University, South Korea. His research interests include
service computing, model checking-based software verification, wireless
network, and intelligent medical image processing.

XueChen Liu received the B.S. degree in soft-


ware engineering from Zhejiang University in
2018. He is currently working toward the Post- Jian Wu received the Ph.D. degree in Computer
graduate degree in the College of Computer Science and Technology from Zhejiang Univer-
Science, Zhejiang University. His research inter- sity in 1998. He is an IEEE member, CFF mem-
ests include computer vision and deep learning ber, CCF TCSC member, CCF TCAPP mem-
based medical intelligence and computer aided ber and member of the ”151 Talent Project of
diagnosis. Zhejiang Province”. Prof. Jian Wu is recently
the director of Research Centre of Zhejiang Uni-
versity and Vice-president of National Research
Institute of Big Data of Health and Medical Sci-
ences of Zhejiang University. His research inter-
ests include Medical Artificial Intelligence, Ser-
vice Computing and Data Mining.
Jingjing Gu received the B.E. degree in com-
puter science and the Ph.D. degree in computer
science and technology from the Nanjing Uni-
versity of Aeronautics and Astronautics (NUAA),
China, in 2005 and 2011, respectively. She is
currently an Associate Professor with the Insti-
tute of Artificial Intelligence and Pattern Com-
puting, NUAA. Her current research interests in-
clude data mining and mobile computing.

Ruiwei Feng received the B.S. degree in North


China Electric Power University in 2018. She
is currently working toward the Ph.D. degree
in the College of Computer Science, Zhejiang
University. Her research interests include ma-
chine learning, deep learning, computer vision
and medical intelligence.

Tingting Chen received the B.S. degree in


computer science from Southwest University in
2017. She is currently working toward the Ph.D.
degree in the College of Computer Science,
Zhejiang University. Her research interests in-
clude machine learning, deep learning, com-
puter vision and medical intelligence.

1545-5963 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on August 02,2020 at 15:00:03 UTC from IEEE Xplore. Restrictions apply.

You might also like