Professional Documents
Culture Documents
Combining Noise-To-Image and Image-To-Image GANs Brain MR Image Augmentation For Tumor Detection
Combining Noise-To-Image and Image-To-Image GANs Brain MR Image Augmentation For Tumor Detection
November 7, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2947606
ABSTRACT Convolutional Neural Networks (CNNs) achieve excellent computer-assisted diagnosis with
sufficient annotated training data. However, most medical imaging datasets are small and fragmented.
In this context, Generative Adversarial Networks (GANs) can synthesize realistic/diverse additional training
images to fill the data lack in the real image distribution; researchers have improved classification by
augmenting data with noise-to-image (e.g., random noise samples to diverse pathological images) or image-
to-image GANs (e.g., a benign image to a malignant one). Yet, no research has reported results combining
noise-to-image and image-to-image GANs for further performance boost. Therefore, to maximize the
DA effect with the GAN combinations, we propose a two-step GAN-based DA that generates and refines
brain Magnetic Resonance (MR) images with/without tumors separately: (i) Progressive Growing of GANs
(PGGANs), multi-stage noise-to-image GAN for high-resolution MR image generation, first generates
realistic/diverse 256×256 images; (ii) Multimodal UNsupervised Image-to-image Translation (MUNIT) that
combines GANs/Variational AutoEncoders or SimGAN that uses a DA-focused GAN loss, further refines
the texture/shape of the PGGAN-generated images similarly to the real ones. We thoroughly investigate
CNN-based tumor classification results, also considering the influence of pre-training on ImageNet and
discarding weird-looking GAN-generated images. The results show that, when combined with classic DA,
our two-step GAN-based DA can significantly outperform the classic DA alone, in tumor detection
(i.e., boosting sensitivity 93.67% to 97.48%) and also in other medical imaging tasks.
INDEX TERMS Data augmentation, synthetic image generation, GANs, brain MRI, tumor detection.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/
156966 VOLUME 7, 2019
C. Han et al.: Combining Noise-to-Image and Image-to-Image GANs: Brain MR Image Augmentation for Tumor Detection
TABLE 1. PGGAN architecture details for the generator/discriminator. Pixelwise feature vector normalization [28] is applied in the generator after each
convolutional layer except for the final output layer as in the original paper [10]. LReLU denotes Leaky ReLU with leakiness 0.2.
towards the generated samples and λgp is the gradient penalty generated images’ realism/diversity via a stochastic model
coefficient. representing continuous output distributions.
We train the model (Table 1) for 100 epochs with a batch MUNIT Implementation Details The MUNIT architecture
size of 16 and 1.0×10−3 learning rate for the Adam optimizer adopts the following loss:
(the exponential decay rates β1 = 0, β2 = 0.99) [29]. All
experiments use λgp = 10 with 1 critic iteration per gener- min max L VAE1 + L GAN1 +L CC1 +L VGG1
ator iteration. During training, we apply random cropping in E1 ,E2 ,G1 ,G2 D1 ,D2
0-15 pixels as DA. +L VAE2 +L GAN2 +L CC2 + L VGG2 , (2)
C. MUNIT/SimGAN-BASED IMAGE REFINEMENT where L(·) denotes the loss function. Using the multiple
Refinement Using resized 224 × 224 images for ResNet-50, encoders E1 /E2 , generators G1 /G2 , discriminators D1 /D2 ,
we further refine the texture/shape of PGGAN-generated cycle-consistencies CC1 /CC2 , and domain-invariant percep-
tumor/non-tumor images separately to fit them into the real tions VGG1 /VGG2 [31], this framework jointly solves learn-
image distribution using MUNIT [13] or SimGAN [15]. ing problems of the VAE1 /VAE2 and GAN1 /GAN2 for the
SimGAN remarkably improved eye gaze estimation results image reconstruction streams, image translation streams,
after refining non-GAN-based synthetic images from the cycle-consistency reconstruction streams, and domain-
UnityEyes simulator via image-to-image translation; thus, invariant perception streams. Since we do not need the style
we also expect such performance improvement after refining loss for our experiments, instead of the MUNIT loss, we use
synthetic images from a noise-to-image GAN (i.e., PGGANs) the UNIT loss with the perceptual loss for the MUNIT
via an image-to-image GAN (i.e., MUNIT/SimGAN) with architecture (as in the UNIT authors’ GitHub repository).
considerably different GAN algorithms. We train the model (Table 2) for 100, 000 steps with a
We randomly select 3, 000 real/3, 000 PGGAN-generated batch size of 1 and 1.0 × 10−4 learning rate for the Adam
tumor images for tumor image training, and we perform the optimizer (β1 = 0.5, β2 = 0.999) [29]. The learning rate
same for non-tumor image training. To find suitable refining is reduced by half every 20, 000 steps. We use the follow-
steps for each architecture, we pick the MUNIT/SimGAN ing MUNIT weights: the adversarial loss weight = 1; the
models with the highest accuracy on tumor detection valida- image reconstruction loss weight = 10; the Kullback-Leibler
tion, when pre-trained and combined with classic DA, among (KL) divergence loss weight for reconstruction = 0.01; the
20, 000/50, 000/100, 000 steps, respectively. cycle consistency loss weight = 10; the KL divergence loss
MUNIT [13] is an image-to-image GAN based on both auto- weight for cycle consistency = 0.01; the domain-invariant
encoding/translation; it extends UNIT [30] to increase the perceptual loss weight = 1; the Least Squares GAN objective
TABLE 2. MUNIT architecture details for the generator/discriminator. We input color images (i.e., 3 channels) to use ImageNet initialization. Instance
normalization [32]/adaptive instance normalization [33] are applied in the content encoder/decoder after each convolutional layer respectively
except for the final decoder output layer as in the original paper [13]. LReLU denotes Leaky ReLU with leakiness 0.2.
TABLE 3. SimGAN architecture details for the refiner/discriminator. Batch normalization is applied both in the refiner/discriminator after each
convolutional layer except for the final output layers respectively as in the original paper [15].
function for the discriminators [34]. During training, discriminator, we update the refiner 5 times. During training,
we apply horizontal flipping as DA. we apply horizontal flipping as DA.
SimGAN [15] is an image-to-image GAN designed for DA
that adopts the self-regularization term/local adversarial loss; D. TUMOR DETECTION USING ResNet-50
it updates a discriminator with a history of refined images. Pre-processing. As ResNet-50’s input size is 224 × 224
SimGAN Implementation Details The SimGAN architec- pixels, we resize the whole real images from 240 × 240 and
ture (i.e., a refiner) uses the following loss: whole PGGAN-generated images from 256 × 256.
X ResNet-50 [12] is a 50-layer residual learning-based CNN.
Lreal (θ; xi , Y) + λreg Lreg (θ ; xi ), (3) We adopt it to detect brain tumors in MR images
i (i.e., the binary classification of tumor/non-tumpor images)
where L(·) denotes the loss function, θ is the function param- due to its outstanding performance in image classifi-
eters, xi is the ith PGGAN-generated training image, and Y is cation tasks [36], including binary classification [37].
the set of the real images yj . The first part Lreal adds realism to Chang et al. [38] also used a similar 34-layer residual
the synthetic images using a discriminator, while the second convolutional network for the binary classification of brain
part Lreg preserves the tumor/non-tumor features. tumors (i.e., determining the Isocitrate Dehydrogenase status
We train the model (Table 3) for 20, 000 steps with a batch in low-/high-grade gliomas).
size of 10 and 1.0 × 10−4 learning rate for the Stochastic DA Setups To confirm the effect of PGGAN-based DA
Gradient Descent (SGD) optimizer [35] without momentum. and its refinement using MUNIT/SimGAN, we compare
The learning rate is reduced by half at 15,000 steps. We the following 10 DA setups under sufficient images both
train the refiner first with just the self-regularization loss with with/without ImageNet [16] pre-training (i.e., 20 DA setups):
λreg = 5 × 10−5 for 500 steps; then, for each update of the 1) 8429 real images;
2) + 200k classic DA; weird-looking synthetic images (Fig. 5); DenseNet-169 [43]
3) + 400k classic DA; extracts image features and k-means++ [44] clusters the
4) + 200k PGGAN-based DA; features into 200 groups, and then we manually discard each
5) + 200k PGGAN-based DA w/o clustering/discarding; cluster containing similar weird-looking images. To verify
6) + 200k classic DA & 200k PGGAN-based DA; its effect, we also conduct a PGGAN-based DA experiment
7) + 200k MUNIT-refined DA; without the discarding step.
8) + 200k classic DA & 200k MUNIT-refined DA; ResNet-50 Implementation Details The ResNet-50 archi-
9) + 200k SimGAN-refined DA; tecture adopts the binary cross-entropy loss for binary classi-
10) + 200k classic DA & 200k SimGAN-refined DA. fication both with/without ImageNet pre-training. As shown
Due to the risk of overlooking the tumor diagnosis via in Table 4, for robust training, before the final sigmoid layer,
medical imaging, higher sensitivity matters much more than we introduce a 0.5 dropout [45], linear dense, and batch nor-
higher specificity [39]; thus, we aim to achieve higher sensi- malization [46] layers—training with GAN-based DA tends
tivity, using the additional synthetic training images. We per- to be unstable especially without the batch normalization
form McNemar’s test on paired tumor detection results [40] layer. We use a batch size of 96, 1.0 × 10−2 learning rate
to confirm our two-step GAN-based DA’s statistically- for the SGD optimizer [35] with 0.9 momentum, and early
significant sensitivity improvement; since this statistical anal- stopping of 20 epochs. The learning rate was multiplied
ysis involves multiple comparison tests, we adjust their by 0.1 every 20 epochs for the training from scratch and
p-values using the Holm–Bonferroni method [41]. by 0.5 every 5 epochs for the ImageNet pre-training.
Whereas medical imaging researchers widely use the
ImageNet initialization despite different textures of natural/ E. CLINICAL VALIDATION USING VISUAL TURING TEST
medical images, recent study found that such ImageNet-
To quantify the (i) realism of 224 × 224 synthetic images
trained CNNs are biased towards recognizing texture rather
by PGGANs, MUNIT, and SimGAN against real images
than shape [42]; thus, we aim to investigate how the
respectively (i.e., 3 setups) and (ii) clearness of their tumor/
medical GAN-based DA affects classification performance
non-tumor features, we supply, in random order, to an expert
with/without the pre-training. As the classic DA, we adopt a
physician a random selection of:
random combination of horizontal/vertical flipping, rotation
up to 10 degrees, width/height shift up to 8%, shearing up • 50 real tumor images;
to 8%, zooming up to 8%, and constant filling of points out- • 50 real non-tumor images;
side the input boundaries (Fig. 4). For the PGGAN-based DA • 50 synthetic tumor images;
and its refinement, we only use success cases after discarding • 50 synthetic non-tumor images.
TABLE 5. ResNet-50 tumor detection (i.e., binary classification) results with various DA, with (without) ImageNet pre-training. Sensitivity and specificity
consider the slight tumor/non-tumor class imbalance (about 6:5) in the test set. Boldface indicates the best performance.
TABLE 6. McNemar’s test p-values for the pairwise comparison of the ResNet-50 tumor detection results in terms of accuracy, sensitivity, specificity,
respectively. We compare our two-step GAN-based DA setups and all the other DA setups. All numbers within parentheses refer to DA setups on Table 5
and PT denotes pre-training. Boldface indicates statistical significance (threshold p-value < 0.05).
TABLE 7. Visual Turing Test results by an expert physician for classifying Real (R) vs Synthetic (S) images and Tumor (T) vs Non-tumor (N) images.
Accuracy denotes the physician’s successful classification ratio between the real/synthetic images and between the tumor/non-tumor images,
respectively. It should be noted that proximity to 50% of accuracy indicates superior performance (chance = 50%).
FIGURE 7. T-SNE plots with 300 tumor/non-tumor MR images per each category: Real images vs (a) Geometrically-transformed images;
(b) PGGAN-generated images; (c) MUNIT-refined images; (d) SimGAN-refined images.
DA under sufficient geometrically-transformed train- refinement. Specificity is higher than sensitivity for every
ing images. When pre-trained, each GAN-based DA DA setup with pre-training, probably due to the training
(i.e., PGGANs/MUNIT/SimGAN) alone helps classifica- data imbalance; but interestingly, without pre-training, sen-
tion due to the robustness from GAN-generated images; sitivity is higher than specificity for both image-to-image
but, without pre-training, it harms classification due to the GAN-based DA since our tumor detection-oriented two-step
biased initialization from the GAN-overwhelming data dis- GAN-based DA can fill the real tumor image distribution
tribution. Similarly, without pre-training, PGGAN-based DA uncovered by the original dataset under no ImageNet initial-
without clustering/discarding causes poor classification due ization. Accordingly, when combined with the classic DA,
to the synthetic images with severe artifacts, unlike the the MUNIT-based DA based on both GANs/VAEs achieves
PGGAN-based DA’s comparable results with/without the the highest sensitivity 97.48% against the best performing
discarding step when pre-trained. classic DA’s 93.67%, allowing to significantly alleviate the
When combined with the classic DA, each GAN-based risk of overlooking the tumor diagnosis; in terms of sensitiv-
DA remarkably outperforms the GAN-based DA or classic ity, it outperforms all the other DA setups, including two-step
DA alone in terms of sensitivity since they are mutually- DA setups, with statistical significance.
complementary: the former learns the non-linear manifold
of the real images to generate novel local tumor features D. VISUAL TURING TEST RESULTS
(since we train tumor/non-tumor images separately) strongly Table 7 indicates the confusion matrix for the Visual
associated with sensitivity; the latter learns the geometrically- Turing Test. The expert physician classifies a few
transformed manifold of the real images to cover global PGGAN-generated images as real, thanks to their realism,
features and provide the robustness on training for most despite high resolution (i.e., 224 × 224 pixels); mean-
cases. We confirm that test samples, originally-misclassified while, the expert classifies less GAN-refined images as real
but correctly classified after DA, are obviously different for due to slight artifacts induced during refinement. The syn-
the GAN-based DA and classic DA; here, both image-to- thetic images successfully capture tumor/non-tumor features;
image GAN-based DA, especially MUNIT, produce remark- unlike the non-tumor images, the expert recognizes a consid-
ably higher sensitivity than the PGGAN-based DA after erable number of the mild/modest tumor images as non-tumor
for both real/synthetic cases. It derives from clinical tumor artifacts; however, only without pre-training, discarding them
diagnosis relying on a full 3D volume, instead of a single boosts DA performance.
2D slice. Overall, by minimizing the number of annotated images
required for medical imaging tasks, the two-step GAN-based
DA can shed light not only on classification, but also on
E. t-SNE RESULTS
object detection [49] and segmentation [50]. Moreover, other
As Fig. 7 represents, the real tumor/non-tumor image potential medical applications exist: (i) A data anonymization
distributions largely overlap while the non-tumor images tool to share patients’ data outside their institution for training
distribute wider. The geometrically-transformed tumor/non- without losing detection performance [50]; (ii) A physician
tumor image distributions also often overlap, and both images training tool to show random pathological images for med-
distribute wider than the real ones. All GAN-based synthetic ical students/radiology trainees despite infrastructural/legal
images by PGGANs/MUNIT/SimGAN distribute widely, constraints [51]. As future work, we plan to define a new
while their tumor/non-tumor images overlap much less than end-to-end GAN loss function that explicitly optimizes the
the geometrically-transformed ones (i.e., a high discrimi- classification results, instead of optimizing visual realism
nation ability associated with sensitivity improvement); the while maintaining diversity by combining the state-of-the-
MUNIT-refined images show better tumor/non-tumor dis- art noise-to-image and image-to-image GANs; towards this,
crimination and a more similar distribution to the real ones we might extend a preliminary work on a three-player GAN
than the PGGAN/SimGAN-based images. This trend derives for classification [52] to generate only hard-to-classify sam-
from the MUNIT’s loss function adopting both GANs/VAEs ples to improve classification; we could also (i) explicitly
that further fits the PGGAN-generated images into the real model deformation fields/intensity transformations and
image distribution by refining their texture/shape; contrarily, (ii) leverage unlabelled data during the generative pro-
this refinement could also induce slight human-recognizable cess [53] to effectively fill the real image distribution.
but DA-irrelevant artifacts. Overall, the GAN-based images,
especially the MUNIT-refined images, fill the distribution REFERENCES
uncovered by the real or geometrically-transformed ones with [1] M. Havaei, A. Davy, D. Warde-Farley, A. Biard, A. Courville, Y. Bengio,
less tumor/non-tumor overlap; this demonstrates the superi- C. Pal, P.-M. Jodoin, and H. Larochelle, ‘‘Brain tumor segmentation with
ority of combining classic DA and GAN-based DA. deep neural networks,’’ Med. Image Anal., vol. 35, pp. 18–31, Jan. 2017.
[2] L. Rundo, C. Han, Y. Nagano, J. Zhang, R. Hataya, C. Militello,
A. Tangherloni, M. S. Nobile, C. Ferretti, D. Besozzi, M. C. Gilardi,
S. Vitabile, G. Mauri, H. Nakayama, and P. Cazzaniga, ‘‘USE-net: Incor-
V. CONCLUSION porating squeeze-and-excitation blocks into U-Net for prostate zonal seg-
Visual Turing Test and t-SNE results show that PGGANs, mentation of multi-institutional MRI datasets,’’ Neurocomputing, vol. 365,
pp. 31–43, Nov. 2019.
multi-stage noise-to-image GAN, can generate realistic/ [3] J. Ker, L. Wang, J. Rao, and T. Lim, ‘‘Deep learning applications in medical
diverse 256×256 brain MR images with/without tumors sep- image analysis,’’ IEEE Access, vol. 6, pp. 9375–9389, 2018.
arately. Unlike classic DA that geometrically covers global [4] O. Ronneberger, P. Fischer, and T. Brox, ‘‘U-Net: Convolutional networks
for biomedical image segmentation,’’ in Proc. Int. Conf. Med. Image
features and provides the robustness on training for most Comput. Comput.-Assist. Intervent. (MICCAI), 2015, pp. 234–241.
cases, the GAN-generated images can non-linearly cover [5] F. Milletari, N. Navab, and S. A. Ahmadi, ‘‘V-net: Fully convolutional
local tumor features with much less tumor/non-tumor over- neural networks for volumetric medical image segmentation,’’ in Proc. Int.
Conf. 3D Vis. (3DV), 2016, pp. 565–571.
lap; thus, combining them can significantly boost tumor [6] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
detection sensitivity—especially after refining them with S. Ozair, A. Courville, and Y. Bengio, ‘‘Generative adversarial nets,’’ in
MUNIT or SimGAN, image-to-image GANs; thanks to an Proc. Adv. Neural Inf. Process. Syst. (NIPS), 2014, pp. 2672–2680.
[7] X. Yi, E. Walia, and P. Babyn, ‘‘Generative adversarial network in
ensemble generation process from those GANs’ different medical imaging: A review,’’ Med. Image Anal., vol. 58, Dec. 2019,
algorithms, the texture/shape-refined images can replace Art. no. 101552.
[8] C. Han, L. Rundo, R. Araki, Y. Furukawa, G. Mauri, H. Nakayama,
missing data points of the training set with less tumor/ and H. Hayashi, ‘‘Infinite brain MR images: PGGAN-based data aug-
non-tumor overlap, and thus handle the data imbalance mentation for tumor detection,’’ in Neural Approaches to Dynamics of
by regularizing the model (i.e., improved generalization). Signal Exchanges (Smart Innovation, Systems and Technologies), vol. 151.
Singapore: Springer, 2019, pp. 291–303.
Notably, MUNIT remarkably outperforms SimGAN in terms [9] E. Wu, K. Wu, D. Cox, and W. Lotter, ‘‘Conditional infilling GANs for data
of sensitivity, probably due to the effect of combining both augmentation in mammogram classification,’’ in Image Analysis for Mov-
GANs/VAEs. ing Organ, Breast, and Thoracic Images. Cham, Switzerland: Springer,
2018, pp. 98–106.
Regarding better medical GAN-based DA, ImageNet pre- [10] T. Karras, T. Aila, S. Laine, and J. Lehtinen, ‘‘Progressive growing of
training generally improves classification despite different GANs for improved quality, stability, and variation,’’ in Proc. Int. Conf.
textures of natural/medical images; but, without pre-training, Learn. Represent. (ICLR), 2017, pp. 1–26.
[11] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, ‘‘Inception-v4,
the GAN-refined images may help achieve better sensitiv- Inception-Resnet and the impact of residual connections on learning,’’ in
ity, allowing to alleviate the risk of overlooking the tumor Proc. AAAI Conf. Artif. Intell. (AAAI), 2017, pp. 4278–4284.
[12] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
diagnosis—this attributes to our tumor detection-oriented recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
two-step GAN-based DA’s high discrimination ability to Jun. 2016, pp. 770–778.
fill the real tumor image distribution under no ImageNet [13] X. Huang, M.-Y. Liu, S. Belongie, and J. Kautz, ‘‘Multimodal unsupervised
image-to-image translation,’’ in Proc. Eur. Conf. Comput. Vis. (ECCV),
initialization. GAN-generated images typically include odd 2018, pp. 172–189.
VOLUME 7, 2019 156975
C. Han et al.: Combining Noise-to-Image and Image-to-Image GANs: Brain MR Image Augmentation for Tumor Detection
[14] D. P. Kingma and M. Welling, ‘‘Auto-encoding variational Bayes,’’ in Proc. [38] K. Chang et al., ‘‘Residual convolutional neural network for the determi-
Int. Conf. Learn. Represent. (ICLR), 2013, pp. 1–14. nation of IDH status in low- and high-grade gliomas from MR imaging,’’
[15] A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, and R. Webb, Clin. Cancer Res., vol. 24, no. 5, pp. 1073–1081, 2018.
‘‘Learning from simulated and unsupervised images through adversarial [39] M. A. Mazurowski, M. Buda, A. Saha, and M. R. Bashir, ‘‘Deep learning in
training,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), radiology: An overview of the concepts and a survey of the state of the art
Jul. 2017, pp. 2107–2116. with focus on MRI,’’ J. Magn. Reson. Imag., vol. 49, no. 4, pp. 939–954,
[16] O. Russakovsky, J. Deng, H. Su, J. Su, S. Satheesh, S. Ma, Z. Huang, 2019.
A. Karpathy, A. Khosla, M. Bernstein, and A. C. Berg, ‘‘ImageNet large [40] Q. McNemar, ‘‘Note on the sampling error of the difference between
scale visual recognition challenge,’’ Int. J. Comput. Vis., vol. 115, no. 3, correlated proportions or percentages,’’ Psychometrika, vol. 12, no. 2,
pp. 211–252, Dec. 2015. pp. 153–157, 1947.
[17] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and [41] S. Holm, ‘‘A simple sequentially rejective multiple test procedure,’’ Scan-
X. Chen, ‘‘Improved techniques for training GANs,’’ in Proc. Adv. Neural din. J. Statist., vol. 6, no. 2, pp. 65–70, 1979.
Inf. Process. Syst. (NIPS), 2016, pp. 2234–2242. [42] R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and
[18] L. van der Maaten and G. Hinton, ‘‘Visualizing data using t-SNE,’’ J. Mach. W. Brendel, ‘‘ImageNet-trained CNNs are biased towards texture; increas-
Learn. Res., vol. 9, pp. 2579–2605, Nov. 2008. ing shape bias improves accuracy and robustness,’’ in Proc. Int. Conf.
[19] J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, ‘‘Unpaired image-to-image Learn. Represent. (ICLR), 2019, pp. 1–22.
translation using cycle-consistent adversarial networks,’’ in Proc. IEEE Int. [43] F. Iandola, M. Moskewicz, S. Karayev, R. Girshick, T. Darrell, and
Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 2223–2232. K. Keutzer, ‘‘DenseNet: Implementing efficient ConvNet descriptor
[20] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, pyramids,’’ 2014, arXiv:1404.1869. [Online]. Available: https://arxiv.
‘‘Improved training of Wasserstein GANs,’’ in Proc. Adv. Neural Inf. org/abs/1404.1869
Process. Syst. (NIPS), 2017, pp. 5769–5779. [44] D. Arthur and S. Vassilvitskii, ‘‘K-means++: The advantages of careful
[21] A. Radford, L. Metz, and S. Chintala, ‘‘Unsupervised representation learn- seeding,’’ in Proc. Annu. ACM-SIAM Symp. Discrete Algorithms (SODA),
ing with deep convolutional generative adversarial networks,’’ in Proc. Int. 2007, pp. 1027–1035.
Conf. Learn. Represent. (ICLR), 2016, pp. 1–16. [45] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and
[22] T. Xu, P. Zhang, Q. Huang, H. Zhang, Z. Gan, X. Huang, and X. He, R. Salakhutdinov, ‘‘Dropout: A simple way to prevent neural networks
‘‘AttnGAN: Fine-grained text to image generation with attentional gen- from overfitting,’’ J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958,
erative adversarial networks,’’ in Proc. IEEE Conf. Comput. Vis. Pattern 2014.
Recognit. (CVPR), Jun. 2018, pp. 1316–1324. [46] S. Ioffe and C. Szegedy, ‘‘Batch normalization: Accelerating deep network
[23] M. Frid-Adar, I. Diamant, E. Klang, M. Amitai, J. Goldberger, and training by reducing internal covariate shift,’’ in Proc. Int. Conf. Mach.
H. Greenspan, ‘‘GAN-based synthetic medical image augmentation for Learn., vol. 37, Jul. 2015, pp. 448–456.
increased CNN performance in liver lesion classification,’’ Neurocomput- [47] M. J. M. Chuquicusma, S. Hussein, J. Burt, and U. Bagci, ‘‘How to fool
ing, vol. 321, pp. 321–331, Dec. 2018. radiologists with generative adversarial networks? A visual turing test for
[24] A. Madani, M. Moradi, A. Karargyris, and T. Syeda-Mahmood, ‘‘Chest lung cancer diagnosis,’’ in Proc. IEEE Int. Symp. Biomed. Imag. (ISBI),
X-ray generation and data augmentation for cardiovascular abnormality Apr. 2018, pp. 240–244.
classification,’’ Proc. SPIE, vol. 10574, Mar. 2018, Art. no. 105741M. [48] C. Han, H. Hayashi, L. Rundo, R. Araki, W. Shimoda, S. Muramatsu,
[25] A. Gupta, S. Venkatesh, S. Chopra, and C. Ledig, ‘‘Generative image Y. Furukawa, G. Mauri, and H. Nakayama, ‘‘GAN-based synthetic brain
translation for data augmentation of bone lesion pathology,’’ in Proc. Int. MR image generation,’’ in Proc. IEEE Int. Symp. Biomed. Imag. (ISBI),
Conf. Med. Imag. Deep Learn. (MIDL), 2019, pp. 1–11. Apr. 2018, pp. 734–738.
[26] B. H. Menze et al., ‘‘The multimodal brain tumor image segmenta- [49] C. Han, K. Murao, T. Noguchi, Y. Kawata, F. Uchiyama, L. Rundo,
tion benchmark (BRATS),’’ IEEE Trans. Med. Imag., vol. 34, no. 10, H. Nakayama, and S. Satoh, ‘‘Learning more with less: Conditional
pp. 1993–2024, Oct. 2015. PGGAN-based data augmentation for brain metastases detection using
[27] S. Koley, A. Sadhu, P. Mitra, B. Chakraborty, and C. Chakraborty, ‘‘Delin- highly-rough annotation on MR images,’’ in Proc. ACM Int. Conf.
eation and diagnosis of brain tumors from post contrast T1-weighted MR Inf. Knowl. Manage. (CIKM), to be published. [Online]. Available:
images using rough granular computing and random forest,’’ Appl. Soft https://arxiv.org/abs/1902.09856
Comput., vol. 41, pp. 453–465, Apr. 2016. [50] H.-C. Shin, N. A. Tenenholtz, J. K. Rogers, C. G. Schwarz, M. L. Senjem,
[28] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification J. L. Gunter, K. P. Andriole, and M. Michalski ‘‘Medical image synthesis
with deep convolutional neural networks,’’ in Proc. Adv. Neural Inf. Pro- for data augmentation and anonymization using generative adversarial
cess. Syst. (NIPS), 2012, pp. 1097–1105. networks,’’ in Proc. Int. Workshop Simulation Synthesis Med. Imag., 2018,
[29] D. P. Kingma and J. Ba, ‘‘Adam: A method for stochastic optimization,’’ pp. 1–11.
in Proc. Int. Conf. Learn. Represent. (ICLR), 2015, pp. 1–15. [51] S. G. Finlayson, H. Lee, I. S. Kohane, and L. Oakden-Rayner, ‘‘Towards
[30] M. Y. Liu, T. Breuel, and J. Kautz, ‘‘Unsupervised image-to-image trans- generative adversarial networks as a new paradigm for radiology educa-
lation networks,’’ in Proc. Adv. Neural Inf. Process. Syst. (NIPS), 2017, tion,’’ in Proc. Mach. Learn. Health (ML4H) Workshop, 2018, pp. 1–11.
pp. 700–708. [52] S. Vandenhende, B. De Brabandere, D. Neven, and L. Van Gool,
[31] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for ‘‘A three-player GAN: Generating hard samples to improve classification
large-scale image recognition,’’ in Proc. Int. Conf. Learn. Represent. networks,’’ in Proc. Int. Conf. Mach. Vis. Appl. (MVA), 2019, pp. 1–6.
(ICLR), 2015, pp. 1–14. [53] K. Chaitanya, N. Karani, C. Baumgartner, O. Donati, A. Becker, and
[32] D. Ulyanov, A. Vedaldi, and V. Lempitsky, ‘‘Improved texture networks: E. Konukoglu, ‘‘Semi-supervised and task-driven data augmentation,’’ in
Maximizing quality and diversity in feed-forward stylization and texture Proc. Int. Conf. Inf. Process. Med. Imag. (IPMI), 2019, pp. 29–41.
synthesis,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
Jun. 2017, pp. 6924–6932.
[33] X. Huang and S. Belongie, ‘‘Arbitrary style transfer in real-time with CHANGHEE HAN received the bachelor’s and
adaptive instance normalization,’’ in Proc. IEEE Int. Conf. Comput. Vis. master’s degrees in computer science from The
(ICCV), Jun. 2017, pp. 1501–1510. University of Tokyo, in 2015 and 2017, respec-
[34] X. Mao, Q. Liu, H. Xie, R. Y. K. Lau, Z. Wang, and S. P. Smolley, ‘‘Least tively, where he is currently pursuing the Ph.D.
squares generative adversarial networks,’’ in Proc. IEEE Int. Conf. Comput. degree with the Graduate School of Information
Vis. (ICCV), Jun. 2017, pp. 2794–2802. Science and Technology. Since 2018, he has been
[35] L. Bottou, ‘‘Large-scale machine learning with stochastic gradient a Visiting Scholar with the National Center for
descent,’’ in Proc. Int. Conf. Comput. Statist. (COMPSTAT), 2010, Global Health and Medicine, and a member of the
pp. 177–186.
Research Center for Medical Big Data, National
[36] S. Bianco, R. Cadene, L. Celona, and P. Napoletano, ‘‘Benchmark analysis
of representative deep neural network architectures,’’ IEEE Access, vol. 6, Institute of Informatics. He was invited as a Visit-
pp. 64270–64277, Oct. 2018. ing Scholar with the Technical University of Munich, in 2016, University
[37] J. Yap, W. Yolland, and P. Tschandl, ‘‘Multimodal skin lesion classification of Milano-Bicocca, in 2017 and 2018, and the University of Cambridge,
using deep learning,’’ Exp. Dermatol., vol. 27, no. 11, pp. 1261–1267, in 2019. His research interests include machine learning, especially deep
2018. learning for medical imaging and bioinformatics.
LEONARDO RUNDO received the bachelor’s and GIANCARLO MAURI is currently a Full Pro-
master’s degrees in computer science engineering fessor of computer science with the University
from the University of Palermo, in 2010 and 2013, of Milano-Bicocca. His research interests include
respectively, and the Ph.D. degree in computer natural computing and unconventional comput-
science with the University of Milano-Bicocca, in ing models, bioinformatics, stochastic modeling,
2019. In 2013, he was a Research Fellow with the simulation of biological systems and processes
Institute of Molecular Bioimaging and Physiology, using high-performance computing approaches,
National Research Council of Italy (IBFM-CNR). and biomedical data analysis. He has published
Since November 2018, he has been a Research about 400 scientific articles in these areas. He
Associate with the Department of Radiology, Uni- is also a member of the Steering Committees of
versity of Cambridge, collaborating with Cancer Research UK Cambridge the International Conference on Membrane Computing, the International
Institute. His main scientific interests include biomedical image analy- Conference on Unconventional Computing and Natural Computing, and the
sis, machine learning, computational intelligence, and high-performance International workshop on Cellular Automata.
computing.