Challenge 1

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Motivation:

The segmentation task is interesting and tough due to the small training data. I also have experience
in training models for turning images into other images, which is quite similar to training a
segmentation model. This motivated me to take this challenge.

Abstract:

In this challenge, the study was initiated with the implementation of a naive UNet model for the
image segmentation task. All 50 labeled images provided in the training dataset were utilized for
training the model. Pixel-wise cross-entropy was employed for network training. To address data
imbalance among pixel classes, the naive pixel-wise cross-entropy was subsequently modified to
incorporate a weighted pixel-wise cross-entropy loss. The model underwent training for 1000
epochs; however, due to the limited training data, training the UNet model proved to be challenging.
In order to alleviate the scarcity of labeled data, a semi-supervised approach was employed,
incorporating a few unlabeled fetal brain images from the dataset designated for the classification
challenge into the segmentation model training process.

Introduction:

In scenarios where annotating data is costly, and we're left with only a handful of labeled examples,
semi-supervised learning emerges as a valuable strategy. Here, the challenge is met by incorporating
a surplus of unlabeled data through perturbation-based cross-consistency training. Drawing
inspiration from a prior study [1], this approach involves introducing controlled variations to the
unlabeled data to enhance model robustness. The idea is to tap into the latent information within
the unlabeled dataset, thereby allowing the model to learn more effectively. By intertwining labeled
and unlabeled data in this manner, we aim to improve the model's generalization and performance,
especially when confronted with limited labeled examples. This technique aligns with contemporary
methodologies that leverage unlabeled data to supplement the training process in resource-
constrained settings. The cross consistency based perturbation training helped in making
segmentation model more consistent in its prediction, which in terms improve the overall
performance.

Data Pre-processing/Analysis:

Any type of data pre-processing was not done here. We directly used the original image. However,
we normalized the image in the range (0,1). In case of the unlabelled image, we centre crop the
images, to make the size [300,300].

Model Architecture:

UNet has been used as the backbone architecture for the proposed method. In figure 1, the
modification to the naïve UNet architecture is shown. We have used three additional auxiliary
decoder along with the main decoder. For labelled data, the training procedure is same as the simple
segmentation model training. For unlabelled data, we first encoded the unlabelled image using the
encoder part of the UNet, then employed three random perturbation to the encoder feature, then
each of the perturbed version of the encoded feature is passed to three different auxiliary decoders.
And the unperturbed version is passed to the main decoder. Then the prediction of the main
decoder was forced to be consistent to the output of the three auxiliary decoder via applying loss
between the predictions.
Figure 1: Illustration of Perturbation based Cross Consistency Training

For perturbation we first randomly sampled a noise vector from uniform distribution and then
multiplied the noise vector with the encoded feature, then finally the modified noise is added to the
main features.

Experimental Setting:

We used Adam optimizer with default hyper parameter setting. The learning rate was first set to 1e-
4, then was set to decrease by a factor of 2 after every 2000 iteration of supervised training. Pixel
wise cross entropy loss was used as supervised loss and MSE loss is used as the cross consistency
loss.

Hypothesis Tried:

I tried to first use weighted version of the pixel wise cross entropy loss to handle the data imbalance.
Next, also tried focal loss to improve the class imbalancing problem. Also added Dice loss to the loss
function for training. As the validation dataset was not provided so, evaluating the effect of these
losses was not possible. Visually there was not significant change in the segmentation output, also
which minor changes actually giving better result was not clear. The main improvement was
employment of semi supervised learning.

Results:

Addition of unlabelled data visually improved the performance of the segmentation model by a good
margin. Few of the result on the test set is shown here.
Input Image Naive UNet Semi Supervised Approach

Key Findings:

It seems like, the UNet trained using only the labelled data, was little overfitted towards training
data, as a result produced poor result on the test images. In contrast the semi supervised approach
trained the UNet better than naïve approach. Hence the improvement can be seen via naked eyes
also.

Future Work:

Given sufficient time, the current framework can be improved significantly via experimenting with
different perturbation, and consistency training. Also, other semi supervised framework can be tried
which will make the algorithm more suitable for ultrasound image segmentation. Active learning
seems to be an option. Where we can use a trained model to label the unlabelled data and then
refine the prediction via manual supervision, and again train another model with the newly labelled
data.

Ref:

1. Ouali, Yassine, Céline Hudelot, and Myriam Tami. "Semi-supervised semantic segmentation
with cross-consistency training." Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition. 2020.

You might also like