Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

In this homework, you will develop a UNet network [1] from scratch and train the constructed network by

the Heart dataset. In this dataset, the images are split into three parts: training, validation, and test sets.
Gold standard images are also provided as png files, where background and heart pixels are represented
with 0 and 1, respectively. (Since the heart pixels are represented with 1, instead of 255, you may not
realize those pixels when you open a gold standard image in an image editor. However, when you read
the contents of these files in your program, you will see those pixels.)
The main aim of the homework is to practice deep learning design and comprehend the effect of network
architecture on a network’s performance. The model’s performance will be evaluated on the segmentation
maps, and changes in the segmentation prediction will be analyzed to understand the effect of applied
modifications in networks.
To construct the network, you may choose any well-known framework, such as PyTorch or TensorFlow.
However, you could make this selection considering your familiarity with frameworks and their
compatibility with resources (particularly GPUs) at your disposal. We recommend using the PyTorch
framework in this homework due to its prevalence in academia.

Part 1: Network Construction

In this step, a network must be constructed from scratch as shown Fig. 1 in the original U-Net paper [1].
However, the input image size and final layer must be modified by considering the details of the given
images in the assignment. The contribution of U-Net architecture is explained in following lecture slide:

The constructed network must include long skip connections along with the blocks explained in the paper.
To shorten the training time and eliminate possible computational resource limitation problems, unlike
the proposed network architecture in Fig. 1, 64 feature channels must be decreased to 16 channels by
maintaining the same logic in downsampling and upsampling steps. Other parameters and configuration
selection must be convenient with the reported network architecture in the paper. After constructing the
network, the training configuration can be set differently than the proposed details in the paper. In this
phase, you will design the training by choosing an optimizer, stopping condition, scheduler, etc., along
with their corresponding hyperparameters. In this design process, you need to answer the following
questions:

1. How did you select an optimizer in the training?


2. What was the batch size in the training? How did you select this value?
3. Which stopping condition was used in the training, and which model was saved to predict the
outputs for the images in the test set?
4. What are the training and validation performance metrics (loss value, precision, recall, F-
score) throughout training epochs?
After completing its implementation, you must predict the segmentation maps for the test set images and
calculate the pixel-level precision, recall, and F-score metrics. The model's overall performance must be
determined by averaging calculated performance metrics for each image separately. In this step, the final
model must exceed 75% for the calculated overall F-score. If it cannot pass this score, please explain your
reasons for this situation.
On the other hand, visual results are crucial to assessing the performance of the models. Therefore, you
must include two images from each good, acceptable, and problematic segmentation result (a total of six
different images) of your model in your report. Please comment on the selected images to explain which
perspectives of the chosen model are sufficient and which are missing (3-4 sentences).

Part 2: Network Architecture Modification

After constructing the network, you will examine the effect of architectural changes in the network. In this
sense, two modifications will be applied to the network design. First, the number of downsampling and
upsampling could be altered to generate different feature maps or to decrease model complexity and the
training time. In this homework, you will decrease the downsampling and upsampling steps from four (as
shown Fig. 1 in the original U-Net paper) to three. Therefore, the designed network's number of feature
channels must be 16-32-64-128 in a sequence in the corresponding blocks.
Second, you will check the effect of the number of feature channels in the network architecture. To do
that, you will change the initial number of feature channels to either 32 or 8 on the constructed network
in the first part (the initial network); however, the logic for doubling the number of feature channels in
downsampling will remain unchanged. In other words, the network's number of feature channels in the
corresponding network blocks will be either 8-16-32-64-128 or 32-64-128-256-512 in a sequence. Also, the
upsampling process will be updated as modified in the encoder part of the U-Net architecture. Please
comment on the number of channels selected in the network design and the reason for its selection.
After implementing these two modifications, networks must be trained in the same way as in the previous
part. Nevertheless, if you want, you can slightly change the determined hyperparameter values for training
in the previous task. At the end of the part, you will complete the following table:

Training Set Validation Set Test Set


Precison

Precison

Precison
F-score

F-score

F-score
Recall

Recall

Recall

Default network design

Modification in the number of


down and up sampling steps
Modification in the number of
feature channels
After filling in the values of this table, you need to answer the following questions:

1. Which method performs the best segmentation performance over other methods? What are the
possible reasons for this difference?
2. What are the observed differences over the default network design in terms of segmentation
performance, training time, and the gap between training and test set results of methods?
3. Which network architecture do you prefer among these three methods and why?

Part 3: Dropout in Network Architecture

In the last part, a well-known regularization technique, dropout [2], will be integrated into the network
architecture. The details of the dropout regularization technique are explained in the lecture, and
summarized information can be found as follows from the lecture slide:

The initially constructed network architecture will be used in this part to observe the contribution of the
dropout technique. You can control example dropout layers for PyTorch and TensorFlow. How you
integrate this layer into the network architecture depends on your decision. However, in your report, you
need to justify your decision regarding the position of the dropout layers. In addition, a rate/probability
hyperparameter p, will be introduced in the network thanks to the dropout layer. You must determine
three different p-values for integrated dropout layers in network design and control the models'
performance for these three values separately. For this part, you must complete the following table:

Training Set Validation Set Test Set


Precison

Precison

Precison
F-score

F-score

F-score
Recall

Recall

Recall

Default network design

Network w/dropout (p=****)

Network w/dropout (p=****)

Network w/dropout (p=****)


After filling in the values of this table, you need to answer the following questions:

1. Do you observe the regularization effect in your results after integrating the dropout layers?
What performance results demonstrate the regularization effect of dropout layers?
2. Which p-value performs the best segmentation performance over the other two p-values?
Comment on the possible reasons for this situation.
3. Could we say that by increasing the p-value in the dropout layers, the difference between
training and test set performance results decreases depending on your results?
References

[1] O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for


Biomedical Image Segmentation," ed: arXiv, 2015.
[2] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a
simple way to prevent neural networks from overfitting," The journal of machine learning
research, vol. 15, no. 1, pp. 1929-1958, 2014.

You might also like