Professional Documents
Culture Documents
CSC580_CTA6 _Option_1_Anderson_Cleon
CSC580_CTA6 _Option_1_Anderson_Cleon
CSC580_CTA6 _Option_1_Anderson_Cleon
Cleon Anderson
In computer vision, image classification involves assigning a label from predefined categories
to an input image. Convolution Neural Networks (CNN) provide state-of-the-art performance in Image
classification problems. In this assignment, we explore the implementation of a custom CNN for image
classification on the CIFAR-10 dataset and evaluate the efficacy of the classification model.
Our CNN model consists of two convolutional layers and two fully connected layers. The model uses
the Rectified Linear Unit (ReLU) activation function after each convolutional operation. The
1. Filters: The number of filters in each convolutional layer determines the depth of feature maps.
The first convolutional layer has 32 filters, and the second has 64 filters. A higher filter number
allows the model to learn more complex and abstract features but comes at a higher
computational cost.
2. Kernel Size: Each convolutional operation has a receptive field determined by its kernel size.
The kernel size of both convolutional layers is 5x5. We generate more parameters with larger
first convolutional layer uses a stride of (1, 1), and the second uses (2, 2). Strides of (2, 2)
reduce the spatial dimensions, enabling the model to capture larger spatial patterns efficiently.
4. Padding: Padding ensures that the output has the same spatial dimensions as the input. We use
the 'SAME' padding, which pads the input so that the output size matches the input size.
5. Fully Connected Layers: The first fully connected layer has 1024 units, and the second fully
connected layer has ten units, representing the number of classes in the CIFAR-10 dataset.
6. Dropout and Learning Rate Tuning: We use a dropout probability 0f 0.8 on the first fully
connected layer to improve the generalization and reduce overfitting. During training, dropout
sets a fraction of the layer's output units to zero, creating multiple submodels. Dropout prevents
the model from becoming too reliant on specific features and encourages it to learn more robust
representations.
7. Learning rate plays a vital role in training the model effectively. Training at a fixed rate can lead
Results
Fig. 1 Tranining and Validation Loss and Accuracy Plots Fig. 2 Algorithm Run with Accuracy Results
The model's efficacy does not improve significantly throughout training.
1. Validation Accuracy and Loss: The validation accuracy remains constant at approximately
0.22, and the validation loss stays at a high value of 272.35 throughout the last epoch,
suggesting that the model's performance on the validation set is not improving and cannot make
2. Test Accuracy: The test accuracy of 0.1 is significantly lower than the training accuracy. This
discrepancy suggests that the model needs to be more balanced to the training data, as it
3. Average Training Accuracy and Loss: The average training accuracy across all epochs is
0.7475, while the average training loss is 1.02. The average accuracy is relatively high,
indicating that the model performs well on the training data. However, the average loss could be
4. Average Validation Accuracy and Loss: The average validation accuracy is 0.22, while the
average validation loss is 276.92. These values are consistent, indicating that the model's
performance does not improve when considering the entire training process.
Conclusion
The model's efficacy remains low, and it does not generalize well to unseen data. The model appears to
be overfitting to the training data, as evidenced by the large discrepancy between training and test
accuracy. Despite using dropout and L2 regularization, the model struggles to improve its performance
on the validation and test sets. We implemented hyperparameter tuning, but after many bug fixes, the