Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Healthy-Defective Metal Object Classification

Goal of this assignment is to build an AI model which can accurately detect the crack defect in metal
objects. Based on the presence of the crack defect, we aim to build a classification model.

Since this is an image classification problem, we will be leveraging CNN models or ConvNets
to solve this problem for us. We will start by building simple CNN models from scratch, then try to
improve using techniques such as regularization and image augmentation. Then, we will try and
leverage pre-trained models to unleash the latest powerful transfer learning technique.

Modelling Overview
We have stored our Defective and Healthy images in two separate folders. We load them using
getLabel function and store their paths in ‘images’ and their labels in ‘labels’.

Next, we split our dataset in two parts training and validation using split_trainTest function. Our
dataset contains 111 Defective images and 139 OK/Health images. For modelling purpose, we will
split our dataset into two parts – Training and Validation. For the split we collect all the images in a
python list and shuffle it to keep the randomness. Then we do a good 0.25 split of the final
combination.
Next, we will covert out large pixel images into (224, 224) images as later Inception model used for
transfer learning accepts input images in this size.

We will scale each image with pixel values between (0, 255) to between (0, 1) because deep learning
models work really well with small input values.

We will start by building a basic CNN model with three convolutional layers, coupled with max
pooling for auto-extraction of features from our images and also down sampling the output
convolution feature maps.

Let’s leverage Keras and build our CNN model architecture now.
We train our model with batch_size = 32, epochs = 30 settings on training data and validate it
consequently on our validation dataset.

After the model run, we saw that our model was overfitting and not doing that great on both
training and validation datasets.

Let’s improve upon our base CNN model by doing what we call regularization. In CNN, best way to
do that is to use a drop out layer. We added two drop out layers with drop out rate of 0.5 after each
hidden dense layer. Dropout randomly masks the outputs of a fraction of units from a layer by
setting their output to zero (in our case, it is 50% of the units in our dense layers). But in the end we
were still overfitting the model, though it took slightly longer and we also got a slightly better
validation accuracy of around ~75% which was an improvement but not amazing.

The reason for model overfitting is because we have much less training data and the model keeps
seeing the same instances over time across each epoch. A way to combat this would be to leverage
an image augmentation strategy to augment our existing training data with images that are slight
variations of the existing images.

The idea behind image augmentation is that we follow a set process of taking in existing images from
our training dataset and applying some image transformation operations to them, such as rotation,
shearing, translation, zooming, and so on, to produce new, altered versions of existing images. Due
to these random transformations, we don’t get the same images each time, and we will leverage
Python generators to feed in these new images to our model during training. The Keras framework
has an excellent utility called ImageDataGenerator that can help us in doing all these operations.

A pre-trained model like the InceptionV3 is an already pre-trained model on a huge dataset
(ImageNet) with a lot of diverse image categories. Considering this fact, the model should have
learned a robust hierarchy of features, which are spatial, rotation, and translation invariant with
regard to features learned by CNN models. Hence, the model, having learned a good representation
of features for over a million images belonging to 1,000 different categories, can act as a good
feature extractor for new images suitable for computer vision problems. These new images might
never exist in the ImageNet dataset or might be of totally different categories, but the model should
still be able to extract relevant features from these images.

This gives us an advantage of using pre-trained models as effective feature extractors for new
images, to solve diverse and complex computer vision tasks, such as solving our model.

We will use Inception V3 as a simple feature extractor by freezing all the convolution blocks to make
sure their weights don’t get updated after each epoch.
A way to save time in model training is to use this model and extract out all the features from our
training and validation datasets and then feed them as inputs to our classifier. Let’s extract out the
bottleneck features from our training and validation sets now.

Let’s build the architecture of our deep neural network classifier now, which will take these features
as input.
We can see that our model has an overall validation accuracy of ~88%, which is a huge improvement
from our previous model, and also the train and validation accuracy are closer to each other,
indicating that the model is not overfitting that.

We tested our model on 20 images and result is as follows:

Considering the fact that we had such a small dataset to work with, this is a huge improvement.

You might also like