Final Thesis

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Classification of Garments from Fashion MNIST

Dataset & Augment Dataset


Using CNN VGG19 Architecture

Presentation of the work performed during monsoon semester

Presented by :AJAY KUMAR


Admission Number : 20MT0028
Guided By: Assoc. Prof. Haider Banka
Department of Computer Science & Engineering, IIT (ISM) Dhanbad
1
Contents

• Introduction
• Why VGG
• - Analysis of DNN Model Architecture of VGG19
• - Analysis of DNN Model
• - Summary Of VGG19
• - Summary Of Dataset
• Feature Extraction
• Why only reuse Convolutional Base
• Extract Feature from fashion mnist
• Result and conclusion
• Challenges for future
2
INTRODUCTION
• Recently, online retails stores are growing quickly and surpass traditional physical stores. Consumers can
browse through thousands of merchandises at online retails stores. Sometimes, it is difficult to search the
ideal item we want in the massive item choices offered by all online stores. For instance, it is time-
consuming to search the ideal cloth since there are too many choices. Also large industrial exporter facing
big issue with product classification and labeling.

• Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a
test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10
classes. The dataset serves as a direct drop-in replacement for the original MNIST dataset for benchmarking
machine learning algorithms. It shares the same image size and structure of training and testing splits.

• Image recognition using CNN is excessively applied in fashion domains such as clothes classification, clothes
retrieval and automatic clothes labeling.

• - I apply VGG19 architecture on the Fashion MNIST dataset. Till Now LeNet-5 gives a higher performance as
compared to other existing models. I compared VGG19 accuracy with LeNet-5.

• - I am also trying with augment dataset(myntra) on existing higher performance model. So that existing
exporter issue resolve.
Why VGG19

• I have continued work on ITCE 2020, paper “classification of Garments


from Fashion MNIST Dataset Using CNN LeNet-5 Architecture”.

• - VGG-19 is a convolutional neural network is a 19 layers deep. In it, We


can load a pretrained version of the network trained on more than a
million images from the ImageNet database. The pretrained network
can classify images into 1000 object categories, such as shirt, jeans,
shoes, and watch etc.

• - It is currently the most preferred choice in the community for


extracting features from images. The weight configuration of the
VGGNet is publicly available and has been used in many other
applications and challenges as a baseline feature extractor. It has
highest parameters.
Analysis Of DNN Models

Figure 1. The architecture of VGG19


Architecture of VGG19

Figure 2. The architecture of VGG19


Summary of VGG19

Figure 3. The Summary of VGG19


Summary of Dataset

Fashion-MNIST is a dataset of Zalando's article


images—consisting of a training set of 60,000
examples and a test set of 10,000 examples. Each
example is a 28x28 grayscale image, associated with
a label from 10 classes. The dataset serves as a
direct drop-in replacement for the original MNIST
dataset for benchmarking machine learning
algorithms. It shares the same image size and
structure of training and testing splits.

The images of fashionMNIST are black and white,


while the required input for VGG19 must be colored
images. Thus, I convert the images into colored ones
with 3 channels R, G, B.

Figure 4. The Summary of Dataset VGG19 requires minimum input image's width and
height of 48, but I'll resize my images from 28 x 28
to 150 x 150.
• Feature Extraction

• Feature extraction consists of using the representations learned


by a previous network to extract interesting features from new
samples. These features are then run through a new classifier,
which is trained from scratch.

• CNNs used for image classification comprise two parts:


• they start with a series of pooling and convolution layers, and
they end with a densely-connected classifier. The first part is
called the "convolutional base" of the model. In the case of
convnets, "feature extraction" will simply consist of taking the
convolutional base of a previously-trained network, running the
new data through it, and training a new classifier on top of the
output.
• WHY Only Reuse the Convolutional
Base

• Representations found in densely-connected


layers no longer contain any information about
where objects are located in the input image:
these layers get rid of the notion of space,
whereas the object location is still described by
convolutional feature maps. For problems
where object location matters, densely-
connected features would be largely useless.
Extract features from Fashion-MNIST

- Run the convolutional base over the dataset.


- Record its output to a Numpy array on disk.
- Use this data as input to a standalone densely-connected classifier.

This solution is very fast and cheap to run, because it only requires
running the convolutional base once for every input image, and the
convolutional base is by far the most expensive part of the pipeline.
However, for the exact same reason, this technique would not allow me
to leverage data augmentation at all.
Categorical Crossentropy
• Categorical crossentropy used as a
loss function for multi-class
classification model where there are
two or more output labels. The
output label is assigned one-hot
category encoding value in form of
0s and 1. The output label, if
present in integer form, is
converted into categorical encoding
using keras.utils to_categorical
method
Figure 5.
Training and Validation Accuracy

Figure 6. Plot training and validation accuracy


Result and conclusion

Till now, we achieve 76% (approx.) accuracy, But It could be increase. I am


trying found all possible method to increase accuracy of VGG19.

Till now LENET5 accuracy is higher than VGG19.

14
Challenges /Future Work

To achieve high accuracy, we need I will try to increase accuracy


more ram access. As we know and reduce the
image dataset is very large, If we computational power using
want complete dataset training transfer learning. After will
model. We need to figure out way use some augmented dataset
to reduce computational power. on training model.
And work with transfer learning
•THANK YOU

16

You might also like