KMPA Assignment-II: 1 Task 1

Department of Computer Science & Engineering
Indian Institute of Technology-Madras
KMPA Assignment-II
Group No. 11
Vindhani Mohsin (ED11B041), Rohit R. Salunke (ME11B119), Pritesh Jain (ME11B116)
April 16, 2016
1
1.1
1.1.1
Task 1
Linearly Separable
C-SVM
Gaussian Kernel
Figure 1: Kernel Gram Matrix
Figure 2: Class 1 Vs. Rest plot

with suppport vectors, boundedFigure 3: Class 2 Vs. Rest withFigure 4: Class 3 Vs. Rest with
support vectors and margin withC=0.1
C=0.1
C=0.1
Figure 6: Decision boundaries with C=0.1

Figure 5: Confusion Matrix with C=0.1

C=0.8
C=0.8


support vectors and margin withC=2
C=2
C=2
Figure 16: Decision boundaries with C=2

Figure 15: Confusion Matrix with C=2
Polynomial Kernel

C=0.1
C=0.1


C=0.9
C=0.9


C=5
C=5

1.1.2
Nu-SVM
Gaussian Kernel

support vectors and margin withNu=0.02
Nu=0.02
Nu=0.02
Figure 38: Decision boundaries with Nu=0.02

Figure 37: Confusion Matrix with Nu=0.02

Nu=0.00001
Nu=0.00001
Figure 43:
Nu=0.00001
Polynomial Kernel
Decision
boundaries
with

Nu=0.02
Nu=0.02

10

Nu=0.00001
Nu=0.00001
Figure 54:
Nu=0.00001
11
Decision
boundaries
with
1.1.3
Nonlinearly Separable
1.1.4
C-SVM
Gaussian Kernel
12

C=0.5
C=0.5

13

C=5
C=5

14
Polynomial Kernel
Figure 66: Kernel Gram Matrix with order 1 Figure 67: Kernel Gram Matrixwith order 2
Figure 68: Kernel Gram Matrix with order 4
15

support vectors and margin withC=2 and Order=2
C=2 and Order=2
C=2 and Order=2
Figure 73: Decision boundaries with C=2 and

Order=2
Figure 72: Confusion Matrix with C=2 and Order=2
16

C=2 and Order=4
C=2 and Order=4
Figure 78: Decision boundaries with C=2 and

Order=4
Figure 77: Confusion Matrix with C=2 and Order=4
17
1.1.5
Nu-SVM
Gaussian Kernel

Nu=0.00001
Nu=0.00001
Figure 84:
Nu=0.00001
18
Decision
boundaries
with

Nu=0.02
Nu=0.02

Polynomial Kernel
Figure 90: Kernel Gram Matrix for order=2
19

support vectors and margin withNu=0.2 and order=2
Nu=0.2 and order=2
Nu=0.2 and order=2

and order=2
Figure 94: Confusion Matrix with Nu=0.2 and
order=2
1.1.6
Overlapping data
1.1.7
C-SVM
Gaussian Kernel
20

C=15
C=15

Polynomial Kernel
21
Figure 102: Kernel Gram Matrix with order 3

C=15 and Order=3
C=15 and Order=3

and Order=3
Figure 106: Confusion Matrix with C=15 and
Order=3
22
1.1.8
Nu-SVM
Gaussian Kernel
23

Nu=0.3
Nu=0.3

Polynomial Kernel
Figure 114: Kernel Gram Matrix for order=2
24

support vectors and margin withNu=0.5 and order=2
Nu=0.5 and order=2
Nu=0.5 and order=2

and order=2
and order=2
1.1.9
Inferences
As we decreased C parameter in C-SVM, the width of the margin increased, which also increased
the total number of support vectors and bounded support vectors.
Whereas, the same happened when Nu parameter was increased in Nu-SVM.
After a certain Nu parameter, it becomes infeasible. This depends on the size of the dataset. More
is the dataset, higher the values Nu parameter could take. This seems because as we increase Nu,
number of support vectors increase which is bounded by the sample size.
Kernel Gram matrices plots for gaussian kernel resembled a block diagonal matrix more than the
polynomial kernel. This is because the kernel values change exponentially for gaussian kernel. The
blocks are the data points of same class.
Task 2
SVM is tried on the full features.
25
Figure 120: Confusion Matrix with original number

of dimensions. Optimal results achieved for Nu=0.1
and C=0.1
2.1
PCA Dimension Reduction
Top 130 principal components explain over 95 percent variance in the data. We have selected top 150
principal components.
Figure 121: Eigen value analysis for PCA
26
Figure 122: Confusion Matrix after dimensions were

reduced to 150. Optimal results achieved for Nu=0.1
and C=0.1
2.2
Auto Encoder Dimension Reduction
The structure of the autoencoder is 350 neurons in the hidden layer and 150 in the bottleneck layer.
The used error is MSE for training with standard back propagation.
The activation in the hidden layer is sigmoid.
Figure 123: Confusion Matrix for auto encoder after

dimensions were reduced to 150. Optimal results
achieved for Nu=0.003 and C=0.5
27
2.3
Stacked Auto Encoder Dimension Reduction
Stacked autoencoder consisted of three autoencoders.

The structure of the autoencoder is non layer with 420 neurons followed by a linear layer of size
350.
This is followed by a non linear layer of size 250 followed by a linear layer of size 180.
The third auto encoder is a nonlinear layer with size 130 followed by a linear layer of size 100.
Training is done using MSE with back propagation.
Figure 124: Confusion Matrix for Stacked auto encoder after dimensions were reduced to 100. Optimal
results achieved for Nu=0.003 and C=0.5
Task 3
ConvolutionN euralN etwork
Image resized to 32 32.
Input: 3 feature maps RGB.
Image size after first layer is 28 28 after convolution. 16 feature maps are generated.
Sub sampling using Max pooling of (2 2) is done. So after pooling it is 14 14.
Image size after second layer after convolution is 10 10
Then, max pooling of (2 2), so it becomes 5 5. 32 maps are generated.
After third convolution, the output is (1 1). 300 features are generated from this.
First MLP layer has input of size 300 from convolution, and 150 out.
Second MLP layer then has 150 inputs and 5 outputs for 5 classes.
28
Softmax function is used for output layer followed by cross entropy error for classification.
The training is done using mini-batch stochastic gradient descent with learning rate=0.1.
The confusion matrix for CNN is as follows:
154
81
43
31
32
114
144
25
36
41
21
11
39
4
3
34
37
7
41
26
32
26
7
15
36
Task 4
The values of the layer before classification of CNN are used as input to the SVM. The number of features
is thus 150.
Figure 125: Confusion Matrix with 150 number of dimensions. Results achieved for Nu=0.003 and C=15
Task 5
DeepBoltzmanM achine
The task is performed for 5 classes numbered 13, 16, 48, 56, 93.
New labels assigned 1, 2, 3, 4, 5 correspondingly.
A total of 513 image data available for all the five classes combined.
Data split into .75 and .25 for training and testing.
1 of k representation used for classification.
There are 256 input and 5 output features.
DBM built using layer configuration of (256, 100, 50, 30, 5) including input layer, three hidden layers
and output layer.
29
The visible layer and hidden layers use sigmoid unit and output uses softmax unit funciton.
Pertraining done with 15 epochs.
Finetuning using backpropagation with 50 epochs.
Used k-step contrastive divergence with k=1.
Conf usionmatrixf ortestdata
31
0
2
0
1
1
15
0
0
0
2
0
23
1
3
1
8
1
19
1
2
0
1
0
26
Conf usionmatrixf ortraindata

87
0
3
2
2
1
64
0
4
0
4
1
65
1
0
Training accuracy :91Test accuracy : 81
30
1
12
0
71
0
1
0
1
1
63

KMPA Assignment-II: 1 Task 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

KMPA Assignment-II: 1 Task 1

Uploaded by

Copyright:

Available Formats

Department of Computer Science & Engineering

Indian Institute of Technology-Madras

Figure 1: Kernel Gram Matrix

Figure 2: Class 1 Vs. Rest plot

Figure 6: Decision boundaries with C=0.1

Figure 7: Class 1 Vs. Rest plot

Figure 11: Decision boundaries with C=0.8

Figure 12: Class 1 Vs. Rest plot

Figure 16: Decision boundaries with C=2

Figure 17: Kernel Gram Matrix

Figure 18: Class 1 Vs. Rest plot

Figure 22: Decision boundaries with C=0.1

Figure 23: Class 1 Vs. Rest plot

Figure 27: Decision boundaries with C=0.9

Figure 28: Class 1 Vs. Rest plot

Figure 32: Decision boundaries with C=5

Figure 33: Kernel Gram Matrix

Figure 34: Class 1 Vs. Rest plot

Figure 38: Decision boundaries with Nu=0.02

Figure 39: Class 1 Vs. Rest plot

Figure 44: Kernel Gram Matrix

Figure 45: Class 1 Vs. Rest plot

Figure 49: Decision boundaries with Nu=0.02

Figure 50: Class 1 Vs. Rest plot

Figure 55: Kernel Gram Matrix

Figure 56: Class 1 Vs. Rest plot

Figure 60: Decision boundaries with C=0.5

Figure 61: Class 1 Vs. Rest plot

Figure 65: Decision boundaries with C=5

Figure 68: Kernel Gram Matrix with order 4

Figure 69: Class 1 Vs. Rest plot

Figure 73: Decision boundaries with C=2 and

Figure 74: Class 1 Vs. Rest plot

Figure 78: Decision boundaries with C=2 and

Figure 79: Kernel Gram Matrix

Figure 80: Class 1 Vs. Rest plot

Figure 85: Class 1 Vs. Rest plot

Figure 89: Decision boundaries with Nu=0.02

Figure 90: Kernel Gram Matrix for order=2

Figure 91: Class 1 Vs. Rest plot

Figure 95: Decision boundaries with Nu=0.2

Figure 96: Kernel Gram Matrix

Figure 97: Class 1 Vs. Rest plot

Figure 101: Decision boundaries with C=15

Figure 102: Kernel Gram Matrix with order 3

Figure 103: Class 1 Vs. Rest plot

Figure 107: Decision boundaries with C=15

Figure 108: Kernel Gram Matrix

Figure 109: Class 1 Vs. Rest plot

Figure 113: Decision boundaries with Nu=0.3

Figure 114: Kernel Gram Matrix for order=2

Figure 115: Class 1 Vs. Rest plot

Figure 119: Decision boundaries with Nu=0.5

SVM is tried on the full features.

Figure 120: Confusion Matrix with original number

PCA Dimension Reduction

Figure 121: Eigen value analysis for PCA

Figure 122: Confusion Matrix after dimensions were

Auto Encoder Dimension Reduction

Figure 123: Confusion Matrix for auto encoder after

Stacked Auto Encoder Dimension Reduction

Stacked autoencoder consisted of three autoencoders.