Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Multivariate

Analysis
Case Study
Nuwan Senevirathne

Index No : 10534

Question 1

I.
There are 7 variables and information of 23 observations. (countries are observations)
II.
Covariance Matrix (Matlab code : cov(X))


Descriptive Statistics

Mean
Std. Deviation Variance
0.159669565
X1
11.4117 .39080
0.855287747
X2
23.1217 .90449
4.914802767
X3
52.0913 2.16821
0.006670356
X4
2.0263
.07988
0.043731225
X5
4.1829
.20452
0.281681423
X6
9.0921
.51907
147.2948292
X7
158.3775 11.86974

III.
Correlation Matrix (matlab code : corrcoef(X))

IV. The given original data set could be standardized using the matlab code zscore(X) (Please find
the attached Excel data sheet). Then using the code cov for standardized data, the covariance
matrix for standardized data can be obtained. It is same as the correlation matrix of the original
data (non-standardized).
V. The new reproduced covariance matrix using spectral decomposition (matlab code:svd(X) for
decomposition) and using only 2 eigenvalues because it explains 99.89% of the total variance.



The matlab code used to calculate eigenvalues and eigenvector [v,d]=eigs(cov.X,7).
The excel codes used for the rest of the calculations. (MMULT).

Trace of the original matrix was 153.556 and trace of this matrix is 153.398. Hence,
approximate value for the rank would be 2 because 2 eigenvalues together represents more
than 99%. (Please find the calculations of eigenvectors in excel data sheet).

Question 2

The scree plot

According to the slopes of the above plot, it is very clear that 3 principal components (PCs) would
be enough to do the analysis.

Addition to that one may consider the residual matrices to determine the number of PCs to keep. It
also suggests that 3 PCs would be ideal (please find the calculated 3 residual matrices in the
attached excel sheet there are 3 separate calculations to see what happened when only 1 or 2 factors
considered).

SPSS Output:
Total Variance Explained
Component

Rotation Sums of Squared Loadings


Total

% of Variance

Cumulative %

2.831

40.450

40.450

2.216

31.657

72.106

1.760

25.148

97.254

Extraction Method: Principal Component Analysis.

Using the varimax rotation it can be


converted into interpretable data. Also when
3 components are selected 97.25% of total
variance is explained.

By looking at the data rotated data it can be said that X1, X2 and X3 are highly correlated and X4 also
indicates somewhat strong relation to that group where X4 correlated to X5 and X6 even more.
X7 is not related to any of others, demonstrating completely isolated.
Above component plot in rotated space gives a very clear picture about the variables. X1, X2 and X3 are
in a side and X5 and X6 on the other side where X4 in the middle and little closer to the X5 and X6 than
X1, X2 and X3. X7 is away from the rest of the variables.
It these variables were seen as they belongs to 3 categories, say short, medium and long X1, X2 and X3
are belongs to the short category clearly and X5 and X6 clearly to the medium and X4 is somewhere
middle but much more closer to the medium category. Thus, X7 belong to the long category.

You might also like