Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Multidimensional Scaling (MDS) Report

1. Introduction

MDS is a technique for visualizing and exploring the similarities and dissimilarities between items
based on a distance or dissimilarity matrix. The results of applying MDS to the supplied dataset
are presented in this report. The goal is to discover hidden patterns and relationships between
locations based on their coordinates.

2. Data Description

The dataset used for MDS consists of points for various locations on plot. The dataset includes the
following points:

3. Methodology

MDS was applied to the dissimilarity matrix following:


• Using the supplied points, compute the dissimilarity matrix.
• Creating a point configuration in a lower-dimensional space.
• Iteratively modifying the point placements to reduce the disparity between the initial
dissimilarities and the distances in lower-dimensional space.
• Evaluating the final configuration's goodness-of-fit.
4. Results

a. Eigenvalues: The eigenvalues obtained from the MDS analysis are as follows:

b. Goodness-of-Fit: The goodness-of-fit (GOF) measure for the MDS analysis is given by:
Stress 1: 0.6490358
Stress 2: 0.6490358

Fig 01: Multidimensional Scaling plot

5. Interpretation

a. Configuration Interpretation: It's challenging to provide a full evaluation of the MDS results
without the arrangement of points in the lower-dimensional space ($x is NULL). The configuration
would have represented the altered coordinates of the points, allowing them to be shown in relation
to one another and in patterns.
b. Eigenvalues Interpretation: The eigenvalues show the amount of variance explained in the
lower-dimensional space by each dimension. The first eigenvalue is substantially greater than the
others in this situation, indicating that the first-dimension accounts for a considerable portion of
the dissimilarity information. Following eigenvalues indicate increasingly less information,
implying that new dimensions capture decreasing dissimilarity variation.
c. Goodness-of-Fit Interpretation: The stress measure indicates how well the MDS model
matches the initial dissimilarity matrix. A lower stress value suggests that the fit is better. Both
stress values (Stress 1 and Stress 2) in this case are 0.649, indicating a moderate amount of match
between the original dissimilarities and the distances in lower-dimensional space.

6. Conclusion

MDS was applied to the given dataset, resulting in a lower-dimensional representation of the
locations based on their coordinates. The research found that the first-dimension accounts for the
vast majority of the variance, showing its importance in capturing underlying patterns. The stress
values show that the original dissimilarities and the MDS solution have a moderate level of fit.
Clustering Report

1. Introduction

In this study, I use k-means clustering to analyze a collection of leaf features from several plant
species. The goal is to distinguish separate clusters based on leaf form, leaf color, leaf diameter,
height, and sepal length.

2. Data Description

The dataset is made up of observations on plant species that are represented by five characteristics:
leaf shape, leaf color, leaf diameter, height, and sepal length. Each feature is graded on a scale of
0 to 4. There are a total of ten observations in the dataset.

3. Methodology

I used the k-means clustering algorithm to classify the plant species according to their leaf
attributes. The number of clusters in this study was set at four.

4. Results

a. Cluster Means: The following table presents the cluster means, representing the centroid values
for each feature in each cluster:
Cluster Leaf Shape Leaf Color Leaf Diameter Height Sepal Length
1 2.67 0.97 14.67 198.54 2.72
2 2.33 1.68 29.26 103.27 3.10
3 2.67 2.20 14.16 106.89 3.57
4 1.00 2.16 23.67 78.56 2.70

b. Clustering Vector: The clustering vector represents the cluster assignments for each data point.
The following vector shows the cluster assignments for the 10 plant species:
[1] 2 2 2 3 1 4 3 1 1 3

c. Within Cluster Sum of Squares: The within-cluster sum of squares measures the variation within
each cluster:
Cluster Sum of Squares
1 294.73
2 96.16
3 71.94
4 0.00

d. Total Within-Cluster Sum of Squares and Between-Cluster Sum of Squares:

• Total Within-Cluster Sum of Squares (tot.withinss): 462.83

• Between-Cluster Sum of Squares (betweenss): 114.10 (19.8% of the total sum of squares)

e. Cluster Sizes:

Cluster Size

1 3

2 3

3 3

4 1
Fig 02: Clustering plot

5. Interpretation

a. Cluster Profiles:
1. Cluster means:
• Cluster 1: The mean leaf shape of this cluster is 2.67, the leaf color is 0.97, the leaf
diameter is 14.67, the height is 198.54, and the sepal length is 2.717714. This group of
plants has medium leaf forms, low leaf color intensity, shorter leaf diameters, tall heights,
and medium sepal lengths.
• Cluster 2: The average leaf shape in this cluster is 2.33, the leaf color is 1.68, the leaf
diameter is 29.26, the height is 103.27, and the sepal length is 3.098167. This cluster's
plants have significantly smaller leaf forms, mild leaf color intensity, greater leaf
diameters, shorter heights, and longer sepal lengths.
• Cluster 3: The average leaf shape in this cluster is 2.67, the leaf color is 2.2, the leaf
diameter is 14.16, the height is 106.89, and the sepal length is 3.567556. Plants in this
cluster have medium leaf forms and tall heights, but have stronger leaf color intensity
and slightly lower leaf diameters than those in Cluster 1.
• Cluster 4: This cluster has one data point with a mean leaf shape of 1, a leaf color of 2.16,
a leaf diameter of 23.67, a height of 78.56, and a sepal length of 2.696. This data point
depicts a one-of-a-kind plant with small leaves, moderate leaf color intensity, medium
leaf diameter, low height, and short sepal length.

You might also like