Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Digital Assignment – 2

1. Find the mean, standard deviation, and variance for each of these data sets.
[12 23 34 44 59 70 98]
[12 15 25 27 32 88 99]
[15 35 78 82 90 95 97]

2. PCA— compute the three principal component for the covariance matrix given below
1 0 0
0 4 -3
0 -3 9

What percentage of the variance do the first 2 principal component capture?

3. a) Use the k-means algorithm and Euclidean distance to cluster the following 8 examples into 3 clusters:

A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4), A7=(1,2), A8=(4,9).

b) Show the result of applying the Agglomerative clustering using single link technique for the following 8 examples:

A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4), A7=(1,2), A8=(4,9).

4. Consider the data for height classification (as Short/Medium/Tall) in the table given below. Use only the height
attribute for distance calculation.
Name Gender Height Output
Kristina F 1.6m Short
Jim M 2m Tall
Maggie F 1.9m Medium
Martha F 1.88m Medium
Stephanie F 1.7m Short
Bob M 1.85m Medium
Kathy F 1.6m Short
Dave M 1.7m Short
Worth M 2.2m Tall
Steven M 2.1m Tall
Debbie F 1.8m Medium
Todd M 1.95m Medium
Kim F 1.9m Medium
Amy F 1.8m Medium
Wynette F 1.75m Medium
(i) What output would a 5-NN model using Euclidean distance return for the same query: <Jack, F, 1.6>
(ii) What output would a 5-NN model using Manhattan distance return for the same query: <Jack, F, 1.6>

5. For the following training data , build a decision tree and predict the class
of the following new example: age<=30, income=medium, student=yes, credit-rating=fair.

You might also like