SML-SET 1-Batch 1-Answer key

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 8

SRM Institute of Science and Technology Mode of Exam

College of Engineering and Technology


OFFLINE
School of Computing
DEPARTMENT OF COMPUTATIONAL INTELLIGENCE
SRM Nagar, Kattankulathur – 603203, Chengalpattu District, Tamilnadu
Academic Year: 2023-2024 SET-A

Test: CLAT-1 Date: 09.08.2023


Course Code & Title: 18CSE479T- Statistical Machine Learning
Duration: 1 Period
Year & Sem: III/IV Max. Marks: 25

Course Learning Outcomes


At the end of this course the learner will be able to:
1 Acquire the knowledge on statistical machine learning techniques
2 Acquire the ability to build model based on logistic regression and random forest techniques
3 Understand the basic ideas of probability and work on probabilistic approaches like Naïve Bayes, Bayes Theorem
4 Apply the knowledge of Kernal functions in practical applications
5 Apply the knowledge of K-means clustering, PCA and SVD with Scikit-learn

Course Learning Outcomes


At the end of this course the learner will be able to :
CO1 H - L M - L L M H H M L
CO2 H H H L - L - L H H H L
CO3 H - L L M - - M M M M -
CO4 H H L M H M - M M L L M
CO5 H - H H H - - M L M M -

Question No Reference to Marks Marks Score Outcomes


Outcome Allocated Met Yes/No
(Total 25)
1 1 1
2 1 1
3 1 1
4 1 1
5 1 1
6 1 4
7 1 4
8 1 4
9 1 12
10 1 12
Faculty Name:
Signature:

Course Articulation Matrix: (to be placed)

Part – A
(5*1 = 5 Marks)
Instructions: Answer all Questions
Q. Question Mar BL CO PO PI Code
No ks
Consider Machine Learning models. An iterative process 1 1 1 1
1. takes place which are built based on various model
parameters are called _________
A. mini-batches
B. optimized parameters 1.1.1
C. hyperparameters
D. super parameters
Ans: C

Consider two small-sized samples and population standard 1 1 1 1


2. deviation is not given. How will you test the significance of
the difference of the mean values between them?
A. F-test
B. Chi-square test 1.1.2
C. t-test
D. Z-test
Ans: C

3. Consider hypothesis is tested for rejection assuming it to be 1 1 1 1


true is _________
A. Statistical Hypothesis
B. Null Hypothesis
2.1.1
C. Simple Hypothesis
D. Composite Hypothesis
Ans: B

4. False Positive Rate is ______________ 1 2 1 1 2.1.3


A. Specificity
B. Sensitivity
C. 1-Sensitivity
D. 1-Specificity
Ans: D

5. Overfitting occurs _________ 1 1 1 1 2.1.3


A. during the process of learning
B. while a statistical model describes noise instead of
fundamental relationship
C. when a data set finds predictive relationship
D. when robots are programmed so that they can perform
the task based on data they gather from sensors
Ans: B

Part – B
(2 x 4 = 8 Marks)
Instructions: Answer any Two Questions

6 Write the steps to find variance. Consider the given data set 4 3 1 1 1.1.2
and find variance.

Step 1: Find the mean

Step 2: Find each score’s deviation from the mean

Step 3: Square each deviation from the mean


Step 4: Find the sum of squares

Step 5: Divide the sum of squares by n – 1 (for sample)


or N (for population)

7 Illustrate Confusion matrix with known example. Calculate 4 2 1 1 2.3.2


Accuracy, Precision Recall and F1 score.
Answer:
 TP: True Positive: Predicted values correctly
predicted as actual positive
 FP: Predicted values incorrectly predicted an actual
positive. i.e., Negative values predicted as positive

 FN: False Negative: Positive values predicted as


negative

 TN: True Negative: Predicted values correctly


predicted as an actual negative

 Precision: The precision metric shows the accuracy


of the positive class. It measures how likely the
prediction of the positive class is correct.
The maximum score is 1 when the classifier perfectly
classifies all the positive values. Precision alone is not very
helpful because it ignores the negative class. The metric is
usually paired with Recall metric. Recall is also called
sensitivity or true positive rate.

Sensitivity: Sensitivity computes the ratio of positive classes


correctly detected. This metric gives how good the model is
to recognize a positive class.

8 The scores of a group of students in a math test are as 4 2 1 1


follows: 85, 90, 78, 92, 85, 88, 95, 82, 85, 90. Calculate the
mean, median, and mode of the scores.

Solution: Mean: Mean = (85 + 90 + 78 + 92 + 85 +


88 + 95 + 82 + 85 + 90) / 10 = 870 / 10 = 87

Median: To find the median, first, arrange the scores


in ascending order: 78, 82, 85, 85, 85, 88, 90, 90, 92,
95. Since there are 10 scores, the median will be the 1.1.2
average of the middle two values, which are 85 and
88. Median = (85 + 88) / 2 = 173 / 2 = 86.5

Mode: The mode is the most frequent score in the


data set. In this case, the score 85 appears three
times, which is more frequent than any other score.
Mode = 85

Part –C
(1 x 12 = 12 Marks)
Instructions: Answer one question
9 Four brands of flashlight batteries are to be compared by 12 4 1 1 2.3.2
testing each brand in five flashlights. Twenty flashlights are
randomly selected and divided randomly into four groups of
five flashlights each. Then each group of flashlights uses a
different brand of battery. The lifetimes of the batteries, to the
nearest hour, are as follows.

Preliminary data analyses indicate that the independent


samples come from normal populations with equal standard
deviations. At the 5% significance level, does there appear to
be a difference in mean lifetime among the four brands of
batteries?
Answer:

OR
10 A company wants to investigate the effects of two 12 4 1 1 2.3.2
factors, "Temperature" and "Pressure," on the tensile
strength of a material. They conduct an experiment by
varying the temperature at three levels (Low, Medium,
and High) and the pressure at four levels (100, 200, 300,
and 400 psi). They measure the tensile strength of the
material under each combination of temperature and
pressure. The data is shown below:

Temperature 100 psi 200 psi 300 psi 400 psi


Low 65 73 79 80
Medium 72 75 78 82
High 78 80 85 88

Perform an ANOVA to determine if there are significant


main effects of Temperature and Pressure as well as any
interaction effects. Use α = 0.05.

Step 1: Calculate the Sum of Squares (SS):

 Total Sum of Squares (SST): 237.67


 Sum of Squares for Temperature
(SSTemperature): 90.67
 Sum of Squares for Pressure (SSPressure): 146
 Sum of Squares for Interaction (SSInteraction): 1

Step 2: Calculate the Degrees of Freedom (df):

 dfTotal = (3 * 4) - 1 = 11
 dfTemperature = 3 - 1 = 2
 dfPressure = 4 - 1 = 3
 dfInteraction = (3 - 1) * (4 - 1) = 6
 dfError = dfTotal - (dfTemperature + dfPressure
+ dfInteraction) = 0

Step 3: Calculate the Mean Square (MS):

 MSTemperature = SSTemperature /
dfTemperature = 90.67 / 2 = 45.335
 MSPressure = SSPressure / dfPressure = 146 / 3
= 48.667
 MSInteraction = SSInteraction / dfInteraction = 1
/ 6 = 0.167

Step 4: Calculate the F-Statistic:

 FTemperature = MSTemperature /
MSInteraction = 45.335 / 0.167 = 271.59
 FPressure = MSPressure / MSInteraction =
48.667 / 0.167 = 291.01

Step 5: Determine the Critical value, or P-value:

Using an F-distribution table or statistical software, find


the critical value for α = 0.05 and the degrees of freedom
for each factor.

Step 6: Interpret the Result:

Since the calculated F-statistics (271.59 and 291.01) for


both Temperature and Pressure are much larger than the
critical values, we reject the null hypotheses for both
factors. This indicates that there are significant effects of
Temperature and Pressure on the tensile strength of the
material.

However, since the calculated F-statistic for the


Interaction Effect (0.167) is smaller than the critical
value, we fail to reject the null hypothesis for the
interaction. This suggests that there is no significant
interaction effect between Temperature and Pressure on
the tensile strength of the material.

Approved by the Audit Professor/Course Coordinator

You might also like