Professional Documents
Culture Documents
Sheet With Answers
Sheet With Answers
Question 3 Which of the following activities is not a data mining task A Question 13 Method suitable for data reduction- A
A Extracting the frequencies of a sound wave A All
B Monitoring the heart rate of a patient for abnormalities B Regression
C Predicting the future stock price of a company using historical records C Clustering
D Monitoring and predicting failures in hydro power plan D Histogram
Question 8 Outlier treatment can be performed using Question 18 Where the use of metadata will be useful
A Process of Retaining, Rectifying and Removing A To avoid errors in schema integration
B None B Missing values
C Process of creating dummy variables C For inconsistancy
D Process of creating new variables D None of the above
Question 1 The salary of employee is -20000. This is the problem of Error (A) Question 20 Chi-square test is suitable for Nominal
A error A Nominal data
B outlier B Numerical data
C C Multimedia data
D D Transaction data
|V1|=8.367
Given the following vectors, find the pair with |V2|=9.487
maximum cosine similarity |V3|=16.733
Vector 1 = [2,7,1,4] sim(V1,V2)=0.9826
Vector 2 = [3,8,1,4] sim(V1,V3)=0.999
Question 2 Which one of the following statement is reflecting data mining task Question 22 Vector 3 = [4,14,2,8] sim(V2,V3)=0.9929
A Identify and group similar documents according to context A 1 and 3
B Find average salary of employee in grade B B 1 and 2
C C 2 and 3
D D Indeterminate
A Customer walks into a modern bank for obtaining loan. The bank wants to assess if
loan can be given to the customer and if so, what is the right amount of loan. From Data Which of the following is unlikely to be a
Question 26 Scientist perspective, bank is performing A Question 25 classification task?
Identification of areas of similiar land
A Classification followed by prediction A use in an earth observation database
Motorists who are at high risk of a car
B Clustering followed by prediction B accident in the next 12 months
Houses that are likely to rise in value in
C Classification foloowed by association C 12 months time
Customers who are likely to buy a
D Sequential pattern discovery D particular product
In point-of-sale transction sequesnces "(Shoes),(Racket,Racketball), followed by ( In Binning, we first sort data and partition
Sports_jacket)"pattern is discovered. Which data mining task could detect this into (equal-frequency) bins, then which
Question 37 pattern? Question 38 of the following is not a valid step
A Sequential Pattern discovery A smooth by bin values
B Classification B smooth by bin boundaries
C Clustering C smooth by bin median
D Regression D smooth by bin means
The nominal attributes are just labelled with valid operation as equals, and not equals
whereas on the ordinal attributes the values provide enough information on ordering (<, In positively skewed data distribution, mean
Question 39 >)? Question 40 will be less than the median?
A TRUE A TRUE
B FALSE B FALSE
C C
D D
Question 41 Which one of the following is not an alternative to data mining? Question 42 The objectives of data pre-processing are
A Computational intelligence A Improve Data quality
Modify data to better fit specific data
B Knowledge Extraction B mining technique
C Data Dredging C Fill the missing value
D Knowledge Discovery in Database (KDD) D All of the above
In a dataset, Hair_color is one of the attributes and it can take the following values {Red,
Question 51 Green,Yellow,Black}, what kind of attribute is it? Question 52 Data Quality Problems are
A Nominal A Noise and outliers
B Ordinal B Missing Values
C Continuous C Duplicate Data
D None D All of the above
Question 57 Dimentionality reduction reduces data set by removing Question 57 Jersey number of cricket players is?
A irrelevant attributes A Nominal
B B Ordinal
C C Interval
D D Ratio
Question 60 Question 61
A A
B B
C
D
Question
A
B
C
D
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Clustering activities
Identifying outliers
Nominal
Ordinal
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 1/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Interval
Ratio
Which one of the following is not a challenge or issue in the data mining
process?
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 2/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Computational Intelligence
Knowledge Extraction
Data Dredging
segmentation
disambiguation
deduplication
domain consistency
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 3/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
error
outlier
The sum of observed data points divided by the number of data records is
called as
mean
mode
frequency
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 4/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Nominal data
Numerical data
Multimedia data
Transcation data
Regression
Clustering
Histogram
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 5/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Classification
Clustering
Regression
Employee IDs
Employee ratings
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 6/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Classification
Clustering
Regression
Which data mining task can be used for predicting wind velocities as a
function of temperature, humidity, air pressure, etc.?
Regression
Classification
Clustering
Given the following vectors, find the pair with maximum cosine similarity.
Vector 1 = [2, 7, 1, 4]
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 7/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Vector 2 = [3, 8, 1, 4]
1 and 3
1 and 2
2 and 3
Indeterminate
For the given records in the table, is similarity matrix correct for the
Gender attribute?
Person Id 1 2 3 4
Gender M M F M
1
1 1
0 0 1
0 1 0 1
Incorrect
Correct
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 8/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Given two objects represented by the tuples (21, 12, 3, 17, 48, 11, 82, 41,
35) and (34, 5, 13, 3, 57, 26, 69, 55, 27), calculate the Supremum
distance between the two objects
15
12
13
14
1.1791
0.1791
1.194
1.21
A customer walks into a modern bank for obtaining loan. The bank wants
to assess if loan can be given to the customer and if so, what is the right
amount of loan. From Data Scientist's perspective, bank is performing
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 10/10
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Quiz 1
Due Dec 21 at 19:00 Points 5 Questions 20
Available Dec 20 at 19:00 - Dec 21 at 19:00 1 day Time Limit 60 Minutes
Instructions
Purpose of the quiz is to validate continuous learning and observe grasp of the concepts.
Attempt History
Attempt Time Score
LATEST Attempt 1 26 minutes 5 out of 5
Classification
Clustering
https://bits-pilani.instructure.com/courses/693/quizzes/1424 1/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Sorted data (attribute values ) for price are 4, 8, 9, 15, 21, 21, 24, 25, 26,
28, 29, 34. Identify which is NOT a bin smoothed by boundaries?
4, 4, 15, 15
4, 4, 4, 15
In positively skewed data distribution, mean will be less than the median?
True
False
outliers
https://bits-pilani.instructure.com/courses/693/quizzes/1424 2/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
rare values
dimensionality of data
supremum values
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y i.e.
σX and σY respectively?
Classification
Clustering
https://bits-pilani.instructure.com/courses/693/quizzes/1424 3/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Regression
Clustering activities
Identifying outliers
https://bits-pilani.instructure.com/courses/693/quizzes/1424 4/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Chi-square test
Covariance
Option A and B
What is an Imputation?
https://bits-pilani.instructure.com/courses/693/quizzes/1424 5/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
1- Very Unsatisfied
2- Somewhat Unsatisfied
3- Neutral
4- Somewhat Satisfied
5- Very Satisfied
Ordinal
Nominal
Continuous
None
How do you understand the Problem Statement before you start your data
mining activity?
Business Constraints
https://bits-pilani.instructure.com/courses/693/quizzes/1424 6/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Business Objectives
error
outlier
The sum of observed data points divided by the number of data records is
called as
https://bits-pilani.instructure.com/courses/693/quizzes/1424 7/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
mean
mode
frequency
0.33
0.25
0.50
0.75
https://bits-pilani.instructure.com/courses/693/quizzes/1424 8/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
1.1791
0.1791
1.194
1.21
Given the following vectors, find the pair with maximum cosine similarity.
Vector 1 = [2, 7, 1, 4]
Vector 2 = [3, 8, 1, 4]
1 and 3
1 and 2
2 and 3
Indeterminate
https://bits-pilani.instructure.com/courses/693/quizzes/1424 9/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Frequency 2 1 2 3 1 1 1
27 and 5
6 and 27
11 and 6
5 and 27
A customer walks into a modern bank for obtaining loan. The bank wants
to assess if loan can be given to the customer and if so, what is the right
amount of loan. From Data Scientist's perspective, bank is performing
https://bits-pilani.instructure.com/courses/693/quizzes/1424 10/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
https://bits-pilani.instructure.com/courses/693/quizzes/1424 11/11
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Clustering activities
Identifying outliers
Nominal
Ordinal
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 1/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Interval
Ratio
Which one of the following is not a challenge or issue in the data mining
process?
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 2/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Computational Intelligence
Knowledge Extraction
Data Dredging
segmentation
disambiguation
deduplication
domain consistency
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 3/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
error
outlier
The sum of observed data points divided by the number of data records is
called as
mean
mode
frequency
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 4/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Nominal data
Numerical data
Multimedia data
Transcation data
Regression
Clustering
Histogram
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 5/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Classification
Clustering
Regression
Employee IDs
Employee ratings
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 6/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Classification
Clustering
Regression
Which data mining task can be used for predicting wind velocities as a
function of temperature, humidity, air pressure, etc.?
Regression
Classification
Clustering
Given the following vectors, find the pair with maximum cosine similarity.
Vector 1 = [2, 7, 1, 4]
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 7/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Vector 2 = [3, 8, 1, 4]
1 and 3
1 and 2
2 and 3
Indeterminate
For the given records in the table, is similarity matrix correct for the
Gender attribute?
Person Id 1 2 3 4
Gender M M F M
1
1 1
0 0 1
0 1 0 1
Incorrect
Correct
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 8/10
12/20/2020 SABHARINATH B's Quiz History: Quiz 1
Given two objects represented by the tuples (21, 12, 3, 17, 48, 11, 82, 41,
35) and (34, 5, 13, 3, 57, 26, 69, 55, 27), calculate the Supremum
distance between the two objects
15
12
13
14
1.1791
0.1791
1.194
1.21
A customer walks into a modern bank for obtaining loan. The bank wants
to assess if loan can be given to the customer and if so, what is the right
amount of loan. From Data Scientist's perspective, bank is performing
https://bits-pilani.instructure.com/courses/693/quizzes/1424/history?version=1 10/10
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Quiz 1
Due Dec 21 at 19:00 Points 5 Questions 20
Available Dec 20 at 19:00 - Dec 21 at 19:00 1 day Time Limit 60 Minutes
Instructions
Purpose of the quiz is to validate continuous learning and observe grasp of the concepts.
Attempt History
Attempt Time Score
LATEST Attempt 1 26 minutes 5 out of 5
Classification
Clustering
https://bits-pilani.instructure.com/courses/693/quizzes/1424 1/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Sorted data (attribute values ) for price are 4, 8, 9, 15, 21, 21, 24, 25, 26,
28, 29, 34. Identify which is NOT a bin smoothed by boundaries?
4, 4, 15, 15
4, 4, 4, 15
In positively skewed data distribution, mean will be less than the median?
True
False
outliers
https://bits-pilani.instructure.com/courses/693/quizzes/1424 2/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
rare values
dimensionality of data
supremum values
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y i.e.
σX and σY respectively?
Classification
Clustering
https://bits-pilani.instructure.com/courses/693/quizzes/1424 3/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Regression
Clustering activities
Identifying outliers
https://bits-pilani.instructure.com/courses/693/quizzes/1424 4/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Chi-square test
Covariance
Option A and B
What is an Imputation?
https://bits-pilani.instructure.com/courses/693/quizzes/1424 5/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
1- Very Unsatisfied
2- Somewhat Unsatisfied
3- Neutral
4- Somewhat Satisfied
5- Very Satisfied
Ordinal
Nominal
Continuous
None
How do you understand the Problem Statement before you start your data
mining activity?
Business Constraints
https://bits-pilani.instructure.com/courses/693/quizzes/1424 6/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Business Objectives
error
outlier
The sum of observed data points divided by the number of data records is
called as
https://bits-pilani.instructure.com/courses/693/quizzes/1424 7/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
mean
mode
frequency
0.33
0.25
0.50
0.75
https://bits-pilani.instructure.com/courses/693/quizzes/1424 8/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
1.1791
0.1791
1.194
1.21
Given the following vectors, find the pair with maximum cosine similarity.
Vector 1 = [2, 7, 1, 4]
Vector 2 = [3, 8, 1, 4]
1 and 3
1 and 2
2 and 3
Indeterminate
https://bits-pilani.instructure.com/courses/693/quizzes/1424 9/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
Frequency 2 1 2 3 1 1 1
27 and 5
6 and 27
11 and 6
5 and 27
A customer walks into a modern bank for obtaining loan. The bank wants
to assess if loan can be given to the customer and if so, what is the right
amount of loan. From Data Scientist's perspective, bank is performing
https://bits-pilani.instructure.com/courses/693/quizzes/1424 10/11
12/20/2020 Quiz 1: Data Mining (S1-20_DSECFZC415)
https://bits-pilani.instructure.com/courses/693/quizzes/1424 11/11
Quiz 1
Due Dec 21 at 19:00 Points 5 Questions 20
Available Dec 20 at 19:00 - Dec 21 at 19:00 1 day Time Limit 60 Minutes
Instructions
Purpose of the quiz is to validate continuous learning and observe grasp of the concepts.
Attempt History
Attempt Time Score
LATEST Attempt 1 26 minutes 5 out of 5
Classification
Clustering
Regression
Question 2 0.25 / 0.25 pts
True
False
How do you understand the Problem Statement before you start your
data mining activity?
Business Objectives
Consider the sorted list of data values given by: 10,20, 30, 40, 50, 60,
70
The interquartile range is given by:
40
60
20
34.5
Nominal
Ordinal
Interval
Ratio
Missing values
Duplicate data
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y
i.e. σX and σY respectively?
Nominal
Ordinal
Continuous
None
Chi-square test
Covariance
Option A and B
None
For the given records in the table, is similarity matrix correct for the
Gender attribute?
Person Id 1 2 3 4
Gender M M F M
1
1 1
0 0 1
0 1 0 1
Incorrect
Correct
1.1791
0.1791
1.194
1.21
0.25
0.50
0.75
Given the following vectors, find the pair with maximum cosine
similarity.
Vector 1 = [2, 7, 1, 4]
Vector 2 = [3, 8, 1, 4]
1 and 3
1 and 2
2 and 3
Indeterminate
Motorists who are at high risk of a car accident in the next 12 months
Quiz 1
Due May 27 at 20:30 Points 5 Questions 20
Available May 27 at 19:25 - May 27 at 20:30 about 1 hour Time Limit 60 Minutes
Attempt History
Attempt Time Score
LATEST Attempt 1 38 minutes 4.25 out of 5
None
Regression
Clustering
Classification
https://bits-pilani.instructure.com/courses/370/quizzes/800 1/10
12/7/2020 Quiz 1: Data Mining (S2-19_DSECLZC415)
Nominal
Ordinal
Continuous
None
Which one of the following is not a challenge or issue in the data mining
process?
https://bits-pilani.instructure.com/courses/370/quizzes/800 2/10
12/7/2020 Quiz 1: Data Mining (S2-19_DSECLZC415)
True
False
Duplicate data
Missing values
How do you understand the Problem Statement before you start your data
mining activity?
Business Constraints
Business Objectives
https://bits-pilani.instructure.com/courses/370/quizzes/800 3/10
12/7/2020 Quiz 1: Data Mining (S2-19_DSECLZC415)
Incorrect
Question 8 0 / 0.25 pts
irrelevant attributes
composite attributes
relevant attributes
derived attributes
https://bits-pilani.instructure.com/courses/370/quizzes/800 4/10
12/7/2020 Quiz 1: Data Mining (S2-19_DSECLZC415)
Hair colour is differentiated as black, brown, white so this will come under
which attribute type
Numeric
Ordinal
Binary
Nominal
What is the Interquartile range for the below set of data points:
1,1,1,3,4,5,5,6,9,11,13,14,17,18,21
https://bits-pilani.instructure.com/courses/370/quizzes/800 5/10
12/7/2020 Quiz 1: Data Mining (S2-19_DSECLZC415)
11
None
Ordinal
Binary
Nominal
https://bits-pilani.instructure.com/courses/370/quizzes/800 6/10
12/7/2020 Quiz 1: Data Mining (S2-19_DSECLZC415)
None
Clustering
Classification
https://bits-pilani.instructure.com/courses/370/quizzes/800 7/10
12/7/2020 Quiz 1: Data Mining (S2-19_DSECLZC415)
Data Dredging
Knowledge Extraction
Computational Intelligence
Incorrect
Question 17 0 / 0.25 pts
Smoothing
Aggregation
Normalization
What is an Imputation?
supremum values
dimensionality of data
rare values
outliers
Incorrect
Question 20 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/370/quizzes/800 9/10
12/7/2020 Quiz 1: Data Mining (S2-19_DSECLZC415)
For inconsistency
Missing values
https://bits-pilani.instructure.com/courses/370/quizzes/800 10/10