Bahria University, Islamabad Campus: Department of Computer Science

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Bahria University, Islamabad Campus

Department of Computer Science


Mid Term Examination
Class/Section: BS CS 6 (A, B)
(Spring 2021 Semester)
Paper Type: Descriptive
Course: Data Mining Date: 21-05-2021
Course Code: CSC 452 Time: Session-I
Faculty’s Name: Dr. M Muzammal Max Marks: 40
Time Allowed: 90 minutes Total Pages: 3

INSTRUCTIONS:

A. All questions are compulsory.


B. There are a total of 7 questions.
C. Submit your work well before time.
D. Your solution must not match with another submission in your class.
E. The submission must be in .pdf file.
F. All the questions must be submitted in order. Questions submitted out of order in the
answer sheet will not be marked.

Student’s Name: _____________________________Enroll No:______________________


(USE CAPITAL LETTERS)

QUESTIONS:

1. Why is it that Google Search is not Data Mining? [5 Points]

2. Find certain names which are more prevalent in certain US locati ons? Why
is it that we can’t write an SQL query for this business questi on? How it helps for
the case of Data Mining? (Answer in your own words) [5
Points]

3. Give an overview of the Knowledge discovery process (in your own words)?
What is the signifi cance of diff erent stages? [5 Points]

Page 1 of 3
Enrollment Number: ____________________________

4. What is market basket data? What kind of analysis is done with the market
basket data? [5 Points]

5. A new coach has been working with the Long Jump team this month, and the
athletes' performance has changed. Augustus can now jump 0.15m further, June
and Carol can jump 0.06m further.

Here are all the results:

Augustus: +0.15m

Tom: +0.11m

June: +0.06m

Carol: +0.06m

Tom: +0.14m

Bob: +0.12m

Sam: +1.56m

How would you work with the above situati on, i.e. as a Data Scienti st do you
noti ce anything unusual, if yes, what would you do about it? [5
Points]

Hint: Computi ng the fi ve-number summary could be the fi rst step. You should
consider fi nding an outlier, if any?

6. Consider the following grouped data. You are required to compute the
Median? [5 Points]

Hint: You could compute the median by interpolati on.

Time to Travel to Work Frequency


Group A 1-10 3
Group B 11-20 15
Group C 21-30 75
Group D 31-40 25
Group E 41-50 8

Page 2 of 3
Enrollment Number: ____________________________

7. Consider the following dataset: 33, 25, 26, 36, 19, 30, 40, 51, 42, 32, 35, 35, 35, 45, 20, 23,
13, 15, 25, 25, 25, 26, 40, 25, 26, 22, 10. Answer the following:
a. What could be the pre-processing step for this data and why? [5 Points]

b. Show a boxplot of the data. [5 Points]

Good Luck 😊

Page 3 of 3

You might also like