Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Reg. No.

Question Paper Code : 16036


M.C.A. DEGREE
EXAMINATION NOVEMBERDECEMBER 2016.
Fourth Semester
MC 7403 - DATA
WAREHOUSING AND DATA MINING
(Regulations 2013)
Time :Three hours
Maximum: 100 marks
Answer ALL questions.
PARTA(10 x 2=20 marks)
1. Write down the applications of Data Warehousing.
2 How does a Data warehouse differ from a database and what are their
similarities?

3. What is the purpose of data pre processing stage?

4. What is Data Reduction? How is it performed?


5. What are the bottlenecks in frequent pattern mining?
mining.
6. List the types of association rules used in data
Classification and Prediction?
7. What is the difference between
examples?
8. What is Ensemble method and give some
K-medoid?
between K-means and
9. What are the difference

10. What is an Outlier?


marks)
PART B- (5 x 13 = 65
Multidimensional
Schematic representations in
Discuss various (13)
11. (a)
Databases.
Or
with a neat diagram. (13)
of Data Warehouse
architecture
(b) Explain the
12. (a) Explain the following stages :
(4)
(i) DataCleaning
(4)
(ii) Data Integration
(iii) Data Transformation. (5)
Or

(b) How data mining system can be integrated with a data warehouse?
Discuss with an example. (13)

13. (a) Describe multilével and multi-dimensional association rule mining with
an example data set. (13)
Or

(b) Explain about constraint-Based association mining. (13)


14. (a) 1) Explain about the Decision Tree induction concept with example. (7)
(ii) Explain in detail about the k-Nearest-Neighbor classifiers. (6)
Or
(b) ) Explain in detail about Naive Bayesian Classification with
example. (7)
(ü) Explain in detail about the classification using
Rough Set
Approach. (6)
15. (a) ) Explain in detail about a density based clustering method based on
connected regions with sufficiently high density. (7)
(iü) Explain in detail about the
Density-Based Local Outlier Detection.
(6)
Or
(b) ) Explain in detail about Statistical Information Grid.
(7)
(ü) Explain in detail about the Conceptual Clustering. (6)
PART C-(1x 15 = 15 marks)
16. (a) A database has five
confidence = 80%. transactions. Let min support = 60% and. min
TID Items bought
T100{M, 0, N, K, E, Y}
T200{D, 0, N, K, E, Y}
T300 {M, A, K, E}
T400 {M, U, C, K, Y}
T500 {C, 0, 0, K, I, E}
2
16036
Find all frequent itemsets using Apriori and FP-growth,
2)
respectively. Compare the efficiency of the two mining processes. (8)
(i) List all of the strong association rules (with support s and
confidence c) matching the following metarule, where X is a
variable representing customers, and item denotes variables
representing items (e.g., "A", "B", etc):
transaction, buys(X, item 1) A buys(X, item2)
buys(X, item3) [s, c]. (7)

Or

(b) Suppose that a data warehouse consists of the three dimensions time,
doctor, and patient, and the two measures count and charge, where
charge is the fee that a doctor charges a patient for a visit.
(1) Enumerate three classes of schemas that are popularly used for
modeling data warehouses. (3)

(i1) Draw a schema diagram for the above data warehouse using one of
(4)
the schema classes listed in (1).
what specific
(ii) Starting with the base cuboid [day, doctor, patient, fee
OLAP operations should be performed in order to list the total
collected by each doctor in 2015? (4)
assuming the data are
(iv) To obtain the same list, write an SQL query
schema fee (day, month,
stored in a relational database with the (4)
year, doctor, hospital, patient, count, charge).

16036
3

You might also like