Model - 2 QP Format - V Year BDA

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

BE / B.

TECH DEGREE EXAMINATION April – 2020


Time: 3 hours SIXTH SEMESTER – IT Max: 100 Marks

CS8091 - BIG DATA ANALYTICS

Reg.No: 5 1 2 2
Answer ALL questions
PART – A (10 X 2 = 20 Marks)
1. What are the challenges of conventional system
2. Discuss the types of data analytics
3. Can you Pick K in a K-Means Algorithm?
4. Define Bayes Theorem
5. Define apriori algorithm
6. What is Prune
7. How are moments estimated?
8. Analyze the term filtering a data stream.
9. Summarize the features of Hive
10. What is NoSQL database
PART – B (5 X 13 = 65 Marks)
11(a) i. What are the best practices in Big Data analytics? (13)
ii. Explain the techniques used in Big Data Analytics.

(Or)
11(b) I. Generalize the list of tools related to Hadoop. (13)
ii. How does Hadoop work?

12(a) i. Given a one dimensional dataset {1, 5, 8, 10, 2} use the (13)
agglomerative clustering algorithms with the complete link with
Euclidean distance to establish a hierarchical grouping relationship.
By using the maximal lifetime as the cutting
threshold, how many clusters are there? What is their
membership in each cluster?

ii. State and explain the clustering in non-Euclidean space with


example.

(Or)
12(b)  I. Describe about Market-Basket model. (13)

II. Explain in detail about Naïve Bayes Theorem, Classifier,


Smoothing and Diagnostics.

13(a) (13)
I. Explain the apriori algorithm for mining frequent item sets with an
example.

II. Discuss in detail about any one Ranking algorithm used by


Search Engines?
(Or)
13(b) I. Illustrates with an example the application of the Apriori (13)
algorithm to a relatively simple case that generalizes to those used
in practice. Show how to use the Apriori algorithm to generate
frequent item sets and rules and to evaluate and visualize the rules.

II. Explain in detail about Hybrid and Knowledge based


recommendation.

14(a) (13)
i. List some common online tools used to perform sentiment
analysis.(6)
ii. What do you understand by sentiment analysis?(7)

(Or)
14(b) I. Describe about Stream clustering and parallel clustering. (13)

II. Discuss in detail about characteristics of a social network as a


graph.

15(a)  Write short notes on (13)


i. NoSQL Databases and its types
ii. Illustrate in detail about Hive data manipulation, queries, data
definition and data types

(Or)
15(b) (13)
i. What is the purpose of sharding?
ii. Explain the process of sharding in MongoDB.

16(a)  Taking stock market preconditions as a case study, elaborate on (15)


the Real-time Analytics Platform (RTAP). Present the assumptions
mode.
(Or)
16(b) (15)
A database has five transactions. Let min sup = 60% and min
conf=80%
TID ITEMS
T100 Milk, Onion, Nuts, Kiwi, Egg, Yoghurt
T200 Dhal, Onion, Nuts, Kiwi, Egg, Yoghurt
T300 Milk, Apple, Kiwi, Egg
T400 Milk, Curd, Kiwi,Yoghurt
T500 Curd, Onion, Kiwi, Ice cream,Egg
Find all frequent item sets using Apriori method

You might also like