Professional Documents
Culture Documents
Question Bank Big Data Analytics
Question Bank Big Data Analytics
Question Bank Big Data Analytics
QUESTION BANK
BIG DATA ANALYTICS
[As per Choice Based Credit System (CBCS) scheme]
(Effective from the academic year 2016 -2017)
SEMESTER – VIII
Module 1:
1. List and brief HDFS User commands that will facilitate navigation within 8M (July ‘19)
HDFS
2. Write a java program to read, write and delete files on HDFS 8M (July’19)
3. Describe HDFS component with diagram 8M (July’19)
4. What are the steps to run terasort benchmark in hadoop? 4M NS
5. What are the steps to run TestDFSIO benchmark in hadoop? 4M (Dec’19)
6. How to manage Hadoop MapReduce jobs by using mapred job command? 4M (July’19)
7. Explain about hadoop MapReduce model with simple mapper and reducer 8M (july2019)
script
8. Illustrate Hadoop parallel MapReduce data flow with diagram. Write down 8M (July’19)
the steps of MapReduce parallel execution
9. Write a java program that counts the number of occurrences of each word in 8M (Dec’19)
a given input.
10. Explain about Hadoop Streaming interface with Python Mapper and 8M (Dec’19)
Reducer script.
11. Explain about Hadoop Pipes interface and write a program for word count 8M (Dec’19)
using the Pipes interface.
Module 2
1. Describe about Apache Pig scripting tool to examine data both locally and 8M (Dec’19)
on Hadoop cluster.
2. Write a steps and procedure to summarize, ad hoc queries and analyze the 8M NS
data set using Hadoop HiveQL with example.
3. How to acquire relational data using Hadoop Sqoop? Explain with Import 8M (Dec’19)
and Export method.
4. Detail the steps of importing data from MySQL to HDFS and Export data 8M (Dec’19)
Module 3
1. What is Business Intelligence? Write a note on its role in Decision making 4M (July’19)
2. List and explain any two areas of applications of Business Intelligence and 8M (July’19)
Data mining.
3. List out the features of a good Data Warehousing 4M (Dec’19)
4. Describe Data Warehouse architecture with neat diagram. 8M (July’19)
5. How the raw data is prepared for mining? Explain the processes are involved 8M (Dec’19)
to prepare data,
6. What are the Data mining techniques are involved in Supervised and 8M (July’19)
Unsupervised learning? Explain briefly.
7. Write notes on tools and platforms for Data Mining. 4M (Dec’19)
8. What is Confusion Matrix? What is the use of Confusion matrix? 4M NS
9. Compare popular Data Mining platforms with different features 4M (July’19)
10. How you will use Data Mining techniques effectively and Successfully? Brief 6M (July’19)
CRISP-DM steps for effective Data Mining.
11. Write down the myths of data Mining in Business Industry 4M (Dec’19)
12 What are the major mistakes to be avoided when doing Data Mining 8M (Dec’19)
Module 4
1. Draw the Decision tree for the given data set. 8M (July’19)
2. What is Decision tree? Why are decision trees the most popular classification 8M NS
technique?
3. What is splitting variable? Describe three criteria for choosing splitting 4M NS
variable.
4. What is pruning? What are pre-pruning and post-pruning techniques? Why 8M (July’19)
choose one over other?
5. What is logistic regression? Describe Advantages and Disadvantages of 8M (Dec’19)
regression models.
6. How you will represent neural network? Explain about the Design principles 8M (Dec’19)
of Artificial Neural Network.
7. List out the steps to build ANN. Brief advantages and disadvantages of using 8M (Dec’19)
ANN
8. How you will represent association rules? Describe Apriori algorithm for 8M (July’19)
association rules.
9. Create a regression model to predict the Test2 from Test1 score. Then predict 8M NS
the score for the one who got 46 in Test1
Test1 Test2
59 56
52 63
44 55
51 50
42 66
42 48
41 58
45 36
27 13
63 50
54 81
44 56
50 64
47 50
10 X and Y are the two dimensions on interest. Determine the number of clusters 8M (July’19)
and the center points of those clusters.
X Y
2 4
2 6
5 6
4 7
8 3
6 6
5 2
5 7
6 3
4 4
Module 5
1. Brief the Text mining application in different domains 8M NS
2. Compare Text mining with data mining with different dimensions. 6M NS
3. What is Naïve Bayes Model? Explain with classification example. 8M NS
4. What is SVM? Explain about SVM model with kernel method. 8M NS
5 Write notes on three types of web mining. 6M NS
6 What is social network analysis? How it is different from other data mining 8M NS
techniques such as clustering and decision trees?