Important Questions-Bigdata

Bigdata
Important Questions Unit1, Unit2, Unit3, Unit 4, and Unit 5

1. List any five Big Data platforms. [2Marks]
2. Write any two industry examples for Big Data [2 Marks]
3. What is the role of Sort & Shuffle in Map-Reduce? [2 Marks]
4. Give the full form of HDFS [2 Marks]
5. What is the block size of a HDFS? [2 Marks]
6. Name the two types of nodes in Hadoop. [2 Marks]
7. What is big data? Why we need to analyse big data? [2 Marks]
8. List down tools related with Hadoop. [2 Marks]
9. State the purpose of Hadoop Pipes. [2 Marks]
10. What is Map Reducing? [2 Marks]
11. Explain Hadoop distributed file system. [2 Marks]
12. Write down any 4 industry examples of big data. [2 Marks]
13. What is Hadoop Architecture? [2 Marks]
14. List down the entities of YARN. [2 Marks]
15. List the characteristics of big data. [ 2 Marks]
16. Compare the classic Map Reduce with YARN. [2 Marks]
17. Explain details about the three dimensions of BIG data. [5 Marks/ 10 Marks]
18. Illustrate the architecture of Map-Reduce. [5 Marks/ 10 Marks]
19. Examine how a client read and write data in HDFS. [5 Marks/ 10 Marks]
20. Discuss in detail the different forms of Big data. [ 5 Marks/ 10 Marks]
21. Elaborate various components of Big Data architecture. [5 Marks/ 10 Marks]
22. Explain the detailed architecture of Map-Reduce. [10 Marks]
23. Differentiate “Scale up and Scale out” Explain with an example How Hadoop uses Scale out
feature to improve the Performance. [ 10 Marks]
24. Demonstrate the design of HDFS and concept in detail [ 10 Marks]
25. Write the benefits and challenges of HDFS [ 10 Marks]
26. Explain the Anatomy of MapReduce Job [10 Marks]
27. Discuss the different types and formats of Map Reduce with example [10 Marks]
28. Explain Master-slave and peer-peer replication in details [10 Marks]
29. Relate crowd sourcing and big data. Justify the relationship with an example. [10 Marks]
30. Explain in detail about Map-reduce Workflows. [10 Marks]
31. Discuss in detail about the basic building blocks of Hadoop with a neat sketch [10 Marks]
32. Explain in details components of MapReduce. Consider following MovieLens

dataset, find out how many movies did each user rate using MapReduce. [10
Marks]
UserId MovieId Rating Timestamp
196 241 3 10:30
186 302 3 18:56
196 377 1 20:34
244 51 2 09:34
166 346 1 15:40
186 474 4 11:45
186 265 2 10:20
33. Explain in details components of MapReduce. Draw the complete work flow of
MapReduce for Word Count. [10 Marks]
34. Explain in details components of MapReduce. Draw the complete work flow of
MapReduce for Sum of even and odd numbers. [10 Marks]
35. Let us suppose we have a file of size 1024MB, therefore compute how
many blocks will create using HDFS also compute the size of last block. (Default
block size is 128MB) [2 Marks]
36 . Consider the File (File.txt), write the steps to compute the word count using pig latin
script. [Unit 5 -10M]
File.txt
This is a hadoop class
hadoop is a bigdata technology
37. Consider the student data file (st.txt) Data in the following format Name, District; Age,
Gender : [Unit 5 -10M]
i. Write a PIG Script to display Names of all male students.
ii. Write a PIG Script to find the number of students from Ghaziabad district.
iii. Write a PIG Script to display district wise count of all female students.
38. Consider the given information analyse the twitter data and explain the steps involved to
find how many tweets are created by a user using Pig latin. [Unit 5- 10M]
User table
UId Name
1 Ram
2 Rahul
3 Rita
4 Neha
Tweet table
UId Tweet
1 Google…
2 Politics…
3 Olympiad
1 Politics
2 Tennis…
4 Google…
39. Define Hbase. Explain the Hbase architectural components with suitable diagram. [Unit
5-10M]
40. Define Hive. Design and explain the detailed architecture of HIVE. [Unit 5-10M]
41. Explore various execution modes of PIG. [Unit 5- 5M]
42. Write in detail about Hbase data model and pig data model. [Unit 5-5Marks]
43. Differentiate HBase Vs RDBMS [Unit 5 -5M]
44. Differentiate Hive Vs RDBMS [Unit 5 -5M]
45. Differentiate Pig Vs RDBMS [Unit 5 -5M]
46. Differentiate HBase Vs Hive Vs Pig Vs RDBMS [Unit 5 -5M]
47. Write down the Hive queries for natural join and outer join. Give examples. [Unit 5 -5M]
48. Provide overview of HBase data model. [Unit 5 -5M]
49. Why Hive is preferred instead of PigLatin? [Unit 5- 2M]
50. How Date and Time data types are used in Hive? [Unit 5- 2M]
51. Mention the usage of Grunt. [Unit 5- 2M]
52. Discuss the queries involved in Hive definition. [Unit 5- 10M]
53. Explore various execution models of PIG. [Unit 5- 10M]
54. Design and explain the detailed architecture of HIVE. [Unit 5- 10M]
55. Differentiate between Map-Reduce, PIG and HIVE [Unit 5- 10M]
56. Compare and Contrast No SQL Relational Databases. [Unit 4- 2M]
57. Does MongoDB support ACID properties? Justify your answer. [Unit 4-2M]
58. Describe schema [Unit4 -2M]
59. With the help of suitable example, explain how CRUD operations are performed in MongoDB.
[Unit 4- 10M]
60. Classify and detail the different types of NoSQL [Unit 4- 10M]
61. Summarize the role of indexing in MongoDB using an example. [Unit4 -10M]
62. Discuss the limitations of MapReduce. Explain the working of apache spark to
overcome the limitations of MapReduce. [Unit 4-10M]
63. Explain the type of Inheritance supported by SCALA programming. [Unit 4-10M]
64. Is Multiple inheritance implemented in SCALA justify your answer. If not the with
example explain how multiple inheritance can be implemented in SCALA. [Unit 4-10M]
65. Explain about HDFS Federation with neat diagram. [Unit 4-10M]
66. Explain in brief the working architecture of spark. Let’s assume 10 nodes in the spark
cluster. Each node having 16 cores, 64 GB RAM, and 5 cores per executer then find
[Unit 4-10M]
i. Total number cores in the cluster
ii. Number of executers
iii. Number of executers per node
iv. Memory required for each executer

67. Discus the needs of data ingestion. [Unit 3 2M]
68. Explain in brief about hadoop security. [Unit 3 5M]
69. Explain about serialization and deserialization. Explain the working of Avro with suitable
example. [Unit 3 10M]
70. Explain the working Flume architecture with suitable diagram. [Unit 3 10M]
71. Explain the extraction and insertion operation of Sqoop with suitable diagram. [Unit 3
10M]
72. List the various operational modes of hadoop cluster configuration and explain in detail about
configuring/installing the hadoop in local/standalone mode. [Unit 3 10M]
73. Demonstrate the design of HDFS and concept in detail. [Unit 3 10M]

Important Questions-Bigdata

Uploaded by

Copyright:

Available Formats

You might also like

Important Questions-Bigdata

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Important Questions-Bigdata

Uploaded by

Copyright:

Available Formats

Bigdata

Important Questions Unit1, Unit2, Unit3, Unit 4, and Unit 5

32. Explain in details components of MapReduce. Consider following MovieLens

58. Describe schema [Unit4 -2M]

iv. Memory required for each executer

You might also like