Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 1

Sample Question Bank1 on Big Data Analytics

Q.1. Explain with a diagram how Large Files are managed by Hadoop Distributed File
System.
Q.2. Explain what is meant by “Big Data Analytics”? Mention the different
characteristics and Sources of Big Data.
Q.3. Compare Name-Node and Data-Node with respect to HDFS.
Q.4. Mention the features of Cloud computing that can be exploited to handle Big
Data.
Q.5. Mention why data blocks are replicated in Hadoop Filesystem. Hadoop Systems
are considered to be Fault Tolerant and Highly Available. Justify with suitable
reasons.
Q.6. Define Big Data and explain why traditional data processing methods are
inadequate for handling it.
Q.7. Describe the role of the Hadoop Distributed File System (HDFS). Explain how it
stores and manages large files across multiple nodes.
Q.8. Predict the potential challenges that Hadoop might face in the coming years.
Suggest how the Hadoop ecosystem may adapt to overcome these challenges.

Q.9. Define the MapReduce programming model. Explain the concepts of Map and
Reduce functions with examples.
Q.10. Explain what is Hive. Relate Hive to Hadoop. Discuss the purpose it serves in
the big data ecosystem.
Q.11. Explain the structure of Hive tables. Describe external tables. Differentiate
them from managed tables.
Q.12. Explain what is meant by “Big Data Analytics”. Mention the different
characteristics and Sources of Big Data.
Q.13. Relate Map-Reduce to HDFS with proper justification.
Q.14. Mention why data blocks are replicated in Hadoop File system. “Hadoop
Systems are considered to be Fault Tolerant and Highly Available” Justify with
proper reasons.
Q.15. Define Big Data and explain its 4 V's (Variety, Volume, Velocity, Veracity)
alongwith their importance.

You might also like