Professional Documents
Culture Documents
Distcpcommand in Hadoop.: Big Data Analytics Qb-3 Module-4
Distcpcommand in Hadoop.: Big Data Analytics Qb-3 Module-4
MODULE-4
1. Demonstrate the use of PathFilter interface inHadoopFileSystem and also explain the use of
distcpcommand in Hadoop.
5. Discuss in detail how the High Availability support solves the single point of failure.
MODULE-5
13. Explain model of HadoopFileSystem that describes the visibility of data to the readers in
read/write operation.
15. Write the unit test to test the mapper and reducer for maximum temperature in weather dataset.
16. Sketch a neat diagram and explain the logical data flow in MapReduce with
a. Single reduce task
b. Multiple reduce task
17. Demonstrate the use of GenericOptionParser in running the job from commandline. List its
options.
19. Discuss in detail how the High Availability support solves the single point of failure.
20. Develop MapReduce program to find the maximum recorded temperature using the weather
dataset.
22. Write the MapReduce program to analyse the weather dataset using old MapReduce API.
23. Develop the Map and Reduce function in Ruby and Python programming languages.
24. Demonstrate using a neat diagram how MapReduce model works with a single reduce task.
25. Demonstrate the Map and Reduce function in C++ programming language.
26. Identify concerns and derive appropriate solutions while analyzing data with unix tools.