Professional Documents
Culture Documents
Running Head: Title of Paper in Caps 1: Hadoop, Mapreduce and HDFS: A Developers Perspective
Running Head: Title of Paper in Caps 1: Hadoop, Mapreduce and HDFS: A Developers Perspective
Running Head: Title of Paper in Caps 1: Hadoop, Mapreduce and HDFS: A Developers Perspective
Article Summary
Name of Student
Institution Affiliation
ARTICLE SUMMARY
ARTICLE SUMMARY
In this summary, the authors of the article [1] describe the necessity of big-data
friendly architectures that allow for adequate processing of the large sized data. They discuss
Discussion
The authors first present the need of deploying storage clusters, stating that this
scalability. Apache Hadoop is the most popular open source implementation of the Google-
based MapReduce programming model. It allows big data processing on affordable computer
hardware. This is a game changer for big companies like Facebook, Google, etc..
Migrating towards cloud computing is not only cost effective, but also provides high
ease the transition into the cloud domain. The Hadoop architecture is as follows. It consists of
five different daemons, each running its own JVM. Fig.1 from [1] shows the hierarchy of
these daemons. The Hadoop cluster is made up of master nodes, which run the master
daemons on the layers, and slave nodes which are run by the remainder of the machines.
continues to provide service even in the event of a node failure, thus reducing the probability
Hadoop. It is named so because it consists of two processes for data analysis, Map phase and
Reduce phase.
What I found very interesting is the fact that master nodes can also play the role of
slave nodes, because it shows the need for more slave nodes sometimes, while it is not the
case vice versa because slave nodes cannot become master nodes. This has also raised a
ARTICLE SUMMARY
question for me, if the master nodes can become slave nodes, what are the implications of this
Conclusion
This article has helped familiarize me with the world of big data, and how big tech
companies like Google and Facebook require very efficient query processing while still
keeping costs at minimum, and that this is provided by the Hadoop architecture. I also
References
[1]. Ghazi, M. and Gangodkar, D. (2018). Hadoop, MapReduce and HDFS: A Developers
Perspective.