Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 9

Shivajirao Kadam Institute of Technology and Management,

Indore (M.P.)
Department of Computer Science and Engineering

Lecture

on

“Hadoop 1.x Vs Hadoop 2.x”

.
Hadoop V.1.x Components
 Apache Hadoop V.1.x has the following two major
Components:

 HDFS (HDFS V1)


 MapReduce (MR V1)

 In Hadoop V.1.x, these two are also know as Two


Pillars of Hadoop.
Hadoop V.2.x Components
 Apache Hadoop V.2.x has the following three major
Components

 HDFS V.2
 YARN (MR V2)
 MapReduce (MR V1)

 In Hadoop V.2.x, these two are also know as Three


Pillars of Hadoop.
Hadoop V.1.x Limitations
 Hadoop 1.x has the following
Limitations/Drawbacks:

 For Example:- Suppose, 10 Map and 10


Reduce Jobs are running with 10 + 10 Slots to
perform a computation. All Map Jobs are doing
their tasks but all Reduce jobs are idle. We
cannot use these Idle jobs for other purpose
Hadoop V.1.x Limitation
 It supports upto 4000 Nodes per Cluster.

 It has a single component : JobTracker to perform


many activities like Resource Management, Job
Scheduling, Job Monitoring, Re-scheduling Jobs etc.

 It supports only one Name Node and One


Namespace per Cluster.

 It runs only Map/Reduce jobs.


Differences between Hadoop 1.x and Hadoop
2.x
 If we observe the components of Hadoop 1.x and
2.x, Hadoop 2.x Architecture has one extra and new
component that is : YARN (Yet Another Resource
Negotiator).

 It is the game changing component for BigData


Hadoop System.
Differences between Hadoop 1.x and Hadoop
2.x
 Hadoop 1.x supports only one namespace for
managing HDFS filesystem whereas Hadoop 2.x
supports multiple namespaces.

 Hadoop 1.x has lot of limitations in Scalability.


Hadoop 2.x has overcome that limitation with new
architecture.

 Hadoop 1.x supports maximum 4,000 nodes per


cluster where Hadoop 2.x supports more than 10,000
nodes per cluster.
Hadoop 1.x Vs Hadoop 2.x (Working)
 In Hadoop 1, there is HDFS which is used for
storage and top of it, Map Reduce which works as
Resource Management as well as Data Processing.
Due to this workload on Map Reduce, it will affect
the performance.
 In Hadoop 2, there is again HDFS which is again
used for storage and on the top of HDFS, there is
YARN which works as Resource Management. It
basically allocates the resources and keeps all the
things going on.
thank you

You might also like