Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

YARN

Architecture

Before we start there are 6 components that we need to get familiar with (Resource manager – Scheduler –
ApplicationsManager – ApplicationMaster – Node Manager ‐ container)

YARN provides its core services via two types of long‐running daemon: a resource manager (one per cluster) to manage
the use of resources across the cluster, and node managers running on all the nodes in the cluster to launch and monitor
containers.

Resource manager:
:‫ ﻭ ﺑﻳﺗﻛﻭﻥ ﻣﻥ ﺣﺎﺟﺗﻳﻥ‬.‫ ﻛﻠﻬﺎ‬cluster ‫ﻋﻧﺩﻱ ﻣﻧﻪ ﻭﺍﺣﺩ ﺑﺱ ﻋﻠﻲ ﻣﺳﺗﻭﻱ ﺍﻝ‬ 
ApplicationsManager .1
Scheduler .2

ApplicationsManager:
The ApplicationsManager is responsible for:

1. Accepting job‐submissions from clients.


2. Negotiating the first container for executing the application specific ApplicationMaster.
3. Provides the service for restarting the ApplicationMaster container on failure.

‫ ﻛﻔﺎﻳﺔ ﺍﻧﻪ ﻳﻘﻭﻡ ﻋﻠﻳﻪ‬resources ‫ ﻋﻠﻳﻪ‬node manager ‫ ﺑﻳﺭﻭﺡ ﻳﺷﻭﻑ‬job ‫ ﻟﻣﺎ ﺑﻳﺎﺧﺩ ﻣﻧﻲ ﺍﻝ‬submit job‫ﻳﻌﻧﻲ ﻫﻭ ﺃﻭﻝ ﺣﺎﺟﺔ ﺑﺭﻭﺡ ﺍﻛﻠﻣﻬﺎ ﻋﺷﺎﻥ ﺃ‬
.‫ ﻭ ﻟﺳﺔ ﻫﻧﺷﺭﺡ ﺩﻭﻝ ﻳﻌﻧﻲ ﺍﻳﻪ‬ApplicationMaster ‫ ﻋﺷﺎﻥ ﻳﺷﻐﻝ ﻋﻠﻳﻪ ﺍﻝ‬container

.‫ ﺗﺎﻧﻳﺔ‬NodeManager ‫ ﻋﻠﻲ ﺍﻱ‬restart ‫ ﻳﻘﺩﺭ ﻳﻌﻣﻠﻪ‬ApplicationsManager ‫ ﺣﺻﻠﻪ ﻣﺷﻛﻠﺔ ﺍﻝ‬ApplicationMaster ‫ﻟﻭ ﺍﻝ‬
ApplicationMaster:
application ‫ﻋﻧﺩﻱ ﻣﻧﻪ ﻭﺍﺣﺩ ﻟﻛﻝ‬ 
 The per‐application ApplicationMaster has the responsibility of:
1. Negotiating appropriate resource containers from the Scheduler.
2. Tracking the containers’ status and monitoring for application progress.

ApplicationMaster ‫ ﺍﻟﻠﻲ ﺑﻳﺭﻭﺡ ﻳﻘﻭﻡ‬ApplicationsManager ‫ ﻟﻝ‬application/job ‫ ﻟﻝ‬submit ‫ﺍﺗﻔﻘﻧﺎ ﺍﻭﻝ ﺣﺎﺟﺔ ﺑﻌﻣﻝ‬

scheduler ‫ ﻭ ﺑﺎﻟﺗﺣﺩﻳﺩ ﻣﻥ ﺍﻝ‬resource manager ‫ ﻣﻥ ﺍﻝ‬resources ‫ ﺑﻌﺩﻫﺎ ﺑﻳﺭﻭﺡ ﻳﻁﻠﺏ‬ApplicationMaster ‫ﺍﻝ‬

Scheduler:
 The Scheduler is responsible for allocating resources to the various running applications.
 The Scheduler performs no monitoring or tracking of status for the application. Also, it offers no guarantees about
restarting failed tasks either due to application failure or hardware failures.
 There are 3 scheduler types available: FIFO scheduler, Fair scheduler & Capacity scheduler.

available resources ‫ ﻋﻧﺩﻫﺎ‬NodeManager ‫ ﺑﻌﺩﻫﺎ ﺭﺍﺡ ﻳﻛﻠﻡ‬scheduler ‫ ﺍﻝ‬.scheduler ‫ ﻣﻥ ﺍﻝ‬resources ‫ ﻁﻠﺏ‬ApplicationMaster ‫ﺑﻌﺩ ﻣﺎ ﺍﻝ‬
.‫ ﺩﻩ‬Application ‫ ﺧﺎﺻﺔ ﻟﻝ‬container ‫ﻋﺷﺎﻥ ﻳﻁﻠﺏ ﻣﻧﻬﺎ ﺗﻘﻭﻡ‬

NodeManager:
The NodeManager is the per‐machine framework agent who is responsible for containers, monitoring their resource usage
(cpu, memory, disk, network) and reporting the same to the ResourceManager/Scheduler.

.scheduler ‫ ﻟﻝ‬reports ‫ ﻭ ﻳﺑﻌﺕ ﺑﻳﻬﺎ‬monitoring‫ ﻭ ﻳﻌﻣﻠﻬﺎ‬containers ‫ ﺍﻝ‬launch‫ﻳﻌﻧﻲ ﻣﻥ ﺍﻻﺧﺭ ﻫﻭ ﺍﻟﻣﺳﺋﻭﻝ ﻋﻥ ﺍﻧﻪ ﻱ‬

Container:
A container executes an application‐specific process with a constrained set of resources (memory, CPU, and so on).

.‫ ﻣﻌﻳﻥ‬application‫ ﺑﺣﺟﺯﻫﺎ ﻭ ﺑﺧﺻﺻﻬﺎ ﻝ‬resources ‫ ﻫﻭ ﻋﺑﺎﺭﺓ ﻋﻥ ﺷﻭﻳﺔ‬container‫ﻳﻌﻧﻲ ﺍﻝ‬


How YARN runs an application:

1. A client contacts the resource manager (specifically ApplicationsManager) and asks it to run an application master
process/container. (Step 1)
2. The resource manager then finds a node manager that can launch the application master in a container (steps 2a
and 2b).
3. The ApplicationMaster could simply run a computation in the container it is running in and return the result to the
client. Or it could request more containers from the resource manager (specifically the Scheduler) (step 3).
4. The ApplicationMaster then uses these containers to run a distributed computation (steps 4a and 4b).

Note: When the ApplicationMaster requests resources from the Scheduler it requests 2 things:

I. Compute resources per container (CPU, memory)


II. Container locality

Container locality means that the ApplicationMaster requests that the containers being launched are launched on
nodes where the data exists. For example: if the container is going to process an HDFS block, the ApplicationMaster
will request that the container gets launched on any of the nodes that holds a replica of that block.
MapReduce 1 vs YARN:
 The distributed implementation of MapReduce in the original version of Hadoop (version 1 and earlier) is
sometimes referred to as “MapReduce 1” to distinguish it from MapReduce 2, the implementation that uses YARN
(in Hadoop 2 and later).

 In MapReduce 1, there are two types of daemon that control the job execution process: a jobtracker and one or
more tasktrackers.

 In Hadoop 1, JobTracker was responsible for scheduling jobs to TaskTrackers, and also monitoring the progress for
these tasks.
In Hadoop 2, the scheduling is now the ResourceManager’s responsibility & the task progress monitoring is the
ApplicationMaster’s responsibility.

 In Hadoop 1, TaskTrackers were responsible for running the tasks and sending progress reports back to the
JobTracker.
In Hadoop 2, NodeManager is responsible for running the tasks and reporting back to the ResourceManager.

 A comparison of MapReduce 1 and YARN components


MAPREDUCE 1 YARN
JobTracker ResourceManager, ApplicationMaster
TaskTracker NodeManager
Slot Container

Scheduler types:
As we said before there are 3 types of schedulers: FIFO scheduler, capacity scheduler & fair scheduler.

The FIFO Scheduler has the merit of being simple to understand and not needing any configuration, but it’s not suitable for
shared clusters (clusters where there are multiple users submitting multiple jobs). Large applications will use all the
resources in a cluster, so each application has to wait its turn. On a shared cluster it is better to use the Capacity Scheduler
or the Fair Scheduler. Both of these allow long‐running jobs to complete in a timely manner, while still allowing users who
are running concurrent smaller ad hoc queries to get results back in a reasonable time.

In the following examples we have 2 jobs running. A large job that takes a lot of time (job 1) and a small job (job 2).

1.FIFO scheduler

In the FIFO scheduler once job 1 is submitted it will take


all the resources and won’t free them until it finishes
executing. So, as we see even though job 2 was submitted
early, it wasn’t executed until job 1 was finished.

The FIFO scheduler is simple, it’s First In First Out. The


first job that gets submitted will take all the resources
until it finishes executing and then the second job in the
queue will take it’s turn in getting all the resources and so
on.
2.Capacity scheduler

With the Capacity Scheduler (ii), a separate dedicated


queue allows the small job to start as soon as it is
submitted, although this is at the cost of overall cluster
utilization since the queue capacity is reserved for jobs
in that queue. This means that the large job finishes later
than when using the FIFO Scheduler.

‫ ﻟﻣﺟﻣﻭﻋﺔ ﻣﻥ‬cluster resources ‫ ﺑﻳﻘﺳﻡ ﺍﻝ‬capacity scheduler‫ﺍﻝ‬


‫ ﻟﻛﻝ‬configure‫ ﻓﻣﺛﻼ ﺍﻗﺩﺭ ﺍ‬configuration ‫ ﺍﻧﺎ ﺍﻟﻠﻲ ﺑﻌﻣﻠﻬﺎ‬queues‫ﺍﻝ‬
.‫ ﻟﻭﺣﺩﻫﻡ‬queue‫ ﺍﻋﻣﻠﻬﻡ‬groups ‫ ﺍﻭ ﺍﻝ‬users‫ﻣﺟﻣﻭﻋﺔ ﻣﻥ ﺍﻝ‬

queue‫ ﺍﻛﺗﺭ ﻣﻥ ﺍﻟﻠﻲ ﻣﻭﺟﻭﺩﺓ ﻓﻲ ﺍﻝ‬resources ‫ ﻣﺗﻘﺩﺭﺵ ﺗﺎﺧﺩ‬job ‫ﺍﻝ‬


‫ ﺍﺗﻧﻔﺫﺕ ﺍﻭﻝ‬job 1 ‫ ﺍﺧﺩﺕ ﻭﻗﺕ ﺍﻁﻭﻝ ﺑﺱ‬job 2 ‫ﺑﺗﺎﻋﺗﻬﺎﺯ ﻋﺷﺎﻥ ﻛﺩﺓ ﻫﻧﺎ‬
submit ‫ﻣﺎ ﺣﺻﻠﻬﺎ‬

3.Fair scheduler
With the Fair Scheduler (iii), there is no need to reserve a set amount of capacity, since it will dynamically balance
resources between all running jobs. Just after the first (large) job starts, it is the only job running, so it gets all the
resources in the cluster. When the second (small) job starts, it is allocated half of the cluster resources so that each job is
using its fair share of resources.

Note that there is a lag between the time the second job starts and when it receives its fair share, since it has to wait for
resources to free up as containers used by the first job complete. After the small job completes and no longer requires
resources, the large job goes back to using the full cluster capacity again. The overall effect is both high cluster utilization
and timely small job completion.

‫دﻋواﺗﻛم‬

You might also like