Friday, August 7, 2015

Introduction to YARN

YARN/Hadoop 2.x has a completely different architecture with compared to Hadoop 1.x.
In Hadoop 1.x JobTracker serves two major functions;
  1. Resource management
  2. Job scheduling/Job monitoring
Recall that in Hadoop 1.x, there is a single JobTracker per Hadoop cluster, serving these function while scaling can overwhelm the JobTracker. Also having a single JobTracker makes it a single point of failure, if the JobTracker goes down the entire cluster goes down with all the current jobs.
YARN tries to separate above mentioned two functionalities into two daemons.
  1. Global Resource Manager
  2. Per-application Application Master
Before YARN, Hadoop designed to support MapReduce type of jobs only. As time goes by people came up with Big Data computation problems which cannot be addressed by MapReduce, hence came up with different frameworks, which works on top of HDFS, to address their problems a better way. Some of which are Apache Spark, Apache HAMA and Apache Giraph. YARN provides a way for these new frameworks, to be integrated into Hadoop framework, sharing the same underlying HDFS. This way YARN enables Hadoop to handle jobs beyond MapReduce.

Yarn Architecture Overview


Yarn Architecture. [Image Source 'http://hadoop.apache.org/']

Yarn Components


As mentioned earlier, with the architectural refurbishment, YARN introduces a whole new set of terminologies which we have to be familiar with.

YARN uses the term 'application' instead of the term 'job', which is used in Hadoop 1.x. In YARN an application can be a single job or a Direct Acyclic Graph(DAG) of jobs, and an application does not essentially have to be a MapReduce type. An application is an instance of a certain application type, which associated with an Application Master. For each application an ApplicationMaster instance will be initiated.

YARN components includes;
  • Container
  • Global Resource Manager
  • Per-node Node Manager
  • Per-application Application Master
Let's go through each component one by one;

Container

Container is a place where a unit of work occurs. For instance each MapReduce task(not the entire job) runs in one container. An application/job will run on one or more containers. Set of physical resources are allocated for each container, currently CPU core and RAM are supported. Each node in a Hadoop cluster can run several containers.

Global Resource Manager

Resource Manager consists of two main elements;
  1. Scheduler
  2. Application Manager
The pluggable Scheduler is accountable for allocating resources to running applications. Scheduling resources are done based on the resource requirements of the applications and ensures optimal resource utilization. Example pluggable Schedulers includes Capacity Scheduler and Fair Scheduler.

The Application Master does several jobs;
  • Accepts job submissions by client programs.
  • Negotiate the first container to execute per-application Application Master.
  • Provide the service to restart a failed Application Master container.
As you can recall in Hadoop 1.x JobTracker handled restarting failed tasks and monitoring each task status. As you can observe in YARN ResourceManager does not handle any of these tasks, instead they have delegated to a different component called per-application Application Master, which we will encounter later. This separation has made the ResourceManager the ultimate authority for allocating resources, and it also decreases the load on ResourceManager and enables it to scale more than the JobTracker.

You may have noticed, Global Resource Manager could be a single point of failure. After its 2.4 release, Hadoop introduced the high availability Resource Manager concept, having Active/Standby ResourceManager pair to remove this single point of failure. you can read more about it from this link.

Per-node Node Manager

Node Manager is the component which actually provisions resources to applications. NodeManager daemon is a slave service which runs on each computation node in a Hadoop cluster.

NodeManager takes resource requests from ResourceManager and provisions Containers to applications. NodeManager keeps track of the health of each node and reports to ResourceManager periodically and that way ResourceManager can keep track of global health.

During each node startup, they registers with the ResourceManager and let ResourceManager know the amount of resources available. These informations are updated periodically.

NodeManager manages resources and periodically reports to ResourceManager about node status, it does not know anything about application status. Applications are handled by a different component called ApplicationMaster, which we are going to discuss next.

Per-application Application Manager

ApplicationMaster negotiates resource containers which are required to execute the application from ResourceManager and obtain resources from NodeManager and executes application. For each application which is actually an instance of a certain application type, an ApplicationManager instance is initiated.

In Hadoop 1.x when a task fails the JobTracker is responsible to re-execute that task, this increases the load of JobTracker and reduce its scalability.

In Hadoop 2.x ResourceManager is only accountable for scheduling resources. ApplicationMaster is responsible for negotiating resource containers from ResourceManager, if a task fails ApplicationMaster negotiates resources from ResourceManager and tries to re-execute the failed task.

Hadoop 1.x only supports MapReduce type of jobs as its design is tightly coupled to solve MapReduce type of computations. In contrast Hadoop 2.x has a more pluggable architecture and supports new frameworks which uses HDFS underneath. A new framework can be plugged in and play with Hadoop framework by developing its ApplicationMaster.

How YARN handles a client request?


When a client program submits an application to YARN framework, the ApplicationMaster gets decided based on application type. ResourceManager negotiates with NodeManager to obtain a Container to execute an instance of the ApplicationMaster. After ApplicationMaster instance is initiated, it gets registered with the ResourceManager. Client communicates with ApplicationMaster though the ResourceManager. ApplicationMaster negotiates with the ResourceManager for resources on a specific node, and obtains actual resources from NodeManager most probably of that specific node. Application codes which run on Containers reports their status to ApplicationMaster periodically. After job completion, AppMaster deregisters with ResourceManager and the containers used are released.

HDFS High Availability


In Hadoop 1.x there is a single NameNode which makes it a single point of failure. If the NameNode fails the entire cluster becomes inaccessible. To avoid this hassle, Hadoop 2.x introduces a High Availability NameNode concept, by having Active/Standby NameNode pair. In high level, while Active NameNode serves client requests, the Standby NameNode constantly synchronizes with the Active NameNode. If the Active NameNode fails, Standby NameNode, becomes the Active NameNode and keeps serving client requests. You can read in depth details from here.