Download as pdf or txt
Download as pdf or txt
You are on page 1of 84

How To Make The Best Use Of Live Sessions

• Please log in 10 mins before the class starts and check your internet connection to avoid any network issues during the LIVE
session

• All participants will be on mute, by default, to avoid any background noise. However, you will be unmuted by instructor if
required. Please use the “Questions” tab on your webinar tool to interact with the instructor at any point during the class

• Feel free to ask and answer questions to make your learning interactive. Instructor will address your queries at the end of on-
going topic

• If you want to connect to your Personal Learning Manager (PLM), dial +917618772501

• We have dedicated support team to assist all your queries. You can reach us anytime on the below numbers:
US: 1855 818 0063 (Toll-Free) | India: +91 9019117772

• Your feedback is very much appreciated. Please share feedback after each class, which will help us enhance your learning
experience

Copyright © edureka and/or its affiliates. All rights reserved.


Big Data & Hadoop Certification Training

Copyright © edureka and/or its affiliates. All rights reserved.


Course Outline
Understanding Big Data Kafka Monitoring &
Hive
Stream Processing
and Hadoop

Hadoop Architecture Integration of Kafka


Kafka Producer Advance
with Hive&and
Hadoop HBase
Storm
and HDFS

Hadoop MapReduce Integration of Kafka


Kafka Consumer Advance
Framework with Spark &HBase
Flume

Kafka Operation and Processing Distributed Data


Advance MapReduce
Performance Tuning with Apache Spark

Kafka Cluster Architectures Apache Oozie and Hadoop


Pig Kafka Project
& Administering Kafka Project

Copyright © edureka and/or its affiliates. All rights reserved.


Module 2: Hadoop Architecture and HDFS

Copyright © edureka and/or its affiliates. All rights reserved.


Topics
Following are the topics covered in this module:
▪ Hadoop 2.x cluster architecture
▪ Hadoop 2.x – High Availability
▪ Hadoop 2.x – Resource Management
▪ Hadoop Cluster Modes
▪ Hadoop Terminal Commands
▪ Hadoop 2.x Configuration Files
▪ Hadoop Daemons
▪ Hadoop Web UI Parts
▪ Data Loading Techniques

Copyright © edureka and/or its affiliates. All rights reserved.


Objectives
At the end of this module, you will be able to:

▪ Analyse Hadoop 2.x Cluster Architecture – Federation

▪ Analyse Hadoop 2.x Cluster Architecture – High Availability

▪ Run Hadoop in Different Cluster Modes

▪ Run Basic Hadoop Commands on Terminal

▪ Prepare Hadoop 2.x Configuration Files and Analyze the Parameters in it

▪ Analyze Dump of a MapReduce Program

▪ Implement Different Data Loading Techniques

Copyright © edureka and/or its affiliates. All rights reserved.


Let’s Revise
▪ Hadoop Core Components

▪ HDFS Architecture

▪ What is HDFS?

▪ Hadoop Vs. Traditional Systems


Resource Node Node Node Node
▪ NameNode and Secondary YARN Manager Manager Manager Manager Manager
NameNode

HDFS DataNode DataNode DataNode DataNode


Cluster NameNode

Copyright © edureka and/or its affiliates. All rights reserved.


Pre-Class Questions

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
The default replication factor is:
a. 2
b. 4
c. 5
d. 3

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer
Ans. Option d.
It means if you move a file to HDFS then by default 3 copies of the file
will be stored on different DataNodes.

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
Every Slave node has two daemons running on them that is
DataNode and NodeManager in a MultiNode Cluster.
a. TRUE
b. FALSE

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer
Ans. TRUE
DataNode service for HDFS and NodeManager for processing.

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
A block is replicated in 4 nodes K,L,M, and N. If M, K and N fails. But, a
client can still read the data.
a. TRUE
b. FALSE

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer
Ans. TRUE.
As the remaining node ‘L’ will contain the block in question.

Copyright © edureka and/or its affiliates. All rights reserved.


Typical Hadoop Cluster Configuration
Secondary NameNode Active NameNode StandBy NameNode
(optional)
RAM: 64 GB, RAM: 32 GB, RAM: 128 GB,
Hard disk: 1 TB Hard disk: 1 TB Hard disk: 1 TB
Processor: Xenon with 4 Cores Processor: Xenon with 8 Cores Processor: Xenon with 8 Cores
Ethernet: 3 x 10 GB/s Ethernet: 3 x 10 GB/s Ethernet: 3 x 10 GB/s
OS: 64-bit CentOS OS: 64-bit CentOS OS: 64-bit CentOS
Power: Redundant Power Supply Power: Redundant Power Supply Power: Redundant Power Supply

DataNode DataNode DataNode


DataNode DataNode DataNode

RAM: 16GB RAM: 16GB RAM: 16GB


Hard disk: 6 x 2TB Hard disk: 6 x 2TB Hard disk: 6 x 2TB
Processor: Xenon with 2 cores. Processor: Xenon with 2 cores Processor: Xenon with 2 cores
Ethernet: 3 x 10 GB/s Ethernet: 3 x 10 GB/s Ethernet: 3 x 10 GB/s
OS: 64-bit CentOS OS: 64-bit CentOS OS: 64-bit CentOS

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x Cluster Architecture

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x Cluster Architecture

Slave01
Master
DataNode
NameNode Slave02
http://master:50070/ NodeManager
DataNode
Slave03
ResourceManager
http://master:8088 NodeManager
DataNode
Slave04
NodeManager
DataNode
Slave05
NodeManager
DataNode

NodeManager

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x Cluster Architecture (Contd.)
Client

HDFS YARN

NameNode ResourceManager

DataNode DataNode NodeManager NodeManager

NodeManager NodeManager DataNode DataNode

DataNode DataNode NodeManager NodeManager

NodeManager NodeManager DataNode DataNode

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x Cluster Architecture - Federation
Hadoop 1.0 Hadoop 2.0

ViewFS Map
Namespace

Namenode NS /Financial => NN1


/HR => NN2
/Health Care => NN3
Block Management

NameNode NameNode NameNode


Block Storage

Datanode … Datanode
DNn DNn DNn
Storage

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
How does HDFS Federation help HDFS Scale horizontally?
a. Reduces the load on any single NameNode by using the multiple, independent
NameNode to manage individual parts of the file system namespace.
b. Provides cross-data centre (non-local) support for HDFS, allowing a cluster
administrator to split the Block Storage outside the local cluster.

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer
Ans. Option (a)
In order to scale the name service horizontally, HDFS federation uses
multiple independent NameNode. The NameNode are federated, that
is, the NameNode are independent and don’t require coordination
with each other.

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
You have configured two name nodes to manage /marketing and
/finance respectively. What will happen if you try to put a file in
/accounting directory?

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer

Ans. Put will fail. None of the namespace will manage the file and you
will get an IOException with a No such file or directory error.

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x – High Availability

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x – High Availability
HDFS HIGH AVAILABILITY

Client
All name space edits
logged to shared NFS Read edit logs
storage; single writer Shared Edit Logs and applies to its
(fencing) own namespace
NameNode
Secondary Active Standby
High Name Node NameNode NameNode
Availability

DataNode DataNode Data Node


*Not necessary to
configure
Secondary
NameNode Node Manager Node Manager
App App
Container Master Container Master

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x – Resource Management
HDFS HIGH AVAILABILITY
HDFS YARN
All name space edits Client
logged to shared NFS Read edit logs
storage; single writer Shared Edit Logs and applies to its
(fencing) own namespace
NameNode
Secondary Active Standby Resource Next Generation
High NameNode NameNode NameNode Manager MapReduce
Availability

DataNode DataNode Data Node Node Manager Node Manager


*Not necessary to
configure App
App
Secondary Container Master Container Master
NameNode Node Manager Node Manager
DataNode DataNode
App App
Container Master Container Master

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x – Resource Management (Contd.)
Client

Masters
Resource Manager
Applications
Scheduler
Manager (AsM)

Node Manager Node Manager

Slaves
App App
Container Master Container Master
DataNode DataNode

YARN – Yet Another Resource Negotiator

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
HDFS HA was developed to overcome the following disadvantage in
Hadoop 1.0?
a. Single Point of Failure of NameNode
b. Only one version can be run in classic MapReduce
c. Too much burden on Job Tracker

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer

Ans. Single Point of Failure of NameNode

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop Cluster: Facebook
Facebook

▪ We use Hadoop to store copies of internal log and dimension data sources and use
it as a source for reporting/analytics and machine learning.

▪ Currently we have 2 major clusters:


❑ A 1100-machine cluster with 8800 cores and about 12 PB raw storage.
❑ A 300-machine cluster with 2400 cores and about 3 PB raw storage.
❑ Each (commodity) node has 8 cores and 12 TB of storage.
❑ We are heavy users of both streaming as well as the Java APIs. We have built a
higher level data warehousing framework using these features called Hive(see the
http://Hadoop.apache.org/hive/). We have also developed a FUSE implementation
over HDFS.

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop Cluster Modes

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop Cluster Modes
Standalone (or Local) Mode

• No daemons, everything runs in a single JVM.

• Suitable for running MapReduce programs during development.

• Has no DFS.

Hadoop can run in any of the three


Pseudo-Distributed Mode modes

• Hadoop daemons run on the local machine.

Fully-Distributed Mode

• Hadoop daemons run on a cluster of machines.

Copyright © edureka and/or its affiliates. All rights reserved.


Terminal Commands

command: hdfs <args>

Copyright © edureka and/or its affiliates. All rights reserved.


Terminal Commands

command: hadoop <args>

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop FS Shell Commands

▪ HDFS organizes data in files and directories

▪ Hadoop provides a command line interface called FS


shell using which a user can interact with directly with
the HDFS

▪ The syntax of the Hadoop commands is similar to bash

▪ Command: hdfs dfs <args>

Copyright © edureka and/or its affiliates. All rights reserved.


Terminal Commands
Listing of files present on HDFS

Listing of files present in Hadoop Directory

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x Configuration Files

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x Configuration Files

Configuration
Description of Log Files
Filenames
hadoop-env.sh Environment Variables that are used in the scripts to run Hadoop.

core-site.xml Configuration settings for Hadoop Core such as I/O settings that are common to HDFS and MapReduce.

hdfs-site.xml Configuration settings for HDFS daemons, the namenode, the Secondary NameNode and the DataNodes.

mapred-site.xml Configuration settings for MapReduce Applications.


yarn-site.xml Configuration settings for ResourceManager and NodeManager.

masters A list of machines (one per line) that each run a Secondary NameNode.

slaves A list of machines (one per line) that each run a Datanode and a NodeManager.

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x Configuration Files

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop 2.x Configuration Files
Core core-site.xml

HDFS hdfs-site.xml

YARN yarn-site.xml

MapReduce mapred-site.xml

Copyright © edureka and/or its affiliates. All rights reserved.


core-site.xml

<?xml version="1.0" encoding="UTF-8"?>


<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- core-site.xml -->
The name of the default file
<configuration> system. The url's authority is
<property> used to determine the host,
<name>fs.defaultFS</name> port, etc. for a filesystem.

<value>hdfs://nameservice1</value>
</property>
</configuration>

Copyright © edureka and/or its affiliates. All rights reserved.


hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>


<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> Determines the number of
<!-- hdfs-site.xml --> replication of blocks allowed in the
<configuration> HDFS(here the specified value is 3).
<property>
<name>dfs.replication</name> Determines the size of data blocks
<value>3</value> in the HDFS(here, the specified
</property> value is in bytes – 128 MB).
<property>
<name>dfs.blocksize</name>
<value>134217728</value>
</property>
</configuration>

Copyright © edureka and/or its affiliates. All rights reserved.


mapred-site.xml

<?xml version="1.0" encoding="UTF-8"?>


<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> The runtime framework
<!-- mapred-site.xml --> for executing MapReduce
jobs. Can be set to local,
<configuration>
classic or yarn.
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

Copyright © edureka and/or its affiliates. All rights reserved.


yarn-site.xml

<?xml version="1.0" encoding="UTF-8"?>


<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- yarn-site.xml --> HA feature enabled
<configuration> in YARN
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
</configuration>

Copyright © edureka and/or its affiliates. All rights reserved.


All Properties
1. https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-common/core-default.xml

2. https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

3. https://hadoop.apache.org/docs/r2.8.5/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

4. https://hadoop.apache.org/docs/r2.8.5/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

Copyright © edureka and/or its affiliates. All rights reserved.


Slaves and Masters
Two files are used by the startup and shutdown commands:

Slaves

▪ Contains a list of hosts, one per line, that are to host DataNode
and NodeManager services.

Masters

▪ Contains a list of hosts, one per line, that are to host Secondary
NameNode servers.

Copyright © edureka and/or its affiliates. All rights reserved.


Per-Process Run Time Environment
Set parameter JAVA_HOME
hadoop-env.sh JVM

▪ This file also offers a way to provide custom parameters for each of the servers.
▪ hadoop-env.sh is sourced by all of the Hadoop Core scripts provided inside hadoop directory:
/opt/cloudera/parcels/CDH/lib/hadoop/etc/hadoop
▪ Examples of environment variables that you can specify:
▪ export HADOOP_HEAPSIZE=“512“
▪ export HADOOP_DATANODE_HEAPSIZE=“128"

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop Daemons

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop Daemons - Status

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop Daemons
NameNode

▪ Runs on master node of the Hadoop Distributed File System (HDFS)


▪ Directs Data Nodes to perform their low-level I/O tasks

DataNode

▪ Runs on each slave machine in the HDFS


▪ Does the low-level I/O work

ResourceManager

▪ Runs on master node of the Data processing System(MapReduce)


▪ Global resource Scheduler

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop Daemons
NodeManager

▪ Runs on each slave node of Data processing System


▪ Platform for the Data processing tasks

JobHistoryServer

▪ Runs on each slave node of Data processing System


▪ Platform for the Data processing tasks

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop Web UI Parts
Default Used
Service Servers Protocol Description
Ports

Master Nodes
Web UI to look at current status of HDFS, explore
NameNode Web UI (NameNode and any back-up 50070 http
file system
NameNodes)

DataNode All Slave Nodes 50075 http Data Node Web UI to access the status, logs etc.

ResourceManager Cluster Level resource Web UI for Resource-Manager and for


8088 http
Web UI manager application submissions

Monitors resources on Data Node information, List of Applications


NodeManager 8042 TCP
Node and List of containers

MapReduce Get status on finished Providing logs of important events in MapReduce


19888 TCP
JobHistory Server applications. job execution and associated profiling metrics

Copyright © edureka and/or its affiliates. All rights reserved.


Web UI URLs
▪ NameNode Status: http://bdlabs.edureka.co:50011

▪ ResourceManager Status: http://bdlabs.edureka.co:50012

▪ MapReduce JobHistoryServer Status: http://bdlabs.edureka.co:50013

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
Which of the following file is used to specify the NameNode's heap
size?
a. bashrc
b. hadoop-env.sh
c. hdfs-site.sh
d. core-site.xml

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer
Ans. hadoop-env.sh.
This file specifies environment variables that affect the JDK
used by Hadoop Daemon (bin/Hadoop)

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
It is necessary to define all the properties in core-site.xml,
hdfs-site.xml,yarn-site.xml & mapred-site.xml.
a. TRUE
b. FALSE

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer

Ans. False.
Detailed answer will be given after the next question.

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
Standalone Mode uses default configuration?
a) TRUE
b) FALSE

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer
Ans. True.
In Stand alone mode Hadoop runs with default configuration
(Empty configuration files i.e. no configuration settings in core-
site.xml, hdfs-site.xml, mapred-site.xml and yarn-site.xml). If
properties are not defined in the configuration files, Hadoop runs
with default values for the corresponding properties.

Copyright © edureka and/or its affiliates. All rights reserved.


Sample Example List

Copyright © edureka and/or its affiliates. All rights reserved.


Running the WordCount Example

Copyright © edureka and/or its affiliates. All rights reserved.


Checking the Output

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question

The output of a MR job will be stored on HDFS:


a. TRUE
b. FALSE

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer

Ans. True
It is stored in different part files for eg – part-m-00000, part-m-00001
and so on.

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
To run MR job, the data should be present on HDFS:
a. TRUE
b. FALSE

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer
Ans. True
In order to process data in parallel, it is necessary that it is present
on HDFS so that MR can work on chunks of data in parallel.

Copyright © edureka and/or its affiliates. All rights reserved.


Data Loading Techniques

Copyright © edureka and/or its affiliates. All rights reserved.


Data Loading Techniques and Data Analysis
Data Analysis

Using Pig Using HIVE

HDFS

Using Flume Using Sqoop Using Hadoop Copy Commands

Data Loading

Copyright © edureka and/or its affiliates. All rights reserved.


Hadoop Copy Commands
put: Copy file(s) from local file system to destination file system. It can also read from “stdin” and writes to destination
file system.

hadoop dfs –put weather.txt hdfs://<target Namenode>

copyFromLocal: Similar to “put” command, except that the source is restricted to a local file reference.

hadoop dfs –copyFromLocal weather.txt hdfs://<target Namenode>

distcp: Distributed Copy to move data between clusters, used for backup and recovery

hadoop distcp hdfs://<source NN> hdfs://<target NN>

Copyright © edureka and/or its affiliates. All rights reserved.


Demo on Copy Commands

Copyright © edureka and/or its affiliates. All rights reserved.


Data Loading Using Flume
Flume is a distributed, reliable, and available service for efficiently collecting,
aggregating, and moving large amounts of streaming event data.
Demo will be covered in
Module 10

Twitter
Streaming HDFS
API

Flume

Twitter Source Memory Channel HDFS Sink

Copyright © edureka and/or its affiliates. All rights reserved.


Data Loading Using Sqoop
Apache Sqoop (TM) is a tool designed for efficiently transferring bulk data
between Apache Hadoop and structured data stores such as relational
databases. Demo will be covered in
Module 10

▪ Imports individual tables or entire databases to HDFS.


▪ Generates Java classes to allow you to interact with your imported data.
▪ Provides the ability to import from SQL databases straight into your Hive data
warehouse.

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
Your website is hosting a group of more than 300 sub-websites. You want to
have an analytics on the shopping patterns of different visitors? What is the
best way to collect those information from the weblogs?
a. SQOOP
b. FLUME

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer

Ans. FLUME.

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Question
You want to join data collected from two sources. One source of data
collected from a big database of call records is already available in HDFS.
The another source of data is available in a database table. The best way to
move that data in HDFS is:
a. SQOOP import
b. PIG script
c. Hive Query

Copyright © edureka and/or its affiliates. All rights reserved.


Annie’s Answer

Ans. SQOOP import.

Copyright © edureka and/or its affiliates. All rights reserved.


Assignment
# Go through Edureka Cloud Lab and explore it

# Check working condition of Hadoop Eco-system in Edureka’s Cloud Lab

Copyright © edureka and/or its affiliates. All rights reserved.


Further Reading
▪ Hadoop Cluster Setup

http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/ClusterSetup.html

▪ Hadoop on Amazon AWS ec2

http://www.edureka.in/blog/install-apache-hadoop-cluster/

▪ Hadoop Hardware Selection

http://blog.cloudera.com/blog/2013/08/how-to-select-the-right-hardware-for-your-new-hadoop-
cluster/

http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win-1.3.0/bk_cluster-planning-
guide/content/ch_hardware-recommendations.html

▪ Hadoop Cluster Configuration

http://www.edureka.in/blog/hadoop-cluster-configuration-files/

Copyright © edureka and/or its affiliates. All rights reserved.


Further Reading
▪ MapReduce Job execution

http://www.edureka.in/blog/anatomy-of-a-mapreduce-job-in-apache-hadoop/

▪ Add/Remove Nodes in a Cluster

http://www.edureka.in/blog/commissioning-and-decommissioning-nodes-in-a-hadoop-cluster/

▪ Secondary Namenode

https://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-
hdfs/HdfsUserGuide.html#Secondary_NameNode

Copyright © edureka and/or its affiliates. All rights reserved.


Pre-work for next class
Refresh your Java Skills using Java Essential for Hadoop Tutorial

Review the Interview Questions for setting up hadoop cluster.

http://www.edureka.in/blog/hadoop-interview-questions-hadoop-cluster/

Copyright © edureka and/or its affiliates. All rights reserved.


Agenda for Next Class
▪ Use Cases of MapReduce
▪ Traditional vs MapReduce Way
▪ Hadoop 2.x MapReduce Components and Architecture
▪ YARN Execution Flow
▪ MapReduce Concepts

Copyright © edureka and/or its affiliates. All rights reserved.


Copyright © edureka and/or its affiliates. All rights reserved.
Copyright © edureka and/or its affiliates. All rights reserved.

You might also like