Fully Distributed Node Hadoop Cluster

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

file

Fully-distributed node cluster:


1. edit the /etc/hosts file:
192.168.0.51
192.168.0.52
192.168.0.53
192.168.0.54
192.168.0.55
192.168.0.56

nn.cluster.com
jt.cluster.com
snn.cluster.com
dn1.cluster.com
dn2.cluster.com
dn3.cluster.com

nn
jt
snn
dn1
dn2
dn3

[hadoop@nn1 ~]$ vim hadoop/conf/core-site.xml


<configuration>

<property>
<name>fs.default.name</name>
<value>hdfs://192.168.0.51:8020</value>
</property>

</configuration>
~

[hadoop@nn1 ~]$ vim hadoop/conf/hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
Page 1

file
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/data/nn</value>
</property>

<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/data/dn</value>
</property>

<property>
<name>fs.checkpoint.dir</name>
<value>/home/hadoop/data/snn</value>
</property>

<property>
<name>dfs.replication</name>
<value>3</value>
</property>

<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
</configuration>

[hadoop@nn1 ~]$ vim hadoop/conf/mapred-site.xml


Page 2

file
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>192.168.0.52:8021</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>3</value>
</property>

<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>3</value>
</property>

<property>
<name>mapred.local.dir</name>
<value>/home/hadoop/data/mapred/local</value>
</property>
<property>
<value>/home/hadoop/data/mapred/system</value>
</property>
</configuration>

Page 3

file
[hadoop@nn1 ~]$ vim hadoop/conf/masters
192.168.0.53

[hadoop@nn1 ~]$ vim hadoop/conf/slaves


192.168.0.54
192.168.0.55
192.168.0.56
192.168.0.57

ON NAMENODE:
-----------[hadoop@nn ~]$ start-dfs.sh
starting namenode, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-nameno
de-nn.cluster.com.out
192.168.0.57: starting datanode, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datano
de-dn4.cluster.com.out
192.168.0.55: starting datanode, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datano
de-dn2.cluster.com.out
192.168.0.56: starting datanode, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datano
de-dn3.cluster.com.out
192.168.0.54: starting datanode, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datano
de-dn1.cluster.com.out
192.168.0.53: starting secondarynamenode, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-second
arynamenode-snn.cluster.com.out
[hadoop@nn ~]$
Page 4

file

ON JOBTRACKER:
------------[hadoop@nn ~]$ ssh jt
Last login: Sun Jul 12 02:16:12 2015 from 192.168.0.51
[hadoop@jt ~]$ start-mapred.sh
starting jobtracker, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtra
cker-jt.cluster.com.out
192.168.0.55: starting tasktracker, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktr
acker-dn2.cluster.com.out
192.168.0.54: starting tasktracker, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktr
acker-dn1.cluster.com.out
192.168.0.57: starting tasktracker, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktr
acker-dn4.cluster.com.out
192.168.0.56: starting tasktracker, logging to
/home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktr
acker-dn3.cluster.com.out

ON NAMENODE
----------[hadoop@nn ~]$ for i in {1..7};do ssh 192.168.0.5$i
"hostname;jdk/bin/jps;echo -e '\n'";done
[hadoop@nn ~]$ for i in {1..7};do ssh 192.168.0.5$i
"hostname;jdk/bin/jps;echo -e '\n'";done
nn.cluster.com
2207 Jps
1632 NameNode

jt.cluster.com
1547 JobTracker
Page 5

file
1918 Jps

snn.cluster.com
1658 Jps
1546 SecondaryNameNode

dn1.cluster.com
1406 DataNode
1493 TaskTracker
2545 Jps

dn2.cluster.com
2117 Jps
1614 TaskTracker
1532 DataNode
dn3.cluster.com
1509 DataNode
2141 Jps
1584 TaskTracker

dn4.cluster.com
1593 TaskTracker
2085 Jps
1512 DataNode

[hadoop@nn ~]$

==============================================================
==============================================================
================================
Page 6

file
HADOOP ADMINISTRATOR - ROLES AND RESPONSIBILITIES
1. Plan hadoop cluster
small cluster
medium cluster
large cluster
jobs are IO bound
jobs are CPU bound
build the cluster based upon the storage
capacity, how your data is growing
a. Role assignments
which node will be datanode/namenode/hbase
master/hive metastore server/hue server/client nodes/standby
namenode/resourcemanager
b. Default Tuning
2.
a.
b.
c.

Install the cluster


How the cluster is designed
Bandwidth management
Balancing of your datanodes

3.
a.
b.
c.
d.

Management of the cluster


shell scripts to balance the datanodes
housekeeping
make the cluster HA and fault tolerant
Comm and decomm of datanodes

4.
a.
b.
c.
d.

Monitoring
CPUs
Memory
Processes
Hadoop and it's ecosystem components

5.
6.
7.
8.

Upgrades of Hadoop and it's ecosystem components


Helping the developers to run their jobs
Performance tuning
Security and user management

Page 7

You might also like