Oozie Basic Exercise

You might also like

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 3

1.

Create a working folder in /home/hadoop


mkdir /home/hadoop/basic-oozie-exercise

2. Extract the mapreduce examples jar


/opt/hadoop/share/hadoop/map-reduce....
jar -xvf <filename.jar>

3. Find the files associated with the wordcount sample

4. Create a job.properties file


nameNode=hdfs://master:9000
jobTracker=master:8050
queueName=default
examplesRoot=examplesoozie#
oozie.wf.application.path=${nameNode}/user/${user.name}
/${examplesRoot}/map-reduce
outputDir=map-reduce#
oozie.libpath=${nameNode}/$(user.name)/share/lib

5. Create a workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.1" name="map-reduce-wf">
<start to="mr-node"/>
<action name="mr-node">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/$
{outputDir}"/>
</prepare>

<configuration>
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapreduce.map.class</name>
<value>org.apache.hadoop.examples.WordCount$TokenizerMapper</value>
</property>
<property>
<name>mapreduce.reduce.class</name>
<value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>
</property>
<property>
<name>mapreduce.combine.class</name>
<value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>
</property>
<property>
<name>mapred.output.key.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapred.output.value.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/user/${wf:user()}/${examplesRoot}/input-data/text</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}</value>
</property>
</configuration>
</map-reduce>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Map/Reduce failed, error message[$
{wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>

6. Create a directory on HDFS


hdfs dfs -mkdir -p /user/hadoop/examplesoozie#/map-reduce
hdfs dfs -copyFromLocal workflow.xml /user/hadoop/examplesoozie#/map-
reduce/workflow.xml

7. create a folder named lib in which the required library / jar files are kept.
hdfs dfs -mkdir -p /user/hadoop/examplesoozie#/map-reduce/lib

8. copy Hadoop MapReduce examples jar under this directory.


hdfs dfs -copyFromLocal hadoop-mapreduce-examples#.jar
/user/hadoop/examplesoozie#/map-reduce/lib/hadoop-mapreduce-examples#.jar

a. hdfs folder created for program


b. lib folder in program folder containing jar
c. paths adjusted correctly in workflow.xml
d. workflow file uploaded
e. data folder created on hdfs
f. data file uploaded to hdfs
g. export OOZIE_HOME and path in ~/.profile and source the .profile
. ~/.profile

sudo mkdir -p /user/hadoop


cd /user/hadoop
sudo tar -xvzf /opt/oozie-4.2.0/oozie-sharelib-4.2.0.tar.gz
sudo chown -R hadoop:hadoop /user/hadoop

9. run Hadoop MapReduce program for WordCount


oozie job -oozie http://localhost:11000/oozie -config job.properties -run

10. view the status of the job


oozie job -oozie http://localhost:11000/oozie -info <jobid>

11. Review the output in the directory as specified by workflow.xml


hdfs dfs -cat /user/hadoop/examplesoozie#/output-data/map-reduce/part-r-00000
ssh -L 0.0.0.0:8030:master:8030 master
ssh -L 0.0.0.0:10020:master:10020 master

You might also like