Vnrvjiet

VNR VJIET Name of the Experiment: HDFS shell
commands
Roll no:17071A0541
Name of laboratory: BDA LAB Experiment No._1_Date:21/9/20
Experiment 1
HDFS SHELL COMMANDS
Q1) What is HDFS?
HDFS commands are used to access the Hadoop File System. HDFS stands for
‘Hadoop Distributed File System’.
The HDFS is a sub-project of the Apache Hadoop project. This Apache Software
Foundation project is designed to provide a fault-tolerant file system designed to run
on commodity hardware. HDFS is accessed through a set of shell commands.
For executing the HDFS commands , open the terminal.
Q2) How to check the version of Hadoop framework?
Command: hadoop version
[cloudera@quickstart Desktop]$ hadoop version
Hadoop 2.6.0-cdh5.10.0
Q3) List the files and subdirectories present in HDFS
Command: hadoop fs –ls
This command will list all the available files and subdirectories under default
directory. For instance, in our example the default directory for Cloudera VM is
/user/cloudera
[cloudera@quickstart Desktop]$ hadoop fs -ls
1
Q4: Return all the directories under root directory
Variations of Hadoop ls Shell Command
[cloudera@quickstart Desktop]$ hadoop fs -ls /
2
commands
Roll no:17071A0541
Q5) Copy a file from local to HDFS?
copyFromLocal
HDFS Command to copy the file from a Local file system to HDFS.
Usage: hadoop fs -copyFromLocal<localsrc><hdfs destination>
First I have created a file in local named student.txt
[cloudera@quickstart Desktop]$ gedit students.txt
[cloudera@quickstart Desktop]$ hadoop fs -copyFromLocal students.txt

/user/cloudera
Q6: Check if the file is copied to HDFS?
3
Q7: Check the contents of file that you copied in HDFS?
cat
HDFS Command that reads a file on HDFS and prints the content of that file to the
standard output.
Usage: hadoop fs –cat /path_to_file_in_hdfs
[cloudera@quickstart Desktop]$ hadoop fs -cat students.txt
Q8: Achieve the same operation as above with put command?
HDFS Command to copy single source or multiple sources from local file system to
the destination file system.
Usage: hadoop fs -put <localsrc><destination>
[cloudera@quickstart Desktop]$ hadoop fs -put students.txt

/user/cloudera/studentscopied.txt
4
commands
Roll no:17071A0541
Q9) Copy any file from HDFS to Local File System
copyToLocal
HDFS Command to copy the file from HDFS to Local File System.
Usage: hadoop fs -copyToLocal<hdfs source><localdst>
[cloudera@quickstart Desktop]$ hadoop fs -copyToLocal students.txt

studentscopiedtolocal.txt
[cloudera@quickstart Desktop]$ ls
5
Q10) Check the health of the Hadoop file system.
fsck
HDFS Command to check the health of the Hadoop file system.
Command: hdfs fsck /
Hdfs fsck /
The filesystem under path '/' is HEALTHY
6
commands
Roll no:17071A0541
Q11) Create a directory in HDFS
HDFS Command to create the directory in HDFS.
Usage: hadoop fs –mkdir /directory_name
[cloudera@quickstart Desktop]$ hadoop fs –mkdir CSE2020
7
Q12) Display the contents of a file inside a directory present in HDFS
[cloudera@quickstart Desktop]$ hadoop fs –mkdir CSE2020
[cloudera@quickstart Desktop]$ hadoop fs –cp cse.txt CSE2020
Q13: Create an empty file in HDFS
touchz
HDFS Command to create a file in HDFS with file size 0 bytes.
Usage: hadoop fs –touchz /directory/filename
[cloudera@quickstart Desktop]$ hadoop fs –touchz empty.txt
8
commands
Roll no:17071A0541
Q14) Check the file size of any file in HDFS
du
HDFS Command to check the file size.
Usage: hadoop fs –du –s /directory/filename
[cloudera@quickstart Desktop]$ hadoop fs –du students.txt
Q15: Print the contents of a file stored in HDFS
cat
HDFS Command that reads a file on HDFS and prints the content of that file to the
standard output.
9
Usage: hadoop fs –cat /path/to/file_in_hdfs
[cloudera@quickstart Desktop]$ hadoop fs -cat students.txt
Q16) Count the number of directories and files inside a directory in HDFS?
count
HDFS Command to count the number of directories, files, and bytes under the paths
that match the specified file pattern.
Usage: hadoop fs -count <path>
[cloudera@quickstart Desktop]$ hadoop fs –count CSE2020
Q17) Remove a file from HDFS?
rm
HDFS Command to remove the file from HDFS.
Usage: hadoop fs dfs –rm<path>
[cloudera@quickstart Desktop]$ hadoop fs -rm empty.txt
Deleted empty.txt
Q18: Delete a directory completely in HDFS?
rm -r
HDFS Command to remove the entire directory and all of its content from HDFS.
Usage: hadoop fs -rm -r <path>
[cloudera@quickstart Desktop]$ hadoop fs -rm –r CSE2020
Q19) Copy a file or multiple files in a directory in HDFS
cp
10
commands
Roll no:17071A0541
HDFS Command to copy files from source to destination. This command allows
multiple sources as well, in which case the destination must be a directory.
Usage: hadoop fs -cp<src><dest>
Command: hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2
[cloudera@quickstart Desktop]$ hadoop fs -mkdir /user/cloudera/ CSE2020
[cloudera@quickstart Desktop]$ hadoop fs –cp dummy3.txt CSE2020
Q20: Move a file into a directory in HDFS
mv
HDFS Command to move files from source to destination. This command allows
multiple sources as well, in which case the destination needs to be a directory.
Usage: hadoop fs -mv <src><dest>
[cloudera@quickstart Desktop]$ hadoop fs –mv emptyfile.txt CSE2020
Q21) Find a Usage for an individual command
Usage command gives all the options that can be used with a particular hdfs
command.
HDFS Command that returns the help for an individual command.
Usage: hadoop fs -usage <command>
Command: hdfsdfs -usage mkdir
[cloudera@quickstart Desktop]$ hdfs dfs -usage mkdir
Usage: hadoop fs [generic options] -mkdir [-p] <path> …
11
Q22) Find the help for a given or all commands
help
HDFS Command that displays help for given command or all commands if none is
specified.
Command: hadoop fs –help
Q23: Check the memory status
Check memory status:
Usage: hadoop fs -dfhdfs :/
[cloudera@quickstart Desktop]$ hadoop fs -df
Filesystem Size Used Available Use%
hdfs://quickstart.cloudera:8020 58531520512 1245229056 46116413440 2%
12
commands
Roll no:17071A0541
Q24) Check for cluster balancing in HDFS
Cluster Balancing
Usage: hadoop balancer
Type command
hadoop balancer
Q25) Change permission for a file to 777
chmod: Changes the permissions of files.
[cloudera@quickstart Desktop]$ hadoop fs -ls -r
[cloudera@quickstart Desktop]$ hadoop fs -chmod 777 /user/cloudera/students.txt
[cloudera@quickstart Desktop]$ hadoop fs -ls -r
13
Q26) Empty the trash in HDFS
14
commands
Roll no:17071A0541
expunge: Empties the trash. When you delete a file, it isn’t removed immediately
from HDFS, but is renamed to a file in the /trash directory. As long as the file
remains there, you can undelete it if you change your mind, though only the latest
copy of the deleted file can be restored.
[cloudera@quickstart Desktop]$ hadoop fs -expunge
27) Display the last kilobyte of the particular file
tail
This hadoop command will show the last kilobyte of the file to stdout.
[cloudera@quickstart Desktop]$ hadoop fs -tail /user/cloudera/n.txt
28) Append the contents of a file present in local to a file present in HDFS
15
appendToFile : Append single src, or multiple srcs from local file system to the
destination file system
[cloudera@quickstart Desktop]$ gedit first.txt
[cloudera@quickstart Desktop]$ gedit second.txt
[cloudera@quickstart Desktop]$ hadoop fs -copyFromLocal second.txt

/user/cloudera/
[cloudera@quickstart Desktop]$ hadoop fs -appendToFile

/home/cloudera/Desktop/first.txt /user/cloudera/second.txt
[cloudera@quickstart Desktop]$ hadoop fs -cat /user/cloudera/second.txt
29. getmerge
Usage: hdfsdfs -getmerge<src><localdst> [addnl]
Takes a source directory and a destination file as input and concatenates files in src
into the destination local file. Optionally -nl can be set to enable adding a newline
character at the end of each file.
[cloudera@quickstart Desktop]$ hadoop fs -mkdir merge
[cloudera@quickstart Desktop]$ hadoop fs -mv pigfile.txt tab.txt

/user/cloudera/merge
[cloudera@quickstart Desktop]$ hadoop fs -getmerge -nl merge

/home/cloudera/Desktop/mergedfile.txt
[cloudera@quickstart Desktop]$ cat mergedfile.txt
16
VNR VJIET Name of the Experiment: Apache Pig
commands
Roll no:17071A0541
Experiment 2
APACHE PIG COMMANDS
Pig is an open-source high level data flow system. It provides a simple language
called Pig Latin, for queries and data manipulation, which are then compiled in to
MapReduce jobs that run on Hadoop.
Q1: How to enter in grunt shell?
[cloudera@quickstart Desktop]$ pig
17
Q2: Create two data sets using gedit command in local?
[cloudera@quickstart Desktop]$ gedit pigfile1.txt
1,2,3
4,5,6
7,8,9
[cloudera@quickstart Desktop]$ cat pigfile.txt
1,2,3
4,5,6
7,8,9
Q3: Copy the above files in HDFS?
[cloudera@quickstart Desktop]$ hadoop fs -put pigfile.txt /user/cloudera/
[cloudera@quickstart Desktop]$ hadoop fs -put pigfile1.txt /user/cloudera/
18
commands
Roll no:17071A0541
Q4: How to read your (pigfile.txt and pigfile1.txt) data in PIG
grunt> a = LOAD '/user/cloudera/pigfile.txt' using PigStorage(',');
grunt> dump a;
OUTPUT
(1,2,3)
(4,5,6)
(7,8,9)
19
grunt> b = LOAD '/user/cloudera/pigfile1.txt' using PigStorage(',');
20
commands
Roll no:17071A0541
grunt> dump b;
21
NOTE: If we want to specify schema, we can, but pig is flexible in that. The
columns can be referred as $0 , $1 and so on. But even if you want to specify
schema we can.
Q4: Specify the schema for above two tables?
grunt> a = LOAD '/user/cloudera/pigfile.txt' using PigStorage(',') as (a1:int, a2:int,

a3:int);
grunt> dump a;
22
commands
Roll no:17071A0541
grunt> b = LOAD '/user/cloudera/pigfile1.txt' using PigStorage(',') as (b1:int, b2:int,

b3:int);
grunt> dump b;
23
Q5: Check the schema of the two tables?
grunt> describe a;
OUTPUT
a: {a1: int,a2: int,a3: int}
grunt> describe b;
OUTPUT
b: {b1: int,b2: int,b3: int}
Q6: Combine the two tables
grunt> c= union a,b;
grunt> dump c
24
commands
Roll no:17071A0541
Q7: Split the c data setinto two different relationseg. d and e? E.g. I want one
data set where $0 is having value 1 and other data set where value of $0 is 4
grunt> SPLIT c INTO d IF $0 == 1 , e IF $0 == 4;
>dump d;
> dump e;
25
Q8: Do filtering on data set c where $1 is greater than 3?
grunt> f = FILTER c BY $1 > 3;
grunt> dump f
Q9: Group data set c by $2?
grunt> g = GROUP c by $2;
grunt> dump g;
Q: 10: Select column 1 and 2 from dataset a ?
grunt> s1 = foreach a generate $1,$2;
grunt> dump s1;
26
commands
Roll no:17071A0541
Q11: Store the above result in HDFS?
grunt> store s1 into '/user/cloudera/pigresult';
To check)
27
Q12: check the file written in HDFS
[cloudera@quickstart Desktop]$ hadoop fs -ls /user/cloudera/pigresult/
Now see what’s inside part-m-00000
[cloudera@quickstart Desktop]$ hadoop fs -cat

/user/cloudera/pigresult/part-m-00000
28
VNR VJIET Name of the Experiment: Apache Sqoop
commands
Roll no:17071A0541
EXPERIMENT 3
Sqoop commands
Aim: To understand the concept of data ingestion tool “Sqoop”

Objective:
• To import structured data from MYSQL to HDFS
• To export text file (structured data) to MYSQL
Key concept:
• Apache Sqoop is a tool in Hadoop ecosystem which is designed to
transfer data between HDFS (Hadoop storage) and relational
database servers like mysql, Oracle RDB, SQLite, Teradata,
Netezza, Postgres etc.
• Apache Sqoop imports data from relational databases to HDFS,
and exports data from HDFS to relational databases. It efficiently
transfers bulk data between Hadoop and external datastores such
as enterprise data warehouses, relational databases, etc.
• This is how Sqoop got its name – “SQL to Hadoop & Hadoop to
SQL”.
Q1: How to enter in mysql CLI in cloudera
[cloudera@quickstart Desktop]$ mysql -uroot -pcloudera
29
Q2: Create a database
mysql> create database sqoop_db;
Query OK, 1 row affected (0.27 sec)
Q3: Select the database created

mysql> use sqoop_db;
Q3: Create a table inside created database

mysql> create table emp(empno int primary key, ename varchar(10),job
varchar(9),mgr int,hiredate date,sal int,deptno int);
Query OK, 0 rows affected (0.08 sec)
Q4: Insert records in the table created

mysql> INSERT INTO emp VALUES (1,'kriti','teacher',100,2013-1-12,
50000,10);
Q5: Check the contents in the table?

mysql> select * from emp;
30
commands
Roll no:17071A0541
Sqoop – Import
Q6: Import this table in HDFS
[cloudera@quickstart Desktop]$ sqoop import --connect

jdbc:mysql://localhost/sqoop_db --username root --password cloudera
--table emp --target-dir /user/cloudera/scoop_data
Q7: Check if the data is imported in HDFS

[cloudera@quickstart Desktop]$ hadoop fs -ls /user/cloudera/scoop_data
31
Q8: Open the partitions and check for the records
[cloudera@quickstart Desktop]$ hadoop fs –cat
/user/cloudera/scoop_data/part-m-00000
Sqoop – Export
Q9: Export data(like csv) in MYSQL in a table?
Create a file in local in the Desktop, suppose abc.txt
gedit abc.txt
Next put this abc.txt in HDFS
[cloudera@quickstart Desktop]$ hadoop fs -put abc.txt /user/cloudera/
Come to MYSQL prompt
32
commands
Roll no:17071A0541
Create a table suppose abc_table
NOTE (you should first write use database_name command: that

means you need to tell in which database you wish to create a table,
in our case i’am using database sqoop_db)
mysql> use sqoop_db;

sql> create table abc_table( a int, b int, c int);
Query OK, 0 rows affected (0.13 sec)
In terminal type the following command to export abc.txt in

abc_table table
[cloudera@quickstart Desktop]$ sqoop export --connect
jdbc:mysql://localhost/sqoop_db --username root --password cloudera
--table abc_table --export-dir /user/cloudera/abc.txt
33
Q10: Check in MYSQL if data is exported from HDFS into
abc_table
mysql> select * from abc_table;
34
VNRVJIET Name of the Experiment:Apache Hive
commands
Roll no:17071A0541
Name of laboratory: BDA LAB ExperimentNo._4__ Date: 4/1/21
EXPERIMENT 4
HIVE commands
Q1:Goto Hive shell
Q2:Create and use a database
Q3: How to create Managed Table in HIVE?

>create table emp(empno int, ename string, job string, sal int, deptno int)
row format delimited fields terminated by ',';
file called empdetails.txt is createdon desktop

Q4:How to check where the managed table is created in hive
hadoop fs -ls /user/hive/warehouse/emp_details.db/emp
35
Q5:Also check the contents inside emp:
hadoop fs -cat /user/hive/warehouse/emp_details.db/emp/emp_details.txt
Q6:Check the schema of the created table emp?

describe emp;
describe extended emp;
Q7:How to see all the tables present in database
36
commands
Roll no:17071A0541
show tables;
Q8:Select all the enames from emp table
Q9:Select ename from emp table where ename=’A’
Q9:Count the total number of records in the created table

Select count(*) from emp;
37
Q10:Group the sum of salaries as per the deptno
select deptno, sum(sal) from emp group by deptno;
Q11.Get the salary of people between 1000 and 2000
select * from emp where sal between 1000 and 2000;
Q12:Select the name of employees where job has exactly 5

characters

select ename from emp where job LIKE '_____';
38
commands
Roll no:17071A0541
Q13:List the employee names where job has l as the second

character
select ename from emp where job LIKE '_l%';
Q14: Retrieve the total salary for each department
select deptno, sum(sal) from emp group by deptno;
39
Q15: Add a column to the table
alter table emp add COLUMNS(lastname string);
Q16:How to Rename a table
alter table emp rename to emp1;
Q17:How to drop table
drop table emp1;
40
commands
Roll no:17071A0541
Movies dataset and Hive commands:

Q1.Create a database called movies
$hive
>create database movies;
>use movies;
Q2.Create movies_details table in movies database
41
>create table movie_details(no int,
name string,
year int,
rating decimal,
views int,
Genres string,
Director string)
row format delimited fields terminated by ',';
Q3.Load the dataset of movies from local to hive table

>LOAD DATA LOCAL INPATH
'/home/cloudera/Desktop/hive_demo/movies_new' INTO table
movie_details;
Q4.Check if table is created.

>show tables;
Q5.Retrieve all the records in moive_details

42
commands
Roll no:17071A0541
>select * from movie_details;
43
Q6.Print all movies between year 2005 and 2017
>select * from movie_details where year between 2005 and 2017;
Q7.Select all records where movie name starts with letter c or C.

>select * from movie_details where name like 'C%' or name like 'c%';
44
commands
Roll no:17071A0541
Q8.Select all records where movie name starts with Devil’s

>select * from movie_details where name LIKE 'Devil's%';
Q9.What is the maximum rating among all the movies.

>select max(rating) from movie_details;
Q10. count the number of records

>select count(*) from movie_details;
45
Q11. select rating of the movie Scott Cooper
>SELECT name,rating FROM movie_details WHERE name = 'Scott
Cooper';
Q12. select rating of the movie The Post

> SELECT name,rating FROM movie_details WHERE name = 'The
Post';
Q13. Print all movies under the direction of Daniel Barnz and James
Franco?
> select name ,rating from movie_details where Director='Daniel Barnz'
or Director = 'James Franco'
Q14. Retriew the movies

which has rating more than 6?
> select * from movie_details where rating>6;
46
commands
Roll no:17071A0541
Q15. Count the adventure Movies from the given dataset.
47
Q16. Count the Adventure|Animation|Comedy|Family|Fantasy
Movies from the given dataset.
> SELECT count(*) FROM movie_details where
Genres='Adventure|Animation|Comedy|Family|Fantasy' ;
Q17. Group the movie names as per the Genres?
48
commands
Roll no:17071A0541
>
49
Word count problem in Hive:
Aim: To perform word count on a text file using Hive query Language
Objective: To perform word count on a text file using functions like split
and explode
1)Create and use database
hive> create database hive_count_table;
hive> use hive_count_table;
2)Create a table
hive> create table hive_count_table(data string);
50
commands
Roll no:17071A0541
3)Make a .txt file on local machine consisting of few sentences, in my

case I made hive_count.txt, then load that data to table called
hive_count_table
hive> LOAD DATA LOCAL INPATH

'/home/cloudera/Desktop/hive_word.txt' into table hive_count_table;
hive> show tables;
4)Show data in the table
hive> select * from hive_count_table;
5)The data we have is in sentences, first we have to convert that it

into words applying space as delimiter using split function
hive> select split(data,' ') from hive_count_table;
51
6) explode is to expand an array in a single row across multiple rows,
one for each value in the array.
hive> select explode( split(data,' ')) as word from hive_count_table;
7)Count words and their frequency
hive> select word,count(1)from (select explode(split(data,' ')) AS word

from hive_count_table)tmp group by word;
52
commands
Roll no:17071A0541
53
EXPERIMENT-5
Map Reduce Word count Problem

1. Download the jar files
2. Open eclipse
Press ok on the prompt.

3. Click File->New Project->Java Project
Give project name-- > “mapreduce” and click on finish button
54
VNRVJIET Name of the Experiment: MapReduce
word count problem
Roll no:17071A0541
Name of laboratory: BDA LAB ExperimentNo._5_Date:25/1/2020
4. Create a package in this project

Right click on the project “mapreduce” select newpackage
55
Give path as src folder in the project
Give package name as “word” then click finish
5. Create a class  ClassWord
56
word count problem
Roll no:17071A0541
6. Copy the given mapreduce program in the created class and save it
57
7. Add jar files by following below steps to remove the errors from above code.
Right click on project->Build Path->Configure Build Path
58
word count problem
Roll no:17071A0541
Goto Libraries->Add external JARs
Select the downloaded jar files.
59
8. Create a jar file for a given program
Right click on project->Export
Select JAR file as export type.
60
word count problem
Roll no:17071A0541
You can give a name to the jar file.
Click on Finish.
61
9. Jar file will be created on the desktop
10. Open the terminal

Create a txt file (read.txt) on the desktop and move it to HDFS
hi how are you
hope you are fine
we are all good
hi we will meet
see you soon
sure soon we will meet------>content of read.txt
Check whether the file is uploaded to cloudera.
62
word count problem
Roll no:17071A0541
Execute the jar file
hadoop jar Untitled.jar word.Classword /user/cloudera/read.txt /user/cloudera/result
63
64
word count problem
Roll no:17071A0541
Check for the output of mapreduce word count problem.
65
VNRVJIET Name of the Experiment: Data
visualization using Tableau
Roll no:17071A0541
EXPERIMENT-6
Data visualization using Tableau
Tableau is the fastest growing data visualization and data

analytics tool that aims to help people see and understand data. In other
words, it simply converts raw data into a very easily understandable
format. Data analysis is great, as it is a powerful visualization tool in the
business intelligence industry.
1) Data uploading in Tableau
You can upload data into tableau using MSExcel or Text file or by
connecting to database server.
After loading your dataset you can view its content as shown in below
figure.There can be one or more tables in your dataset and you can apply
union operation to combine the tables for effective data visualization.
67
Union of two tables.
68
Roll no:17071A0541
2) Creating sheets in Tableau-worksheets can be created by clicking on

create newsheet in the bottom as shown.
69
3) Applying Filter in Tableau-we can right click the particular column or
row and select apply filter and give respective filter.
4) Creating Dashboard-dashboards can be created by clicking on create

dashboard in the bottom as shown.Dashboards have sequence of
worksheets.
70
Roll no:17071A0541
5) Creating Storyboard-storyboards can be created by clicking on create

storyboard in the bottom as shown.Storyboards have sequence of
worksheets or dashboards.
71

Vnrvjiet

Uploaded by

Copyright:

Available Formats

You might also like

Vnrvjiet

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Vnrvjiet

Uploaded by

Copyright:

Available Formats

VNR VJIET Name of the Experiment: HDFS shell

Name of laboratory: BDA LAB Experiment No._1_Date:21/9/20

HDFS SHELL COMMANDS

Q1) What is HDFS?

For executing the HDFS commands , open the terminal.

Q2) How to check the version of Hadoop framework?

Command: hadoop version

[cloudera@quickstart Desktop]$ hadoop version

Q3) List the files and subdirectories present in HDFS

Command: hadoop fs –ls

[cloudera@quickstart Desktop]$ hadoop fs -ls

Variations of Hadoop ls Shell Command

[cloudera@quickstart Desktop]$ hadoop fs -ls /

Name of laboratory: BDA LAB Experiment No._1_Date:21/9/20

Q5) Copy a file from local to HDFS?

Usage: hadoop fs -copyFromLocal<localsrc><hdfs destination>

First I have created a file in local named student.txt

[cloudera@quickstart Desktop]$ gedit students.txt

[cloudera@quickstart Desktop]$ hadoop fs -copyFromLocal students.txt

Q6: Check if the file is copied to HDFS?

[cloudera@quickstart Desktop]$ hadoop fs -ls

Usage: hadoop fs –cat /path_to_file_in_hdfs

[cloudera@quickstart Desktop]$ hadoop fs -cat students.txt

Q8: Achieve the same operation as above with put command?

Usage: hadoop fs -put <localsrc><destination>

[cloudera@quickstart Desktop]$ hadoop fs -put students.txt

[cloudera@quickstart Desktop]$ hadoop fs -ls

Name of laboratory: BDA LAB Experiment No._1_Date:21/9/20

Q9) Copy any file from HDFS to Local File System

Usage: hadoop fs -copyToLocal<hdfs source><localdst>

[cloudera@quickstart Desktop]$ hadoop fs -copyToLocal students.txt

HDFS Command to check the health of the Hadoop file system.

Command: hdfs fsck /

The filesystem under path '/' is HEALTHY

Name of laboratory: BDA LAB Experiment No._1_Date:21/9/20

Q11) Create a directory in HDFS

HDFS Command to create the directory in HDFS.

Usage: hadoop fs –mkdir /directory_name

[cloudera@quickstart Desktop]$ hadoop fs –mkdir CSE2020

[cloudera@quickstart Desktop]$ hadoop fs -ls

[cloudera@quickstart Desktop]$ hadoop fs –mkdir CSE2020

[cloudera@quickstart Desktop]$ hadoop fs -ls

[cloudera@quickstart Desktop]$ hadoop fs –cp cse.txt CSE2020

Q13: Create an empty file in HDFS

HDFS Command to create a file in HDFS with file size 0 bytes.

Usage: hadoop fs –touchz /directory/filename

[cloudera@quickstart Desktop]$ hadoop fs –touchz empty.txt

[cloudera@quickstart Desktop]$ hadoop fs -ls

Name of laboratory: BDA LAB Experiment No._1_Date:21/9/20

Q14) Check the file size of any file in HDFS

HDFS Command to check the file size.

Usage: hadoop fs –du –s /directory/filename

[cloudera@quickstart Desktop]$ hadoop fs –du students.txt

Q15: Print the contents of a file stored in HDFS

[cloudera@quickstart Desktop]$ hadoop fs -cat students.txt

Usage: hadoop fs -count <path>

[cloudera@quickstart Desktop]$ hadoop fs –count CSE2020

Q17) Remove a file from HDFS?