Hive - Hands On Exercises: Intellipaat Software Solutions Pvt. LTD

Intellipaat Software Solutions Pvt. Ltd.
Hive – Hands on Exercises
Prerequisites
Create Database
Drop Database
Change to Database
Create Table
Load Data
Drop Table
Alter Table
Queries with Where Clause
Group By Queries
Join Tables
Partitioning
External Tables
Sequence Table
Map Join
Storing Data into file
Storing Data into another Table
https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 1

Prerequisites
1. Hive should be installed on the Intellipaat VM.

2. Use hive to go to hive prompt to execute following commands.
3. Copy data provided as zip file along this file in /tmp of the VM and extract it.
Create Database
Syntax: CREATE DATABASE [IF NOT EXISTS] <DATABASE_NAME>;
Example: CREATE DATABASE intellipaat;
Validate: SHOW DATABASES;

default
Intellipaat
Drop Database
Syntax: DROP DATABASE IF EXISTS <DATABASE_NAME>;
Example: DROP DATABASE IF EXISTS intellipaat;
Validate: SHOW DATABASES;

Default
Change to Database
Syntax: USE <DATABASE_NAME>;
Example: USE intellipaat;
Create Table
Syntax:
CREATE TABLE <table-name>
( <column name> <data-type>,
<column name> <data type>);
CREATE TABLE <table-name>

( <column name> <data-type>,

<column name> <data type> )
row format delimited fields terminated by ‘DELIMITER’;
Example:
CREATE TABLE students(age int, name string) ROW FORMAT DELIMITED FIELDS
TERMINATED BY ‘,’;
Validate: SHOW TABLES;

students
CREATE TABLE employees(depid int, empname string) ROW FORMAT DELIMITED

FIELDS TERMINATED BY ‘,’;
CREATE TABLE departments(depid int, depname string) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’;
Validate: SHOW TABLES;

departments
employees
students
Load Data
Syntax: LOAD DATA LOCAL INPATH ‘<input-path>' INTO TABLE <TABLE_NAME>;
Example: LOAD DATA LOCAL INPATH ‘/tmp/students.txt' INTO TABLE students;
Verify: SELECT * FROM students;

11,Raju
21,Suresh
16,Prabhu
Example: LOAD DATA LOCAL INPATH ‘/tmp/employees.txt' INTO TABLE employees;

LOAD DATA LOCAL INPATH ‘/tmp/departments.txt' INTO TABLE departments;
Verify: SELECT * FROM employees;

1,Hari
2,Giri
2,Sasi
SELECT * FROM departments;

1,Electrician
2,Plumber

Drop Table
Syntax: DROP TABLE <TABLE_NAME>;
Example: DROP TABLE students;
Verify: SHOW TABLES;
Note: If you drop table then you cannot use that table any more. Please recreate the table if
you want to use it further.
Alter Table
Syntax: ALTER TABLE <TABLE_NAME> ADD COLUMNS (NEW_COL COL_TYPE);
ALTER TABLE <TABLE_NAME> DROP [COLUMN] <COL_NAME>
Example: ALTER TABLE students ADD COLUMNS (fatherName string);
Verify: DESCRIBE students;

age int
name string
fatherName string
Example: ALTER TABLE students REMOVE COLUMN fatherName;
Verify: DESCRIBE students;

age int
name string
Queries with Where Clause
Syntax: SELECT * FROM <TABLE_NAME> WHERE

<COLUMN_NAME><OPERATOR><VALUE>
Example: SELECT * FROM students WHERE age<18;

11,Raju
Group By Queries
Syntax: SELECT <COLUMN_NAME>, count(*) FROM <TABLE_NAME> GROUP BY

<COLUMN_NAME>;
Example: SELECT depid, count(*) FROM employees GROUP BY depid;

1, 1

2, 2
Join Tables
Syntax: SELECT <TABLE1_NAME.COLUMN, TABLE2_NAME.COLUMN, …>
FROM <TABLE1> JOIN <TABLE2> ON (TABLE1.KEY = TABLE2.KEY)
Example: SELECT departments.depid, departments.name, employees.name FROM

departments JOIN employees ON (departments.dpid = employees.depid);
1,Electrician,Hari
2,Plumber,Giri
2,Plumber,Sasi
Partitioning
Syntax: CREATE TABLE <TABLE_NAME>(COL_NAME COL_TYPE,...) PARTITIONED BY

(COL_NAME COL_TYPE COL_TYPE, ...);
Example: CREATE TABLE logs(serverName string, message string) PARTITIONED BY

(date int);
LOAD DATA INPATH '/tmp/logs‘ INTO TABLE logs PARTITION (date='21');
LOAD DATA INPATH '/tmp/logs‘ INTO TABLE logs PARTITION (date='22');
Verify: SHOW PARTITIONS logs;

date=21
date=22
SELECT data, serverName, message FROM logs WHERE date=’21’;

21,server1,imp1
21,server2,imp2
External Tables
If you want to use existing data without copying it into /user/hive/warehouse then
Create EXTERNAL table with LOCATION construct as shown below.
Note: Don’t use LOAD after creating table.
1. Copy file to HDFS
hadoop fs -mkdir /tmp/table_name

hadoop fs -put /tmp/employees.txt /tmp/table_name
2. Create external table and query for rows using select statement

CREATE EXTERNAL TABLE ext(age int, name string) ROW FORMAT DELIMITED FIELD
TERMINATED BY ‘,’ LOCATION ‘/tmp/table_name’;
3. Check if the actual File still resides in the given folder
hadoop fs -cat /tmp/table_name/employees.txt
4. Also dropping this table will not delete the actual file unlike in Managed tables
(Tables created without using Keyword EXTERNAL)
Sequence Table
Example
CREATE TABLE students_seq (age int, name string) ROW FORMAT DELIMITED FIELDS
TERMINATED BY',' STORED AS SEQUENCEFILE;
INSERT OVERWRITE TABLE students_seq SELECT * FROM students;
SELECT * FROM students_seq;
Note: One cannot load data directly to SEQUENCE Tables
Map Join
MapJoin allows a table to be loaded into memory so that a join could be performed entirely
within a mapper. This improves performance of Joins in Hive.
set hive.auto.convert.join.noconditionaltask = true;

set hive.auto.convert.join.noconditionaltask.size = 10000;
SELECT * FROM departments JOIN employees ON (departments.depid =

employees.depid);
Storing Data into file
Syntax: INSERT OVERWRITE DIRECTORY ‘</PATH_TO_OUTPUT_DIRECTORY>’

<QUERY>;
Example: INSERT OVERWRITE DIRECTORY ‘/tmp/output/’ SELECT * FROM students;
Verify: hadoop fs -cat /tmp/output/part-r-00000

11,Raju
21,Suresh
16,Prabhu

Storing Data into another Table
Syntax: CREATE TABLE <NEW_TABLE> AS <QUERY>;
Example: CREATE TABLE majors AS SELECT * FROM students WHERE age>18;
Verify: SELECT * FROM majors;

21,Suresh
Indexing
Syntax: CREATE INDEX <INDEX_NAME> ON TABLE

<TABLE_NAME>(<COLUMN_NAME>) AS
‘org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED
REBUILD;
Example: CREATE INDEX age_index ON TABLE students(age) AS

‘org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED
REBUILD;
Hive UDF
1. mkdir com/example/hive/udf
2. vim Lower.java
3. Paste this content
package com.example.hive.udf;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
public final class Lower extends UDF {
public Text evaluate(final Text s) {
if (s == null) { return null; }
return new Text(s.toString().toLowerCase());

4. javac -cp `hadoop classpath`:”/usr/lib/hive/lib/*” com/example/hive/udf/Lower.java

5. jar -cvf my_udf.jar com/
6. add jar my_jar.jar;
7. hive> create temporary function my_lower as 'com.example.hive.udf.Lower';
8. hive> select empid , my_lower(empname) from employees;

Hive - Hands On Exercises: Intellipaat Software Solutions Pvt. LTD

Uploaded by

Copyright:

Available Formats

You might also like

Hive - Hands On Exercises: Intellipaat Software Solutions Pvt. LTD

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hive - Hands On Exercises: Intellipaat Software Solutions Pvt. LTD

Uploaded by

Copyright:

Available Formats

Intellipaat Software Solutions Pvt. Ltd.

Hive – Hands on Exercises

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 1

1. Hive should be installed on the Intellipaat VM.

Syntax: CREATE DATABASE [IF NOT EXISTS] <DATABASE_NAME>;

Example: CREATE DATABASE intellipaat;

Validate: SHOW DATABASES;

Syntax: DROP DATABASE IF EXISTS <DATABASE_NAME>;

Example: DROP DATABASE IF EXISTS intellipaat;

Validate: SHOW DATABASES;

Syntax: USE <DATABASE_NAME>;

Example: USE intellipaat;

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 2

( <column name> <data-type>,

Validate: SHOW TABLES;

CREATE TABLE employees(depid int, empname string) ROW FORMAT DELIMITED

Validate: SHOW TABLES;

Syntax: LOAD DATA LOCAL INPATH ‘<input-path>' INTO TABLE <TABLE_NAME>;

Example: LOAD DATA LOCAL INPATH ‘/tmp/students.txt' INTO TABLE students;

Verify: SELECT * FROM students;

Example: LOAD DATA LOCAL INPATH ‘/tmp/employees.txt' INTO TABLE employees;

Verify: SELECT * FROM employees;

SELECT * FROM departments;

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 3

Example: DROP TABLE students;

Verify: SHOW TABLES;

Example: ALTER TABLE students ADD COLUMNS (fatherName string);

Verify: DESCRIBE students;

Example: ALTER TABLE students REMOVE COLUMN fatherName;

Verify: DESCRIBE students;

Queries with Where Clause

Syntax: SELECT * FROM <TABLE_NAME> WHERE

Example: SELECT * FROM students WHERE age<18;

Syntax: SELECT <COLUMN_NAME>, count(*) FROM <TABLE_NAME> GROUP BY

Example: SELECT depid, count(*) FROM employees GROUP BY depid;

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 4

Example: SELECT departments.depid, departments.name, employees.name FROM

Syntax: CREATE TABLE <TABLE_NAME>(COL_NAME COL_TYPE,...) PARTITIONED BY

Example: CREATE TABLE logs(serverName string, message string) PARTITIONED BY

Verify: SHOW PARTITIONS logs;

SELECT data, serverName, message FROM logs WHERE date=’21’;

1. Copy file to HDFS

hadoop fs -mkdir /tmp/table_name

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 5

3. Check if the actual File still resides in the given folder

hadoop fs -cat /tmp/table_name/employees.txt

Note: One cannot load data directly to SEQUENCE Tables

set hive.auto.convert.join.noconditionaltask = true;

SELECT * FROM departments JOIN employees ON (departments.depid =

Storing Data into file

Syntax: INSERT OVERWRITE DIRECTORY ‘</PATH_TO_OUTPUT_DIRECTORY>’

Example: INSERT OVERWRITE DIRECTORY ‘/tmp/output/’ SELECT * FROM students;

Verify: hadoop fs -cat /tmp/output/part-r-00000

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 6

Storing Data into another Table

Syntax: CREATE TABLE <NEW_TABLE> AS <QUERY>;

Example: CREATE TABLE majors AS SELECT * FROM students WHERE age>18;