Hive - Hands On Exercises: Intellipaat Software Solutions Pvt. LTD

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Intellipaat Software Solutions Pvt. Ltd.

Hive – Hands on Exercises

Prerequisites
Create Database
Drop Database
Change to Database
Create Table
Load Data
Drop Table
Alter Table
Queries with Where Clause
Group By Queries
Join Tables
Partitioning
External Tables
Sequence Table
Map Join
Storing Data into file
Storing Data into another Table

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 1


Intellipaat Software Solutions Pvt. Ltd.

Prerequisites

1. Hive should be installed on the Intellipaat VM.


2. Use hive to go to hive prompt to execute following commands.
3. Copy data provided as zip file along this file in /tmp of the VM and extract it.

Create Database

Syntax: CREATE DATABASE [IF NOT EXISTS] <DATABASE_NAME>;

Example: CREATE DATABASE intellipaat;

Validate: SHOW DATABASES;


default
Intellipaat

Drop Database

Syntax: DROP DATABASE IF EXISTS <DATABASE_NAME>;

Example: DROP DATABASE IF EXISTS intellipaat;

Validate: SHOW DATABASES;


Default

Change to Database

Syntax: USE <DATABASE_NAME>;

Example: USE intellipaat;

Create Table

Syntax:
CREATE TABLE <table-name>
( <column name> <data-type>,
<column name> <data type>);
CREATE TABLE <table-name>

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 2


Intellipaat Software Solutions Pvt. Ltd.

( <column name> <data-type>,


<column name> <data type> )
row format delimited fields terminated by ‘DELIMITER’;

Example:
CREATE TABLE students(age int, name string) ROW FORMAT DELIMITED FIELDS
TERMINATED BY ‘,’;

Validate: SHOW TABLES;


students

CREATE TABLE employees(depid int, empname string) ROW FORMAT DELIMITED


FIELDS TERMINATED BY ‘,’;
CREATE TABLE departments(depid int, depname string) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’;

Validate: SHOW TABLES;


departments
employees
students

Load Data

Syntax: LOAD DATA LOCAL INPATH ‘<input-path>' INTO TABLE <TABLE_NAME>;

Example: LOAD DATA LOCAL INPATH ‘/tmp/students.txt' INTO TABLE students;

Verify: SELECT * FROM students;


11,Raju
21,Suresh
16,Prabhu

Example: LOAD DATA LOCAL INPATH ‘/tmp/employees.txt' INTO TABLE employees;


LOAD DATA LOCAL INPATH ‘/tmp/departments.txt' INTO TABLE departments;

Verify: SELECT * FROM employees;


1,Hari
2,Giri
2,Sasi

SELECT * FROM departments;


1,Electrician
2,Plumber

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 3


Intellipaat Software Solutions Pvt. Ltd.

Drop Table
Syntax: DROP TABLE <TABLE_NAME>;

Example: DROP TABLE students;

Verify: SHOW TABLES;

Note: If you drop table then you cannot use that table any more. Please recreate the table if
you want to use it further.

Alter Table
Syntax: ALTER TABLE <TABLE_NAME> ADD COLUMNS (NEW_COL COL_TYPE);
ALTER TABLE <TABLE_NAME> DROP [COLUMN] <COL_NAME>

Example: ALTER TABLE students ADD COLUMNS (fatherName string);

Verify: DESCRIBE students;


age int
name string
fatherName string

Example: ALTER TABLE students REMOVE COLUMN fatherName;

Verify: DESCRIBE students;


age int
name string

Queries with Where Clause

Syntax: SELECT * FROM <TABLE_NAME> WHERE


<COLUMN_NAME><OPERATOR><VALUE>

Example: SELECT * FROM students WHERE age<18;


11,Raju

Group By Queries

Syntax: SELECT <COLUMN_NAME>, count(*) FROM <TABLE_NAME> GROUP BY


<COLUMN_NAME>;

Example: SELECT depid, count(*) FROM employees GROUP BY depid;


1, 1

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 4


Intellipaat Software Solutions Pvt. Ltd.

2, 2

Join Tables
Syntax: SELECT <TABLE1_NAME.COLUMN, TABLE2_NAME.COLUMN, …>
FROM <TABLE1> JOIN <TABLE2> ON (TABLE1.KEY = TABLE2.KEY)

Example: SELECT departments.depid, departments.name, employees.name FROM


departments JOIN employees ON (departments.dpid = employees.depid);
1,Electrician,Hari
2,Plumber,Giri
2,Plumber,Sasi

Partitioning

Syntax: CREATE TABLE <TABLE_NAME>(COL_NAME COL_TYPE,...) PARTITIONED BY


(COL_NAME COL_TYPE COL_TYPE, ...);

Example: CREATE TABLE logs(serverName string, message string) PARTITIONED BY


(date int);
LOAD DATA INPATH '/tmp/logs‘ INTO TABLE logs PARTITION (date='21');
LOAD DATA INPATH '/tmp/logs‘ INTO TABLE logs PARTITION (date='22');

Verify: SHOW PARTITIONS logs;


date=21
date=22

SELECT data, serverName, message FROM logs WHERE date=’21’;


21,server1,imp1
21,server2,imp2

External Tables

If you want to use existing data without copying it into /user/hive/warehouse then
Create EXTERNAL table with LOCATION construct as shown below.
Note: Don’t use LOAD after creating table.

1. Copy file to HDFS

hadoop fs -mkdir /tmp/table_name


hadoop fs -put /tmp/employees.txt /tmp/table_name

2. Create external table and query for rows using select statement

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 5


Intellipaat Software Solutions Pvt. Ltd.

CREATE EXTERNAL TABLE ext(age int, name string) ROW FORMAT DELIMITED FIELD
TERMINATED BY ‘,’ LOCATION ‘/tmp/table_name’;

3. Check if the actual File still resides in the given folder

hadoop fs -cat /tmp/table_name/employees.txt

4. Also dropping this table will not delete the actual file unlike in Managed tables
(Tables created without using Keyword EXTERNAL)

Sequence Table

Example
CREATE TABLE students_seq (age int, name string) ROW FORMAT DELIMITED FIELDS
TERMINATED BY',' STORED AS SEQUENCEFILE;
INSERT OVERWRITE TABLE students_seq SELECT * FROM students;
SELECT * FROM students_seq;

Note: One cannot load data directly to SEQUENCE Tables

Map Join

MapJoin allows a table to be loaded into memory so that a join could be performed entirely
within a mapper. This improves performance of Joins in Hive.

set hive.auto.convert.join.noconditionaltask = true;


set hive.auto.convert.join.noconditionaltask.size = 10000;

SELECT * FROM departments JOIN employees ON (departments.depid =


employees.depid);

Storing Data into file

Syntax: INSERT OVERWRITE DIRECTORY ‘</PATH_TO_OUTPUT_DIRECTORY>’


<QUERY>;

Example: INSERT OVERWRITE DIRECTORY ‘/tmp/output/’ SELECT * FROM students;

Verify: hadoop fs -cat /tmp/output/part-r-00000


11,Raju
21,Suresh
16,Prabhu

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 6


Intellipaat Software Solutions Pvt. Ltd.

Storing Data into another Table

Syntax: CREATE TABLE <NEW_TABLE> AS <QUERY>;

Example: CREATE TABLE majors AS SELECT * FROM students WHERE age>18;

Verify: SELECT * FROM majors;


21,Suresh

Indexing

Syntax: CREATE INDEX <INDEX_NAME> ON TABLE


<TABLE_NAME>(<COLUMN_NAME>) AS
‘org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED
REBUILD;

Example: CREATE INDEX age_index ON TABLE students(age) AS


‘org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED
REBUILD;

Hive UDF
1. mkdir com/example/hive/udf
2. vim Lower.java
3. Paste this content
package com.example.hive.udf;

import org.apache.hadoop.hive.ql.exec.UDF;

import org.apache.hadoop.io.Text;

public final class Lower extends UDF {

public Text evaluate(final Text s) {

if (s == null) { return null; }

return new Text(s.toString().toLowerCase());

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 7


Intellipaat Software Solutions Pvt. Ltd.

4. javac -cp `hadoop classpath`:”/usr/lib/hive/lib/*” com/example/hive/udf/Lower.java


5. jar -cvf my_udf.jar com/
6. add jar my_jar.jar;
7. hive> create temporary function my_lower as 'com.example.hive.udf.Lower';
8. hive> select empid , my_lower(empname) from employees;

https://intellipaat.com/ IN: +91-7847955955 US: 1-800-216-8930(Toll Free) Page 8

You might also like