Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

Hive concepts

Hive>insert overwrite table


tablename2 select * from
tableaname1

joins
There are 2 types of joins
a.inner join
b.Outer join
Here default join is innerjoin
Note: to collect information(rows) from
more than one tables(2 tables)

INNERJOIN: only matching records


with condition
OUTERJOIN: matching records + non
matching records with condtion.
Leftouterjoin: matching records+non
matching
From leftside table

Rightouterjoin: matching records


+non matching from right side table
Fullouterjoin: matching records+no
matching from left table and right
table

examples
INNERJOIN:
Hive>select e.code,
e.ename,e.esal,d.dname ,d.dloc from
emp e join dept d on(e.dno=d.dno)

examples
Leftouterjoin:
Hive>select e.ecode, e.ename,
e.esal,d.dname, d.dloc from emp e
leftouterjoin dept d on(e.dno=d.dno);
RIGHTOUTERJOIN:
Hive>select e.ecode, e.ename,e.esal,
d.dname, d.dloc from emp e
rightouterjoin dept d
on(e.dno=d.dno)

FULLOUTERJOIN:
Hive>select
e.ecode,e.ename,e.esal,d.dname,
d.dloc from emp e fullouterjoin dept
d on(e.dno=d.dno);

Two types of tables in hive


1.internal table(managed table)
2.external table
If u create internal table, for the table
in hdfs one file ll be created default
directory for hive
Table is
User/training/hive/warehouse

But files of this directory cannt be


accessed by other echosystem of
hadoop.
SO CREATE EXTERNAL TABLE that
data can be
Accessed by other ecosystem.

Table to table
Insert overwrite table tablename
select * from
tablename

Hive to hdfs
Insert overwrite directory
directoryname/filename.txt select *
from tablename

Hive to localfilesystem
Insert overwrite local directory file.txt
select * from tablename;

select deptno,MAX(esal) from emp1


group by deptno
orderby

joins
joins:

1) inner join:

first
11
12
13

second
11
13
14

join:
11
13

Outerjoin:
a. LEFTOUTER JOIN
11
12 null
13

b. Rightouterjoin:
11
13
null 14
3) full outer join:
11
12 null
13
null 14
--------------------------------------inner join:
select e.ecode,e.ename,e.esal,d.dname,d.dloc,d.dmid from emp1 e
full outer join dept1 d on (e.deptno=d.deptno);

Struct ,map, array


CREATE TABLE employees (name
STRING, salary FLOAT,subordinates
ARRAY<STRING>,deductions
MAP<STRING, FLOAT>,address
STRUCT<street:STRING, city:STRING,
state:STRING, zip:INT>)

Hive vs rdbms
Differences with Relational
database
Hiveql providing High latency.It does
not providing index concept because
of streaming access being massive
amount of data.
It does not support transactional
management it supports only batch
There is no updates because we have
write once concept

You might also like