Welcome to Scribd!

Avro Hands On Exercises

Uploaded by

0% found this document useful (0 votes)

45 views2 pages

This document describes how to work with Avro data formats in Hive and Pig. In Hive, it explains how to create a table using an Avro schema file, load data from a text file into that table, and check the table description. In Pig, it shows how to register necessary jars, load Avro data using a schema file, and access the data.

Original Description:

Original Title

Avro Hands on Exercises

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

45 views2 pages

Avro Hands On Exercises

Uploaded by

Mytheesh Waran

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 2

Search inside document

Hadoop – Avro Data Formats

Hive

1. Create a Table
CREATE TABLE departments(departmentID int, departmentName string) ROW
FORMAT DELIMITED FIELDS TERMINATED BY ‘,’;

2. Load data from the dataset

LOAD DATA LOCAL INPATH ‘/tmp/departments.txt’;

3. Create Table using avro schema file (avsc)

CREATE TABLE departments_avro ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION
'/tmp/departments/' TBLPROPERTIES('avro.schema.url'='/tmp/departments.avsc');

4. Fill data to avro table from Existing Table

INSERT OVERWRITE TABLE departments_avro SELECT * FROM departments;

5. Check for description of the table

DESCRIBE FORMATTED departments_avro

Intellipaat Software Services Pvt. Ltd. Page 1

Hadoop – Avro Data Formats

PIG

1. Copy a avro data file into HDFS /tmp/departments.avro

2. Register all jars

REGISTER /usr/lib/pig/piggybank.jar;
REGISTER /usr/lib/pig/lib/avro-1.7.4.jar;
REGISTER /usr/lib/pig/lib/jackson-core-asl-1.8.8.jar;
REGISTER /usr/lib/pig/lib/jackson-mapper-asl-1.8.8.jar;
REGISTER /usr/lib/pig/lib/json-simple-1.1.jar;
REGISTER /usr/lib/pig/lib/jython-standalone-2.5.2.jar;
REGISTER snappy-java-1.0.4.1.jar;

3. Load data using Avro Schema File

deps= LOAD '/tmp/departments.avro' USING
org.apache.pig.piggybank.storage.avro.AvroStorage('no_schema_check','schema_fil
e','/tmp/departments.avsc');

Intellipaat Software Services Pvt. Ltd. Page 2

Shibboleth Setup
Document17 pages
Shibboleth Setup
Saless Sen
No ratings yet
5 SQL Hadoop Analyzing Big Data Hive m5 Storage Eco System Slides
Document15 pages
5 SQL Hadoop Analyzing Big Data Hive m5 Storage Eco System Slides
गोपाल शर्मा
No ratings yet
Transport Tablespace From One Database To Another
Document10 pages
Transport Tablespace From One Database To Another
naga54
No ratings yet
EX. NO Date Program NO Sign
Document80 pages
EX. NO Date Program NO Sign
Dheepa
No ratings yet
А.Москвин О безоп исп РНР wrappers
Document29 pages
А.Москвин О безоп исп РНР wrappers
VasileXXL5
No ratings yet
Lab 5 Correlate Structured W Unstructured Data
Document5 pages
Lab 5 Correlate Structured W Unstructured Data
Vin
No ratings yet
Firebird PHP Linux PDF
Document7 pages
Firebird PHP Linux PDF
MlTri
No ratings yet
WP Load
Document2 pages
WP Load
kandorp
No ratings yet
Install PEAR For PHP 7
Document3 pages
Install PEAR For PHP 7
ivanlopez1
No ratings yet
VuFind Documentation
Document15 pages
VuFind Documentation
Yesan Sellan
No ratings yet
CSA Database Monitoring Guide
Document49 pages
CSA Database Monitoring Guide
rohit kumar
No ratings yet
Pclzip
Document230 pages
Pclzip
muhammad
No ratings yet
Shared APPL TOP Clone Jj1
Document17 pages
Shared APPL TOP Clone Jj1
Srinivas Gandikota
No ratings yet
Oracle
Document2 pages
Oracle
windwalker78
No ratings yet
Index File
Document6 pages
Index File
Lenovo Shyukri
No ratings yet
CS210 - OOP Using C++ Lab 11-A: Input/Output With Files
Document5 pages
CS210 - OOP Using C++ Lab 11-A: Input/Output With Files
sami
No ratings yet
Preparing For The AIA Communication Pack 2.01 Installation: Permissions
Document43 pages
Preparing For The AIA Communication Pack 2.01 Installation: Permissions
abhijmit
No ratings yet
Experiment No 3
Document9 pages
Experiment No 3
Aman Jain
No ratings yet
1 Purpose: CSC E-Hosting Alstom Account
Document4 pages
1 Purpose: CSC E-Hosting Alstom Account
austinfru
No ratings yet
User Managed Hot Backup of Oracle Database
Document4 pages
User Managed Hot Backup of Oracle Database
dbareddy
No ratings yet
DB Creation I
Document5 pages
DB Creation I
Krishna Rao
No ratings yet
Any Auth Put
Document3 pages
Any Auth Put
Katie Salisbury
No ratings yet
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
Document5 pages
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
Karim Fathallah
No ratings yet
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
Document5 pages
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
Karim Fathallah
No ratings yet
1) Create A Custom Table As Per The Below Screen Shot.: File Upload
Document9 pages
1) Create A Custom Table As Per The Below Screen Shot.: File Upload
majid khan
No ratings yet
063 Rcfile
Document1 page
063 Rcfile
Pradeep Saraswat
No ratings yet
Hadoop Installation
Document14 pages
Hadoop Installation
satu
No ratings yet
Install Sqoop
Document7 pages
Install Sqoop
Kajal
No ratings yet
Hadoop - PIG User Material
Document292 pages
Hadoop - PIG User Material
rahulneel
No ratings yet
Clone
Document9 pages
Clone
Muhd Q
No ratings yet
Test: Sun Systems Fault Analysis Workshop: Online Assessment
Document21 pages
Test: Sun Systems Fault Analysis Workshop: Online Assessment
ulrich nobel kouamé
No ratings yet
How To Use The Poifs Apis: 1.1. Target Audience
Document11 pages
How To Use The Poifs Apis: 1.1. Target Audience
jimakosjp
No ratings yet
Quick Installation FreeSwitch ASTPP
Document6 pages
Quick Installation FreeSwitch ASTPP
Douglas Braga Gomes
No ratings yet
How To Install The Oracle 10g Database On Linux PDF
Document2 pages
How To Install The Oracle 10g Database On Linux PDF
nmughal20006484
No ratings yet
Technical Architecture of Dataguard: Task List
Document8 pages
Technical Architecture of Dataguard: Task List
mohdinam
No ratings yet
Creating Physical Standby Using RMAN Duplicate Without Shutting Down The Primary (Doc ID 789370.1)
Document8 pages
Creating Physical Standby Using RMAN Duplicate Without Shutting Down The Primary (Doc ID 789370.1)
Arc Angel M
No ratings yet
Server Setup
Document6 pages
Server Setup
Sahil Thakur
No ratings yet
UNIT 5 Notes by ARUN JHAPATE
Document21 pages
UNIT 5 Notes by ARUN JHAPATE
Ankit “अंकित मौर्य” Mourya
No ratings yet
HOL Hive PDF
Document23 pages
HOL Hive PDF
Kishore Kumar
No ratings yet
Big Data Training1
Document4 pages
Big Data Training1
seshuchoudary
No ratings yet
Hadoop Administrator Training - Lab Hand Book
Document12 pages
Hadoop Administrator Training - Lab Hand Book
debkrc
No ratings yet
DSBDA GRP B Print
Document21 pages
DSBDA GRP B Print
tmhrrsmorde
No ratings yet
Lab Manual
Document86 pages
Lab Manual
pthuynh709
No ratings yet
Configure Dataguard On Oracle Linux 6-3 by Cornelia Dwi M v1-1
Document26 pages
Configure Dataguard On Oracle Linux 6-3 by Cornelia Dwi M v1-1
Dev Fitriady
No ratings yet
BDA LabManual
Document20 pages
BDA LabManual
posprojectz
No ratings yet
Quick Installation FreeSwitch ASTPP
Document6 pages
Quick Installation FreeSwitch ASTPP
yimbi2001
No ratings yet
Back End
Document34 pages
Back End
Albano Futta
No ratings yet
Introducing Assetic: Asset Management For PHP 5.3
Document110 pages
Introducing Assetic: Asset Management For PHP 5.3
Patrick Deroubaix
No ratings yet
Tutorial Scripting - Tutorial (Avidemux)
Document4 pages
Tutorial Scripting - Tutorial (Avidemux)
Sasa Miljkovic
No ratings yet
Clone Steps Rman
Document10 pages
Clone Steps Rman
Kishore Adikar
No ratings yet
JSF + JPA + JasperReports (Ireport) Part 1 - Ramki Java Blog
Document5 pages
JSF + JPA + JasperReports (Ireport) Part 1 - Ramki Java Blog
Martin Murciego
No ratings yet
Ucm Archive Pull Replicate
Document13 pages
Ucm Archive Pull Replicate
John Does
No ratings yet
Big Data Manual Ai
Document33 pages
Big Data Manual Ai
smitcse2021
No ratings yet
Windows Powershell Logging Cheat Sheet - Win 7/win 2008 or Later
Document10 pages
Windows Powershell Logging Cheat Sheet - Win 7/win 2008 or Later
apogee.protection
No ratings yet
Bda Lab Manual
Document45 pages
Bda Lab Manual
reenadh shaik
No ratings yet
Secret//Noforn: Description Platform Support
Document9 pages
Secret//Noforn: Description Platform Support
Shashank Sn
No ratings yet
Paw Functions
Document19 pages
Paw Functions
Alou
No ratings yet
Data Guard
Document7 pages
Data Guard
pratikl
No ratings yet
Experiment 3: Hive: Aim: To Understand Data Processing Tool - Hive and HQL (Hive Query Language)
Document11 pages
Experiment 3: Hive: Aim: To Understand Data Processing Tool - Hive and HQL (Hive Query Language)
Ňąŕëşh ķümãŕ S
No ratings yet
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Hbase
Document8 pages
Hbase
Mytheesh Waran
No ratings yet
Hadoop Eco System - Class 1
Document25 pages
Hadoop Eco System - Class 1
Mytheesh Waran
No ratings yet
FLUME Hands On Exercises
Document1 page
FLUME Hands On Exercises
Mytheesh Waran
No ratings yet
Hive Practical 2
Document11 pages
Hive Practical 2
Mytheesh Waran
No ratings yet
Working of Hive: Mapreduce: It Is A Parallel Programming Model For Processing Large Amounts
Document3 pages
Working of Hive: Mapreduce: It Is A Parallel Programming Model For Processing Large Amounts
Mytheesh Waran
No ratings yet
Datatypes in Hive
Document31 pages
Datatypes in Hive
Mytheesh Waran
No ratings yet
Web Results: Ad WWW - Amazon.in
Document3 pages
Web Results: Ad WWW - Amazon.in
Mytheesh Waran
No ratings yet
Group 11
Document31 pages
Group 11
Mytheesh Waran
No ratings yet