Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Big Data & Hadoop Certification

Course Catalogue

© EduPristine – www.edupristine.com
© EduPristine For [Big Data & Hadoop]
Some facts…

 Every day, people send 150 billion new email messages. The number of
mobile devices already exceeds the world's population and is growing.
With every keystroke and click, we are creating new data at a blistering
pace.
 90% of the data in the world today has been created in the last two
years alone.
 80% of data captured today is unstructured.
 This brave new world is a potential treasure trove for data scientists and
analysts who can comb through massive amounts of data for new
insights, research breakthroughs and make marketing strategies.
 But It also presents a problem for traditional relational databases and
analytics tools, which were not built to handle such massive data .
 Another challenge is the mixed sources and formats, which include
XML, log files, objects, text, binary and more.

© EduPristine For [Big Data & Hadoop] 1


Big Data is growing

© EduPristine For [Big Data & Hadoop] 2


For an example

 “80% of data captured today is unstructured, from sensors used to gather climate information,
posts to social media sites, digital pictures and videos, purchase transaction records, and cell
phone GPS signals, to name a few. All of this unstructured data is Big Data : According to IBM”

© EduPristine For [Big Data & Hadoop] 3


Why Hadoop ??

 With Hadoop, no data is too big. Hadoop has gained momentum mainly due to its ability to
analyze unstructured Big Data to draw important predictions for businesses
 Hadoop is the core platform for structuring Big Data, and solves the problem of making it useful
for analytics purposes.
 Hadoop is more than just a faster, cheaper database and analytics tool. In some cases, the
Hadoop framework lets users query datasets in previously unimaginable ways.
 Hadoop appeals to IT leaders because of the improved performance, scalability, flexibility,
efficiency, extensibility and fault tolerance it offers
 Big Data means Big Opportunities
• 16000+ openings today-> Source : naukari.com
• Big Demand but No One Qualifies : For every 100 openings, there are only 2 qualified candidates" - fast
company dot com
• Hadoop pioneers in the online world -- including eBay, Face book, LinkedIn, Netflix and Twitter - paved the
way for companies in other data-intensive industries and now you have huge opportunities in Industries
such as: finance, technology, telecom and government. Increasingly, IT companies are finding a place for
Hadoop in their data architecture plans
© EduPristine For [Big Data & Hadoop] 4
The power of Hadoop

 Hadoop brings the ability to cheaply process large amounts of data, regardless of its structure. By
large, we mean from 10-100 gigabytes and above.
 USP:
1. Open source technology
2. No high end sever machines required
Reliable

Hadoop
Flexible Economical
Features

Scalable

© EduPristine For [Big Data & Hadoop] 5


Top companies using Hadoop

© EduPristine For [Big Data & Hadoop] 6


Hottest Salaries in IT sector

© EduPristine For [Big Data & Hadoop] 7


Who can join this course ???

 Software Engineers, who are into ETL/Programming and exploring for great job opportunities in
hadoop.
 Managers, who are looking for the latest technologies to be implemented in their organization, to
meet the current & upcoming challenges of data management.
 Any Graduate/Post-Graduate, who is aspiring a great career towards the cutting edge
technologies.
Pre-requisites for the course:

 Prerequisites for learning Hadoop include hands-on experience in Core Java & Unix and good
analytical skills to grasp and apply the concepts in Hadoop. This course helps you brush up your
Java Skills needed to write Map Reduce programs.
 Edupristine provides comprehensive recordings complimentary Course "Java Essentials for
Hadoop" to all the participants who enroll for the Hadoop Training. This course helps you to brush
up your Java Skills needed to write Map Reduce programs.

© EduPristine For [Big Data & Hadoop] 8


Training Snapshot- Hadoop Pro Package

Big Data & Hadoop


Study material
 3 days- Online Java and Unix coverage Provided:
Hadoop definitive
 12 days – 60 hrs Classroom Trainings. reference guide.
 10 hrs Live Projects (included in classroom training).
 PowerPoint Presentation covering all classes
 Recorded Videos covering all classes.
 Hadoop Definitive Guide+ Edupristine study notes
 Quiz & Assignment
 24x7 access on online material through LMS
 Live support / Discussion Forum
 Certificate of Completion and Excellence from Edupristine
 Placement Assistance
© EduPristine For [Big Data & Hadoop] 9
Day wise break up of Hadoop Program

Day Topic Content Objective

Introduction to
 Unix
Day 1 Unix and Basics of  How to operate Unix system
 Online Session
Hadoop

 Object Oriented  Understand & brush up


Basic Java &  Portable core java concepts.
Day Introduction to  Multi Threaded  Introduction to Hadoop
2&3 Hadoop  Secure Technology.
Technology.  Platform Independent  Mode : Online session

© EduPristine For [Big Data & Hadoop] 10


Day wise break up of Hadoop Program – Cont‘d…

Day Topic Content Objective


 Cluster Specification
Understanding  To understand the
 Hadoop Configuration (Configuration
Pseudo Cluster different components of
Environment Management a Hadoop Pseudo Cluster
 Environment Settings and about different
&  Important Hadoop Daemon Properties configuration files to be
 Hadoop Daemon Addresses and Ports used in the cluster.
Introduction To Other Hadoop Properties)
Hadoop  Basic Linux and HDFS commands
Distributed File  Design of HDFS  To understand what is
System (HDFS).  HDFS Concepts HDFS, its requirement for
Day 4 & 5  Command Line Interface running Map-Reduce and
 Hadoop File Systems how it differs from other
 Java Interface distributed file systems.
 Data Flow (Anatomy of a File Read,
Anatomy of a File Write, Coherency
Model)
 Parallel Copying with DISTCP
 Hadoop Archives

© EduPristine For [Big Data & Hadoop] 11


Day wise break up of Hadoop Program – Cont‘d…

Day Topic Content Objective


 Hadoop Data Types, Functional -
Concept of Mappers, Functional -
Concept of Reducers, The Execution
Understanding - Framework, Concept of Practitioners,  To get an idea of how
Map-Reduce Functional - Concept of Combiners, Map-Reduce framework
Day
Basics and Map- Distributed File System, Hadoop Cluster works and why Map-
6&7
Reduce Types and Architecture, MapReduce Types, Input Reduce is tightly coupled
Formats Formats (Input Splits and Records, Text with HDFS.
Input, Binary Input, Multiple Inputs),
Output Formats (Text Output, Binary
Output, Multiple Output).

© EduPristine For [Big Data & Hadoop] 12


Day wise break up of Hadoop Program – Cont‘d…

Day Topic Content Objective


 Hive Architecture, Running Hive,  To understand HIVE, how
Comparison with Traditional Database data can be loaded into
(Schema on Read Versus Schema on HIVE and query data
Write, Updates, Transactions and from Hive and so on.
Indexes), HiveQL (Data Types, Operators
and Functions), Tables (Managed Tables
and External Tables, Partitions and
Day 8 HIVE Buckets, Storage Formats, Importing
Data, Altering Tables, Dropping Tables),
Querying Data (Sorting And Aggregating,
Map Reduce Scripts, Joins & Sub queries
& Views, Map and Reduce site Join to
optimize Query), User Defined Functions,
Appending Data into existing Hive Table,
Custom Map/Reduce in Hive.

© EduPristine For [Big Data & Hadoop] 13


Day wise break up of Hadoop Program – Cont‘d…

Day Topic Content Objective

 Installing and Running Pig, Grunt, Pig's


 To learn what is Pig,
Data Model, Pig Latin, Developing &
where we can use Pig,
Day 9 PIG Testing Pig Latin Scripts, Writing
how Pig is tightly coupled
Evaluation, Filter, Load & Store
with Map-Reduce.
Functions.

 To understand Sqoop,
 Database Imports, Working with
Day how import and export is
Imported Data, Importing Large Objects,
10 SQOOP done in/from HDFS and
Performing Exports, Exports - A Deeper
what is the internal
Look.
architecture of Sqoop.

© EduPristine For [Big Data & Hadoop] 14


Day wise break up of Hadoop Program – Cont‘d…

Day Topic Content Objective


 We will provided data sets on which
participants will work as a part of the
 To work on a real life
Day Project.
Live Project 1 project.
11  App. Development.
 Running search Query.

 To understand HBase,
 Introduction, Client API - Basics, Client
how data can be loaded
API - Advanced Features, Client API -
Day into HBase and query
HBASE Administrative Features, Available Client,
12 data from HBase using
Architecture, Map Reduce Integration,
client and so on.
Advanced Usage, Advance Indexing.

 We will provided data sets on which


participants will work as a part of the
Day  To work on a real life
Live Project 2 Project.
13 project.
 App. Development.
 Running search Query.

© EduPristine For [Big Data & Hadoop] 15


Day wise break up of Hadoop Program – Cont‘d…

Day Topic Content Objective


 Modes of Spark
 Spark Installation Demo
 Overview of Spark on a cluster
 Spark Standalone Cluster
 Invoking Spark Shell
 Creating the Spark Context
 Loading a File in Shell
 Performing Some Basic Operations on Files in Spark Shell How to Run
 Building a Spark Project with sbt programs up to
 Running Spark Project with sbt, Caching Overview 100x faster than
Day  Distributed Persistence Hadoop MapReduce
SPARK
14 & 15  Spark Streaming Overview in memory, or 10x
 RDDs faster on disk using
 Transformations in RDD Spark.
 Actions in RDD, Loading Data in RDD
 Saving Data through RDD
 Key-Value Pair RDD
 MapReduce and Pair RDD Operation
 Java/Scala/Python and Hadoop Integration Hands on
 Loading of Data
 Hive Queries through Spark
 Performance Tuning Tips in Spark
© EduPristine For [Big Data & Hadoop] 16
Participants will be able to:

 Master the concepts of Hadoop Distributed File System


 Setup a Hadoop Cluster
 Write MapReduce Code in Java
 Perform Data Analytics using Pig and Hive
 Understand Data Loading Techniques using Sqoop and Flume
 Implement HBase, MapReduce Integration, Advanced Usage and Advanced Indexing
 Have a good understanding of ZooKeeper service
 Use Apache Oozie to Schedule and Manage Hadoop Jobs
 Implement best Practices for Hadoop Development and Debugging
 Develop a working Hadoop Architecture
 Work on a Real Life Project on Big Data Analytics and gain Hands on Project Experience

© EduPristine For [Big Data & Hadoop] 17


The Hadoop Bestiary

The Hadoop Bestiary


Flume Collection and import of log and event data
Hbase Column-oriented database scaling to billions of rows
HDFS Distributed redundant file system for Hadoop
HIVE Data warehouse with SQL-like access
MapReduce Parallel computation on server clusters
PIG High-level programming language for Hadoop computations
Sqoop Imports data from relational databases
Zookeeper Configuration management and coordination

System Requirement: 4 GB Ram, 64 bit processor

© EduPristine For [Big Data & Hadoop] 18


Available Packages

Packages Available

​Hadoop-Pro ​Hadoop -Plus

Rs.25000 Rs.33000

Hadoop Training Hadoop Training+ Big Data Trends

Kindly see the next slide for details on Hadoop Plus offerings

© EduPristine For [Big Data & Hadoop] 19


Hadoop Plus Offering

Big Data TRENDS


 3Hrs online interactive session on every 2nd & 4th Sunday of
month.
 -New topics in Big data & Industrial Case studies will be
covered.
 - Access to Hadoop trends for a year.
 - 4 classroom workshops in a year in 5 cities to cover
important topics
 - Cities for workshop- Mumbai, Pune, Bangalore, Kolkata and
Delhi only.
 Apart from Hadoop Trends, the Hadoop Plus candidates will
also get an access to Live online classes which is worth INR
20000.

© EduPristine For [Big Data & Hadoop] 20


Course Highlights

COURSE HIGHLIGHTS HADOOP PRO HADOOP PLUS


Rs. 25000 Rs. 33000
12 Days Classroom Training (60 Hours)  

2 Real Time Live Projects.  

Hadoop Definitive Guide + Edupristine study notes  


PowerPoint Presentation covering all classes  
Recorded Videos covering Java and Unix sessions.  
Recorded Videos of Live Instructor based Training.  
Quiz & Assignment  
24x7 Life Time access on material and Support  
Discussion Forum  
Certification of completion & excellence from
 
EduPristine
1 year access to Big Data Trends 
Live online classes for all topics 
© EduPristine For [Big Data & Hadoop] 21
Fee Structure

Big Data & Hadoop


 3 days- Online Java and Unix coverage
 12 days – 60 hrs Classroom Trainings.
 10 hrs Live Projects (included in classroom training).
 PowerPoint Presentation covering all classes
 Note:
You can also earn
 Recorded Videos covering all classes. yourself a hefty referral
 Hadoop Definitive Guide+ Edupristine study notes bonus
(Rs-1000/- per student)
 Quiz & Assignment
if your friends or
 24x7 access on online material through LMS colleagues join this
 Live support / Discussion Forum course in your or any
other city.
 Placement Assistance

© EduPristine For [Big Data & Hadoop] 22


Thank you!
Contact:

EduPristine
702, Raaj Chambers, Old Nagardas Road, Andheri (E),
Mumbai-400 069. INDIA
www.edupristine.com
Ph. +91 22 4211 7474

© EduPristine – www.edupristine.com
© EduPristine For [Big Data & Hadoop]

You might also like