Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Elective Course on Big Data

Jnaneshwar Bohara
Big Data - Hadoop | BoharaG 1
Big Data - Hadoop | BoharaG 2
Know Your Instructor

 Jnaneshwar Bohara
 M. Sc. Computer System and Knowledge
Engineering, IOE, TU (Gold Medal)
 Certified Scrum Master

 Senior Java Programmer

 Big Data Analyst


Know Your Instructor

 Jnaneshwar Bohara
 Researcher on Big Data
and Bioinformatics

https://www.amazon.com/MapReduce-Approach-Longest-
Subsequence-BioSequences/dp/3659680508
How Huge Big Data is!

Big Data and Hadoop | BoharaG 5


What is Big Data?

Big Data and Hadoop | BoharaG 6


What is Big Data?

Collection of data sets so large


and complex that it becomes
difficult to process using on-
hand database management
tools or traditional data
processing applications.

Big Data - Hadoop | BoharaG 7


Big Data and Hadoop | BoharaG 8
Topics
 Introduction
 MapReduce
 Hadoop
 Hands on Hadoop
 NoSQL
 MongoDB
 HBase
 Spark

Big Data and Hadoop | BoharaG 9


Introduction

 What is Big Data


 Characteristic of Big Data
 Current Trend in Big Data
 Real Life Applications of Big Data
 Scope and Challenges of Big Data
 Orientation of Practical (Tools and
Techniques)

Big Data and Hadoop | BoharaG 10


MapReduce
 Functional Programming
 What is MapReduce?
 How Does MapReduce Work?
 Distributed Execution Overview
 Data Distribution
 Use cases of MapReduce
 Anatomy of MapReduce Program
 MapReduce programs in Java
 Basic MapReduce API Concepts
 Writing MapReduce Driver, Mappers, and Reducers in
Java
Big Data and Hadoop | BoharaG 11
Hadoop
 What is Hadoop?
 History of Hadoop
 Motivations for Hadoop
 The Hadoop Ecosystem
 Hadoop Master/Slave Architecture
 Hadoop Daemons
 Hadoop Configuration Modes
 Uses for Hadoop
 Hadoop Cluster Setup
 Troubleshooting of installation and running programs
in Hadoop cluster
Big Data and Hadoop | BoharaG 12
Hands on Hadoop
 Basic Concept of Java Programming for Hadoop
Developers
 Basic Concept of Linux to work in Hadoop
 Basic HDFS Commands
 Compile and Run Hadoop Programs using Command Line
 Use Eclipse IDE for Hadoop Programming
 Use Python in Hadoop
 Write your own MapReduce Programs to solve real life
problems
 Use different Data Types and Formats in Hadoop
 Analyze Big Data (CSV and JSON) in your MapReduce
Program

Big Data and Hadoop | BoharaG 13


NoSQL

 Types of Data
 What is NoSQL?
 Why NoSQL?
 Types of NoSQL Databases

Big Data and Hadoop | BoharaG 14


MongoDB
 Document v’s Relational Databases
 Installing MongoDB
 MongoDB – Collections
 MongoDB – Documents
 Object Ids
 Queries on MongoDB
 Aggregation Pipeline
 Nested Documents
 Twitter data analysis using MongoDB

Big Data and Hadoop | BoharaG 15


HBase
 HBase: Overview
 HBase vs. RDBMS
 HBase vs. HDFS
 HBase Architecture
 HBase Data Model
 HBase: Keys and Column Families
 HBase Regions
 Creating a Table
 Writing Queries to insert and retrieve data to and from
HBase

Big Data and Hadoop | BoharaG 16


Spark

 What is Spark?
 Spark Core
 Spark SQL
 Spark SQL – Handling JSON
 Spark SQL – Handling CSV
 Spark Streaming

Big Data and Hadoop | BoharaG 17


Thank You !

Big Data - Hadoop | BoharaG 18

You might also like