Welcome to Scribd!

Presented By-: Nisha Choudhary - Priya Kamti - Chandra Kanta Singha

Uploaded by

0% found this document useful (0 votes)

13 views21 pages

This document provides an overview of Hadoop, including its history and core components. It describes how Hadoop manages big data applications through HDFS for storage and MapReduce for processing. Additional Hadoop components like YARN and the Hadoop ecosystem including Hive and Pig are also outlined. The document lists several use cases for big data and key features of Hadoop like fault tolerance, scalability and economic benefits. It briefly discusses the distributed cache and limitations of Hadoop with potential solutions.

Original Description:

Original Title

Hadoop_seminar

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

0% found this document useful (0 votes)

13 views21 pages

Presented By-: Nisha Choudhary - Priya Kamti - Chandra Kanta Singha

Uploaded by

Nisha Choudhary

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

Jump to Page

You are on page 1of 21

Search inside document

PRESENTED BY-

-NISHA CHOUDHARY
-PRIYA KAMTI
-CHANDRA KANTA SINGHA
CONTENTS
 Big Data
 Hadoop history

 Hadoop

 HDFS

 MapReduce

 YARN

 Why Hadoop

 Hadoop ecosystem

 Hive

 Pig

 Features of Hadoop

 Distributed cache in Hadoop

 Limitations and solutions

BIG DATA
Use-cases
 Facebook

 Twitter

 Youtube

 Digitalmedia
 Healthcare/lifescience

 Finance services

 Law enforcement

 Retail(marketing)
HADOOP HISTORY
 Hadoop was primarily driven by Doug Cutting and
Tom White in 2006.
 Doug Cutting’s kid named Hadoop to one of his toy
that was a yellow elephant.
HADOOP
 It is an open source distributed processing
framework.
 It manages data processing and storage for big
data application.
 It works on clustered system

 Core components of hadoop are :

1) HDFS
2) Map Reduce
3) Yarn
HDFS(HADOOP DISTRIBUTED FILE
SYSTEM)
 Primary Data storage unit in hadoop.
 Used in distributed data processing environment.

 Works in a master-slave topology.

 Has two daemons running for it – Name node and data

node.
MAPREDUCE
 Data processing layer of hadoop.
 Processes data in two phase:

1) Map phase- applies business logic to the

data.
2) Reduce phase- takes as input the output of
map phase.
YARN(YET ANOTHER RESOURCE
LOCATOR)
Components are-
 Resource manager
Runs on master node.
Knows about the location and resources of each slave.
 Node manager
Runs on slave machines.
monitors resource utilization of each container.
 Job submitter
clients submits the job to resource manager.
resource manager contacts with relevant nodes
11 REASONS
HADOOP ECOSYSTEM
HIVE
PIG
FEATURES OF HADOOP
 Open source
 Distributed Processing
 Fault tolerance
 Reliability
 High availability
 Scalability
 Economic
 Easy to use
 Data Locality
DISTRIBUTED CACHE IN HADOOP
 It is a facility provided by the Hadoop
MapReduce framework.
 It can cache read only text files, archives, jar
files etc.
 Benefits:

1) Store complex data

2) Data consistency

3) Single point of failure

LIMITATIONS AND SOLUTIONS

 Issuewith small files

 Slow processing speed

 Support for Batch processing only

 No real time data processing

 No delta iteration

 Latency
 Not easy to use
 Security

 No Abstraction

 Vulnerable by nature

 No caching

 Lengthy line of code

 Uncertainty
THANK YOU…

Hadoop Interview Questions New
Document9 pages
Hadoop Interview Questions New
Rupali Shetty
No ratings yet
Hadoop: Data Processing and Modelling
From Everand
Hadoop: Data Processing and Modelling
Garry Turkington
No ratings yet
Cloud Computing
Document19 pages
Cloud Computing
Afia Faryad
No ratings yet
HADOOP: A Solution To Big Data Problems Using Partitioning Mechanism Map-Reduce
Document6 pages
HADOOP: A Solution To Big Data Problems Using Partitioning Mechanism Map-Reduce
Editor IJTSRD
No ratings yet
Shortnotes For Cloud
Document22 pages
Shortnotes For Cloud
Mahi Mahi
No ratings yet
Big Data and Hadoop: by - Ujjwal Kumar Gupta
Document57 pages
Big Data and Hadoop: by - Ujjwal Kumar Gupta
Ujjwal Kumar Gupta
No ratings yet
The Solution For Big Data Hadoop
Document27 pages
The Solution For Big Data Hadoop
Amritranjan Das
No ratings yet
HADOOP and PYTHON For BEGINNERS - 2 BOOKS in 1 - Learn Coding Fast! HADOOP and PYTHON Crash Course, A QuickStart Guide, Tutorial Book by Program Examples, in Easy Steps!
Document89 pages
HADOOP and PYTHON For BEGINNERS - 2 BOOKS in 1 - Learn Coding Fast! HADOOP and PYTHON Crash Course, A QuickStart Guide, Tutorial Book by Program Examples, in Easy Steps!
Antony George Sahayaraj
No ratings yet
Unit 3
Document15 pages
Unit 3
xcgfxgvx
No ratings yet
Module 1 - Introduction To Big Data
Document40 pages
Module 1 - Introduction To Big Data
raghunath sastry
100% (1)
Hadoop Overview
Document16 pages
Hadoop Overview
Sunil D Patil
100% (1)
CC 2
Document25 pages
CC 2
bhargav242004
No ratings yet
Fillatre Big Data
Document98 pages
Fillatre Big Data
satmania
No ratings yet
HADOOP
Document40 pages
HADOOP
saadiaiftikhar123
No ratings yet
IRJET - Big Data-A Review Study With Comp
Document6 pages
IRJET - Big Data-A Review Study With Comp
satish.sathya.a2012
No ratings yet
BIGDATA
Document180 pages
BIGDATA
Knightdale Rauschenberg
No ratings yet
Unit 2
Document56 pages
Unit 2
Ramstage Testing
No ratings yet
Interview Questions: 1. What Is Hadoop Mapreduce?
Document126 pages
Interview Questions: 1. What Is Hadoop Mapreduce?
Keshav Krishna
No ratings yet
Lesson 1 - Introduction To Big Data and Hadoop
Document46 pages
Lesson 1 - Introduction To Big Data and Hadoop
PoojaSampath
No ratings yet
Hadoop Notesforstudents
Document13 pages
Hadoop Notesforstudents
Saif Fazal
No ratings yet
Hadoop Chapter 1
Document6 pages
Hadoop Chapter 1
Swati
No ratings yet
Module 2
Document23 pages
Module 2
Saanvi Chetan Shelke
No ratings yet
Compusoft, 2 (11), 370-373 PDF
Document4 pages
Compusoft, 2 (11), 370-373 PDF
Ijact Editor
No ratings yet
Lecture Notes Hadoop
Document11 pages
Lecture Notes Hadoop
sakshi kureley
No ratings yet
500+ Interview Questions-1
Document126 pages
500+ Interview Questions-1
SavitaDarekar
No ratings yet
Notes Hadoop
Document19 pages
Notes Hadoop
Oyimang Tatin
No ratings yet
Hadoop Features 2
Document3 pages
Hadoop Features 2
sharan kommi
No ratings yet
REPORT_ON_AN_EXPLORATORY_ANALYSIS_OF_THE
Document19 pages
REPORT_ON_AN_EXPLORATORY_ANALYSIS_OF_THE
jasonberyl492
No ratings yet
Hadoop Ecosystem and Their Components
Document19 pages
Hadoop Ecosystem and Their Components
pallavibhardwaj1124
No ratings yet
BDA Presentations Unit-4 - Hadoop, Ecosystem
Document25 pages
BDA Presentations Unit-4 - Hadoop, Ecosystem
Ashish Chauhan
No ratings yet
Big Data Introduction PDF
Document180 pages
Big Data Introduction PDF
valtech20086605
No ratings yet
Data W - Bigdata8
Document105 pages
Data W - Bigdata8
ujjwal subedi
No ratings yet
Big Data RAJNEESH CCC
Document11 pages
Big Data RAJNEESH CCC
vidhya associate
No ratings yet
BDA Notes
Document25 pages
BDA Notes
mrudula.sb
No ratings yet
Unit II Big Data
Document27 pages
Unit II Big Data
rohitmarale77
No ratings yet
Hadoop Administration
Document97 pages
Hadoop Administration
arjun.ec633
No ratings yet
Parallel Project
Document32 pages
Parallel Project
hafsabashir820
No ratings yet
Hadoop Unit-4
Document44 pages
Hadoop Unit-4
Kishore Parimi
No ratings yet
Unit 2
Document30 pages
Unit 2
Awadhesh Maurya
No ratings yet
777 1651400645 BD Module 3
Document62 pages
777 1651400645 BD Module 3
nimmy
No ratings yet
Data Analytics IT 404 - Mod 6: Ojus Thomas Lee CE Kidangoor
Document53 pages
Data Analytics IT 404 - Mod 6: Ojus Thomas Lee CE Kidangoor
sreelaya
No ratings yet
Bda Summer 2022 Solution
Document30 pages
Bda Summer 2022 Solution
Vivek
No ratings yet
HADOOP
Document18 pages
HADOOP
maiyi020106
No ratings yet
Fbda Unit-3
Document27 pages
Fbda Unit-3
Aruna Aruna
No ratings yet
HADOOP
Document10 pages
HADOOP
debasmita.saha
No ratings yet
Bda 201070046 01
Document24 pages
Bda 201070046 01
HARSH NAG
No ratings yet
Hadoop Interview1
Document27 pages
Hadoop Interview1
paramreddy2000
No ratings yet
Introduction To Hadoop
Document44 pages
Introduction To Hadoop
Ponnusamy S Pichaimuthu
No ratings yet
Mapreduce: Simplified Data Processing On Large Clusters
Document38 pages
Mapreduce: Simplified Data Processing On Large Clusters
car
No ratings yet
1) Hadoop Basics
Document86 pages
1) Hadoop Basics
angeline
No ratings yet
Big Data ABHISHEK PRAJA C CCCCCCCCCCC
Document11 pages
Big Data ABHISHEK PRAJA C CCCCCCCCCCC
vidhya associate
No ratings yet
Unit 3 ETI (BDA)
Document34 pages
Unit 3 ETI (BDA)
abdulahad.ubeid
No ratings yet
Hadoop
Document11 pages
Hadoop
Inu Kag
No ratings yet
Unit 3 - BD - Hadoop Ecosystem
Document42 pages
Unit 3 - BD - Hadoop Ecosystem
2028110
No ratings yet
CC Unit - 5
Document27 pages
CC Unit - 5
harshitamakhija100
No ratings yet
Big Data
Document22 pages
Big Data
Kapil Soni
No ratings yet
Big Data Analytics
Document27 pages
Big Data Analytics
Chinmay Bhake
No ratings yet
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Mastering Hadoop
From Everand
Mastering Hadoop
Sandeep Karanth
No ratings yet
Hadoop in Practice
From Everand
Hadoop in Practice
Alex Holmes
No ratings yet