Welcome to Scribd!

A Brief History of Hadoop

Uploaded by

0% found this document useful (0 votes)

28 views1 page

Doug Cutting created Hadoop, originally named after his son's stuffed elephant. It originated from Apache Nutch, an open source web search engine built on Apache Lucene. While Nutch worked for smaller web indexes, the architects realized it would not scale to support the billions of web pages. This led them to create their own distributed file system, modeled after Google's GFS paper, in order to store and manage the very large files generated from web crawling and indexing at scale.

Original Description:

history of hadoop

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

0% found this document useful (0 votes)

28 views1 page

A Brief History of Hadoop

Uploaded by

gegopi

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 1

Search inside document

A Brief History of Hadoop

Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used
text search library. Hadoop has its origins in Apache Nutch, an open source web search
engine, itself a part of the Lucene project.
The Origin of the Name Hadoop
The name Hadoop is not an acronym; its a made-up name. The projects creator, Doug
Cutting, explains how the name came about:
The name my kid gave a stuffed yellow elephant. Short, relatively easy to spell and
pronounce, meaningless, and not used elsewhere: those are my naming criteria.
Kids are good at generating such. Googol is a kids term.
Subprojects and contrib modules in Hadoop also tend to have names that are unrelated
to their function, often with an elephant or other animal theme (Pig, for example).
Smaller components are given more descriptive (and therefore more mundane)
names. This is a good principle, as it means you can generally work out what something
does from its name. For example, the jobtracker keeps track of MapReduce jobs.
Building a web search engine from scratch was an ambitious goal, for not only is the
software required to crawl and index websites complex to write, but it is also a challenge
to run without a dedicated operations team, since there are so many moving parts. Its
expensive too: Mike Cafarella and Doug Cutting estimated a system supporting a 1-
billion-page index would cost around half a million dollars in hardware, with a monthly
running cost of $30,000. Nevertheless, they believed it was a worthy goal, as it would
open up and ultimately democratize search engine algorithms.
Nutch was started in 2002, and a working crawler and search system quickly emerged.
However, they realized that their architecture wouldnt scale to the billions of pages on
the Web. Help was at hand with the publication of a paper in 2003 that described the
architecture of Googles distributed filesystem, called GFS, which was being used in
production at Google.# GFS, or something like it, would solve their storage needs for
the very large files generated as a part of the web crawl and indexing process. In particular,
GFS would free up time being spent on administrative tasks such as managing
storage nodes. In 2004, they set about writing an open source implementation, the
Nutch Distributed Filesystem (NDFS).

Big Data Case Study - Facebook
Document5 pages
Big Data Case Study - Facebook
Mrunal Pilankar
0% (1)
14 MapReduce
Document82 pages
14 MapReduce
gegopi
100% (1)
Bachelor of Engineering: C K Pithawalla College of Engineering & Technology, SURAT
Document14 pages
Bachelor of Engineering: C K Pithawalla College of Engineering & Technology, SURAT
Nishant M Gandhi
No ratings yet
Big Data Notes - 2 Unit
Document20 pages
Big Data Notes - 2 Unit
collegebuddiesroasted678
No ratings yet
Student name:TARUN KUMAR Roll No:1314310112
Document22 pages
Student name:TARUN KUMAR Roll No:1314310112
Praveen Kumar
No ratings yet
Unit1 Remainingtopics 6feb
Document13 pages
Unit1 Remainingtopics 6feb
Prince Rathore
No ratings yet
Cloud Computing
Document5 pages
Cloud Computing
M Shaban
No ratings yet
Unit 2
Document30 pages
Unit 2
Awadhesh Maurya
No ratings yet
Module III Note
Document36 pages
Module III Note
johnsonjoshal5
No ratings yet
Case Study-Google
Document16 pages
Case Study-Google
Jomer Robas Tadena
No ratings yet
Bda Aiml Note Unit 2
Document13 pages
Bda Aiml Note Unit 2
viswakranthipalagiri
No ratings yet
Unit 2
Document10 pages
Unit 2
tripathineeharika
No ratings yet
Big Data Analytics
Document27 pages
Big Data Analytics
Chinmay Bhake
No ratings yet
HADOOP and PYTHON For BEGINNERS - 2 BOOKS in 1 - Learn Coding Fast! HADOOP and PYTHON Crash Course, A QuickStart Guide, Tutorial Book by Program Examples, in Easy Steps!
Document89 pages
HADOOP and PYTHON For BEGINNERS - 2 BOOKS in 1 - Learn Coding Fast! HADOOP and PYTHON Crash Course, A QuickStart Guide, Tutorial Book by Program Examples, in Easy Steps!
Antony George Sahayaraj
No ratings yet
Unit Iii
Document43 pages
Unit Iii
Varun Balaji
No ratings yet
CC 2
Document25 pages
CC 2
bhargav242004
No ratings yet
CASE STUDY On Application of Hadoop
Document16 pages
CASE STUDY On Application of Hadoop
haqueashraful713
No ratings yet
Chicago Crime (2013) Analysis Using Pig and Visualization Using R
Document61 pages
Chicago Crime (2013) Analysis Using Pig and Visualization Using R
Saurabh Sharma
No ratings yet
Google Research Paper
Document5 pages
Google Research Paper
fyr5efrr
100% (1)
Google Research Paper Mapreduce
Document7 pages
Google Research Paper Mapreduce
fvg2005k
100% (1)
Big Data Analytics: Dmitry Anoshin
Document11 pages
Big Data Analytics: Dmitry Anoshin
RAMON CESPEDES PAZ
No ratings yet
Research Paper On Google Inc
Document8 pages
Research Paper On Google Inc
pimywinihyj3
100% (1)
History: WP:Search Engine Test Search Engine (Disambiguation)
Document5 pages
History: WP:Search Engine Test Search Engine (Disambiguation)
mannu
No ratings yet
Step by Step Guide To Become Big Data Developer
Document15 pages
Step by Step Guide To Become Big Data Developer
Saggam Bharath
75% (4)
SEARCH ENGINES and PAGERANK
Document29 pages
SEARCH ENGINES and PAGERANK
Babita Naagar
No ratings yet
Big Data and Hadoop - 12 Aug 2021
Document19 pages
Big Data and Hadoop - 12 Aug 2021
Sahil Sarwar
No ratings yet
Design An Efficient Big Data Analytic Architecture For Retrieval of Data Based On Web Server in Cloud Environment
Document10 pages
Design An Efficient Big Data Analytic Architecture For Retrieval of Data Based On Web Server in Cloud Environment
Anonymous roqsSNZ
No ratings yet
Big Data
Document18 pages
Big Data
Louis Arokiaraj
No ratings yet
Groovy in Action
Document18 pages
Groovy in Action
api-3997953
No ratings yet
Cloudera Administration Handbook Sample Chapter
Document22 pages
Cloudera Administration Handbook Sample Chapter
Packt Publishing
No ratings yet
Lopez IEEE Big Data
Document6 pages
Lopez IEEE Big Data
Manoj Kumar Maurya
No ratings yet
Research Paper On Google
Document5 pages
Research Paper On Google
fvfj1pqe
100% (1)
Research Paper On Apache Hadoop
Document6 pages
Research Paper On Apache Hadoop
soezsevkg
100% (1)
Research Paper On Hadoop Technology
Document4 pages
Research Paper On Hadoop Technology
efjddr4z
100% (1)
Big Data/ Hadoop: Interview Questions
Document17 pages
Big Data/ Hadoop: Interview Questions
satish.sathya.a2012
No ratings yet
Big Data1
Document8 pages
Big Data1
prerana rai
No ratings yet
Google Image Search Research Paper
Document4 pages
Google Image Search Research Paper
gvzh54d3
100% (1)
Reading 2 Facebook Joins Google in HPC Computing Architectures For Big Data Apr 2011
Document2 pages
Reading 2 Facebook Joins Google in HPC Computing Architectures For Big Data Apr 2011
Billy Eden William Asrul
No ratings yet
Couchdb and PHP Web Development Beginner'S Guide: Chapter No. 3 "Getting Started With Couchdb and Futon"
Document26 pages
Couchdb and PHP Web Development Beginner'S Guide: Chapter No. 3 "Getting Started With Couchdb and Futon"
jagdevs7234
No ratings yet
Getting Started With Hadoop
Document47 pages
Getting Started With Hadoop
TeeMan27
No ratings yet
DuckDB in Action MEAP v01 Chptrs 1to3 MotheDuck
Document71 pages
DuckDB in Action MEAP v01 Chptrs 1to3 MotheDuck
Swastik K
No ratings yet
Running Head: Title of Paper in Caps 1: Hadoop, Mapreduce and HDFS: A Developers Perspective
Document5 pages
Running Head: Title of Paper in Caps 1: Hadoop, Mapreduce and HDFS: A Developers Perspective
Mohab Sameh
No ratings yet
Research Papers by Google
Document4 pages
Research Papers by Google
gz8pjezc
100% (1)
Unit 3 ETI (BDA)
Document34 pages
Unit 3 ETI (BDA)
abdulahad.ubeid
No ratings yet
Compusoft, 2 (11), 370-373 PDF
Document4 pages
Compusoft, 2 (11), 370-373 PDF
Ijact Editor
No ratings yet
Hadoop Ecosystem: Hdfs Mapreduce Yarn Hadoop Common
Document5 pages
Hadoop Ecosystem: Hdfs Mapreduce Yarn Hadoop Common
Harshdeep850
No ratings yet
Spaces - LvW30LMWx5oHe1 SY3L PDF 1091582327
Document403 pages
Spaces - LvW30LMWx5oHe1 SY3L PDF 1091582327
Adan
No ratings yet
Hadoop Presentation: Swarnali B.SC Computer Science Hons. 2 Year Chandernagore Govt. College Halder
Document8 pages
Hadoop Presentation: Swarnali B.SC Computer Science Hons. 2 Year Chandernagore Govt. College Halder
Akash Halder
No ratings yet
CSE Hadoop Report
Document14 pages
CSE Hadoop Report
rohit
No ratings yet
Big Data NoSLQ Kopyası
Document51 pages
Big Data NoSLQ Kopyası
sude uğur
No ratings yet
A Cognitive Defense of Associative Interfaces For Object Reference
Document12 pages
A Cognitive Defense of Associative Interfaces For Object Reference
thumperward
No ratings yet
Hadoop Tutorial
Document17 pages
Hadoop Tutorial
Priyadarsini Rout
No ratings yet
9781782175810-Hadoop Explained
Document22 pages
9781782175810-Hadoop Explained
nps
No ratings yet
Hadoop Job Runner UI Tool
Document10 pages
Hadoop Job Runner UI Tool
International Journal of Engineering Inventions (IJEI)
No ratings yet
BDA Experiment 14 PDF
Document77 pages
BDA Experiment 14 PDF
Nikita Ichale
No ratings yet
Other Website Information - Document
Document5 pages
Other Website Information - Document
Vasu Devan
No ratings yet
BD 11
Document17 pages
BD 11
V Kalyan
No ratings yet
Hadoop Course Content
Document3 pages
Hadoop Course Content
Akash Kumar
No ratings yet
DuckDB in Action MEAP v02 Chptrs 1to4 MotheDuck
Document123 pages
DuckDB in Action MEAP v02 Chptrs 1to4 MotheDuck
zenemex
No ratings yet
Google Chrome Os Research Paper PDF
Document5 pages
Google Chrome Os Research Paper PDF
eh0vmbmp
100% (1)
Hadoop Beginner's Guide
From Everand
Hadoop Beginner's Guide
Garry Turkington
Rating: 4 out of 5 stars
4/5 (7)
Mapreduce A Approach
Document1 page
Mapreduce A Approach
gegopi
No ratings yet
Patient Profiling:: Big Data's Impact On Healthcare
Document25 pages
Patient Profiling:: Big Data's Impact On Healthcare
gegopi
No ratings yet
Pig Interview Questions
Document3 pages
Pig Interview Questions
gegopi
No ratings yet
Hiveinterviewquestions 150326053618 Conversion Gate01 PDF
Document4 pages
Hiveinterviewquestions 150326053618 Conversion Gate01 PDF
gegopi
No ratings yet
Key Skills Every Business Needs
Document9 pages
Key Skills Every Business Needs
gegopi
No ratings yet