Welcome to Scribd!

Mapreduce in Cloud Computing: Mapreduce, Mapreduce Paradigm, Mapreduce Examples, Hadoop, Hdfs

Uploaded by

0% found this document useful (0 votes)

63 views10 pages

MapReduce is a programming model for distributed computing of large datasets across clusters of machines. It involves distributing the data processing across nodes in a parallel and fault-tolerant manner. Hadoop is an open-source implementation of MapReduce that uses HDFS for distributed storage and MapReduce for distributed processing of large datasets in parallel. Users write map and reduce functions that operate on key-value pairs to perform tasks like search, sorting, analytics in a distributed manner.

Original Description:

good

Original Title

Mapreduce ,HDFS

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

0% found this document useful (0 votes)

63 views10 pages

Mapreduce in Cloud Computing: Mapreduce, Mapreduce Paradigm, Mapreduce Examples, Hadoop, Hdfs

Uploaded by

Muhammad umar

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

Jump to Page

You are on page 1of 10

Search inside document

MapReduce in Cloud Computing

MapReduce , MapReduce Paradigm, MapReduce

Examples,Hadoop,HDFS
Distributed system
• A distributed system is a system whose components are located on
different networked computers, which communicate and coordinate
their actions by passing messages to one another.

• A distributed system allows resource sharing, including software by

systems connected to the network. Examples of distributed systems
/ applications of distributed computing : Intranets, Internet, WWW,
email.
MapReduce
MapReduce is a general-purpose programming model for
data-intensive computing
• Pioneered by Google
• Processes 20 PB of data per day
• Popularized by open-source Hadoop project
• Used by Yahoo!, Facebook, Amazon, …
• It uses a parallel computing model that distributes
computational tasks to large number of nodes(approx
1000-10000 nodes.)

It is fault-tolerable. It can work even when 1600 nodes

among 1800 nodes fails.
• In MapReduce model, user has to write only two functionsmap and
reduce
Few examples that can be easily expressed as
MapReduce computations:
Distributed Grep
Count of URL Access Frequency
Inverted Index
Mining
Example :
• MapReduce is a programming model for large-scale
computing
It uses distributed environment of the cloud to process
large amount of data in reasonable amount of time.
It was inspired by map and reduce function of Functional
Programming Language(like LISP, scheme, racket)[3].
Map and Reduce in Racket (Functional Programming
Language)
Map:
(map f list1) ! list2
e.g. (map square ’(1 2 3 4 5)) ! ’(1 4 9 16 25)
Reduce:
(foldl f init list1) ! any
e.g. (foldl + 0 ’(1 2 3 4 5)) ! 15
• It analyzes Hadoop.
Hadoop is the implementation of MapReduce Model.
• It process data parallely in distributed manner.
• It divides the data into different logical blocks and process
these logical blocks in parallel on different machines and at
last combines all the results to produce the final result.
• It is fault-tolerable.
• One attractive feature of Hadoop is that user can write the
map and reduce functions in any programming langauge.
Approach Used

• Hadoop is an open source Java framework for processing

large amount of data on the clusters of machines[1].
Hadoop is the implementation of Google’s MapReduce
programming model.
Yahoo is the biggest contributor of Hadoop[5].
Hadoop has mainly two components:
• Hadoop Distributed File System (HDFS)
• MapReduce
HDFS
• HDFS provides support for distributed storage[1].
Like traditional File System, the files can be deleted,
renamed etc.
HDFS has two types of nodes:
• Name Node
• Data Node
Name Node
• Name Node:
Name Node provides the main data services.
It is a process that runs on a separate machine.
It stores only the meta-data of the files and directories.
Programmer access files through it.
For reliablity of the file system, it keeps multiple copies of
the same file blocks.
Data Node
• Data Node:
Data Node is a process that runs on individual machine of
the cluster.
The file blocks are stored in the local file system of these
nodes.
It periodically send the meta-data of the stored blocks to the
Name Node.

Wagoappcloud: Release 1.3.2.10
Document51 pages
Wagoappcloud: Release 1.3.2.10
Hisyam Syafiq
No ratings yet
1634566604muschetto Dragos Platform Threat and Visibility Pdf1634566604
Document11 pages
1634566604muschetto Dragos Platform Threat and Visibility Pdf1634566604
Ronald N Meza C
No ratings yet
Hadoop Major Components
Document10 pages
Hadoop Major Components
aswagada
No ratings yet
Hadoop: Data Processing and Modelling
From Everand
Hadoop: Data Processing and Modelling
Garry Turkington
No ratings yet
DMLS Curricula PDF-1
Document28 pages
DMLS Curricula PDF-1
Muhammad umar
No ratings yet
Failed To Deploy - PlatformName - Can Not Communicate With Remote Node - InSource KnowledgeCenter
Document3 pages
Failed To Deploy - PlatformName - Can Not Communicate With Remote Node - InSource KnowledgeCenter
Taleb Eajal
No ratings yet
Element Management System (Ems-Nms)
Document8 pages
Element Management System (Ems-Nms)
ems-nms
100% (11)
Unit - III Advanced Analytics Technology and Tools
Document44 pages
Unit - III Advanced Analytics Technology and Tools
Diksha Chhabra
No ratings yet
The Map Reduce Programming
Document15 pages
The Map Reduce Programming
manjunath
No ratings yet
Hadoop Notesforstudents
Document13 pages
Hadoop Notesforstudents
Saif Fazal
No ratings yet
Unit 5
Document35 pages
Unit 5
Khushi Sharma
No ratings yet
Bda - 3 Unit
Document18 pages
Bda - 3 Unit
ASMA UL HUSNA
No ratings yet
Kcs 061 PPT Unit 2
Document56 pages
Kcs 061 PPT Unit 2
PRACHI ROSHAN
No ratings yet
Shortnotes For Cloud
Document22 pages
Shortnotes For Cloud
Mahi Mahi
No ratings yet
Unit 5 - Introduction To Hadoop
Document50 pages
Unit 5 - Introduction To Hadoop
Shree Shak
No ratings yet
Hadoop and Mapreduce
Document21 pages
Hadoop and Mapreduce
18941
No ratings yet
Hadoop: A Seminar Report On
Document28 pages
Hadoop: A Seminar Report On
Roshni Khairnar
No ratings yet
Unit 5 - Introduction To Hadoop
Document50 pages
Unit 5 - Introduction To Hadoop
Shree Shak
No ratings yet
A New Way To Store and Analyze Data: Presented By:: Harsha Jain
Document20 pages
A New Way To Store and Analyze Data: Presented By:: Harsha Jain
C. Valeriu
No ratings yet
Module 2. 16974328568170
Document113 pages
Module 2. 16974328568170
Sagar B S
No ratings yet
Chapter 3 Hadoop
Document10 pages
Chapter 3 Hadoop
Abhishek Nazare
No ratings yet
Hadoop: A Report Writing On
Document13 pages
Hadoop: A Report Writing On
dilip kodmour
No ratings yet
Hadoop
Document13 pages
Hadoop
kajole7693
No ratings yet
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
Document62 pages
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
Mousoomi Baruah
No ratings yet
Unit 3 ETI (BDA)
Document34 pages
Unit 3 ETI (BDA)
abdulahad.ubeid
No ratings yet
CC Unit 5
Document43 pages
CC Unit 5
prassadyashwin
No ratings yet
Lovely Professional University (Lpu) : Mittal School of Business (Msob)
Document10 pages
Lovely Professional University (Lpu) : Mittal School of Business (Msob)
Fareed
No ratings yet
Chapter 2 - 大数据生态系统
Document31 pages
Chapter 2 - 大数据生态系统
gs68295
No ratings yet
Hdfs Architecture and Hadoop Mapreduce
Document10 pages
Hdfs Architecture and Hadoop Mapreduce
Nishkarsh Shah
No ratings yet
BD - Unit - III - MapReduce
Document31 pages
BD - Unit - III - MapReduce
Prem Kumar
No ratings yet
Unit - 3
Document34 pages
Unit - 3
sixit37787
No ratings yet
Hadoop: Er. Gursewak Singh Dsce
Document15 pages
Hadoop: Er. Gursewak Singh Dsce
Daisy Kawatra
No ratings yet
Unit 2 - Hadoop PDF
Document7 pages
Unit 2 - Hadoop PDF
Gopal Agarwal
No ratings yet
Guided By:-Prof. K. Kakwani: Payal M. Wadhwani
Document24 pages
Guided By:-Prof. K. Kakwani: Payal M. Wadhwani
Ravi Joshi
No ratings yet
Big Data Engines: Binary Batch Processing
Document12 pages
Big Data Engines: Binary Batch Processing
Sonakshi Gupta
No ratings yet
Large-Scale Data Management: Cs525: Special Topics in Dbs
Document22 pages
Large-Scale Data Management: Cs525: Special Topics in Dbs
Pindiganti
No ratings yet
Explain in Detail About Hadoop Framework
Document4 pages
Explain in Detail About Hadoop Framework
Information Techn. HOD
No ratings yet
Bda 18CS72 Mod-2
Document152 pages
Bda 18CS72 Mod-2
Dhathri Reddy
No ratings yet
2 Hadoop Ecosystem
Document41 pages
2 Hadoop Ecosystem
tranngocbaooooo12062003
No ratings yet
Exploring Bigdata With Hadoop: Dr.A.Bazila Banu Associate Professor Department of Cse
Document23 pages
Exploring Bigdata With Hadoop: Dr.A.Bazila Banu Associate Professor Department of Cse
MAMAN MYTHIEN S
No ratings yet
Hadoop
Document5 pages
Hadoop
Vaishnavi Chockalingam
No ratings yet
Big Data Hadoop
Document37 pages
Big Data Hadoop
SDHR BCA
No ratings yet
Report Title: Wasit University
Document8 pages
Report Title: Wasit University
bassam lateef
No ratings yet
Unit Iv-1
Document84 pages
Unit Iv-1
keerthanavelmurugan02
No ratings yet
Hadoop
Document12 pages
Hadoop
Ã S Àdhìkãrí
No ratings yet
Cloud - UNIT V
Document18 pages
Cloud - UNIT V
Shikha Sharma
No ratings yet
Sem 7 - COMP - BDA
Document16 pages
Sem 7 - COMP - BDA
Raja Rajgonda
No ratings yet
Intro Hadoop Ecosystem Components, Hadoop Ecosystem Tools
Document15 pages
Intro Hadoop Ecosystem Components, Hadoop Ecosystem Tools
Rebecca tho
No ratings yet
Lab Manual BDA
Document36 pages
Lab Manual BDA
hemalata jangale
No ratings yet
BDT Unit 2 Textbook
Document20 pages
BDT Unit 2 Textbook
N.C.Yashaswini
No ratings yet
DAY 3 - ITEM 10 - Overview of Big Data Tools
Document25 pages
DAY 3 - ITEM 10 - Overview of Big Data Tools
Ade Rahman
No ratings yet
Fillatre Big Data
Document98 pages
Fillatre Big Data
satmania
No ratings yet
Big Data Analytics Assignment
Document7 pages
Big Data Analytics Assignment
Devananth A B
No ratings yet
Big Data Unit 2 Notes
Document6 pages
Big Data Unit 2 Notes
Aniket Raj Kashyap
No ratings yet
Hadoop Overview
Document16 pages
Hadoop Overview
Sunil D Patil
100% (1)
Mapreduce and Hadoop Ecosystem
Document64 pages
Mapreduce and Hadoop Ecosystem
Rin Rin Nurmalasari
No ratings yet
HADOOP: A Solution To Big Data Problems Using Partitioning Mechanism Map-Reduce
Document6 pages
HADOOP: A Solution To Big Data Problems Using Partitioning Mechanism Map-Reduce
Editor IJTSRD
No ratings yet
1) Hadoop Basics
Document86 pages
1) Hadoop Basics
angeline
No ratings yet
BigData Unit 2
Document56 pages
BigData Unit 2
Ravi Yadav
No ratings yet
Chapter4 PDF
Document50 pages
Chapter4 PDF
Vard Farrell
No ratings yet
Hadoop and Pig Overview - Hands-On: Outline of Tutorial
Document52 pages
Hadoop and Pig Overview - Hands-On: Outline of Tutorial
Konara Kiran
No ratings yet
Big Data, Map Reduce & Hadoop: By: Surbhi Vyas (7) Varsha
Document40 pages
Big Data, Map Reduce & Hadoop: By: Surbhi Vyas (7) Varsha
18941
No ratings yet
02 Unit-II Hadoop Architecture and HDFS
Document18 pages
02 Unit-II Hadoop Architecture and HDFS
KumarAdabala
No ratings yet
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
Features: Down's Syndrome Turner's Syndrome Klinefelter's Syndrome
Document1 page
Features: Down's Syndrome Turner's Syndrome Klinefelter's Syndrome
Muhammad umar
No ratings yet
110.a Critical Value List Attachment 1 - 12.13
Document3 pages
110.a Critical Value List Attachment 1 - 12.13
Muhammad umar
No ratings yet
Ascp Boc Us Procedures Book Web
Document28 pages
Ascp Boc Us Procedures Book Web
Muhammad umar
100% (1)
Dump State
Document8 pages
Dump State
Edson Shuan
No ratings yet
Business Intelligence Software at SYSCO
Document12 pages
Business Intelligence Software at SYSCO
Pedro Barros
0% (1)
Oracle To Sybase ASE Migration Guide
Document66 pages
Oracle To Sybase ASE Migration Guide
5744
No ratings yet
CSE3146-Advanced JAVA Programming-Module 4 - JSP Labsheet
Document8 pages
CSE3146-Advanced JAVA Programming-Module 4 - JSP Labsheet
Don Afaque
No ratings yet
Introducing f5 Distributed Cloud Private Link Solution Overview
Document5 pages
Introducing f5 Distributed Cloud Private Link Solution Overview
DuDu Dev
No ratings yet
Userspace Networking: Beyond The Kernel Bypass With RDMA!
Document8 pages
Userspace Networking: Beyond The Kernel Bypass With RDMA!
aashutosh1
No ratings yet
Warsaw Group: April 3, 2019
Document82 pages
Warsaw Group: April 3, 2019
cyrus1502
No ratings yet
Repository Queries
Document14 pages
Repository Queries
Veerabaku
No ratings yet
Troubleshooting A HUng PS
Document2 pages
Troubleshooting A HUng PS
Gladdy Francis Dcruz
No ratings yet
DevsecOps Part 7 Pre Quiz - Attempt Review
Document2 pages
DevsecOps Part 7 Pre Quiz - Attempt Review
vinay Murakambattu
No ratings yet
Movie Recomendation: A Project Report o
Document15 pages
Movie Recomendation: A Project Report o
Ved Prakash
No ratings yet
Cybersecurity in Smart Cities: Defending Our Cities From Cyber Threats
Document10 pages
Cybersecurity in Smart Cities: Defending Our Cities From Cyber Threats
Atul Saikumar
No ratings yet
Best Practices
Document29 pages
Best Practices
Anoop Kumar
No ratings yet
Guideline - GFC TERPADU
Document9 pages
Guideline - GFC TERPADU
Salapi Muhamad
No ratings yet
CO2 Extraction From Seawater Using Bipolar Membran
Document8 pages
CO2 Extraction From Seawater Using Bipolar Membran
simone
No ratings yet
Required Technical & Professional Skills Nice To Have Skills
Document1 page
Required Technical & Professional Skills Nice To Have Skills
Mithu Kaushik
No ratings yet
Technical Report On Indoor Positioning System
Document5 pages
Technical Report On Indoor Positioning System
Benj Mendoza
No ratings yet
7.1.2.7 Packet Tracer - Logging Network Activity
Document3 pages
7.1.2.7 Packet Tracer - Logging Network Activity
Sarinah
No ratings yet
Asif Resume
Document3 pages
Asif Resume
yuvaraj
No ratings yet
Enterprise Mobile Device Management Using Microsoft Intune and SCCM
Document3 pages
Enterprise Mobile Device Management Using Microsoft Intune and SCCM
Siyaram Kumar
No ratings yet
Manual GCsolution Agent
Document111 pages
Manual GCsolution Agent
Abdul Kalim
No ratings yet
NSTP Serial Number Template
Document19 pages
NSTP Serial Number Template
Kai Cabactulan
No ratings yet
Ora Net 0c
Document13 pages
Ora Net 0c
Yulin Liu
No ratings yet
NSE L1 M5 Management and Analytics (Rev 1)
Document15 pages
NSE L1 M5 Management and Analytics (Rev 1)
paulo_an7381
No ratings yet
Process Modeling, Process Improvement, and ERP Implementation
Document19 pages
Process Modeling, Process Improvement, and ERP Implementation
Viet Hoa
No ratings yet
Web 3.0 Collaboration Proposal
Document3 pages
Web 3.0 Collaboration Proposal
Altamash Khuwaja
No ratings yet