Chapter 7

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 51

UNIVERSITY INSTITUTE OF ENGINEERING

COMPUTER SCIENCE ENGINEERING


Bachelor of Engineering (Computer Science & Engineering)
Subject Name: Cloud Computing & Distributed Systems
Subject Code: 21CST-378/21ITT-378

INTRODUCTION TO IOT DISCOVER . LEARN . EMPOWER


Cloud Computing & Distributed Systems
Course Outcomes
CO1: Understanding of various paradigm of cloud computing
and distributed systems.
CO2: Articulate the basic concepts, key technologies,
strength and limitation of cloud computing and possible
applications.
CO3: Appraise the architecture and infrastructure of cloud
computing including SaaS, PaaS, IaaS, UCaaS/FaaS public
cloud, private cloud and hybrid cloud.
CO4: Interpret various data, scalability, security and cloud
services to acquire efficient database for cloud storage.
CO5: Develop the appropriate cloud computing solutions
and recommendations according to the application used.
2
Cloud Computing &
Distributed Systems

Contents
Map Reduce and GFS, Big data
and Hadoop, Different modules
of Data stores, Micro services:
Kubernetes, Server less
Computation with Open Lambda,
Geo distribution, Scaling
Distributed Machine Learning
with the Parameter Server,
Library from Apache, Open
Source Cloud Software Systems,
Eucalyptus, Nimbus, Open
Nebula, AWS, EC2
3
Cloud Computing & Distributed Systems

UNIT 3
CLOUD SECURITY SERVICES

CHAPTER 7
Advanced Topics in Cloud Computing (Beyond the Syllabus)
To be present by the Assigned Students
(Included in the syllabus)

4
Map Reduce and GFS (CO-5)

5
Using MapReduce to Compute PageRank
MapReduce is an algorithm/data processing model that is introduced by Google research in the early
2000s. It is extremely useful for parallel processing and distributed computing of big sets of data. It
basically contains three phases: Mapping, Shuffling and Reducing
Mapping phase takes some high volume input (usually a GFS/HDFS file), and breaks them into key-value
pairs. Shuffling phase takes the outputs from Mapping phase and sorts them by their keys. And all inputs
with the same key will be allocated to the same place.
Reducing phase takes the outputs from Shuffling phase and do some computations (programmable based
on one’s need) on those data. It will finally store the results.
MapReduce allows distributed computing, meaning a program can be computed on multiple computers
to improve efficiency. This would be really useful because it means, for example, we can process huge
volumes of webpage data for a search engine – i.e. Webpage data collected by Google’s crawlers can be
fed into the MapReduce model to calculate the resulting page ranks under a Pagerank algorithm, and the
result will be stored in a BigTable. However, MapReduce also have its weakness – for example, the original
model cannot handle stream data (real-time data) – which basically means that the model can only
process a batch of data after the previous batch has finished its calculation. In the search engine scenario,
this means that we cannot crawling the webpages and doing MapReduce calculation to store the results
at the same time. On the other hand, the processing speed of MapReduce is not that fast due to it
requires relatively long time to perform mapping and reducing. This results in very high latency in real-
time data processing/searching scenario when we do not have any previously stored result data. 6
7
Big data and Hadoop (CO-5)

8
9
10
Different modules of Data stores (CO-5)

11
12
13
Micro services: Kubernetes (CO-5)

14
15
16
Server less Computation with Open Lambda (CO-5)

17
18
19
Geo Distribution (CO-5)

20
21
YoMo: An open-source streaming serverless framework for building geo-distributed applications , which guarantee
<50ms global latency for their users:

Macrometa: Macrometa solves complex, geo-distributed data challenges.

22
Scaling Distributed Machine Learning with
the Parameter Server (CO-5)

23
24
25
Library from Apache (CO-5)

26
27
28
Open Source Cloud Software Systems (CO-5)

29
30
31
Eucalyptus (CO-5)

32
33
34
Nimbus (CO-5)

35
36
37
Open Nebula (CO-5)

38
39
40
AWS (CO-5)

41
42
43
EC2 (CO-5)

44
45
46
47
Summary
In recent years, the field of cloud computing has experienced remarkable advancements, revolutionizing
the way individuals and businesses manage and access data. The adoption of cloud technologies has
become pervasive, offering unparalleled flexibility, scalability, and efficiency. Infrastructure as a Service
(IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) have become integral components of
the cloud landscape, empowering users to leverage computing resources without the need for extensive
on-premises infrastructure. The development of edge computing has further extended the capabilities of
cloud systems by enabling data processing closer to the source, reducing latency and enhancing real-time
applications. Additionally, the integration of artificial intelligence and machine learning within cloud
platforms has ushered in a new era of intelligent computing, enabling advanced analytics, automation,
and predictive capabilities. As cyber security concerns continue to be a priority, cloud providers have
implemented robust security measures, ensuring the protection of sensitive data. The ongoing evolution
of cloud technologies promises continued innovation, driving digital transformation and shaping the
future of computing.

48
QUIZ
1. What does EC2 stand for?
a. Elastic Container Cloud
b. Elastic Compute Cloud
c. Elastic Cloud Computing
d. Elastic Container Computing
2. Which of the following is not a valid EC2 instance type?
a. t2.micro
b. m5.large
c. c4.xlarge
d. r3.nano
3. What is an Amazon Machine Image (AMI) in EC2?
a. A virtual server instance
b. A snapshot of an EC2 instance
c. A type of EC2 instance
d. An elastic IP address
4. Which AWS service is used for load balancing across multiple EC2 instances?
a. Amazon RDS
b. Amazon S3
c. Amazon ELB (Elastic Load Balancer)
d. Amazon SQS
5. What is the purpose of an Elastic IP (EIP) in EC2?
a. To provide additional storage for EC2 instances
b. To allow dynamic scaling of EC2 instances
c. To assign a static IP address to an EC2 instance
d. To connect EC2 instances to an on-premises network 49
REFERENCES
TEXT BOOKS
1. Cloud Computing: A Practical Approach by Toby Velte, Anthony Velte, Robert C. Elsenpeter, McGraw Hill Professional, 22 Oct 2009
2. Buyya, Rajkumar, James Broberg, and Andrzej M. Goscinski, eds. Cloud computing: Principles and paradigms. Vol. 87. John Wiley & Sons, 2010.
3. Miller, Michael. Cloud computing: Web-based applications that change the way you work and collaborate online. Que publishing, 2008.
4. Hurwitz, Judith S., et al. Cloud computing for dummies. John Wiley & Sons, 2010.
5. Kris Jamsa. Cloud Computing: SaaS, PaaS, IaaS, Virtualization, Business Models, Mobile, Security and more, Jones &Bartlet Learning Company LC, 20012
REFRENCE BOOKS
1. G. Pfister. In Search of Clusters. Prentice Hall PTR, NJ, 2nd Edition, NJ, 1998.
2. Cloud Computing: Implementation, Management, and Security, by John Rittinghouse and James F.Ransome, CRC Press Taylor and Francis Group
3. Joshy Joseph and Craig Fellenstein, Grid Computing, Person Edition, (2004).
4. Maozhen Li, Mark Baker, “The Grid Core Technologies”, John Wiley & Sons (2005).
5. Cloud Computing: A Practical Approach for Learning and Implementation Paperback – 1 January 2014 by Srinivasan, Pearson Education
Video Links
https://www.youtube.com/watch?v=A3FPxuKlnkU&list=PLFW6lRTa1g82dte3YD_7-GoZXcBiK6K9G
Web Links
1. https://www.geeksforgeeks.org/what-is-a-distributed-system/
2. https://www.geeksforgeeks.org/difference-between-cloud-computing-and-distributed-computing/
3. https://www.ibm.com/topics/distributed-cloud
4. https://www.geeksforgeeks.org/cloud-computing/
5. https://learn.rumie.org/jR/bytes/learn-the-basics-of-cloud-computing-in-3-minutes/?
utm_source=bing&utm_medium=cpc&utm_campaign=RumieLearn-Bytes%20%28non-NA%29&utm_term=cloud%20computing&utm_content=TS 50
%20-%20Computing%20In%20Cloud%20Computing
THANK YOU

You might also like