Professional Documents
Culture Documents
Cloud Capabilities
Cloud Capabilities
Cloud Capabilities
Thanks to its speed, scale, and capacity, the cloud offers more functionality with more
automation than nearly every on-premises solution. This is for a few reasons:
• Cloud is built around services The more you have on-premises, the more you
and your technology need to be jacks-of-all-trades — you need to handle
everything yourselves. But in the cloud's service-based model, you have access
to individualized services optimized for specific functions instead of having one
centralized group of servers and staff. That means Cloud Service Providers
(CSPs) are focused on providing discrete elements of the cloud experience, and
both hardware and software have been designed for performance.
• Cloud is automated The computing power inherent to cloud means that
everything mentioned above should be available to you on-demand and in a self-
service manner — the Cloud Service Providers (CSPs) experts involved in
providing the computing power should be behind the scenes, not in between you
and the technology. (Remember, on-demand self-service is one of NIST's five
essential elements of cloud computing).
• Cloud (usually) oversees your networking For the most part, networking — the
joining up of resources within a cloud environment — will be managed by your
Cloud Service Provider (CSP). They have staff dedicated to monitoring and
optimizing the cloud resources, and as the keepers of the infrastructure, the CSP
balances the "cloud" resources (e.g., compute, storage, network). The CSP
manages resources within your defined configurations and in meeting agreed
upon Service Level Agreements (SLAs). The CSP frees you have from having to
monitor, manage, and optimize the resources.
• Containers are basically virtual machines, just one step higher — they're all
running the same operating system. They're faster to spin up and down (and
faster to destroy after they're done) than virtual machines. This makes them
much more efficient while requiring fewer resources at the same time.
• Clusters are, of course, virtual clusters they follow the same rules as physical
clusters, and they're housed within a portion of a data center that's been
assigned to you. In a physical cluster, you're limited to the machines you have
on-premises, and if one machine fails, there can be ramifications on the rest of
the system. Virtual clusters can be made of physical or virtual machines situated
anywhere in the world, they can be spun up and down on demand, and aren't
likely to be brought down by the failure of any one individual component.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
• Load balancing provides automatic oversight over every instance of your
production environment and makes sure no servers are receiving an undue
amount of strain or reaching their capacity. This isn't really possible on-
premises; the increased latency and lack of scale wouldn't be worth the
investment.
• Servers in a cloud context typically mean "space rented on a server". Renting
space on a server saves that resource for you until you say otherwise; you can
scale up and down on demand, but there's always capacity ready for you on a
moment's notice.
• Serverless is a cloud computing service model where functions of code are the
unit of deployment. There are no machines, VMs or containers to manage.
Computing power only exists for you when you need it; there are no "reserved"
resources for you; code is dynamically run as needed and then destroyed when
done. Since there are no VMs or even containers to manage, serverless
computing has minimal operational overhead, making it the ideal platform for
developers dev and test environments. Serverless tends to be more affordable
for infrequently or sporadically used applications as you only pay for code as it
runs, but it is slower than traditional cloud services since you don't have reserved
resources dedicated to your applications. For these reasons applications in
continuous use or that are time sensitive are better served by more traditional
cloud service models.
• Storage in a traditional data center, whether databases or files, means having a
physical computer with an operating system and a configuration to run your files.
Storage on the cloud is fundamentally the same, but the resources aren't usually
located in the same place physically. The code might be somewhere in a place
optimized to run code, the hardware might be somewhere else optimized to run
hardware, and the operating system might be in a third place optimized to run
operating systems.
• Virtualization is effectively the same thing as having virtual machines (i.e.,
multiple instances of the same operating system on a physical computer).
Instead of running these virtual machines yourself, however, your CSP will
typically provide the virtualization for you. (Note: Software licensing can become
sticky in virtualization, because providers may charge you based on how many
virtual instances you have instead of how many physical computers you have —
since you don't own those computers yourself anymore.)
For example, meteorologists use grid computing for weather modeling. Weather
modeling is a computation-intensive problem that requires complex data management
and analysis. Processing massive amounts of weather data on a single computer is
slow and time consuming. That’s why meteorologists run the analysis over
geographically dispersed grid computing infrastructure and combine the results.
Efficiency
With grid computing, you can break down an enormous, complex task into multiple
subtasks. Multiple computers can work on the subtasks concurrently, making grid
computing an efficient computational solution.
Cost
Grid computing works with existing hardware, which means you can reuse existing
computers. You can save costs while accessing your excess computational resources.
You can also cost-effectively access resources from the cloud.
Flexibility
Grid computing is not constrained to a specific building or location. You can set up a
grid computing network that spans several regions. This allows researchers in different
countries to work collaboratively with the same supercomputing power.
Financial institutions use grid computing primarily to solve problems involving risk
management. By harnessing the combined computing powers in the grid, they can
shorten the duration of forecasting portfolio changes in volatile markets.
Gaming
Entertainment
Some movies have complex special effects that require a powerful computer to create.
The special effects designers use grid computing to speed up the production timeline.
They have grid-supported software that shares computational resources to render the
special-effect graphics.
Engineering
Engineers use grid computing to perform simulations, create models, and analyze
designs. They run specialized applications concurrently on multiple machines to
process massive amounts of data. For example, engineers use grid computing to
reduce the duration of a Monte Carlo simulation, a software process that uses past
data to make future predictions.
In grid computing, a network of computers works together to perform the same task.
The following are the components of a grid computing network.
Nodes
The computers or servers on a grid computing network are called nodes. Each node
offers unused computing resources such as CPU, memory, and storage to the grid
network. At the same time, you can also use the nodes to perform other unrelated
tasks. There is no limit to the number of nodes in grid computing. There are three main
types of nodes: control, provider, and user nodes.
Grid middleware
Grid architecture represents the internal structure of grid computers. The following
layers are broadly present in a grid node:
Grid nodes and middleware work together to perform the grid computing task. In grid
operations, the three main types of grid nodes perform three different roles.
User node
A user node is a computer that requests resources shared by other computers in grid
computing. When the user node requires additional resources, the request goes
through the middleware and is delivered to other nodes on the grid computing system.
Provider node
In grid computing, nodes can often switch between the role of user and provider.
A provider node is a computer that shares its resources for grid computing. When
provider machines receive resource requests, they perform subtasks for the user
nodes, such as forecasting stock prices for different markets. At the end of the
process, the middleware collects and compiles all the results to obtain a global
forecast.
Control node
A control node administers the network and manages the allocation of the grid
computing resources. The middleware runs on the control node. When the user node
requests a resource, the middleware checks for available resources and assigns the
task to a specific provider node.
Computational grid
Scavenging grid
While similar to computational grids, CPU scavenging grids have many regular
computers. The term scavenging describes the process of searching for available
computing resources in a network of regular computers. While other network users
access the computers for non-grid–related tasks, the grid software uses these nodes
when they are free. The scavenging grid is also known as CPU scavenging or cycle
scavenging.
Data grid
A data grid is a grid computing network that connects to multiple computers to provide
large data storage capacity. You can access the stored data as if on your local machine
without having to worry about the physical location of your data on the grid.
The amount of data generated and collected today is growing exponentially. It’s not
only more varied, but also wildly disparate. Data can now reside across on-premises
databases and distributed cloud applications and services, making it difficult to
integrate using traditional approaches. In addition, real-time data processing is
becoming essential to business success—delays and lags in data delivery to mission-
critical applications could have catastrophic consequences.
As cloud adoption accelerates and the way we use data continues to evolve, legacy
databases face significant challenges.
While the benefits of cloud databases can help organizations address many modern
obstacles that impede growth and digital transformation, there are some common
considerations of cloud databases to keep in mind as you plan your migration to the
cloud.
• Vendor lock-in
• Difficulty integrating data with other systems
• Complex and lengthy migrations
• Underestimating cloud costs
• Possibility of connection downtime
• Cloud security concerns
Below are the most opt-for and industry-leading programming languages that support
cloud infrastructure development.
Java
Java is an all-in-one developers’ toolset to develop websites, desktop applications,
android, iOS, and games. The language offers a resource-rich library to support all
programming tasks.
The standard preference among cloud infrastructure developers for developing large-
scale, enterprise-grade applications is Java.
Java offers robust security features, a large developer community, and excellent
compatibility with cloud platforms such as AWS, Azure, and V2 Cloud, making it a
preferred choice for those looking to develop websites and deploy scalable
applications seamlessly.
Besides its established role in cloud computing, Java’s versatility extends to various
domains, including web scraping, as demonstrated in java for web scraping.
Python
Python has emerged as one of the leading languages for cloud computing due to its
ease of use, performance, open-source development, third-party integrations, and
popularity among developers.
Upskilling yourself with Python and its libraries can significantly increase your chances
of landing well-paid jobs and joining the community of cloud computing professionals.
Supported by AWS Lambda, Python is used for serverless computing in AWS Cloud. It
offers dedicated libraries to automate cloud-based workflows, perform data analysis,
and build cloud-native apps.
It includes:
• Boto3 SDK – This AWS SDK (Software Development Kit) for Python allows
developers to access various AWS services via a simple API.
• Apache Libcloud – An all-rounder cloud computing library in Python that offers a
unified API to interact with different cloud vendors, including AWS, Microsoft
Azure, and Google Cloud.
• OpenStack SDK – A complete user-oriented SDK package including all open-
stack python libraries to automate cloud-based workflows such as creating
virtual machines & managing network configurations.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
• Pycloud – A pipeline for cloud computing to implement complex data analytics
on the cloud with pCloud API.
• Google Cloud Client Library – A python library to access Google Cloud services,
including Google Cloud Storage, Google Cloud Datastore, & Google Cloud
Pub/Sub.
.NET
ASP.NET or .NET, introduced by Microsoft, is widely used programming for web
development and cloud-native applications. The language is known for its wide-scale
adoption due to its easy-to-use development of dynamic web pages.
A large community of .NET developers and available resource material make the
onboarding and development journey easier for entrants and experts. Its significant
features include:
It’s a fast, simple programming language with an easy-to-adapt syntax that boasts
cross-platform compatibility. Moreover, Go offers a unique combination of robust
performance like C/C++, Python’s simplicity, and Java’s efficient concurrency
handling.
JavaScript
JavaScript, along with HTML and CSS, was instrumental in the development of the
internet. It has matured into a high-level, multi-paradigm language, driving front-end
development for web and Node.js development for cloud-native applications. Its
evolution reflects a broader trend towards using versatile, scalable languages in cloud
computing, underscoring the importance of JavaScript and Node.js in modern
development stacks.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
It provides dynamic interactivity for web pages, including alerts, events, notifications,
and pop-ups.It is also well-suited for serverless computing as it allows experts to easily
trigger and respond to events, such as changes in data or user interactions.
All major cloud platforms, including AWS Lambda and Google Cloud Functions,
support JavaScript.
Ruby on Rails
Ruby on Rails is a web development framework known for producing a clean and
streamlined codebase, making implementing new features more accessible.
This framework is ideal for developing complex SaaS and marketplace platforms. Ruby
is used for developing various SaaS products by Shopify, Github, and Zendesk.
• Easy to learn and implement for entrants, with some programming homework
help.
• Open source and easily accessible extensive libraries from Ruby on Rails
developer communities
• Support Multi-threading to facilitate fast processing.
Runtime Support
Google provides support for a runtime during General availability (GA). During this
support window:
• Runtime components are regularly updated with security and bug fixes. Updates
are applied in accordance with your function's security update policy.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
• To maintain stability, Cloud Functions avoids implementing breaking features or
changes into the runtime. Breaking changes will be announced in advance in the
Cloud Functions release notes.
Parallel computing also helps in faster application processing and task resolution by
increasing the available computation power of systems. The parallel computing
principles are used by most supercomputers employ to operate. The operational
scenarios that need massive processing power or computation, generally, parallel
processing is commonly used there.
There are many reasons to use parallel computing, such as save time and money,
provide concurrency, solve larger problems, etc. Furthermore, parallel computing
reduces complexity. In the real-life example of parallel computing, there are two
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
queues to get a ticket of anything; if two cashiers are giving tickets to 2 persons
simultaneously, it helps to save time as well as reduce complexity.s
From the open-source and proprietary parallel computing vendors, there are generally
three types of parallel computing available, which are discussed below:
ADVERTISEMENT
ADVERTISEMENT
o One of the primary applications of parallel computing is Databases and Data
mining.
o The real-time simulation of systems is another use of parallel computing.
o The technologies, such as Networked videos and Multimedia.
o Science and Engineering.
o Collaborative work environments.
o The concept of parallel computing is used by augmented reality, advanced
graphics, and virtual reality.
o In parallel computing, more resources are used to complete the task that led to
decrease the time and cut possible costs. Also, cheap components are used to
construct parallel clusters.
o Comparing with Serial Computing, parallel computing can solve larger problems
in a short time.
o For simulating, modeling, and understanding complex, real-world phenomena,
parallel computing is much appropriate while comparing with serial computing.
o When the local resources are finite, it can offer benefit you over non-local
resources.
o There are multiple problems that are very large and may impractical or
impossible to solve them on a single computer; the concept of parallel
computing helps to remove these kinds of issues.
o One of the best advantages of parallel computing is that it allows you to do
several things in a time by using multiple computing resources.
o Furthermore, parallel computing is suited for hardware as serial computing
wastes the potential computing power.
What is MapReduce?
With MapReduce, rather than sending data to where the application or logic
resides, the logic is executed on the server where the data already resides, to
expedite processing. Data access and storage is disk-based—the input is usually
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
stored as files containing structured, semi-structured, or unstructured data, and the
output is also stored in files.
MapReduce was once the only method through which the data stored in the HDFS
could be retrieved, but that is no longer the case. Today, there are other query-based
systems such as Hive and Pig that are used to retrieve data from the HDFS using SQL-
like statements. However, these usually run along with jobs that are written using the
MapReduce model. That's because MapReduce has unique advantages.
At the crux of MapReduce are two functions: Map and Reduce. They are sequenced one
after the other.
• The Map function takes input from the disk as <key,value> pairs, processes
them, and produces another set of intermediate <key,value> pairs as
output.
• The Reduce function also takes inputs as <key,value> pairs, and produces
<key,value> pairs as output.
The types of keys and values differ based on the use case. All inputs and outputs are
stored in the HDFS. While the map is a mandatory step to filter and sort the initial data,
the reduce function is optional.
Mappers and Reducers are the Hadoop servers that run the Map and Reduce functions
respectively. It doesn’t matter if these are the same or different servers.
Map
The input data is first split into smaller blocks. Each block is then assigned to a mapper
for processing.
For example, if a file has 100 records to be processed, 100 mappers can run together to
process one record each. Or maybe 50 mappers can run together to process two
records each. The Hadoop framework decides how many mappers to use, based on the
size of the data to be processed and the memory block available on each mapper
server.
Reduce
After all the mappers complete processing, the framework shuffles and sorts the
results before passing them on to the reducers. A reducer cannot start while a mapper
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
is still in progress. All the map output values that have the same key are assigned to a
single reducer, which then aggregates the values for that key.
This makes shuffling and sorting easier as there is less data to work with. Often, the
combiner class is set to the reducer class itself, due to the cumulative and associative
functions in the reduce function. However, if needed, the combiner can be a separate
class as well.
Partition is the process that translates the <key, value> pairs resulting from mappers to
another set of <key, value> pairs to feed into the reducer. It decides how the data has to
be presented to the reducer and also assigns it to a particular reducer.
The default partitioner determines the hash value for the key, resulting from the
mapper, and assigns a partition based on this hash value. There are as many partitions
as there are reducers. So, once the partitioning is complete, the data from each
partition is sent to a specific reducer.
A MapReduce Example
Consider an ecommerce system that receives a million requests every day to process
payments. There may be several exceptions thrown during these requests such as
"payment declined by a payment gateway," "out of inventory," and "invalid address." A
developer wants to analyze last four days' logs to understand which exception is thrown
how many times.
The objective is to isolate use cases that are most prone to errors, and to take
appropriate action. For example, if the same payment gateway is frequently throwing
an exception, is it because of an unreliable service or a badly written interface? If the
"out of inventory" exception is thrown often, does it mean the inventory calculation
service has to be improved, or does the inventory stocks need to be increased for
certain products?
The developer can ask relevant questions and determine the right course of action. To
perform this analysis on logs that are bulky, with millions of records, MapReduce is an
apt programming model. Multiple mappers can process these logs simultaneously:
one mapper could process a day's log or a subset of it based on the log size and the
memory block available for processing in the mapper server.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
Map
For simplification, let's assume that the Hadoop framework runs just four mappers.
Mapper 1, Mapper 2, Mapper 3, and Mapper 4.
The value input to the mapper is one record of the log file. The key could be a text string
such as "file name + line number." The mapper, then, processes each record of the log
file to produce key value pairs. Here, we will just use a filler for the value as '1.' The
output from the mappers look like this:
Mapper 1 -> <Exception A, 1>, <Exception B, 1>, <Exception A, 1>, <Exception C, 1>,
<Exception A, 1>
Mapper 2 -> <Exception B, 1>, <Exception B, 1>, <Exception A, 1>, <Exception A, 1>
Mapper 3 -> <Exception A, 1>, <Exception C, 1>, <Exception A, 1>, <Exception B, 1>,
<Exception A, 1>
Mapper 4 -> <Exception B, 1>, <Exception C, 1>, <Exception C, 1>, <Exception A, 1>
Combine
Partition
After this, the partitioner allocates the data from the combiners to the reducers. The
data is also sorted for the reducer.
If there were no combiners involved, the input to the reducers will be as below:
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
Reducer 1: <Exception A> {1,1,1,1,1,1,1,1,1}
Reducer 2: <Exception B> {1,1,1,1,1}
Reducer 3: <Exception C> {1,1,1,1}
Here, the example is a simple one, but when there are terabytes of data involved, the
combiner process’ improvement to the bandwidth is significant.
Reduce
Now, each reducer just calculates the total count of the exceptions as:
The data shows that Exception A is thrown more often than others and requires more
attention. When there are more than a few weeks' or months' of data to be processed
together, the potential of the MapReduce program can be truly exploited.
Advantages of MapReduce
1. Scalability
2. Flexibility
3. Security and authentication
4. Faster processing of data
5. Very simple programming model
6. Availability and resilient nature
Hadoop
Hadoop is an open source framework based on Java that manages the storage and
processing of large amounts of data for applications. Hadoop uses distributed storage
and parallel processing to handle big data and analytics jobs, breaking workloads
down into smaller workloads that can be run at the same time.
Four modules comprise the primary Hadoop framework and work collectively to form
the Hadoop ecosystem:
Hadoop Distributed File System (HDFS): As the primary component of the Hadoop
ecosystem, HDFS is a distributed file system in which individual Hadoop nodes
operate on data that resides in their local storage. This removes network latency,
providing high-throughput access to application data. In addition, administrators don’t
need to define schemas up front.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
Yet Another Resource Negotiator (YARN): YARN is a resource-management platform
responsible for managing compute resources in clusters and using them to schedule
users’ applications. It performs scheduling and resource allocation across the Hadoop
system.
Hadoop Common: Hadoop Common includes the libraries and utilities used and
shared by other Hadoop modules.
Beyond HDFS, YARN, and MapReduce, the entire Hadoop open source ecosystem
continues to grow and includes many tools and applications to help collect, store,
process, analyze, and manage big data. These include Apache Pig, Apache Hive,
Apache HBase, Apache Spark, Presto, and Apache Zeppelin.
How does Hadoop work?
Software clients input data into Hadoop. HDFS handles metadata and the distributed
file system. MapReduce then processes and converts the data. Finally, YARN divides
the jobs across the computing cluster.
All Hadoop modules are designed with a fundamental assumption that hardware
failures of individual machines or racks of machines are common and should be
automatically handled in software by the framework.
What are the benefits of Hadoop?
Scalability
Hadoop is important as one of the primary tools to store and process huge amounts
of data quickly. It does this by using a distributed computing model which enables the
fast processing of data that can be rapidly scaled by adding computing nodes.
Low cost
As an open source framework that can run on commodity hardware and has a large
ecosystem of tools, Hadoop is a low-cost option for the storage and management of
big data.
Flexibility
Hadoop allows for flexibility in data storage as data does not require preprocessing
before storing it which means that an organization can store as much data as they like
and then utilize it later.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
Resilience
As a distributed computing model, Hadoop allows for fault tolerance and system
resilience, meaning if one of the hardware nodes fail, jobs are redirected to other
nodes. Data stored on one Hadoop cluster is replicated across other nodes within the
system to fortify against the possibility of hardware or software failure.
What are the challenges of Hadoop?
As a file-intensive system, MapReduce can be a difficult tool to utilize for complex jobs,
such as interactive analytical tasks. MapReduce functions also need to be written in
Java and can require a steep learning curve. The MapReduce ecosystem is quite large,
with many components for different functions that can make it difficult to determine
what tools to use.
Security
Data sensitivity and protection can be issues as Hadoop handles such large datasets.
An ecosystem of tools for authentication, encryption, auditing, and provisioning has
emerged to help developers secure data in Hadoop.
Hadoop does not have many robust tools for data management and governance, nor
for data quality and standardization.
Talent gap
Like many areas of programming, Hadoop has an acknowledged talent gap. Finding
developers with the combined requisite skills in Java to program MapReduce, operating
systems, and hardware can be difficult. In addition, MapReduce has a steep learning
curve, making it hard to get new programmers up to speed on its best practices and
ecosystem.
GFS is a scalable distributed file system developed by Google for its large data-
intensive applications.
GFS was built for handling batch processing on large data sets and is designed for
system-to-system interaction, not user-to-system interaction.
• Scalable: GFS should run reliably on a very large system built from commodity
hardware.
• Fault-tolerant: The design must be sufficiently tolerant of hardware and
software failures to enable application-level services to continue their
operation in the face of any likely combination of failure conditions.
• Large files: Files stored in GFS will be huge. Multi-GB files are common.
• Large sequential and small random reads: The workloads primarily consist of
two kinds of reads: large, streaming reads and small, random reads.
• Sequential writes: The workloads also have many large, sequential writes that
append data to files. Typical operation sizes are similar to those for reads. Once
written, files are seldom modified again.
• Not optimized for small data: Small, random reads and writes do occur and are
supported, but the system is not optimized for such cases.
• Concurrent access: The level of concurrent access will also be high, with large
numbers of concurrent appends being particularly prevalent, often
accompanied by concurrent reads.
• High throughput: GFS should be optimized for high and sustained throughput in
reading the data, and this is prioritized over latency. This is not to say that
latency is unimportant; rather, GFS needs to be optimized for high-performance
reading and appending large volumes of data for the correct operation of the
system.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
APIs
GFS does not provide standard POSIX-like APIs; instead, user-level APIs are provided.
In GFS, files are organized hierarchically in directories and identified by their
pathnames. GFS supports the usual file system operations:
AWS Meaning: The Amazon Web Services (AWS) platform provides more than 200 fully
featured services from data centers located all over the world, and is the world's most
comprehensive cloud platform.
Amazon web service is an online platform that provides scalable and cost-effective
cloud computing solutions.
AWS is a broadly adopted cloud platform that offers several on-demand operations like
compute power, database storage, content delivery, etc., to help corporates scale and
grow.
That was all about what is AWS. Next, let’s have a look at the history.
History of AWS
Advantages of AWS
Disadvantages of AWS
1. AWS has supportive paid packages for intensive or immediate response. Thus,
users might need to pay extra money for that.
2. There might be some cloud computing problems in AWS especially when you
move to a cloud Server such as backup protection, downtime, and some
limited control.
3. From region to region, AWS sets some default limitations on resources such as
volumes, images, or snapshots.
4. If there is a sudden change in your hardware system, the application on the
cloud might not offer great performance.
Migration
Migration services use 3 different sub-services, DMS, SMS, and snowball to transfer
the data physically from Datacenter to AWS.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
1. DMS also known as Database Migration Service is used to migrate one
database to another.
2. SMS is a Server Migration Service that helps to migrate on-site servers to AWS
within a short period of time.
3. Snowball is used to migrate data inside in terabytes to data outside within the
AWS environment.
Applications of AWS
The most common applications of AWS are storage and backup, websites, gaming,
mobile, web, and social media applications. Some of the most crucial applications in
detail are as follows:
One of the reasons why many businesses use AWS is because it offers multiple types
of storage to choose from and is easily accessible as well. It can be used for storage
and file indexing as well as to run critical business applications.
2. Websites
Businesses can host their websites on the AWS cloud, similar to other web
applications.
3. Gaming
There is a lot of computing power needed to run gaming applications. AWS makes it
easier to provide the best online gaming experience to gamers across the world.
A feature that separates AWS from other cloud services is its capability to launch and
scale mobile, e-commerce, and SaaS applications. API-driven code on AWS can
enable companies to build uncompromisingly scalable applications without requiring
any OS and other systems.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
5. Big Data Management and Analytics (Application)
• Amazon Elastic MapReduced to process large amounts of data via the Hadoop
framework.
• Amazon Kinesis to analyze and process the streaming data.
• AWS Glue to handle, extract, transform and load jobs.
• Amazon Elasticsearch Service to enable a team to perform log analysis, and
tool monitoring with the help of the open source tool, Elastic-search.
• Amazon Athena to query data.
• Amazon QuickSight to visualize data.
6. Artificial Intelligence
• AWS IoT service offers a back-end platform to manage IoT devices as well as
data ingestion to database services and AWS storage.
• AWS IoT Button offers limited IoT functionality to hardware.
• AWS Greengrass offers AWS computing for IoT device installation.
What is Amazon S3?
✔ Amazon S3 is an object storage service that offers industry-leading scalability, data
availability, security, and performance.
✔ Store and protect any amount of data for a range of use cases, such as data lakes,
websites, cloud-native applications, backups, archive, machine learning, and
analytics.
✔ Amazon S3 is designed for 99.999999999% (11 9's) of durability, and stores data for
millions of customers all around the world.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
Features of Amazon S3
Storage management
Amazon S3 has storage management features that you can use to manage costs, meet
regulatory requirements, reduce latency, and save multiple distinct copies of your data
for compliance requirements.
• S3 Object Lambda – Add your own code to S3 GET, HEAD, and LIST requests to
modify and process data as it is returned to an application. Filter rows,
dynamically resize images, redact confidential data, and much more.
• Event notifications – Trigger workflows that use Amazon Simple Notification
Service (Amazon SNS), Amazon Simple Queue Service (Amazon SQS), and AWS
Lambda when a change is made to your S3 resources.
Amazon S3 provides logging and monitoring tools that you can use to monitor and
control how your Amazon S3 resources are being used. For more information,
see Monitoring tools.
If the EC2 instance stops or is terminated, all the data on the attached EBS volume
remains.
What are AWS EBS Snapshots?
Every next backup copies only a block of data that has changed since the last
snapshot.
Only the data unique to that snapshot is removed when you delete a snapshot.
If the EC2 instance stops, or is terminated, all the data on the attached EBS volume
remains.
Azure is a cloud computing platform and an online portal that allows you to access and
manage cloud services and resources provided by Microsoft. These services and
resources include storing your data and transforming it, depending on your
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
requirements. To get access to these resources and services, all you need to have is an
active internet connection and the ability to connect to the Azure portal.
What are the Various Azure Services and How does Azure Work?
Azure provides more than 200 services, are divided into 18 categories. These
categories include computing, networking, storage, IoT, migration, mobile, analytics,
containers, artificial intelligence, and other machine learning, integration,
management tools, developer tools, security, databases, DevOps, media identity, and
web services. Let’s take a look at some of the major Azure services by category:
Compute Services
• Virtual Machine
This service enables you to create a virtual machine in Windows, Linux or any
other configuration in seconds.
• Cloud Service
This service lets you create scalable applications within the cloud. Once the
application is deployed, everything, including provisioning, load balancing,
and health monitoring, is taken care of by Azure.
UNIT -4 CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENT
• Service Fabric
With functions, you can create applications in any programming language. The
best part about this service is that you need not worry about hardware
requirements while developing applications because Azure takes care of that.
All you need to do is provide the code.
Now that you know more about Azure and the services it provides, you might be
interested in exploring the various uses of Azure.