Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 42

CloudCentrix: An Overview of

Frameworks and Technologies to Improve


Your Cloud Environment
(somewhere in the first part of this Jason will publish a graphic of CloudCentrix.)
Table of Contents
Introduction
Chapter 1 Public, Private, Hybrid and Multiclouds
Chapter 2 VMware
Chapter 3 Veeam
Chapter 4 Nutanix
Chapter 5 Cloud-Native
Chapter 6 Microservices
Chapter 7 Containers
Chapter 8 Docker
Chapter 9 Kubernetes
Chapter 10 Programming Languages and Frameworks
Chapter 11 Automation: Iac
Chapter 12 Terraform vs. Ansible
Chapter 13 Cloud Security
Chapter 14 DevOps
Chapter 15 Git and GitHub
Chapter 16 Scrum vs. Kanban
Chapter 17 Analytics
Chapter 18 Databricks
Chapter 19 Power BI vs. Tableau

Conclusion
Introduction
You’ve either already made your move to the cloud, or you’re thinking about it. But now what?
The benefits that come with moving to the cloud don’t just come about naturally. You’ve got to
do something: Use certain technologies that let you accomplish tasks in a fraction of the time it
took you before moving to the cloud.
The benefits that come with cloud computing have spawned a new age and new growth.
Gartner predicts that worldwide end-user spending on public cloud services is forecast to grow
20.7% to a total $591.8 billion in 2023. That’s higher than the 18.8% growth forecast in 2022.
This ebook is designed to help you understand the various technologies that will help you
accelerate your digital transformation. Welcome to CloudCentrix, your guide to understanding
the various cloud technologies that will help you transform your IT department.

About CloudCentrix
CloudCentrix is a concept developed to help organizations move (applications, data, or
systems) to the cloud and use it in ways that would suit them best. The CloudCentrix concept
was developed to help our customers understand the ways that various cloud technologies can
accelerate digital transformation.
After realizing that many of our clients had moved to the cloud without fully leveraging
technologies that could save their IT teams hours every workday, we created CloudCentrix to
help them understand the various technologies and corresponding courses that could benefit
them.
Adopting new technologies and processes is the first step to working well in the cloud. The
second step is training. Repeatedly, we’ve seen clients that have adopted various cloud
technologies but weren’t using them to their full capabilities, costing loads of unnecessary time.
This ebook doesn’t cover everything the cloud offers; rather, it gives you a good working
knowledge of the various types of technologies and processes you can begin to adopt to elevate
your IT teams. There’s no order in which you should read this book. You can read
chronologically or skip around as you see fit. The chapter divisions help you focus on the tools
and technologies best suited for your needs.
Read this ebook to unlock the full potential of the cloud.

Chapter 1 Private, Public, Hybrid, and Multicloud Environments

Public, Private, Hybrid and Multiclouds: What Are the


Differences?
A cloud computing system enables businesses to store and analyze data using cutting-edge
technology over the internet. This cloud technology is often referred to as a cloud deployment
model. The main four cloud models are classified as public, private, hybrid, and multicloud (the
use of cloud computing services from at least two cloud providers).
Cloud deployment models define a specific type of cloud environment based on ownership,
scale, and access, and govern how data is kept, how customers interact with it, and how cloud-
deployed applications function.

What is a Public Cloud?


The most common model of cloud computing services is the public cloud, a computing model
managed by a third party like Amazon Web Services (AWS), Google Cloud, or Microsoft Azure,
which delivers IT services through the internet. These companies provide a wide range of
solutions and computing resources, such as cloud analytics, security, and serverless computing.
Computing functionality can range from simple services, such as email, applications, and
storage, to enterprise-grade operating systems and infrastructure environments used for
software development and testing. The cloud provider oversees creating, managing, and
maintaining a pool of computing resources that various tenants from across the network share.

Pros and Cons of the Public Cloud


Listed below are several advantages of a public cloud deployment:
● The public cloud is highly scalable, as there is no need to invest in hardware to improve
the infrastructure. It can scale up and down according to demand.
● Subscription and pay-as-you-go pricing models mean you only pay for what you use.
● Cloud providers offer hardware updates and maintenance so their clients don’t have to.
● Minimal technical knowledge is required in-house for setting up a public cloud.
● Cloud service providers offer a variety of services.

Listed below are the disadvantages of public cloud solutions:


● Specific organizational requirements around security and compliance could hinder some
companies from using public cloud solutions.
● Public clouds may not meet your legal, compliance, or industry requirements.
● The organization does not own the infrastructure, which could restrict services and
usage.
● Software-as-a-Service (SaaS) providers do not always meet bespoke business
requirements.

What is a Private Cloud?


Any cloud system dedicated to a single enterprise is called a private cloud. Cloud computing
resources are not shared with any other organization in a private cloud.

The data center resources may be on-site, which you manage, or off-site and managed by a
third-party vendor. The computing resources are isolated and distributed through a secure
private network not shared with other clients.

Pros and Cons of a Private Cloud


A private cloud has the following advantages:
● Ensures the configuration can support all application and legacy system scenarios
● Allows you to control the security of your cloud deployment
● Meets any compliance, legal, and security requirements for the organization

Below are the typical cons of a private cloud system:


● Requires you to configure and maintain the necessary hardware
● Requires in-house skills to manage and leverage a private cloud infrastructure
● Requires you to purchase and install new hardware
● Limits your ability to scale applications if you lack the necessary infrastructure

What is a Hybrid Cloud?


Any cloud infrastructure architecture comprising public and private cloud solutions is called a
hybrid cloud.

Typically, the resources are orchestrated as an integrated infrastructure environment. Based on


corporate business and technical policies, apps and data workloads can share resources
between public and private cloud deployments.

Pros and Cons of a Hybrid Cloud


Here are a few benefits of a hybrid cloud setup:
● A hybrid system gives more flexibility and scalability than on-premises infrastructure;
once your private cloud gets more traffic than it can handle, the public cloud
automatically handles the overflow, giving users a seamless experience.
● You can take advantage of the economies of scale that come with public cloud
deployments.
● Organizations can still use their own systems to ensure security and compliance
requirements are met.

Some challenges of a hybrid setup are listed here:


● A hybrid setup can be complicated to deploy and manage when integrating public and
private environments.
● The cost is typically higher than sticking to a single (private or public) method.
Multicloud vs. Hybrid Cloud
In a multicloud environment, a company uses two or more public cloud service providers. A
company may host its online front-end application on AWS while hosting its Exchange Servers
on Microsoft Azure.

Because not all cloud providers are created equal, organizations adopt a multicloud strategy to
deliver best-of-breed IT services to avoid being locked into one provider or to choose providers
based on which one offers the lowest prices.

Factors to Consider When Choosing a Cloud Strategy


While the public cloud has revolutionized IT operations, many companies still have a large
investment in their own data centers and want to keep them. However, they also want to take
advantage of public clouds, necessitating hybrid solutions.

Some companies already have on-demand IT resource delivery within their infrastructure and
do not require a public cloud. Others build private clouds because their workloads may carry
private data that companies don’t want in the public cloud due to security or compliance
concerns.

Public cloud providers and many third-party software developers help merge cloud and on-
premises resources to make management, backups, and security easier. For example, VMware
Cloud on AWS (and VMware for Azure and Google equivalents) can help overcome some of the
public cloud's challenges as it is the fastest and easiest way to have infrastructure that meets
regulatory compliance. VMware Cloud migrates and extends your on-premises environments to
the public cloud. Numerous other vendors, such as those listed below, can also assist with
managing technology requirements across multiple environments:

● Red Hat OpenShift builds on the portable nature of containers and Kubernetes,
providing a Platform-as-a-Service.
● OpenStack creates and manages cloud infrastructures.
● VMWare extends your existing on-premises infrastructure and operations to any public
cloud, running enterprise programs with a consistent operating model. 
● Veeam provides modern-data protection software for virtual and physical infrastructures
within a multicloud environment.
● Nutanix combines the ease and agility of public clouds with the performance and
security of private clouds. Whether on-premises or in a hybrid environment, centralized
management, one-click operations, and AI-driven automation will assure business
continuity.
● NetApp provides storage and backup solutions for hybrid clouds.
Summary
The most critical components in choosing a cloud strategy for most enterprises will be
affordability, accessibility, reliability, and scalability. Your sector, security, compliance
legislation, budget, and plans for the future will determine whether a private, public, hybrid, or
multicloud environment is the right solution for your needs.

Discover which cloud provider is right for you in this whitepaper, How to Choose the Right Cloud
Provider.

Chapter 2 VMware

How to Create a Private Cloud With VMware


If you currently operating or soon plan to operate a private or hybrid cloud, you’ve probably
heard of virtualization and VMware. But what is VMware, and how can it help you build a private
cloud?

VMware is software designed to create virtual machines (VMs), which are virtual copies of
computers, operating systems, and installed programs, created to maximize computing
resources. You’ll find an easy explanation of virtual machines here in “The Pros and Cons of
Virtual Machines and Containers.” VMs run independently of one another and make it possible
to accommodate multiple operating systems and workloads on a single server with high
performance and very low latency.

Traditionally, physical servers housed only one application on one server. But with VMs, rather
than running only one application on one server, you can create dozens of virtual machines, and
each one houses its own application, saving companies thousands of dollars on the costs of
buying numerous servers.

Creating Dozens of VMs with ESXi


VMware is a virtualization and cloud computing software vendor that provides many
components for you to run your own private cloud in your own data center using VM
technologies. VMware’s core technology is its hypervisor, ESXi, which allows you to create
dozens of virtual machines on a physical machine. The number of VMs you install is only limited
to the number of resources you have on your physical machine. The more CPU, RAM, and
storage you have on a server, the more VMs you can put on it. ESXi is free to download and
install on your computer to start building VMs. You can connect to ESXi over a web browser and
start building VMs.

VMware offers a stack of products in addition to ESXi, which alone will allow you to build VMs
on a laptop, desktop, or server. You’ll need another computer to connect over a web browser to
your ESXi host. As long as your ESXi management network is visible, you input the IP address
of the ESXi host, log in to your account, and you can start building things from there.

The Management Component


The VMware stack includes vCenter, which allows you to manage multiple ESXi hosts in a
cluster or multiple clusters. You could have hundreds of ESXi hosts inside a vCenter
environment, and that allows you to centrally see and manage resources among all your hosts,
clone your VMs, and move them from one host to another. You can also set policies so that if a
VM goes down, it is automatically restarted on another host.

While virtualization brings a lot of efficiencies to the data center, one of the challenges is
resource contention or over-working a physical host. There will be times when you want or need
to move a VM from one host to another. Without vMotion, you’d have to manually shut down the
VM, unregister it on the current host and re-register it on the new host, which all takes time and
requires a maintenance window for the application outage. vMotion lets you move a VM from
one host to another while the VM is running, there’s no need to shut the VM down.

VMware DRS automates that entire process as it detects when a host has too much strain on it
and whether a VM would be better off being on another host. DRS will automatically move VMs
among hosts based on the load. So, if a VM is using so much power that the host is struggling
to give it the power it needs, DRS automatically detects that and moves VMs around the cluster
to resolve the resource constraint.

Storage
Built into the VMware hypervisor is vSAN, which lets you aggregate storage devices within your
ESXi environment and create a single shared data store across your entire cluster of virtual
machines. vSAN delivers enterprise-class storage. You can create storage policies for each VM,
and if there are any policy deviations, you’ll get an alert that will show you what VMware is doing
to mitigate the difference in the policy. If you need to grow storage, you can add more hosts to
expand your vSAN datastore, scaling performance as you scale storage space.

Virtual Network
VMware NSX is the network virtualization solution. As organizations move to a software-defined
data center (SDDC) model, NSX delivers a software-centric approach to networking, including
switching, routing, firewalling, IDS/IPS, and load balancing in a distributed architecture. NSX
provides data center-wide visibility, simplified policy compliance analysis, and streamlined
security operations, connecting and protecting your workloads wherever they’re deployed. Like
VMs, these networks can be created, saved, deleted, and restored easily.

All of these functions and activities are coordinated through VMware’s vCenter Server, the
centralized management platform for controlling your VMware vSphere environments, allowing
you to manage virtual machines, multiple ESXi hosts, and all dependent components from a
single pane of glass so you can connect and protect applications across your data center,
including private and public clouds, no matter where your applications run, whether in a VM,
container, or bare metal.

Part of the VMware stack, Aria, provides customers with a graph — a total view — of all their
assets to effectively manage their cloud-native applications and multicloud assets across cloud
environments. This feature helps companies determine which apps should be deployed on
which cloud and how to optimize cost versus performance. Aria also helps companies detect
whether policies are being applied consistently across environments and provides federated
access to manage users and govern their access to multiple applications.
To learn about using VMware for your private cloud, start with VMware vSphere: Install,
Configure, Manage [V8].

Chapter 3 Veeam

Veeam: A Backup for Your Data

Veeam is a software platform developed to back up, restore, and replicate data. The first
solution to focus on protecting virtual machines (VMs) and recognizing the difference between
VMs and physical endpoints, Veeam backs up data across cloud, virtual, physical, and network-
attached storage (NAS) devices.
As virtualization gains importance for companies in almost every industry, the need for
specialized backup and recovery systems also increases.

The Rise of Virtualization


Virtualization is the creation of a virtual duplicate of an operating system, storage device, server,
or network resource used to run an application. Virtualization enables many essential practices
today, including supporting multiple operating systems and applications on a single server, thus
reducing IT and infrastructure costs and complexity. Virtualization is the technology that enabled
the rise of both private and public cloud architectures.
Whether your data is on-premises or in a public or private cloud, Veeam can restore individual
files and applications within a very short timeframe, often in as little as 15 minutes. Veeam
reports it can replicate data off-site up to 50 times faster than would otherwise be possible with
a raw data transfer. This quick restoration is essential for mission-critical applications and
enterprise disaster recovery plans.
Veeam has been listed as a Leader six times in the Gartner Magic Quadrant for Enterprise
Backup and Recovery Software Solutions, most recently in July 2022. Veeam supports the
backup and replication of virtual, physical, and cloud machines, including container-based
workloads. In addition to allowing you to see what and where configuration changes are being
made, the software also provides robust monitoring and reporting views, as well as a chart that
shows the proportion of backup-compliant VMs compared to the rest of the VMs across a
selected scope.
How Veeam Protects Company Data
As digital transformation evolves across the global business landscape, companies rely on
certain technologies to ensure they meet today’s business standards. Veeam is a simple,
reliable solution designed to protect these technological systems and their data, ensuring
organizations can continue meeting customer and employee needs. To support these needs,
cloud-based solutions and software-as-a-service (SaaS) are used to drive efficiencies, but they
multiply risk by increasing the number of potential endpoints. The more we depend on these
solutions, the more complex the data environment becomes, increasing security risks. Backing
up your data mitigates risk, providing quick disaster recovery.
Veeam allows organizations to protect, backup, recover, and replicate virtual systems quickly,
simply, and reliably, according to their changing needs. To learn more about Veeam and how it
can make life easier for your company and IT team, browse our Veeam training and certification
courses today.
Chapter 4 Nutanix

Nutanix: Simplifying and Unifying the Cloud


Nutanix is one of a set of cloud service providers that have emerged in recent years to address
the challenges enterprises face in shifting to the cloud complex, heterogeneous systems,
applications, and processes. Nutanix offers a comprehensive, user-friendly set of products and
services that make it possible to migrate an enterprise’s data center to the cloud — public,
private, or hybrid — simply, rapidly, and with minimal disruption to day-to-day operations. To
ensure that the enterprise’s infrastructure can handle sudden surges in demand without
straining its resources, Nutanix can build an entirely new data center from scratch and can do it
all at a massive scale.

Since companies no longer need to maintain data centers and other complex, large-scale
hardware and software installations, they don’t need the personnel to support them on-premises
at their physical locations. However, the cloud’s radically different IT delivery model introduces
new complexities and challenges. Here are just a few of them: Some enterprises, especially
businesses operating in highly regulated industries, need private clouds to ensure the security
and privacy of their sensitive data. Other enterprises may opt for the cost savings of public
clouds, or choose hybrid clouds, which involve some combination of public and private clouds
and on-premises installations. But performance, interoperability, and flexibility issues often arise
when an enterprise moves to the cloud or interacts — as is increasingly the case — with
partners, customers, and even competitors across a highly distributed industry ecosystem.
Perhaps most importantly, the shift to the cloud requires specialized IT and business process
skills — in areas like network security, virtualization, and disaster recovery.

Nutanix’s comprehensive range of products and services are designed to simplify the cloud’s
complexities, offering centralized management, one-click operations using a unified dashboard
that looks and works the same way for all users, and automation powered by artificial
intelligence (AI). Nutanix services, applications, and infrastructure offerings provide the ability to
manage virtually all of an enterprise’s end-to-end operations, including storage, computing,
virtualization, and networking, simply, efficiently, and at scale. This means that Nutanix can
move the enterprise’s entire existing tech stack — including its storage systems and network
servers, its services, virtualization resources, and more — to a hyperconverged cloud
infrastructure. Nutanix can handle the migration path, the day-to-day management of systems,
applications, and services, and the provisioning and deployment of new ones.

Automation
Automation is central to many of Nutanix’s products and services. Nutanix can automate IT
support using advanced techniques like predictive analytics to anticipate surges in calls to the
help desk and can ensure that adequate personnel are always available. Additionally, Nutanix’s
unified cloud offering means that a single IT team can manage all applications and data across
even the most complex multicloud environment. Nutanix’s products and services are
interoperable with essentially all the hardware an enterprise uses.

Nutanix has been placed as a Visionary in the Gartner’s Magic Quadrant for Distributed Files
Systems and Storage for two years in a row, with the research and advisory firm highlighting
Nutanix’s “ease of use and high-quality customer support experience” as one of its key
strengths. Those qualities lie at the heart of Nutanix’s complete value proposition, across its
extensive range of cloud services and applications. One important example is multicloud
security: Nutanix can create software-based firewalls that protect critical applications and data
against emerging threats across the most complex cloud environment and can do it without the
enterprise having to hire or upskill specialists in cloud security. Nutanix security products use
advanced AI techniques and ML algorithms to automate the analysis, identification, mitigation,
and reporting of security threats and ensure comprehensive regulatory compliance.

The bottom line: Nutanix offers a complete end-to-end set of products and services for
enterprises that are looking to move all or part of their systems and applications to the cloud, to
do it simply and efficiently, and to use the cloud as a platform for future innovation.

Introduce yourself to the products, capabilities, and technologies that serve as the foundation of
the Nutanix Hybrid Cloud by taking a course like Nutanix Hybrid Cloud Fundamentals (NHCF).

Chapter 5 Cloud-Native

Understanding Cloud-Native Architecture and Applications


Cloud-native architecture, the most modern way of developing applications, increases
collaboration, scalability, availability, and speed to market new services. With cloud-native
applications, developers can quickly and easily make high-impact changes frequently and with
minimal effort.
Based on microservices, independent functions that are responsible for doing just one thing,
cloud-native applications are the wave of the future. By 2025, Gartner estimates that more than
95% of new digital workloads will be deployed on cloud-native platforms, up from 30% in 2021.

Benefits of Cloud-Native Applications


 Simplifies building, deploying, and managing parts of an overall application
 Enables you to independently scale parts of a cloud-native application
 Keeps iterations of applications online even when one of them fails
 Provides the ability to move applications across environments
 Enables continuous integration/continuous delivery (CI/CD)
 Enables frequent updates with no downtime
 Brings applications to market faster than ever before

Why Become Cloud-Native?


For years applications have been created using a monolithic architecture. The term “monolith”
refers to a single block stone of considerable size. Monolithic apps are built so that everything
regarding that application is all in that one block of code. For example, a monolithic application
for a retail app might include functions for searching for products, ordering, online chat, the cart,
the packaging information, the shipping information, the customer address and contact
information, reviewing products, and dozens of other functions. In a monolithic architecture, all
the functions work together inside the application. If one of those functions breaks, the entire
application could break and remain offline until someone can fix it.
A monolithic application takes a long time to develop and deploy as the different services in the
app are interdependent. That means when one part of the code breaks, it could have a
deleterious effect on another part of the application that previously was working fine. To fix a
monolithic application, you need to take it offline.
As businesses and the usage of applications grow, there can be scalability problems. Scalability
refers to the ability to add or remove portions of the application in response to business needs.
The single deployment unit means that a monolithic application must run entirely within a given
server, which doesn’t allow for parts of the application to be scaled independently. If you need to
scale the database on a monolithic app, that could affect other services, so you may have to
scale other services even if they aren’t getting the attention another service is. But when you
have services working independently of one another, as is the case in a cloud-native
application, you can scale one service, or even remove it, and it won’t affect any other part of
your business application.

Why are Companies Moving to Cloud-Native?


When you move a monolithic application into the cloud, it doesn’t have the elasticity, scalability,
and resiliency cloud-native apps have. Elasticity refers to the ability to quickly expand to meet
changing demands for peak usage like on Valentine’s Day when thousands of people order
chocolates or flowers from a store that normally receives few orders. Resiliency refers to the
ability to recover from failures and continue to function. Cloud-native apps are built to give you
those benefits and more.

What is Cloud-Native?
Cloud-native is a technical and business approach to using the cloud to create business
applications quickly and more frequently than ever before. The cloud-native approach is about
building applications for the cloud using microservices to increase speed to market, respond
quickly to business needs by scaling as needed, and integrate the continuous integration and
continuous delivery (CI/CD) pipeline. Cloud-native apps can be developed in days or weeks as
compared to months, the time it often takes to create monolithic apps. Applications built with a
cloud-native architecture allow developers to make simple changes to them in minutes without
ever taking them offline.

Why Cloud-Native Matters


Taking a cloud-native approach to developing applications — adopting a microservices
architecture embracing the cloud and DevOps concepts — is the key to unlocking all
advantages of the cloud. But most companies lack the skills it takes to do that. In this rapidly
changing digital transformation landscape, Gartner finds that 70% of employees realize they
don’t have mastery of the skills they need to do their jobs.
ExitCertified can help you select and execute IT training courses for all your needs, including
emerging open source technologies, to help IT teams rapidly acquire new skills and optimize
business results.
Take the courses you need to build cloud-native applications and microservices to help
transform your business.
 

Chapter 6 Microservices

What are Microservices?

Microservices have become the architecture of choice for application development. Whereas a
monolithic architecture was yesterday’s main style of developing applications, today’s modern
app development comes about through using microservices.
A component of cloud-native computing, microservices are mini-applications that compose a
business application. Each microservice is responsible for doing one discrete piece of
functionality and doing it well.
Take an eCommerce site as an example. In a microservice architecture, the company’s
shopping website may look like a typical business application, but it is a combination of dozens
or even hundreds of mini-applications consisting of a product catalog, user profiles, and order
processing. Application Programming Interfaces, referred to as APIs, are gateways between
each of these small, independent applications. Each min-application conducts its own service
and communicates with other microservices, providing a seamless shopping experience. This is
in contrast to traditional applications, which are known as monolithic applications.

Monolithic Applications
A monolithic application is built as a single unified unit. When one part of it fails, there are often
ripple effects that cause other parts of it to fail. To make a change to this type of application, you
must take the entire application offline. Because these applications are built as one unit, they
typically aren’t deployed until the entire application has been built.

The Benefits of Microservices


Faster Time to Market
Microservices help organizations get their applications to market faster. By breaking a business
application down into smaller pieces, organizations can deploy the most important components
to create one business application and add other components later.
Reliability
The traditional way of creating applications is with a monolithic architecture, which is becoming
outdated. The term “monolith” refers to a single block of stone of considerable size. Although
they are built with different sections for certain categories — like a book with different chapters
— monolithic apps are built with all the lines of code contained in one block. If one part of the
monolithic app goes down, it can have ripple effects and can potentially shut down your entire
application.
A power outage knocked a Netflix data center offline for three days back in its DVD shipping
days, which was a big reason why the company transitioned to microservices. If one
microservice goes down, the other microservices remain online. So, if the recommendation
microservice were to go down, users might not see suggestions that Netflix recommends they
watch based on their previous viewing practices, but users would still be able to watch shows.
Scale
A microservices application can scale with fewer resources than a monolithic application. If you
need to scale the catalog section of an application, with microservices, you just scale that one
single application. In a monolithic application, you have to scale the entire application.
Innovation
Compared to a monolithic application, a microservice is broken down into its smallest
component. Enterprises often have separate IT teams that are tasked with creating and
servicing a specified handful of microservices. This allows companies to assign focused teams
that can take ownership of their microservices, making it easier to update and improve a certain
function without worrying about affecting the rest of the application.
Flexibility
Different languages can be used to develop different microservices. For example, one
microservice could be developed in Java, and another could be developed using NodeJS. A
third might be developed in C#. This allows individual developers to choose the language that
they are most comfortable coding in or to use a certain language they believe is best suited for
the service they’re developing.

Disadvantages Of Microservices
While there are a lot of benefits to microservices, it’s not perfect for every use case, and it has
some drawbacks.
Complexity
There are moving pieces as each application is its own entity and could reside anywhere,
making it difficult to see all the mini-applications at one time. You have to handle requests
travelling between different modules, and the remote calls to a service could experience latency.
Carefully planning out this architecture is critical.
You’ll need to spend time connecting the microservices by enabling authorized access to all the
microservices to get the application running. For small business applications with few services,
it may be best to use a monolithic architecture.
Limited Reuse of Code
Because each smaller application may be written in a programming language that’s different
from the other applications, there’s a limited ability to reuse code.
Learn how to build and connect microservices to help transform your business.

Chapter 7 Virtual Machines and Containers


The Pros and Cons of Virtual Machines and Containers
Virtual machines (VMs) and containers are both virtual environments that use software instead
of a physical machine to isolate applications into a self-contained unit. But VMs and containers
are built and run differently from each other, and that affects their capabilities and the costs of
using them.

Servers
To understand VMs, let’s start with a basic understanding of servers. They’re called servers
because they serve up applications, information, or other services to other computers.

VMs are deployed on a host, either a physical computer or a physical server. You can run a VM
on a computer or laptop, but companies, which run dozens or hundreds of VMs, typically house
their VMs on a physical server because it has a lot more resources than a computer. The more
resources a physical machine has, the more VMs it can host.

Before hosting business applications in the cloud, companies housed them on physical servers,
which were stored in racks in a company’s data center. Typically, there would be only one
application running on each server. Enterprises might need 1,000 physical servers to host 1,000
applications. That’s a lot of servers to buy and manage.

When an application on a traditional physical server lacks the resources that are needed to
handle increased traffic, the application slows down or, worse, crashes. But that issue rarely
occurs with VMs and containers because they take up far fewer resources. So instead of having
only one VM or container on a server, a server can house dozens of VMs or containers.

Virtual Machines
A virtual machine emulates the functionality of a physical computer. To better understand what
a VM is, let’s start with something we know: a Word document. Although a VM is more complex
than a Word document, a VM is kind of like a Word document in that they are both virtual—
meaning you can’t touch them—and they both contain information. A Word document is a
vessel that hosts content, some type of code, typically made of words and graphics. A VM is a
vessel that typically hosts business applications, which consist of code. To create a Word
document, you need special software like Microsoft Word or Google Docs. So, too, to create a
VM you need software like VMware, VirtualBox, or QEMU. Virtual machines are great for
hosting applications that were created using a monolithic architecture.

Monolithic Architecture
Traditional applications were written in a monolithic architecture, meaning all code instructions
were built in one large block of code. What does that look like? Imagine one large Word
document that contains instructions for operating an electric car. There’s a section in the
document that operates the steering wheel, one that operates the blinkers, one that operates
the engine, one that operates the cruise control. If one of those instructions—say the blinkers—
fails or gets corrupted, that failure could trickle down to other sections of that operating
document. The corruption might start with the blinkers, but because the blinker instructions are
closely tied to the engine instructions, those could also become corrupted. Then before you
know it, the driver of the car not only has blinkers that don’t work but also a non-working engine.
Companies today typically build monolithic apps only for small applications. For example, if a
company wanted to create an application for employees to update their contact information and
add two people to contact in case of an emergency, that would be a small application, so it
would make sense to create it using a monolithic architecture.

When companies move a monolithic application to the cloud, they move an application off a
physical server and deploy it in a VM. A company will typically create a few different instances
of that VM, so if it ever has any type of failure, another duplicate VM takes over, giving the user
a seamless experience with no downtime. When an application resides directly on a physical
server rather than a VM or container, when the application fails, it must be taken offline until it
can be fixed.

Microservices
Microservices are independent services that aren’t dependent on other services. Using our
previous example above, if a blinker were to go out, it wouldn’t affect the brakes, cruise control,
radio, engine, or anything else. Rather than housing all these services in one block of code as is
done in a monolithic application, each of these services is created independently as a mini-
application. Each mini-application is typically housed in its own container and communicates via
APIs to form a business application.

Containers
Each container hosts a mini-application that is part of a business application. For example, an
online retailer might offer a loyalty rewards program. The underlying coding for that program
would be its own service held in its own container. If that service were to fail, it would not affect
any of the other mini-applications associated with the overall business application. The
shopping cart and database would remain running, so even though shoppers would not see the
typical loyalty rewards offer, they could still buy products.

Systems are put into place so that all these containers communicate when needed. Users would
not be able to detect whether an application was a monolithic application or a microservices
application.

The Differences Between VMs and Containers


VMs use a lot more resources on a server than containers.
Reasons to Use VMs
 You can quickly create new iterations of VMs, so if one VM goes down, another can
either be created manually or automatically.
 Recovery. You can quickly restore a VM from a snapshot. With frequent iterations of
snapshots of VMs, you can reduce downtime.
 Economical. You can buy one server and host numerous VMs, and you can use VMs in
a public or private cloud.
 Storage and backup. A VM backup consists of backing up an entire machine rather than
just individual files.
 Application development. You can deploy new VMs quickly to develop new apps.
 Secure environment. VMs are separated from one another, helping to prevent the
malware from spreading.
 Safe testing environment. VMs can be used as a sandbox to test code or potential
malware.
 You can run multiple operating systems. One VM could be running MacOS while another
VM on the same server could be running Windows, and another could be running Linux.
 Portability. If they’re running on the same type of hypervisor, you can move VMs across
virtual environments and from one physical server to another, including from your data
center to another data center. If the application works in your environment, it should be
able to work anywhere you move it.
 Compatibility. You can run a VM image on any OS host, such as Linux, Mac, or
Windows.

Challenges With Virtual Machines


 VMs are slow to start because it takes time to start up all the resources they use.
 Hardware issues interrupt virtualization performance, hence why it’s good to have
iterations of your VMs on separate servers.
 An OS license may be required for each VM you create.
 VMs require large amounts of resources: memory, CPU, and storage.

Reasons to Use Containers


 Lightweight. Containers start up in milliseconds.
 Minimal size. A container is typically as small as 10MB, whereas a VM occupies at least
a few GBs of storage space.
 Rapid deployment. You can create, test and deploy an application quickly.
 Quick release cycle. Using Infrastructure as Code (IaC ) along with containers for the
setup and maintenance of servers results in a quicker release cycle.
 Portable. Containers can run anywhere, across environments.
 Multiple languages. One container could contain an app built in C# and another could
contain an app built with Python. Developers can use the language they’re most
comfortable using.
 Low overhead. Containers use fewer resources than VMs.
 Scaling is quick. You can scale just one container rather than an entire monolithic
application.

Challenges With Containers


 A container orchestrator is needed. You’ll need an orchestrator like Kubernetes or
Amazon Elastic Container Service to manage all those containers.
 Complexity. If a service fails, it could be difficult to detect where the problem lies as there
could be hundreds of containers for one overarching business application. Distributed
tracing tools will be needed to identify problems.
 Persistent storage. Container data typically disappear forever once the instance shuts
down, so you’ll need an external storage component that’s compatible with your
orchestrator.

Summary
VMs are a great way to move legacy and traditional applications to the cloud. Containers work
great for creating large applications and adopting a cloud-native architecture.

Take a course to learn the core technical skills needed for VMs and containers.

Chapter 8 Docker
How Docker Simplifies and Speeds Up Application Development
Docker containers provide functionality like that of virtual machines (VMs) but are far more
lightweight, as they occupy far less space on the host machine. Since its inception in 2013, the
open source Docker engine has been one of the most impactful innovations in IT. Worldwide,
over 13 million developers currently rely on Docker’s Platform-as-a-Service (PaaS) technology
to facilitate speedy, scalable, container-based development.

What are Containers and How do They Work?


A container is an abstract concept that is used to explain the way developers package
applications. The container packages up code and all its dependencies so the application runs
quickly and reliably from one computing environment to another.
To run a container, you need a container runtime, or computer engine, such as Docker.
Essentially, containers act as a conduit that connects your hardware and its operating system
(OS) to individual applications. The application is stored in a container along with needed
libraries and dependencies, making it easy to move the application between different virtual
machines and computer systems across environments without impacting core functionalities or
other software.

Key Advantages of Containerized App Development:

Speedy Application Installations


In a traditional development ecosystem, installing new applications is time-consuming.
Developers need to check multiple dependencies and troubleshoot conflicts that arise between
the new application and existing software. With containers, the application comes packaged
with its required libraries and dependencies. Conflict-free installation can be achieved quickly by
running a single command.

Prepackaged Dependencies
Traditional development often leads to a complex, convoluted matrix where different
applications require different versions of the same libraries and dependencies. Docker alleviates
that by packaging all the dependencies into a container alongside an application.

Easy Transferal Between Systems


Modern development often requires you to move applications between different environments,
like a developer’s laptop and a production enterprise system. Without containers, packaging and
moving software is an arduous and error-prone task. Containers simplify the process.

Isolation and Identification of Key Processes


In a traditional development environment, there are several interrelated moving parts. When
things go wrong, it can be challenging to identify the cause and respond accordingly. However,
Docker containers isolate applications and their related libraries and dependencies, making it
easier to diagnose issues and implement the appropriate solution rapidly and securely.

Rapid, Flexible Scalability


One of the major drawbacks to the traditional deployment model is the difficulty that ensues
when trying to scale. The process is time-consuming and complex. With containers, simply by
leveraging a load balancer to split traffic, developers can quickly and confidently scale
applications.

Ready to Take the Next Steps in Your Education?


To find out how your company can quickly get up to speed with Docker, visit Docker
Containerization Boot Camp.

Chapter 9 Kubernetes

Using Kubernetes to Control Containers


Over the past few years, the increase in containerized application development and
microservices architecture has changed the way the tech sector operates. Across the board,
organizations are migrating from outdated monolithic architectures to flexible, scalable
alternatives built around microservices. Many factors have contributed to this shift, but one of
the biggest has been the widespread usage of Kubernetes.
Kubernetes is an orchestration tool for managing and automating workloads housed in
containers in various environments. Each container is responsible for running a single service,
or workload, so instead of having one monolithic application that provides hundreds or
thousands of services, each container has only a single mini-application running an independent
service. In a monolithic application, services are tightly woven together and dependent upon
one another. If one service goes down, it could affect other services downstream. But with an
independent service in its own container, if that service goes offline, it doesn’t affect the other
services.
With a monolithic application, you only need to manage that one application. But with
containerized applications, which are comprised of numerous mini-applications that work
together to form one business application, each container needs to be managed. Can you
imagine having to manage 100 tiny applications to ensure the overall business application
works seamlessly? It would be nearly impossible. But with an orchestrator like Kubernetes, you
can easily manage groups of individual applications.

What is Kubernetes?
Also known as K8s (for the eight letters in between the first and last letter of its name),
Kubernetes is an open source container orchestration platform that automates the deployment,
scaling, and management of containerized applications across multiple hosts. A Kubernetes
cluster consists of a set of worker machines, called nodes, that run containerized applications.
Able to run on-premises, in the cloud, and in hybrid environments, Kubernetes treats the entire
cluster as a single resource, making it much easier to deploy, scale, and manage containerized
applications.

Kubernetes helps with the following areas:


● Load balancing: If traffic to a container is high, Kubernetes can load balance and
distribute the network traffic so that it is spread evenly across the Pods, which hold and
run your containers.
● Scaling: K8s automatically scales the workload to match demand.
● Self-healing: If a container or machine fails, Kubernetes automatically replaces it or
redeploys the afflicted container to its desired state to restore operations, ensuring that
the application continues to run without interruption.
● Rolling updates: This is a rolling deployment that incrementally updates Pods instances
with new ones to achieve zero downtime.

Decrease Release and Deployment Times


Compared to a traditional development ecosystem, containerization enables the rapid testing,
deployment, and release of applications. Deploying Kubernetes early in the development
lifecycle enables teams to test code faster than they could otherwise.
Kubernetes supports multicloud development and deployment environments by integrating with
public cloud platforms. Kubernetes schedules and automates many formerly manual processes
and can automatically scale resources based on need. Kubernetes also includes rollback
features that make it easy to get development back on track when things go wrong. When
deployed in conjunction with a containerization tool like Docker, Kubernetes has the potential to
develop, release, and scale applications faster and more efficiently than ever before.

Ready to kick your Kubernetes knowledge into high gear?


Get started with Kubernetes Application Essentials, and then move ahead with Kubernetes
Operations.

Chapter 10 Programming Languages and Frameworks

Top 11 Programming Languages and Frameworks for Cloud


Computing
The decision to focus on cloud-native development might be something of a no-brainer, but
deciding which programming languages, frameworks, and developer tools to get certified in isn’t
as simple. Do you want to focus on something widely used and transferable like Python or Java,
or would you prefer to become proficient in a more narrowly defined language like the Google-
created Go language?
To make things easier, we’ve compiled a list of the most widely used and useful programming
languages to get certified in for aspiring and current cloud developers in 2023. In no particular
order, here they are:

11 Key Programming Languages and Frameworks for Impactful Cloud


Development

· Python
Python boasts a massive collection of third-party modules and support libraries.
Compared to many other programming languages, it’s also relatively easy to learn, with
a highly engaged online community for troubleshooting and resolving issues.
Python might be beginner-friendly, but it's not just for beginners. Large organizations
ranging from Wikipedia to NASA all use Python, as do many social media companies
and some of the most prominent cloud platforms in the world, including AWS and
Microsoft Azure. Leveraging its immense development capabilities, Python is one of the
best coding languages for rapidly growing fields like artificial intelligence (AI), machine
learning (ML), and data analytics.

· Java
First released in 1995, this high-level, general-purpose language continually ranks
among the most widely used programming languages. It’s not hard to see why. Java is
versatile, modular, and platform-independent. This means that Java applications can run
on both Windows and Linux as well as other popular operating systems. Every major
cloud platform provides a software development kit (SDK) for Java.

Java’s syntax is similar to the C and C++ languages that influenced its development but
has fewer low-level facilities than either of them. Java is an object-oriented language,
with all code written inside classes and no operator overloading support. TIOBE, which
publishes a monthly list of the most popular software, ranked Java in October 2022 as
the third most widely used programming language in the world.1

Like Python listed above, Java is considered relatively easy to use when compared to
other programming languages. Developers frequently take advantage of Java’s security
and portability to code scalable enterprise cloud applications.
When it comes to cloud computing, Java’s multi-platform ability to run the same program
across multiple systems makes the end-to-end development process much smoother.
Java is one of the best cloud development languages for beginners.

· Microsoft .NET Framework


.NET refers to both Microsoft’s proprietary .NET Framework and its free, open source
cross-platform successor. The language interoperable framework includes a large class
library known as the Framework Class Library (FCL). It is compatible with many different
programming languages, including C++, C#, F#, and Visual Basic. Python programmers
can also use the .NET framework via IronPython.

Though developed by Microsoft, .NET is compatible with all major cloud platforms. With
that said, many Azure products, features, and capabilities were designed to run .NET
natively.

Released alongside .NET back in 2002, ASP.NET is an important tool that builds
upon .NET’s existing web application development capabilities with additional editing
features, libraries, and templates. It’s an open source language interoperable web app
framework that can be used for speedy, scalable cloud development. ASP.NET Core

1 TIOBE, “TIOBE Index,” October 2022. Source.


offers a modular cross-platform version.

· Angular
Developed by Google, Angular is an open source web platform that is increasingly
popular with cloud developers. Angular is a complete rewrite of the discontinued
AngularJS framework and has introduced core features that function independently to
reduce the risk of minor errors derailing the code.

Dynamic loading allows Angular to start up independently, discover new libraries, and
access new features, significantly speeding up start times and simplifying cross-platform
cloud development.

Angular is written in Microsoft’s TypeScript. The developers recommend using this free,
open source programming language when developing web applications in the platform.
A superset of JavaScript, TypeScript was originally designed to simplify some of the
complexities in JavaScript’s code. As such, TypeScript has all the core capabilities of
Javascript and can be used to develop JavaScript applications. Angular can also be
used with CSS and HTML.

· React
When it comes to building user interfaces, one of the most beneficial tools in a cloud app
developer’s arsenal is the React JavaScript framework. Also known as ReactJS, it is a
front-end library for creating eye-catching user interfaces.

According to a Stack Overflow survey from 2021, React was the preferred web
framework among developers.2 It offers rapid development speed, improved efficiency,
and extensions that enable easy custom component creation. React’s speedy rendering
can drastically reduce app and website load times, helping improve SEO to get noticed
on Google and other search engines.

One of the biggest selling points, however, is the emphasis on User Interface (UI). The
ReactJS framework’s declarative components, reusable components, and virtual
Document Object Models (DOMs) enable the streamlined creation of engaging user
interfaces.

Entry-level cloud developers will also be interested in the framework’s large online
community and relative ease of use. React only deals with the view layer of a website or
app, giving JavaScript developers a much easier entry point than other options like
Angular.

· Spring
Spring is a lightweight, open source framework created for Java development.
Leveraging core features that are compatible with all Java applications and web

2 StackOverflow, “2021 Developer Survey,” 2021. Source.


application extensions for the Java Enterprise Edition platform, Spring provides
developers with the ability to build scalable, secure apps quickly.

With Spring, developers can define remote procedures without remote APIs, run
database transactions without transaction APIs, and solve complex technical problems
in real time. The Spring Framework contains a collection of sub-frameworks, including
Spring Web Flow, Spring ORM, Spring Cloud, and Spring MVC. The Spring framework
also serves as a base for Spring Boot, and Spring GraphQL.

Spring offers pre-defined templates, loose coupling, constant transaction management,


dependency interjection, and aspect-oriented programming support.

· Go
Some developers call it Go, others call it Golang: Either way, this is one of the best
programming languages for cloud development. This robust, modern language was
ranked among the top 10 most widely used programming languages as recently as
March 20203 and was ranked No. 11 as of October 2022.4

At the time of its initial development, the primary goal of Go was to improve upon the
perceived deficiencies of languages like Python, JavaScript, and C while incorporating
the positive aspects of each.5 As a result, Go combines the runtime of C with the ease of
use and accessibility of Python and JavaScript.

Go enables the reliable, speedy development of secure, scalable apps via microservices
and boasts impressive degrees of package management and concurrency support. It
can be used across most cloud platforms but is most effective when developing cloud-
native apps for Google Cloud.

· Rust
Rust is a general-purpose, multi-paradigm programming language that emphasizes
memory safety and performance. Its built-in ability to provide safe access to hardware
and memory without a runtime or garbage collector enables developers to catch and
address unsafe code before it reaches the user.6 More importantly, Rust provides this
additional level of security without sacrificing speed or increasing memory consumption.
This combination makes it a great fit for cloud developers looking to reduce the
frequency and prevalence of bugs without slowing down or otherwise impacting
performance.

The primary drawback of Rust is its inherent difficulty. It has a steep learning curve that
can be challenging for entry-level developers. In the short term, organizations might
suffer a decrease in immediate productivity, which could scare them away from this
otherwise beneficial programming language. This difficult learning curve has led to Rust
3 TIOBE, “The Go Programming Language,” October 2022. Source.
4 TIOBE, “TIOBE Index,” October 2022. Source.
5 Stanford University, “ Stanford EE Computer Systems Colloquium,” April 2010. Source.
6 Tech Target, “Rust Rises in Popularity for Cloud Native Apps,” August 2021. Source.
being less widely used than it should be based on its capabilities and features. With that
said, many of these once-hesitant organizations are now realizing that Rust
development leads to secure, stable cloud-native apps.

· Kafka
Written in Java and Scala, Kafka is an open source event streaming platform that
processes data feeds in real-time across multiple systems. Kafka enables speedy,
scalable development by capturing and recording streaming data in an immutable
commit log that can then be accessed and added to.

According to Apache, over 80% of all fortune 500 companies use Kafka in some
capacity, and the platform has been downloaded over 5 million times.7 It’s a trusted,
secure, and highly available source of permanent storage, with a robust library of open
source tools, built-in stream processing and the ability to connect with almost everything,
including the stream processing services of the major cloud platforms. Kafka also gives
developers the capability to access and process event streams in many different
programming languages.

Kafka is capable of handling millions of messages per second, expanding/contracting


storage based on need, and delivering low-latency messages within two milliseconds.
One of the few drawbacks to Kafka is the difficulty of setting up, deploying, and
managing clusters on-site, but this easily can be offset by accessing Kafka as a
managed service via the cloud.8

· Confluent
Built by the creators of Kafka, the Confluent platform provides many of the same data
streaming capabilities without the demanding monitoring and management
requirements. Confluent simplifies the connection between Kafka infrastructure and data
sources, serving as a central source of truth for all historical and real-time data. In
addition to enabling databases and file systems to access Kafka via the Kafka Connect
API, Confluent also serves as a Kubernetes operator.9

Confluent is the best option for cloud-native development on Kafka. The platform
provides direct access to a serverless, cost-effective, and highly available cloud
development ecosystem that’s currently used by most Fortune 500 companies.10
Essentially, Confluent turns Kafka from a demanding tool with significant overhead and
management requirements into an open source, enterprise-ready cloud infrastructure
solution for scalable growth.

· Selenium

7 Kafka Apache. “Company Website,” November 2022. Source.


8 Google Cloud, “What is Apace Kafka,” November 2022. Source.
9 Confluent, “Company Website,” November 2022. Source.
10 Confluent, “What is Apache Kafka,” November 2022. Source.
Selenium is an open source tool used for automated browser testing. It can perform
browser compatibility tests using scripts from a wide assortment of popular programming
languages, including the ones mentioned in this article.

Local browser testing infrastructure can be inflexible, unscalable, and expensive, and
typically lacks the capabilities required to run adequate tests. The Selenium framework
is different. In addition to being a SaaS model with no overhead that only requires you to
pay for what you use, testing with Selenium on the cloud leverages the power of parallel
testing to unlock more complete coverage.

Selenium WebDriver allows you to automate graphical user interface (GUI) tests on
Chrome, Firefox, and other leading browsers and has a sizable online community for
troubleshooting and support.

Ready to take your cloud development game to the next level?


Check out the range of cloud computing and cloud programming courses to elevate your coding
skills.

Chapter 11 Automation: Iac

The Power of Automation: Making Manual Infrastructure


Management Obsolete

Provisioning, configuring, managing, and reconfiguring infrastructure has long been a time-
consuming, difficult process for system administrators. Repeatedly executing any process
manually runs the risk of introducing inconsistency. Even when executing processes correctly,
people still must individually reconfigure individual servers when they go offline due to an error
or accident. But with automation, no one ever must configure or reconfigure servers individually
because it’s done automatically once you connect the server to a configuration tool.

Automated hardware and service configuration is also referred to as Infrastructure as Code


(IaC), a configuration management process that uses code and code development practices to
automate configuring infrastructure. Servers, load balancers, databases, firewalls, storage,
services, and permissions can be applied to development, testing, and production
environments.

Founded on DevOps practices, IaC automates processes for both system administrators and
DevOps. IaC allows you to build, change, and manage your infrastructure in a safe, consistent,
and repeatable way by defining resource configurations you can version, reuse, and share.

Without IaC, teams must maintain deployed environment settings individually, which is costly
from a time perspective. Over time, each of those environments becomes a unique
configuration, referred to as drift, which cannot be reproduced consistently, causing
inconsistencies among environments and problems with deployment and security.
The impact of infrastructure automation via IaC
Below is a list of the strategic business value that infrastructure automation presents:
 Helps to stop inaccuracies and inconsistencies in the creation and documentation of
your architecture
 Provides an accurate picture of your infrastructure at any given moment
 Enables and enhances version control
 Reduces reliance on undocumented or poorly shared tribal knowledge that may be held
by just a few people in your organization
 Minimizes the risk of human error
 Reduces the time needed to manage technology infrastructure, freeing up staff to focus
on higher-value tasks

Who benefits from IaC?


 Software Developers
 Technical Managers and Leads
 System and Cloud Administrators
 Network Engineers

While it seems like all companies should put IaC in place, many of them don’t because it
requires an initial time investment. But once IaC becomes part of your system, IaC generates
repeatable and identical environments for any one device—or group of devices—providing you
with the same environment every time it deploys, preventing configuration errors.

Benefits of a Configuration Tool


 When a configuration tool is used in combination with a monitoring tool, the latter
automatically detects when a server can’t handle any more traffic and sends a request to
the load balancer to spin up another instance of a new server with the application, or
applications, that reside on it.
 A configuration tool, whether it’s one provided by your cloud provider or one like Ansible
or Terraform, offers configuration models that allow you to configure the desired end
state of your infrastructure. And when you discover a problem with a change you’ve
made, the tools give you a way to compare two different deployments, roll back to the
earlier configuration version, and then deploy a different change.
 When you need to make infrastructure changes to improve security or compute
resources, you just update the code in the tool and publish those changes to a server or
group of servers.
 The tool also helps your DevOps team use IaC with its Continuous
Integration/Continuous Deployment (CI/CD) pipelines.
 Some configuration tools include a discovery process to capture and document your
environment.

IaC and CI/CD
As well as benefiting system administrators, IaC helps developers to use the correct
infrastructure as they migrate applications from development to staging to production. When this
process is automated, once the team integrates their work for the day, the code is automatically
tested, and if it works, it can automatically be deployed. If it doesn’t work, the team gets notified
of the error so they can fix it before deploying the code. These environments in which
developers build, test, and deploy code can all be created and removed in one fell swoop with
IaC.
Your IaC tools allow you to automatically provision new infrastructure for each developer. When
the developer team integrates their code into the pipeline, the IaC tool detects the code and
automatically creates a new virtual test environment.

Discover how to automate your infrastructure with Ansible or Terraform and how to create an
automated CI/CD pipeline.

Chapter 12 Terraform vs. Ansible

Understanding the Key Differences Between Ansible and


Terraform

Ansible and Terraform are both used to automate repetitive tasks that typically take system
administrators hundreds of hours a month to manually perform. Yet many companies are still
not taking advantage of these Infrastructure as Code (IaC) tools, which can release system
administrators from laboring over manual procedures.
Without IaC tools, system administrators must manually configure servers, release new versions
of applications on dozens of servers, install security updates to servers and applications,
conduct backups and system reboots, create users, assign permissions for individuals and
groups, and document the latest server configurations and steps for installing applications.
Many companies still spend countless hours each week doing these tasks, but they all can be
automated with software like Terraform and Ansible.
In this article, we’ll help you understand the benefits of Terraform and Ansible, as well as the
main ways they differ from one another. But first, let’s define what each of these software
solutions was designed to do.

What is Ansible?
Ansible is a collection of open source software tools that automate configuration management,
software provisioning, intra-service orchestration, server updating, and many other routine IT
tasks. Written in Python, Ansible is easy to deploy, making it a popular option for organizations
looking to streamline version control. It does not require extensive programming knowledge to
understand, which is advantageous to end-users as well as DevOps teams. Ansible can
configure systems, deploy software, and orchestrate advanced workflows to support application
deployment, system updates, and more. It also supports hybrid cloud automation, network
automation, and security automation. Automation streamlines essential routine activities in
addition to testing and deploying network changes, helping you run your network more
efficiently.

What is Terraform?
Terraform is an open source solution for securely building and maintaining IaC processes. It can
manage proprietary infrastructure solutions as well as other solutions provided by third-party
vendors. Terraform-managed infrastructure can be hosted on leading public clouds like Amazon
Web Services (AWS), Google Cloud, and Microsoft Azure. Alternatively, it can be hosted on-
premises using private clouds. Terraform is commonly leveraged by IT departments and
DevOps teams to ensure a single, secure workflow across multiple cloud environments.
Terraform is also commonly used for managing Kubernetes clusters and multicloud
deployments, as well as for automating the infrastructure deployment of existing workflows.

Which Tool Should You Use?


The answer depends on what you’re trying to accomplish.
Terraform’s platform-agnostic software is great for organizations that want to streamline the way
they securely collaborate across different environments and/or transition to multicloud
infrastructure management. The tool is also useful for developing three-tier architectures and
enforcing policies before users can create infrastructure.
Ansible is a great option with a low barrier to entry. IT departments across the globe rely on the
tool to automate day-to-day tasks related to cloud provisioning, infrastructure provisioning,
application deployment, and enterprise-grade security.
Learn more about automating manual tasks that take hours by earning a certification in Ansible
Essentials or studying the basics of Terraform.

Chapter 13 Cloud Security

Creating a Robust Cloud Security Architecture

Whether or not you intentionally store data in the cloud, your data is somewhere there due to
applications you intentionally use like Salesforce or those your employees use without your
permission. That’s why cloud security needs to be at the forefront of every IT decision you
make.

In the early days of public cloud service providers, many companies were leery of housing data
in the cloud because of the strict requirements they had to comply with. But nowadays,
according to the Sysdig 2022 Cloud-Native Security and Usage Report, most security incidents
in the cloud occur due to misconfigurations. Granting excessive privileges, unintentionally
exposing assets to the public, and neglecting to change weak default configurations are
examples of common misconfiguration errors that are the fault of their customers rather than the
cloud service providers (CSPs) themselves. Generally, cloud customers are not taking the
necessary precautions to secure their data.

The main cloud service providers employ dozens of cybersecurity professionals and must
maintain the highest security standards for a variety of compliance requirements, like HIPPA,
DDS, and DPA. Cloud service providers regularly undergo Quality Security Assessments
(QSAs), so cloud customers should be more concerned with their own cloud cybersecurity
practices rather than the security of their cloud providers.

CSPs aren’t perfect, but breaches that do occur there are mainly due to issues arising from their
customers’ subpar cloud architecture, regulatory compliance violations, poorly configured
services, vulnerable APIs, and inside attacks.
In other words, the strategies and architecture deployments of your internal team are often more
impactful to your overall cloud security than insecure data centers. Optimizing your cloud
architecture is the most powerful way to improve organizational cloud security.

How To Enhance Cloud Security With Your CSP


The first step to enhancing your cloud security protocols is understanding your specific
requirements. Cloud security is different from on-premises security and varies from one CSP to
another. Optimal Azure cloud security looks different from optimal AWS security. Understanding
the different security practices associated with each cloud provider makes it easier to develop a
solution that keeps your data secure.

Cloud security certifications are a great way to build this knowledge, but there are lots of
different options to choose from. Broadly speaking, these can be split into two categories:
vendor-specific security certifications and security certifications from professional IT
organizations.

Vendor-Specific Security Certifications


As the name suggests, vendor-specific cloud security certifications focus on building skills within
a singular cloud platform like Amazon Web Services (AWS), Google Cloud Platform (GCP), or
Microsoft Azure. If your data is housed with a single cloud provider, it’s helpful to understand
what that means for your security in specific, narrowly defined terms.

In the security courses for each cloud provider, you’ll learn how to protect your environment in
their cloud. Courses like Security Engineering on AWS, Microsoft Azure Security Technologies,
and Security in Google Cloud provide a wide overview of security controls for their clouds. Each
cloud service provider offers different controls, so if your data is in a multicloud environment,
you’ll need security training for each CSP.

Security Certifications From Professional IT Associations


Alternatively, vendor-agnostic industry certifications focus on the more generalized, transferable
aspects of cloud security. These are issued by nonprofit organizations such as (ISC)2, ISACA,
and CompTIA and focus more on the big-picture aspects of cloud security. Some of the more
popular certifications include Certified Information Systems Security Professional (CISSP),
Certified Cloud Security Professional (CCSP), and Certified Information Security Manager
(CISM).

If you want to secure your public cloud, take the security courses that your cloud provider offers,
such as AWS Security Essentials or Microsoft Azure Security Technologies.

Chapter 14 DevOps

Using DevOps to Speed Time to Market

To stay competitive and better market to customers, companies consistently create new
applications and revise existing ones. But innovation is hampered when problems arise between
the development side and the operations side of application production. Development teams
write code to create new products, features, and updates; operations teams deploy and manage
software while diagnosing any errors caused by integrating new code that developers deliver for
release. Because of the barriers between these traditionally siloed teams, miscommunication
and product release delays are not uncommon. To help both sides work better together,
companies use a collaborative approach, DevOps, which unites two teams — developers (Dev)
and operations (Ops) — to improve productivity by automating more of the product release
process and monitoring software performance in time-efficient chunks.

The Growth of DevOps


DevOps has been growing since it was introduced in 2007 when an IT project manager raised
concerns about the inefficiencies of software development practices due to the separation
between software development and IT operations teams. Developers would spend months
creating software, Quality Assurance (QA) would scramble to test everything before the
proposed release date and send back to the developers the defects that needed to be changed,
and developers would scramble again to fix the errors before sending the application back to
QA. Often, even though the software wouldn’t yet be production-ready, business stakeholders
would insist the software be released as planned, causing the need for hotfixes, and interrupting
the development team’s production cycle. This ineffective process prompted developers and IT
operations teams to work together in a new way and coined the term DevOps to show their
united relationship and work processes, maximizing efficiencies for developing and deploying
applications.

Development and Operations


Development and operations combine to increase the efficiency, speed, and delivery of software
development. The job of development is to create application software, add new features to
existing software, and add updates for users. The job of operations is to generate constructive
feedback to improve the code, resulting in a better product. The operations team views
production from a completely different angle, focusing on ensuring users can access a fast,
stable, and bug-free system.
However, DevOps is not simply the combination of development and operations teams. It
merges a set of practices, processes, and cultural philosophies that must be adopted by all
components of the software pipeline.

Benefits of DevOps
Adopting the DevOps approach improves the quality of software development, the rate of
software releases, and the speed of innovation. Errors can be caught and fixed before code can
be pushed out into the production environment. The deployment process increases efficiencies
and shortens the time to market from months or weeks to days or hours.

Getting Started with DevOps


Take a course in Agile and DevOps, the CI/CD pipeline, or Git and GitHub to start managing
and improving DevOps.

Chapter 15 Git and GitHub


Git and GitHub: Their Uses and Differences

Git and GitHub are two different technologies used by developers. You don’t need GitHub to
use Git, but you need Git to use GitHub.

Git
Git is an open source (freely distributed) version control system (VCS), which automatically
tracks coding activity over time and allows developers to save each modified version of the code
in anticipation of situations that require reverting to an earlier version.

Git is a command-line application that developers can install and host on their personal
computers or their organization’s server, so multiple developers working on the same code base
don’t accidentally overwrite another developer’s changes. According to the 2022 Stack Overflow
Developer Survey, Git is the overwhelming choice for version control by almost 94% of today’s
developers.

GitHub
GitHub is a web-based hosting service that provides developers with a globally accessible
platform for collaborating on Git projects. Git and GitHub are two separate entities. A standard
way of doing version control, Git was developed by Linus Torvalds as a free open source
program. On the other hand, GitHub was created as a commercial (for-profit) product by a
couple of software developers, who had no connection to him. While GitHub offers a free
hosting service, a user-friendly VCS, and standard automation tools that fulfill the requirements
for many development projects, it also offers premium plans — GitHub Pro, GitHub Team, and
GitHub Enterprise — for more specialized development and deployment needs.

GitHub’s freely distributed VCS offers version control and activity tracking based on Git. GUI-
based and very intuitive, GitHub is easy to learn, even for non-programmers. GitHub’s standard
features include project management tools, such as user authentication and access controls,
permission settings, task management, and internal project team messaging.

Working With Git and GitHub


When Git users write and modify code using their personal computer, Git’s VCS automatically
date- and time-stamps the activity and stores the newly modified code without changing or
deleting previous versions. Everything is saved; nothing is overwritten or lost.

Git users can upload their work to an online forum or virtual community to solicit feedback or
find collaborators with similar interests and expertise. Alternatively, users already working with
other programmers can upload their work to an off-site server via a Local Area Network (LAN)
or an online location where the work is “merged” with the main project code.

As the world’s most popular online destination for Git projects, GitHub stores uploaded data in a
filing system called a “repository.” Using GitHub’s GUI-based VCS and project management
tools, a development manager shepherds the team’s collective body of work to completion with
minimum hands-on oversight. Think of GitHub as a virtual design studio where technology
projects based on work done by a team of Git collaborators are managed and stored.
Speaking of which, Git users don’t need to use GitHub, but to collaborate on the GitHub
platform, all users (with certain exceptions) must be using Git.

Git and GitHub Features and Functions


As previously noted, once installed on a PC (Mac, Linux, or Windows), Git’s VCS enables a
user to write, edit and comment while automatically stamping every action with the time, date,
and author. To enhance workflow, Git’s branching model design lets developers test ideas in
isolation before integrating the modification into the main project code.

For example, let’s say a Git user has an idea for a new feature or solution that will fix a bug. The
user creates a branch file where the new code is written. After uploading the file to the GitHub
repository, the modified code is subject to testing and approval before being merged with the
main code. The steps involved in this process are controlled by user-generated notifications,
known as “pulls” and “merges,” which automatically create a communication thread for audit
review.

Sounds simple enough, but when dozens or hundreds of Git programmers are working on a
project over an extended period, GitHub’s GUI-based VCS and management tools help smooth
out the workflow and make the project manager’s task much easier.

Project Management Using GitHub


In addition to serving as a web-based workshop and repository for Git project collaboration,
GitHub offers basic project planning and management tools that are listed below:
● Workflow displays to keep projects and tasks organized according to the current status
● Individual task assignment posting with progress to completion tracking
● “Notes,” “reviews” and “mentions” for communicating between project team members
● Project milestones for tracking activity and reviewing open issues
● Access rights for delegating different responsibilities to individual team members
● Integration with third-party applications for advanced functionality, such as report
generation and deployment options

Teams wanting to learn GitHub and developers seeking to improve their Git skills can learn
basic and advanced Git commands and best practices in just two days. Sign up now for a Git &
GitHub Boot Camp.

Chapter 16 Scrum vs. Kanban

Understanding the Differences Between Scrum and Kanban


Scrum and Kanban are two agile frameworks that help decrease the time it takes to deploy
applications. Agile is an incremental delivery of value utilizing continual planning, continual
learning, and team collaboration. Working together in an agile method helps IT teams discover
and fix problems that happen along the way rather than waiting until a large project has been
completed before evaluating how well it works. To work well in agile, IT teams typically adopt a
framework to keep the projects moving forward in a systemized manner to prioritize, manage,
and execute work. The most popular of these tools are Scrum and Kanban.
Scrum
Scrum is derived from the rugby term of the same name in which players come together to solve
a problem, such as gaining the ball or completing a quick project, in the shortest amount of time.
IT teams that use the Scrum framework work in sprints to complete a large project by breaking it
up into smaller deliverable increments of value, and at the end of the sprint, they package the
items and release them for deployment.
At the beginning of each sprint, the product owner selects backlog items from the product
backlog, a prioritized to-do list. The scrum team creates a plan to deliver the desired features by
the end of the sprint. Each mini-project is written on a card. The Scrum Master leads the
discussion for the team to decide approximately how long it should take to complete the tasks
that they are considering for the upcoming sprint. The team will then only pull enough cards for
a sprint, which typically lasts two weeks. So, one sprint may consist of five cards, but the
following sprint may consist of eight. In each sprint, the team aims to complete all tasks on all
the cards they have pulled.
To stay organized, teams commonly use Kanban boards similar to the one shown below in
Figure 1. The board is generally organized into categories for each of the following tasks: To
Do, Develop, Test, and Release.

To Do/Sprint Dev Dev Done Test Test Done Release


Backlog

Ticket 5 Ticket 3 Ticket 2 Ticket 1

Ticket 6 Ticket 4

Ticket 7

Ticket 8

Fig. 1. A board that could be used for either Scrum or Kanban


The team members pull the cards from the Product Backlog and place them on the board in an
order which has been prioritized by the Product Owner. Each sprint starts with a clean board,
except in the unusual case where there is a card left over from the previous sprint.

Daily Scrum
Every morning at the same time, everyone attends a meeting, commonly referred to as the
stand-up, where each contributor to the project reports on their progress and issues that have
arisen since the last stand-up. These daily meetings, which last about 15 minutes, hold
everyone accountable for their work and allow people to discuss improvement ideas.

The Process
After the stand-up, people get to work, and over time, the cards gradually make their way across
the board. For a sprint that lasts two weeks, by the middle of the second week, it’s expected that
the cards have moved at least one step. It’s always a race to get everything done by the end of
the sprint so the services they built can all be released. Anything that was not completed during
a sprint stays on the board for the next sprint.
Retrospective
A retrospective meeting is held after each sprint to discuss what went well and what can be
improved upon in the next sprints. The team often discusses the best ways to improve the
processes, tools, and communication between teams and team members, as well as the action
items the team will commit to for the next sprint.

Scrum Team
A Scrum Master has no management authority over the team but is responsible for ensuring the
scrum values are followed and removing obstacles obstructing the team’s progress. A Scrum
Master who holds a certification as a Scrum Master helps the team perform better and ensures
the output meets the product objectives.
Often instrumental in assisting with face-to-face communication with stakeholders, the Product
Owner is responsible for maximizing the value of the product, as well as for deciding when the
team needs to continue developing the product and when to release it. A product owner who
holds the Certified Scrum Product Owner title helps ensure that the team works well together
and the product meets the needs of the business.
Both Scrum and Kanban limit the work in progress, identify bottlenecks, and improve
productivity. They both break down large tasks so they can be completed efficiently and place a
high value on continual improvement.

Kanban
Kanban, a Japanese word that means signboard or billboard, encourages small, incremental,
evolutionary changes to a current process and is focused on getting things done. Kanban seeks
to improve already established processes in a non-disruptive way by continuously improving
them through constant collaboration and feedback.
Originating in the manufacturing industry, Kanban was later adopted by agile software
development teams and has since been adopted by other business lines throughout a variety of
industries.
Kanban uses a board to visualize and manage work. The columns could be the same as the
one in Fig. 2 below or could include many other columns to suit the project and team. The work
items that need to be done are put on individual cards and present important information about
a task. This information includes the names or job roles of the people who will handle the work
item, a brief description of the job being done, screenshots, technical details that help explain
the task, and the amount of time the piece of work is estimated to take.

Kanban Cards
The cards for projects start on the far-left column of the Kanban board and travel from left to
right across the board, allowing team members to see the status of any work item at any point in
time. If a card is held up too long in any column, everyone will be able to see that on the
visualization board, allowing team members to identify any tasks that are taking longer than
expected.

The Kanban Board


For any one large project, all the cards start on the far-left column, the To Do, or Backlog,
column. The cards are never “pushed” onto any team or column, meaning when one team
finishes a task, they don’t push the card onto the next team, as it may be busy with other cards
and have no time to take on a new one. Instead, the team that completes a task might put it in
its Done column. Then, when the following team is ready, it can pick up the card and begin the
tasks listed.

Limits
A central Kanban principle is to limit work in progress. Collectively, the team decides how many
cards can be in any one column at any one time. The team members may decide there may
only be five cards in the Developing column and four cards in the Testing column.

To Do/ Dev Dev Done Test Test Done Release


Kanban
Backlog

Card 14 Card 9 Card 7 Card 3 Card 1

Card 15 Card 10 Card 8 Card 4 Card 2

Card 16 Card 11 Card 5

Card 17 Card 12 Card 6

Card 18 Card 13

Fig. 2. A Kanban board


You’ll notice in the board above (Fig. 2) that even though Card 7 and Card 8 are in the Dev
Done column, the testers are not going to touch that card because they already have four cards
in their column. And as stated above, this project team set a work-in-progress (WIP) limit of four
cards in the Test column. Not until the testers have only three cards in their column will they pull
a card from the Dev Done column.
Now, what if the testers have completed all their cards, so that there are no cards in the Dev
Done column, but there are still five cards in the Dev column? The Kanban board is simply
illuminating an inefficiency in the flow — which is one of its main purposes. If something is not
working well, the team composition may need to change, or WIP limits may need to be adjusted.
This is the “continually improving” aspect of Kanban.

A Collective Leadership
Kanban is all about visualizing and optimizing a process. All team members should have a good
understanding of Kanban skills and optimizing the system. They can become proficient in
adopting the practices by taking a Kanban workshop.

Can Scrum and Kanban be used together?


Yes! Scrum does not prescribe how teams organize their work; rather, it allows teams to
establish whatever processes work best for them. Many teams find that Kanban-style boards
are an excellent way to organize the production of the sprint's workflow items.
While Kanban is easy to understand, it’s difficult to put into practice. Developers, testers, and
project analysts may want to consider learning more about implementing Kanban and Scrum to
increase productivity and improve efficiency.
Chapter 17 Analytics

Why You Should Be Considering Cloud Analytics

Across industries and business departments, teams rely on data to drive decision-making
and optimize day-to-day operations. But when data is siloed and inaccessible to the people in
your organization who need it, they can’t analyze all the data the company has collected to
identify valuable business opportunities. This inaccessibility is one of the areas where on-
premises analytical technologies fall far behind cloud analytics.

What is Cloud Analytics?


Analytics refers to the process of storing, compiling, and analyzing data. This process can be
done with machines on-premises or in the cloud. Cloud analytics is the process of storing and
analyzing data in the cloud to extract actionable business insights. Companies that store data in
various systems on-premises often have duplicated data sets stored in two separate places.
Business units may remove some of the data in the data set to show only the information
pertinent to them. That works fine for that department. But when the business needs to analyze
the original data, it no longer has it because departments have altered it to suit themselves.
When business analysts review the company’s data, there are two different datasets, of which
both have been changed, and they don’t know which, if any, is the accurate one to analyze.
When data is stored in the cloud, the original data stays safe in a repository, while business
units can have copies of it and change it to suit their needs. When it comes time to analyze the
original data, it’s all there, providing the business with all it needs to know to make a wise
decision.
Machines that analyze data are expensive. Companies that perform cloud analytics on-
premises may only have a couple of machines. With only one or two machines, it could take 12
hours or more to analyze a lot of data. In the cloud, you can spread out the workload among
numerous machines, so you can process data in a fraction of the time.
A 2020 Gartner report estimated that by 2022 public cloud services would be essential for 90%
of data and analytics innovation. The report also stated that data and analytics leaders need to
prioritize workloads that can exploit cloud capabilities.

What Are the Challenges of Traditional Analytics?


For data to be useful, it must be collected, organized, and analyzed. When data is siloed, it’s
often difficult for people who need it to obtain it. The data is also often stored in different file
types, making it difficult or impossible for some people departments to analyze it themselves.
Every day, businesses collect and grow data, including IP addresses, website content, products
and services reviewed or purchased, search queries, and a website visitor’s location. Internally,
the IT team may be logging every move that’s made within the network, such as firmware and
patch levels, software versions, and log events from dozens or hundreds of systems. Keeping
up with all this data is daunting, and while it’s not all currently being analyzed, analysts will one
day decide to analyze some of that data that is now being collected. That means the analytics
system you’re using today will need to continue to grow to be able to store and analyze all that
data.

Lack of Scalability
The first problem companies are faced with when performing analytics on-premises is the lack
of scalability with their traditional analytics software. Even if your data is already housed in the
cloud — and it’s likely that at least some or all of it is stored there — many companies are still
performing their analytics on-premises. This wasn’t a huge problem twenty years ago when
most data was manually entered by humans at a keyboard. But these days, there is so much
machine-generated code that almost every organization is a “Big Data” organization.
Traditional on-premises data warehouse devices and Hadoop clusters have tightly coupled
compute and storage, so when either the data or the amount of processing required increases,
more expensive hardware needs to be purchased. Hadoop clusters are a collection of
computers, known as nodes, that are networked together to perform parallel computations on
big data sets. When you use cloud analytics, you don’t have to buy a collection of computers to
perform parallel computations. You pay the cloud providers only for the time that you use their
machines to perform computations.

Lack of Centralized Data Lake


The second problem with on-premises analytics is that you may often find yourself waiting
hours, days, or even weeks for reports due to the lack of a centralized place where you can pull
data from, such as a data lake. A data lake refers to a central storage area — usually housed in
the cloud — where different types of data can be stored. Because there are no filetype
limitations in a data lake, companies can bring in structured data, semi-structured data, (CSV,
logs, XML, JSON), and unstructured data (media files, documents, PDFs, emails, IoT data) with
few limitations as to what kind of or how much data can be stored. When doing analytics outside
of the cloud, departments may pull from various silos to collect and use their data. But not
everyone can access those silos.
For example, a company’s IT, sales, marketing, and operations teams may each store their own
data in their silos. Not only is this inefficient, as the chances for data duplication are high, but
manually locating and then moving data in and out of silos across a network is time-consuming.
The more data you need to move, the more time it takes, delaying the analysis. Since data from
different silos is not easily shared, departments can’t easily assess needed data that is housed
elsewhere.

High Costs
The third problem that arises when performing traditional analytics on-premises is the high cost
of both the analytics software and the servers it runs on. These software programs can take
days or weeks to get fully up and running and must continually be maintained and paid for,
regardless of whether analytics are run for just an hour a day or 12 hours a day. Running
analytics on huge data sets requires huge computing power, and that could take one machine
on-premises dozens of hours. Over time, as your data grows, that means more servers are
needed for storage and computing.

How Can Cloud Analytics Help?


The cost of cloud analytics is typically lower than the cost of performing cloud analytics on-
premises, given the “pay for what you use” model in the cloud. Cloud analytics allows you to
create a single organized place for all your data and allows different users/departments to
consume that data as they wish. It also always gives you the latest version of data, highly
available object storage, and the ability to scale resources up or down as needed.

Improved Scalability and Built-in Tools


Cloud analytics providers can eliminate constraints associated with installing, updating, and
maintaining software on-premises by providing businesses with various tools for data
integration, analytics, data exploration, and data cataloging. You can choose exactly which tools
and services you need and have them up and running as soon as you need them.

Improved Security
The cloud also provides increased security and safety for both your business and your users.
Data stored in the cloud is generally backed up in multiple locations in the local region, and, if
you choose to, you can also store your data around the globe, for a fee, eliminating the danger
of a single point of failure. Sensitive information doesn’t have to be transferred through emails or
on flash drives. The cloud analytics service providers’ tools can help you to determine whether
there’s anything that violates governance or compliance requirements like GDPR or PCI DSS.

Ready to Make the Switch?


It is only a matter of time before cloud analytics is the new normal. Peruse the various courses
that will help you keep up to date with the latest cloud analytics technologies.

Chapter 18 What is Databricks?

What is Databricks?

Companies have been collecting data about their business operations since the concept of a
business originated. With the advent of computers as tools for amassing large quantities of
data, they have increasingly faced problems associated with organizing, applying, and
understanding that data. To be of value to the organization, that data must be accessible, well-
organized, and accurate.
This is where Databricks comes into play.

Who Should Read This Article?


 C-level executives and IT leaders who want to understand what it takes to process and
transform huge volumes of data
 Data engineers, data analysts, and data scientists
This article presents a quick explanation of Databricks for technical professionals who work in
analytics. For everyone else, this article clearly explains the best processes to take to get the
most accurate insights into your company data so you can make the wisest business decisions.

A Technical Explanation for IT


Databricks is a managed cloud platform that provides a unified set of tools for data analysts,
data scientists, and engineers to collaborate on data engineering, analytic, and machine
learning workloads. A platform known as the Databricks Lakehouse combines the best features
of a data lake and data warehouse technologies to house everything under one roof with
complementary tools so organizations can extract the most value from that data.
Among the projects that form the core of the Databricks platform are Apache Spark, MLFlow,
and Delta Lake.
Apache Spark is a general-purpose, high-performance, open source, distributed computing
framework for big data processing.
MLflow is an open source framework that helps manage your machine learning lifecycle. With
MLFlow, you can track and manage machine learning experiments in ways that make it easy to
reproduce and deploy your machine learning models. MLflow is used to log the data you input
into your machine learning model, compare iterations of your experiments, package the model
into an easy-to-use interface, and track model metrics when deploying to production.
Delta Lake is a storage repository that that combines the flexibility, scalability and low cost of a
data lake with the structure, data integrity guarantees and analytic capabilities of a data
warehouse.
Below is a list of some of the features of Delta Lake:

● ACID transactions to protect your data


● Scalable Metadata to handle petabyte-scale tables consisting of billions of partitions and
file with ease
● Time Travel (Versioning) to provide snapshots of versions, enabling developers to
access and revert to earlier versions of your data for audits, rollbacks, or any use case
requiring reproducibility
● Schema Evolution and Enforcement to prevent bad data from corrupting your data
● Audit History to provide a complete audit trail of the history of your data
● DML operations to capture changes to your data over time
● Unified Batch/Streaming Architecture to run both batch and streaming operations on a
single unified architecture
Delta Lake runs on top of your existing data lake and storage systems — Amazon S3, Azure
Data Lake Storage (ADLS), Google Cloud Storage (GCS), or Hadoop Distributed File System
(HDFS) — delivering low cost, reliability, security, and performance to your data lake. Capable
of storing structured, semi-structured, and unstructured data, Delta Lake allows the organization
to achieve the vision of having the data lake serve as the single source of truth for all enterprise
data needs. Additionally, Delta Lake allows the enterprise to automate tasks such as sourcing,
storing, cleaning, and analyzing any type of data at scale.

A Simple Explanation for Non-IT Business Professionals


Databricks is a company that provides a web-based platform for data engineering, machine
learning, and analytics. Built on top of Apache Spark, an open source framework for large-scale
data processing, Databricks provides a collaborative environment for data scientists, engineers,
and analysts to work with big data at any scale.
The main idea behind modern big data processing platforms is a simple one: run workloads in
parallel. Rather than using a single super-computer to process the vast quantities of information
that are being gathered these days, organizations can spread the data over a cluster of
standard computers and rely on a framework to take responsibility for breaking data down into
chunks that can be processed simultaneously on different machines.
In conclusion, your data scientists, data engineers, and data analysts can use Databricks to
process, store, clean, share, analyze, model, and monetize their datasets. They can use it with
virtually any public cloud provider or with your private cloud to access, store, and analyze
massive amounts of data.
Check out courses to get you started: Programming with Databricks and Data Analysis with
Databricks.

Chapter 19 Power BI vs. Tableau

Choosing Between Tableau and Power BI Business Intelligence


Tools

Microsoft Power BI and Tableau (whose parent company is Salesforce) are business
intelligence (BI) tools that gather, integrate, analyze, and present business information to help
you interpret corporate data and make wise business decisions. The 2022 Gartner Magic
Quadrant report positions both tools as leaders in the BI space.
The functionality and features of the two platforms appear incredibly similar at first inspection.
But Power BI is easier to use, allowing anyone in your organization to start creating reports and
visualizing data. Tableau will be easy for data analysts, engineers, and data scientists to use but
will be more difficult for people who work outside of IT.

A brief look at Tableau and Power BI


Designed for the average stakeholder rather than a data analyst, Power BI features more drag-
and-drop and intuitive capabilities to assist teams in creating visualizations. The tool is an
excellent choice for any team that needs data analysis software but lacks extensive training in
data analytics.

Tableau is similarly strong, but its interface isn't as user-friendly, making it more difficult to use
and master. Those with prior data analytics and statistics experience will have an easier time
cleaning and translating data into visuals. People just starting out using analytics software will
undoubtedly feel overwhelmed by the uphill fight of learning basic data science before creating
any dashboards or reports.

Power BI is faster and performs better when data volume is limited, whereas Tableau can
handle large volumes of data quickly. Power BI also has limited access to some databases and
servers, while Tableau has access to the types of databases that are used by data analysts like
Hadoop.

Below are listed some of the core capabilities of both platforms:


 Integrate multiple data sources into a single source of truth
 Measure business key performance indicators (KPIs)
 Provide data through dashboards
 Provide data transparency and sharing capabilities

Power BI
Power BI is a powerful application for data analytics, data visualization, and ad-hoc report
creation, providing a multi-perspective view of the information. After you clean and aggregate
data to create a single data model for analysis processes, you can view, visualize and analyze
the information to generate key business insights.

Power BI imports data from .PXIB files, reducing storage requirements, data transmission time,
and the need for more bandwidth. It offers several software services, over 100 connections, and
a drag-and-drop capability to make it easy to use.

Tableau
Tableau is a well-known data visualization and business intelligence program used to analyze
and create reports on massive amounts of data. The software enables users to build various
charts, graphs, maps, dashboards, and stories to view and analyze data to make business
decisions. It incorporates natural language capabilities, used in artificial intelligence, into its
software, helping to find solutions to complex problems by understanding the data better.

The learning curve for Tableau dashboard development is slightly steeper than Power BI, but
given it is geared more towards data analysts than casual users, it still doesn’t require you to be
highly technical.

Ease of Use
Power BI provides end users with real-time dashboards and data analysis. To improve the user
experience, the tool also has superb drag-and-drop functionality. Users do not require
substantial technical knowledge to use its robust data analytics and discovery features. 

Tableau also has powerful dashboard and reporting options, albeit some of them are less
straightforward. Tableau's live query features are advantageous to analysts. Tableau also
includes query-based visualization and drag-and-drop functionality.

To learn how to operate Tableau, take Tableau Desktop 1 Fundamentals. Business Analysts,
Data Analysts, and Data Scientists should start their Power BI learning path with Quickstart to
Power BI For Analysts And Users.

Conclusion

Once you decide on the technologies you’re most interested in working with, consider the job
roles and skills that are required to use them to their full capabilities. Even if your organization
has been using specific technologies for a while, it could be beneficial for various roles to
receive vendor-authorized training for them. Instructors often report back to us that students
who have been using a tool for more than a year say the course they just took gave them new
knowledge that would have saved them hours on projects they have previously worked on.
It's time to understand the skills and training needed to best use your various cloud technologies
to advance your digital transformation. ExitCertified has been providing IT training to individuals
and organizations for over 20 years. As well as providing standard courses, we also customize
training to suit your IT projects and goals, so your IT staff learns all they need to know to
perform their job duties. To speak with a subject matter expert who can help you discover which
courses would best suit you or your organization, contact us.

You might also like