Professional Documents
Culture Documents
Data Center
Data Center
TABLE OF CONTENTS:-
Network ................................................................................................................................... 26
Data Centers ............................................................................................................................. 26
Analytics servers ...................................................................................................................... 26
Adding an Analytics server ...................................................................................................... 27
Fields........................................................................................................................................ 27
Editing an Analytics server ...................................................................................................... 28
Adding an Analytics server to a resource pool ........................................................................ 30
VIRTUALIZATION .............................................................................................................. 42
2|Page
ADVANTAGES AND DISADVANTAGES OF VIRTUALIZATION ............................. 49
REFERENCES....................................................................................................................... 61
3|Page
DATA CENTER
At its simplest, a data center is a physical facility that organizations use to house their
critical applications and data. A data center's design is based on a network of computing
and storage resources that enable the delivery of shared applications and data.
Modern data centers are very different than they were just a short time ago.
Infrastructure has shifted from traditional on-premises physical servers to virtualized
infrastructure that supports applications and workloads across pools of physical
infrastructure and into a multicloud environment.
In this era, the modern data center is wherever its data and applications are. It stretches
across multiple public and private clouds to the edge of the network via mobile devices
and embedded computing. In this constantly shifting environment, the data center must
reflect the intentions of users and applications.
4|Page
Why are data centers important to business?
In the world of enterprise IT, data centers are designed to support business applications
and activities that include:
● Productivity applications
Data center design includes routers, switches, firewalls, storage systems, servers, and
application delivery controllers. Because these components store and manage business-
critical data and applications, data-center is critical in data center design. Together, they
provide:
Network infrastructure. This connects servers (physical and virtualized), data center
services, storage, and external connectivity to end-user locations.
Storage infrastructure. Data is the fuel of the modern data center. Storage systems are
used to hold this valuable commodity.
Computing resources. Applications are the engines of a data center. These servers
provide the processing, memory, local storage, and network connectivity that drive
applications.
Data center services are typically deployed to protect the performance and integrity of
the core data center components.
5|Page
What is in a data center facility?
The most widely adopted standard for data center design and data center infrastructure
is ANSI/TIA-942. It includes standards for ANSI/TIA-942-ready certification, which
ensures compliance with one of four categories of data center tiers rated for levels of
redundancy and fault tolerance.
Tier 1: Basic site infrastructure. A Tier 1 data center offers limited protection against
physical events. It has single-capacity components and a single, nonredundant
distribution path.
Tier 3: Concurrently maintainable site infrastructure. This data center protects against
virtually all physical events, providing redundant- capacity components and multiple
independent distribution paths. Each component can be removed or replaced without
disrupting services to end users.
Tier 4: Fault-tolerant site infrastructure. This data center provides the highest levels of
fault tolerance and redundancy. Redundant-capacity components and multiple
independent distribution paths enable concurrent maintainability and one fault
anywhere in the installation without causing downtime
Many types of data centers and service models are available. Their classification
depends on whether they are owned by one or many organizations, how they fit (if they
6|Page
fit) into the topology of other data centers, what technologies they use for computing
and storage, and even their energy efficiency. There are four main types of data centers:
Centers and are optimized for their end users. Most often
7|Page
Data Center Architecture Overview
The data center is home to the computational power, storage, and applications necessary
to support an enterprise business. The data center infrastructure is central to the IT
architecture, from which all content is sourced or passes through. Proper planning of
the data center infrastructure design is critical, and performance, resiliency, and
scalability need to be carefully considered.
Another important aspect of the data center design is flexibility in quickly deploying
and supporting new services. Designing a flexible architecture that has the ability to
support new applications in a short time frame can result in a significant competitive
advantage. Such a design requires solid initial planning and thoughtful consideration in
the areas of port density, access layer uplink bandwidth, true server capacity, and
oversubscription, to name just a few.
The data center network design is based on a proven layered approach, which has been
tested and improved over the past several years in some of the largest data center
implementations in the world. The layered approach is the basic foundation of the data
center design that seeks to improve scalability, performance, flexibility, resiliency, and
maintenance. Figure 1-1 shows the basic layered design.
8|Page
Figure 1-1 Basic Layered Design
The layers of the data center design are the core, aggregation, and access layers. These
layers are referred to extensively throughout this guide and are briefly described as
follows:
Core layer—Provides the high-speed packet switching backplane for all flows
going in and out of the data center. The core layer provides connectivity to
multiple aggregation modules and provides a resilient Layer 3 routed fabric with
no single point of failure. The core layer runs an interior routing protocol, such
as OSPF or EIGRP, and load balances traffic between the campus core and
aggregation layers using Cisco Express Forwarding-based hashing algorithms.
Aggregation layer modules—Provide important functions, such as service
module integration, Layer 2 domain definitions, spanning tree processing, and
default gateway redundancy. Server-to-server multi-tier traffic flows through
the aggregation layer and can use services, such as firewall and server load
balancing, to optimize and secure applications. The smaller icons within the
aggregation layer switch in Figure 1-1 represent the integrated service modules.
These modules provide services, such as content switching, firewall, SSL
offload, intrusion detection, network analysis, and more.
9|Page
Access layer—Where the servers physically attach to the network. The server
components consist of 1RU servers, blade servers with integral switches, blade
servers with pass-through cabling, clustered servers, and mainframes with OSA
adapters. The access layer network infrastructure consists of modular switches,
fixed configuration 1 or 2RU switches, and integral blade server switches.
Switches provide both Layer 2 and Layer 3 topologies, fulfilling the various
server broadcast domain or administrative requirements.
This chapter defines the framework on which the recommended data center architecture
is based and introduces the primary data center design models: the multi-tier and server
cluster models.
The server cluster model has grown out of the university and scientific community to
emerge across enterprise business verticals including financial, manufacturing, and
entertainment. The server cluster model is most commonly associated with high-
performance computing (HPC), parallel computing, and high-throughput computing
(HTC) environments, but can also be associated with grid/utility computing. These
designs are typically based on customized, and sometimes proprietary, application
architectures that are built to serve particular business objectives.
Chapter 2 "Data Center Multi-Tier Model Design," provides an overview of the multi-
tier model, and Chapter 3 "Server Cluster Designs with Ethernet," provides an
overview of the server cluster model. Later chapters of this guide address the design
aspects of these models in greater detail.
10 | P a g e
Multi-Tier Model
Web-server
Application
Database
Multi-tier server farms built with processes running on separate machines can provide
improved resiliency and security. Resiliency is improved because a server can be taken
out of service while the same function is still provided by another server belonging to
the same application tier. Security is improved because an attacker can compromise a
web server without gaining access to the application or database servers. Web and
application servers can coexist on a common physical server; the database typically
remains separate.
Resiliency is achieved by load balancing the network traffic between the tiers, and
security is achieved by placing firewalls between the tiers. You can achieve segregation
between the tiers by deploying a separate infrastructure composed of aggregation and
access switches, or by using VLANs (see Figure 1-2).
11 | P a g e
Figure 1-2 Physical Segregation in a Server Farm with Appliances (A) and Service
Modules (B)
The design shown in Figure 1-3 uses VLANs to segregate the server farms. The left
side of the illustration (A) shows the physical topology, and the right side (B) shows
the VLAN allocation across the service modules, firewall, load balancer, and switch.
The firewall and load balancer, which are VLAN-aware, enforce the VLAN segregation
between the server farms. Note that not all of the VLANs require load balancing. For
example, the database in the example sends traffic directly to the firewall.
12 | P a g e
Figure 1-3 Logical Segregation in a Server Farm with VLANs
Business security and performance requirements can influence the security design and
mechanisms used. For example, the use of wire-speed ACLs might be preferred over
the use of physical firewalls. Non-intrusive security devices that provide detection and
correlation, such as the Cisco Monitoring, Analysis, and Response System (MARS)
combined with Route Triggered Black Holes (RTBH) and Cisco Intrusion Protection
System (IPS) might meet security requirements. Cisco Guard can also be deployed as
a primary defense against distributed denial of service (DDoS) attacks.
In the modern data center environment, clusters of servers are used for many purposes,
including high availability, load balancing, and increased computational power. This
13 | P a g e
guide focuses on the high performance form of clusters, which includes many forms.
All clusters have the common goal of combining multiple CPUs to appear as a unified
high performance system using special software and high-speed network interconnects.
Server clusters have historically been associated with university research, scientific
laboratories, and military research for unique applications, such as the following:
Server clusters are now in the enterprise because the benefits of clustering technology
are now being applied to a broader range of applications. The following applications in
the enterprise are driving this requirement:
In the enterprise, developers are increasingly requesting higher bandwidth and lower
latency for a growing number of applications. The time-to-market implications related
to these applications can result in a tremendous competitive advantage. For example,
the cluster performance can directly affect getting a film to market for the holiday
season or providing financial management customers with historical trending
information during a market shift.
In the high performance computing landscape, various HPC cluster types exist and
various interconnect technologies are used. The top 500 supercomputer list at
www.top500.org provides a fairly comprehensive view of this landscape. The majority
of interconnect technologies used today are based on Fast Ethernet and Gigabit
Ethernet, but a growing number of specialty interconnects exist, for example including
Infiniband and Myrinet. Specialty interconnects such as Infiniband have very low
14 | P a g e
latency and high bandwidth switching characteristics when compared to traditional
Ethernet, and leverage built-in support for Remote Direct Memory Access (RDMA).
10GE NICs have also recently emerged that introduce TCP/IP offload engines that
provide similar performance to Infiniband.
The Cisco SFS line of Infiniband switches and Host Channel Adapters (HCAs) provide
high performance computing solutions that meet the highest demands. For more
information on Infiniband and High Performance Computing, refer to the following
URL: http://www.cisco.com/en/US/products/ps6418/index.html.
The remainder of this chapter and the information in Chapter 3 "Server Cluster Designs
with Ethernet" focus on large cluster designs that use Ethernet as the interconnect
technology.
Although high performance clusters (HPCs) come in various types and sizes, the
following categorizes three main types that exist in the enterprise environment:
The traditional high performance computing cluster that emerged out of the university
and military environments was based on the type 1 cluster. The new enterprise HPC
15 | P a g e
applications are more aligned with HPC types 2 and 3, supporting the entertainment,
financial, and a growing number of other vertical industries.
The following section provides a general overview of the server cluster components
and their purpose, which helps in understanding the design objectives described
in Chapter 3 "Server Cluster Designs with Ethernet."
16 | P a g e
Figure 1-5 Logical View of a Server Cluster
Logical Overview
Front end—These interfaces are used for external access to the cluster, which
can be accessed by application servers or users that are submitting jobs or
retrieving job results from the cluster. An example is an artist who is submitting
a file for rendering or retrieving an already rendered result. This is typically an
Ethernet IP interface connected into the access layer of the existing server farm
infrastructure.
Master nodes (also known as head node)—The master nodes are responsible for
managing the compute nodes in the cluster and optimizing the overall compute
capacity. Usually, the master node is the only node that communicates with the
outside world. Clustering middleware running on the master nodes provides the
tools for resource management, job scheduling, and node state monitoring of
the computer nodes in the cluster. Master nodes are typically deployed in a
redundant fashion and are usually a higher performing server than the compute
nodes.
17 | P a g e
Back-end high-speed fabric—This high-speed fabric is the primary medium for
master node to compute node and inter-compute node communications. Typical
requirements include low latency and high bandwidth and can also include
jumbo frame and 10 GigE support. Gigabit Ethernet is the most popular fabric
technology in use today for server cluster implementations, but other
technologies show promise, particularly Infiniband.
Compute nodes—The compute node runs an optimized or full OS kernel and is
primarily responsible for CPU-intense operations such as number crunching,
rendering, compiling, or other file manipulation.
Storage path—The storage path can use Ethernet or Fibre Channel interfaces.
Fibre Channel interfaces consist of 1/2/4G interfaces and usually connect into a
SAN switch such as a Cisco MDS platform. The back-end high-speed fabric
and storage path can also be a common transport medium when IP over Ethernet
is used to access storage. Typically, this is for NFS or iSCSI protocols to a NAS
or SAN gateway, such as the IPS module on a Cisco MDS platform.
Common file system—The server cluster uses a common parallel file system
that allows high performance access to all compute nodes. The file system types
vary by operating system (for example, PVFS or Lustre).
Physical Overview
Server cluster designs can vary significantly from one to another, but certain items are
common, such as the following:
18 | P a g e
technologies are also used to increase performance while reducing CPU
utilization.
Low latency hardware—Usually a primary concern of developers is related to
the message-passing interface delay affecting the overall cluster/application
performance. This is not always the case because some clusters are more
focused on high throughput, and latency does not significantly impact the
applications. The Cisco Catalyst 6500 with distributed forwarding and the
Catalyst 4948-10G provide consistent latency values necessary for server
cluster environments.
Non-blocking or low-over-subscribed switch fabric—Many HPC applications
are bandwidth-intensive with large quantities of data transfer and interprocess
communications between compute nodes. GE attached server oversubscription
ratios of 2.5:1 (500 Mbps) up to 8:1(125 Mbps) are common in large server
cluster designs.
Mesh/partial mesh connectivity—Server cluster designs usually require a mesh
or partial mesh fabric to permit communication between all nodes in the cluster.
This mesh fabric is used to share state, data, and other information between
master-to-compute and compute-to-compute servers in the cluster.
Jumbo frame support—Many HPC applications use large frame sizes that
exceed the 1500 byte Ethernet standard. The ability to send large frames (called
jumbos) that are up to 9K in size, provides advantages in the areas of server
CPU overhead, transmission overhead, and file transfer time.
19 | P a g e
Figure 1-6 Physical View of a Server Cluster Model Using ECMP
The recommended server cluster design leverages the following technical aspects
or features:
20 | P a g e
Scalable fabric bandwidth—ECMP permits additional links to be added
between the core and access layer as required, providing a flexible method of
adjusting oversubscription and bandwidth per server.
Everyone knows that data centers are vital to global connectivity. Our content – from
cat videos to financial transactions – is stored and distributed from these data centers
24/7 and we expect on-demand, high quality, and real-time access to it whenever and
wherever we need it. But just how big has the data center monster become? Here are
12 fascinating facts about data centers that just may blow your mind.
I. There are over 7,500 data centers worldwide, with over 2,600 in the top 20
global cities alone, and data center construction will grow 21% per
year through 2018.
II. By 2020, at least 1/3 of all data will pass through the cloud.
III. The Natural Resources Defense Council (NRDC) estimates that data centers
consume up to 3% of all global electricity production.
IV. With just over 300 locations (337 to be exact), London, England has the largest
concentration of data centers in any given city across the globe.
V. California has the largest concentration of data centers in the U.S. with just over
300 locations.
VI. The average data center consumes over 100x the power of a large commercial
office building, while a large data center uses the electricity equivalent of a
small U.S. town.
VII. The largest concentration of data centers in a U.S. city is within the New York-
New Jersey metropolitan area (approximately 306 centers).
VIII. Data centers are increasingly using in-flight wire speed encryption, which
keeps your data fully protected from the moment it leaves one data center to the
moment it arrives at another.
IX. The largest data center in the world (Langfang, China) is 6.3 million square
feet—nearly the size of the Pentagon.
21 | P a g e
X. As much as 40% of the total operational costs for a data center come from the
energy needed to power and cool the massive amounts of equipment data
centers require.
XI. Google recently announced plans to build 12 new cloud-focused data centers
over a 1.5-year period.
XII. By 2020, nearly 8% of all new data centers will be powered by green energy.
A valuable option for companies to access cutting-edge tools and technology such as
blade servers, Flywheel UPS systems, etc.
22 | P a g e
Single-tenant facilities ensure enhanced security management along with other
DCIM solutions
Disadvantages of Outsourcing
23 | P a g e
SERVER
What is server ?
Server (hardware) is a computer optimized for high loads (CPU usage, Disk I/O
operations, swapping of Disk/RAM when server is running, etc.) and high throughput
(high bandwidth usage network cards). ... A Datacenter is a physical location
where Server (hardware) is hosted.
CLOUD SERVER
Key features:
Computing infrastructure that can be physical (bare metal), virtual or a mix of
the two depending on use case.
Has all the capabilities of an on-premises server.
Enables users to process intensive workloads and store large volumes of
information.
Automated services are accessed on demand through an API.
Gives users the choice of monthly or as-you-go payment.
Users can opt for a shared hosting plan that scales depending on needs.
24 | P a g e
What is a network server?
25 | P a g e
DIFFERENTIATE BETWEEN NETWORKS AND
DATACENTERS
Network
Data Centers
Analytics servers
An Analytics server facilitates the work of building and populating an Analytics index,
which allows you to perform clustering, concept searching, and categorization. This
server is also integral to Relativity Assisted Review, in that an Analytics index is
required to create an Assisted Review project. This server also facilitates Structured
Analytics operations, including email threading and the identification of textual near
duplicates, language, and repeated content.
26 | P a g e
Adding an Analytics server
I. Click your name in the upper right corner of Relativity, and click Home.
II. Select the Servers tab.
III. Click New Resource Server.
IV. Complete the fields on the form. See Fields.
V. Click Save. Relativity now attempts to retrieve information from the
server. If this call fails, you receive an error. To save your changes,
ensure that the web server can reach the server.
Fields
27 | P a g e
server installer and enter the new password. You must then enter the new
password here.
Status - select Active or Inactive. Relativity automatically updates this value
to Inactive when the respective agent for an Analytics server exceeds the
maximum connection attempts set in the Relativity Configuration table or when
Relativity finds an Analytics server unresponsive. After this update is complete,
you no longer receive email notifications indicating that the connection has
failed. This change doesn’t affect the server functionality, and you can reset the
value as necessary. For more information on Relativity automatically changing
an Analytics server's status to Inactive when the server is unresponsive.
Version - a read-only field that Relativity automatically updates when you
click Save. It also updates on a nightly basis to reflect any version changes. You
should run the same version of Analytics on all Analytics servers that you add.
Maximum connectors - the maximum number of connections allowed between
the Analytics server and SQL for each index that uses this server. The default
value is 4.
Maximum total connectors - the maximum number of connections allowed
between the Analytics server and SQL across all indexes using this server. The
default value is 50.
To edit certain settings for an existing Analytics server, follow these steps:
I. Click your name in the upper right corner of Relativity, and click Home.
II. Select the Servers tab.
III. Click the Edit link next to the server's name to display the server form.
IV. Update the following fields as necessary
Name - enter a name for the Analytics or worker manager server.
Analytics operations - select which Analytics operations can run on the
Analytics server. This field defaults to permitting Analytics
Indexing and Structured Data Analytics operations. This field only appears
when the server type is Analytics.
28 | P a g e
REST API port - the port used by the Analytics server's REST API. This value
must match the port specified during installation of the Analytics server.
REST API username - the username Relativity uses to authenticate the
Analytics server's REST API. This value must match the username specified
during installation of the Analytics server.
REST API password - the password Relativity uses to authenticate the
Analytics server’s REST API. This value must match the password specified
during installation of the Analytics server. This field is required for Analytics
servers on which a password as not yet been provided in Relativity. This field
is optional if you’re editing an Analytics server that already has a password set.
If you need to change the REST API password, you need to run the Analytics
server installer and enter the new password. You must then enter the new
password here.
Status - select Active or Inactive. Relativity automatically updates this value
to Inactive when the respective agent for an Analytics or worker manager
server exceeds the maximum connection attempts set in the Relativity
Configuration table. After this update is made, you'll no longer receive email
notifications indicating that the connection has failed. This change doesn’t
affect the server functionality, and you can reset the value as necessary.
Maximum connectors - the maximum number of connections allowed between
the Analytics server and SQL for each index that uses this server. The default
value is 4.
Maximum total connectors - the maximum number of connections allowed
between the Analytics server and SQL across all indexes using this server. The
default value is 50.
V. Click Save. When you click save, Relativity attempts to retrieve information
from the server. If this call fails, you'll receive an error message. To save your
changes, ensure that the web server can reach the server.
29 | P a g e
Adding an Analytics server to a resource pool
When you add an Analytics server to your environment, you also need to add that server
to the resource pool referenced by the workspace so that you can select it when creating
Analytics indexes. For more information, see Adding resources to a pool.
Connectors functionality
Connectors increase performance when populating the index by allowing the Analytics
server and the SQL server to communicate directly with each other and not having to
go through the agent to send and receive calls. This direct line between servers reduces
the number of entities involved in index population and leads to faster population times.
I. The Content Analyst Index Manager Agent queries SQL to see if it should
populate any indexes.
II. If there is an index to populate, the agent creates connectors, not exceeding
the number of connectors specified on the Analytics server.
III. The connectors allow the Analytics server to query the SQL database
directly.
IV. The agent monitors the progress and reports back the status of the
population.
30 | P a g e
POWER CONSUMPTIONS BY DATACENTER
With so many new data centers on the horizon, it’s worth thinking about the harsh
realities of data center power consumption. Even with innovative developments in
sustainable energy solutions, the truth of the matter that both small and large data
centers consume a LOT of power.
31 | P a g e
Data Center Power Consumption: By the Numbers
In 2017, US based data centers alone used up more than 90 billion kilowatt-hours of
electricity. To give some perspective on how much energy that amounts to, it would
take 34 massive coal-powered plants generating 500 megawatts each to equal the power
demands of those data centers. On a global scale, data centers power consumption
amounted to about 416 terawatts, or roughly three percent of all electricity generated
on the planet. For context, data center energy consumption around the world amounted
to 40 percent more than all the energy consumed by the United Kingdom, an
industrialized country with over 65 million people.
That’s a lot of power. And it’s only going to increase in the future as more facilities are
built each year. With 80 percent of the world’s energy still being generated by fossil
fuels, those ever-increasing power demands could become a problem. Fortunately, data
center providers are working tirelessly to meet the needs of consumers while keeping
their energy usage at reasonable levels.
On the plus side, these massive data center energy consumption figures are much better
than past projections. Between 2005 and 2010, US data center energy usage grew by
24 percent. The previous five years were even worse, with energy usage increasing by
nearly 90 percent from 2000 to 2005. But from 2010 to 2014, total data center energy
consumption grew by a comparatively tiny four percent. Researchers expect that growth
rate to hold steady at least through 2020.
Much of these gains are the result of efficiency improvements. The economies of scale
offered by hyperscale data centers have pushed their Power Usage Effectiveness (PUE)
scores lower than their smaller cousins, but smaller enterprise data centers also operate
much more efficiently today than they did a decade ago. A 2005 Uptime Institute
report found that many data centers were so badly organized that only 40 percent of
cold air intended for server racks actually reached them despite the fact that the facilities
had installed 2.6 times as much cooling capacity as they needed. Since that time, data
32 | P a g e
center energy efficiency has improved by as much as 80 percent through the use of low-
power chips and solid state drives rather than spinning hard drives.
Consolidation also played an important role in keeping power demands under relative
control. With the rapid growth of cloud computing, organizations have
increasingly abandoned private data centers and server closets in favor of colocation or
on-demand services. Since most of these solutions ran on inefficient and energy-hungry
legacy hardware, exporting their IT infrastructure to data centers actually proved to be
a net positive in terms of efficiency.
It’s not yet clear what impact developments like Internet of Things (IoT)
devices and edge computing will have on power usage. Newly designed edge data
centers will incorporate efficiency best practices, but since most IoT devices aren’t
physically located in data centers, they often aren’t taken into consideration when
measuring data center consumption.
33 | P a g e
Many data centers have made a commitment to sustainable energy solutions by turning
to sources of renewable power. Although the current nature of renewable power in the
US makes it difficult for data center providers to rely on it as a primary source of
energy, there are a number of ways, such as the purchase of Renewable Energy Credits
(RECs), it can be used to supplement energy needs to improve the overall carbon
footprint of facilities.
There’s also good reason to be hopeful that unexpected technological solutions wait
just over the horizon. Despite all the developments of the 21st century, many core
principles of computing architecture have gone largely unchanged since their
invention many decades ago. Processors, for example, have become smaller and more
powerful, but they still operate according to the same principles as their bulkier and
slower ancestors. Where their transistors were once much slower than the wires
connecting them, today the opposite is true. Many experts believe we’ve only scratched
the surface of what’s possible.
Although data center power consumption will continue to be an issue in the future, the
twin trends of consolidation and efficiency practices have greatly reduced the overall
impact of these facilities. Where data centers were once expected to push energy
demands to unsustainable levels, developments in data center energy efficiency over
the last decade have created an opportunity to research and implement more long term
solutions that will continue to allow data centers to serve the needs of the companies
and consumers who depend upon their services.
34 | P a g e
Here’s How Much Energy All US Data Centers Consume
It’s no secret that data centers, the massive but bland, unremarkable-looking buildings
housing the powerful engines that pump blood through the arteries of global economy,
consume a huge amount of energy. But while our reliance on this infrastructure and its
ability to scale capacity grows at a maddening pace, it turns out that on the whole, the
data center industry’s ability to improve energy efficiency as it scales is extraordinary.
The demand for data center capacity in the US grew tremendously over the last five
years, while total data center energy consumption grew only slightly, according to
results of a new study of data center energy use by the US government, released today.
This is the first comprehensive analysis of data center energy use in the US in about a
decade.
It’s no secret that data centers, the massive but bland, unremarkable-looking buildings
housing the powerful engines that pump blood through the arteries of global economy,
consume a huge amount of energy. But while our reliance on this infrastructure and its
ability to scale capacity grows at a maddening pace, it turns out that on the whole, the
data center industry’s ability to improve energy efficiency as it scales is extraordinary.
The demand for data center capacity in the US grew tremendously over the last five
years, while total data center energy consumption grew only slightly, according to
results of a new study of data center energy use by the US government, released today.
This is the first comprehensive analysis of data center energy use in the US in about a
decade.
35 | P a g e
consumption, according to the study. That's equivalent to the amount consumed by
about 6.4 million average American homes that year. This is a 4 percent increase in
total data center energy consumption from 2010 to 2014, and a huge change from the
preceding five years, during which total US data center energy consumption grew by
24 percent, and an even bigger change from the first half of last decade, when their
energy consumption grew nearly 90 percent.
Efficiency improvements have played an enormous role in taming the growth rate of
the data center industry’s energy consumption. Without these improvements, staying at
the efficiency levels of 2010, data centers would have consumed close to 40 billion
kWh more than they did in 2014 to do the same amount of work, according to the study,
conducted by the US Department of Energy in collaboration with researchers from
Stanford University, Northwestern University, and Carnegie Mellon University.
Energy efficiency improvements will have saved 620 billion kWh between 2010 and
2020, the study forecasts. The researchers expect total US data center energy
consumption to grow by 4 percent between now and 2020 – they predict the same
growth rate over the next five years as it was over the last five years – reaching about
73 billion kWh.
This chart shows past and projected growth rate of total US data center energy use from
2000 until 2020. It also illustrates how much faster data center energy use would grow
36 | P a g e
if the industry, hypothetically, did not make any further efficiency improvements after
2010. (Source: US Department of Energy, Lawrence Berkeley National Laboratory)
Counting Electrons
Somewhere around the turn of the century, data center energy consumption started
attracting a lot of public attention. The internet was developing fast, and many started
asking questions about the role it was playing in the overall picture of the country’s
energy use.
Many, including public officials, started ringing alarm bells, worried that continuing to
power growth of the internet would soon become a big problem. These worries were
stoked further by the coal lobby, which funded pseudo-scientific research by “experts”
with questionable motives, who said the internet’s power consumption was out of
control, and if the society wanted it to continue growing, it wouldn’t be wise to continue
shutting down coal-burning power plants.
The DOE’s first attempt to quantify just how much energy data centers were
consuming, whose results were published in a 2008 report to Congress, was a response
to those rising concerns. It showed that yes, this infrastructure was consuming a lot of
energy, and that its energy use was growing quickly, but the problem wasn’t nearly as
big as those studies of murky origins had suggested.
“The last [DOE] study … was really the first time data center energy use for the entire
country was quantified in some way,” Arman Shehabi, research scientist at the DOE’s
Lawrence Berkeley National Laboratory and one of the new study’s lead authors, said
in an interview with Data Center Knowledge.
What authors of both the 2008 report and this year’s report did not anticipate was how
much the growth curve of the industry’s total energy use would flatten between then
and now. This was the biggest surprise for Shehabi and his colleagues when analyzing
the most recent data.
“It’s slowed down, and right now the rate of increase is fairly steady,” he said. “There’s
more activity occurring, but that activity is happening in more efficient data centers.”
37 | P a g e
Fewer Servers
There’s a whole list of factors that contributed to flattening of the curve, but the most
obvious one is that the amount of servers being deployed in data centers is simply not
growing as quickly as it used to. Servers have gotten a lot more powerful and efficient,
and the industry has figured out ways to utilize more of each server’s total capacity,
thanks primarily to server virtualization, which enables a single physical server to host
many virtual ones.
Each year between 2000 and 2005, companies bought 15 percent more servers on
average than the previous year, the study says, citing server shipment estimates by the
market research firm IDC. The total number of servers deployed in data centers just
about doubled in those five years.
Growth rate in annual server shipments dropped to 5 percent over the second half of
the decade, due in part to the 2008 market crash but also to server virtualization, which
emerged during that period. Annual shipment growth dropped to 3 percent since 2010,
and the researchers expect it to remain there until at least 2020.
The end of the last decade and beginning of the current one also saw the rise of
hyperscale data centers, the enormous facilities designed for maximum efficiency from
the ground up. These are built by cloud and internet giants, such as Google, Facebook,
Microsoft, and Amazon, as well as data center providers, companies that specialize in
designing and building data centers and leasing them to others.
According to the DOE study, most of the servers that have been responsible for that 3
percent annual increase in shipments have been going into hyperscale data centers. The
cloud giants have created a science out of maximizing server utilization and data center
efficiency, contributing in a big way to the slow-down of the industry’s overall energy
use, while data center providers have made improvements in efficiency of their facilities
infrastructure, the power and cooling equipment that supports their clients’ IT gear.
38 | P a g e
Both of these groups of data center operators are well-incentivized to improve
efficiency, since it has direct impact on their bottom lines.
The amount of applications companies deployed in the cloud or in data center provider
facilities started growing as well. A recent survey by the Uptime Institute found that
while enterprise-owned data centers host 71 percent of enterprise IT assets today, 20
percent is hosted by data center providers, and the remaining 9 percent is hosted in the
cloud
This chart shows the portion of energy use attributed to data centers of various types
over time. SP data centers are data centers operated by service providers, including both
colocation and cloud service providers, while internal data centers are typical single-
user enterprise data centers. (Source: US Department of Energy, Lawrence Berkeley
National Laboratory)
Additionally, while companies are deploying fewer servers, the amount of power each
server needs has not been growing as quickly as it used to. Server power requirements
were increasing from 2000 to 2005 but have been relatively static since then, according
to the DOE. Servers have gotten better at reducing power consumption when running
idle or at low utilization, while the underlying data center power and cooling
infrastructure has gotten more efficient. Storage devices and networking hardware have
also seen significant efficiency improvements.
39 | P a g e
From IT Closet to Hyperscale Facilities
To put this new data in perspective, it’s important to understand the trajectory of the
data center industry’s development. It was still a young field in 2007, when the first
DOE study was published, Shehabi said. There was no need for data centers not too
long ago, when instead of a data center there was a single server sitting next to
somebody’s desk. They would soon add another server, and another, until they needed
a separate room or a closet. Eventually, that footprint increased to a point where servers
needed dedicated facilities.
All this happened very quickly, and the main concern of the first data center operators
was keeping up with demand, not keeping the energy bill low. “Now that [data centers]
are so large, they’re being designed from a point of view of looking at the whole system
to find a way to make them as efficient and as productive as possible, and that process
has led to a lot of the efficiencies that we’re seeing in this new report,” Shehabi said.
While the industry as a whole has managed to flatten the growth curve of its energy
use, it’s important to keep in mind that a huge portion of all existing software still runs
in highly inefficient data centers, the small enterprise IT facilities built a decade ago or
earlier that support applications for hospitals, banks, insurance companies, and so on.
“The lowest-hanging fruit will be trying to address efficiency of the really small data
centers,” Shehabi said. “Even though they haven’t been growing very much … it’s still
millions of servers that are out there, and those are just very inefficient.” Going forward,
it will be important to find ways to either make those smaller data centers more efficient
or to replace them with footprint in efficient hyperscale facilities.
As with the first data center study by the DOE, the new results are encouraging for the
industry, but they don’t indicate that it has effectively addressed energy problems it is
likely to face in the future. There are only a “couple of knobs you can turn” to improve
efficiency – you can design more efficient facilities and improve server utilization –
and operators of the world’s largest data centers have been turning them both, but
40 | P a g e
demand for data center services is increasing, and there are no signs that it will be
slowing down any time soon. “We can only get to 100 percent efficiency,” Shehabi
said.
Writing in the report on the study, he and his colleagues warn that as information and
communication technologies continue to evolve rapidly, it is likely that deployment of
new systems and services is happening “without much consideration of energy
impacts.” Unlike 15 years ago, however, the industry now has a lot more knowledge
about deploying these systems efficiently. Waiting to identify specific efficient
deployment plans can lead to setbacks in the future.
“The potential for data center services, especially from a global perspective, is still in a
fairly nascent stage, and future demand could continue to increase after our current
strategies to improve energy efficiency have been maximized. Understanding if and
when this transition may occur and the ways in which data centers can minimize their
costs and environmental impacts under such a scenario is an important direction for
future research.”
41 | P a g e
VIRTUALIZATION
Virtualization is technology that lets you create useful IT services using resources that
are traditionally bound to hardware. It allows you to use a physical machine’s full
capacity by distributing its capabilities among many users or environments.
In more practical terms, imagine you have 3 physical servers with individual dedicated
purposes. One is a mail server, another is a web server, and the last one runs internal
legacy applications. Each server is being used at about 30% capacity—just a fraction
of their running potential. But since the legacy apps remain important to your internal
operations, you have to keep them and the third server that hosts them, right?
Traditionally, yes. It was often easier and more reliable to run individual tasks on
individual servers: 1 server, 1 operating system, 1 task. It wasn’t easy to give 1 server
multiple brains. But with virtualization, you can split the mail server into 2 unique ones
that can handle independent tasks so the legacy apps can be migrated. It’s the same
hardware, you’re just using more of it more efficiently.
42 | P a g e
A brief history of virtualization
While virtualization technology can be sourced back to the 1960s, it wasn’t widely
adopted until the early 2000s. The technologies that enabled virtualization—
like hypervisors—were developed decades ago to give multiple users simultaneous
access to computers that performed batch processing. Batch processing was a popular
computing style in the business sector that ran routine tasks thousands of times very
quickly (like payroll).
But, over the next few decades, other solutions to the many users/single machine
problem grew in popularity while virtualization didn’t. One of those other solutions was
time-sharing, which isolated users within operating systems—inadvertently leading
to other operating systems like UNIX, which eventually gave way to Linux®. All the
while, virtualization remained a largely unadopted, niche technology.
Fast forward to the 1990s. Most enterprises had physical servers and single-vendor IT
stacks, which didn’t allow legacy apps to run on a different vendor’s hardware. As
companies updated their IT environments with less-expensive commodity servers,
operating systems, and applications from a variety of vendors, they were bound to
underused physical hardware—each server could only run 1 vendor-specific task.
This is where virtualization really took off. It was the natural solution to 2 problems:
companies could partition their servers and run legacy apps on multiple operating
system types and versions. Servers started being used more efficiently (or not at all),
thereby reducing the costs associated with purchase, set up, cooling, and maintenance.
Virtualization’s widespread applicability helped reduce vendor lock-in and made it the
foundation of cloud computing. It’s so prevalent across enterprises today that
specialized virtualization management software is often needed to help keep track of it
all.
Software called hypervisors separate the physical resources from the virtual
environments—the things that need those resources. Hypervisors can sit on top of an
operating system (like on a laptop) or be installed directly onto hardware (like a server),
43 | P a g e
which is how most enterprises virtualize. Hypervisors take your physical resources and
divide them up so that virtual environments can use them.
Resources are partitioned as needed from the physical environment to the many virtual
environments. Users
interact with and run
computations within
the virtual
environment
(typically called a
guest machine
or virtual
machine). The virtual machine functions as a single data file. And like any digital file,
it can be moved from one computer to another, opened in either one, and be expected
to work the same.
When the virtual environment is running and a user or program issues an instruction
that requires additional resources from the physical environment, the hypervisor relays
the request to the physical system and caches the changes—which all happens at close
to native speed (particularly if the request is sent through an open source hypervisor
based on KVM, the Kernel-based Virtual Machine).
TYPES OF VIRTUALIZATION
Data virtualization
Data that’s spread all over can be consolidated into a single source. Data virtualization
allows companies to treat data as a dynamic supply—providing processing capabilities
that can bring together data from multiple sources, easily accommodate new data
sources, and transform data according to user needs. Data virtualization tools sit in front
of multiple data sources and allows them to be treated as single source, delivering the
needed data—in the required form—at the right time to any application or user.
44 | P a g e
Desktop virtualization
Server virtualization
Servers are computers designed to process a high volume of specific tasks really well
so other computers—like laptops and desktops—can do a variety of other tasks.
Virtualizing a server lets it to do more of those specific functions and involves
partitioning it so that the components can be used to serve multiple functions.
45 | P a g e
Operating system virtualization
Reduces bulk hardware costs, since the computers don’t require such high out-
of-the-box capabilities.
Increases security, since all virtual instances can be monitored and isolated.
46 | P a g e
Network functions virtualization
Network functions virtualization (NFV) separates a network's key functions (like directory
services, file sharing, and IP configuration) so they can be distributed among environments.
Once software functions are independent of the physical machines they once lived on, specific
functions can be packaged together into a new network and assigned to an environment.
Virtualizing networks reduces the number of physical components—like switches, routers,
servers, cables, and hubs—that are needed to create multiple, independent networks, and it’s
particularly popular in the telecommunications industry.
47 | P a g e
Red Hat Virtualization
48 | P a g e
Red Hat Virtualization is an open, software-defined platform that virtualizes Linux and
Microsoft Windows workloads. Built on Red Hat Enterprise Linux® and the Kernel-
based Virtual Machine (KVM), it features management tools that virtualize resources,
processes, and applications—giving you a stable foundation for a cloud-native and
containerized future.
Since then, virtualization has expanded into almost every form of digital life. From
virtual machines that act like a real computer to console emulation, many people take
advantage of what virtualization can provide.
Like most technologies, there are advantages and disadvantages of virtualization that
must be considered before fully implementing a system or plan.
1. It is cheaper.
49 | P a g e
and corporations can have predictable costs for their information technology needs.
For example: the cost of a Dell PowerEdge T330 Tower Server, at the time of
writing, is $1,279 direct from the manufacturer. In comparison, services provided
by Bluehost Web Hosting can be a slow as $2.95 per month.
Most virtualization providers automatically update their hardware and software that
will be utilized. Instead of sending people to do these updates locally, they are
installed by the third-party provider. This allows local IT professionals to focus on
other tasks and saves even more money for individuals or corporations.
Resource provisioning is fast and simple when virtualization is being used. There
is no longer a need to set up physical machines, create local networks, or install
other information technology components. As long as there is at least one point of
access to the virtual environment, it can be spread to the rest of the organization.
50 | P a g e
7. It provides energy savings.
The cost for the average individual or business when virtualization is being
considered will be quite low. For the providers of a virtualization environment,
however, the implementation costs can be quite high. Hardware and software are
required at some point and that means devices must either be developed,
manufactured, or purchased for implementation.
Information is our modern currency. If you have it, you can make money. If you
don’t have it, you’ll be ignored. Because data is crucial to the success of a business,
it is targeted frequently. The average cost of a data security breach in 2017,
according to a report published by the Ponemon Institute, was $3.62 million. For
51 | P a g e
perspective: the chances of being struck by lightning are about 1 in a million. The
chances of experiencing a data breach while using virtualization? 1 in 4.
The primary concern that many have with virtualization is what will happen to their
work should their assets not be available. If an organization cannot connect to their
data for an extended period of time, they will struggle to compete in their industry.
And, since availability is controlled by third-party providers, the ability to stay
connected in not in one’s control with virtualization.
If you have local equipment, then you are in full control of what you can do. With
virtualization, you lose that control because several links must work together to
perform the same task. Let’s using the example of saving a document file. With a
local storage device, like a flash drive or HDD, you can save the file immediately
and even create a backup. Using virtualization, your ISP connection would need to
be valid. Your LAN or Wi-Fi would need to be working. Your online storage option
would need to be available. If any of those are not working, then you’re not saving
that file.
52 | P a g e
7. It takes time.
Although you save time during the implementation phases of virtualization, it costs
users time over the long-run when compared to local systems. That is because there are
extra steps that must be followed to generate the desired result.
The advantages and disadvantages of virtualization show us that it can be a useful tool
for individuals, SMBs, entrepreneurs, and corporations when it is used properly.
Because it is so easy to use, however, some administrators begin adding new servers or
storage for everything and that creates sprawl. By staying disciplined and aware of
communication issues, many of the disadvantages can be tempered, which is why this
is such an effective modern system.
Virtualization often sounds like the holy grail of IT infrastructures. But is this truly the
case for small businesses?
Virtualization has several benefits. For businesses with limited funds, virtualization
helps them stay on budget by eliminating the need to invest in tons of hardware.
Creating virtual environments to work in also helps businesses with limited IT staff
automate routine tasks and centralize resource management. Further, employees can
access their data anytime, anywhere, using any device. However, virtualized
environments have drawbacks. Here are the major pros and cons of virtualization.
53 | P a g e
Pro: Reduced IT costs
Virtualization helps businesses reduce costs in several ways, according to Mike Adams,
senior director of cloud platform product marketing at VMware.
If you're transitioning a legacy system to a virtualized one, upfront costs are likely to
be expensive. Be prepared to spend upwards of $10,000 for the servers and software
licenses. However, as virtualization technology improves and becomes more
commonplace, costs will go down D
54 | P a g e
Pro: Efficient resource utilization
Virtualization enables businesses to get the most out of their investment in hardware
and resources. "As customer data center environments grow in size and complexity,
managing it becomes a burden," Adams said. "Virtualization can greatly help reduce
this complexity by offering resource management capabilities to help increase
efficiencies in these virtual environments."
In contrast, traditional infrastructures that use multiple servers don't make the most out
of their setups. "Many of those servers would typically not utilize more than 2 to 10
percent of the server hardware resources," said John Livesay, vice president of Infranet
Technologies, a network infrastructure services provider. "With virtualization, we can
now run multiple virtual servers on a single virtual host [and make] better use of the
resources available."
The drawback, however, is that not all servers and applications are virtualization-
friendly, Livesay said. "Typically, the main reason you may not virtualize a server or
application is only because the application vendor may not support it yet, or recommend
it," he said.
But virtualization is highly scalable. It lets businesses easily create additional resources
as required by many applications, such as by easily adding extra servers – it's all done
on-demand on an as-needed basis, without any significant investments in time or
money.
IT admins can create new servers quickly, because they do not need to purchase new
hardware each time they need a new server, Livesay said. "If the resources are available,
we can create a new server in a few clicks of a mouse," he added.
The ease of creating additional resources also helps businesses scale as they grow. "This
scenario might be good for small businesses that are growing quickly, or businesses
using their data center for testing and development," Livesay said.
55 | P a g e
Businesses should keep in mind, though, that one of the main goals and advantages of
virtualization is the efficient use of resources. Therefore, they should be careful not to
let the effortlessness of creating servers result in the carelessness of allocating
resources.
The limitations virtualization faces include a lack of awareness that certain applications
or workloads can be virtualized, according to Adams.
"Workloads such as Hadoop, NoSQL databases, Spark and containers often start off on
bare-metal hardware but present new opportunities to be virtualized later on," Adams
said. "Virtualization can now support many new applications and workloads within the
first 60 to 90 days on the market."
Although more software applications are adapting to virtualization situations, there can
be licensing complications due to multiple hosts and migrations. Regarding
performance and licensing issues, it's prudent to check if certain essential applications
work well in a virtualized environment.
56 | P a g e
COOLING METHODS IN DATA CENTER
Free Cooling
Free cooling is a cost-effective way to ensure that your data center’s temperature flow
is properly functioning. When this technique is used, the cooling used are minimalistic
and reduce the overall expenditures for cooling. This method consists of two systems
known as air-side economization and water-side economization. Air-side
economization uses air from the outdoors to regulate the equipment’s coolness. This
57 | P a g e
technique has its flaws since it can potentially allow pollutants and moisture from the
outdoors to enter into the data center.
Liquid cooling can be more efficient and direct in its cooling techniques. This is due to
the fact that chilled water can be directly targeted to the desired area, without it being
necessary to supply cool air to all areas of the facility. With the chilled water technique,
the CRAH is connected to a chiller. As the chilled water travels through coils, it engulfs
the heat and deposits it into the chiller. Once the water returns to the chiller, it merges
with condenser water flowing through a cooling tower.
Pumped Refrigerant
This method pumps chilled water through a heat exchanger and utilizes a cold pumped
refrigerant to draw out the heat. The Pumped Refrigerant technique provides savings
since it has the capacity to transmit energy from servers and it allows for humidification
to be greatly reduced.
With this technique, an air-duct that is connected to an indirect air evaporative cooler
is utilized. This method is energy efficient and uses weather from the outdoors to cool
the facility at times when it is cooler than the temperature inside. This air is used to add
cooler air to the airflow within the data center.
Optimizing the organization and placement of your data center equipment is an easy
and cost-efficient way to ensure that your data center is meeting the temperatures it
needs to maintain productivity. Efficient organization for optimum data center
58 | P a g e
temperatures includes hot/ cold aisle arrangement, containment, rack placement, cable
organization, and usage of blanking panels.
Managing hot and cold aisles is a typical way to sustain temperatures in data centers.
Without separating hot and cold aisles, the air within the data center will experience
“mixing”, which is an inefficient use of energy. With air mixing, the equipment does
not have the opportunity to be submerged in the optimal temperature that it needs to
function. The hot/cold isle method is implemented by positioning racks so that the lanes
are divided by hot aisles and cold aisles.
Containment
Through the hot and cold aisles, containment can also be implemented to isolate the hot
and cold air from the racks. When containment is in use, the HVAC units can perform
more efficiently. For this system, the hot air should be monitored to ensure that the hot
aisle is not over-extended.
Rack Placement
The placement of the racks can minimize the heat circulation from rack hot spots.
Within the rack, the hottest location is located at the top of the rack. To ensure optimal
cooling, you can arrange your racks by organizing their components so that the heavier
equipment is located on the lower racks. Since larger equipment circulates the most air,
a lower rack placement will ensure that less hot air is dispersed at the top of the rack.
59 | P a g e
Cable Organization
Maintaining cable organization not only allows your cables to be neater and easier to
manage, but it also ensures that the cables are not obstructing the data center’s airflow.
It is a small step towards allowing your data center to optimize its airflow.
Blanking Panels
If your racks have unused space and your data center is not utilizing blanking panels,
excess heat is being emitted into your data center’s airflow to accommodate rack space
that is unoccupied. By using blanking panels, the hot air will be blocked from entering
your data center’s airflow, proving greater cooling efficiency.
Environmental Monitoring
60 | P a g e
Reference List
https://www.cisco.com/c/en/us/solutions/data-center-
virtualization/what-is-a-data-center.html
https://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Data_Cen
ter/DC_Infra2_5/DCInfra_1.html
https://www.ciena.com/insights/articles/Twelve-Mind-blowing-Data-
Center-Facts-You-Need-to-Know.html
https://whatismyipaddress.com/proxy-server
https://www.samlogic.net/articles/mail-server.htm
https://www.vxchnge.com/blog/power-hungry-the-growing-energy-
demands-of-data-centers
https://www.redhat.com/en/topics/virtualization/what-is-
virtualization
https://www.businessnewsdaily.com/6014-pros-cons-
virtualization.html
https://www.raritan.com/ap/blog/detail/types-of-data-center-cooling-
techniques
61 | P a g e