Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Data Center Cooling Management And Analysis – A Model Based Approach

Rongliang Zhou, Zhikui Wang, Cullen E. Bash, Alan McReynolds

Abstract— As the hub of information aggregation, processing, reference of Supply Air Temperature (SAT) or Return Air
and dissemination, today’s data centers consume significant Temperature (RAT). The flow rate of the cool air supply can
amount of energy. The data center electricity consumption also be tuned continuously if a Variable Frequency Drive
mainly comes from the IT equipment and the supporting
cooling facility that manages the thermal status of the IT (VFD) is installed for each CRAC unit to vary the speed of
equipment. The traditional data center cooling facility usually its blowers.
consists of chilled water cooled computer room air conditioning For the particular configuration shown in Fig. 1, neither
(CRAC) units and chillers that provide chilled water to the the cold aisles nor the hot aisles are contained and hence air
CRAC units. Electricity used to power the cooling facility streams are free to mix. Most of the hot air in the hot aisles
could take up to a half of the total data center electricity
consumption, and is a major contributor to the data center returns to the CRAC units, but a small portion of it might
total cost of ownership. While the data center industry has escape into the cold aisles from the top, the sides, or even the
established the best practice to improve the cooling efficiency, bottom of the racks and causes recirculation. Recirculation
the majority of it is rule of thumbs providing only qualitative can be also due to the reverse flows with certain IT equipment
guidance. In order to provide on demand cooling and achieve (some network switches, for example) of which the internal
improved cooling efficiency, a model based description of the
data center thermal environment is indispensable. In this paper, fans blow the hot exhaust air from the hot aisle into the
a computationally efficient multivariable model capturing the cold aisle. The inlet air flow of the IT equipment is thus
effects of CRAC units blower speed and supply air temperature a mixture of cool air from the vent tiles in its vicinity and
(SAT) on rack inlet temperatures is introduced, and model the recirculated hot air [4]. The recirculation of hot air into
identification and reduction procedures are discussed. Using the the cold aisle generates entropy and lowers the data center
model developed, data center cooling system design and analysis
such as thermal zone mapping, CRAC units load balancing, and cooling efficiency.
hot spot detection are investigated.

I. INTRODUCTION
Due to the ever-increasing computing and hence power
density of the IT equipment, today’s data centers require
tremendous amount of cooling power to maintain the desired
thermal status. According to [1], [2], about a third to a
half of data center total power consumption goes to the
cooling system. Highly efficient cooling systems are thus
indispensable to reduce the total cost of ownership and
environmental footprint of data centers.
Figure 1 shows a typical raised-floor air-cooled data center
with hot aisles and cold aisles separated by rows of IT Fig. 1. Typical Raised Floor Data Center
equipment racks. The thermal requirements of IT equipment
are usually specified in terms of the inlet air temperatures The challenge of data center cooling management is the
of the equipment [3]. The equipment temperature thresholds coordination of CRAC units blower speeds and supply air
are not necessarily uniform across the entire data center but temperature (SAT) tuning to minimize the power consump-
are dependent on the different functions, such as computing, tion of both CRAC units and chiller plants, while maintaining
storage, and networking, which the IT equipment serves. hundreds or even thousands of rack inlet temperatures below
Service contracts of the IT workload hosted in the IT their respective thresholds. The major hurdle to overcome
equipment can also affect the temperature threshold. this challenge is the lack of simplified and computationally
The blowers of the Computer Room Air Conditioner efficient models that are capable of capturing the complex en-
(CRAC) units pressurize the under-floor plenum with cool ergy and mass flows within the data centers and performing
air, which in turn is drawn through the vent tiles located in transient cooling analysis. Computational Fluid Dynamics
front of the racks in the cold aisles. Hot air carrying the waste (CFD) have been used extensively for data center cooling
heat from the IT equipment is rejected into the hot aisles. system design, but most of the applications are built upon
Depending on its design, the CRAC unit internal control can steady-state system analysis. The few CFD based transient
regulate the chilled water valve opening to track the given cooling system performance analyses reported so far have
been focused on predicting system responses to cooling fail-
Sustainable Ecosystems Research Group, HP Labs, Hewlett-
Packard Company, 1501 Page Mill Road, Palo Alto, CA 94304-1126. ures. Beitelmal and Patel [5], for example, use transient CFD
{firstname.lastname}@hp.com simulation to investigate data center temperature distribution
change caused by a malfunctioning CRAC unit, and show we demonstrate how model based grouping of rack inlet
that acceptable rack inlet temperatures can still be maintained temperatures and CRAC units can be used for improved
if the IT load and available cooling resources can be appro- thermal zone mapping and coordination of decentralized
priately re-organized. In another application, CFD simulation cooling controllers. Section IV defines Hot Spot Index (HSI)
is used to analyze the various failure scenarios of IBM’s using the model parameters and shows that it is powerful in
cooling infrastructure design for the water cooled cluster of data center hot spot detection. Finally, Section V concludes
11 racks [6]. While transient CFD analysis provides valuable the paper with a summary of the work presented.
insights in data center cooling system design as well as
measures to handle different failure modes, it is impractical II. DYNAMIC COOLING SYSTEM MODELING
to use it for real-time data center cooling management. This AND MODEL PARAMETER IDENTIFICATION
is partly because data center IT load could change from
minute to minute, and transient CFD analysis is normally A. Dynamic Cooling System Modeling
time consuming and could take hours or even longer to finish. In this subsection, we briefly introduce simplified mod-
In addition, as pointed out in [7], [8],it is usually difficult to els from the basic mass and energy balance principles to
capture the true IT and cooling configurations in sufficient characterize the complex mass and energy flows within the
details to predict the resulting environment with desired raised-floor air-cooled data centers. Since only the modeling
accuracy. Targeting higher computationally efficiency, some results are presented, the interested readers can refer to [11],
alternative approaches have been utilized or developed by [12] for more detailed description on how these modes are
researchers for transient data center cooling performances derived.
analysis. Khankari [9] uses a simple energy balance model In the open environment, air flow coming into the IT
to investigate the availability of data center thermal mass equipment inlet is a mixture of the cool air from the
in various configurations during power shutdown, Kummert CRAC units (through the vent tiles) and the recirculated hot
[10] studies the effects of chiller failure and cooling system (exhaust) air that escapes into the cold aisle. In hot aisle
thermal inertia on room temperature variations, and Zhang contained environment, although significantly reduced, recir-
together with VanGilder develops data center transient ther- culation could still exist because of imperfect containment,
mal models. These alternative approaches, however, achieve or the reverse flows from some network switches that draw
the improved computational efficiency by sacrificing the hot air from the hot aisle for cooling and reject the even
spatial non-uniformity witnessed in most data centers, and hotter air (up to 40◦ C) into the cold aisle.
hence are not suitable for real-time data center thermal
management either.
In an attempt to bridge this gap, the authors’ recent work
[11], [12] develops physics based state-space models that
describe the air flow transport and distribution within the
data centers. The parameters of the models are obtained from
measurement data of system identification experiments, and
hence are ensured to reflect the data center reality empha-
sized in [8]. The physics based data center cooling model
is utilized in [11] to coordinate zonal (CRAC units blower
speeds and SAT) and local cooling actuation (adaptive vent
Fig. 2. Air Mixing at the Rack Inlet
tiles), and validation on a small portion of a research data
center shows significant cooling power savings. Using the
same but simplified physics based model, the authors present Consider a small control volume in the proximity of the
in [12] a decentralized model predictive control (MPC) rack inlet with mass m and temperature T , as shown in
design approach for CRAC units SAT and blower speed Fig. 2. Cool and recirculated hot air flows with mass and
regulation targeting large scale data centers. Compared with temperature (mc , Tc ) and (mh , Th ) enter the control volume,
the commonly used CFD modeling, the model employed in mix well with the air (m, T ) already in the volume, leave
this paper is computationally light without losing the data the control volume altogether and enter the rack inlet with
center spatial non-uniformity, and is suitable for both off- total mass m∗ and temperature T ∗ . It can be found that the
line analysis and online dynamic control. In this paper, the temperature change ∆T of the air within the control volume
computationally efficient model developed is used to perform before and after the mixing is:
critical cooling system design and analysis tasks such as
thermal zone mapping, CRAC units load balancing, and hot mc (Tc − T ) mh (Th − T )
∆T , T ∗ − T = + , (1)
spot detection. m + mc + mh m + mc + mh
The other sections of this paper are organized as follows. which reveals that the influence of cool and recirculated hot
Section II first briefly introduces the dynamic rack inlet air on rack inlet temperature can be mainly captured by
temperature model we developed previously using the energy mc (Tc − T ) and mh (Th − T ), respectively.
and mass balance principles, followed by a discussion on the In raised-floor data centers, all the CRAC units pressurize
model identification and reduction procedures. In Section III, the under floor plenum by blowing the cool air into it. The
cool air ṁc flowing into a rack inlet could come from all be performed on part of the experimental data collected such
the CRAC units and hence: that the parameterized model minimizes the error between
model prediction and rack inlet temperature measurements.
NCRAC
X The remaining part of the experimental data can then be
ṁc = bj · V F Dj , (2)
used to validate the model identified. Note that after the
j=1
model identification data is collected, the model parameter
in which NCRAC is the number of CRAC units, bj quantifies identification can be performed per rack inlet temperature,
the cooling air contribution from the j th CRAC unit to a since different rack inlet temperatures only have input cou-
specific rack inlet, and V F D stands for the speed of the pling. The associated parallelization can be exploited to
blower in the percentage of its maximum. speed up the model identification process and is extremely
It can be seen from Eqn. (1) that both cool and recirculated useful for large scale data centers with thousands of rack
hot air contribute to the rack inlet temperature change ∆T . inlet temperatures.
Since the recirculated hot air flow is beyond direct control, From the physical perspective, every rack inlet temper-
we can lump its effect into a time-varying term C and ature of interest is influenced by all the CRAC units, and
simplify Eqn. (1) as: correspondingly each CRAC unit affects all the rack inlet
temperatures within the data center. As a result, the matrix
ṁc ∆t(Tc − T )
T∗ − T = + C, (3) G = [gi,j ] (1 ≤ i ≤ NT , 1 ≤ j ≤ NCRAC ) obtained
m + ṁc ∆t + mh from the model identification process is fully populated.
in which ∆t is the length of the sampling interval. However, it is observed that most rack inlet temperatures
The discrete form of Eqn. (3) is: are usually affected by a selected small number of CRAC
T (k + 1) =T (k) units, and that this subset of CRAC units varies with the
NCRAC
location of the specific rack inlet temperature relative to the
+{
X
gj · [SATj (k) − T (k)] · V F Dj (k)} layout of the cooling facilities. Reflected on the dynamic
rack inlet temperature model identified, this observation
j=1
manifests through the fact that for any rack inlet temperature
+ C(k), Ti (1 ≤ i ≤ NT ) some of the gi,j (1 ≤ j ≤ NCRAC )
(4)
items are dominant with noticeably larger values than the
in which T (k + 1) and T (k) are rack inlet temperatures at
rest, suggesting that the non-dominant items may be ignored
time steps k +1 and k, respectively. In Eqn. (4), gj quantifies
without significant effects on modeling accuracy.
the combined influences of VFD and SAT tuning of the j th
CRAC unit, and also lumps the effects of parameters bj , ∆t In order to obtain a sparse G matrix and hence take the
together with the nonlinearity associated with ṁc . associated advantages such as system decoupling and par-
The vector form of Eqn. (4) for multiple rack inlet allelization of sub-problem solving, we can perform model
temperatures is: reduction for each rack inlet temperature. For rack inlet
temperature Ti , define
T (k + 1) = T (k) + F + C, (5)
gi,max = max(gi,j ), 1 ≤ j ≤ NCRAC .
in which
T = [T1 , T2 , · · · , TNT ]T , In the reduced matrix Gr , item gi,j
r
is set to the corresponding
gi,j if and only if
F = [F1 , F2 , · · · , FNT ]T ,
NCRAC
X gi,j ≥ λ · gi,max ,
Fi = gi,j [SATj (k) − Ti (k)]V F Dj (k), 1 ≤ i ≤ NT ,
r
j=1 and otherwise gi,j = 0. In the inequality above λ is a
C = [C1 , C2 , · · · , CNT ]T , adjustable threshold between 0 and 1. The physical inter-
pretation of this model reduction process is that in order
and NT is the number of rack inlet temperatures of interest. to regulate a specific rack inlet temperature, we can only
consider the CRAC units that have significant influence over
B. Model Parameter Identification and Model Reduction it and simply ignore the CRAC units that just marginally
Parameters of the dynamic rack inlet temperature model affect this rack inlet temperature of interest.
described in Eqn. (5), including gi,j (1 ≤ i ≤ NT , 1 ≤ j ≤
NCRAC ) and C can be obtained through model identification III. M ODEL BASED G ROUPING OF CRAC U NITS AND
experiments. During the experiments, the various available R ACK I NLET T EMPERATURES
cooling actuation, such as CRAC unit blower speed and SAT,
is perturbed through sequential/simultaneous step changes or The matrix Gr obtained after the model reduction process
other specifically designed identification sequences. Both the trims the weak relationships between CRAC units and rack
cooling actuation signals and the corresponding temperature inlet temperatures, maintaining only the strong ones. The
response at the rack inlets are collected. sparse structure of Gr implies the natural grouping of CRAC
In order to find the model parameters gi,j (1 ≤ i ≤ units and rack inlet temperatures, which can be used for both
NT , 1 ≤ j ≤ NCRAC ) and C, a nonlinear optimization can static system analysis and dynamic system control.
A. Thermal Zone Mapping method for each CRAC unit does not vary with its blower
Previously, CRAC units thermal zone mapping has been speed, since Gr captures the combined effects of CRAC unit
based on the absolute values of thermal correlation index SAT and blower speed on rack inlet temperatures. Because of
(TCI) defined in [13] as: the consistency of this model based thermal zone mapping
method, it is valuable for distributed controller design of
∆Ti
T CIi,j = , (6) large scale data centers. In the authors’ previous work [12],
∆SATCRAC,j a decentralized controller is designed for each CRAC unit
which quantifies the steady-state response of the ith rack inlet to regulate the rack inlet temperatures within its established
temperature to a step change in the SAT of the j th CRAC zone of influence.
unit. This steady-state system information based method,
however, has a drawback since the TCI values obtained
through the system commissioning process is dependent B. Load Balancing Based on CRAC Units Grouping
on the blower speeds settings of the CRAC units. The
In raised floor air cooled data centers, there might be
thermal zone of a particular CRAC unit established using
strong interactions between neighboring CRAC units, thus
TCI could expand or shrink when its blower speed increases
causing load balancing problems such as load piggybacking
or decreases, leading to a family of data center thermal zone
and load swapping, which is not uncommon in decentralized
mappings under different CRAC units blower speed settings.
or decoupled CRAC unit controller design.
The drawback of TCI based thermal zone mapping method
outlined above comes from the fact that only steady-state In load piggybacking, one of the CRAC units may keep
system information is utilized and the rich dynamic system increasing its cooling provisioning in order to drive a tem-
information embedded in system parameter such as the Gr porary rack inlet temperature violation below the specified
matrix is left out. From the CRAC’s perspective, CRAC #j threshold, and part of the cool air from this CRAC unit may
effectively affects rack inlet temperature Ti (1 ≤ i ≤ NT ) also be routed to the racks intended to be cooled by the
r
only if the corresponding item gi,j of matrix Gr is nonzero. neighboring CRAC units because of the shared underfloor
th plenum. Observing the rack temperature decrease in its
The nonzero items of the j column of Gr defines the
thermal zone, the neighboring CRAC units may piggyback
zone of influence of the j th CRAC units, with the value
on the CRAC unit with high load and decrease its own
of the corresponding item denoting the exact intensity of the
load provisioning and in turn cause the high load CRAC
influence of CRAC #j on a particular rack inlet temperature.
unit to reach an even higher load. Load piggybacking often
Using this model based approach, Fig. 3 shows the thermal
manifests itself as a high load CRAC unit with low SAT
zones of a research data center with 8 CRAC units and 10
and high blower speed, with its neighbor(s) working at the
rows of racks, and the thermal zone of each CRAC unit
opposite extreme with high SAT and low blower speed. The
is approximated by a balloon. The significant overlapping
significant load imbalance between CRAC units from load
between the thermal zones of CRAC #3 and #4, and
piggybacking may shorten the CRAC units’ life span, and the
CRAC #5 and #6 are clearly indicated in the figure. Since
mixing of supply air streams with big temperature difference
CRAC units with overlapping of thermal zones all affect
also generates entropy and lowers the overall data center
the rack inlet temperatures in the overlapping area, some
cooling efficiency.
coordination might be necessary between them such that they
may have balanced loads when managing the shared rack Apart from load piggybacking, load swapping can also be
inlet temperatures. observed for CRAC units that are individually controlled by
different controllers. When stuck in load swapping, the high
and low load status could switch back and forth between
neighboring CRAC units, resulting in oscillation in the SAT
and blower speeds. Load swapping could be triggered by
the temperature disturbances such as a sudden and temporary
load increase from a server rack, opening of a cold aisle vent
tile due to maintenance, or introduction of free cooling from
the outside air, and may not stop without intervention of the
operator.
The root cause of both load piggybacking and load swap-
ping is lack of coordination in decentralized or decoupled
CRAC unit controllers, in which the physical input coupling
between neighboring thermal zones are neglected. The local
controller for each thermal zone tries to maintain the ther-
mal status within the zone using the least efforts (usually
through minimization of the cooling power required), and
Fig. 3. Model Based Thermal Zone Mapping of a Research Data Center
uncoordinated local optimization could easily lead to global
suboptimal solution (load piggybacking) or instability (load
Compared with thermal correlation index (TCI) based swapping). In order to address these problems, simple and yet
method, the thermal zone identified using the model based effective load balancing mechanism needs to be established
between neighboring thermal zones to coordinate the outputs
of the CRAC units.
The first step toward CRAC load balancing is to identify
for each thermal zone the neighboring thermal zones that it
needs to coordinate with. For each rack inlet temperature
Ti , the corresponding temperature violation Tv,i over its
reference temperature Tref,i is defined as:
Tv,i = Ti − Tref,i . (a) SAT (b) VFD

In the j th thermal zone, the rack inlet temperature with the Fig. 4. Load Balancing
highest Tv,i is called its master sensor, and is denoted as
Tm,j . At each control interval, the local controller of each
thermal zone broadcasts the index and reference temperature IV. M ODEL BASED H OT S POT D ETECTION
of its master sensor, together with its CRAC unit’s current In data center cooling management, ability to identify
SAT and blower speed setting. Each local controller also the hot spots is essential. While examining the snapshot or
receives broadcast messages from all other thermal zones, temporal trends of the data center rack inlet temperature
and compares its master sensor with its neighbors. After this distribution is helpful, a systematic approach is needed to
information exchange, each local controller has a most recent automate the process.
snapshot of the settings of all the cooling controllers, and The detection of hot spots can not be accomplished
CRAC units sharing the same master sensor can be grouped without the definition of an appropriate metric or measure.
for load balancing purpose. While most people tend to believe that the highest rack
In the case that several thermal zones share the same inlet temperature within the entire data center or a thermal
master sensor, it is desirable that all the CRAC units in this zone indicates a hot spot and thus temperature seems to be
group coordinate to the same SAT since mixing of supply the right measure, it is not always the case. First, location
air streams at different temperatures generates entropy and where the highest rack inlet temperature observed within
lowers cooling efficiency. In order to coordinate to the same the data center or a thermal zone could easily change as
SAT, the thermal zone with the highest SAT of the group the settings of the CRAC units vary. A hot spot previously
adds to the SAT setting of its local controller an additional identified could disappear as the thermal zone it belongs to
load balancing term: is over provisioned while the neighboring thermal zones are
SATc = k · (SATmin − SATmax ), configured to be insufficiently provisioned. Second, the hot
spot detection results using temperature measure might be
in which SATmin and SATmax are the lowest and highest subject to various disturbances and dependent on whether
CRAC unit SAT settings of the group, and k is appropriate the detection is performed when the data center has reached
feedback gain for SAT coordination. Note that in this co- a relatively steady state. These drawbacks indicate that using
ordination mechanism, load balancing is only performed on temperature as the measure for data center hot spot detection
the low load CRAC unit to increase its provisioning, and the lacks consistency. New and improved metric for data center
reason is to minimize the chance of rack inlet temperature hot spot detection needs to be developed.
violation during the load balancing process. In order to address this problem, we define the model
Figure 4 shows the experimental results in a research based Hot Spot Index (HSI) for data center hot spot de-
data center as shown in Fig. 3 with 8 CRAC units. Before tection. For a rack inlet temperature Ti with reduced order
load balancing is enabled shortly after time t = 1hr, the model:
data center has already entered a relatively steady state. Ti (k + 1) =Ti (k)
CRAC #3/4 share the same master sensor, and CRAC #5/6
NCRAC
share another master sensor. Due to the lack of coordination X
r
+{ gi,j · [SATj (k) − Ti (k)] · V F Dj (k)}
between the decentralized controller which each controls a
j=1
CRAC unit, the supply air temperature difference is as large
as 2.2 ◦ C between CRAC #3 and #4, and 1.8 ◦ C between + Ci ,
CRAC #5 and #6. After load balancing is enforced at time HSI is defined as:
t = 1hr, CRAC #3 and #4 quickly converge to the same Ci
SAT, and the same is true for CRAC #5 and #6. The blower HSI , , (7)
k gir k1
speeds of these four CRAC units, denoted by the percentage
of the maximum of variable frequency drive (VFD) output, in which vector gir = [gi,1
r r
gi,2 r
· · · gi,NCRAC
], and k · k1
also reach their new steady-state settings after load balancing stands for the vector 1-norm. From a physical perspective,
is enforced as shown in Fig. 4(b). The trajectories of other HSI measures the ratio between effects of hot air recircu-
CRAC units are not shown here since their settings do not lation and CRAC units tuning on a specific rack inlet tem-
change for the duration of the experiment, either because perature. A high HSI value means that hot air recirculation
they do not share master sensors (CRAC #1 and #2) or is severe while the cooling effects from all the CRAC units
load balancing has already been achieved (CRAC #7 and are weak at the location of interest, and hence indicates a
#8). potential hot spot.
identified by HSI based method, such as those in rack D5
and G9, now have relatively low temperatures among all the
rack inlet locations and hence could be left out in hot spot
detection.

V. CONCLUSIONS
In this paper, a data center management and analysis
scheme based on dynamic rack inlet temperature model is
introduced. The model parameter identification and model
reduction procedures are both discussed. Improved thermal
zone mapping approach is introduced through model based
Fig. 5. HSI Values for Rack Inlet Locations of a Research Data Center in rack inlet temperature grouping, and load balancing mech-
Descending Order anism is investigated through grouping of CRAC units. In
order to detect the hot spots within data centers, Hot Spot
Index (HSI) is defined using the model parameters and
Figure 5 shows the HSI values in descending order for proves to be effective in data center hot spot detection.
the 192 rack inlet temperature locations in the research data
center shown in Fig. 3. A closer look at the locations with R EFERENCES
the leading HSI values points to rack inlet temperatures at [1] Steve Greenberg, Evan Mills, Bill Tschudi, Peter Rumsey, and Bruce
rack B6, D5, D6, and G9. Among these hot spots, rack B6, Myatt. Best practices for data centers: Results from benchmarking 22
D5, and D6 are affected by severe reverse flow from the data centers. In 2006 ACEEE Summer Study on Energy Efficiency in
Buildings.
network switches mentioned earlier, while G9 is at the end [2] Chandrakant D. Patel, Cullen E. Bash, Ratnesh K. Sharma, Monem
of the row G and is most affected by the hot air escaped from H. Beitelmal, and Rich J. Friedrich. Smart cooling of data centers. In
the hot aisle. The locations of these hot spots agree with the IPACK03, The Pacific Rim/ASME International Electronic Packaging
Technical Conference and Exhibitions.
observations of the research data center. Furthermore, the [3] ASHRAE. Datacom equipment power trends and cooling applications.
severity of these hot spots can also be indicated by their Atlanta, GA, 2005.
corresponding HSI values, which can not be easily detected [4] Cullen E. Bash, Chandrakant D. Patel, Ratnesh K. Sharma. Efficient
thermal management of data centers – Immediate and long-term
by simply observing the data center temperature distribution. research needs, volume 9, no. 2. HVAC&R Research, Apr 2003.
Rack B6, for example, has higher HSI values, and it is found [5] Monem H. Beitelmal and Chandrakant D. Patel. Thermo-fluids pro-
that the reverse flow is much more severe than that of rack visioning of a high performance high density data center. Distributed
and Parallel Databases, 21(2):227–238, 2007.
D5, and D6, which follow B6 in HSI ranking. [6] Roger Schmidt, Mike Ellsworth, Madhu Iyengar, and Gary New.
IBM’s power6 high performance water cooled cluster at NCAR: In-
frastructure design. In ASME 2009 InterPACK Conference collocated
with the ASME 2009 Summer Heat Transfer Conference and the
ASME 2009 3rd International Conference on Energy Sustainability
(InterPACK2009), San Francisco, California, USA, July 19–23 2009.
[7] Jim VanGilder. Real-Time data center cooling analysis. Electronics
Cooling, September, 2011.
[8] Mark Seymour, Christopher Aldham, Matthew Warner, and Hassan
Moezzi. The increasing challenge of data center design and manage-
ment: Is CFD a must? Electronics Cooling, December, 2011.
[9] Kishor Khankari. Thermal mass availability for cooling data centers
during power shutdown. ASHRAE Transactions, 116(2):205–217,
(a) Uniform CRAC Settings (b) Nonuniform CRAC Settings
2010.
[10] Michael Kummert, William Dempster, and Ken McLean. Thermal
Fig. 6. Rack Inlet Temperatures (in Descending Order of HSI) analysis of a data centre cooling system under fault conditions. In
11th International Building Performance Simulation Association Con-
Compared with HSI based method, relying solely on data ference and Exhibition, Building Simulation 2009, Glasgow, Scotland,
July 27–30 2009.
center temperature distribution for hot spot detection could [11] Rongliang Zhou, Zhikui Wang, Cullen E. Bash, Christopher Hoover,
lead to misleading results. Figure 6, for example, shows Rocky Shih, Alan McReynolds, Niru Kumari, and Ratnesh K. Sharma.
the temperature distribution of the aforementioned research A holistic and optimal approach for data center cooling management.
In American Control Conference (ACC2011), pages 1346–1351. IEEE,
data center in descending order of HSI. In Fig. 6(a), all 2011.
the 8 CRAC units are configured with the same SAT and [12] Rongliang Zhou, Zhikui Wang, Cullen E. Bash, and Alan McReynolds.
blower speed. The trend of rack inlet temperatures roughly Modeling and control for cooling management of data centers with
hot aisle containment. In ASME 2011 International Mechanical
follows that of HSI values as shown in Fig. 5, meaning that Engineering Congress & Exposition, Denver, USA, November 11-17
in this particular data center cooling configuration the rack 2011.
inlet temperatures can be used as a reference for hot spot [13] Cullen E. Bash, Chandrakant D. Patel, and Ratnesh K. Sharma.
Dynamic thermal management of air cooled data centers. In Thermal
detection. The rack inlet temperatures as shown in Fig. 6(b), and Thermomechanical Phenomena in Electronics Systems, 2006.
however, vary significantly in both amplitudes and ranking ITHERM’06. The Tenth Intersociety Conference on, pages 445–452.
relative to each other with a nonuniform setting of CRAC IEEE, 2006.
units. Although the five rack inlet locations with the highest
HSI values still have the highest temperatures, other hot spots

You might also like