Wiboonrat 2017

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Zone Protection Approach of Data Center

System Reliability and Power Quality


Montri Wiboonrat
College of Graduate Study in Management
Khon Kaen University, Bangkok Campus
Bangkok, Thailand
mwiboonrat@gmail.com, montwi@kku.ac.th

Abstract—A critical analysis of power quality and power resolved by emergency power systems such as power generators
reliability in data centers is important as steppingstone for and uninterruptible power supply (UPS) to ensure the continued
improving data center operation efficiency. All data center operation of data centers with all their related technical systems
standards such as, Uptime Tier Classification, TIA-942, BICSI such as climate control, power load, and security.
002, BITKOM, and EN 50600, are defined system reliability design
by topology but not mentioned on power quality. Most of topology Statistical measures of data center system reliability, based
defines by a single line diagram of power distribution systems on such probabilistic analyses, from the basis of engineering
(PDS). The PDS design approach is using a coherent zone decisions. The available reliability tools permit the systems
protection mechanism. This mechanism is tried to get rid of single designer to secure, in general, system reliability from IEEE 493
points of failure by applying redundant topology such as N (by N Std. Design of Reliable Industrial and Commercial Power
≥ 1), N+1, 2N, 2(N+1), and 2N+2N through PDSs. This paper Systems [1] and BITKOM [8]. Consequently, an understanding
describes the zone protection approach of system reliability and of reliability at the component and sub-system level is essential
power quality design for Tier 4 data centers. This paper was to develop accurate estimates of data center system reliability.
conducted design analysis of power distribution systems (PDS)
from 8 (Tier 3) data centers and 2 (Tier 4) data centers while The power quality is associated with alternating current (AC)
compared 5 data center standards subject to how to build the power line harmonics. Natural sources of harmonic current in
highest reliability and harmonic control of data center through electric power systems include power electronic converters such
zone protection as a recheck point. The results of zone protection as uninterruptible power supplies (UPS), heating ventilation air
analysis proposes a new practical topology of PDS in data center conditions (HVAC), inverters for distributed generation,
that can be practically deployment in order to improve the overall variable speed drives (VSD), and industrial applications.
power quality and reliability of the data centers. Moreover, they are injected harmonic current into the power
system network. Harmonic currents are wasted energy that
Keywords—data center; zone protection; power quality; power appears as heat. Heat is caused of a degraded effect on the
distribution; hamonice control system performance and life expectancy of various equipment.
I. INTRODUCTION This paper was conducted in-depth investigation of existing
Electronic systems are designed to operate for specified work of power distribution system (PDS) design in data centers,
period of time, which is determined by cost and performance, as and organized the PDS designs using a coherent zone protection
well as customer requirements. The ability of electronic systems approach. A single line diagram of power distribution system in
to operate within this time frame is referred to generally as data center is illustrated for zone protection analysis. While there
reliability. Science power distribution systems (PDS) of data are many PDS designs for different components and systems of
center consist of electronic components (devices), system data centers, the PDS designs are largely unorganized, and lack
reliability of PDS is robustly dependent on the reliability of the an overall framework that allows data center designers to be
individual components in the sub-system environment. used as a guideline to model more sophisticated and complex
Presently, it is not feasible to predict the lifetime or degradation systems. Furthermore, this paper gave a more detailed taxonomy
rate of any individual electronic component of PDS inside data of the zone protection of system reliability design for data
center. It is possible to treat large populations of components and centers to understand the complexities of power distribution
systems probabilistically and thereby predict average mean time system in data center system architecture. The power quality
to failure (MTTF). The results obtained from measuring and objective in this study is to identify the equipment that could be
then predicting component survivability is used as input to a potential sources of harmonic currents or nonlinear loads before
probabilistic analysis of the reliability of an entire data center redesign a new data center topology.
system.
II. BACKGROUND
Power producers cannot guarantee an uninterrupted
electrical power supply and in standard contracts, the power A. Data Center Standards
supply companies (PSCs) disclaim any liability. Short The Thai Government notes that there are multiple building
interruptions of longer power outages have therefore to be standards governing data centre construction, including the

69
Building Code of Thailand, and various local government Acts
and regulations.

Fig. 1. International standards of data centers.

It is expected that all proposed data centre facilities have, or Fig. 2. Reliability of parallel equipment N, N+1, 2N, and 2(N+1).
will have, received all relevant and necessary approvals. As a
basis for shared understanding, this paper nominates
O1.e  O .T
1

international standards such as Uptime [6], BICSI [7], TIA-942 O2 .e


 O ' .( t t1)
2

[11], BITKOM [10], and EN 50600 [12], as mean of describing O O2 .e  O .T


2
O2
'

f (T) O21 '

data center facilities, as presented in Figure 1.


B. Series System of Security Protection Theory O PS
If n units are put in service, how many failures might expect
in a year? Assume no replacement of failed units, and that the
infant mortality period is over. 2 O e  O T  2 O e 2 O T
OPS (T )
2e OT  e  2 OT
Solution. The expected number of failures between times t1 and 0 t1 t
t2 is given by multiplying the cumulative distribution function (a). A pdf’s of two parallel power systems when primary power fails
by n [2], the number of units:
O1.e  O .T O ' .(t t 2)
O1' .e
1
∫ ( ) 1
= expected number of failures = n[1 − ] (1) O2 .e  O .T
2
O1
f (T) OO1
'

C. Data Center Topology 2

The quantification of the reliability of parallel units is based O PS


on the assumption that, when a redundant unit fails, the failure
rate or the reliability of the surviving units does not change
during the operating mission [14]. For example, in Fig 2, if (N)
2 O e  O T  2 O e 2 O T
topology: N=1 means a unit will take load at 100%; (N+1) OPS (T )
topology: N = 1 means 2 parallel units will take load at 100% 2e OT  e  2 OT
0 t1 t
each; (N+1) topology: N = 2 means 3 parallel units will take load
at 50% each. (b). A pdf’s of two parallel power systems when secondary power fails
Fig. 3. Comparing between pdf and operaing time of load-sharing system.
There will be situations, however, when this will not prevail
and the failure rate, or the reliability, of the surviving equipment The probability density function (pdf) of 3 parallel systems
will change. Usually their failure rate will increase and their (N+1) [3], that 3 systems are equal (Note: N=2), when λ 1 = λ 2
reliability will decrease, because the surviving equipment will = λ 3 = λ, is given by (2):
be sharing the load during the operating transfer mission;
( )
consequently, their share of the load will increase [13], as ( )= + 1− +
indicated in Figure 3.
( )
. 1− − 1−
The probability density function (pdf) of two parallel
( )
systems (N+1) [3], that two systems are equal (Note: N=1), (3)
when λ 1 = λ 2 = λ, is given by (1):
The topology of Tier 4 [6] or Class F4 [7] or DC Category D
2O [10] is combination of (N+1) and 2(N+1) as illustrated in Fig. 4.
RLS 2 (t ) e 2 O t  (e 2Ot  e O 't ) (2)
O '2O

70
(N+1) Tier 4/Class F4/DC Category D system reliability such as 2N utility sources, 2N transformers,
2 (N+1) 2(N+1) power generators, 2N ATSs, and 2N PDSs through
critical IT loads. This system protect approach is well-known in
Gen Gen
N+1 N+1 term of fault-tolerant which applies by all international data
center standards such as Tier 4 [6], Class F4 [7], or DC Category
D [10].
Utility A
Zone 0 Utility B
A Side B Side

Fuse
Zone I Fuse

LV-FD
LV-FD
Relay Xtrans B Relay
Xtrans A

CB CB
UPS N+1 UPS N+1

Bus Duct
Bus Duct
Gen Gen Gen Gen
+1 N N +1

CB CB CB CB
CB CB
ATS CB CB ATS

FD Tie CB FD
Interlock
CB CB

Bus Duct

Bus Duct
CB Zone II CB
Bus Bar Bus Bar
CB CB CB CB CB CB

BB
BB
Bus Duct

Bus Duct

Bus Duct

Bus Duct
CB CB
Bus Bar Bus Bar
CB CB CB Tie CB CB CB
CB
CB/NO

FD

FD

FD
CB

FD
CB/NO

FD

FD
CB
Bus Bar Bus Bar
CB CB CB CB CB CB CB CB CB CB CB
CB

FD

FD
CB CB
CRAC CRAC CRAC CRAC CRAC CRAC
N N +1 N N +1

Fig. 4. Tier 4/Class F4/DC Category D 2(N+1) topology [15].

FD

FD
FD

FD

FD
FD
FD

FD
ByPass UPS UPS UPS ByPass
XTrans Mech Mech XTrans
Sys Sys

III. RESEARCH METHODOLOGY


FlyWheel FlyWheel FlyWheel Battery Battery Battery

FD

FD
STS STS

FD

FD

FD
System protection is the art and science of detecting
FD

FD

FD
FD

FD
problems with power system components and isolating these CB

CB
CB
Bus Bar
CB

CB
CB

CB
Zone III
CB
CB

CB
CB
Bus Bar
CB

CB
CB

components. Problems on the power systems are included, short FD STS FD


Bus Duct

Bus Duct
Isolation Isolation

circuits, abnormal conditions, and equipment failures. XTrans Fire System, Control System,
Critical Fans or Pumps
XTrans

CB/NO CB CB/NO CB
Bus Bar Bus Bar

This research question is what maximum IT downtime can


CB CB CB CB CB CB

the business tolerate? Which downtime of business on this paper


Cables

Cables

Cables

Cables

Cables

Cables
was referred to power outages inside data centers, normally CB
Bus Bar
CB
Bus Bar
CB
Bus Bar
CB
Bus Bar
CB
Bus Bar Bus Bar
CB

means all causes that happen from failures of parts, components,


equipment, sub-systems, and systems. The power distribution Zone IV

analysis of data center was investigated by applying a research Racks Racks Racks Racks Racks Racks

framework of reliability system which consisted of


L L M M H H

Racks Racks Racks Racks Racks Racks

dependability and security system. This paper was conducted Racks Racks Racks Racks Racks Racks

design analysis of power distribution systems (PDS) from 8 Racks Racks Racks Racks Racks Racks

(Tier 3) data centers (6 sites are already on operations and the Racks Racks Racks Racks Racks Racks

other 2 sites are under construction) and 2 (Tier 4) data centers


(both site are under construction) in Thailand. Fig. 5. Dependability protection approach.

IV. ZONE ANALYSIS OF DEPENDABILITY PROTECTION Security protection approach is derived from series system
APPROACH of security protection theory from (1). Security protection is
The primary purpose of zone protection of PDS for data reversed analysis from critical IT loads by applied mission
centers is to protect the mission critical equipment from out-zone critical load shedding (MCLS) called “zonal survivability” or
faults (such as utility outage, transformer failure, generator Zone IV. An in-zone PDS may or may not need an energy
malfunction, or battery failure: upstream loads) and in-zone storage module (ESM) to meet business downtime tolerance or
faults (such as cables, breakers, distribution panels, and power quality of service requirements. ESM can use a host of
supply units: downstream loads). Researcher was segregated technologies depending on the power load and energy
PDS of data center into 5 zones; Zone 0, Zone I, Zone II, Zone requirements such as batteries, capacitors, flywheels, or
III, and Zone IV, as presented in Fig. 5. Each zone needs to superconducting magnetic energy storage. The time domain of
support power throughout other zone from Zone 0 till Zone IV. ESM to supply to critical IT loads depends on business
Dependability and security protection approach were analyzed downtime tolerance and technologies vary from 15 seconds to
through single line diagram of PDS in Fig. 5 and Fig. 6 30 minutes. Security zone protecting can be classified into 4
respectively. stages, as depicted in Fig. 5. Zonal survivability or Zone IV is
protected by Zone III (UPS + Batteries or Flywheels: 15 seconds
Dependability protection approach is considered redundant to 30 minutes) stage 1, Zone II (Power Generator 12-96 hours)
mechanism such as N+1, 2N, 2(N+1) topology. These scenarios stage 2, and Zone I (Dual feed from different utility substations)
are designed to get rid of single points of failure (SPOF) from stage 3.
PDS. As in Fig. 5, critical components are utilized to increase

71
A. Zone 0: Utilities (2N) Protection Approach C. Zone II: UPSs 2(N+1) Protection Approach
Possible causes of power outages: technical faults in the Possible causes of power outages: technical faults in the
power transmission, sub-station, power distribution, utility power distribution system such as cables, breakers, and
breaker, and drop fuses (Out of control). distribution panels; faults in the uninterruptible power sources
Multiple utility sources are employed to increase reliability (UPS systems) or battery packed.
is whether the sources should be operated in parallel or should The UPS systems provide facility the highest level of
be isolated, with an automatic transfer control scheme to switch protection by isolating the electronic equipment from raw utility
from a failed source to the alternate source. The degree of power. The system performs by converting power from AC to
independence of the utility sources is also important to DC and then back to AC. This unit is the only UPS that provides
determining the available improvement in data center system power with zero transfer time to the battery, making it ideal for
reliability. Multiple utility distribution feeders should preferable sensitive and mission-critical equipment such as servers,
come from different substations. storages, and networks. Additional, and online, double
Dual utility sources are not required to meet criteria for any conversion UPS has an internal static bypass, ensuring that if
Tier. Loss of utilities is not considered a failure but is a normal installed UPS experiences a catastrophic failure or requires
operational condition for which the data center must be prepared maintenance, system may be able to keep critical loads online
[6]. However, BICSI 002 [7] Class F4 requires dual feed from during repair or replacement [5], as illustrated in Figure 8.
different utility substations.
B. Zone I: Generators 2(N)+2(N) Protection Approach
Possible causes of power outages: technical faults in the
power distribution system such as cables, breakers, and
distribution panels; faults in the backup power systems such as
continuous power genterators.
Continuous generators [4] are compulsory applied with Fig. 7. Double conversion UPS.
switching equipment or automatic transfer switch (ATS or ) on
their output to transfer loads from the normal power supply to D. Zone III: Dual Power Paths (2N) Protection Approach
the generator and back to the normal power supply when it is
available and acceptable. The ATS typically contains all Possible causes of power outages: technical faults in the
controls essential to sense loss of the normal power supply, to power distribution system such as cables, breakers, and
start the continuous generators, and transfer the power to the distribution panels.
load from one source or the other, including all critical time Since, Tier 4 is provided dual power distribution paths to IT
delays. Reliability protection in continuous generator power load. Therefore, all IT equipment shall be dual powered as
can be enhanced by application of more than one generator defined by the Institute’s Fault Tolerant Power Compliance
(N+1)[1], as shown in Figure 7, especially where data center Specification, Version 2.0 and installed properly to be
downtime is not acceptable and when the size and number of compatible with the topology of the site’s classification.
loads can be separated (2N) according to their criticality. E. Zone IV: Load Sheddding Protection Approach
Uptime [6]: Tier 4 requires 12 hours of onsite fuel storage for
N capacity while BICSI 002 [7]: Class F4 still requires fuel run Possible causes of power outages: technical faults in the
time minimum 96 hours for N+1 capacity and BITKOM [10] power distribution system such as cables, breakers, and
requires fuel reserve minimum 72 hours. Therefore, the engine- distribution panels; technical faults in power supply units, or
generator plant may be not the only source of power reliability. equipment such as servers, storages, or networks.
Load shedding schemes can be deployed to control peak
demand levels to ensure service continuity to critical loads. A
further consideration is the load shedding control system that
initiates the shutting down of business applica…tion must be
planned according to applications, servers, storages, and
networks priority ranking. The control system makes
recommendations regarding what the lowest priority processes
to shutdown and calculates the residual additional uptime
available to the high priority processes or critical applications.
The highest critical applications and support systems equipment
must be listed and put into group operations that support power
from UPS batteries, as depicted in Figure 9. This is the last
scenario of zone protection from utility outage, backup
generators failed and only UPS batteries perform.

Fig. 6. Continuous generators power (N+1).

72
V. ZONE ANALYSIS OF HARMONIC PROTECTION APPROACH Utility A Zone 0 Utility B

Ring Main Unit Ring Main Unit

The primary purpose of zone analysis of harmonic protection Fuse Fuse


Tie CB
Interlock
Fuse Fuse

of PDS for data centers is to determine the level of harmonic

LV-FD
Zone I

LV-FD

LV-FD
LV-FD
currents that can be tolerated within PDS and other systems, Xtrans A Relay Xtrans B Relay

subject to power quality and efficiency loss, before deciding to CB


Xtrans A Relay Xtrans B Relay
CB

add or remove equipment for mitigation harmonic distortion.

Bus Duct
Bus Duct
CB CB
Gen Gen
N N

Most critical equipment in data center is IT equipment

Bus Duct
Bus Duct
CB Gen Gen CB

frequently required AC sources that have no more than a 5%


CB N N CB
ATS ATS
CB CB

harmonic voltage distortion factor, with the largest single CB


ATS
CB
ATS
CB

CB

harmonic being no more than 3% of the fundamental voltage, as

Bus Duct

Bus Duct
CB CB

shown in Higher levels of harmonics result in erratic, sometimes CB Zone II CB

subtle, malfunctions of the IT equipment that can. In some cases,

BD
BD
Bus Bar Bus Bar
CB CB CB CB

they have serious consequences. Instruments can be affected

Bus Duct

Bus Duct

Bus Duct

Bus Duct
CB CB
Bus Bar Bus Bar

similarly, giving erroneous data or otherwise performing CB/NO


CB CB CB CB CB CB

FD

FD

FD
CB

FD
CB/NO

FD

FD
CB

unpredictably. Perhaps the most serious of these are


Bus Bar Bus Bar
CB CB CB CB CB CB CB CB CB CB CB
CB

FD

FD
CB CB
CRAC CRAC CRAC CRAC CRAC CRAC

malfunctions in medical instruments [17]. N N +1 N N +1

FD

FD
FD

FD

FD
FD
FD

FD
ByPass UPS UPS UPS ByPass
XTrans Mech Mech XTrans
Sys Sys
FlyWheel FlyWheel FlyWheel Battery Battery Battery

FD

FD
STS STS

FD

FD

FD
FD

FD

FD
FD

FD
CB CB
Bus Bar
CB CB
Zone III CB CB
Bus Bar
CB CB

CB CB CB CB CB CB
FD STS FD

Bus Duct

Bus Duct
Isolation Isolation
XTrans Fire System, Control System, XTrans
Critical Fans or Pumps

CB/NO CB CB/NO CB
Bus Bar Bus Bar
CB CB CB CB CB CB

Cables

Cables

Cables

Cables

Cables

Cables
CB CB CB CB CB CB
Bus Bar Bus Bar Bus Bar Bus Bar Bus Bar Bus Bar

Zone IV

Racks Racks Racks Racks Racks Racks


L L M M H H

Racks Racks Racks Racks Racks Racks

Fig. 8. Voltage distortion limits in IEEE Std 519 [16]. Racks Racks Racks Racks Racks Racks

Racks Racks Racks Racks Racks Racks

A. Zone 0: Ring Main Unit (2N+2N) Protection Approach Racks Racks Racks Racks Racks Racks

Ring Main Unit (RMU) is a 11 KV-33 KV HT panel having Fig. 9. Harmonic protection approach.
3 nodes of circuit breakers or isolators that are 2 MV for
incoming and one for outgoing to the data center. It enables data C. Zone II: Isolation Bus Bar (2N) Protection Approach
center use 2 source of HT power in at the same metering point,
Each equipment or system as described as nonlinear load
as shown in Fig. 9.
will draw the fundamental current component and inject back
B. Zone 1: Isolation Transformers and Generators (2N+2N) to the point of common coupling (PCC) the higher frequency
Protection Approach harmonic current components. The major power quality indices
are the line current total harmonic distortion (THDi) for all
A new topology desing of PDS for data center is isolated all
loads connected and the voltage harmonic distortion (THDv) for
VAR systems such as pumps, CRACs, chillers, and inverters
the connecting points and busbars.
for distributed generation from IT equipment loads. Since, these
converters conventionally generate current harmonics and VI. CONCLUSION
pollute the power network with circulating unwanted high
System reliability of data center depends on the outset design
frequency currents. Utilizing and isolation transformer prior the
of power distributions system of data center which interprets
three-phase UPS system that can provide a harmonic data and information from business requirements. The crucial
segregation [18], as demonstrated in Fig. 9. This is a crucial factor that requires complying is international standards such as
protection stage that shell be considered as the utilized UPS Uptime, TIA-942, ANSI/BICSI-002, BITKOM, ASHRAE, and
systems indicate a low capacitive impedance for the harmonic EN 50600. These standards have design guidelines and best
current. This can be very attractive imprdance case to inject practices for complying as in this paper referred to Tier 4/Class
more harmonic currents components from adjacent nonlinear F4/DC Category D that means the highest system reliability of
loads. data center classification. This paper was investigated power
distribution systems (PDS) and power quality of Tier 4 data
centers and revealed a zone protection approach of each stage.
The most important zone protection called “zonal survivability”
which is contained the critical IT load applications or equipment.

73
The zone protection approach can be categorized into 5 zones; [9] ASHRAE, Design Considerations for Datacom Equipment Centers,
Zone 0, Utilities 2N+2N protection; Zone I, Generators 2N+2N Second Edition, ASHRAE, 2009.
protection; Zone II, UPSs 2(N+1) protection; Zone III, Dual [10] BITKOM, Reliable Data Centers Guide, German Association for
Information Technology Telecommunications and New Media e.V.,
Power Paths 2N; Zone IV, Load shedding protection. The power December 2013.
quality has improved by isolation transforms and busbars to [11] TIA-942, Telecommunications Infrastructure Standard for Data Centers,
reduce total harmonic distortion (THDi) and voltage harmonic The Telecommunications Industry Association (TIA), March 2014.
distortion (THDv) in power distribution system (PDS) of data [12] EN 50600-2-4, Information Technology. Data center facilities and
center. infrastructures. Telecommunications cabling infrastructure, bsi., April
2015.
REFERENCES [13] M. Wiboonrat, “Data center design of optimal reliable systems,” IEEE
[1] IEEE Std 493, Design of Reliable Industrial and Commercial Power International Conference on Quality and Reliability (ICQR), 2011, pp.
Systems, Institute of Electrical and Electronics Engineering, Inc., 2007. 350-354.
[2] K. David, N. Yoshinao, and M. Maria, AT&T Reliability Manual, Van [14] M. Wiboonrat, “An empirical study on data center system failure
Nostrand Reinhold, New York, 1990. diagnosis,” The Third International Conference on Internet Monitoring
and Protection (ICIMP), 2008a, pp. 103-108.
[3] K. Dimitri, Reliability Engineering Handbook, Volume 2, Prentice Hall,
Englewood, New Jersey, 1991. [15] M. Wiboonrat, “Risk anatomy of data center power distribution systems,”
IEEE International Conference on Sustainable Energy Technologies
[4] ISO Std. 8528-1, Reciprocating Internal Combustion Engine Drive
(ICSET), 2008b, pp. 674-679.
Alternating Current, Generaing Sets, 2005.
[16] IEEE Std 519, Recommended Practice and Requirements for Harmonic
[5] IEEE Std 446, Recommended Practice for Emergency and Standby Power
Control in Electric Power Systems, 2014.
Systems for Industrial and Commercial Appications, IEEE, 2000.
[17] T. Hovenaars, K. LeDoux, and M. Colosino, “Interpreting IEEE Std 519
[6] W. Pitt Turner, J. Seader, and V. Renaud, Data Center Site Infrasructure
and meeting its harmonic limits in VFD applications, IEEE Industrial
Tier Standard: Topology, Uptime Institute, LLC., 2014.
Appication Society 50th Annual Petroleum and Chemical Industrial
[7] ANSI/BICSI-002, Data Center Design and Implementation Best Conference, 2003, pp. 145-150.
Practices, BICSI, 2014.
[18] H. Zubi and S. Khalifa, “Power Quality Investigation of a Typical
[8] BITKOM, Reliable Data Centers Guideline, German Association for Telecom Data Center Electrical Network,” International Conference on
Information Technology Telecommunications and New Media e.V., Control Engineering & Information Technology, 2016.
November 2006.

74

You might also like