Professional Documents
Culture Documents
RSS: A Relay Based Schedule Scheme For Optical Data Center Network
RSS: A Relay Based Schedule Scheme For Optical Data Center Network
https://doi.org/10.1007/s11107-019-00869-5
ORIGINAL PAPER
Abstract
The ever-increasing communication requirements have led to the introduction of optical circuit switch (OCS) to data center,
which is capable of providing more flexible bandwidth allocation compared to the electrical packet switching. However,
the challenge still arises due to non-negligible reconfiguration delay of commercially available MEMS-based optical cir-
cuit switching technology, even though it provides high bandwidth with low per-bit cost and power. Additionally, existing
scheduling schemes amortize long switch delay by means of reconfiguring OCS every a few 100 s of milliseconds, not only
causing the degradation of the latency performance, but also incurring the low utilization of optical links. In this paper, we
propose an OCS-based scheduling scheme called relay-based schedule scheme (RSS), which can leverage idle optical paths
to forward traffic from remote nodes only with straightforward software modifications on controller. We evaluate the perfor-
mance of the proposed scheme via OMNET++ simulator, and the results demonstrate that our proposal delivers significant
benefits, including reducing the average mice flow FCTs (flow complete time) by up to 80% and reducing the end-to-end
packet delay by 50% compared to non-RSS schemes.
13
Vol.:(0123456789)
Photonic Network Communications
expands. Reconfigurable topology technology has aroused Taking Helios for example, the centralized controller
widespread concern [13, 14], which utilizes OCS to dynami- supervises and periodically collects traffic demands between
cally change link capacities, and allows for changing the different racks and then accordingly allocates optical links
network topology to match the interconnection demands of to ToRs pairs with high communication demands. However,
each application. the traffic demands information periodically collected by
Even though OCS is capable of delivering significant the controllers are based on the instantaneous buffer queue
benefits, it inevitably incurs a non-negligible reconfiguration lengths of ToRs, leading to a mismatch between the real-
penalty [15, 16], which can be up to tens of milliseconds in a time traffic demands and current configuration. Besides,
MEMS-based OCS network. On the contrary, the processing existing proposed methods set the reconfiguration cycle as
time of EPS is only several nanoseconds. Therefore, hybrid dozens of times of reconfiguration delay, so that the inter-
optical/electrical architectures [2, 3, 7, 17] are emerging val between reconfigurations can amortize the time spent
as an alternative to DPS-only networks, where EPS han- on demand collection, configuration calculation and OCS
dles dynamically changing traffic and OCS handles static switching [19].
and long-lived traffic. However, network downtime due to Assuming the reconfiguration delay is 0.1 ms and the
reconfiguration still happens whenever the interconnect state reconfiguration cycle is ten times of it, we simulate the above
changes [18], during which optical links are incapable of scheduling scheme in OMNeT++ simulator and record the
transmitting data. Night and Day represent the time interval throughput during each cycle. As observed in Fig. 1, once
when the network is being reconfigured and the time inter- high-speed optical links are set up, most packets finish trans-
val when it can transmit data, respectively. To amortize the mitting within a short time slot at the beginning of Day,
inevitably long reconfiguring reconfiguration delay, existing while only a few packets keep transmitting through opti-
approaches generally set Day up to a few hundred millisec- cal links in the remaining of reconfiguration cycle. Under
onds or even several seconds (far longer than Night), thereby such circumstances, optical recourses are wasted since many
lengthening the reconfiguration cycle and diminishing the optical links are occupied in vain while there still are many
impact of reconfiguration delays [19]. Nevertheless, most packets in ToR switches that cannot get allocated optical
packets can arrive at destination ToRs through optical links links until next cycle, resulting in unacceptable performance
in short slots. In that case, longer-term Day means more degradation in both throughput and latency.
optical resources will be wasted since many optical links are Another kind of strategy is replacing MEMS-based
occupied in vain in the rest of Day. switches with faster optical devices, such as RotorNet [6]
Aiming to improve the utilization of optical links in and Mordia [4]. Those proposals are no longer subject to
MEMS-based OCS network, a relay-based schedule scheme slow reconfiguration speed. However, they either rely on
(RSS) is proposed in this paper. RSS leverages idle opti- costly custom-designed OCSes, which is hard to imple-
cal links to forward traffic through relay ToR, alleviating ment in current data networks, or otherwise restrict to small
the negative effects of long reconfiguration cycles in OCS network scale. Therefore, we expect to reduce impacts of
networks. In order to obtain the link state, we accordingly reconfiguration only through straightforward software modi-
propose an advertisement scheme. Furthermore, we also fications and guarantee the latency performance as well as
propose a PFC-like scheme as an enhancement of RSS to reduce the average FCTs.
avoid bufferbloat.
The rest of this paper is organized as follows: In Sect. 2,
we discuss the existing reconfigurable networks and explore
the problem of existing scheduling schemes. In Sect. 3, we
present the details of RSS and provide a modified scheme
to address bufferbloat issue. Section 4 evaluates the perfor-
mance of RSS by conducting extensive simulations. At last,
we conclude this paper in Sect. 5.
2 Related works
13
Photonic Network Communications
3 Relay‑based schedule scheme Based on the collected traffic demands, the controller
determines OCS configuration that provides optical links
In this section, we present a relay-based schedule scheme between input ToRs and output ToRs in every reconfigura-
(RSS) for OCS-based DCN, which can be easily imple- tion cycle. All of the OCS switches reconfigure synchro-
mented in hybrid electrical/optical architectures. RSS nously, during which no packet can be transmitted through
provides better connectivity of the optical network with- the optical link, resulting in a large number of queuing
out increasing the number of MEMS switches as it offers delays and then degrading network latency performance.
each ToR the opportunity to communicate with multiple Besides, long and constant Day interval in one reconfigura-
indirectly connected ToRs. In our design, RSS is also com- tion cycle leads to low efficiency of optical links since traffic
bined with an idle link notification scheme as well as a cannot saturate the whole Day, while variable Day interval
bufferbloat avoidance scheme. will inevitably result in a sophisticated configuration scheme
with high computational complexity.
Therefore, RSS still adopts constant Day interval in one
3.1 Overview reconfiguration cycle, but adds relay mechanism to leverage
the optical links in idle state during the Day, which fully
We attempt to address these issues mentioned in section II utilizes the “wasted” bandwidth and improves the network
from two aspects: (1) the measurement and collection of latency performance. When informed that optical links are
traffic demands; (2) the OCS configuration. idle, source ToR can communicate with destination ToR
In respect of the measurement and collection of traffic which is not directly connected, by transmitting to a relay
demands, there have been considerable efforts in develop- ToR directly connecting to destination ToR first.
ing efficient traffic scheduling schemes, which leverage Figure 2 shows OCS-based network adopting RRS and
centralized controllers to performance optical links alloca- how relay works. After finishing reconfiguration, ToR 1
tion according to the periodically collected traffic informa- is connected to ToR 3 and ToR 3 is connected to ToR 2
tion. However, the traffic changes dynamically over time through optical links during this configuration cycle. If
so that there will be delay between traffic collection and adopting existing schemes, once data from ToR 1 to ToR 3
the real-time traffic, which will significantly degrade the or data from ToR 3 to ToR 2 drain out, optical links between
efficiency of scheduling schemes. them will be in idle state during the rest time of this Day.
And even if ToR 1 needs to communicate with ToR 2, it
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
H H H H H H H H H H H H H H H H H H H H H H H H
H
T T T T
T T T T T T T T T T T T T
Fig. 2 ToR 1 transmits packet to ToR 2 through relay ToR 3 when optical links are idle
13
Photonic Network Communications
can only wait for optical links allocation in the next cycle. Controller Relay Advertisement Process
However, when adopting RSS, ToR 1 can send data to ToR Input : The relay ToR R The destination ToR D
2 via relay ToR 3 when traffic between ToR 1 and ToR 3 The source ToR S
finishes transmitting, rather than waiting for allocation of begin
optical links until the next Day. for ( ToR S in all ToRs)
Despite the existence of two-hop seems to be a kind of if ( ToR S has optical link to ToR R)
bandwidth waste, such trade-off is acceptable consider- if ( ToR R is ToR D )
ing that without RSS the second-hop bandwidth will be /*no need for relaying */
unexploited anyway in remaining time of a Day. Therefore, else
adopting relay actually reuses the optical resources and pro- Notify ToR S send traffic to ToR D via ToR R
vides connection for mice flows and bursty communications. endif
endfor
3.2 Relay advertisement end
13
Photonic Network Communications
3.4 Bufferbloat
4 Evaluation
4.1 Connectivity
13
Photonic Network Communications
13
Photonic Network Communications
5 Conclusion
13
Photonic Network Communications
5. Chen, K., Singlay, A., Singhz, A., Ramachandran, K., Xu, L.,
Zhang, Y., Wen, X., Chen, Y.: OSA: an optical switching archi-
tecture for data center networks with unprecedented flexibility.
IEEE/ACM Trans. Netw. 22(2), 498–511 (2014) Shangqi Ma received the B.E. degree in
6. Mellette, W.M., McGuinness, R., Roy, A., Forencirh A., Papen, Computer and Communications from Lan-
G., Snoeren, A.C., Poter, G.: RotorNet: a scalable, low-complex- zhou University of Technology, in 2017. She
ity, optical datacenter network. In: Proceedings of ACM SIG- is currently working toward the M.E. degree
COMM (2017) in telecommunication and information sys-
7. Christodoulopoulos, K., Lugones, D., Katrinis, K., Ruffini, M., tems in the State key lab of ISN, Xidian Uni-
O’Mahony, D.: Performance evaluation of a hybrid optical/elec- versity. Her research interests include optical
trical interconnect. IEEE/OSA J. Opt. Commun. Network. 7(3), interconnection and data center.
193–204 (2015)
8. Ghobadi, M., Mahajan, R., Phanishayee, A., Devanur, N.,
Kulkarni, J., Ranade, G., Blanche, P.A., Rastegarfar, H., Glick,
Huaxi Gu received his Ph.D. degree in tele-
M., Kilper, D.: ProjecToR: agile reconfigurable data center inter-
communication and information system from
connect. In: Proceedings of ACM SIGCOMM (2016)
Xidian University in 2005. He is currently a
9. Muhammad, I., Martin, C., Pascal, L., Kostas, K.: Performance
professor in the state key lab of ISN, Xidian
evaluation of hybrid optical switch architecture for data center
University. His current research interests
networks. Opt. Switch. Network. 21(C), 1–15 (2016)
include interconnection networks, networks-
10. Proietti, R., Yin, Y., Yu, R., et al.: Scalable optical interconnect
on-chip, optical interconnect and data center
architecture using AWGR-based TONAK LION switch with
networks. He has more than 100 publications
limited number of wavelengths. J. Lightwave Technol. 31(24),
in many international journals and
4087–4097 (2013)
conferences.
11. Mukherjee, B.: WDM optical communication networks: progress
and challenges. IEEE J. Sel. Areas Commun. 18(10), 1810–1824
Hao Lan received the B.E. degree in Com-
(2000)
munication Engineering from Xidian Uni-
12. Kachris, C., Tomkos, I.: A survey on optical interconnects for data
versity, in 2015. He received his M.E. degree
centers. IEEE Commun. Surv. Tutor. 14(4), 1021–1036 (2012)
in telecommunication and information sys-
13. Liu, H., Lu, F., Forencich, A., et al.: Circuit switching under the
tem from Xidian University in 2018. He is
radar with reactor. In: Usenix Conference on Networked Systems
currently working toward the Ph.D. degree
Design & Implementation. USENIX Association (2014)
in University of Toronto.
14. Zhao, Y., et al.: Dynamic topology management in optical data
center networks. J. Lightwave Technol. 33(19), 4050–4062 (2015)
15. Liu, H., Mukerjee, M.K., Li, C., et al.: Scheduling techniques
for hybrid circuit/packet networks. In: ACM Conference. ACM
(2015) Xiaoshan Yu received the M.E. degree in
16. Wang, C.H., Javidi, T., Porter, G.: End-to-end scheduling for all- Electronics and Communications Engineer-
optical data centers. In: Computer Communications. IEEE (2015) ing from Xidian University in 2013 and the
17. Raffaelli, C., et al.: Evaluation of packet scheduling in hybrid Ph.D. degree in Telecommunication and
optical/electrical switch. Photon Netw. Commun. 23(1), 92–108 Information System from Xidian University
(2012) in 2016. Now he is doing postdoctoral pro-
18. Cao, Z., Kodialam, M., Lakshman, T.V.: Joint static and dynamic gramme in the State key lab of ISN, Xidian
traffic scheduling in data center networks. IEEE/ACM Trans. Net- University. His main research interests
work. 24(3), 1908–1918 (2016) include optical interconnected networks and
19. Porter, G., Strong, R., Farrington, N., et al.: Integrating microsec- data center networks.
ond circuit switching into the data center. In: Proceedings of the
ACM SIGCOMM 2013 Conference on SIGCOMM. ACM (2013) Kun Wang received the B.E. degree and M.E.
20. Kandula, S., Sengupta, S., Greenberg, A.G., et al.: The nature degree in Computer Science and Technology
of data center traffic: measurements and analysis. In: ACM SIG- from Xidian University, in 2003 and 2006,
COMM Conference on Internet Measurement Conference. ACM respectively. Now she is a lecturer in the
(2009) Department of Computer Science, Xidian
21. Roy, A., Zeng, H., Bagga, J., et al.: Inside the social network’s University. Her current interests include
(datacenter) network. ACM SIGCOMM Comput. Commun. Rev. high-performance computing and cloud
45(5), 123–137 (2015) computing and the network virtualization
technology.
Publisher’s Note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
13