Professional Documents
Culture Documents
Cloud Computing: - Applications Consist of Tasks
Cloud Computing: - Applications Consist of Tasks
• Elastic resources
– Expand and contract resources
A survey study on Data Center – Pay-per-use
– Infrastructure on demand
Traffic Engineering • Multi-tenancy
– Multiple independent users
– Security and resource isolation
– Amortize the cost of the (shared) infrastructure
• Flexibility service management
– Resiliency: isolate failure of servers and storage
– Workload movement: move work to other locations
DCN 1 DCN 2
1
Data center networks
• 10’s to 100’s of thousands of hosts, often closely coupled, in
close proximity:
– e-business (e.g. Amazon)
Data Center Network – content-servers (e.g.,YouTube,Akamai, Apple,
Microsoft)
– search engines, data mining (e.g., Google)
Few slides from
V. Arun challenges:
College of Computer Science
University of Massachusetts Amherst
multiple applications,
each serving massive
numbers of clients
managing/balancing load,
avoiding processing, Inside a 40-ft Microsoft container,
bottlenecks
Tier-1 switches
B Tier-2 switches
A C Tier-2 switches
TOR TOR
switches switches
Server racks Server racks
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
DCN 8
DCN
2
Broad questions
• How are massive numbers of commodity
machines networked inside a data center?
• Virtualization: How to effectively manage
physical machine resources across client
virtual machines?
• Operational costs:
– Server equipment
– Power and cooling
DCN 9 DCN 10
Source: NRDC research paper
3
Top-of-Rack Architecture Data Center Network Topology
Internet
• Rack of servers CR CR
DCN 13 DCN 14
DCN 15 DCN 16
4
Fat-Tree topology for DCs PortLand topology
• Fat-Tree: a special type of Clos Networks (after C. Clos)
• Introduce “pseudo MAC addresses” to servers to balance the pros and cons of flat- vs.
K-ary fat tree: three-layer topology (edge, aggregation and core)
topology-dependent addressing
– each pod consists of (k/2)2 servers & 2 layers of k/2 k-port switches
– each edge switch connects to k/2 servers & k/2 aggr. switches
– each aggr. switch connects to k/2 edge & k/2 core switches
– (k/2)2 core switches: each connects to k pods
pod
Fat-tree
with K=4 position
DCN 17 DCN 18
Int ...
D Max DC size
(# of 10G ports) (# of Servers)
48 11,520
Aggr ...
96 46,080
K aggr switches with D ports
144 103,680
. . . TOR ......
... 20 ........
Servers
20*(DK/4)
DCN Servers 19 DCN 20
5
Data-Center Routing Load Balancers
Internet
CR CR • Spread load over server replicas
DC-Layer 3 – Present a single public address (VIP) for a service
AR AR ... AR AR
– Direct each request to a server replica
DC-Layer 2
SS S
S 10.10.10.1
Virtual IP (VIP)
192.121.10.1
S
S S
S SS SS
...
Key 10.10.10.2
… … • CR = Core Router (L3)
• AR = Access Router (L3)
~ 1,000 servers/pod == IP subnet • S = Ethernet Switch (L2)
• A = Rack of app. servers
10.10.10.3
DCN 21 DCN 22
DNS
Server Internet Proxy Proxy
DNS-based
Clients Clients
site selection
DCN 23 DCN 24
6
Traffic Engineering Challenges
• Scale
– Many switches, hosts, and virtual machines
• Churn
Data Center Traffic Engineering – Large number of component failures
– Virtual Machine (VM) migration
• Traffic characteristics
Challenges and Opportunities – High traffic volume and dense traffic matrix
– Volatile, unpredictable traffic patterns
• Performance requirements
– Delay-sensitive applications
– Resource isolation between tenants
DCN 25 DCN 26
DCN 27 DCN 28
7
Improving Data Center Network Utilization
Using Near-Optimal Traffic Engineering [1] ECMP [1 contd.]
• [1] constructs a routing algorithm that • Most of currentDCnetworks employ equal cost
provides path diversity over non-uniform link multipath (ECMP) forwarding to leverage the
weights (i.e., unequal cost links), simplicity in path diversity provided by topological
path discovery and optimality in minimizing redundancy, by splitting traffic across multiple
maximum link utilization (MLU) is nontrivial
paths through hashing packets’ headers.
• [1] implemented and evaluated the Penalizing
Exponential FlowspliTing (PEFT) algorithm in a • Per-flow ECMP is link load and flow agnostic in
cloud DC environment based on two which resulting hash collision can hinder load-
dominant topologies, canonical and fat tree balancing among links
DCN 29 DCN 30
DCN 31 DCN 32
8
OpenFlow based Load Balancing for Hedera: Dynamic Flow Scheduling for
Fat-Tree Networks with Multipath Support [3] Data Center Networks [4]
• [3] present a load balancer for OpenFlow based fat-tree • It is a dynamic flow scheduling system that adaptively schedules a multi-
data center networks and implements a dynamic routing stage switching fabric to efficiently utilize aggregate network resources.
algorithm in the load balancer. • Dynamic Flow Demand Estimation: The demand estimator performs
repeated iterations of increasing the flow capacities from the sources and
• [3] prposes dynamic load balancer (DLB) and compares it decreasing exceeded capacity at the receivers until the flow capacities
with the results of three flow scheduling algorithms: none converge
load balancing (NLB) algorithm, static load balancing (SLB) • Schedular algo.s:
algorithm – Global First Fit: When a new large flow is detected, (e.g. 10% of the host’s link
capacity), the scheduler linearly searches all possible paths to find one whose
• One characteristic of routing in fat-tree network is: once a link components can all accommodate that flow. Forwarding entries are also
flow reaches the highest layer that it accesses, the path scheduled.
– Simulated Annealing: performs a probabilistic search to efficiently compute
from that switch to the destination host is deterministic. paths for flows. The key insight of this approach is to assign a single core
Only upward traffic is handled by DLB algorithm. The switch for each destination host rather than a core switch for each flow. This
downward traffic is automatically determined reduces the search space significantly.
corresponding to highest layer switch and destination. • Implementation uses NetFPGA OpenFlow switches which have 2 hardware
tables: a 32-entry TCAM (that accepts variable-length prefixes) and a 32K
• Beacon openflow controller with Mininet emulator. entry SRAM that only accepts flow entries with fully qualified 10-tuples.
DCN 33 DCN 34
• Valiant Load Balancing (VLB), which essentially • Devoflow [5] is a modication of the OpenFlow model which
gently breaks the coupling between control and global
guarantees equal spread load-balancing in a visibility, in a way that maintains a useful amount of
visibility without imposing unnecessary costs. It focuses on
mesh network by bouncing individual packets handling most of the micro-flows in data plane
from a source switch in the mesh off of • [6] proposes LABERIO (a novel path-switching algorithm)
which solves the problem of OpenFlow which only try to
randomly chosen intermediate “core” find a static routing path during initialization step while the
switches, which finally forward those packets static routing path often suffers from poor performance
since the network configuration may change during the
to their destination switch. data transmission.
• ECMP • [7] discusses of Aster*x controller built on top of NOX
which is a distributed load balancer for large cloud.
DCN 35 DCN 36
9
Research Questions
• What topology to use in data centers?
– Reducing wiring complexity
– Achieving high bisection bandwidth
Ongoing Research – Exploiting capabilities of optics and wireless
• Routing architecture?
– Flat layer-2 network vs. hybrid switch/router
– Flat vs. hierarchical addressing
• How to perform traffic engineering?
– Over-engineering vs. adapting to load
– Server selection, VM placement, or optimizing routing
• Virtualization of NICs, servers, switches, …
DCN 37 DCN 38
10
42
Recent DC architectures
(topologies)
Fat-tree
with K=4
DCN 41 DCN
Facebook 4-post
DCN 43 DCN 44
11
Cisco virtualized fabric DC design
Amazon EC2 high availability design
DCN 45 DCN 46
References
1. “Improving Data Center Network Utilization Using Near-Optimal Traffic
Engineering”, Fung Po Tso, Dimitrios P. Pezaros, IEEE TRANSACTIONS ON
PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 6, JUNE 2013
2. “Intra-Data-Center Traffic Engineering with Ensemble Routing”, Ziyu Shao, Xin Jin,
Wenjie Jiang, Minghua Chen, Mung Chiang, 2013 Proceedings IEEE INFOCOM
3. “OpenFlow based Load Balancing for Fat-Tree Networks with Multipath Support”,
Yu Li, Deng Pan, IEEE ICC 2013
4. “Hedera: Dynamic Flow Scheduling for Data Center Networks”, Mohammad Al-
Fares, Sivasankar Radhakrishnan, Barath Raghavan, Nelson Huang, Amin Vahdat,
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design
and implementation
5. “DevoFlow: Scaling Flow Management for High-Performance Networks”, Andrew
R. Curtis, Jeffrey C. Mogul, Jean Tourrilhes, Praveen Yalagandula, Puneet Sharma,
Sujata Banerjee, Proceedings of the ACM SIGCOMM 2011 conference Pages 254-
265
6. “LABERIO: Dynamic load-balanced routing in OpenFlow-enabled networks”, Hui
Long, Yao Shen, Minyi Guo, Feilong Tang, 2013 IEEE 27th International
Conference on Advanced Information Networking and Applications
7. “Aster*x: Load-Balancing as a Network Primitive”, Nikhil Handigol, Mario Flajslik,
Srini Seetharaman, Ramesh Johari, Nick McKeown, Stanford University, Deutsche
Telekom R&D Lab USA, 9th GENI Engineering Conference, 2010
8. “HyperX: Topology, Routing, and Packaging of Efficient Large-Scale Networks”,
Jung Ho Ahn, Nathan Binkert, Al Davis, Moray McLaren, Robert S. Schreiber, HP
Laboratories , HPL-2009-184
DCN 47 DCN 48
12