Professional Documents
Culture Documents
ACI Multi-Site Architecture and Deployment - BRKACI-2125
ACI Multi-Site Architecture and Deployment - BRKACI-2125
Deployment
BRKACI-2125
Cisco Webex Teams
Questions?
Use Cisco Webex Teams to chat
with the speaker after the session
How
1 Find this session in the Cisco Events Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Session Objectives
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Agenda
• ACI Network and Policy Domain Evolution
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
ACI Network and
Policy Domain
Evolution
Cisco ACI: Industry Leader
Ecosystem Partners
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Introducing Application Centric Infrastructure (ACI)
Web App DB
APIC
Application Policy
Infrastructure Controller
ACI Fabric
Integrated GBP VXLAN Overlay
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
ACI Anywhere
Fabric and Policy Domain Evolution
MP-BGP - EVPN
ACI 1.0 - ACI 3.0 – Multiple ACI 4.1 & 4.2 – ACI
ACI Multi-Pod Fabric Availability Zones (Fabrics)
ACI Remote Leaf
Leaf/Spine Single Extensions to Multi-Cloud
Pod Fabric in a Single Region ’and’
IPN Multi-Region Policy
Pod ‘A’ Pod ‘n’
Management
MP-BGP - EVPN
…
APIC Cluster
BRKACI-2387: Design for a distributed Data Center environment with Cisco ACI Remote Leaf
BRKACI-2690: How to extend your ACI fabric to Public Clouds BRKACI-2125
(AWS and Azure)
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Multi-Pod or Multi-Site?
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
And the answer is…
BOTH!
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Regions and Availability Zones
OpenStack and AWS Definitions
OpenStack
• Regions - Each Region has its own full OpenStack
deployment, including its own API endpoints, networks
and compute resources
• Availability Zones - Inside a Region, compute nodes can
be logically grouped into Availability Zones, when launching
new VM instance, we can specify AZ or even a specific
node in a AZ to run the VM instance
• Pod – A Leaf/Spine network sharing a common control plane (ISIS, BGP, COOP,
…)
▪ Pod == Availability Zone
• Multi-Fabric – Multiple APIC Clusters + associated Pods (you can have Multi-Pod
with Multi-Fabric)*
▪ Multi-Fabric == Multi-Site == a DC infrastructure with multiple regions
Fabric Change/Fault Domain Fabric Change/Fault Domain Fabric Change/Fault Domain Fabric Change/Fault Domain
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Typical Requirement
Creation of Two Independent Fabrics/AZs
Application
workloads deployed
across regions
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Typical Requirement
Creation of Two Independent Fabrics/AZs
MP-BGP - EVPN
…
Up to 50 msec RTT
APIC Cluster
IS-IS, COOP, MP-BGP IS-IS, COOP, MP-BGP
• Multiple ACI Pods connected by an IP Inter-Pod L3 • Forwarding control plane (IS-IS, COOP) fault
network, each Pod consists of leaf and spine nodes isolation
• Up to 50 msec RTT supported between Pods • Data Plane VXLAN encapsulation between Pods
• Managed by a single APIC Cluster • End-to-end policy enforcement
• Single Management and Policy Domain
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Single AZ with Maintenance and Configuration Zones
Scoping ‘Network Device’ Changes
ACI Multi-Pod
Fabric
APIC Cluster
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Single AZ with with Tenant Isolation
Isolation for ‘Virtual Network Zone and Application’ Changes
Inter-Pod
Network
ACI Multi-Pod
Fabric
APIC Cluster
• The ACI ‘Tenant’ construct provide a domain of application and associated virtual network policy
change
• Domain of operational change for an application (e.g. production vs. test)
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
ACI Multi-Pod
Most Common Use Cases
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
ACI Multi-Pod
Supported Topologies
Intra-DC Site Two DC sites directly connected
1G/10G/40G/100G
10G/40G/100G 10G*/40G/100G
Pod 1 Pod n 10G*/40G/100G 10G*/40G/100G
Pod 1 Dark fiber/DWDM Pod 2
(up to 50** msec RTT)
…
POD 3 ** 50
© 2020 msec
Cisco and/orsupport
its affiliates.added in SW release
All rights reserved. 2.3(1)
Cisco Public
ACI Multi-Site
Deep Dive
Overview and Use Cases
ACI Multi-Site VXLAN Data Plane
Overview Inter-Site
Network
MP-BGP - EVPN
Multi-Site
Orchestrator
Site 1 Site 2
REST
GUI
API
• Separate ACI Fabrics with independent APIC clusters • MP-BGP EVPN control plane between sites
• No latency limitation between Fabrics • Data Plane VXLAN encapsulation across
• ACI Multi-Site Orchestrator pushes cross-fabric sites
configuration to multiple APIC clusters providing • End-to-end policy definition and
scoping of all configuration changes enforcement
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
ACI Multi-Site
Most Common Use Cases
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
ACI Multi-Site
Software and Hardware Requirements
• Support all ACI leaf switches (1st Generation, -EX and -FX) Can have only a subset
Inter-Site of spines connecting to
• Only –EX spine (or newer) to connect to the ISN Network (ISN) the IP network
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
ACI Multi-Site
Network and Identity Extended between Fabrics
MP-BGP - EVPN
Multi-Site
Orchestrator
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
ACI Multi-Site
Namespace Normalisation VNID → 16678781
Class-ID: 49153 Translation of Class-ID, VNID
Inter-Site (scoping of name spaces)
VNID → 16678781 Network
Spine Translation Table
Class-ID: 49153
Rem. Site Local Site
Class-ID: 32770
VNID → 16678781
Class-ID: 49153 EP1 Site 2
Site 1 EPG C
EP2
EPG
Leaf to Leaf VTEP, Class-ID is local to the Fabric
Leaf to Leaf VTEP, Class-ID is local to the Fabric
VNID Class-ID Tenant Packet
VNID Class-ID Tenant Packet VNID Class-ID Tenant Packet
• Maintain separate name spaces with ID translation performed on the spine nodes
• Requires specific HW on the spine to support for this functionality
• Multi-Site Orchestrator instructs local APIC to program translation tables on spines
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
ACI Multi-Site
Inter-Site Policies and Spines’ Translation Tables
entries are populated only in two cases: VNID: 16678781 VNID: 16457896 VNID: 16547722 VNID: 15434256
Class-ID: 49153 Class-ID: 31564 Class-ID: 32770 Class-ID: 36784
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
ACI Multi-Site MSO 2.0(2)
Release
Removing Policy Enforcement: Preferred Groups
Contract required to
Multi-Site Preferred Group communicate with EPG(s)
external to the Preferred Group
App DB
C1 Non-PG
EPG
C2
Free
Web
communication
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Removing Policy Enforcement
Preferred Groups for E-W and N-S Flows
Inter Site
Network ▪ Adding internal EPGs and External EPGs
Site 1 Site 2
(associated to L3Outs) to the Preferred Group
allows to enable free east-west and north-
south connectivity
▪ When adding the Ext-EPG to the Preferred
L3Out L3Out Group:
Site 1 Site 2
Ext-EPG
EP1 EP2 Ext-EPG • Can’t use 0.0.0.0/0 for classification, needs more
specific prefixes
Multi-Site Preferred Group • As workaround it is possible to use 0.0.0.0/1 and
EPG1 EPG2
128.0.0.0/1 to achieve the same result
Ext-EPG
• Must ensure Ext-EPG is a stretched object
On MSO
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
ACI Multi-Site MSO 2.2(4)
Release
vzAny Support (MSO 2.2(4) Release)
▪ Multiple EPGs part of a specific VRF1 consume ▪ vzAny provides and consumes a contract with an
the services provided by a shared EPG (part of associated “Permit-any” filter
VRF1 or of a VRF-shared)
▪ Use ACI fabric only for network connectivity without policy
▪ VRF-shared can be part of the same tenant or of enforcement
a different tenant
▪ Equivalent to “VRF unenforced”
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
ACI Multi-Site and vzAny MSO 2.2(4)
Release
Many-to-One Communication (Shared Services)
Inter Site
Network
Site1 Site2
Ext-EPG Ext-EPG
L3Out-Site1 L3Out-Site2
Shared-Resource
• Proper translation entries are created on the spines of both fabrics to enable
east-west communication
• Supported also for Shared Services behind an L3Out
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
ACI Multi-Site and vzAny MSO 2.2(4)
Release
Enable Inter-Site Free Communication Inside a VRF
Inter Site
Network
Site1 Site2
Ext-EPG Ext-EPG
L3Out-Site1 L3Out-Site2
EPG1 EPG2
• Proper translation entries are created on the spines of both fabrics to enable
east-west communication
• Supported also for connecting to the external Layer 3 domain
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
ACI Multi-Site ACI 3.2(1)
Release
Spines in Separate Sites Connected Back-to-Back
Multi-Site
Orchestrator
• Back-to-back connections only supported between 2 sites from ACI 3.2 release
• Support for full mesh and ‘square’ topologies
• Support for more than 2 sites scoped for a future ACI release
▪ Current restriction is that a site cannot be ‘transit’ for communication between other sites
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
ACI Multi-Site
CloudSec Encryption for VXLAN Traffic
Encrypted Fabric to Fabric Traffic
[GCM-AES-256-XPN (64-bit PN)])
CloudSec = “TEP-to-TEP MACSec”
VTEP Information
in Clear Text
Inter-Site Network
MP-BGP - EVPN
Multi-Site
Orchestrator
Supported from ACI 4.0(1) release for FX line cards and 9332C/9364C platforms
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
ACI Multi-Site
Per Bridge Domain Behavior
MSO GUI
(BD)
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
ACI Multi-Site
L3 Only across Sites
L3Out L3Out
Layer 3 only across sites IP Mobility without BUM flooding Layer 2 adjacency across Sites
1 2 3
ISN ISN ISN
Site Site Site Site 2
Site Site Site
1 2 1 2 1 2
▪ Bridge Domains and subnets not ▪ Same IP subnet defined in separate ▪ Interconnecting separate sites for
extended across Sites Sites fault containment and scalability
reasons
▪ Layer 3 Intra-VRF or Inter-VRF ▪ Support for IP Mobility (‘cold’ and
communication (shared services ‘live’* VM migration) and intra- ▪ Layer 2 domains stretched across
across VRFs/Tenants) subnet communication across sites Sites, support for application
clustering
▪ No Layer 2 BUM flooding across
sites ▪ Layer 2 BUM flooding across
sites
*’Live’ migration officially supported from ACI release 3.2 BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
ACI Multi-Site
Scalability Values Supported in ACI 4.2(x)/MSO 2.2(x) Releases
Scale Parameter Stretched Objects
Sites 12
Tenants 400
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
ACI Multi-Site
Continuous Scale Improvements
ACI Release 3.0 ACI Release 3.1 ACI Release 3.2 ACI Release 4.2
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
Introducing the ACI
Multi-Site Orchestrator
ACI Multi-Site
Multi-Site Orchestrator (MSO)
• Three MSO nodes are clustered and run concurrently
(active/active)
REST
GUI ▪ Typical database redundancy considerations
API
(minority/majority rules)
▪ Up to 150 msec RTT latency supported between MSO nodes
ACI Multi-Site Orchestrator Cluster
• OOB Mgmt connectivity to the APIC clusters
deployed in separate sites
150 msec RTT
MSO Node 1 (max) MSO Node 2 MSO Node 3
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
ACI Multi-Site Orchestrator
VM Based MSO Cluster
Site 1 Site 2
….. Site n
• VMware ESXi 6.0 or later
• Minimum of eight virtual CPUs (vCPUs), 24 Gbps of
memory, and 100 GB of disk space
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
ACI Multi-Site Orchestrator MSO 2.2(3)
Release
Cisco Application Service Engine (CASE) Based MSO Cluster
• CASE cluster available in different form factors (physical, on-prem VM, AWS instance)
• MSO is installed as an App on the CASE cluster
• Recommended MSO cluster deployment option going forward
1 MSO App on CASE 2 MSO App on CASE VM form 3 MSO App on CASE VM form
physical form factor factor (on premises) factor (on AWS)
MSO Node1 MSO Node2 MSO Node3 MSO Node1 MSO Node2 MSO Node3
CASE CASE CASE CASE VM on CASE VM on CASE VM on
Physical Host Physical Host Physical Host ESXi/KVM ESXi/KVM ESXi/KVM
IP network IP network
IP network
Site 1
….. Site n Site 1 Site 2
….. Site n …..
Site 2
Site 1 Site 2 Site n
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
ACI Multi-Site
MSO Dashboard
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
ACI Multi-Site
MP-BGP/EVPN Infra Configuration
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
ACI Multi-Site For more Information on setting
up ACI Multi-Site via Ansible :
UCSD and Ansible Integration BRKACI-2291
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
APIC vs. Multi-Site Orchestrator Functions
• Complementary to APIC
• Central point of management and
configuration for the Fabric • Provisioning and managing of “Inter-Site
Tenant and Networking Policies”
• Responsible for all Fabric local functions
• Scope of changes
• Fabric discovery and bring up
• Fabric access policies • Granularly propagate policies to multiple APIC
• Domains creation (VMM, Physical, etc.) clusters
• … • Can import tenant configuration from APIC
• Maintains runtime data (VTEP address, VNID, cluster domains
Class_ID, GIPo, etc.) • End-to-end visibility and troubleshooting
• No participation in the fabric control and data • No run time data, configuration repository
planes
• No participation in the fabric control and data
planes
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Multi-Site Orchestrator
Deployment Considerations
ACI Multi-Site
MSO Deployment Considerations
Intra-DC Deployment Interconnecting DCs over WAN
New York
Site3
IP Network
WAN
Milan Rome
Site1 Site2
Milan Rome
• MSO nodes can be connected directly to the DC OOB network • Up to 150 msec RTT latency supported between MSO nodes
• Each MSO node has a unique routable IP (can be part of • Higher latency (500 msec to 1 sec RTT) between MSO nodes and
separate IP subnets) managed APIC clusters
• Async calls from MSO to APIC • If possible deploy MSO nodes in separate sites for availability
purposes (network ©partition scenarios)
2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
ACI Multi-Site
MSO and APIC Release Dependency (Pre-MSO 2.2(1) Release)
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
ACI Multi-Site Interversion MSO 2.2(1)
Release
Decoupling MSO and APIC Releases
MSO 2.2(1)
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
ACI Multi-Site Interversion
Decoupling MSO and APIC Releases
SW dependency for different ACI functionalities
1 2
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
How to Define Schemas,
Templates and their Mappings
to ACI Sites?
ACI Multi-Site
MSO Schema and Templates
Schema
▪ Template = ACI policy definition
Tenant1
(ANP, EPGs, BDs, VRFs, etc.) Stretched Tenant1
Template
▪ Schema = container of Templates
sharing a common use-case
• As an example, a schema can be
dedicated to a Tenant
▪ The template is currently the atomic unit
of change for policies
• Such policies are concurrently pushed to
one or more sites
Site 1 Site 2
▪ Scope of change: policies in different
templates can be pushed to separate EFFECTIVE
POLICY
EFFECTIVE
POLICY
sites at different times
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
ACI Multi-Site
Schema and Templates Definition for the DR Use Case
Future
Schema Schema Schema
t1 t1 t1 t2 t1 t2
▪ Single Template associated to Prod ▪ Separate Template associated to Prod ▪ Single Template associated to Prod
and DR Sites and DR Sites (can use cloning) and DR Sites
▪ Any change applied to the template ▪ Changes made to a template can be ▪ Capability of independently apply
is pushed to both sites applied only to the mapped site changes to each site
simultaneously
▪ Requires sync between the two ▪ Brings together the advantages of
▪ Easiest way to keep consistent templates (manual or performed by an the previous two options
policies deployed across sites higher level Orchestration tool) © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
How to Define the Policies
inside a Template for a Given
Tenant?
Schema Design
One Template per Site, plus a ‘Stretched’ Template
Schema Site 1
ANP1 VRF
BD7 C1 C2
EPG7
Contracts
▪ All objects defined inside the schema are visible and can be referenced via the
drop-down list
• This is not the case for object referenced across schemas → for those it is required to digit at least 3 letters of
their names to be displayed and then create references
▪ Current support limited to 5 templates per schema
• With four sites you could have a template per site and one stretched template (would not scale to support
other combinations)
▪ Be aware of the maximum object limit in the same schema
• Every object that can be defined in a template counts (EPGs, BDs, VRFs, Contracts, etc.)
• 500 objects per schema support up to MSO 2.2(2), 1000 objects per schema supported from
MSO 2.2(3)
▪ Note: increasing both the number of templates and number of objects in a schema is
planned for a future ACI release
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
ACI Multi-Site Orchestrator
Defining Policies in a Template
Site 1
Site 1 Site 1 Site 1
1a 1b 2b
2a
Site 2
Site 2 Site 2 Site 2
2 2 1
2a 2b
Site 2
Site 2 not allow diff/merge operations on policies
from different APIC domains
1 1 ▪ It is still possible to import policies for the
3 same tenant from different APIC domains,
under the assumption those are no
conflicting
Site 1 Site 2 • Tenant defined with the same Name
Existing Fabric Existing Fabric • Name and policies for stretched objects are
1. Import existing tenant policies from site 1 and site 2 to new
also common
common and site-specific templates on ACI MSO
2a. Associate the common template to both sites (for stretched objects)
2b. Associate site-specific templates to each site
3. Push the policies back to the ACI sites BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Inter-Site Connectivity
Deployment Considerations
ACI Multi-Site
Inter-Site Network (ISN) Requirements
Inter-Site Network
MP-BGP - EVPN
Multi-Site
Orchestrator
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
ACI Multi-Site and MTU
Tuning MTU for EVPN Traffic across Sites
Configurable MTU
ISN
MP-BGP - EVPN
Multi-Site
Orchestrator
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
ACI Multi-Site and QoS
Intra-Site QoS Behavior
• ACI Fabric supports six classes of services
• Traffic is classified only in the ingress leaf
▪ The CoS value in the iVXLAN packet is set based
Class of Traffic Type Dot1p Marking in on this table
Service/QoS-group VXLAN Header • Three user configurable classes of services
0 Level3 user data 0 for user data traffic
1 Level2 user data 1 ▪ Level3 is the default class (CoS value 0)
2 Level1 user data 2 • Three reserved classes of service for control
3 APIC controller traffic 3 traffic and SPAN
▪ APIC controller traffic
4 SPAN traffic 4
▪ Control traffic for traffic destined to the
5 Control Traffic 5 supervisor
5 Traceroute 6 ▪ SPAN traffic
▪ Traceroute traffic
Note: 3 additional user classes have been added in ACI
• Each class is configured at the fabric level
4.0(1) release
and mapped to a hardware queue
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
ACI Multi-Site and QoS
Inter-Site QoS Behavior
• Traffic across sites should be consistently prioritized (as it happens intra-site)
• To achieve this end-to-end consistent behavior it is required to perform DSCP-to-
CoS mapping on the spines (Spines-to-ISN and ISN-to-Spines)
• Important: must ensure that no traffic is received by the spines from the IPN with the DSCP
marking associated to Traceroute (spines do not forward this traffic toward the leaf nodes)
• The traffic can then be properly treated inside the ISN (classification/queuing)
Traffic classification
and queuing
Spines set the outer Spines set the iVXLAN
DSCP field based on the CoS field based on the
configured mapping configured mapping
ISN
Pod ‘A’ Pod ‘B’
MP-BGP - EVPN
CS5 CS5
Multi-Site
Orchestrator
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Control Plane
Considerations
ACI Multi-Site
BGP Inter-Site Peers
• Spines connected to the Inter-Site Network perform
two main functions:
Inter-Site
Network
1. Establishment of MP-BGP EVPN peerings with spines in
remote sites
Anycast VTEP Addresses:
O-UTEP & O-MTEP ▪ One dedicated Control Plane address (EVPN-RID) is
assigned to each spine running MP-BGP EVPN
2. Forwarding of inter-sites data-plane traffic
▪ Anycast Overlay Unicast TEP (O-UTEP): assigned to all the
EVPN-RID 4 spines connected to the ISN and used to source and
receive L2/L3 unicast traffic
EVPN-RID 1 ▪ Anycast Overlay Multicast TEP (O-MTEP): assigned to all
EVPN-RID 2 EVPN-RID 3
the spines connected to the ISN and used to receive L2
BUM traffic
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
ACI Multi-Site
Exchanging TEP Information across Sites IP Network Routing Table
O-UTEP A, O-MTEP A
EVPN-RID S1-S4
O-UTEP B, O-MTEP B
Filter out the
EVPN-RID S5-S8
advertisement of internal
• OSPF peering between spines and TEP pools into the ISN
Inter-Site network
Inter-Site
• Mandates the use of L3 sub-interfaces (with OSPF Network OSPF
VLAN 4 tag) between the spines and the ISN
Site 1 Site 2
O-UTEP A VXLAN Inter-Site unicast traffic O-UTEP B
sourced from O-UTEP A and
S1 S2Proxy AS3 S4 destined to O-UTEP B S5 S6Proxy BS7 S8
Multi-Site EP2 e1/1
EP1 e1/3 2 Orchestrator
4 EP1 O-UTEP A
5 * Proxy B
* Proxy A
EP1 sends Leaf learns remote Site
EP2 unknown, traffic is 1 traffic to EP2 location info for EP1
encapsulated to the local EP1 EP1 EP2 EP2
Proxy A Spine VTEP (adding 10.10.10.10 EPG
C EPG 20.20.20.20
S_Class information) 6
2 3 4 If policy allows it, EP2
receives the packet
Proxy-A O-UTEP B S2-L3-TEP
1 S1-L4-TEP O-UTEP A O-UTEP A 6
20.20.20.20 20.20.20.20 20.20.20.20 20.20.20.20
= VXLAN Encap/Decap
20.20.20.20
10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
ACI Multi-Site Policy information (EP1’s Class-ID)
Inter-Sites Unicast Data Plane (2) carried across Pods
10 9
Site 1 Site 2
O-UTEP A VXLAN Inter-Site unicast traffic O-UTEP B
sourced from O-UTEP B and
S1 S2 S3 S4 destined to O-UTEP A S5 S6 S7 S8
EP1 e1/3
EP2 O-UTEP B Multi-Site EP1 O-UTEP A
Orchestrator
** Proxy A
8 * Proxy B
11 Leaf applies the policy and, if
Leaf learns remote Site allowed, encapsulates traffic
location info for EP2 to remote O-UTEP address
EP1 EP1
EPG
C EP2
EPG
EP2
12 7
EP1 receives the packet EP2 sends traffic back
10 9 8
to remote EP1
S1-L4-TEP O-UTEP A O-UTEP A
12 7
O-UTEP B O-UTEP B S2-L4-TEP
From this point EP1 to EP2 communication is encapsulated Leaf to Remote Spine O-UTEPs in both directions
Inter-Site
Network
Site 1 Site 2
O-UTEP A Flows between O-UTEP A and O- O-UTEP B
UTEP B (and vice versa)
S1 S2 S3 S4 S5 S6 S7 S8
Multi-Site
Orchestrator
**
EP2 e2/5
EP1 e1/3 EP1 O-UTEP A
EP1 EP1
EPG
C EP2
EPG
EP2
EP2 O-UTEP B * Proxy B
Proxy A
= VXLAN Encap/Decap
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
ACI Multi-Site
Layer 2 BUM Traffic Data Plane
3 4
O-UTEP A Inter-Site BUM traffic sourced O-MTEP B
from O-UTEP A and destined to
S1 S2 S3 S4 O-MTEP B
S5 S6 S7 S8
BUM frame is flooded along the
Multi-Site tree associated to GIPo. VTEP
2 Orchestrator 5 learns VM1 remote location
*
*
EP1 O-UTEP A
BUM frame is associated
to GIPo1 and flooded
intra-site via the * Proxy B
EP1 EP2
corresponding FTAG tree
1 6
GIPo1 = Multicast Group EP1 generates a EP2 receives the
associated to EP1’s BD BUM frame BUM frame
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Tenant Multicast
Routing
Tenant Multicast Routing with Multi-Site ACI 4.0(2)
Release
Deployment Considerations
• Supported from ACI 4.0(2) release only on 2nd Gen leaf switches (EX/FX and beyond)
• Intra-VRF only support, inter-VRF planned for a future ACI release
• Each BL node runs PIM in active mode and forms neighborship with other BLs in
the same site and with the external router(s)
• Supports sources attached to the fabric and external sources (reachable via local L3Out)
• BDs with L3 Multicast sources or receivers may or may not be stretched across the sites
• External sources must be reachable independently from each site via local L3Outs
• Supports receivers attached to the fabric and external receivers (reachable via local or
remote L3Out)
• BDs with L3 Multicast receivers or receivers may or may not be stretched across the sites
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Multicast
Forwarding
Tenant Multicast Routing with Multi-Site ACI 4.0(2)
Release
Forwarding Behavior
• Each defined VRF gets assigned a dedicated underlay multicast group (VRF GIPo)
• Tenant Multicast traffic is forwarded within a site using the VRF GIPo tree and
delivered to all leaf nodes where the VRF is deployed (whether there are connected
receivers or not)
• Multicast is dropped at the egress leaf in the case where there are no interested receivers
• Inter-site tenant Multicast traffic for a given VRF is forwarded using ingress
replication
• The traffic is replicated to all remote sites where the VRF is stretched, whether there are connected receivers
or not at the receiving site
• Multicast is dropped at the receiving spine if there are no interested receivers in that site
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Tenant Multicast Routing with Multi-Site
L2 Multicast over Multi-Site (Supported since ACI 3.0)
• Stretched BDs with BUM Traffic Enabled (no PIM configuration required)
• Within a site the L2 multicast traffic is VXLAN encapsulated and sent to the BD GIPo multicast address → reaches all
the spines and the leaf nodes where the BD is defined (configuration driven)
• The spine elected as Designated Forwarder (DF) replicates the stream to each remote sites where the BD is stretched
• At the receiving spine the multicast will be sent down the FTAG tree to the receiving site BD GIPo multicast address
Inter-Site
Network
Site 2 O-MTEP
HREP tunnel destination:
Site 2 O-MTEP
Site 1
BD1 VNID → 16514962
BD1 GIPo → 225.0.195.240
Site 2
BD1 VNID → 16711545
BD1 GIPo → 225.1.128.160
BD1 BD1 BD1 BD1 BD1
Site 1 Source Receiver NOT Receiver NOT Receiver Receiver
Site 2
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Tenant Multicast Routing with Multi-Site
L3 Multicast over Multi-Site (Source Inside the Fabric)
• Built as Routing-First Approach (decrement TTL at source and destination ACI leaf nodes)
• L3 Multicast is always sent to the VRF GIPo within a site (existing behavior)
• Between sites it is sent over the HREP tunnel to the O-MTEP address of the remote sites
where the VRF is stretched (the VXLAN header will include the source site VRF VNID)
• L3 Multicast at the receiving site will be sent in the VRF GIPo of the receiving site
Inter-Site
Network
Site 2 O-MTEP
Site1 VRF VNID → 2293762
HREP tunnel dest: Site 2 O-MTEP
R2 BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Tenant Multicast Routing with Multi-Site
L3 Multicast over Multi-Site (Source Outside the Fabric)
Source 85
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
PIM Sparse Mode
Control and Data Planes
Tenant Multicast Routing with Multi-Site
External RP requirement
• RP must be external to the fabric
• All sites can point to the same RP address
Inter-Site
Network
Sites 1 and 2
using RP 1.1.1.1
L3Out-1 L3Out-2
Site 1 Site n
RP: 1.1.1.1
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 87
Tenant Multicast Routing with Multi-Site
External RP requirement
Site 2 using
RP 2.2.2.2
Site 1 using
RP 1.1.1.1
L3Out-1 L3Out-2
Site 1 Site n
MSDP
RP: 1.1.1.1 RP: 2.2.2.2
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Source Inside, Receivers Inside
and Outside
Control and Data Planes
Tenant Multicast Routing with Multi-Site
Source Inside, Receivers Inside and Outside (Control Plane)
▪ A receiver is connected to a leaf node in a ▪ COOP is used between leaf and spine nodes
site and sends an IGMP Join for group G to build a PIM shared tree(*,G) toward the
external RP
Inter-Site
▪ BL node sends PIM (*,G) Join toward the RP
Network
COOP
(*,G) state (*,G) state
BL11 BL21
IGMP Join
for G
L3Out-1 L3Out-2
Receiver
Site 1 PIM (*,G) Site 2
Join
(*,G) state
(*,G) state
PIM (*,G)
RP PIM (*,G)
Join
IGMP Join Join
for G
(*,G) state
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Receiver
Tenant Multicast Routing with Multi-Site
Source Inside, Receivers Inside and Outside (Control Plane, cont.)
PIM
Register*
(S,G) state
BL11 BL21
MC traffic
to G L3Out-1 Advertise L3Out-2
Source S PIM PIM (S,G) source’s IP Receiver
Site 1 Register Join Subnet Site 2
Stop PIM (S,G)
Join (S,G) state
* PIM register packets are unicast packets (sent from
first-hop router to the RP external to the fabric), with
(S,G) state
PIM protocol number (103) set in the IP header RP
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
Receiver
Tenant Multicast Routing with Multi-Site
Source Inside, Receivers Inside and Outside (Data Plane)
• When RP receives register from source it will • When BL (BL21) installs (S,G), sees that the
forward multicast down the shared tree source is part of a pervasive BD and sends
PIM prune towards the RP
Inter-Site
Network
BL11 BL21
MC traffic
to G
L3Out-1 L3Out-2 Receiver
Source S
Site 1 Site 2
PIM Prune
Multicast data
RP PIM Prune
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Receiver
Source Outside, Receivers
Inside
Control and Data Planes
Tenant Multicast Routing with Multi-Site
Source Outside, Receivers Inside (Control Plane)
▪ Receivers are connected to leaf nodes across
sites and send IGMP Joins for group G
▪ PIM Shared tree with (*,G) state is built from
the leaf nodes toward the external RP Inter-Site
Network
COOP COOP
(*,G) state (*,G) state
BL11 BL21
IGMP Join IGMP Join
for G for G
L3Out-1 L3Out-2
Receiver Receiver
Site 1 PIM (*,G) Site 2
PIM (*,G) Join
Join
(*,G) state
(*,G) state
(*,G) state
RP PIM (*,G)
Source S Join
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
Tenant Multicast Routing with Multi-Site
Source Outside, Receivers Inside (Data Plane)
▪ Each site must receive multicast sent from external sources via a local L3Out
L3Out-1 L3Out-2
`
Receiver Receiver
Site 1 Site 2
Multicast data
RP
Source S
MC traffic
to G
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
Tenant Multicast Routing with Multi-Site
Source Outside, Receivers Inside with Transit Case
▪ Transit multicast use case is supported. One site can be transit for an external source and that multicast flow can
arrive at another site via the local L3out. Multicast is not sent over the ISN in this case
Multicast traffic from external Multicast traffic from external
Inter-Site sources is dropped on the spines
sources is dropped on the spines
Network
and not sent over HREP tunnels and not sent over HREP tunnels
BL11 BL21
Multicast data
RP
Source S
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
PIM SSM
Control and Data Planes
Source Inside, Receivers Inside
and Outside
Control and Data Planes
PIM SSM Tenant Multicast Routing with Multi-Site
Source Inside, Receivers Inside and Outside (Control Plane)
▪ A receiver is connected to a leaf node in a site ▪ An (S,G) state is created in the leaf nodes where
and sends an IGMPv3 Join for group (S,G) the endpoint is connected and to the BL node
Inter-Site
Network
COOP
(S,G) state (S,G) state (S,G) state
BL11 BL21
IGMPv3 Join
for (S,G)
L3Out-1 Advertise L3Out-2
Source S PIM (S,G) source’s IP Receiver
Site 1 Join Subnet Site 2
(S,G) state
IGMPv3 Join
for (S,G)
Receiver
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
PIM SSM Tenant Multicast Routing with Multi-Site
Source Inside, Receivers Inside and Outside (Control Plane)
▪ As soon the multicast source starts sending ▪ The traffic will hence be received by the remote
traffic, it is encapsulated and sent to all the local receiver
leaf nodes and all the remote sites where the VRF
is defined Inter-Site
Network
(S,G) state
Receiver
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
PIM SSM Tenant Multicast Routing with Multi-Site
Source Inside on a Stretched BD, Receiver Outside (Control Plane)
(S,G) state
BL11 BL21
(S,G) state
• In scenarios where the source is part of a • Multicast flows may be sent to the external
stretched BD, the RP may sent PIM (S,G) Join receivers through the L3Out of a remote site
to either sites
Inter-Site
Network
(S,G) state
BL11 BL21
MC traffic
to G
L3Out-1 L3Out-2
Source S
Site 1 Site 2
Multicast data
(S,G) state
Receiver
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 102
Source Outside, Receiver Inside
Control and Data Planes
PIM SSM Tenant Multicast Routing with Multi-Site
Sources Outside, Receivers Inside (Control Plane)
▪ Receivers are connected to leaf nodes across
sites and send IGMPv3 Joins for group (S,G)
▪ PIM (S,G) Joins are sent from the BL nodes
toward the external network to build (S,G) state
Inter-Site
up to the last router where the source is
Network
connected
COOP COOP
(S,G) state (S,G) state (S,G) state (S,G) state
BL11 BL21
IGMPv3 Join IGMPv3 Join
for (S,G) for (S,G)
L3Out-1 L3Out-2
Receiver Receiver
Site 1 PIM (S,G) Site 2
PIM (S,G) Join
Join PIM (S,G)
Join
(S,G) state
(S,G) state
Source S
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 104
PIM SSM Tenant Multicast Routing with Multi-Site
Sources Outside, Receivers Inside (Data Plane)
▪ Each site must receive multicast sent from external sources via a local L3Out
BL11 BL21
L3Out-1 L3Out-2
Receiver Receiver
Site 1 Site 2
MC Multicast data
traffic to
Source S G
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 105
Connecting to the
External Layer 3
Domain
Connecting to the External Layer 3 Domain
‘Traditional’ L3Outs on the BL Nodes (Recommended Option)
Client
ISN ISN
WAN WAN
▪ BLs on each ACI site connect to a separate pair of WAN edge ▪ BLs of different sites connect to a common pair of WAN
routers for communication with the WAN edge routers for communication with the WAN
▪ Most common deployment model for ACI fabrics ▪ Typical deployment model when Multi-Site is used for
geographically dispersed scaling up the fabric in a single DC location
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 108
Connecting to the External Layer 3 Domain
‘GOLF’ L3Outs (VRF High Scale Use Cases)
= VXLAN Encap/Decap
Different WAN
Hand-Off options:
VRF-Lite, MPLS-
VPN, LISP*
Client
WAN
OTV/VPLS
• Connecting to WAN Edge devices at Spine nodes
(directly or indirectly)
GOLF Routers
(ASR 9000, ASR 1000, ▪ VXLAN data plane with MP-BGP EVPN control plane
Nexus 7000)
• High scale tenant L3Out support
• Automated configuration with OpFlex
• Support for host routes advertisement out of the
ACI Fabric
▪ Enabled at the VRF level
• No support for L3 Multicast or Shared L3Out
(every tenant VRF requires its own L3Out)
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 109
ACI Multi-Site and ‘GOLF’ L3Outs
Deployment Options
Distributed GOLF Routers Shared GOLF Routers (from ACI 3.1)
WAN WAN
GOLF Routers GOLF Routers GOLF Routers
MP-BGP MP-BGP
MP-BGP ISN MP-BGP EVPN ISN EVPN
EVPN EVPN
▪ Each ACI sites utilizes a separate pair of GOLF routers for ▪ Common pair of GOLF routers shared by all sites for
communication with the WAN communication with the WAN
▪ Local EVPN peering between spines and GOLF routers ▪ GOLF routers can be connected to the ISN or directly to
the spines
▪ GOLF routers connect to the ISN or directly to the spines
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 110
ACI Multi-Site and ‘GOLF’ L3Outs
Must Use Host-Route Advertisement for Stretched BDs with GOLF L3Outs
10.1.0.10/32 → G1, G2 Traffic destined
10.1.0.20/32 → G3, G4 10.1.0.0/24 → G1-G4
✓
to 10.1.0.20
G1 G2
WAN G3 G4 G1 G2
WAN
G3 G4
❌
ISN ISN
10.1.0.0/24 10.1.0.0/24
.10 .20 .10 .20
▪ Host-route advertisement into the WAN for stretched BDs ▪ Without host-route advertisement traffic destined to a
stretched IP subnet may enter the ‘wrong site’
▪ Ensures that ingress traffic is always delivered to the ‘right site’
▪ A site can’t be used as ‘transit’ for traffic destined to an
endpoint part of a stretched BD and remotely located (traffic is
dropped on the receiving spines)
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
Solving Asymmetric
Routing Issues with the
External Network
Multi-Site and L3Out
Endpoints Normally Use Local L3Outs for Outbound Traffic
Inter-Site Network
Site 1 Site 2
Web-EPG C1 Ext-EPG
L3Out L3Out
Site 1 Site 2
10.10.10.10 IP Subnet 10.10.10.11
IP Subnet Active/Standby
10.10.10.0/24 Active/Standby
10.10.10.0/24
Traffic dropped
because of lack of
state in the FW
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 113
Solving Asymmetric Routing Issues ACI 4.0(1)
Release
Use of Host-Routes Advertisement
Inter-Site Network
Site 1 Site 2
Web-EPG C1 Ext-EPG
L3Out L3Out
Site 1 Site 2
10.10.10.10 Host routes 10.10.10.11
10.10.10.10/32 Active/Standby Active/Standby Host routes
10.10.10.11/32
*Alternative could be
running an overlay solution
Host-routes
(LISP, GRE, etc.) injected into the
WAN* Enabled at
the BD level
• Ingress optimization requires host-routes advertisement on the L3Out
▪ Native support on ACI Border Leaf nodes available from ACI release 4.0(1)
▪ Supported also on GOLF L3Outs (enabled at the VRF level) © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Solving Asymmetric Routing Issues ACI 4.2(1)
Release
Use of Active/Standby FW Pair Deployed across Sites
Inter-Site Network
Site 1 Site 2
Web-EPG C1 Ext-EPG
L3Out L3Out
Site 1 Site 2
10.10.10.10 10.10.10.11
Active Standby
IP Subnet
10.10.10.0/24
• Inbound and outbound flows are forced through the site with the active perimeter FW node
• Common scenario in a Multi-Pod deployment, less desirable with Multi-Site
• Requires Intersite L3Out support (ACI release 4.2(1))
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 115
Intersite L3Out
ACI 4.2(1)/MSO 2.2(1)
Problem Statement
Problem Statement
Behavior before ACI Release 4.2(1)
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 118
Problem Statement
Behavior before ACI Release 4.2(1)
Supported Design
✓ Not Supported Design
❌
Inter-Site Network Inter-Site Network
X
Note: the same consideration applies to both Border Leaf L3Outs and GOLF L3Outs
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 119
Solution Overview and
Supported Use Cases
ACI Multi-Site and L3Out ACI 4.2(1)
Release
Support of Intersite L3Out
WAN, Mainframes,
WAN WAN
WAN, Mainframes, WAN, Mainframes,
FW/SLB, etc… FW/SLB, etc… FW/SLB, etc…
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 122
Control Plane and Data Plane
Considerations
ACI Multi-Site and Intersite L3Out
Introduction of a Routable TEP Pool
Inter-Site
Network
Site 1 Site 2
Address taken
from the new Next-Hop to
routable TEP pool reach P1 is BL
TEP
BL RTEP
L3Out
Site 1
External prefix P1 EP
WAN, Mainframes,
FW/SLB, etc…
• The BL TEP is normally taken from the original TEP pool assigned during the fabric bring-up
procedure
• Since we don’t want to assume that the original TEP pool can be reached across the ISN, a
separate routable TEP pool is introduced to support intersite L3Out
• The routable TEP pool can be directly configured on the Multi-Site Orchestrator
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 124
ACI Multi-Site and Intersite L3Out
Control Plane
Inter-Site
Red Ext
EPG C EPG Network
Site 1 Site 2
• Next-Hop to reach P1 is
RR RR MP-BGP VPNv4/VPNv6 RR RR BL routable TEP (RTEP)
• Info to rewrite remote
L3Out’s VRF L3VNI
BL RTEP
L3Out
Site 1
External prefix P1 EP
WAN, Mainframes,
VPNv4 FW/SLB, etc…
• External prefix advertisements received via the L3Out are redistributed to the leaf nodes in the local site
via MP-BGP VPNv4/VPNv6 through the RRs in the spines (normal ACI intra-fabric behavior)
• MP-BGP VPNv4 advertisements are also used to distribute this information to the remote sites (in
addition to the EVPN Type-2 advertisements for endpoint information)
• The prefixes are then redistributed inside the remote sites via VPNv4/VPNv6 by the RR spines
• The next-hop VTEP for the prefixes is the BL routable TEP (RTEP) that received the routes from the external network
• Associated to the prefix information are the info to rewrite the VRF L3VNI value to© match
BRKACI-2125 2020 Ciscothe one
and/or in theAll remote
its affiliates. siteCisco Public
rights reserved. 125
ACI Multi-Site and Intersite L3Out
No VNID/S-Class Translations on Receiving Spines
No VNID/S-Class Inter-Site
translations applied
Network
Site 1 Site 2
Rewrite of
remote L3Out’s
VRF VNID
information
BL RTEP
L3Out
Site 1
External prefix P1 EP
WAN, Mainframes,
FW/SLB, etc…
• Differently from regular inter-site (east-west) communication between endpoints, the spines on
the receiving site are not involved in VNID/Class-ID translations for communication to external
prefixes
• BGP program the rewrite of VNIDs of remote site routes directly on ToRs
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 126
ACI Multi-Site and Intersite L3Out
Data Plane (EndPoint to L3Out)
Site 1 Site 2
Decapsulate traffic
Insert remote VRF
and perform L3
L3VNI value in
lookup in the right
the VXLAN
VRF
header
• VXLAN tunnel is established directly between the leaf in site 2 and the BL in site 1
• The spines in the source site still translate the SIP to be the site’s O-UTEP
• The spines in the destination site simply route the VXLAN packet toward the destination BL node
• The BL node uses the L3VNI info in the VXLAN header to perform the lookup in the right VRF
• Remote endpoint learning always disabled on the BL nodes to avoid learning the wrong sClass info (as
there is no translation happening on the receiving spines)
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 127
ACI Multi-Site and Intersite L3Out
Data Plane (L3Out to EndPoint)
Spine performs the
lookup in the COOP
Spine performs the EP database and
lookup in the COOP Inter-Site encapsulates traffic
database and toward the local leaf
encapsulates traffic Network
toward site 2 O-UTEP
Site 1 Site 2
Incoming traffic is O-UTEP
always sent to the Decapsulate traffic and
proxy spine (since perform L3 lookup in
remote EP learning is the right VRF
disabled)
• Traffic received from the WAN hit the BL node and is always sent to the spine proxy
• The local spine performs the lookup for the destination endpoint and encapsulates to the O-UTEP of the
destination site (after changing the SIP in the VXLAN header to match the local O-UTEP)
• The receiving spine performs the lookup in the COOP DB and S-Class/VNID translations (as in regular Multi-Site
data-plane) → the traffic is encapsulated to the local leaf
• The local leaf decapsulates the packet, performs the L3 lookup, applies the policy and sends traffic to the
endpoint (if allowed) 128
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Deployment Considerations
ACI Multi-Site and Intersite L3Out
Deployment Considerations
• Before ACI release 4.2(1), the outbound and inbound traffic flows take always a
deterministic path
• For BDs that are only locally defined in a site (i.e. not stretched), outbound communication is
only possible via local L3Outs
• Inbound communication is also only possible via the local L3Out, as it is not possible to
advertise the BD subnet(s) out of a remote L3Out connection
• For stretched BDs, the option of enabling host-based routing advertisement has been made
available from ACI release 4.0(1) to ensure inbound flows take always an optimal path
• The enablement of the Inter-site L3Out functionality may change this behavior, so it
is important to keep in mind some specific deployment considerations discussed in
the following slide
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 130
ACI Multi-Site and Intersite L3Out
Stretched Ext-EPG - Outbound Flows
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 131
ACI Multi-Site and Intersite L3Out
Stretched Ext-EPG - Outbound Flows
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 132
ACI Multi-Site and Intersite L3Out
Stretched Ext-EPG - Outbound Flows
Inter-Site Network
• When Intersite L3Out is enabled to allow,
for example, communication with a
mainframe in the remote site, the external
prefixes received in that site from all the
deployed L3Outs will be propagated to the
other sites
Ext-EPG Ext-EPGX
L3Out Site 2
Ext-EPG
• This may cause outbound traffic flows
L3Out Site 1 L3Out-MF
destined to external network to be sent via
BD-Red the remote L3Out connection
172.16.1.0/24 172.16.1.0/24
Mainframe
WAN
received with received with • The traffic will then be dropped because
longest AS-Path shortest AS-Path
of the absence of a contract with the
associated Ext-EPG
172.16.1.1
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 133
ACI Multi-Site and Intersite L3Out
Stretched Ext-EPG - Inbound Flows
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 134
Guidelines and Restrictions
ACI Multi-Site and Intersite L3Out
Current Restrictions
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 136
ACI Multi-Site and Intersite L3Out
Integration with Remote Leaf Nodes
• The traffic flows working with ACI release 4.1(2) will continue to • The following traffic flows are not supported in ACI release 4.2(1)
be supported in 4.2(1) 1. Transit routing between L3Outs deployed on RL pairs associated to
1. Endpoint connected to the RL pair associated to a site communicating separate sites
with L3Out(s) deployed in the same site 2. Endpoint connected to a RL pair associated to a site communicating with
2. Transit routing between L3Outs deployed in the main site and the RL pair the L3Out deployed on the RL pair associated to a remote site
associated to the same site 3. Endpoint connected to the local site communicating with the L3Out
3. Endpoint connected to the main site communicating with the L3Out deployed on the RL pair associated to a remote site 2
deployed on the RL pair associated to the same site 4. Endpoint connected to a RL pair associated to a site communicating with
the L3Out deployed on a remote site
4. Transit routing between L3Outs deployed on RL pairs associated to the
same site BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 137
ACI Multi-Site and Intersite L3Out
Integration with L3 Multicast
• Even with ACI release 4.2(1), support for Layer 3 Multicast and Multi-Site mandates
the deployment of at least a local L3Out per site (for each multicast enabled VRF)
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 138
ACI Multi-Site and Intersite L3Out
Integration with L3 Multicast – Not Supported Scenarios
PIM ASM, PIM SSM
PIM ASM
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 139
Multi-Site and
Network Services
Integration
Multi-Site and Network Services Deployment options fully
supported with ACI Multi-Pod
Integration Models
ISN
• SW and HW dependencies:
▪ Supported from ACI release 3.2(1)
▪ Mandates the use of EX/FX leaf nodes (both for compute and service leaf switches)
• The PBR policy applied on a leaf switch can only redirect traffic to a service
node deployed in the local site
▪ Requires the deployment of independent service node function in each site
▪ Various design options to increase resiliency for the service node function: per site
Active/Standby pair, per site Active/Active cluster, per site multiple independent Active
nodes
▪ Only a single service node function (FW) supported in the PBR policy with 3.2(1) release
▪ Two service node functions (FW + SLB) supported in the PBR policy from 4.0(1) release
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 142
Use of Service Graph and Policy Based Redirection
Resilient Service Node Deployment in Each Site
• The Active/Standby pair represents a • The Active/Active cluster represents a • Each Active node represent a unique
single MAC/IP entry in the PBR policy single MAC/IP entry in the PBR policy MAC/IP entry in the PBR policy
• Spanned Ether-Channel Mode • Use of Symmetric PBR to ensure each
supported with Cisco ASA/FTD flow is handled by the same Active node
platforms in both directions
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 143
Use of Service Graph and Policy Based Redirection
North-South and East-West Use Cases
Ext-BD Web-BD
▪ Ext-EPG must also be a stretched object, mapped to the
Service-BD
individual L3Outs defined in each site
▪ Web-BD and App-BD can be stretched across sites or locally
defined in each site
East-West • North-South use case
VRF1 ▪ Intra-VRF only support as of ACI 4.2(3) release
L3 L3 L3
• East-West use case
Web-EPG App-EPG
▪ Supported intra-VRF or inter-VRFs/Tenants
▪ Requires to configure the IP range for the endpoints under the
Web-BD Service-BD App-BD Provider EPG
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 144
Use of Service Graph and Policy Based Redirection
North-South Communication – Inbound Traffic
Inter Site
Network
Site1 Site2
Compute leaf
always applies
the PBR policy Compute leaf
always applies
EPG EPG the PBR policy
Ext C Web
Consumer Provider
(Provider) (Consumer)
L3Out-Site1 L3Out-Site2
10.10.10.10 10.10.10.11
L3 Mode L3 Mode
Active/Standby Active/Standby
• Inbound traffic can enter any site when destined to a stretched subnet (if ingress optimisation is not deployed or
possible)
• PBR policy must always be applied on the compute leaf node where the destination endpoint is connected
▪ Requires the VRF to have the default policies for enforcement preference and direction
▪ Supported only intra-VRF in ACI release 3.2
▪ Ext-EPG and Web EPG can indifferently be provider or consumer of the contract
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 145
Use of Service Graph and Policy Based Redirection
North-South Communication – Outbound Traffic
Inter Site
Network
Site1 Site2
Compute leaf
always applies
the PBR policy Compute leaf
always applies
EPG EPG the PBR policy
Ext C Web
Consumer Provider
(Provider) (Consumer)
L3Out-Site1 L3Out-Site2
10.10.10.10 10.10.10.11
L3 Mode L3 Mode
Active/Standby Active/Standby
• PBR policy always applied on the same leaf where it was applied for inbound traffic
• Ensures the same service node is selected for both legs of the flow
• Different L3Outs can be used for inbound and outbound directions of the same flow
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 146
Use of Service Graph and Policy Based Redirection
East-West Communication (1)
Inter Site
Network
Site1 Site2
Provider leaf
always applies
the PBR policy
EPG EPG
Web C App
Provider Consumer
*From ACI 4.0(1) release, it was on the Consumer side earlier BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 147
Use of Service Graph and Policy Based Redirection
East-West Communication (2)
Inter Site
Network
Site1 Site2
Provider leaf
always applies Consumer leaf
the PBR policy does not apply
EPG EPG
Web C App the PBR policy
Provider Consumer
▪ The Consumer leaf must not apply PBR policy to ensure proper traffic stitching to the FW
node that has built connection state
▪ Ensures both legs of the flow are handled by the same service node
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 148
Multi-Site and Virtual Machine
Manager (VMM) Integration
ACI Multi-Site and VMM Integration
Option 1 – Separate VMM per Site
ISN
VMM 1 VMM 2
HV vSwitch1
HV HV Managed HV vSwitch2
HV HV
by VMM 1
HV Cluster 1 Managed HV Cluster 2
by VMM 2
VMM 1
HV vSwitch1
HV HV HV vSwitch2
HV HV
Managed
HV Cluster 1 by VMM 1 HV Cluster 2
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 151
ACI Multi-Site and VMM Integration
Workload Migration across Sites
ISN
vCenter vCenter
Server 1 Server 2
SRM SRM
HV HVVDS1 HV
EPG1 HV HVVDS2 HV
EPG1
HV Cluster 1 HV Cluster 2
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 154
ACI Multi-Pod and Multi-Site
Connectivity between Pods and Sites
Single external network used
for IPN and ISN
IPN/ISN
APIC Cluster
Pod ‘A’ Pod ‘B’
Site 1 Site 2
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 155
ACI Multi-Pod and Multi-Site
Connectivity between Pods and Sites
IP WAN
Separate networks
used for IPN and ISN
IPN
Site 2
1st Gen 1st Gen
APIC Cluster
Pod ‘A’ Pod ‘B’
Site 1 Site 2
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 156
Connectivity between Pods and Sites
Not Supported Topology
Separate uplinks
between spines IP WAN
and external
networks
IPN
Site 2
1st Gen 1st Gen
APIC Cluster
Pod ‘A’ Pod ‘B’
Site 1 Site 2
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 157
ACI Multi-Pod and Multi-Site
BGP Spine Roles
Pod ‘B’
• All the spines that are not speakers implicitly
become forwarders
Site 1
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 158
ACI Multi-Pod and Multi-Site
Inter-Site and Intra-Site EVPN Sessions
= Inter-Site MP-BGP EVPN Peering (Speaker-to-Speaker)
= Intra-Site MP-BGP EVPN Peering (Speaker-to-Forwarders)
Site 2
IP
BGP BGP
Speaker Speaker
IP WAN
IPN
Site 1
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 159
ACI Multi-Pod and Multi-Site
Inter-Site L2/L3 Unicast Traffic
Site 2
IP
O-UTEP-S2
IP WAN
EP1
Site 2 Spine Table
EP1 Leaf 1
EP2 O-UTEP-S1P1
EP3 O-UTEP-S1P3
Site 1-Pod3 Spine Table
Site 1-Pod1 Spine Table O-UTEP-S1P1 O-UTEP-S1P2 O-UTEP-S1P3
EP3 Leaf 4
EP2 Leaf 1
EP1 O-UTEP-S2
EP1 O-UTEP-S2
EP2 O-UTEP-S1P1
EP3 O-UTEP-S1P3
EP4 O-UTEP-S1P2
EP4 O-UTEP-S1P2
EP2 EP4 EP3
Site 1
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 160
ACI Multi-Pod and Multi-Site
Inter-Site L2 BUM Traffic
= BUM sent via Ingress Replication
= BUM sent via PIM-Bidir
Site 2
IP
O-MTEP-S2
IP WAN
EP1
DF DF DF
BUM
Frame
EP2 EP4 EP3
Site 1
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 161
ACI Multi-Pod and Multi-Site
Inter-Site L2 BUM Traffic
= BUM sent via Ingress Replication
Ingress Replicated = BUM sent via PIM-Bidir
Site 2
IP to O-MTEP address
of Site 1
DF
BUM
Frame IP WAN
EP1
O-MTEP-S1
Site 1
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 162
ACI Multi-Pod and Multi-Site
TEP Pools Deployment and Advertisement
Filter out the
advertisement of
TEP Pool advertisement
10.1.0.0/16
the TEP Pool into not needed for inter-Sites
Site 2
IP the backbone
communication
IP WAN
Filter out the
All Pods TEP Pools
TEP Pool Site 2 advertisement of
advertised into the
10.1.0.0/16 the TEP Pools into
IPN
the backbone
IPN
10.1.0.0/16
TEP Pool Pod1 10.2.0.0/16 10.3.0.0/16 TEP Pool Pod3
10.1.0.0/16 TEP Pool Pod2 10.3.0.0/16
10.2.0.0/16
Site 1
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 163
Conclusions
Multi-Pod and Multi-Site
Complementary Architectures
‘Classic’ Active/Active
ACI Multi-Site
‘Classic’ Active/Active
Application
Pod ‘1.B’
workloads
Pod ‘2.B’
deployed across
availability zones © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Where to Go for More Information
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 166
Complete your
online session
survey • Please complete your session survey
after each session. Your feedback
is very important.
• Complete a minimum of 4 session
surveys and the Overall Conference
survey (starting on Thursday) to
receive your Cisco Live t-shirt.
• All surveys can be taken in the Cisco Events
Mobile App or by logging in to the Content
Catalog on ciscolive.com/emea.
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 167
Continue your education
Demos in the
Walk-in labs
Cisco campus
BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 168
Thank you