VXLAN With Static Ingress Replication and Multicast Control Plane

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 19

VXLAN with static ingress replication and multicast control

plane
Posted by Paris Arau on December 30, 2017 in Cisco, How To's, Technical
LinkedInFacebookTwitter
Share

This is the first part of a series covering VXLAN on NEXUS devices. Various control-plane
approaches will be covered.

In this first part, unicast and multicast control-plane is discussed and in our next post, we’ll
discuss one VXLAN using MP-BGP. Each of these have advantages and disadvantages.

The purpose of this series is to show how you can configure each method and how the traffic is
forwarded.

VXLAN Tunnel Endpoint(VTEP): end of a VXLAN segment that performs encapsulation and
de-encapsulation

Virtual Network Identifier(VNI): a VXLAN segment on 24 bits


Network Virtualization Edge(NVE): the overlay interface to define VTEPs.
These are the fields of an Ethernet frame carrying a VXLAN frame.

The first part of this article will cover simple VXLAN and this is the topology:
The NEXUS devices are all running an IGP for loopback interfaces reachability and all the
traffic between the edge NEXUS devices must go through NX_OS_4.

These are the OSPF routes on NX_OS_4 and similar output is found on all the other devices.

1 NX_OS_4# show ip route ospf-1


2  IP Route Table for VRF "default"
3  '*' denotes best ucast next-hop
4  '**' denotes best mcast next-hop
5  '[x/y]' denotes [preference/metric]
6  '%' in via output denotes VRF

7
8 1.1.1.1/32, ubest/mbest: 1/0
9  *via 10.10.14.1, Eth1/1, [110/41], 00:07:04, ospf-1, intra
10  1.1.1.2/32, ubest/mbest: 1/0
11  *via 10.10.24.2, Eth1/2, [110/41], 00:10:01, ospf-1, intra
12  1.1.1.3/32, ubest/mbest: 1/0
13  *via 10.10.34.3, Eth1/3, [110/41], 00:09:55, ospf-1, intra
14
15 NX_OS_4#

R1, R2 and R3 are all in the same VLAN, VLAN 100.

1 NX_OS_1# show vlan id 100


2

3 VLAN Name Status Ports


 ---- -------------------------------- ---------
4
-------------------------------
5  100 VLAN0100 active Eth1/2
6

7 VLAN Type Vlan-mode


8  ---- ----- ----------
9  100 enet CE
10

11 Remote SPAN VLAN


12  ----------------
13  Disabled
14

15 Primary Secondary Type Ports


 ------- --------- ---------------
16
-------------------------------------------
17
18 NX_OS_1#

So far, everything is as expected and to enable VXLAN, several things are required:
The first one is to enable VXLAN and overlay features:

1 NX_OS_1# show running-config | i feature


2  feature ospf
3  feature vn-segment-vlan-based
4  feature nv overlay
5  NX_OS_1#

Next, the vn-segment ID under the VLAN:

1 NX_OS_1# show running-config vlan


2
3 !Command: show running-config vlan
4  !Time: Tue Dec 12 14:35:17 2017

5
6 version 7.0(3)I6(1)
7  vlan 1,100
8  vlan 100
9  vn-segment 10100
10
11 NX_OS_1#

And finally, to create the overlay interface and specify the ingress replication type along with the
peers.
This is for NX_OS_1:

1 NX_OS_1# show running-config nv overlay


2
3 !Command: show running-config nv overlay
4  !Time: Tue Dec 12 14:33:00 2017

5
6 version 7.0(3)I6(1)
7  feature nv overlay
8
9 interface nve1
10  no shutdown
11  source-interface loopback0
12  member vni 10100
13  ingress-replication protocol static
14  peer-ip 1.1.1.2
15  peer-ip 1.1.1.3
16
17 NX_OS_1#

An almost identical configuration is found on NX_OS_2 and NX_OS_3, with the difference of
peers identifier.

Once this configuration is applied, two tunnels from each router going to the other two routers
will be created:

This is the overlay interface:

1 NX_OS_1# show nve interface


2  Interface: nve1, State: Up, encapsulation: VXLAN
3  VPC Capability: VPC-VIP-Only [not-notified]
4  Local Router MAC: 5e00.0000.0007

5  Host Learning Mode: Data-Plane


6  Source-Interface: loopback0 (primary: 1.1.1.1, secondary: 0.0.0.0)
7
8 NX_OS_1# show interface nve1
9  nve1 is up
10  admin state is up, Hardware: NVE
11  MTU 9216 bytes
12  Encapsulation VXLAN
13  Auto-mdix is turned off
14  RX
15  ucast: 0 pkts, 0 bytes - mcast: 0 pkts, 0 bytes
16  TX
17  ucast: 0 pkts, 0 bytes - mcast: 0 pkts, 0 bytes
18
19 NX_OS_1#

You can also check the VXLAN network identifier along with the peer status:

1 NX_OS_1# show nve vni


2  Codes: CP - Control Plane DP - Data Plane
3  UC - Unconfigured SA - Suppress ARP
4

5 Interface VNI Multicast-group State Mode Type [BD/VRF] Flags


6  --------- -------- ----------------- ----- ---- ------------------ -----
7  nve1 10100 UnicastStatic Up DP L2 [100]
8
9 NX_OS_1# show nve peers detail | no-more
10  Details of nve Peers:
11  ----------------------------------------
12  Peer-Ip: 1.1.1.2
13  NVE Interface : nve1
14  Peer State : Up
15  Peer Uptime : 00:04:48
16  Router-Mac : n/a

17  Peer First VNI : 10100


18  Time since Create : 00:04:48
19  Configured VNIs : 10100
20  Provision State : add-complete
21  Route-Update : Yes
22  Peer Flags : None

23  Learnt CP VNIs : 10100


24  Peer-ifindex-resp : Yes
25  ----------------------------------------
26  Peer-Ip: 1.1.1.3
27  NVE Interface : nve1
28  Peer State : Up
29  Peer Uptime : 00:04:48
30  Router-Mac : n/a

31  Peer First VNI : 10100


32  Time since Create : 00:04:48
33  Configured VNIs : 10100
34  Provision State : add-complete
35  Route-Update : Yes
36  Peer Flags : None

37  Learnt CP VNIs : 10100


38  Peer-ifindex-resp : Yes
39  ----------------------------------------
40  NX_OS_1#

Everything looks fine, so a ping from R1 to R2 and R3 should be successful:

1 R1#ping 100.100.100.2
2  Type escape sequence to abort.
3  Sending 5, 100-byte ICMP Echos to 100.100.100.2, timeout is 2 seconds:
4  .!!!!
5  Success rate is 80 percent (4/5), round-trip min/avg/max = 17/18/19 ms
6  R1#ping 100.100.100.3

7  Type escape sequence to abort.


8  Sending 5, 100-byte ICMP Echos to 100.100.100.3, timeout is 2 seconds:
9  .!!!!
10  Success rate is 80 percent (4/5), round-trip min/avg/max = 18/18/19 ms
11  R1#show ip arp
12  Protocol Address Age (min) Hardware Addr Type Interface
13  Internet 100.100.100.1 - fa16.3ebd.45fa ARPA GigabitEthernet0/1
14  Internet 100.100.100.2 0 fa16.3eae.df08 ARPA GigabitEthernet0/1
15  Internet 100.100.100.3 0 fa16.3efb.a5a3 ARPA GigabitEthernet0/1
16  R1#

As you can see, R1 gets the ARP entries as if they all three routers were in the normal VLAN.

The MAC address table on NX_OS_1 looks like this and it helps to understand which MAC was
learnt via direct connection (for R1) and which ones were learned over the overlay interface and
from which peer:

1 NX_OS_1# show system internal l2fwder mac


2  Legend:
3  * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
4  age - seconds since last seen,+ - primary entry using vPC Peer-Link,
5  (T) - True, (F) - False, C - ControlPlane MAC
6  VLAN MAC Address Type age Secure NTFY Ports
 ---------+-----------------+--------+---------+------+----
7
+------------------
8  * 100 fa16.3efb.a5a3 dynamic 00:03:48 F F (0x47000002) nve-peer2 1.1.1.3
9  * 100 fa16.3eae.df08 dynamic 00:03:52 F F (0x47000001) nve-peer1 1.1.1.2
10  * 100 fa16.3ebd.45fa dynamic 00:05:24 F F Eth1/2
11  NX_OS_1#

Observe that the MAC type is dynamic.


Here is a packet capture done on NX_OS_1 side on eth1/2(the interface towards R1) and
showing an ARP Request from R1 trying to resolve the ARP for R3:
Next is a packet capture on NX_OS_1 on interface eth1/1(towards NX_OS_4) showing that the
same ARP packet is encapsulated with VXLAN:

You can clearly see the VXLAN header encapsulating the original frame received from R1 on
eth1/2.
And this would be everything about VXLAN using unicast.
Next, we will cover the VXLAN implementation with multicast control plane and from the
underlay point of view, nothing changed with the exception that PIM was added with NX_OS_4
as RP for a group used for VXLAN:

This is the configuration on NX_OS_1 and all the other devices have identical configuration:

1 NX_OS_1# show running-config | section pim


2  feature pim
3  ip pim rp-address 1.1.1.4 group-list 226.0.0.0/24
4  ip pim ssm range 232.0.0.0/8
5  ip pim sparse-mode
6  ip pim sparse-mode
7  NX_OS_1# show running-config interface lo0
8
9 !Command: show running-config interface loopback0
10  !Time: Tue Dec 12 15:07:04 2017
11
12 version 7.0(3)I6(1)
13
14 interface loopback0
15  ip address 1.1.1.1/32
16  ip router ospf 1 area 0.0.0.0
17  ip pim sparse-mode
18
19 NX_OS_1# show running-config interface e1/1
20
21 !Command: show running-config interface Ethernet1/1
22  !Time: Tue Dec 12 15:07:11 2017

23
24 version 7.0(3)I6(1)
25
26 interface Ethernet1/1
27  no switchport
28  mtu 9216

29  ip address 10.10.14.1/24


30  ip router ospf 1 area 0.0.0.0
31  ip pim sparse-mode
32  no shutdown
33  NX_OS_1#

The configuration pertaining to VXLAN using multicast is almost identical with the one using
unicast.
The difference is that ingress-replication was removed and a multicast group was added:

1 NX_OS_1# show running-config nv overlay


2
3 !Command: show running-config nv overlay
4  !Time: Tue Dec 12 15:04:42 2017

5
6 version 7.0(3)I6(1)
7  feature nv overlay
8
9 interface nve1
10  no shutdown
11  source-interface loopback0
12  member vni 10100
13  mcast-group 226.0.0.100
14
15  NX_OS_1#

Independent of the overlay interface configuration, the underlying PIM infrastructure should
work. These are the PIM neighbors of NX_OS_4(RP):

1 NX_OS_4# show ip pim neighbor


2  PIM Neighbor Status for VRF "default"
3  Neighbor Interface Uptime Expires DR Bidir- BFD
4  Priority Capable State
5  10.10.14.1 Ethernet1/1 00:20:19 00:01:38 1 yes n/a
6  10.10.24.2 Ethernet1/2 00:20:15 00:01:31 1 yes n/a
7  10.10.34.3 Ethernet1/3 00:20:12 00:01:31 1 yes n/a
8  NX_OS_4#

This is the multicast routing table on NX_OS_1:

1 NX_OS_1# show ip mroute | no-more


2  IP Multicast Routing Table for VRF "default"
3
4 (*, 226.0.0.100/32), uptime: 00:32:34, ip pim nve
5  Incoming interface: Ethernet1/1, RPF nbr: 10.10.14.4, uptime: 00:30:52
6  Outgoing interface list: (count: 1)
7  nve1, uptime: 00:06:29, nve
8

9 (1.1.1.1/32, 226.0.0.100/32), uptime: 00:16:48, ip mrib pim nve


 Incoming interface: loopback0, RPF nbr: 1.1.1.1,
10
uptime: 00:16:48
11  Outgoing interface list: (count: 1)
12  Ethernet1/1, uptime: 00:07:37, pim

13
14 (1.1.1.2/32, 226.0.0.100/32), uptime: 00:16:34, ip mrib pim nve
15  Incoming interface: Ethernet1/1, RPF nbr: 10.10.14.4, uptime: 00:16:34
16  Outgoing interface list: (count: 1)
17  nve1, uptime: 00:06:29, nve
18

19 (1.1.1.3/32, 226.0.0.100/32), uptime: 00:16:32, ip mrib pim nve


20  Incoming interface: Ethernet1/1, RPF nbr: 10.10.14.4, uptime: 00:16:32
21  Outgoing interface list: (count: 1)
22  nve1, uptime: 00:06:29, nve

23
24 (*, 232.0.0.0/8), uptime: 00:31:11, pim ip
25  Incoming interface: Null, RPF nbr: 0.0.0.0, uptime: 00:31:11
26  Outgoing interface list: (count: 0)
27  NX_OS_1#

And this is from RP. Observe for instance, that for a packet that comes from 1.1.1.1 and destined
to 226.0.0.100, the packet should be forwarded on eth1/2(NX_OS_2) and eth1/3(NX_OS_3).
Also, from any source towards 226.0.0.100, the packets should be forwarded to all the other
NEXUS devices:

1 NX_OS_4# show ip mroute


2  IP Multicast Routing Table for VRF "default"
3
4 (*, 226.0.0.100/32), uptime: 00:08:15, pim ip
5  Incoming interface: loopback0, RPF nbr: 1.1.1.4, uptime: 00:08:15
6  Outgoing interface list: (count: 3)
7  Ethernet1/2, uptime: 00:06:06, pim
8  Ethernet1/1, uptime: 00:06:07, pim
9  Ethernet1/3, uptime: 00:08:15, pim
10

11 (1.1.1.1/32, 226.0.0.100/32), uptime: 00:08:15, pim mrib ip


 Incoming interface: Ethernet1/1, RPF nbr: 10.10.14.1, uptime: 00:08:15,
12
internal
13  Outgoing interface list: (count: 2)
14  Ethernet1/2, uptime: 00:06:06, pim
15  Ethernet1/3, uptime: 00:08:15, pim
16

17 (1.1.1.2/32, 226.0.0.100/32), uptime: 00:08:15, pim mrib ip


 Incoming interface: Ethernet1/2, RPF nbr: 10.10.24.2, uptime: 00:08:15,
18
internal
19  Outgoing interface list: (count: 2)
20  Ethernet1/1, uptime: 00:06:07, pim
21  Ethernet1/3, uptime: 00:08:15, pim
22

23 (1.1.1.3/32, 226.0.0.100/32), uptime: 00:08:15, pim ip


 Incoming interface: Ethernet1/3, RPF nbr: 10.10.34.3, uptime: 00:08:15,
24
internal
25  Outgoing interface list: (count: 2)
26  Ethernet1/2, uptime: 00:06:06, pim
27  Ethernet1/1, uptime: 00:06:07, pim
28

29 (*, 232.0.0.0/8), uptime: 00:29:07, pim ip


30  Incoming interface: Null, RPF nbr: 0.0.0.0, uptime: 00:29:07
31  Outgoing interface list: (count: 0)
32
33  NX_OS_4#

This is the VXLAN network identifier and now it shows the multicast group:

1 NX_OS_1# show nve vni


2  Codes: CP - Control Plane DP - Data Plane
3  UC - Unconfigured SA - Suppress ARP
4

5 Interface VNI Multicast-group State Mode Type [BD/VRF] Flags


6  --------- -------- ----------------- ----- ---- ------------------ -----
7  nve1 10100 226.0.0.100 Up DP L2 [100]
8
9 NX_OS_1#

A ping from R1 to R2 and R3 is successful:

1 R1(config-if)#do ping 100.100.100.2


2  Type escape sequence to abort.
3  Sending 5, 100-byte ICMP Echos to 100.100.100.2, timeout is 2 seconds:
4  .!!!!
5  Success rate is 80 percent (4/5), round-trip min/avg/max = 18/19/21 ms
6  R1(config-if)#do ping 100.100.100.3
7  Type escape sequence to abort.
8  Sending 5, 100-byte ICMP Echos to 100.100.100.3, timeout is 2 seconds:
9  .!!!!
10  Success rate is 80 percent (4/5), round-trip min/avg/max = 18/19/21 ms
11  R1(config-if)#do show ip arp
12  Protocol Address Age (min) Hardware Addr Type Interface
13  Internet 100.100.100.1 - fa16.3ebd.45fa ARPA GigabitEthernet0/1
14  Internet 100.100.100.2 3 fa16.3eae.df08 ARPA GigabitEthernet0/1
15  Internet 100.100.100.3 0 fa16.3efb.a5a3 ARPA GigabitEthernet0/1
16  R1(config-if)#

Also, the MAC address table looks the same like before:

1 NX_OS_1# show system internal l2fwder mac


2  Legend:
3  * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
4  age - seconds since last seen,+ - primary entry using vPC Peer-Link,
5  (T) - True, (F) - False, C - ControlPlane MAC
6  VLAN MAC Address Type age Secure NTFY Ports
 ---------+-----------------+--------+---------+------+----
7
+------------------
8  * 100 fa16.3efb.a5a3 dynamic 00:02:57 F F (0x47000002) nve-peer2

9  1.1.1.3
10  * 100 fa16.3eae.df08 dynamic 00:03:21 F F (0x47000001) nve-peer1
11  1.1.1.2
12  * 100 fa16.3ebd.45fa dynamic 00:03:31 F F Eth1/2
13  NX_OS_1#

Again, the type of the MAC is dynamic like in the unicast control-plane.
The following is the traffic flow and VTEP discovery for ARP Request/ARP Reply.
The ARP Request is sent by the end host and reaches the NX_OS_1.
NX_OS_1 will send the ARP Request encapsulated using its loopback IP address as source and
the multicast group as destination:

This is a packet capture on eth1/1 on NX_OS_1 showing the ARP Request leaving. Notice the
Src/Dst IP of the packet:
Next, after the packet reaches the RP, the RP will forward the packet to all interfaces on which a
PIM Join for 226.0.0.100 group was received:

After the packet reaches NX_OS_3(NX_OS_3 will know about NX_OS_3 at this moment) and it
is de-encapsulated and sent to R3, R3 will send an ARP Reply to NX_OS_3. Next NX_OS_3
will encapsulate the ARP Reply in a unicast packet and send it directly to NX_OS_1:
This is a packet capture on NX_OS_1 showing the ARP Reply coming from NX_OS_3:

And this is pretty much about how VXLAN using multicast is implemented and how the data
forwarding happens.
To sum up, some of the:


o Advantages for:
 Unicast control-plane:
 Controlled deployment of VTEP
 Easier troubleshooting
 Multicast control-plane:
 Reduced operational overhead
 Scalability
 Simplicity
o Disadvantages for:
 Unicast control-plane:
 Increased operational burden
 Prone to configuration errors
 Each peer must be configured on every VTEP
 Multicast control-plane:
 Each VNI use one multicast group
 Possible Increased complexity due to PIM usage

Reference:
1. A Summary of Cisco VXLAN Control Planes: Multicast, Unicast, MP-BGP EVPN
2. Configure VxLAN Flood And Learn Using Multicast Core

Thank you to Paris Arau for his contributions to this article.

You might also like