Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18

Understand BGP EVPN+VXLAN control-

plane and data-plane

Marcin Zimnica

Network Architect CCDE#20170060/CCIE#39720 x 2


4 articles Follow
January 1, 2019
Open Immersive Reader

I used this simple spine-leaf topology to describe and show how BGP EVPN+VXLAN
works for control-plane and data-plane on Cisco Nexus boxes.
Before we dive into details there are some prerequisite and consideration that need to
be met for underlay network:

 MTU must support at least 1550 bytes


 multicast needs to be enable and configured which is required for VXLAN to
forward broadcast, unknown unicast or mutlicast traffic across tunnels.
 appreciate NX-OS features need to be enable

The following features must be enabled to get EVPN support on NX-OS boxes

feature ospf
feature bgp
feature pim
feature interface-vlan
feature vn-segment-vlan-based
feature nv overlay

In this example underlay network is based on the OSFP with unnumbered links and each
box is configured in very similar way, where only difference is OSPF router-id, loopback0
and loopback1 IP's for unnumbered interfaces and NVE interface respectively.

Below is configuration used for OSPF

interface loopback0
description UNDERLAY OSPF
ip address 2.2.2.2/32
ip router ospf 1 area 0.0.0.0
ip pim sparse-mode

interface loopback1
description NVE SOURCE
ip address 22.22.22.22/32
ip router ospf 1 area 0.0.0.0
ip pim sparse-mode

router ospf 1
router-id 2.2.2.2

interface Ethernet1/1
no switchport
medium p2p
ip unnumbered loopback0
ip router ospf 1 area 0.0.0.0
ip pim sparse-mode
no shutdown

In terms of multicast IP PIM spare mode is enable with BSR, where spine switch performs
both role of BSR candidate and BSR rendezvous point candidate.
Spine multicast configuration is shown below.

feature pim

ip pim bsr bsr-candidate loopback0


ip pim bsr rp-candidate loopback0 group-list 231.0.0.0/8
ip pim bsr listen

interface Ethernet1/1
ip pim sparse-mode

interface Ethernet1/2
ip pim sparse-mode

The leafs configuration for multicast is very similar and the only difference is lack of bsr-
candidate and bsr rp-candidate commands. There is requirement to run following
command ip pim bsr listen on the switches to allow them listen to and forward
bootstrap router (BSR) and Candidate-RP messages.

When basic ip reachability is achieved between leafs switches for unicast and multicast
traffic, then OVERLAY network can be configure. The overlay network is based on the
BGP, where spine switch is configured as route-reflector and basic spine BGP
configuration is shown below.

router bgp 100


  router-id 1.1.1.1
  address-family l2vpn evpn
  neighbor 2.2.2.2
    remote-as 100
    update-source loopback0
    address-family l2vpn evpn
      send-community extended
      route-reflector-client
  neighbor 3.3.3.3
    remote-as 100
    update-source loopback0
    address-family l2vpn evpn
      send-community extended
      route-reflector-client

This is basic BGP configuration for leaf switches

router bgp 100


  router-id 2.2.2.2
  address-family l2vpn evpn
  neighbor 1.1.1.1
    remote-as 100
    update-source loopback0
    address-family l2vpn evpn
      send-community extended
When basic BGP configuration for L2VPN EVPN address-families is applied you can
verify BGP peering status. As depicted below BGP should be up on the RR with both
leafs switches. At this point no routes are learned as configuration is not fully completed
on the leafs switches however BGP control-plane should be up.

To provide communication between VPC1 and VPC2 there is a requirement to put these
hosts into specific VLAN's and assign these VLAN's to dedicated tenant VRF. Each host
can be put in different VLAN on each leafs and there is no specific requirement that they
have to be on the same VLAN, however in this example both hosts are going to be place
into VLAN 1100 for simplicity.

Below configuration needs to be deployed on the leafs switches to provide L2


communication between VPC1 and VPC2. The VLAN 100 is used to provide L3 VNI and
is used when layer 3 lookup needs to be perform between different subnets. VLAN1100
is used as access where hosts VPC1 and VPC2 are placed. Both VLAN's 100 and 1100
must have respective SVI where SVI100 is responsible for layer 3 lookup and SVI1100
provide default gateway for VPC1 and VPC2 and layer 2 lookup.

vlan 100
name L3VNI-TENANT-A
vn-segment 100

vlan 1100
name TENANT-A
vn-segment 1100

interface Vlan100
description L3VNI-TENANT-A
no shutdown
vrf member A
ip forward

interface Vlan1100
description VRF-TENANT-A
no shutdown
vrf member A
ip address 10.100.1.1/24
fabric forwarding mode anycast-gateway
Then appreciate VRF must be configured under L2VPN address-family as shown below.
In addition, you can redistributed all directly connected interfaces for that VRF. Also
anycast getaway mac address needs to be configure to make sure it will stay the same
across fabric for any given VLAN.

vrf context A
  vni 100
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn

fabric forwarding anycast-gateway-mac 0000.0000.0100

router bgp 100


  vrf A
    address-family ipv4 unicast
      redistribute direct route-map ALL

Next step is to configure EVPN as below for tenant VRF A.

vpn
vni 1100 l2
rd auto
route-target import auto
route-target export auto

To verify VLAN to VXLAN mapping use the following command. In this example VLAN
100 is mapped to VXLAN 100 and VLAN 1100 is mapped to VXLAN 1100.

There is no VPC1 to VPC2 reachability at this moment due to that we still need to
configured one very important part of this deployment which is NVE interface. Let’s
configure NVE interface which is used to exchange control-plane BGP updates between
the peers and provide data-plan VXLAN encapsulation between leafs switches. Below
configuration shows basic NVE interface config which is exactly the same on both leafs
switches.
As soon as NVE interfaces is configured and enabled you can see that PIM register
message is sent to the RP with source IP of Lo1 (22.22.22.22) and dst of RP 1.1.1.1. This
packet also includes PIM information about the source 22.22.22.22 and multicast group
231.1.1.1 which that specific leaf switch wants to join.
On the spine if you run show ip mroute you should be able to see that both NVE
sources are in the multicast routing table with the correct state of the incoming and
outgoing interfaces.

You can verify what NVE peers are up and running using below command. As you can
see leaf-1 has one peer 33.33.33.33 (leaf-2). The Router-Mac address is the MAC address
of the NVE interface on the leaf-2 switch. Peer-IP refers to the Lo1 on the leaf-2 switch.

To verify VNI you can run below command which indicates what is the purpose of
specific VNI for example VNI 1100 is L2 only and VNI 100 is responsible for layer 3
lookup. Technically the L3 VNI 100 is not required if you don't need to provide L3
communication between different segments, which is not the case in most production
deployments and usually you will need some sort of L3 VNI to provide routing between
different subnets within the same fabric or to external domain.
To verify MAC learning for any given VLAN you can use following commands. In this
example VLAN 1100 is checked, as you can see along the mac address information you
can get IP information for end-host too. The mac address of VPC2 0050.7966.6805 is
learned via BGP with the next hop of 33.33.33.33 (leaf-2 NVE1)

From the control-plane perspective looking at the packet capture output you can see
standard BGP updated packet which includes end-host MAC address information. This is
purely control-plane information exchanged by BGP EVPN peers about end-host
reachability.

Nowadays most of the end-host are not silent hosts so they should send some kind of
traffic or gratuitous ARP message which allow BGP control-plane quickly learn new host
and then send appreciate BGP update across the fabric to all BGP peers. As you can see
below BGP l2vpn evpn routing table include VPC1 and VPC2 information already. For
L2VNI you can see that BGP table includes type-2 MAC address information of VPC1
[0050.7966.6804] indicated by prefix /216, also you can see type-2 MAC+IP information
indicated by prefix /272. In addition, you can see L3VNI information about specific
subnets BGP learned, which in this case is 10.100.1.0/24 type 5 with prefix /224.

From the control-plane perspective we have all information that we need for end-hosts
communication. Lets ping VPC1 and VPC2 and capture some traffic to see how it looks
like from data plane perspective.
In the packet capture you can see that encapsulation looks like [IP | UDP | VXLAN | ETH |
IP | ICMP]. The outer IP packet is source from 22.22.22.22 (leaf-1) to 33.33.33.33 (leaf-2)
which include UDP data where VXLAN is encapsulated. The VXLAN includes information
about the VNI 1100 which indicate what VNI is used for the communication between
both these end-host. Then we have ethernet frame with src and dst MAC’s of both VPC1
and VPC2 and then IP information about the VPC1 and VPC2 src and dst respectively,
then ICMP info.
As mentioned above the L3 VNI 100 provides layer 3 lookup so I created additional
Lo100 (100.20.1.0/24) interface on the leaf-2 switch and I added it into tenant A VRF. As
you can see the leaf-1 switch learns this subnet as type-5 with the next-hop of
33.33.33.33 which is the source IP address of NVE1 interface on the leaf-2 switch.

Ping from VPC1 to this Lo100 works as expected because as shown above I configured
L3 VNI 100 on both leafs switches. Looking at the packet capture you can see that outer
IP packet is source from 22.22.22.22 and destine to 33.33.33.33, both these IP's are
assigned to NVE interfaces as source of the VXLAN traffic. The VXLAN information in the
packet capture shows that VNI 100 was used for encapsulation of that ICMP. Then you
can see Ethernet frame with source (5000.0002.0007) and destination (5000.0003.0007)
MAC's which are basically NVE interface mac addresses. Then we can see inner IP
address packet with the source of VPC1 (10.100.1.20) and destination IP of Lo100
(100.20.1.1) configured on the leaf-2 switch.

The VNI 100 is used because Cisco boxes by default uses symmetric RIB (Integrated
Bridge Routing) where ingress and egress VTEPs perform Layer 2 and Layer 3 lookups. In
this case only layer 3 lookup was performed as source and destination is on different
subnets.
Lets shutdown the SVI100 (VNI100) on the leaf-1 switch to see what will happen and if
ICMP ping from VPC1 to Lo100 will failed too.

After the SVI 100 was shutdown I was not able to ping the Lo100 (100.20.1.1) from VPC1
however still was able to ping VPC2 (10.100.1.20) as shown below.

I was able to ping the VPC2 host due to that for that communication VNI 1100 was used
to provide VXLAN encapsulation and only layer 2 lookup was done. As you can see
different VNI are required for different purpose.

Essentially BGP itself is only used to exchange control-plane information about host or
subnets reachability however to provide data-plane forwarding between end-host on
the same or different segment VXLAN is required to provide encapsulation for data
plane.

Report this

Published by

Marcin Zimnica

Marcin Zimnica
Network Architect CCDE#20170060/CCIE#39720 x 2
Published • 4y
4 articlesFollow
#bgp #evpn #vxlan
Like
Comment
Share
 77
 6 comments

Reactions

BAPPY GHOSH

Manjunath MR

Alfred Lau

Yigit Fırat

Orkun Taşdan

Anusha Sharma

Jesteen Bolle

Abdulhameed Faizur

Sudhakar Subburam

Krishna Mall
 +65

6 CommentsComments on Marcin Zimnica’s article

Current selected sort order is Most relevantMost relevant

Open Emoji Keyboard

YongHun Lee out of network3rd+Network Engineer

1y
Unfortunately, the middle few images are not loading
Like
Reply

M. Hasanuz Zaman CCNA®,CCNAS®,CCNP® 2nd degree


connection2ndManager , IP Core Planning and Design at Royal Green Ltd

3y
Hi Marcin,

Very good right up . But after reading the article I am still confused regarding the evpn inclusion with
multicast Underlay . What the help evpn doing ? Can EVPN do the job without Multicast Underlay ?If
not then why EVPN Comes ? only Multicast can do the JOB . Yes , BGP EVPN address family carrying
the Remote HOST MAC but How can a HOST learn the MAC address from BGP Update ?
…see more
Like
Reply
Load more comments

Marcin Zimni

You might also like