VXLAN L3VPN

Published: 2022-12-26
Updated: 2023-01-27

L3VPN may also be called IPVPN or Routed VPN. This post will focus on how VXLAN can be used to create routed VPN topologies, similar to the MPLS L3VPN technology which has been the industry-standard for VPN services offered by service providers.

This post will focus on mapping VNIs to VRFs, allowing us to build L3VPN topologies with VXLAN. I will cover three ways of building L3VPN on top of VXLAN going from least to most scalable. But first we need to cover some basic topics, so let's get on with it!

VXLAN L3VPN vs MPLS L3VPN

The purpose of building a VPN topology is to create network isolation. In a service-provider (SP) or datacenter (DC) network you want to separate customers or tenants from each other to avoid traffic from one customer leaking into another. The best way to do this is to separate customers into their own virtual routing tables (VRF).

The most common and scalable L3VPN example is MPLS L3VPN provided by most ISPs, a technology that has been used for decades. Instead of a company purchasing expensive darkfibre links to connect their sites, the company purchase L3VPN services from an ISP, utilizing the existing ISP backbone for inter-site connectivity.

The drawback of MPLS L3VPN for the ISP is that every node node in the core must support MPLS forwarding. These devices tend to be more expensive to buy and maintain. With VXLAN L3VPN, only the edge nodes that perform encapsulated and decapsulation need to support VXLAN. The intermediary nodes can be simple devices as they only perform native IP routing.

What is a VRF and why do we need it

The L3VPN technology utilize VRFs for network isolation, so here's a quick primer:

  • VRF is short for Virtual Routing Forwarding.
  • When you create a VRF inside a router you create a virtual routing table.
  • You can then assign interfaces to a VRF. Traffic entering that interface, will be routed according to the routes in that particular VRF.

By using VRFs you are also guaranteed to avoid leaking traffic from one customer VPN into the private network of another. Additionally, since each customer are located in their own VRF, they are free to use whatever IP-addressing they want without risk of colliding with other customers.

Why do we need to combine VXLAN and VRFs?

You don't! You can choose to go with MPLS L3VPN or a "VRF-lite" setup where the VRF is stretched across all nodes in the network. Beware though, VRF-lite is not a scalable solution as displayed in the image below:

In the VRF-lite setup at the top, for the two Customer Red sites to access each other, all R1-R5 routers along the way must have the Red VRF configured. This configuration includes the virtual routing-table, interfaces associated to that routing table (subinterfaces or physical) and some protocol for advertising routes, statically or dynamically. This works fine in a small environment like in the picture, but if you have 100 routers and 100 customers then you can appreciate the problems this solution may create.

If we instead look at the VXLAN L3VPN setup in the same diagram, we can see that only R1 and R5 require customer VRF configuration. The R2-R4 nodes are only forwarding VXLAN-encapsulated packets between R1 and R5. This type of design simplifies the configuration aswell as improve scalability as R2, R3 and R4 are simple routers that require no VXLAN features. This also means many fewer routes in the R2-R4 routing tables, they only need to know how to reach R1 and R5.


VXLAN L3VPN with Static Routes

We will start the L3VPN demonstration with the simplest possible setup. The diagram below describes the lab topology we will be using for the rest of this article. The R2-R4 devices are Arista vEOS running 4.28.3M with VXLAN capabilities. R1 is a "core" router running Cisco IOS with OSPF enabled to enable R2-R4 to advertise their respective 10.0.0.X loopback address. R1 has no VXLAN capabilities.

There are two customers in this topology, Blue and Red, each in their respective VRF. The customers use overlapping IP-address space, but this is not a problem thanks to the VRF isolation. Let us examine the configuration:

service routing protocols model multi-agent
!
vrf instance BLUE
vrf instance RED
!
interface Ethernet1
   no switchport
   ip address 10.0.12.2/29
   ip ospf network point-to-point
   ip ospf area 0.0.0.0
!
interface Ethernet2
   # mac-address 50:01:00:00:00:02
   no switchport
   vrf BLUE
   ip address 10.1.2.1/24
!
interface Ethernet3
   # mac-address 50:01:00:00:00:02
   no switchport
   vrf RED
   ip address 10.2.2.1/24
!
interface Loopback0
   ip address 10.0.0.2/32
!
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vrf BLUE vni 1
   vxlan vrf RED vni 2
!
ip routing
ip routing vrf BLUE
ip routing vrf RED
!
ip route vrf BLUE 10.1.3.0/24 vtep 10.0.0.3 vni 1 router-mac-address 50:01:00:00:00:03
ip route vrf BLUE 10.1.4.0/24 vtep 10.0.0.4 vni 1 router-mac-address 50:01:00:00:00:04
ip route vrf RED 10.2.3.0/24 vtep 10.0.0.3 vni 2 router-mac-address 50:01:00:00:00:03
ip route vrf RED 10.2.4.0/24 vtep 10.0.0.4 vni 2 router-mac-address 50:01:00:00:00:04
!
router ospf 1
   redistribute connected

R2#show ip route vrf all

VRF: default
 C        10.0.0.2/32 is directly connected, Loopback0
 O E2     10.0.0.3/32 [110/1] via 10.0.12.1, Ethernet1
 O E2     10.0.0.4/32 [110/1] via 10.0.12.1, Ethernet1
 C        10.0.12.0/29 is directly connected, Ethernet1
 O        10.0.13.0/29 [110/20] via 10.0.12.1, Ethernet1
 O        10.0.14.0/29 [110/20] via 10.0.12.1, Ethernet1

VRF: BLUE
 C        10.1.2.0/24 is directly connected, Ethernet2
 S        10.1.3.0/24 [1/0] via VTEP 10.0.0.3 VNI 1 router-mac 50:01:00:00:00:03
 S        10.1.4.0/24 [1/0] via VTEP 10.0.0.4 VNI 1 router-mac 50:01:00:00:00:04

VRF: RED
 C        10.2.2.0/24 is directly connected, Ethernet3
 S        10.2.3.0/24 [1/0] via VTEP 10.0.0.3 VNI 2 router-mac 50:01:00:00:00:03
 S        10.2.4.0/24 [1/0] via VTEP 10.0.0.4 VNI 2 router-mac 50:01:00:00:00:04

service routing protocols model multi-agent
!
vrf instance BLUE
vrf instance RED
!
interface Ethernet1
   no switchport
   ip address 10.0.13.3/29
   ip ospf network point-to-point
   ip ospf area 0.0.0.0
!
interface Ethernet2
   # mac-address 50:01:00:00:00:03
   no switchport
   vrf BLUE
   ip address 10.1.3.1/24
!
interface Ethernet3
   # mac-address 50:01:00:00:00:03
   no switchport
   vrf RED
   ip address 10.2.3.1/24
!
interface Loopback0
   ip address 10.0.0.3/32
!
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vrf BLUE vni 1
   vxlan vrf RED vni 2
!
ip routing
ip routing vrf BLUE
ip routing vrf RED
!
ip route vrf BLUE 10.1.2.0/24 vtep 10.0.0.2 vni 1 router-mac-address 50:01:00:00:00:02
ip route vrf BLUE 10.1.4.0/24 vtep 10.0.0.4 vni 1 router-mac-address 50:01:00:00:00:04
ip route vrf RED 10.2.2.0/24 vtep 10.0.0.2 vni 2 router-mac-address 50:01:00:00:00:02
ip route vrf RED 10.2.4.0/24 vtep 10.0.0.4 vni 2 router-mac-address 50:01:00:00:00:04
!
router ospf 1
   redistribute connected

service routing protocols model multi-agent
!
vrf instance BLUE
vrf instance RED
!
interface Ethernet1
   no switchport
   ip address 10.0.14.4/29
   ip ospf network point-to-point
   ip ospf area 0.0.0.0
!
interface Ethernet2
   # mac-address 50:01:00:00:00:04
   no switchport
   vrf BLUE
   ip address 10.1.4.1/24
!
interface Ethernet3
   # mac-address 50:01:00:00:00:04
   no switchport
   vrf RED
   ip address 10.2.4.1/24
!
interface Loopback0
   ip address 10.0.0.4/32
!
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vrf BLUE vni 1
   vxlan vrf RED vni 2
!
ip routing
ip routing vrf BLUE
ip routing vrf RED
!
ip route vrf BLUE 10.1.2.0/24 vtep 10.0.0.2 vni 1 router-mac-address 50:01:00:00:00:02
ip route vrf BLUE 10.1.3.0/24 vtep 10.0.0.3 vni 1 router-mac-address 50:01:00:00:00:03
ip route vrf RED 10.2.2.0/24 vtep 10.0.0.2 vni 2 router-mac-address 50:01:00:00:00:02
ip route vrf RED 10.2.3.0/24 vtep 10.0.0.3 vni 2 router-mac-address 50:01:00:00:00:03
!
router ospf 1
   redistribute connected

!
interface Ethernet2
 ip address 10.0.12.1 255.255.255.248
 ip ospf network point-to-point
 ip ospf 1 area 0
!
interface Ethernet3
 ip address 10.0.13.1 255.255.255.248
 ip ospf network point-to-point
 ip ospf 1 area 0
!
interface Ethernet4
 ip address 10.0.14.1 255.255.255.248
 ip ospf network point-to-point
 ip ospf 1 area 0
!
router ospf 1
 router-id 10.0.0.1

Focusing on the R2 configuration, the magic lies in the VXLAN parameters added to each static route:

ip route vrf BLUE 10.1.3.0/24 vtep 10.0.0.3 vni 1 router-mac-address 50:01:00:00:00:03

The nexthop for the static route (10.0.0.3) is entered as type vtep, meaning we should VXLAN-encapsulate the packet before forwarding. While encapsulating we need to know which VNI tag to add to the packet, so we must specify that in the static route aswell.

The final configuration of the static route is to specify a router-mac-address. This address is then set as the destination MAC-address in the original Ethernet frame when it is being VXLAN-encapsulated. I believe this is to tell the receiving router to route the packet that was unpacked.
The way a L3 switch normally decides to perform routing is by checking if the destination MAC-address in the Ethernet frame is its own MAC-address. If yes, the destination address in the IP header is examined. If the destination IP-address does not match a locally configured IP-address the packet should be routed and is then matched against the routing table to find a suitable outbound interface towards the packet destination.

PC21-PC31 Packet walk

As this is a complex operation, let's go through it in detail using the diagram below:

The diagram shows an ICMP Echo packet traveling from PC21 to PC31. All the MAC-addresses have been simplified to improve readability. The MAC-address matches each node ID. Let's examine how the packet is processed along the way to the destination:

  1. R2 receives the Ethernet frame from PC21 on an interface that belongs to vrf BLUE. The destination MAC-address (0002) is the MAC-address of R2, so R2 is the intended recipient. The destination IP-address (10.1.3.31), however, does not match any local IP-address of R2 inside vrf BLUE, so R2 searches the vrf BLUE routing-table for a matching route, finding the one highlighted in the image.
    The static route specifies a VTEP as nexthop together with a VNI, so R2 knows that the packet should be VXLAN-encapsulated before forwarding. During the encapsulation process, R2 changes the destination MAC-address inside the original Ethernet frame from R2 (0002) to R3 (0003). This is important and further discussed below.

  2. R1 receives the Ethernet frame from R2. The destination MAC-address matches R1, so it further processes the packet, discarding the Ethernet header as it has served its purpose. The destination IP (10.0.0.3) is not a local IP-address, so R1 uses its routing table to find the best path to R3. The path via the R1-R3 link is selected, so R1 adds a new Ethernet header with R1 (0001) as source and R3 (0003) as destination. Finally the packet is forwarded to R3.

  3. R3 receives the Ethernet frame from R1. The destination MAC-address (0003) matches R3, so the Ethernet Header is stripped and the IP header is processed next. The destination IP-address (10.0.0.3) is a local IP-address on R3, so the packet is not routed and instead processed further. The destination port in the UDP header matches the configured VXLAN port (4789), so the packet is sent to the VXLAN process for packet decapsulation and VNI identification (vrf BLUE). The original Ethernet Frame is then recirculated back into the R3 forwarding ASIC as having entered on a virtual VXLAN interface inside vrf BLUE, so the forwarding process start over:
    R3 receives an Ethernet frame from internal interface "VXLAN" in vrf BLUE. The destination MAC-address (0003) matches R3, so the Ethernet header is stripped and the IP header is processed next. This time the source MAC-address is 0002, but this is not important. The destination IP-address (10.1.3.31) is not a local IP-address inside vrf BLUE, but matches subnet 10.1.3.0/24 configured on the R3-PC31 interface. R3 adds a new Ethernet header, setting itself as the source (0003), PC31 as the destination (0031) and finally forwards the packet to PC31.

This packetwalk above goes into great depth on how the packet is processed. You may need to read this a couple of times to get the full picture and that's fine because this is a complicated process. While researching this I found that older ASICs that did not fully support VXLAN would use a forwarding technique called "recirculation" where the VXLAN-packet would first be processed by the ASIC, the VXLAN-packet identified and decapsulated. The remaining Ethernet frame would then be sent through the ASIC one more time to figure where out where to send it. Newer ASICs with full VXLAN support will do everything in a single processing cycle.

Sending traffic between VRFs

This is very simple when using static routes. You can enter any VNI you desire in your static route, so allowing communication between PC21 and PC32 require only the following configuration:

ip route vrf BLUE 10.2.3.0/24 vtep 10.0.0.3 vni 2 router-mac-address 50:01:00:00:00:03

ip route vrf RED 10.1.2.0/24 vtep 10.0.0.2 vni 1 router-mac-address 50:01:00:00:00:02

The simplicity comes from the fact that the VNI is just a number. By telling R2 to add VNI 2 when forwarding the packet, R3 will map it to vrf RED when processing.

Conclusion L3VPN using static routes

This ends our examination of configuring L3VPN using static routes. While the configuration is simple, scalability is poor. Every time a route is added or removed, it must be removed from all routers in the VPN. If you have 100 routers in the VPN then this will become quite a tedious task. It would be better if the routes were dynamically exchanged using a routing protocol. So next up in the article is looking at how to improve the scalability by using BGP inside each VRF.


L3VPN with Per-VRF BGP

Instead of configuring static routes on all routers in a customer VPN, it would be much cleaner for the router to advertise customer routes automatically. This is what we will accomplish in this example by establishing iBGP adjacencies between loopbacks in each VRF.

service routing protocols model multi-agent
!
vrf instance BLUE
vrf instance RED
!
interface Ethernet1
   no switchport
   ip address 10.0.12.2/29
   ip ospf network point-to-point
   ip ospf area 0.0.0.0
!
interface Ethernet2
   switchport access vlan 2
   no switchport
   vrf BLUE
   ip address 10.1.2.1/24
!
interface Ethernet3
   switchport access vlan 3
   no switchport
   vrf RED
   ip address 10.2.2.1/24
!
interface Loopback0
   ip address 10.0.0.2/32
!
interface Loopback1
   vrf BLUE
   ip address 10.0.0.2/32
!
interface Loopback2
   vrf RED
   ip address 10.0.0.2/32
!
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vrf BLUE vni 1
   vxlan vrf RED vni 2
!
ip routing
ip routing vrf BLUE
ip routing vrf RED
!
ip route vrf BLUE 10.0.0.3/32 vtep 10.0.0.3 vni 1 router-mac-address 50:01:00:00:00:03
ip route vrf BLUE 10.0.0.4/32 vtep 10.0.0.4 vni 1 router-mac-address 50:01:00:00:00:04
ip route vrf RED 10.0.0.3/32 vtep 10.0.0.3 vni 2 router-mac-address 50:01:00:00:00:03
ip route vrf RED 10.0.0.4/32 vtep 10.0.0.4 vni 2 router-mac-address 50:01:00:00:00:04
!
router bgp 65000
   vrf BLUE
      neighbor 10.0.0.3 remote-as 65000
      neighbor 10.0.0.3 update-source Loopback1
      neighbor 10.0.0.4 remote-as 65000
      neighbor 10.0.0.4 update-source Loopback1
      redistribute connected
      !
      address-family ipv4
         neighbor 10.0.0.3 activate
         neighbor 10.0.0.4 activate
   !
   vrf RED
      neighbor 10.0.0.3 remote-as 65000
      neighbor 10.0.0.3 update-source Loopback2
      neighbor 10.0.0.4 remote-as 65000
      neighbor 10.0.0.4 update-source Loopback2
      redistribute connected
      !
      address-family ipv4
         neighbor 10.0.0.3 activate
         neighbor 10.0.0.4 activate
!
router ospf 1
   redistribute connected

service routing protocols model multi-agent
!
vrf instance BLUE
vrf instance RED
!
interface Ethernet1
   no switchport
   ip address 10.0.13.3/29
   ip ospf network point-to-point
   ip ospf area 0.0.0.0
!
interface Ethernet2
   no switchport
   vrf BLUE
   ip address 10.1.3.1/24
!
interface Ethernet3
   no switchport
   vrf RED
   ip address 10.2.3.1/24
!
interface Loopback0
   ip address 10.0.0.3/32
!
interface Loopback1
   vrf BLUE
   ip address 10.0.0.3/32
!
interface Loopback2
   vrf RED
   ip address 10.0.0.3/32
!
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vrf BLUE vni 1
   vxlan vrf RED vni 2
!
ip routing
ip routing vrf BLUE
ip routing vrf RED
!
ip route vrf BLUE 10.0.0.2/32 vtep 10.0.0.2 vni 1 router-mac-address 50:01:00:e5:e3:6a
ip route vrf BLUE 10.0.0.4/32 vtep 10.0.0.4 vni 1 router-mac-address 50:01:00:00:00:04
ip route vrf RED 10.0.0.2/32 vtep 10.0.0.2 vni 2 router-mac-address 50:01:00:e5:e3:6a
ip route vrf RED 10.0.0.4/32 vtep 10.0.0.4 vni 2 router-mac-address 50:01:00:00:00:04
!
router bgp 65000
   vrf BLUE
      neighbor 10.0.0.2 remote-as 65000
      neighbor 10.0.0.2 update-source Loopback1
      neighbor 10.0.0.4 remote-as 65000
      neighbor 10.0.0.4 update-source Loopback1
      redistribute connected
      !
      address-family ipv4
         neighbor 10.0.0.2 activate
         neighbor 10.0.0.4 activate
   !
   vrf RED
      neighbor 10.0.0.2 remote-as 65000
      neighbor 10.0.0.2 update-source Loopback2
      neighbor 10.0.0.4 remote-as 65000
      neighbor 10.0.0.4 update-source Loopback2
      redistribute connected
      !
      address-family ipv4
         neighbor 10.0.0.2 activate
         neighbor 10.0.0.4 activate
!
router ospf 1
   redistribute connected

service routing protocols model multi-agent
!
vrf instance BLUE
vrf instance RED
!
interface Ethernet1
   no switchport
   ip address 10.0.14.4/29
   ip ospf network point-to-point
   ip ospf area 0.0.0.0
!
interface Ethernet2
   no switchport
   vrf BLUE
   ip address 10.1.4.1/24
!
interface Ethernet3
   no switchport
   vrf RED
   ip address 10.2.4.1/24
!
interface Loopback0
   ip address 10.0.0.4/32
!
interface Loopback1
   vrf BLUE
   ip address 10.0.0.4/32
!
interface Loopback2
   vrf RED
   ip address 10.0.0.4/32
!
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vrf BLUE vni 1
   vxlan vrf RED vni 2
!
ip routing
ip routing vrf BLUE
ip routing vrf RED
!
ip route vrf BLUE 10.0.0.2/32 vtep 10.0.0.2 vni 1 router-mac-address 50:01:00:e5:e3:6a
ip route vrf BLUE 10.0.0.3/32 vtep 10.0.0.3 vni 1 router-mac-address 50:01:00:00:00:03
ip route vrf RED 10.0.0.2/32 vtep 10.0.0.2 vni 2 router-mac-address 50:01:00:e5:e3:6a
ip route vrf RED 10.0.0.3/32 vtep 10.0.0.3 vni 2 router-mac-address 50:01:00:00:00:03
!
router bgp 65000
   vrf BLUE
      neighbor 10.0.0.2 remote-as 65000
      neighbor 10.0.0.2 update-source Loopback1
      neighbor 10.0.0.3 remote-as 65000
      neighbor 10.0.0.3 update-source Loopback1
      redistribute connected
      !
      address-family ipv4
         neighbor 10.0.0.2 activate
         neighbor 10.0.0.3 activate
   !
   vrf RED
      neighbor 10.0.0.2 remote-as 65000
      neighbor 10.0.0.2 update-source Loopback2
      neighbor 10.0.0.3 remote-as 65000
      neighbor 10.0.0.3 update-source Loopback2
      redistribute connected
      !
      address-family ipv4
         neighbor 10.0.0.2 activate
         neighbor 10.0.0.3 activate
!
router ospf 1
   redistribute connected

Focusing on the R2 configuration, we can see that Loopback1 and Loopback2 were added, one for each VRF. Static routes for neighbor loopbacks were added in each VRF, allowing R2 to reach the loopbacks of R3 and R4 in each VRF via VXLAN-tunneling. Finally, iBGP configuration was added inside each VRF, allowing the connected 10.1.X.X and 10.2.X.X customer routes to be advertised via BGP.
Any connected route is now automatically advertised or withdrawn from BGP thanks to the redistribute connected command. You can also establish BGP adjacencies with your customers, allowing them dynamically advertise routes into the VRF, giving them more power while you get more time to perform other tasks.

R2#show ip bgp sum vrf all
BGP summary information for VRF BLUE
  Neighbor V AS      Up/Down State   PfxRcd PfxAcc
  10.0.0.3 4 65000  00:05:28 Estab   2      2
  10.0.0.4 4 65000  00:05:13 Estab   2      2

BGP summary information for VRF RED
  Neighbor V AS      Up/Down State   PfxRcd PfxAcc
  10.0.0.3 4 65000  00:04:44 Estab   2      2
  10.0.0.4 4 65000  00:04:42 Estab   2      2

R2#show ip bgp vrf all
VRF BLUE:
      Network     Next Hop LocPref Weight  Path
 * >  10.0.0.2/32 -        -       0       i
 * >  10.0.0.3/32 10.0.0.3 100     0       i
 * >  10.0.0.4/32 10.0.0.4 100     0       i
 * >  10.1.2.0/24 -        -       0       i
 * >  10.1.3.0/24 10.0.0.3 100     0       i
 * >  10.1.4.0/24 10.0.0.4 100     0       i

VRF RED:
      Network     Next Hop LocPref Weight  Path
 * >  10.0.0.2/32 -        -       0       i
 * >  10.0.0.3/32 10.0.0.3 100     0       i
 * >  10.0.0.4/32 10.0.0.4 100     0       i
 * >  10.2.2.0/24 -        -       0       i
 * >  10.2.3.0/24 10.0.0.3 100     0       i
 * >  10.2.4.0/24 10.0.0.4 100     0       i

R2#show ip route vrf all
VRF: default
 C        10.0.0.2/32 is directly connected, Loopback0
 O E2     10.0.0.3/32 [110/1] via 10.0.12.1, Ethernet1
 O E2     10.0.0.4/32 [110/1] via 10.0.12.1, Ethernet1
 C        10.0.12.0/29 is directly connected, Ethernet1
 O        10.0.13.0/29 [110/20] via 10.0.12.1, Ethernet1
 O        10.0.14.0/29 [110/20] via 10.0.12.1, Ethernet1

VRF: BLUE
 C        10.0.0.2/32 is directly connected, Loopback1
 S        10.0.0.3/32 [1/0] via VTEP 10.0.0.3 VNI 1 router-mac 50:01:00:00:00:03
 S        10.0.0.4/32 [1/0] via VTEP 10.0.0.4 VNI 1 router-mac 50:01:00:00:00:04
 C        10.1.2.0/24 is directly connected, Ethernet2
 B I      10.1.3.0/24 [200/0] via 10.0.0.3 VTEP 10.0.0.3 VNI 1 router-mac 50:01:00:00:00:03
 B I      10.1.4.0/24 [200/0] via 10.0.0.4 VTEP 10.0.0.4 VNI 1 router-mac 50:01:00:00:00:04

VRF: RED
 C        10.0.0.2/32 is directly connected, Loopback2
 S        10.0.0.3/32 [1/0] via VTEP 10.0.0.3 VNI 2 router-mac 50:01:00:00:00:03
 S        10.0.0.4/32 [1/0] via VTEP 10.0.0.4 VNI 2 router-mac 50:01:00:00:00:04
 C        10.2.2.0/24 is directly connected, Ethernet3
 B I      10.2.3.0/24 [200/0] via 10.0.0.3 VTEP 10.0.0.3 VNI 2 router-mac 50:01:00:00:00:03
 B I      10.2.4.0/24 [200/0] via 10.0.0.4 VTEP 10.0.0.4 VNI 2 router-mac 50:01:00:00:00:04

This is what the R2 routing output looks like. We can see that BGP adjacencies are active in both VRFs and that BGP is used to share customer routes (10.1.X.Y and 10.2.X.Y) in each VRF.

While this setup scales better than using only static routes, we still rely on static routes for VXLAN connectivity. If a router is added to a VPN, all other routers in that VPN must add a static route for that new router. Additionally, the BGP configuration is per-VRF so it has the same issues; any new router must be added as a BGP neighbor inside the VRF. As you get more and more VRFs and neighbors, the more overhead BGP will add. Using a Route-reflector could alleviate some BGP scalability issues here, but RR is not in the scope of this article and it doesn't fully make sense in this per-VRF BGP configuration anyway.

So, while this solution is better and has higher scalability, there is still a need for static routes and BGP-adjacencies are required in each vrf. Can we do better? Let's check out the EVPN example below to find out.


L3VPN with EVPN

This is the most scalable solution for VXLAN L3VPN and probably the model that you want to deploy in your network. Let's look at the config:

service routing protocols model multi-agent
!
vrf instance BLUE
vrf instance RED
!
interface Ethernet1
   no switchport
   ip address 10.0.12.2/29
   ip ospf network point-to-point
   ip ospf area 0.0.0.0
!
interface Ethernet2
   switchport access vlan 2
   no switchport
   vrf BLUE
   ip address 10.1.2.1/24
!
interface Ethernet3
   switchport access vlan 3
   no switchport
   vrf RED
   ip address 10.2.2.1/24
!
interface Loopback0
   ip address 10.0.0.2/32
!
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vrf BLUE vni 1
   vxlan vrf RED vni 2
!
ip routing
ip routing vrf BLUE
ip routing vrf RED
!
router bgp 65000
   neighbor EVPN peer group
   neighbor EVPN remote-as 65000
   neighbor EVPN update-source Loopback0
   neighbor EVPN send-community
   neighbor 10.0.0.3 peer group EVPN
   neighbor 10.0.0.4 peer group EVPN
   !
   address-family evpn
      neighbor EVPN activate
   !
   vrf BLUE
      rd 65000:1
      route-target import evpn 65000:1
      route-target export evpn 65000:1
      redistribute connected
   !
   vrf RED
      rd 65000:2
      route-target import evpn 65000:2
      route-target export evpn 65000:2
      redistribute connected
!
router ospf 1
   redistribute connected

service routing protocols model multi-agent
!
vrf instance BLUE
vrf instance RED
!
interface Ethernet1
   no switchport
   ip address 10.0.13.3/29
   ip ospf network point-to-point
   ip ospf area 0.0.0.0
!
interface Ethernet2
   no switchport
   vrf BLUE
   ip address 10.1.3.1/24
!
interface Ethernet3
   no switchport
   vrf RED
   ip address 10.2.3.1/24
!
interface Loopback0
   ip address 10.0.0.3/32
!
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vrf BLUE vni 1
   vxlan vrf RED vni 2
!
ip routing
ip routing vrf BLUE
ip routing vrf RED
!
router bgp 65000
   neighbor EVPN peer group
   neighbor EVPN remote-as 65000
   neighbor EVPN update-source Loopback0
   neighbor EVPN send-community
   neighbor 10.0.0.2 peer group EVPN
   neighbor 10.0.0.4 peer group EVPN
   !
   address-family evpn
      neighbor EVPN activate
   !
   vrf BLUE
      rd 65000:1
      route-target import evpn 65000:1
      route-target export evpn 65000:1
      redistribute connected
   !
   vrf RED
      rd 65000:2
      route-target import evpn 65000:2
      route-target export evpn 65000:2
      redistribute connected
!
router ospf 1
   redistribute connected

service routing protocols model multi-agent
!
vrf instance BLUE
vrf instance RED
!
interface Ethernet1
   no switchport
   ip address 10.0.14.4/29
   ip ospf network point-to-point
   ip ospf area 0.0.0.0
!
interface Ethernet2
   no switchport
   vrf BLUE
   ip address 10.1.4.1/24
!
interface Ethernet3
   no switchport
   vrf RED
   ip address 10.2.4.1/24
!
interface Loopback0
   ip address 10.0.0.4/32
!
interface Vxlan1
   vxlan source-interface Loopback0
   vxlan udp-port 4789
   vxlan vrf BLUE vni 1
   vxlan vrf RED vni 2
!
ip routing
ip routing vrf BLUE
ip routing vrf RED
!
router bgp 65000
   neighbor EVPN peer group
   neighbor EVPN remote-as 65000
   neighbor EVPN update-source Loopback0
   neighbor EVPN send-community
   neighbor 10.0.0.2 peer group EVPN
   neighbor 10.0.0.3 peer group EVPN
   !
   address-family evpn
      neighbor EVPN activate
   !
   vrf BLUE
      rd 65000:1
      route-target import evpn 65000:1
      route-target export evpn 65000:1
      redistribute connected
   !
   vrf RED
      rd 65000:2
      route-target import evpn 65000:2
      route-target export evpn 65000:2
      redistribute connected
!
router ospf 1
   redistribute connected

Looking at the BGP configuration for R2, we can see that the number of adjacencies has been drastically improved. The R2-R4 BGP adjacencies are now defined globally instead of per-VRF, so we have achieved much higher scalability thanks to a much reduced number of BGP adjacencies in the network. Thanks to EVPN, we no longer need our vtep static routes and I will cover why in the routing output section below.

The Route-Distinguisher (rd) and Route-Target configuration has already been covered in my MPLS L3VPN article, so feel free to check that one out if you're curious.

A way to further scale this network further is deploying a Route-Reflector. Upgrading R1 to router with BGP EVPN address-family capabilities would make it a natural RR thanks to its central network location. R2-R4 would then peer with R1 instead of with each other which would massively improve scalability when you have 100+ BGP-speaking routers in your network.

Let's consider the EVPN routing output:

R2#show ip route vrf all
VRF: default
 C        10.0.0.2/32 is directly connected, Loopback0
 O E2     10.0.0.3/32 [110/1] via 10.0.12.1, Ethernet1
 O E2     10.0.0.4/32 [110/1] via 10.0.12.1, Ethernet1
 C        10.0.12.0/29 is directly connected, Ethernet1
 O        10.0.13.0/29 [110/20] via 10.0.12.1, Ethernet1
 O        10.0.14.0/29 [110/20] via 10.0.12.1, Ethernet1

VRF: BLUE
 C        10.1.2.0/24 is directly connected, Ethernet2
 B I      10.1.3.0/24 [200/0] via VTEP 10.0.0.3 VNI 1 router-mac 50:01:00:00:00:03
 B I      10.1.4.0/24 [200/0] via VTEP 10.0.0.4 VNI 1 router-mac 50:01:00:00:00:04

VRF: RED
 C        10.2.2.0/24 is directly connected, Ethernet3
 B I      10.2.3.0/24 [200/0] via VTEP 10.0.0.3 VNI 2 router-mac 50:01:00:00:00:03
 B I      10.2.4.0/24 [200/0] via VTEP 10.0.0.4 VNI 2 router-mac 50:01:00:00:00:04

R2#show bgp evpn sum
BGP summary information for VRF default
Router identifier 10.0.0.2, local AS number 65000
Neighbor Status Codes: m - Under maintenance
  Neighbor V AS           MsgRcvd   MsgSent  InQ OutQ  Up/Down State   PfxRcd PfxAcc
  10.0.0.3 4 65000             10         9    0    0 00:02:36 Estab   2      2
  10.0.0.4 4 65000             10         9    0    0 00:02:28 Estab   2      2

R2#show bgp evpn rd 65000:1 detail 
BGP routing table information for VRF default
Router identifier 10.0.0.2, local AS number 65000
BGP routing table entry for ip-prefix 10.1.2.0/24, Route Distinguisher: 65000:1
 Paths: 1 available
  Local
    - from - (0.0.0.0)
      Origin IGP, metric -, localpref -, weight 0, valid, local, best, redistributed (Connected)
      Extended Community: 
         Route-Target-AS:65000:1 
         TunnelEncap:tunnelTypeVxlan 
         EvpnRouterMac:50:01:00:00:00:02
      VNI: 1
BGP routing table entry for ip-prefix 10.1.3.0/24, Route Distinguisher: 65000:1
 Paths: 1 available
  Local
    10.0.0.3 from 10.0.0.3 (10.0.0.3)
      Origin IGP, metric -, localpref 100, weight 0, valid, internal, best
      Extended Community: 
         Route-Target-AS:65000:1 
         TunnelEncap:tunnelTypeVxlan 
         EvpnRouterMac:50:01:00:00:00:03
      VNI: 1
BGP routing table entry for ip-prefix 10.1.4.0/24, Route Distinguisher: 65000:1
 Paths: 1 available
  Local
    10.0.0.4 from 10.0.0.4 (10.0.0.4)
      Origin IGP, metric -, localpref 100, weight 0, valid, internal, best
      Extended Community: 
         Route-Target-AS:65000:1 
         TunnelEncap:tunnelTypeVxlan 
         EvpnRouterMac:50:01:00:00:00:04
      VNI: 1

Examining the detailed BGP EVPN output of R2, we can see that the ip-prefix (Type 5) routes advertised by the EVPN address family also sends the router-mac-address as one of the route parameters (EvpnRouterMac). This allows the receiving router to dynamically learn which destination MAC-address to set when forwarding. Another important parameter of the advertised route is the VNI, telling the receiving router what VNI value to set when VXLAN encapsulating the packet. Because of the EVPN address family doing all of the heavy lifting for us, our static routes can finally be removed!

Sending traffic between VRFs

I did share a brief example of route-leaking between VRFs in the chapter on static routes. We can achieve the same result with EVPN, but this time we use multiple import or export statements to install routes into multiple VRFs. This is a simple example of R2 importing routes BLUE routes into RED and RED routes into BLUE:

SW2#sh run sec router bgp
router bgp 65000
   vrf RED
      rd 65000:2
      route-target import evpn 65000:1
      route-target import evpn 65000:2
      route-target export evpn 65000:2
      redistribute connected

SW2#sh ip route vrf RED
VRF: RED
 B L      10.1.2.0/24 is directly connected (source VRF BLUE), Ethernet2 (egress VRF BLUE)
 B I      10.1.3.0/24 [200/0] via VTEP 10.0.0.3 VNI 1 router-mac 50:01:00:ca:00:03
 B I      10.1.4.0/24 [200/0] via VTEP 10.0.0.4 VNI 1 router-mac 50:01:00:be:00:04
 C        10.2.2.0/24 is directly connected, Ethernet3
 B I      10.2.3.0/24 [200/0] via VTEP 10.0.0.3 VNI 2 router-mac 50:01:00:ca:00:03
 B I      10.2.4.0/24 [200/0] via VTEP 10.0.0.4 VNI 2 router-mac 50:01:00:be:00:04

SW2#sh run sec router bgp
router bgp 65000
   vrf BLUE
      rd 65000:1
      route-target import evpn 65000:1
      route-target import evpn 65000:2
      route-target export evpn 65000:1
      redistribute connected

SW2#show ip route vrf BLUE
VRF: BLUE
 C        10.1.2.0/24 is directly connected, Ethernet2
 B I      10.1.3.0/24 [200/0] via VTEP 10.0.0.3 VNI 1 router-mac 50:01:00:ca:00:03
 B I      10.1.4.0/24 [200/0] via VTEP 10.0.0.4 VNI 1 router-mac 50:01:00:be:00:04
 B L      10.2.2.0/24 is directly connected (source VRF RED), Ethernet3 (egress VRF RED)
 B I      10.2.3.0/24 [200/0] via VTEP 10.0.0.3 VNI 2 router-mac 50:01:00:ca:00:03
 B I      10.2.4.0/24 [200/0] via VTEP 10.0.0.4 VNI 2 router-mac 50:01:00:be:00:04

PC21#traceroute 10.2.2.22
  1 10.1.2.1 4 msec 2 msec 2 msec
  2 10.2.2.1 3 msec 3 msec 2 msec
  3 10.2.2.22 5 msec *  24 msec

In the R2 BLUE->RED Route leaking configuration box we can see that the command route-target import evpn 65000:1 was added, telling R2 to import routes with that route-target into the vrf RED routing table.

We then did a similar thing in the RED->BLUE box where the command route-target import evpn 65000:2 was added. The end result is that PC21 and PC22 are now able to communicate thanks to inter-VRF Route leaking on R2, shown in the traceroute output.

Because we used the import statement, any route leaking stays local to R2. If we had used the export statement, any RED or BLUE route advertised by R2 would be installed into both RED and BLUE VRFs on R2, R3 and R4. This would defeat the purpose of having separate VRFs.

Conclusion

This ends the second chapter in my VXLAN series. While this setup adds more packet overhead than MPLS L3VPN, I believe VXLAN L3VPN with EVPN to be a highly scalable technology for building Layer-3 VPN topologies. If you're looking at deploying a new network, this would be a valid alternative to running MPLS L3VPN for the simple reason that you don't have to pay extra for any premium MPLS features. That being said, VXLAN feature will add a premium of its own, but only for the edge nodes of your network.

Thanks for reading this article. I hope this was informative and atleast somewhat entertaining for you. I had a great time composing this article and lab topologies.

If you want more to read, please consider other posts in my VXLAN series:


Copyright 2021-2023, Emil Eliasson.
All Rights Reserved.