Multiprotocol Label Switching (MPLS)

Published: 2022-01-02

The MPLS protocol was developed in the 1990s to get the same functionality that ATM offered but over Ethernet. MPLS works by labeling IP packets. This label is then used to forward the packet instead of the receiving routing looking at the destination IP in the IP header.

One early use case of MPLS was a faster forwarding plane than IP routing, the reason being that routing lookups were performed by the CPU at the time. A router must always find the most specific route, meaning the full routing table has to be examined for every packet. As routing tables grew, this process became more and more CPU intensive and so did not scale very well.

The MPLS lookup process was made much simpler because the incoming packet label could be directly mapped to an action in the mpls forwarding table. The router knew exactly which entry in the forwarding table to look at as opposed to having to parse the entire routing table. Nowadays, IP routing is just as fast as MPLS label switching thanks to hardware processing and ASICs.

IP Packet:            MPLS Packet:       
+-----------------+   +-----------------+
| Ethernet Header |   | Ethernet Header |
+-----------------+   +-----------------+
| IPv4 Header     |   | MPLS Header     |
+-----------------+   +-----------------+
| Payload         |   | IPv4 Header     |
+-----------------+   +-----------------+
                      | Payload         |
                      +-----------------+

MPLS Header

This is what the inserted header looks like. It contains four fields, each explained in some detail below. The header is 4 byte in total, making a lightweight and efficient header that minimizes the additional MTU required to forward the packet. Multiple MPLS label headers can be stacked on top of each other. The receiving router usually processes the topmost label header, leaving the inner labels untouched.

+---------+--------+-------+--------+
| Label   | Exp    | S     | TTL    |
| 20 bits | 3 bits | 1 bit | 8 bits |
+---------+--------+-------+--------+

Label field:
This field contains the label value, from zero up to one million. Labels 0-15 are reserved:
- Label 0: IPv4 Explicit Null label. Overrides the default PHP behavior.
- Label 2: IPv6 Explicit Null label. Overrides the default PHP behavior.
- Label 3: Implicit Null label. Instruction to pop label before forwarding.
- Label 7: Entropy label. Used for loadbalancing, not covered here.
Exp field:

These are called experimental bits but are most often used for QoS, values 0-7.
S field:

The Bottom-of-Stack field. Set to 1 if this is the last header in the label stack.
TTL field:

Time-to-Live, value 0-255 and is decremented on every hop. The IP packet TTL value is usually copied to the MPLS header when encapsulated.

Label Push, Swap and Pop

A MPLS router usually performs one of these three tasks. Labels are pushed onto the packet once it enters the MPLS network. Inside the MPLS network a core router will swap the label when forwarding. When the packet reaches the MPLS network edge, the labels are popped.

Push:

This pushes a label on top of the label stack. If there is no label already, the label is inserted betwen the Ethernet and IP header. A router may push several labels at once, commonly performed in MPLS VPN topologies.
Swap:

An incoming label is often swapped when forwarded to the next hop in the path. Only the topmost label in the stack is swapped. Any inner labels are left untouched. The swap happens because the label needs to make sense to the receiving router. The label was most likely advertised from the receiving router using some label signaling protocol. More on that later.
Pop:

Popping a label means to remove it, exposing the next header.

Penultimate Hop Popping, PHP

A concept in label switching is called Penultimate Hop Popping, PHP. When a MPLS packet reaches the MPLS network edge, the edge router must perform routing to forward the packet. Without PHP the edge router would receive the MPLS packet, remove the MPLS labels and then perform routing on the IPv4 header.

With PHP the previous router pops the MPLS header before sending the packet to the edge router. This saves some resources on the edge router as the IPv4/IPv6 header is exposed and the packet can be routed immediately. PHP also has a use case with MPLS VPN where removing the outer transport label exposes the VPN label underneath, allowing the edge router to quickly map the packet to the correct VPN. PHP is the default behavior.

MPLS Forwarding table

The below output shows an example forwarding table where the forwarding action is quickly found based on the incoming packet label. If a packet is received with label 17 then perform the following action: swap the label to 18 and forward it out on Gi2 to 10.2.2.2.

Incoming   Outgoing   Outgoing   Next Hop
Label      Label      interface          
---        ---        ---        ---
16         Pop Label  Gi1        10.1.1.1
17         18         Gi2        10.2.2.2
18         16         Gi3        10.3.3.3

Running a BGP-free core

One benefit of the MPLS protocol is that routing state can be reduced. The IPv4 global routing table today contains 930k+ entries and the IPv6 table is around 160k+. If we didn't have MPLS, all routers in our network would be forced to learn and exchange these routes with one another to forward traffic properly. By using MPLS labels, the core routers (P) in the network only need to know how to reach the edge routers (PE). This saves hardware resources, and money, as the core routers only need to maintain a table of a hundred or so internal prefixes, allowing the big bucks to be spent on the edge routers that participate in the global BGP-based Internet. With MPLS we therefore introduce different router types:

P-router:

P is short for Provider. A core router that forward MPLS labeled traffic between PE-routers. It does not participate in BGP.
PE-router:

PE is short for Provider Edge. This router sits at the edge of the service provider network, participating in BGP. This node connects the customer equipment to the SP network. Whenever the PE sends traffic onto the SP core network (towards a P-router) a label is pushed to the packet.
CE-router:

Customer premise equipment. May also be called CPE. This is the SP router placed inside the customer facility, granting them access to the SP network. This device is often small and cheap and does not participate in MPLS or global BGP. It is usually managed by the service provider, but can sometimes be owned and managed by the customer.
Border/ASBR/Peer-router:

Dedicated border routers are optional, used to connect to other service providers. Smaller SPs use their PEs for this.
Route-Reflector:

While not necessarily part of the network topology, a RR is used to reflect BGP routes between PEs. The alternative would be having all PEs build an iBGP full mesh which just isn't scalable at a service provider level. The RRs does not forward customer traffic.

The image above shows an example topology. Starting in the middle, we have our P-routers connected in a ring. Our PEs and RRs then connect to one or multiple P-routers. Finally the CE-routers connect to the PE-routers to complete the topology.

Packet walk Example

The above topology show the sites for customers red and blue. In this example a device behind CE51 wants to communicate with a device behind CE71, let's follow the packet:

CE51 routes the packet to PE5.
PE5 pushes a MPLS label onto the IP packet and forwards it to P1. The label says the packet destination is PE7.
P1 reads the label and follows the instruction: Swap the label and forward it to P3 towards PE7.
P3 reads the label and follows the instruction: Pop the label and forward only the IP packet to PE7.
PE7 receives the unlabelled IP packet and routes it to CE71.

LSP and LSR

LSP is short for Label Switched Path. Another word that could be used is tunnel, since the IP packet is tunneled inside the LSP. While a tunnel is often point-to-point (GRE, IPsec), an LSP is more flexible as it can be P2P, MP2P, P2MP or even MP2MP. The type of LSP depends on the label signaling protocol used.

A router along the LSP is an LSR, Label Switching Router. It performs the Push, Swap and Pop forwarding operations instead of routing the IP packet. The router that encapsulates the packet and places it in the LSP is called Ingress LSR, Headend LSR or Ingress PE. The router at the end of the LSP that decapsulates the packet is called Egress LSR, Tailend LSR or Egress PE. The routers inbetween are called Transit LSR.

Label Signaling Protocols

These are the protocols that dynamically build the LSPs in a SP core network:

LDP

Label Distribution Protocol. A commonly used protocol that cooperates with OSPF or IS-IS to advertise labels for prefixes that the IGP advertise. LDP and the IGP must be running on the same interfaces for this to work properly. LDP builds MP2P LSPs by default, using the IGP to find the path to the destination PE.
RSVP-TE

Resource Reservation Protocol with Traffic Engineering. This protocol allows you to explicitly set each hop along the LSP to the destination PE, giving the admin granular control of the exact path for this traffic. You can also assign a certain amount of bandwidth to the LSP to help avoid congestion in parts of your network. All in all, a very powerful but complex protocol. RSVP builds P2P LSPs by default.
SR-TE/SPRING-TE

Segment Routing with Traffic Engineering. An extension to OSPF and IS-IS allowing them to advertise labels themselves without having to rely on LDP or RSVP to do that for them. This is a relatively new technology. Most SPs run LDP or RSVP, or both!
BGP Labeled Unicast

The last label signaling protocol in our list, most often used between SPs to label-switch specific VPNs or services between them. This is popular because it keeps each SP IGP and LDP domains separate.

Multiprotocol Label Switching (MPLS)

IP vs MPLS packet

MPLS Header

Label field:

Exp field:

S field:

TTL field:

Push:

Swap:

Pop:

MPLS Forwarding Table

P-router:

PE-router:

CE-router:

Border/ASBR/Peer-router:

Route-Reflector:

RSVP-TE

SR-TE/SPRING-TE

BGP Labeled Unicast