HOME > Technology

Research on the Application of Equal Cost Multi-Path (ECMP) Technology in Data Center Networks

Published: 2024-09-09 11:56:02

 

At present, a large number of ECMP (Equal-Cost Multipath Routing, abbreviated as ECMP) are applied in the Fabric architecture widely used in data center networks. Its advantages are mainly reflected in improving network redundancy and reliability, and also improving network resource utilization; a large number of ECMP links will cause other problems during operation in specific scenarios. For example, when an ECMP link is disconnected, all link traffic in the ECMP group will be re-hashed, which will cause an avalanche in stateful server areas (such as LVS), or multi-level ECMP hash polarization will occur, causing link congestion, etc.

This article will analyze the above issues based on the ECMP operation principle and explore how to optimize the use of ECMP.   

           Equal-cost multipath routing

 

Equal-cost multi-path routing means that there are multiple paths with equal costs to the same destination address. When the device supports equal-cost routing, the Layer 3 forwarding traffic sent to the destination IP or destination network segment can be shared through different paths to achieve load balancing of network links and fast switching when a link fails.

 

ECMP Implementation Process

 

Step 1: Selection of HASH Factor

First, the data packet forwarding queries the routing table to confirm that there are multiple equal-cost routes and then extracts the key fields involved in the HASH calculation, namely the HASH factor, according to the traffic balancing algorithm currently configured by the user. The HASH factors that can be selected for ECMP traffic balancing are as follows:

 

Traffic load balancing mode

HASH Factor

SRC-MAC

Source IP address (SlP)

DST-MAC

SRC-DST-MAC

SRC-MAC

SRC-IP

Source and destination IP address (SlP + DlP)

DST-IP

SRC-DST-IP

Source and destination IP address, L4 port source and destination (SIP + DIP + SP + DP)

SRC-DST-IP-L4PORT

Enhanced mode, extracting message fields based on load balancing profile, can define and configure existing hash factors, or customize hash perturbation factors

 

▲ Table 1: Traffic balancing mode corresponding to HASH factor table


 

Note: Because ECMP is a three-layer forwarding, even if the configuration is based on source MAC, destination MAC, or source-destination MAC as the HASH factor, the system will default to selecting the source IP as the HASH factor. In addition, when selecting to extract the HASH factor as the destination IP, the system will default to selecting the source-destination IP as the HASH factor.

 

Step 2: HASH calculation

HASH factor extracted in step 1, the corresponding HASH lb-key (load-balance key) is calculated according to the HASH algorithm. The HASH algorithms supported by ECMP traffic balancing include XOR, CRC, CRC+ scrambling, etc.

 

There are many types of HASH algorithms. We will use the XOR algorithm as an example. The XOR algorithm rule includes that if the two input bits are the same, they are 0, and if they are different, they are 1. The HASH factors are different, and the operation results are also different.

 

1. HASH factor is IP address source (SIP)

● SIP XOR 0, to get a 32-bit value a;

● Perform XOR calculation on the high 16 bits and low 16 bits of value a to get the 16-bit value b;

● Perform XOR calculation on bits 15 to 12 and bits 11 to 8 of value b to get 4-bit value c;

● The value c replaces the 11~8 bits of the value b to get the value d;

● The lower 10 bits of the value d are the lb key.

 

2. HASH factor is SIP+DIP/DIP

● DIP XOR SIP, get a 32-bit value a;

● The remaining operation steps are consistent with the SIP operation.

 

3.HASH factor is SIP+DIP+SP+DP

● SIP XOR DIP gets the 32-bit value a;

● The lower 16 bits of value a are XORed with SP to get the 32-bit value b;

● The lower 16 bits of value b are XORed with DP to obtain the 32-bit value c;

● The high 16 bits of the value c, are XORed with the low 16 bits to get the 16-bit value d;

● XOR 11 to 8 bits of the value d with 15 to 12 bits to get the 4-bit value e.

● The value e replaces the 11th to 8th bits of the value d to obtain the value f;

● The lower 10 bits of the value f are truncated to form lb-key.

 

Step 3: Confirm the next forwarding hop

After the data message passes through the routing table lookup, the corresponding ECMP base value (base-ptr) is found. After the HASH lb-key is obtained through the HASH algorithm according to the HASH factor, the ECMP next-hop link number (Member-count) is calculated, and then the addition operation is performed with the ECMP base value to obtain the forwarding next-hop index, which determines the next-hop forwarding route.

Calculation formula: Next-hop = (lb-key % Member-count) + base-ptr

 

The above process is the normal forwarding process of ECMP, but problems may occur during operation in a specific network environment. Next, we will continue to analyze two common problems encountered by ECMP in data center networks.

 

see more detail from:

https://www.ruijienetworks.com/support/tech-gallery/ecmp-technology-in-data-center-networks

 

 

Copyright and Disclaimer
【Disclaimer】 All works without attribution are reproduced, compiled or extracted from other media. The purpose of reprinted, compiled or extracted is to convey more information, and does not mean that the website agrees with its views and is responsible for its authenticity. If the content of the work, copyright and other issues need to contact this network, please carry out within 30 days!