Skip to content

native routing mode performance issue

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.18.3 and lower than v1.19.0

What happened?

I am running following cilium version:

# cilium version
cilium-cli: v0.18.7 compiled with go1.25.0 on linux/amd64
cilium image (default): v1.18.1
cilium image (stable): v1.18.3
cilium image (running): 1.18.2

I am deploying kubernetes on bare metal hardware in my datacenter and all my k8s nodes in single L2 domain. I have configured following config for native mode

# cilium config view 
ipv4-native-routing-cidr                          10.233.0.0/16
routing-mode                                            native
auto-direct-node-routes                          true

routing table looks like following

# ip route
default via 10.0.72.1 dev bond0.31 proto static
10.0.16.0/21 dev bond0.10 proto kernel scope link src 10.0.22.11
10.0.72.0/22 dev bond0.31 proto kernel scope link src 10.0.72.11
10.233.0.0/24 via 10.233.0.43 dev cilium_host proto kernel src 10.233.0.43
10.233.0.43 dev cilium_host proto kernel scope link
10.233.0.118 dev lxc_health proto kernel scope link
10.233.1.0/24 via 10.0.72.12 dev bond0.31 proto kernel
10.233.2.0/24 via 10.0.72.13 dev bond0.31 proto kernel
10.233.3.0/24 via 10.0.72.21 dev bond0.31 proto kernel
10.233.4.0/24 via 10.0.72.22 dev bond0.31 proto kernel
10.233.5.0/24 via 10.0.72.23 dev bond0.31 proto kernel
10.233.6.0/24 via 10.0.72.24 dev bond0.31 proto kernel
10.233.7.0/24 via 10.0.72.25 dev bond0.31 proto kernel

My datacenter MTU is 9212 and I have configured MTU 9000 on all the nodes interface and I have tested them with ping -M do -s 9872 command. I am 100% sure its not MTU issue. I have 10G+10G LACP bonding on all interface.

I have created two pods on two different nodes and running iperf3 test and I found lots of TCP Retrans packets.

My traceroute between two pods


POD1# traceroute -n 10.233.6.227
traceroute to 10.233.6.227 (10.233.6.227), 30 hops max, 60 byte packets
 1  10.0.72.23  0.121 ms  0.620 ms *
 2  10.0.72.24  0.861 ms  0.836 ms *
 3  10.233.6.227  0.818 ms * *

Here is the iperf3 result and you can see lots of tcp retr happening

root@ubuntu-04:/# iperf3 -c 10.233.6.31 -t 120
Connecting to host 10.233.6.31, port 5201
[  5] local 10.233.5.157 port 46350 connected to 10.233.6.31 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.02 GBytes  8.73 Gbits/sec  367   2.36 MBytes
[  5]   1.00-2.00   sec  1.09 GBytes  9.32 Gbits/sec  275   2.93 MBytes
[  5]   2.00-3.00   sec  1.07 GBytes  9.21 Gbits/sec   12   3.02 MBytes
[  5]   3.00-4.00   sec  1.07 GBytes  9.23 Gbits/sec  165   3.02 MBytes
[  5]   4.00-5.00   sec  1.07 GBytes  9.22 Gbits/sec   88   3.02 MBytes
[  5]   5.00-6.00   sec  1.10 GBytes  9.43 Gbits/sec   86   3.02 MBytes
[  5]   6.00-7.00   sec  1.07 GBytes  9.15 Gbits/sec   24   3.02 MBytes
[  5]   7.00-8.00   sec  1.06 GBytes  9.14 Gbits/sec   17   3.02 MBytes
[  5]   8.00-9.00   sec  1.08 GBytes  9.31 Gbits/sec  114   3.02 MBytes
[  5]   9.00-10.00  sec  1.04 GBytes  8.92 Gbits/sec   45   3.02 MBytes
[  5]  10.00-11.00  sec  1.10 GBytes  9.48 Gbits/sec  120   3.02 MBytes
[  5]  11.00-12.00  sec  1.11 GBytes  9.54 Gbits/sec    0   3.02 MBytes
[  5]  12.00-13.00  sec  1.06 GBytes  9.11 Gbits/sec   39   2.53 MBytes
[  5]  13.00-14.00  sec  1.01 GBytes  8.69 Gbits/sec   32   3.02 MBytes

If I change routing mode from native to tunnel then I am seeing better performance with very few tcp retrans packets. I don't understand what is going on here.

If I run iperf between physical worker nodes then I am not seeing any TCP re-trans. Its only between pods running on two different nodes.

How can we reproduce the issue?

I have installed with helm but using kubespray deployment tool

Cilium Version

cilium version

cilium-cli: v0.18.7 compiled with go1.25.0 on linux/amd64 cilium image (default): v1.18.1 cilium image (stable): v1.18.3 cilium image (running): 1.18.2


### Kernel Version

Ubuntu 22.04 LTS

# uname -a
Linux ubuntu-04 5.15.0-161-generic #171-Ubuntu SMP Sat Oct 11 08:17:01 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux


### Kubernetes Version

v1.33.5

### Regression

_No response_

### Sysdump

_No response_

### Relevant log output

```shell

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct