Skip to content

Clean up stale entries from socket-level reverse NAT map

For the socket-level load balancer, we currently store the information for reverse NAT in an LRU hashmap. We add new entries on cgroup/connect, but never delete them. Instead, we rely on the LRU to kick in and evict entries. This map is then looked up on both cgroup/recvmsg and cgroup/getpeername.

In practice, that causes the LRU hashmap to grow to a near-full size before the LRU kicks in. With a low/mid node churn and a high number of ClusterIP connections, that can become the default state of the map, leading to more hash collisions and lower performance.

To avoid that unnecessary overhead, we could attach to cgroup/sock_release and clean up stale entries. This attach point should give us a struct sock and we can use bpf_get_socket_cookie to retrieve the sk cookie (needed for the map's key).