Multiple nodes register the IP Addresses
Describe the bug
I am using k3s as Kubernetes cluster on my homelab with three nodes. I have noticed problems with some applications that uses websockets , getting connections lost errors. Investigating the problem I have noticed that the ARP requests were sent with two different MAC addreses, and looking at the nodes I have noticed that some of the Load Balancing addresses are duplicated in two nodes.
Aragorn node:
# ip addr show dev enp2s0.40 | egrep "192.168.40.102|inet "
inet 192.168.40.52/24 brd 192.168.40.255 scope global enp2s0.40
inet 192.168.40.104/32 scope global deprecated enp2s0.40
inet 192.168.40.111/32 scope global deprecated enp2s0.40
inet 192.168.40.120/32 scope global deprecated enp2s0.40
inet 192.168.40.107/32 scope global deprecated enp2s0.40
inet 192.168.40.115/32 scope global deprecated enp2s0.40
Legolas node:
# ip addr show dev enp2s0.40 | egrep "192.168.40.102|inet "
inet 192.168.40.51/24 brd 192.168.40.255 scope global enp2s0.40
inet 192.168.40.70/32 scope global deprecated enp2s0.40
inet 192.168.40.104/32 scope global deprecated enp2s0.40
inet 192.168.40.102/32 scope global deprecated enp2s0.40
inet 192.168.40.111/32 scope global deprecated enp2s0.40
inet 192.168.40.120/32 scope global deprecated enp2s0.40
inet 192.168.40.106/32 scope global deprecated enp2s0.40
inet 192.168.40.107/32 scope global deprecated enp2s0.40
inet 192.168.40.115/32 scope global deprecated enp2s0.40
The Gimli node right now doesn't have any vip address, just the node address. It happens always, one node is OK with no addresses, and the other two have some repeated.
After removing the IP addresses from the non leader node the problem has gone, but just for a while. After some hours the addresses were there again, and after a reboot also they comes again.
To Reproduce Steps to reproduce the behavior: There are no special steps to reproduce it. I have just installed kube-vip, disabled the k3s load balancing feature and added a service as LoadBalancer.
Expected behavior To keep the addresses in just one node, the leader node.
Environment (please complete the following information):
- OS/Distro: Debian 13
- Kubernetes Version: v1.32.6+k3s1
- Kube-vip Version: 1.0.1
Kube-vip.yaml:
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/name: kube-vip-ds
app.kubernetes.io/version: v1.0.1
name: kube-vip-ds
namespace: kube-system
spec:
selector:
matchLabels:
app.kubernetes.io/name: kube-vip-ds
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/name: kube-vip-ds
app.kubernetes.io/version: v1.0.1
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
- matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: Exists
containers:
- args:
- manager
env:
- name: vip_arp
value: "true"
- name: port
value: "6443"
- name: vip_nodename
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: vip_interface
value: enp2s0.40
- name: vip_subnet
value: "32"
- name: dns_mode
value: first
- name: cp_enable
value: "true"
- name: cp_namespace
value: kube-system
- name: svc_enable
value: "true"
- name: svc_leasename
value: plndr-svcs-lock
- name: vip_leaderelection
value: "true"
- name: vip_leasename
value: plndr-cp-lock
- name: vip_leaseduration
value: "5"
- name: vip_renewdeadline
value: "3"
- name: vip_retryperiod
value: "1"
- name: address
value: 192.168.40.70
- name: prometheus_server
value: :2112
image: ghcr.io/kube-vip/kube-vip:v1.0.1
imagePullPolicy: IfNotPresent
name: kube-vip
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
drop:
- ALL
hostNetwork: true
serviceAccountName: kube-vip
tolerations:
- effect: NoSchedule
operator: Exists
- effect: NoExecute
operator: Exists
updateStrategy: {}
Additional context
I am using Kube-VIP in ARP mode to manage the API address (which works perfectly and just in one node), and the Load Balancing ip addresses (the one which gives me problems). I have disabled the servicelb and traefik features in k3s to avoid conflicts.
The kube-vip is running as daemonset and all the nodes are aware of the leader node:
2025/11/06 18:02:05 INFO kube-vip.io version=v1.0.1 build=8409073e7ac0087b475a137aed3434066d5629e4
2025/11/06 18:02:05 INFO starting namespace=kube-system Mode=ARP "Control Plane"=true Services=true
2025/11/06 18:02:05 INFO using node name name=legolas
2025/11/06 18:02:05 INFO prometheus HTTP server started
2025/11/06 18:02:05 INFO Starting Kube-vip Manager with the ARP engine
2025/11/06 18:02:05 INFO Starting UPNP Port Refresher
2025/11/06 18:02:05 INFO Start ARP/NDP advertisement
2025/11/06 18:02:05 INFO beginning services leadership namespace=kube-system "lock name"=plndr-svcs-lock id=legolas
2025/11/06 18:02:05 INFO [ARP manager] starting ARP/NDP advertisement
2025/11/06 18:02:05 INFO cluster membership namespace=kube-system lock=plndr-cp-lock id=legolas
I1106 18:02:05.760967 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-svcs-lock...
I1106 18:02:05.760986 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-cp-lock...
2025/11/06 18:02:11 INFO New leader leader=aragorn
2025/11/06 18:02:11 INFO new leader elected "new leader"=aragorn
I1106 18:02:21.865661 1 leaderelection.go:271] successfully acquired lease kube-system/plndr-svcs-lock
2025/11/06 18:02:21 INFO (svcs) starting services watcher for all namespaces
I1106 18:02:21.893316 1 leaderelection.go:271] successfully acquired lease kube-system/plndr-cp-lock
2025/11/06 18:02:21 INFO New leader leader=legolas
2025/11/06 18:02:21 INFO layer 2 broadcaster starting
2025/11/06 18:02:21 INFO [ARP manager] inserting ARP/NDP instance name=192.168.40.70/32-enp2s0.40
2025/11/06 18:02:21 INFO (svcs) adding VIP ip=192.168.40.104 interface=enp2s0.40 namespace=homelab name=homeassistant-service
2025/11/06 18:02:21 INFO successful add IP
... It continues setting the VIP ip addresses to the node
2025/11/06 17:59:40 INFO kube-vip.io version=v1.0.1 build=8409073e7ac0087b475a137aed3434066d5629e4
2025/11/06 17:59:40 INFO starting namespace=kube-system Mode=ARP "Control Plane"=true Services=true
2025/11/06 17:59:40 INFO prometheus HTTP server started
2025/11/06 17:59:40 INFO using node name name=gimli
2025/11/06 17:59:40 INFO Starting Kube-vip Manager with the ARP engine
2025/11/06 17:59:40 INFO Start ARP/NDP advertisement
2025/11/06 17:59:40 INFO [ARP manager] starting ARP/NDP advertisement
2025/11/06 17:59:40 INFO Starting UPNP Port Refresher
2025/11/06 17:59:40 INFO beginning services leadership namespace=kube-system "lock name"=plndr-svcs-lock id=gimli
2025/11/06 17:59:40 INFO cluster membership namespace=kube-system lock=plndr-cp-lock id=gimli
I1106 17:59:40.375404 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-svcs-lock...
I1106 17:59:40.375362 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-cp-lock...
2025/11/06 17:59:40 INFO New leader leader=aragorn
2025/11/06 17:59:40 INFO new leader elected "new leader"=aragorn
2025/11/06 18:02:23 INFO New leader leader=legolas
2025/11/06 18:02:23 INFO new leader elected "new leader"=legolas
2025/11/06 18:04:40 INFO [UPNP] Refreshing Instances "number of instances"=0
E1106 18:09:30.654764 1 leaderelection.go:448] error retrieving resource lock kube-system/plndr-svcs-lock: Get "https://10.43.0.1:443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/plndr-svcs-lock?timeout=10s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1106 18:09:37.350741 1 leaderelection.go:448] error retrieving resource lock kube-system/plndr-cp-lock: Get "https://10.43.0.1:443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/plndr-cp-lock?timeout=10s": context deadline exceeded
2025/11/06 18:09:40 INFO [UPNP] Refreshing Instances "number of instances"=0
E1106 18:09:42.646636 1 leaderelection.go:448] error retrieving resource lock kube-system/plndr-svcs-lock: Get "https://10.43.0.1:443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/plndr-svcs-lock?timeout=10s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1106 18:09:48.962571 1 leaderelection.go:448] error retrieving resource lock kube-system/plndr-cp-lock: the server was unable to return a response in the time allotted, but may still be processing the request (get leases.coordination.k8s.io plndr-cp-lock)
E1106 18:10:10.442655 1 leaderelection.go:448] error retrieving resource lock kube-system/plndr-svcs-lock: Get "https://10.43.0.1:443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/plndr-svcs-lock?timeout=10s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1106 18:10:11.253127 1 leaderelection.go:448] error retrieving resource lock kube-system/plndr-cp-lock: Get "https://10.43.0.1:443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/plndr-cp-lock?timeout=10s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2025/11/06 18:14:40 INFO [UPNP] Refreshing Instances "number of instances"=0
2025/11/06 18:19:40 INFO [UPNP] Refreshing Instances "number of instances"=0
2025/11/06 18:24:40 INFO [UPNP] Refreshing Instances "number of instances"=0
2025/11/06 18:02:24 INFO kube-vip.io version=v1.0.1 build=8409073e7ac0087b475a137aed3434066d5629e4
2025/11/06 18:02:24 INFO starting namespace=kube-system Mode=ARP "Control Plane"=true Services=true
2025/11/06 18:02:24 INFO using node name name=aragorn
2025/11/06 18:02:24 INFO prometheus HTTP server started
2025/11/06 18:02:24 INFO Starting Kube-vip Manager with the ARP engine
2025/11/06 18:02:24 INFO Start ARP/NDP advertisement
2025/11/06 18:02:24 INFO beginning services leadership namespace=kube-system "lock name"=plndr-svcs-lock id=aragorn
I1106 18:02:24.344981 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-svcs-lock...
2025/11/06 18:02:24 INFO Starting UPNP Port Refresher
2025/11/06 18:02:24 INFO cluster membership namespace=kube-system lock=plndr-cp-lock id=aragorn
I1106 18:02:24.346604 1 leaderelection.go:257] attempting to acquire leader lease kube-system/plndr-cp-lock...
2025/11/06 18:02:24 INFO [ARP manager] starting ARP/NDP advertisement
2025/11/06 18:02:24 INFO New leader leader=legolas
2025/11/06 18:02:24 INFO new leader elected "new leader"=legolas
2025/11/06 18:07:24 INFO [UPNP] Refreshing Instances "number of instances"=0
2025/11/06 18:12:24 INFO [UPNP] Refreshing Instances "number of instances"=0
On the logs I only see the Legolas node adding the addresses, but as I have writen above, the Aragorn node also contains some addresses.
I suspect that after a leader election the old leader node is not fully removing the addresses (because not all the addresses are repeated). I can just create an script to detect this problem and remove the addresses from the non leader node, but is just an ugly patch to a problem that maybe is related with a missconfiguration from my side.
Best regards!