DNS latency when a CoreDNS pod is deleted

What happened?

Hello,

We noticed that when one of our CoreDNS pods is deleted, some client pods experience latency on their DNS queries.

This happens when the pod is completely deleted from Kubernetes, after the terminating phase. When it happens, all DNS requests from some pods (not all of them, it seems random) are "stuck" for a few seconds (the value is the timeout value in the pod resolv.conf file, so 5 seconds by default but if I set in the pod spec a timeout of 3 seconds in dnsConfig.options, it will be 3 seconds at maximum).

You can see on this screenshot how it looks like on the application side (traces are generated using Opentemetry + httptrace): When the coredns pod is removed (not in terminating phase, completely removed, so after the lameduck period, we even tried 17 seconds for lameduck), all requests are waiting for 5 seconds. We can see span durations decreasing because of new requests all wait until the system can send requests again:

We ran tcpdump (tcpdump -w capture.pcap udp port 53) on the pod namespace (using nsenter) and we can indeed see that during 5 seconds, no DNS requests are visible (look at the traces and the wireshark timestamps, they are matching):

We're using Karpenter on our Kubernetes clusters so CoreDNS pods are destroyed regularly. To mitigate the issue, we moved the CoreDNS pods to stable nodes but at every node upgrade, the problem occurs so it's not a good long-term solution (it is also more expensive for us to have dedicated nodes for CoreDNS).

What did you expect to happen?

We didn't expect any latency during CoreDNS rollouts.

How can we reproduce it (as minimally and precisely as possible)?

On AWS EKS

A simple kubectl rollout restart -n kube-system deployment coredns is enough to impact our applications.

On Exoscale SKS

I created a 1.31.4 cluster (and also reproduced with kube-proxy 1.32.0 on it) with 5 CoreDNS replicas, and then deployed an application generating DNS traffic on the cluster (it's the only app running on the cluster):

package main

import (
	"context"
	"errors"
	"fmt"
	"net"
	"os"
	"strconv"
	"time"
)

func resolve(ctx context.Context, domain string) ([]net.IP, error) {
	addrs, err := net.DefaultResolver.LookupIPAddr(ctx, domain)
	if err != nil {
		return nil, err
	}
	result := make([]net.IP, len(addrs))
	for i, ia := range addrs {
		result[i] = ia.IP
	}
	return result, nil
}

func main() {
	domain := os.Getenv("DOMAIN")
	if domain == "" {
		panic(errors.New("DOMAIN env var is empty"))
	}
	parallelism, err := strconv.Atoi(os.Getenv("PARALLELISM"))
	if err != nil {
		panic(err)
	}
	interval, err := strconv.Atoi(os.Getenv("INTERVAL"))
	if err != nil {
		panic(err)
	}

	for i := 0; i < parallelism; i++ {
		ticker := time.NewTicker(time.Duration(interval) * time.Millisecond)
		go func() {
			for {
				select {
				case <-ticker.C:
					ctx, cancel := context.WithTimeout(context.Background(), 7*time.Second)
					start := time.Now().UnixMilli()
					_, err := resolve(ctx, domain)
					cancel()
					end := time.Now().UnixMilli()
					duration := end - start
					if err != nil {
						fmt.Printf("%d: resolved in %d milliseconds with error: %s\n", start, duration, err.Error())
					} else {
						fmt.Printf("%d: resolved in %d milliseconds\n", start, duration)
					}

				}
			}
		}()
	}
	time.Sleep(24000 * time.Second)
}

I then deploy this code using this deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dns-test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: dns-test
  template:
    metadata:
      labels:
        app: dns-test
    spec:
      containers:
      - name: dns
        image: mcorbin/dnstest:0.0.3
        resources:
          limits:
            memory: "300Mi"
          requests:
            cpu: "0.5"
            memory: "300Mi"
        env:
          - name: DOMAIN
            value: "metrics-server.kube-system.svc.cluster.local."
          - name: PARALLELISM
            value: "4"
          - name: INTERVAL
            value: "50"

From time to time I can see slow DNS queries after rollout, similar to what I see on EKS:

1736868482815: resolved in 5003 milliseconds
1736868482815: resolved in 5003 milliseconds
1736868482815: resolved in 5003 milliseconds
1736868482815: resolved in 5003 milliseconds

Anything else we need to know?

We already investigated a lot of things:

increased lameduck option on CoreDNS to 17 seconds: no changes
It's not a CoreDNS performance issue (metrics are good, no latency at all which was verified by enabling debug logs).
It's not a kube-proxy reconciliation latency issue: kube-proxy logs/metrics are good, endpoints are correctly updated
We're mostly AWS EKS users but it seems we're also able to reproduce the issue on Exoscale SKS offering.

I suspect a conntrack issue when conntrack entries are removed from kube-proxy. I indeed noticed that cleaning the conntrack manually for CoreDNS IPs was causing the same symptoms

Kubernetes version

We reproduced the issue on several Kubernetes versions/cloud providers:

On AWS EKS:

Client Version: v1.30.7
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.8-eks-2d5f260

On Exoscale SKS

Client Version: v1.30.7
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.31.3

I also reproduced on Exoscale SKS with server v1.31.3 and kube-proxy v1.32.0 to get this fix.

The AWS EKS Service Team also told us that they can reproduce the issue on the (unreleased yet to users) v1.32.0 on their side.

Cloud provider

AWS EKS, Exoscale SKS

OS version

NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"

Install tools

Both cases use kube-proxy with iptables mode.

Container runtime (CRI) and version (if applicable)

No response

Related plugins (CNI, CSI, ...) and versions (if applicable)

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DNS latency when a CoreDNS pod is deleted #129617

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

DNS latency when a CoreDNS pod is deleted #129617

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions