Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Static CPU Manager can fail with UnexpectedAdmissionError with init-containers requesting integer CPUs #112228

Copy link
Copy link
Open
@bpineau

Description

@bpineau
Issue body actions

What happened?

The static CPU manager can reserves full CPU cores for (Guaranteed, integer CPU) pods' init containers.

It supports re-using those reserved CPU cores for main containers, but doesn't enforce (or even favour) their re-use, which can lead to CPU core(s) being left reserved to an init-container while the main container uses a different CPU set.

The example deployment provided below can lead to a topology like this:

$ jq . /var/lib/kubelet/cpu_manager_state
{
  "policyName": "static",
  "defaultCpuSet": "0",
  "entries": {
    "3ba83abb-7ceb-45dc-96ec-556fe1640954": {
      "init": "4",              # reserved yet not reused (not in "1,5"), ie. leaked CPU
      "main": "1,5"
    },
    "f2ee3f83-3839-4d19-aa2d-f1b0775b18ca": {
      "init": "2",
      "main": "2,6"
    },
    "f8436276-f7e3-4b90-8824-ccef1313df16": {
      "init": "3",
      "main": "3,7"
    }
  },
  "checksum": 3949393128
}

Then Kubelet and the Kubernetes Scheduler will disagree on the remaining resources for that node.

Pods might get scheduled to that node then rejected with an UnexpectedAdmissionError (Pod Allocate failed due to not enough cpus available to satisfy request, which is unexpected).

What did you expect to happen?

Static CPU Manager only reserving at most max(sum_containers_requests, max(among init containers requests)) (as expected by the scheduler), allowing future scheduled pods to start.

How can we reproduce it (as minimally and precisely as possible)?

Update the deployment's nodeSelector below to target a node with cpuManagerPolicy: static; then look at /var/lib/kubelet/cpu_manager_state and look for leaked "init" CPU cores (or schedule more pods on that node, and see them rejected with UnexpectedAdmissionError).

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: tdeploy
  name: tdeploy
spec:
  replicas: 3
  selector:
    matchLabels:
      app: tdeploy

  template:
    metadata:
      labels:
        app: tdeploy

    spec:
      nodeSelector:
        kubernetes.io/hostname: TEST-NODE-NAME-GOES-HERE

      initContainers:
      - name: init
        image: kubernetes/pause:go

        resources:
          limits:
            cpu: 1
            memory: 1Gi
          requests:
            cpu: 1
            memory: 1Gi

      containers:
      - name: main
        image: kubernetes/pause:go

        resources:
          limits:
            cpu: 2
            memory: 1Gi
          requests:
            cpu: 2
            memory: 1Gi

Anything else we need to know?

No response

Kubernetes version

$ kubectl version  # also tested with a 1.25 kubelet
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.9-dd.1", GitCommit:"d9705166e190927de148edae148bf46471c7f8d5", GitTreeState:"clean", BuildDate:"2022-03-07T11:53:47Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

Any (irrelevant)

OS version

Any (Linux)

Install tools

Container runtime (CRI) and version (if applicable)

Tested and reproduced with Containerd 1.5.4 and 1.6.8.

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Labels

kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.priority/backlogHigher priority than priority/awaiting-more-evidence.Higher priority than priority/awaiting-more-evidence.sig/nodeCategorizes an issue or PR as relevant to SIG Node.Categorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

Status

Triaged
Show more project fields

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Morty Proxy This is a proxified and sanitized view of the page, visit original site.