Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[WIP][FG:InPlacePodVerticalScaling] Fix Static CPU management policy alongside InPlacePodVerticalScaling #129719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
Loading
from

Conversation

esotsal
Copy link
Contributor

@esotsal esotsal commented Jan 20, 2025

What type of PR is this?

/kind bug

What this PR does / why we need it:

It is needed to resize allocated CPU of a Guaranteed QoS Class with integer CPU requests Pod without restart with Static CPU management policy alongside InPlacePodVerticalScaling.

Current PR retains "promised" upon start of container CPUs

Which issue(s) this PR fixes:

It is mentioned the issue in https://kubernetes.io/blog/2023/05/12/in-place-pod-resize-alpha/#known-issues last bullet.
I wasn't able to find a K8s issue, posting this PR following "I found it, I fix it" ethos after discussion in slack.

Update: Attempt to fix #127262

Special notes for your reviewer:

[Update] Latest 14th April 2025 Demo screencast

Screencast.from.2025-04-14.10-02-45.webm

This PR replace #123319 , needed due to company transfer.

This PR includes also a merge from esotsal#9 , to test proposals and review in one common PR. Thanks @Chunxia202410 for the contributions.

  • A proposal for CPU allocated strategy when Pod scale up and down
  • ~ Add mustKeepCPUs Interface~ ( Update 14th April 2025 , replaced with a local checkpoint solution as per sig-node meeting decisions )

Does this PR introduce a user-facing change?

Fixed an issue where static CPU management policy would not work alongside in-place
vertical scaling of a Pod.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 20, 2025
@k8s-ci-robot k8s-ci-robot added area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/kubelet area/test sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 20, 2025
@esotsal
Copy link
Contributor Author

esotsal commented Jan 20, 2025

@Chunxia202410, @hshiina , @ffromani , @AnishShah , @tallclair , @SergeyKanzhelev , @vinaykul , moved old PR here due to company transfer, now i can continue working on this. Will prioritize updating the commit with the recent changes done in InPlacePodVerticalScaling and the proposals shared by Chunxia202410.

Copy link

linux-foundation-easycla bot commented Feb 5, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 5, 2025
@esotsal
Copy link
Contributor Author

esotsal commented Feb 5, 2025

Thanks @Chunxia202410 for the contributions, merged your two PRs to check tests . I have some questions about the second PR regarding strategy , thought best to discuss here, will ask later this week, need to do some more tests.

Thanks for the API PR seems is one of the options discussed.

Will try to update tests tomorrow or by end of this week, to make sure test is covered and passed before the Beta in v1.33 to have a point of reference.

We will need to raise the final solution or solutions in Sig-node , after v1.33 is finished and this PR is refactored with the new InPlacePodVerticalScaling KEP ongoing changes .

Thanks again for your PRs

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Feb 5, 2025
@esotsal
Copy link
Contributor Author

esotsal commented May 2, 2025

/retest

@esotsal esotsal force-pushed the policy_static branch 3 times, most recently from e6f5d5a to dc25a5a Compare May 5, 2025 13:09
@esotsal
Copy link
Contributor Author

esotsal commented May 5, 2025

how much of this work is intrinsically tied to VPA vs how much can be extracted in a separated PR (even KEP?) for example

1. changing the cpumanager logic to extend allocation taking into account existing cpuset as (pseudo?) affinity hint

2. changing the checkpoint layout to take into account base set + extension set(s) and compute "effective cpu set"

3. redesigning cpu accumulator in general

Hi, @ffromani and @kad please take another look for this PR, i have followed the promised local solution approach as it has been asked during the sig-node meeting . I don't think Francesco that it adds a value to split this in three issues as it has been asked above and not sure if KEP update is needed for this fix. But leaving this up to you and kad to decide.

PS: I have kept from @Chunxia202410 14df31c commit the non API parts for the moment but depending on #129719 (comment) decision might revert it fully, in that case i will revert and then squash the commits together in order to make it easier for rebase and review .

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 6, 2025
@@ -438,7 +562,7 @@ func (p *staticPolicy) allocateCPUs(s state.State, numCPUs int, numaAffinity bit
numAlignedToAlloc = numCPUs
}

allocatedCPUs, err := p.takeByTopology(alignedCPUs, numAlignedToAlloc)
allocatedCPUs, err := p.takeByTopology(alignedCPUs, numAlignedToAlloc, reusableCPUsForResize, mustKeepCPUsForResize)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a possibility that mustKeepCPUsForResize (and reusableCPUsForResize ?) is a subset of alignedCPUs. Maybe worth adding a check and throw and error if there are not. Otherwise, we might end up leaking CPUS.

klog.InfoS("AllocateCPUs", "numCPUs", numCPUs, "socket", numaAffinity)

allocatableCPUs := p.GetAvailableCPUs(s).Union(reusableCPUs)
allocatableCPUs := cpuset.New()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth exploring an alternate approach here.

Instead of combining everything (reusableCPUsForResize, available cpus) into allocatableCPUs and then attempting allocation, we can have a more tiered approach.

  1. Attempt allocation from mustKeepCPUsForResize.
  2. Remaining is allocated from reusableCPUsForResize.
  3. Remaining from NUMA aligned CPUs
  4. Remaining from other NUMA nodes

With this approach, takeByTopology() function does not have to change ?

klog.InfoS("Regenerating TopologyHints for CPUs already allocated", "pod", klog.KObj(pod), "containerName", container.Name)
return map[string][]topologymanager.TopologyHint{
string(v1.ResourceCPU): p.generateCPUTopologyHints(allocated, cpuset.CPUSet{}, requested),
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not sure if this is handled.
Do we throw an error is mustKeepCPUsForResize is not used during Allocation() ?. mustKeepCPUsForResize might not align with the new NUMA hint we generate here.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 13, 2025
esotsal and others added 8 commits May 16, 2025 15:05
Use new topology.Allocation struct (a CPU set plus
alignment metadata) instead of CPU set, due to rebase.

Remove duplicate unecessary SetDefaultCPUSet call as per
review comment.
- Revert introduction of API env mustKeepCPUs
- Replace mustKeepCPUs with local checkpoint "promised"
- Introduce "promised" in CPUManagerCheckpointV3 format
- Add logic, refactor with Beta candidate
- Fix lint issues
- TODO improve alogn resize tests, go through testing, corner cases
- TODO improve CPUManagerCheckpointV3 tests
- TODO address code review/feedback
- TODO check init-containers
- TODO check migration from v2 to v3 CPU Manager checkpoint
- TODO check kubectl failure when prohibited can this be done earlier?
- TODO update CPU Manager tests to use refactpred cpu_manager_test
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented May 16, 2025

@esotsal: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-capz-windows-master 9c411d0 link false /test pull-kubernetes-e2e-capz-windows-master
pull-kubernetes-node-kubelet-serial-containerd-sidecar-containers 9c411d0 link false /test pull-kubernetes-node-kubelet-serial-containerd-sidecar-containers
pull-kubernetes-node-kubelet-serial-crio-cgroupv2 9c411d0 link false /test pull-kubernetes-node-kubelet-serial-crio-cgroupv2
pull-kubernetes-node-kubelet-serial-podresources 9c411d0 link false /test pull-kubernetes-node-kubelet-serial-podresources
pull-kubernetes-node-kubelet-serial-containerd-alpha-features 9c411d0 link false /test pull-kubernetes-node-kubelet-serial-containerd-alpha-features
pull-kubernetes-node-kubelet-serial-containerd 9c411d0 link false /test pull-kubernetes-node-kubelet-serial-containerd

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Needs Triage
Status: Issues - In progress
Status: In Progress
Development

Successfully merging this pull request may close these issues.

[FG:InPlacePodVerticalScaling] Static CPU management policy alongside InPlacePodVerticalScaling
7 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.