Remove LRU cache from ResourceQuota admission plugin #129998

AwesomePatrol · Feb 6, 2025

/kind bug
/kind regression

What this PR does / why we need it:

This PR mitigates the problem of high latency caused by consistent reads made in Admission/Validation of ResourceQuota. LISTs made with Informer provide sufficient consistency guarantees. More details are available in the linked bug report.

Which issue(s) this PR fixes:

Issue #129931

Special notes for your reviewer:

Tests in resource_access_test focused on caching and avoiding duplicated LIST requests. As I deleted the code, the tests became obsolete. GetQuotas is still covered by other tests in the module.

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

/cc @jpbetz @serathius @liggitt @deads2k

k8s-ci-robot · Feb 6, 2025

Hi @AwesomePatrol. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

liggitt · Feb 6, 2025

/ok-to-test

Update the release note to indicate this resolves a performance regression of write requests related to the ConsistentListFromCache feature gate, enabled in 1.31+ by default.

liggitt · Feb 6, 2025

/lgtm
/approve

/hold for ack from second api-machinery reviewer (@deads2k?)

k8s-ci-robot · Feb 6, 2025

LGTM label has been added.

Git tree hash: 9d0008aa89cc2d2ff081a4fe7307992a4b66601a

deads2k · Feb 6, 2025

/hold for ack from second api-machinery reviewer (@deads2k?)

lgtm

/hold cancel

plugin/pkg/admission/limitranger/admission.go

liggitt · Feb 14, 2025

staging/src/k8s.io/apiserver/pkg/admission/plugin/resourcequota/admission.go

@@ -164,6 +154,10 @@ func (a *QuotaAdmission) Validate(ctx context.Context, attr admission.Attributes
 	if attr.GetNamespace() == "" || isNamespaceCreation(attr) {
 		return nil
 	}
+	// we need to wait for our caches to warm
+	if !a.WaitForReady() {


doing this here, before calling Evaluate(), means that in unsynced cases, we'll reject resources we would otherwise have ignored:

kubernetes/staging/src/k8s.io/apiserver/pkg/admission/plugin/resourcequota/controller.go

Lines 601 to 606 in d36737c

// is this resource ignored?

gvr := a.GetResource()

gr := gvr.GroupResource()

if _, ok := e.ignoredResources[gr]; ok {

return nil

}

or skipped:

kubernetes/staging/src/k8s.io/apiserver/pkg/admission/plugin/resourcequota/controller.go

Lines 617 to 621 in d36737c

// for this kind, check if the operation could mutate any quota resources

// if no resources tracked by quota are impacted, then just return

if !evaluator.Handles(a) {

return nil

}

I think the WaitForReady call should happen after both of those, so we only block on sync for things quota is actually interested in. That will mean plumbing the wait function down into the evaluator to invoke.

I moved it to the evaluator, but I am not sure if this is the best place. init can't be used as it doesn't allow to retry errors. Moving to quotaAccessor may work, but it would likely need a custom implementation of waitForReady:

func (a *quotaAccessor) waitForReady() error { if a.hasSynced() { return nil } if err := wait.PollUntilContext(context.Background, 100 * time.Millisecond, time.Second, func(_ context.Context) (bool, error) { return a.hasSynced(), nil }); err != nil { return fmt.Errorf("not ready") } return nil }

I also hope that compiler manages to optimize handler's WaitForReady as otherwise it creates a timer only to be GCed right after (in cases when readyFunc returns always true, i.e. post sync)

The informer keeps track of the data already so there is no need for an additional layer of caching and LISTing. We add wait to ResourceQuota admission hook to make sure it is synchronized before processing requests.

ResourceQuota's Admission tests now start an informer that fetches resources from the fake kubeClient.

k8s-ci-robot requested review from deads2k and jpbetz February 6, 2025 08:23

k8s-ci-robot added the release-note-none Denotes a PR that doesn't merit a release note. label Feb 6, 2025

k8s-ci-robot requested review from liggitt and serathius February 6, 2025 08:23

AwesomePatrol mentioned this pull request Feb 6, 2025

Add ResourceVersion="0" to ResourceQuota and LimitRanger LIST calls #129980

Closed

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 6, 2025

k8s-ci-robot assigned liggitt Feb 6, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 6, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 6, 2025

k8s-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 7, 2025

AwesomePatrol force-pushed the 129931-fix-2 branch from 40e5316 to 86f6da5 Compare February 11, 2025 12:36

AwesomePatrol marked this pull request as draft February 12, 2025 12:33

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 12, 2025

AwesomePatrol mentioned this pull request Feb 12, 2025

Make ResourceQuota LIST requests only when Informer is not synced #130113

Merged

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 12, 2025

AwesomePatrol force-pushed the 129931-fix-2 branch from b906c46 to e8ef70f Compare February 13, 2025 13:07

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 13, 2025

AwesomePatrol changed the title ~~Remove LRU cache from ResourceQuota admission plugin~~ Remove LRU caches from ResourceQuota/LimitRanger admission plugins Feb 13, 2025

k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Feb 13, 2025

AwesomePatrol marked this pull request as ready for review February 14, 2025 07:52

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 14, 2025

AwesomePatrol requested a review from liggitt February 14, 2025 07:52

AwesomePatrol force-pushed the 129931-fix-2 branch from e8ef70f to d042c84 Compare February 14, 2025 09:21

liggitt reviewed Feb 14, 2025

View reviewed changes

AwesomePatrol force-pushed the 129931-fix-2 branch from d042c84 to 2ccf61d Compare February 17, 2025 09:03

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 17, 2025

AwesomePatrol changed the title ~~Remove LRU caches from ResourceQuota/LimitRanger admission plugins~~ Remove LRU cache from ResourceQuota admission plugin Feb 17, 2025

AwesomePatrol requested a review from liggitt February 19, 2025 14:24

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 15, 2025

dims added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 4, 2025

serathius mentioned this pull request May 22, 2025

Consistent Reads cause increased Create/Update latency #129931

Closed

AwesomePatrol added 2 commits May 27, 2025 11:50

Remove additional LIST and LRU cache from ResourceQuota

42ee099

The informer keeps track of the data already so there is no need for an additional layer of caching and LISTing. We add wait to ResourceQuota admission hook to make sure it is synchronized before processing requests.

Remove references to Indexer and Store in admission test

5173023

ResourceQuota's Admission tests now start an informer that fetches resources from the fake kubeClient.

AwesomePatrol force-pushed the 129931-fix-2 branch from 2ccf61d to 5173023 Compare May 27, 2025 09:50

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove LRU cache from ResourceQuota admission plugin #129998

Remove LRU cache from ResourceQuota admission plugin #129998

Uh oh!

AwesomePatrol commented Feb 6, 2025 •

edited

Loading

Uh oh!

k8s-ci-robot commented Feb 6, 2025

Uh oh!

liggitt commented Feb 6, 2025

Uh oh!

liggitt commented Feb 6, 2025

Uh oh!

k8s-ci-robot commented Feb 6, 2025

Uh oh!

deads2k commented Feb 6, 2025

Uh oh!

Uh oh!

liggitt Feb 14, 2025

Uh oh!

AwesomePatrol Feb 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

	// is this resource ignored?
	gvr := a.GetResource()
	gr := gvr.GroupResource()
	if _, ok := e.ignoredResources[gr]; ok {
	return nil
	}

	// for this kind, check if the operation could mutate any quota resources
	// if no resources tracked by quota are impacted, then just return
	if !evaluator.Handles(a) {
	return nil
	}

Search code, repositories, users, issues, pull requests...

Remove LRU cache from ResourceQuota admission plugin #129998

Are you sure you want to change the base?

Remove LRU cache from ResourceQuota admission plugin #129998

Uh oh!

Conversation

AwesomePatrol commented Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

k8s-ci-robot commented Feb 6, 2025

Uh oh!

liggitt commented Feb 6, 2025

Uh oh!

liggitt commented Feb 6, 2025

Uh oh!

k8s-ci-robot commented Feb 6, 2025

Uh oh!

deads2k commented Feb 6, 2025

Uh oh!

Uh oh!

liggitt Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

AwesomePatrol Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AwesomePatrol commented Feb 6, 2025 •

edited

Loading

AwesomePatrol Feb 17, 2025 •

edited

Loading