-
Notifications
You must be signed in to change notification settings - Fork 40.6k
Remove LRU cache from ResourceQuota admission plugin #129998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Hi @AwesomePatrol. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test Update the release note to indicate this resolves a performance regression of write requests related to the ConsistentListFromCache feature gate, enabled in 1.31+ by default. |
/lgtm /hold for ack from second api-machinery reviewer (@deads2k?) |
LGTM label has been added. Git tree hash: 9d0008aa89cc2d2ff081a4fe7307992a4b66601a
|
lgtm /hold cancel |
40e5316
to
86f6da5
Compare
b906c46
to
e8ef70f
Compare
e8ef70f
to
d042c84
Compare
@@ -164,6 +154,10 @@ func (a *QuotaAdmission) Validate(ctx context.Context, attr admission.Attributes | ||
if attr.GetNamespace() == "" || isNamespaceCreation(attr) { | ||
return nil | ||
} | ||
// we need to wait for our caches to warm | ||
if !a.WaitForReady() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doing this here, before calling Evaluate(), means that in unsynced cases, we'll reject resources we would otherwise have ignored:
kubernetes/staging/src/k8s.io/apiserver/pkg/admission/plugin/resourcequota/controller.go
Lines 601 to 606 in d36737c
// is this resource ignored? | |
gvr := a.GetResource() | |
gr := gvr.GroupResource() | |
if _, ok := e.ignoredResources[gr]; ok { | |
return nil | |
} |
or skipped:
kubernetes/staging/src/k8s.io/apiserver/pkg/admission/plugin/resourcequota/controller.go
Lines 617 to 621 in d36737c
// for this kind, check if the operation could mutate any quota resources | |
// if no resources tracked by quota are impacted, then just return | |
if !evaluator.Handles(a) { | |
return nil | |
} |
I think the WaitForReady call should happen after both of those, so we only block on sync for things quota is actually interested in. That will mean plumbing the wait function down into the evaluator to invoke.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it to the evaluator, but I am not sure if this is the best place. init
can't be used as it doesn't allow to retry errors. Moving to quotaAccessor
may work, but it would likely need a custom implementation of waitForReady
:
func (a *quotaAccessor) waitForReady() error {
if a.hasSynced() {
return nil
}
if err := wait.PollUntilContext(context.Background, 100 * time.Millisecond, time.Second,
func(_ context.Context) (bool, error) { return a.hasSynced(), nil }); err != nil {
return fmt.Errorf("not ready")
}
return nil
}
I also hope that compiler manages to optimize handler's WaitForReady
as otherwise it creates a timer only to be GCed right after (in cases when readyFunc
returns always true
, i.e. post sync)
d042c84
to
2ccf61d
Compare
The informer keeps track of the data already so there is no need for an additional layer of caching and LISTing. We add wait to ResourceQuota admission hook to make sure it is synchronized before processing requests.
ResourceQuota's Admission tests now start an informer that fetches resources from the fake kubeClient.
2ccf61d
to
5173023
Compare
/kind bug
/kind regression
What this PR does / why we need it:
This PR mitigates the problem of high latency caused by consistent reads made in Admission/Validation of ResourceQuota. LISTs made with Informer provide sufficient consistency guarantees. More details are available in the linked bug report.
Which issue(s) this PR fixes:
Issue #129931
Special notes for your reviewer:
Tests in
resource_access_test
focused on caching and avoiding duplicated LIST requests. As I deleted the code, the tests became obsolete.GetQuotas
is still covered by other tests in the module.Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
/cc @jpbetz @serathius @liggitt @deads2k