Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

feat: Allow leases to have custom labels set when a new holder takes the lease #131632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 5, 2025

Conversation

DerekFrank
Copy link
Contributor

@DerekFrank DerekFrank commented May 6, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds unit tests for the leaselock class, and slightly modifies the functionality to allow users to set custom labels that are updated when a lease gets a new leader.

The intended use of this feature is to allow graceful fail away from leaders. Right now understanding which replica holds a lease is complicated as the holder identity is merely the name of the pod. This would simplify the process to understand who holds the lease without having to backtrace that information through the pod's host names. It also prevents race conditions from other workarounds such as custom controllers that reconcile on leases and update labels.

Which issue(s) this PR fixes:

Fixes kubernetes/client-go#1413

Special notes for your reviewer:

Does this PR introduce a user-facing change?

This change is technically user facing, as users can now leverage the functionality, but there is no action necessary. If a release note is required, I would say a note such as

LeaseLocks can now have custom Labels that different holders will overwrite when they become the holder of the underlying lease.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 6, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @DerekFrank!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @DerekFrank. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 6, 2025
@dims
Copy link
Member

dims commented May 6, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 6, 2025
@DerekFrank DerekFrank force-pushed the custom-lease-metadata branch 2 times, most recently from 9e6551b to b63166c Compare May 6, 2025 18:26
@dims
Copy link
Member

dims commented May 6, 2025

/test pull-kubernetes-e2e-kind

@aaron-prindle
Copy link
Contributor

/cc @Jefftree

@k8s-ci-robot k8s-ci-robot requested a review from Jefftree May 6, 2025 20:14
@aaron-prindle
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 6, 2025
@DerekFrank DerekFrank force-pushed the custom-lease-metadata branch from b63166c to 501d207 Compare May 6, 2025 22:09
@DerekFrank DerekFrank force-pushed the custom-lease-metadata branch from 501d207 to 8597aba Compare May 7, 2025 23:50
@DerekFrank
Copy link
Contributor Author

DerekFrank commented May 8, 2025

e2e failing due to a 502 response from google.com. I'm going to guess thats a flake?

https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/131632/pull-kubernetes-e2e-gce/1920264922424414208

@DerekFrank
Copy link
Contributor Author

/retest

Copy link
Member

@Jefftree Jefftree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall approach lgtm.

cc @jpbetz @sttts: Change is completely backwards compatible but this is adding new methods to client-go that we'll likely have to support forever and would like your thoughts

@DerekFrank DerekFrank force-pushed the custom-lease-metadata branch from 9ca18a2 to a89b7b7 Compare May 14, 2025 17:01
@alvaroaleman
Copy link
Member

Right now understanding which replica holds a lease is complicated as the holder identity is merely the name of the pod

@DerekFrank why is this complicated? The holder usually is a pod. What other information would you want to put in there?

@DerekFrank
Copy link
Contributor Author

DerekFrank commented May 28, 2025

Right now understanding which replica holds a lease is complicated as the holder identity is merely the name of the pod

@DerekFrank why is this complicated? The holder usually is a pod. What other information would you want to put in there?

Our main usecase is to understand which cloud provider zone a given holder is in for zonal outage fail over mechanisms. Currently to find out what zone the holder is in one would need to:

  1. Parse the pod host name out of the holder identity (and hope nobody changes that format in the future)
  2. Get the name of the node the pod is on from an apiserver call to get the pod object
  3. Get the node labels from an api server call to get the node object
  4. parse the node's labels to determine which zone it is in.

Given the nature of zonal outages, this makes any failover mechanism susceptible to all sorts of dependency failures.

@alvaroaleman
Copy link
Member

alvaroaleman commented May 29, 2025

Our main usecase is to understand which cloud provider zone a given holder is in for zonal outage fail over mechanisms.

Could you elaborate a bit as to why you need failure domain info to do failover? Another replica will automatically take over or not? Furthermore the failure domain really is a correlation, how would you be able to tell the failure was in any way caused by the replica being in a given failure domain?

@DerekFrank
Copy link
Contributor Author

DerekFrank commented May 30, 2025

Could you elaborate a bit as to why you need failure domain info to do failover? Another replica will automatically take over or not? Furthermore the failure domain really is a correlation, how would you be able to tell the failure was in any way caused by the replica being in a given failure domain?

Apologies, I should have been more clear. I am attempting to setup a mechanism that, when manually activated, will force failover to happen for all controllers within a specific zone. This is useful for maintaining availability when there are known partial zonal outages for underlying cloud provider services.

For an example, if a cloud provider is having a partial networking outages within a specific zone such that some calls from a controller to the control plane fail but some do not, the controllers in that zone will most likely be able to maintain leadership status. This is not ideal as they will be failing to perform optimally due to the dropped calls. We would be able to see that the controllers are impacted through routine monitoring, and could then leverage the fail over mechanism. The goal of the mechanism would be to force replicas in another zone to take leadership of the lease, mitigating impact until the partial networking outage can be resolved. Ideally the mechanism does this without relying on being able to communicate with the controllers in the affected zone as we know it will be impaired.

For an example of how the label could be used, if the lease holders added their zonal information to the lease renewal requests, the api server will be able to reject lease renewal requests that originate from the impaired zone. I'm not suggesting adding that functionality specifically

@DerekFrank DerekFrank force-pushed the custom-lease-metadata branch from a89b7b7 to 792f229 Compare June 3, 2025 20:22
Copy link
Contributor

@jpbetz jpbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

With one super small nit.

Happy to lgtm once ready to merge

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DerekFrank, jpbetz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 3, 2025
@DerekFrank DerekFrank force-pushed the custom-lease-metadata branch from 792f229 to 109ae1b Compare June 4, 2025 17:08
@jpbetz
Copy link
Contributor

jpbetz commented Jun 5, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 54ca0d347b91701cd999e12e2b39a041f43307d8

@k8s-ci-robot k8s-ci-robot merged commit 3eebe67 into kubernetes:master Jun 5, 2025
14 of 15 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.34 milestone Jun 5, 2025
inkel pushed a commit to inkel/kubernetes that referenced this pull request Jun 6, 2025
…adata

feat: Allow leases to have custom labels set when a new holder takes the lease

Kubernetes-commit: 3eebe67
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: Allow users of LeaseLock to add custom object metadata that updates on leader election
7 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.