Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

@pallasathena92
Copy link
Collaborator

What type of PR is this?

/kind feature
/kind design

What this PR does / why we need it:

After add AcceleratorClass crd. Need to create a contoller to reconcile this resource.
This is the first step, only check nodes and update acceleratorClass status according nodes.
This pr mainly containers these parts:

  1. acceleratorClass controller. It will handle accelerator cru action and update acceleratorClass status.
  2. For consistency with other API fields and to avoid platform-dependent integer sizes, it's better to use *int64 instead of *int for MinMemory. Modify the parameter I added in inference service, runtime before.

Special notes for your reviewer:

The controller has tested.

Does this PR introduce a user-facing change?

https://github.com/sgl-project/ome/issues/210

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @pallasathena92, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the core logic for managing AcceleratorClass resources within the system. It establishes a new controller responsible for monitoring AcceleratorClass definitions and dynamically updating their status by identifying and matching available Kubernetes nodes that meet the specified discovery and capability criteria. Additionally, it refines the API by standardizing integer types for accelerator constraints to ensure robust and consistent behavior across different platforms.

Highlights

  • New AcceleratorClass Controller: Implemented a new controller to manage AcceleratorClass custom resources, reconciling their status based on available Kubernetes nodes that meet specified criteria.
  • Node Discovery and Capability Matching: The controller includes sophisticated logic to discover and filter nodes based on AcceleratorClass discovery specifications (node selectors) and capability requirements (memory, compute capability, and GPU resources).
  • Type Consistency Update: Changed MinMemory, MaxMemory, and MinComputeCapability fields within AcceleratorConstraints from *int to *int64 for improved consistency and platform independence across the API.
  • RBAC and Finalizer Management: Updated RBAC roles to grant necessary permissions for the new AcceleratorClass resource and implemented finalizer logic for proper lifecycle management during deletion.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new controller for the AcceleratorClass CRD and refactors some API fields from *int to *int64 for platform consistency. The overall direction is good, but the new controller contains several correctness and performance issues that need to be addressed. I've identified problems in node selector matching logic, capability matching, and the efficiency of the reconciliation loop. Please review the detailed comments for specific suggestions on how to resolve these issues.

@pallasathena92 pallasathena92 force-pushed the yifeliu/ac_controller branch 3 times, most recently from aa4abdc to 423d9ed Compare September 15, 2025 00:37
@slin1237 slin1237 merged commit fdec439 into main Sep 15, 2025
23 checks passed
@slin1237 slin1237 deleted the yifeliu/ac_controller branch September 15, 2025 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.