[None][cleanup] Add supplemental ruff lint for legacy files via ruff-legacy hook#11469
[None][cleanup] Add supplemental ruff lint for legacy files via ruff-legacy hook#11469venkywonka merged 1 commit intoNVIDIA:mainNVIDIA/TensorRT-LLM:mainfrom venkywonka:venky/stricter-lintervenkywonka/TensorRT-LLM:venky/stricter-linterCopy head branch name to clipboard
Conversation
|
/bot run |
|
PR_Github #36306 [ run ] triggered by Bot. Commit: |
|
PR_Github #36306 [ run ] completed with state |
📝 WalkthroughWalkthroughThis PR implements a two-group Python linting strategy by introducing legacy file management and automated configuration generation. It adds ~1,350 legacy files to a dedicated list, creates configuration templates and a generator script to maintain lint/format rules automatically, documents the strategy in development guides, and updates configuration files accordingly. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tensorrt_llm/runtime/kv_cache_manager_v2/rawref/setup.py (1)
16-27:⚠️ Potential issue | 🟡 MinorAlign setuptools imports with the namespace-preserving rule.
The direct import of
Extensionviolates the coding guideline. However, the proposed fix is incorrect—usingimport setuptoolsexplicitly violates the guideline's "Avoidimport package" requirement.Import the submodule instead and qualify usages:
🔧 Proposed fix
-from setuptools import Extension, setup +from setuptools import extension, setup -rawref_module = Extension( +rawref_module = extension.Extension( '_rawref', sources=['rawrefmodule.c'], )As per coding guidelines, "Always maintain the namespace when importing. Use
from package.subpackage import fooinstead offrom package.subpackage.foo import SomeClassorimport package."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/runtime/kv_cache_manager_v2/rawref/setup.py` around lines 16 - 27, Replace the direct import of Extension with a submodule import: keep setup imported from setuptools but import Extension from the setuptools.extension submodule (i.e., change "from setuptools import Extension, setup" to "from setuptools import setup" and "from setuptools.extension import Extension"), leaving usages of Extension and setup unchanged in the file.pyproject.toml (1)
107-122:⚠️ Potential issue | 🟡 MinorPotential crash if END marker is on the last line without a trailing newline.
replace_managed_blockin the generator (line 120 of the script) usescontent.index("\n", end_idx)which raisesValueErrorif the END marker is on the very last line with no trailing newline. While the auto-generated templates likely always produce a trailing newline, this is fragile if someone manually edits the file. The same logic applies to the managed blocks in this file.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pyproject.toml` around lines 107 - 122, The replace_managed_block function currently calls content.index("\n", end_idx) which will raise ValueError if the END marker is the last thing in the file with no trailing newline; update replace_managed_block (and other managed-block handlers in this file) to safely handle a missing newline by using content.find("\n", end_idx) and if it returns -1 fall back to len(content) (or wrap the index call in try/except and set the end position to len(content) on ValueError), so the function will correctly replace/remove the block even when the END marker is at EOF.
🧹 Nitpick comments (3)
scripts/generate_legacy_lint_config.py (3)
221-230:set(stale)is rebuilt for every element in the list comprehension.Minor inefficiency — pre-compute the set.
Proposed fix
- valid = [p for p in paths if p not in set(stale)] + stale_set = set(stale) + valid = [p for p in paths if p not in stale_set]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/generate_legacy_lint_config.py` around lines 221 - 230, In do_prune(), the list comprehension builds set(stale) on every iteration causing inefficiency; fix it by precomputing the set once (e.g., stale_set = set(stale)) before creating valid and then use that stale_set in the comprehension (valid = [p for p in paths if p not in stale_set]) so the set conversion isn’t repeated for each element.
164-183:do_generatewrites three files non-atomically.If the script fails midway (e.g., a missing marker in
.pre-commit-config.yaml),ruff-legacy.tomlandpyproject.tomlwill already be overwritten, leaving the config in a partially-updated state. For a developer-facing CLI tool this is low-risk (re-running fixes it), but worth noting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/generate_legacy_lint_config.py` around lines 164 - 183, do_generate currently writes RUFF_LEGACY_TOML, PYPROJECT_TOML, and PRECOMMIT_YAML in-place which can leave files partially-updated if an error occurs; change it to perform atomic replacements by writing each generated content to a temporary file (e.g., next to the target path with a unique suffix) and only calling Path.replace/os.replace to move the temp into place after all generation/replace_managed_block operations succeed. Use the existing helpers generate_ruff_legacy_toml, generate_pyproject_block, generate_precommit_block and the targets RUFF_LEGACY_TOML, PYPROJECT_TOML, PRECOMMIT_YAML to locate where to write temps; ensure temps are cleaned up on error and that writes use atomic replace so no original file is lost if a later step fails.
238-279:--generatewithdefault=Truemakesargs.generatealwaysTrue, even when--checkor--pruneis passed.The code works because
args.pruneandargs.checkare checked first in the if-chain, but this is misleading —args.generateis never actually read. A cleaner approach would usestore_constor simply dropdefault=Trueand fall through todo_generateas the default.Cleaner alternative
group = parser.add_mutually_exclusive_group() group.add_argument( "--generate", action="store_true", - default=True, help="Write generated files in-place (default)", )The else-branch already handles the default (generate) case, so
default=Trueis unnecessary.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/generate_legacy_lint_config.py` around lines 238 - 279, The --generate argument is declared with default=True causing args.generate to always be True; remove the confusing default (or change the add_argument for "--generate" to use action="store_const"/const=True without a default) so that args.generate is False unless explicitly passed, and rely on the existing if/else flow in main() that calls do_prune(), do_check(paths) or, by falling through, do_generate(paths); update the group.add_argument("--generate", ...) definition accordingly (referencing main, the ArgumentParser/group.add_argument for "--generate", and the downstream checks do_generate/do_check/do_prune).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/generate_legacy_lint_config.py`:
- Around line 107-122: The code assumes content.index("\n", end_idx) will always
find a newline and will raise ValueError if the END marker is the file's final
line; update the logic in generate_legacy_lint_config.py to use
content.find("\n", end_idx) and treat -1 as "no trailing newline" by setting
end_line_end = len(content) in that case so the return statement (using
begin_idx, new_block and end_line_end) works when the END_MARKER is on the last
line.
---
Outside diff comments:
In `@pyproject.toml`:
- Around line 107-122: The replace_managed_block function currently calls
content.index("\n", end_idx) which will raise ValueError if the END marker is
the last thing in the file with no trailing newline; update
replace_managed_block (and other managed-block handlers in this file) to safely
handle a missing newline by using content.find("\n", end_idx) and if it returns
-1 fall back to len(content) (or wrap the index call in try/except and set the
end position to len(content) on ValueError), so the function will correctly
replace/remove the block even when the END marker is at EOF.
In `@tensorrt_llm/runtime/kv_cache_manager_v2/rawref/setup.py`:
- Around line 16-27: Replace the direct import of Extension with a submodule
import: keep setup imported from setuptools but import Extension from the
setuptools.extension submodule (i.e., change "from setuptools import Extension,
setup" to "from setuptools import setup" and "from setuptools.extension import
Extension"), leaving usages of Extension and setup unchanged in the file.
---
Nitpick comments:
In `@scripts/generate_legacy_lint_config.py`:
- Around line 221-230: In do_prune(), the list comprehension builds set(stale)
on every iteration causing inefficiency; fix it by precomputing the set once
(e.g., stale_set = set(stale)) before creating valid and then use that stale_set
in the comprehension (valid = [p for p in paths if p not in stale_set]) so the
set conversion isn’t repeated for each element.
- Around line 164-183: do_generate currently writes RUFF_LEGACY_TOML,
PYPROJECT_TOML, and PRECOMMIT_YAML in-place which can leave files
partially-updated if an error occurs; change it to perform atomic replacements
by writing each generated content to a temporary file (e.g., next to the target
path with a unique suffix) and only calling Path.replace/os.replace to move the
temp into place after all generation/replace_managed_block operations succeed.
Use the existing helpers generate_ruff_legacy_toml, generate_pyproject_block,
generate_precommit_block and the targets RUFF_LEGACY_TOML, PYPROJECT_TOML,
PRECOMMIT_YAML to locate where to write temps; ensure temps are cleaned up on
error and that writes use atomic replace so no original file is lost if a later
step fails.
- Around line 238-279: The --generate argument is declared with default=True
causing args.generate to always be True; remove the confusing default (or change
the add_argument for "--generate" to use action="store_const"/const=True without
a default) so that args.generate is False unless explicitly passed, and rely on
the existing if/else flow in main() that calls do_prune(), do_check(paths) or,
by falling through, do_generate(paths); update the
group.add_argument("--generate", ...) definition accordingly (referencing main,
the ArgumentParser/group.add_argument for "--generate", and the downstream
checks do_generate/do_check/do_prune).
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
.pre-commit-config.yamlCODING_GUIDELINES.mdCONTRIBUTING.mdlegacy-files.txtpyproject.tomlruff-legacy.tomlscripts/generate_legacy_lint_config.pyscripts/release_check.pyscripts/templates/precommit-legacy-files.j2scripts/templates/pyproject-format-exclude.j2scripts/templates/ruff-legacy.toml.j2tensorrt_llm/runtime/kv_cache_manager_v2/rawref/setup.py
hnover-nv
left a comment
There was a problem hiding this comment.
One question, but to the limited extent I understand it, looks good!
We'll want to message something out to the team before we turn this on. Did you want to hold this until everything is finalized, or should we message this change now-ish and turn it on?
2f6ca3c to
6c49ba4
Compare
|
PR_Github #41588 [ run ] completed with state
|
|
PR_Github #41599 [ run ] triggered by Bot. Commit: |
|
/bot run --disable-fail-fast |
|
PR_Github #41640 [ run ] triggered by Bot. Commit: |
|
/bot run --disable-fail-fast |
|
PR_Github #41666 [ run ] triggered by Bot. Commit: |
3d0fec2 to
b3d52b1
Compare
|
/bot run |
|
PR_Github #41711 [ run ] triggered by Bot. Commit: |
|
PR_Github #41711 [ run ] completed with state
|
|
/bot run |
|
PR_Github #41728 [ run ] triggered by Bot. Commit: |
|
/bot run |
|
PR_Github #41728 [ run ] completed with state
|
|
PR_Github #41760 [ run ] triggered by Bot. Commit: |
|
/bot run --disable-multi-gpu-test |
|
PR_Github #41782 [ run ] triggered by Bot. Commit: |
|
/bot run --disable-multi-gpu-test |
|
PR_Github #41782 [ run ] completed with state
|
|
PR_Github #41784 [ run ] triggered by Bot. Commit: |
|
PR_Github #41784 [ run ] completed with state |
…legacy hook Two-group Python lint strategy: - Group A (~550 modern files): full ruff lint+format via existing hooks - Group B (~1,350 legacy files): baseline-gated supplemental ruff lint that blocks new violations while tolerating pre-existing ones Key changes: - Add ruff-legacy pre-commit hook with baseline JSON snapshot - Add scripts/legacy_utils.py for baseline management (prune, gen-configs, lint-update-violations, lint-precommit, check-configs) - Auto-generate ruff-legacy.toml and file lists in pyproject.toml / .pre-commit-config.yaml from a single source of truth - Bulk-fix auto-fixable legacy violations, update CONTRIBUTING.md and CODING_GUIDELINES.md with lint documentation - Add unit tests for lint scripts in tests/unittest/scripts/ Signed-off-by: venkywonka <23023424+venkywonka@users.noreply.github.com>
b3d52b1 to
e74e905
Compare
|
/bot skip --comment "pipeline 32678 passed. latest force-push was a mere rebase" |
|
PR_Github #41810 [ skip ] triggered by Bot. Commit: |
|
PR_Github #41810 [ skip ] completed with state |
…legacy hook (NVIDIA#11469) Signed-off-by: venkywonka <23023424+venkywonka@users.noreply.github.com>
…legacy hook (NVIDIA#11469) Signed-off-by: venkywonka <23023424+venkywonka@users.noreply.github.com>
Summary
ruff-legacypre-commit hook that applies supplemental lint rules to ~1,350 legacy Python files (Group B) — rules that yapf, isort, and autoflake don't cover.pre-commit-config.yamlandpyproject.toml) into a single source of truth:legacy-files.txtscripts/generate_legacy_lint_config.py) that renders all derived configs fromlegacy-files.txtvia Jinja2 templatesruff-legacy-baseline.json). New violations block; pre-existing ones are tolerated.Reviewer Guide
This PR is large (~7,900 lines) but most of it is auto-generated or mechanical. Here's how to navigate it efficiently:
Where to start (human-written logic — ~750 lines)
Review these first. This is where all the design decisions live:
scripts/ruff_legacy_lint_baseline.pyprecommit_mode()(regression detection, improvement hints) andupdate_baseline_mode().scripts/generate_legacy_lint_config.pylegacy-files.txt, renders templates, has--checkand--prunemodes.CODING_GUIDELINES.md(lines 527-640)scripts/templates/ruff-legacy.toml.j2Skim or spot-check (~200 lines)
.pre-commit-config.yaml(bottom ~30 lines, afterEND AUTO-GENERATEDmarker)ruff-legacy,verify-legacy-config). The ~1,500-line diff above the marker is auto-generated file lists — skip it.pyproject.tomlCONTRIBUTING.mdscripts/templates/precommit-legacy-files.j2scripts/templates/pyproject-format-exclude.j2Skip entirely (auto-generated or mechanical — ~6,800 lines)
ruff-legacy-baseline.json--update-baseline. JSON blob of per-file violation counts.ruff-legacy.tomllegacy-files.txt. Just the rendered config.legacy-files.txt.pre-commit-config.yaml(lines 1-2716)common_files,legacy_files).pyproject.toml(managed block)ruff --fix: removed unused imports (F401), removed unused variables (F841), fixed comparisons (E711/E712), replacedl→L(E741), fixed escape sequences (W605). Every change is a single-line mechanical fix.One file that looks weird but is expected
tensorrt_llm/_torch/modules/fused_moe/fused_moe_cutlass.py— the ruff auto-fix removed a duplicate import (UnquantizedFusedMoEMethod, rule F811), which changed line lengths in the import block, triggering yapf to reformat it. The diff looks like a formatting change but it's rooted in a real lint fix.Problem
The codebase has two groups of Python files with different formatting toolchains:
Group B files get zero ruff lint coverage today. Bugs like bare
except:, undefined names, invalid escape sequences, andraise NotImplementedgo undetected.Why not just enable full ruff on Group B?
line-length = 100vs yapf'scolumn_limit = 80would cause formatting conflictspre-commit run -a(all files), and Group B has thousands of pre-existing violations — enabling full ruff would break CI immediatelySolution: Baseline-Gated Supplemental Lint
A dedicated
ruff-legacyhook applies only the rules that the legacy tools don't cover, gated by a violation snapshot (ruff-legacy-baseline.json):ruff --fixon staged legacy files, auto-fixing what it canThis means the hook runs in both local pre-commit AND CI
--all-filesmode.F402,F403,F405,F601,F811,F821,F822,F901E4,E711-E714,E721,E722,E731,E741D(minus D100-D107, D417)PLEW605Suppressed (overlap with legacy tools):
E1xx/E2xx/E3xx(yapf),E501(yapf),I001/I002(isort),F401/F841(autoflake).Architecture
Developer Experience
When a developer resolves existing violations:
Files Created (8)
legacy-files.txtruff-legacy.tomlruff-legacy-baseline.jsonscripts/generate_legacy_lint_config.py--generate,--check,--prunemodesscripts/ruff_legacy_lint_baseline.py--update-baselinemodesscripts/templates/ruff-legacy.toml.j2scripts/templates/pyproject-format-exclude.j2scripts/templates/precommit-legacy-files.j2Files Modified (6)
.pre-commit-config.yamllegacy_filesanchor; addexclude: *legacy_filesto ruff + ruff-format hooks; addruff-legacyandverify-legacy-confighooks; movestatic-analysis-filesanchor outside auto-generated zone; fixpass_filenameson model-registry-checkpyproject.toml[tool.ruff] exclude(global) to[tool.ruff.format] exclude(format-only); wrap in managed block markers; addforce-exclude = trueCODING_GUIDELINES.mdCONTRIBUTING.mdtensorrt_llm/runtime/kv_cache_manager_v2/rawref/setup.pyNote: basename-match side effect on
mainOn
main, the rootsetup.pyis in[tool.ruff] exclude. Because ruff matches exclude patterns against basenames, this accidentally excluded every file namedsetup.pyin the entire repo from linting — includingtensorrt_llm/runtime/kv_cache_manager_v2/rawref/setup.py. The ruff pre-commit hook uses--force-exclude(built intoastral-sh/ruff-pre-commit), which enforces excludes even when files are passed by path.This PR moves the legacy list from the global
[tool.ruff] excludeto[tool.ruff.format] exclude, so"setup.py"no longer suppresses linting for all basename matches. This exposed a pre-existing I001 (unsorted import) inrawref/setup.py, which is fixed in this PR.Graduation Path
To move a file from Group B → Group A:
legacy-files.txtpython scripts/generate_legacy_lint_config.pypython scripts/ruff_legacy_lint_baseline.py --update-baselineruff check --fix <file> && ruff format <file>When
legacy-files.txtreaches 0 entries, delete all supplemental infrastructure and switch to a single ruff toolchain.Maintenance
legacy-files.txtis the single source of truth. Three derived configs are auto-generated — never edit by hand.verify-legacy-configpre-commit hook catches stale configs.ruff-legacy-baseline.jsonis a separate artifact — update with--update-baseline.legacy-files.txt: rungenerate_legacy_lint_config.pythen--update-baseline.generate_legacy_lint_config.py --pruneremoves stale entries for deleted/renamed files.Measured Impact
Developer disruption estimate
Subsequent PRs that touch legacy files will encounter scoped lint enforcement only on the files they touch. The baseline gate means pre-existing violations don't block — only new regressions.
Test plan
python scripts/generate_legacy_lint_config.py --checkexits 0ruff check --config ruff-legacy.toml <legacy-file>reports only supplemental rules--all-files(excluding clang and mypy which need compiled bindings)ruff-legacyruns on Group B / skips Group A; mainruffskips Group B / runs on Group Averify-legacy-configpasses after regenerationforce-exclude = trueconfirmed in bothpyproject.tomlandruff-legacy.toml