[TRTLLM-12137][chore] Drop non-key-model (starcoder2/mllama/nemotron) cases from L0 by QiJune · Pull Request #13315 · NVIDIA/TensorRT-LLM

QiJune · Apr 22, 2026

Summary

Remove 6 redundant test cases from L0 test lists to reduce CI queue time. These tests cover older/less-critical models (Starcoder2, MLlama, Nemotron) whose coverage is already provided by downstream accuracy tests or is no longer a key model target for the corresponding platform.

Deleted tests

#	Test	Removed from
1	`unittest/_torch/modeling -k "modeling_starcoder2"`	`l0_a30.yml`
2	`unittest/_torch/modeling -k "modeling_mllama"`	`l0_gb202.yml`, `l0_l40s.yml`, `l0_rtx_pro_6000.yml`
3	`unittest/_torch/modeling -k "modeling_nemotron"`	`l0_h100.yml`
4	`accuracy/test_llm_api_pytorch.py::TestStarcoder2_3B::test_auto_dtype`	`l0_h100.yml`
5	`accuracy/test_llm_api_pytorch.py::TestStarcoder2_7B::test_auto_dtype`	`l0_h100.yml`
6	`accuracy/test_llm_api_pytorch.py::TestStarcoder2_15B::test_auto_dtype`	`l0_h100.yml`

CI time savings

Average per-run duration measured from OpenSearch CI data over the
past 7 days
(2026-04-15 → 2026-04-22).

Unit tests (run on every PR via `L0_MergeRequest_PR`)

Test	Platform(s)	Avg duration	Runs (7d)	Pass rate
`modeling_starcoder2`	A30	4 min 3 s	408	96.3%
`modeling_mllama`	GB202 / L40S / RTX Pro 6000	8 min 1 s	1314	100%
`modeling_nemotron`	H100	15 min 35 s	342	95.0%

Accuracy tests (run on `L0_PostMerge` only)

Test	Platform	Avg duration	Runs (7d)
`TestStarcoder2_3B::test_auto_dtype`	DGX_H100	97.8 s	16
`TestStarcoder2_7B::test_auto_dtype`	H100_PCIe	224.2 s	2
`TestStarcoder2_15B::test_auto_dtype`	DGX_H100	183.0 s	16

Estimated savings

Per PR CI run (L0_MergeRequest_PR): 243 + 481×3 + 935 ≈ 2621 s ≈ 37 min saved across A30 / GB202 / L40S / RTX Pro 6000 / H100 parallel stages.
Per Post-Merge run (H100): 97.8 + 224.2 + 183.0 ≈ 8.4 min saved.

Summary by CodeRabbit

Tests
- Refined test configurations across multiple GPU models (A30, GB202, H100, L40S, RTX Pro 6000) to optimize test coverage and resource utilization. Adjusted model-specific test selections to better align with hardware capabilities and testing priorities, ensuring more focused validation across different accelerators.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

coderabbitai · Apr 22, 2026

📝 Walkthrough

Walkthrough

Multiple test configuration YAML files are modified to remove specific PyTorch modeling test selections across different GPU configurations, with one file also updating model selection comments and removing additional accuracy tests.

Changes

Cohort / File(s)	Summary
Test List Removals `tests/integration/test_lists/test-db/l0_a30.yml`, `l0_gb202.yml`, `l0_l40s.yml`, `l0_rtx_pro_6000.yml`	Removed individual `unittest/_torch/modeling` test entries: `modeling_starcoder2` (A30), `modeling_mllama` (GB202, L40S, RTX Pro 6000).
Test List Update with Model Changes `tests/integration/test_lists/test-db/l0_h100.yml`	Updated comment from "llama/mixtral/nemotron/deepseek" to "llama/mixtral/gemma3/gpt-oss", removed `modeling_nemotron` test entry, and removed three Starcoder2 accuracy test variants (`TestStarcoder2_*::test_auto_dtype` for 3B, 7B, 15B).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description comprehensively covers the purpose (reducing CI time), lists all deleted tests in a clear table, and provides detailed CI time savings analysis with supporting data from OpenSearch.
Title check	✅ Passed	The title accurately summarizes the main change: removing non-key-model test cases (starcoder2/mllama/nemotron) from L0 test lists, which is directly reflected in all modified files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

QiJune · Apr 22, 2026

/bot run

tensorrt-cicd · Apr 22, 2026

PR_Github #44869 [ run ] triggered by Bot. Commit: 72b965c Link to invocation

…ists Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

QiJune · Apr 22, 2026

/bot skip --comment "trivial changes"

tensorrt-cicd · Apr 22, 2026

PR_Github #44967 [ skip ] triggered by Bot. Commit: 6b9036f Link to invocation

sunnyqgg

LGTM

tensorrt-cicd · Apr 22, 2026

PR_Github #44967 [ skip ] completed with state SUCCESS. Commit: 6b9036f
Skipping testing for commit 6b9036f

Link to invocation

… cases from L0 (NVIDIA#13315) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

github-actions Bot assigned QiJune Apr 22, 2026

QiJune changed the title ~~[]test: drop non-key-model (starcoder2/mllama/nemotron) cases from L0~~ [TRTLLM-12137][ci] drop non-key-model (starcoder2/mllama/nemotron) cases from L0 Apr 22, 2026

QiJune requested review from YihuiLu512 and litaotju April 22, 2026 03:21

QiJune changed the title ~~[TRTLLM-12137][ci] drop non-key-model (starcoder2/mllama/nemotron) cases from L0~~ [TRTLLM-12137][chore] Drop non-key-model (starcoder2/mllama/nemotron) cases from L0 Apr 22, 2026

QiJune changed the title ~~[TRTLLM-12137][chore] Drop non-key-model (starcoder2/mllama/nemotron) cases from L0~~ [TRTLLM-12137][chore] Drop non-key-model (starcoder2/mllama/nemotron) cases from L0 Apr 22, 2026

QiJune requested a review from sunnyqgg April 22, 2026 12:04

QiJune added 2 commits April 22, 2026 20:13

test: drop non-key-model (starcoder2/mllama/nemotron) cases from L0 l…

a4a83bd

…ists Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

test: drop stale mistral cases from h100 L0 list

6b9036f

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

QiJune force-pushed the deprecate_cases branch from 72b965c to 6b9036f Compare April 22, 2026 12:37

sunnyqgg approved these changes Apr 22, 2026

View reviewed changes

QiJune merged commit a1bcae6 into NVIDIA:main Apr 22, 2026
5 checks passed

ziyixiong-nv pushed a commit to ziyixiong-nv/TensorRT-LLM that referenced this pull request Apr 24, 2026

[TRTLLM-12137][chore] Drop non-key-model (starcoder2/mllama/nemotron)…

3c918b3

… cases from L0 (NVIDIA#13315) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TRTLLM-12137][chore] Drop non-key-model (starcoder2/mllama/nemotron) cases from L0#13315

[TRTLLM-12137][chore] Drop non-key-model (starcoder2/mllama/nemotron) cases from L0#13315
QiJune merged 2 commits intoNVIDIA:mainNVIDIA/TensorRT-LLM:mainfrom
QiJune:deprecate_casesQiJune/TensorRT-LLM:deprecate_casesCopy head branch name to clipboard

QiJune commented Apr 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 22, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

QiJune commented Apr 22, 2026

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

QiJune commented Apr 22, 2026

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

sunnyqgg left a comment

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Search code, repositories, users, issues, pull requests...

Conversation

QiJune commented Apr 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Deleted tests

CI time savings

Unit tests (run on every PR via L0_MergeRequest_PR)

Accuracy tests (run on L0_PostMerge only)

Estimated savings

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

QiJune commented Apr 22, 2026

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

QiJune commented Apr 22, 2026

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

sunnyqgg left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

QiJune commented Apr 22, 2026 •

edited by coderabbitai Bot

Loading

Unit tests (run on every PR via `L0_MergeRequest_PR`)

Accuracy tests (run on `L0_PostMerge` only)

coderabbitai Bot commented Apr 22, 2026 •

edited

Loading