[None][test] add deepseek RCCA perf test case#11736
[None][test] add deepseek RCCA perf test case#11736ruodil merged 1 commit intoNVIDIA:mainNVIDIA/TensorRT-LLM:mainfrom
Conversation
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
📝 WalkthroughWalkthroughThe changes add a new performance test configuration for a Deepseek R1 NVFP4 model with chunked prefill and FP8 KV cache, including model configuration parameters and corresponding benchmark test entry. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/integration/test_lists/qa/llm_perf_core.yml (1)
259-260: Consider adding a TIMEOUT annotation.The test has
reqs:3000which is substantial. Similar DeepSeek tests with comparable request counts (e.g., lines 257-258 withreqs:3000) includeTIMEOUT(120). While theinput_output_len:8192,512profile (long input, short output) may differ from1000,2000, you may want to verify if a timeout is needed to prevent test hangs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_lists/qa/llm_perf_core.yml` around lines 259 - 260, Add a TIMEOUT annotation to the long-running DeepSeek test entry to prevent hangs: for the test identifier string perf/test_perf.py::test_perf[deepseek_r1_nvfp4-bench-pytorch-float4-maxbs:32-maxnt:4096-kv_frac:0.80-input_output_len:8192,512-reqs:3000-ep:2-tp:4-gpus:4] insert the same TIMEOUT(120) annotation used by nearby DeepSeek tests (e.g., the entries around lines 257–258) so the test will fail fast if it exceeds the expected runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/integration/test_lists/qa/llm_perf_core.yml`:
- Around line 259-260: Add a TIMEOUT annotation to the long-running DeepSeek
test entry to prevent hangs: for the test identifier string
perf/test_perf.py::test_perf[deepseek_r1_nvfp4-bench-pytorch-float4-maxbs:32-maxnt:4096-kv_frac:0.80-input_output_len:8192,512-reqs:3000-ep:2-tp:4-gpus:4]
insert the same TIMEOUT(120) annotation used by nearby DeepSeek tests (e.g., the
entries around lines 257–258) so the test will fail fast if it exceeds the
expected runtime.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tests/integration/defs/perf/pytorch_model_config.pytests/integration/test_lists/qa/llm_perf_core.yml
|
/bot skip --comment "skip test as just adding test cases" |
|
PR_Github #36884 [ skip ] triggered by Bot. Commit: |
|
PR_Github #36884 [ skip ] completed with state |
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Summary by CodeRabbit
Release Notes
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.