Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[None][chore] Minor fix in w4a8 mxfp4 mxfp8 test.#11745

Merged
Tracin merged 1 commit intoNVIDIA:mainNVIDIA/TensorRT-LLM:mainfrom
Tracin:fix_testTracin/TensorRT-LLM:fix_testCopy head branch name to clipboard
Feb 27, 2026
Merged

[None][chore] Minor fix in w4a8 mxfp4 mxfp8 test.#11745
Tracin merged 1 commit intoNVIDIA:mainNVIDIA/TensorRT-LLM:mainfrom
Tracin:fix_testTracin/TensorRT-LLM:fix_testCopy head branch name to clipboard

Conversation

@Tracin
Copy link
Copy Markdown
Collaborator

@Tracin Tracin commented Feb 26, 2026

Summary by CodeRabbit

Tests

  • Updated test suite with new parameter configurations for broader test scenario coverage
  • Improved validation approach that verifies results more rigorously for consistency and accuracy
  • Streamlined test initialization with deterministic setup and simplified data preparation
  • Refined testing methodology for more robust and reliable execution

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
@Tracin Tracin requested a review from nv-guomingz February 26, 2026 09:30
@Tracin
Copy link
Copy Markdown
Collaborator Author

Tracin commented Feb 26, 2026

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #36905 [ run ] triggered by Bot. Commit: 519a722 Link to invocation

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 26, 2026

📝 Walkthrough

Walkthrough

A test file for GEMM operations is modified to use quantified FP8/FP4 quantization pathways via TRTLLM operations, introduce seed control, replace exact numerical comparison with cosine similarity validation (>0.98), add alpha scaling, and adjust dimension configurations.

Changes

Cohort / File(s) Summary
GEMM Test Refactoring
tests/unittest/_torch/thop/parallel/test_w4a8_mxfp4_mxfp8_gemm.py
Test parameterization updated with larger dimensions; input generation refactored to use quantized FP8/FP4 pathways via TRTLLM ops instead of direct conversion; seed control added with deterministic initialization; computation switched to fp8_fp4_gemm with alpha scaling; validation changed from exact numerical comparison to cosine similarity check (threshold >0.98); reference computation adjusted accordingly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description is entirely the template with no actual content filled in; only the checklist is marked complete without any explanation of changes. Add a meaningful Description section explaining what was changed and why, and a Test Coverage section listing relevant tests that safeguard the changes.
Title check ❓ Inconclusive The title indicates a 'minor fix' but the AI summary shows substantial changes including quantization pathway replacement, test parameter modifications, and numerical comparison changes. Clarify whether this is truly a minor fix or if the title should better reflect the scope of changes (e.g., 'Refactor w4a8 mxfp4 mxfp8 test with quantized GEMM pathway').

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unittest/_torch/thop/parallel/test_w4a8_mxfp4_mxfp8_gemm.py`:
- Line 49: The division by mat_b.abs().max().float() in the global_scale_b
calculation can produce inf for zero tensors; guard the denominator by computing
a safe max (e.g., max_val = mat_b.abs().max().float() and then replace zero with
a small epsilon or use torch.clamp/max to enforce a minimum like 1e-6) and use
that safe_max in the division so global_scale_b = (448 * 6) / safe_max; update
references to mat_b.abs().max().float() in this expression to the safe_max
variable.
- Around line 58-60: The test currently only checks cosine similarity
(scale-invariant) and can miss cases where alpha is ignored; update the
assertion after computing c and c_ref (variables: c, c_ref, alpha, mat_a, mat_b)
to include a magnitude-sensitive check — e.g., assert that the tensor norms or
element-wise values match the expected scaled result using torch.allclose or a
max-abs-difference threshold (for example compare c.norm() to c_ref.norm() with
a small rtol/atol or assert torch.allclose(c, c_ref, rtol=1e-3, atol=1e-6)) so
the test fails if alpha is not applied correctly.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e3a4f43 and 519a722.

📒 Files selected for processing (1)
  • tests/unittest/_torch/thop/parallel/test_w4a8_mxfp4_mxfp8_gemm.py

Comment thread tests/unittest/_torch/thop/parallel/test_w4a8_mxfp4_mxfp8_gemm.py
Comment thread tests/unittest/_torch/thop/parallel/test_w4a8_mxfp4_mxfp8_gemm.py
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #36905 [ run ] completed with state SUCCESS. Commit: 519a722
/LLM/main/L0_MergeRequest_PR pipeline #28573 completed with status: 'SUCCESS'

Link to invocation

@Tracin Tracin requested a review from kaiyux February 27, 2026 02:14
@Tracin Tracin merged commit 6f7138a into NVIDIA:main Feb 27, 2026
7 of 9 checks passed
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Mar 9, 2026
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
@Tracin Tracin deleted the fix_test branch March 13, 2026 06:31
tianyuz-nv pushed a commit to wanqian-nv/TensorRT-LLM that referenced this pull request Mar 19, 2026
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.