[None][chore] Minor fix in w4a8 mxfp4 mxfp8 test.#11745
[None][chore] Minor fix in w4a8 mxfp4 mxfp8 test.#11745Tracin merged 1 commit intoNVIDIA:mainNVIDIA/TensorRT-LLM:mainfrom
Conversation
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
|
/bot run |
|
PR_Github #36905 [ run ] triggered by Bot. Commit: |
📝 WalkthroughWalkthroughA test file for GEMM operations is modified to use quantified FP8/FP4 quantization pathways via TRTLLM operations, introduce seed control, replace exact numerical comparison with cosine similarity validation (>0.98), add alpha scaling, and adjust dimension configurations. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unittest/_torch/thop/parallel/test_w4a8_mxfp4_mxfp8_gemm.py`:
- Line 49: The division by mat_b.abs().max().float() in the global_scale_b
calculation can produce inf for zero tensors; guard the denominator by computing
a safe max (e.g., max_val = mat_b.abs().max().float() and then replace zero with
a small epsilon or use torch.clamp/max to enforce a minimum like 1e-6) and use
that safe_max in the division so global_scale_b = (448 * 6) / safe_max; update
references to mat_b.abs().max().float() in this expression to the safe_max
variable.
- Around line 58-60: The test currently only checks cosine similarity
(scale-invariant) and can miss cases where alpha is ignored; update the
assertion after computing c and c_ref (variables: c, c_ref, alpha, mat_a, mat_b)
to include a magnitude-sensitive check — e.g., assert that the tensor norms or
element-wise values match the expected scaled result using torch.allclose or a
max-abs-difference threshold (for example compare c.norm() to c_ref.norm() with
a small rtol/atol or assert torch.allclose(c, c_ref, rtol=1e-3, atol=1e-6)) so
the test fails if alpha is not applied correctly.
|
PR_Github #36905 [ run ] completed with state |
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
Summary by CodeRabbit
Tests
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.