Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[https://nvbugs/5987470][fix] BREAKING: Do not normalize log probs by default#12366

Merged
achartier merged 3 commits intoNVIDIA:mainNVIDIA/TensorRT-LLM:mainfrom
achartier:normachartier/TensorRT-LLM:normCopy head branch name to clipboard
Mar 24, 2026
Merged

[https://nvbugs/5987470][fix] BREAKING: Do not normalize log probs by default#12366
achartier merged 3 commits intoNVIDIA:mainNVIDIA/TensorRT-LLM:mainfrom
achartier:normachartier/TensorRT-LLM:normCopy head branch name to clipboard

Conversation

@achartier
Copy link
Copy Markdown
Collaborator

@achartier achartier commented Mar 19, 2026

Expose the option in ModelRunnerCpp

Summary by CodeRabbit

  • New Features

    • Added parameter to enable customization of log probability normalization behavior, allowing users to control how probability values are formatted during inference.
  • Changes

    • Updated default configuration to disable log probability normalization, changing baseline behavior for probability output formatting in the executor without requiring explicit parameter specification.

Description

Log probs are only useful when not normalized for greedy sampling which is the most common use case. Also the pytorch backend does not normalize log probs, so this is aligning behaviors between the two backends.

Test Coverage

Existing log probs tests.

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Expose the option in ModelRunnerCpp

Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
@achartier
Copy link
Copy Markdown
Collaborator Author

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 19, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 19, 2026

📝 Walkthrough

Walkthrough

The default value for the normalizeLogProbs parameter is changed from true to false across the C++ header, Python binding, and Python API. This modifies the default log probability normalization behavior when callers omit the argument.

Changes

Cohort / File(s) Summary
Default Parameter Value Updates
cpp/include/tensorrt_llm/executor/executor.h, cpp/tensorrt_llm/nanobind/executor/executorConfig.cpp
Changed default value of normalizeLogProbs/normalize_log_probs parameter from true to false in ExecutorConfig constructor across C++ header and Python binding.
Python API Enhancement
tensorrt_llm/runtime/model_runner_cpp.py
Added new normalize_log_probs: bool = False parameter to ModelRunnerCpp.from_dir() and forwards it to ExecutorConfig.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description provides the rationale for changes and mentions test coverage, but lacks a clear summary of what was actually changed across the three modified files. Add a more detailed description summarizing the changes made: updating default values in ExecutorConfig (C++ header and Python binding) and adding a new parameter to ModelRunnerCpp.from_dir(). Clarify which changes address which aspect of the PR objective.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: setting the default value of normalizeLogProbs to false across C++, nanobind, and Python layers. It clearly indicates this is a breaking change.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use your project's `pylint` configuration to improve the quality of Python code reviews.

Add a pylint configuration file to your project to customize how CodeRabbit runs pylint.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/tensorrt_llm/nanobind/executor/executorConfig.cpp (1)

1-3: ⚠️ Potential issue | 🟠 Major

Update the file copyright year to include 2026.

This file is modified in this PR, but the SPDX copyright header still ends at 2025.

Suggested fix
- * SPDX-FileCopyrightText: Copyright (c) 2022-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2022-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

As per coding guidelines, "Add NVIDIA copyright header on ALL new files, and update year on modified files".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tensorrt_llm/nanobind/executor/executorConfig.cpp` around lines 1 - 3,
Update the SPDX copyright header at the top of the modified file by extending
the year range to include 2026: locate the existing header lines starting with
"SPDX-FileCopyrightText:" and "SPDX-License-Identifier:" in executorConfig.cpp
and change the year range "2022-2025" to "2022-2026" so the file reflects the
updated modification year.
🧹 Nitpick comments (1)
tensorrt_llm/runtime/model_runner_cpp.py (1)

129-205: Document the new normalize_log_probs argument in from_dir docstring.

The new public parameter is added to the signature but not described in Args, which makes runtime behavior changes easier to miss.

Suggested doc update
         fail_fast_on_attention_window_too_large (bool):
             Whether to fail fast if the attention window(s) are too large to fit even a single sequence in the KVCache.
+        normalize_log_probs (bool):
+            Whether returned log probabilities should be normalized. Defaults to False.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/runtime/model_runner_cpp.py` around lines 129 - 205, The
docstring for ModelRunnerCpp.from_dir is missing documentation for the new
normalize_log_probs parameter; add an Args entry describing normalize_log_probs
(type bool), its default value (False), and what turning it on/off does to
output (e.g., whether logits are converted to normalized log probabilities
before downstream processing/generation), placing it with the other parameter
descriptions in the from_dir docstring and referencing the normalize_log_probs
parameter name exactly so users see its runtime effect.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@cpp/tensorrt_llm/nanobind/executor/executorConfig.cpp`:
- Around line 1-3: Update the SPDX copyright header at the top of the modified
file by extending the year range to include 2026: locate the existing header
lines starting with "SPDX-FileCopyrightText:" and "SPDX-License-Identifier:" in
executorConfig.cpp and change the year range "2022-2025" to "2022-2026" so the
file reflects the updated modification year.

---

Nitpick comments:
In `@tensorrt_llm/runtime/model_runner_cpp.py`:
- Around line 129-205: The docstring for ModelRunnerCpp.from_dir is missing
documentation for the new normalize_log_probs parameter; add an Args entry
describing normalize_log_probs (type bool), its default value (False), and what
turning it on/off does to output (e.g., whether logits are converted to
normalized log probabilities before downstream processing/generation), placing
it with the other parameter descriptions in the from_dir docstring and
referencing the normalize_log_probs parameter name exactly so users see its
runtime effect.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d3850d7d-6212-4cf9-a5d4-a72ab26951b9

📥 Commits

Reviewing files that changed from the base of the PR and between e37493a and 5a32ec9.

📒 Files selected for processing (3)
  • cpp/include/tensorrt_llm/executor/executor.h
  • cpp/tensorrt_llm/nanobind/executor/executorConfig.cpp
  • tensorrt_llm/runtime/model_runner_cpp.py

Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
@achartier achartier requested a review from a team as a code owner March 19, 2026 22:06
@achartier achartier requested a review from SimengLiu-nv March 19, 2026 22:06
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
@achartier
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39646 [ run ] triggered by Bot. Commit: ac66157 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39646 [ run ] completed with state SUCCESS. Commit: ac66157
/LLM/main/L0_MergeRequest_PR pipeline #30851 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@pcastonguay
Copy link
Copy Markdown
Collaborator

Can you rename PR to [https://nvbugs/5987470][fix] BREAKING: Do not normalize log probs by default since this is a breaking change.

@pcastonguay
Copy link
Copy Markdown
Collaborator

@achartier so just to be clear, this shouldn't impact LLM API behavior correct?

@pcastonguay
Copy link
Copy Markdown
Collaborator

/bot run --disable-fail-fast

@achartier achartier changed the title [https://nvbugs/5987470][fix] Do not normalize log probs by default [https://nvbugs/5987470][fix] BREAKING: Do not normalize log probs by default Mar 20, 2026
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39741 [ run ] triggered by Bot. Commit: ac66157 Link to invocation

@achartier
Copy link
Copy Markdown
Collaborator Author

achartier commented Mar 20, 2026

@achartier so just to be clear, this shouldn't impact LLM API behavior correct?

Yes, normalize_log_probs is already false by default when using LLM API:

normalize_log_probs: bool = Field(
default=False, description="Normalize log probabilities.")

@pcastonguay
Copy link
Copy Markdown
Collaborator

@Funatiq @schetlur-nv @MartinMarciniszyn are you aware of customers using the C++ runtime that could be impacted by this change?

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39741 [ run ] completed with state SUCCESS. Commit: ac66157
/LLM/main/L0_MergeRequest_PR pipeline #30936 completed with status: 'SUCCESS'

CI Report

Link to invocation

Copy link
Copy Markdown
Collaborator

@ixlmar ixlmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts:

@achartier
Copy link
Copy Markdown
Collaborator Author

Yes, that's correct. Only the top-K code path has normalizeLogProbs

The TRT backend logprobs do account for temperature indeed. Would the decodingParams.h header description of outputLogProbs be a good spot to document it (in a separate PR)?

@achartier achartier merged commit 73a02ee into NVIDIA:main Mar 24, 2026
7 checks passed
@achartier achartier deleted the norm branch March 24, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.