Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[Security] Fix remote DoS from grammar-rejected spec tokens padded with -1#44743

Open
jperezdealgaba wants to merge 1 commit into
vllm-project:mainvllm-project/vllm:mainfrom
jperezdealgaba:fix/dos-grammar-rejected-spec-tokensjperezdealgaba/vllm:fix/dos-grammar-rejected-spec-tokensCopy head branch name to clipboard
Open

[Security] Fix remote DoS from grammar-rejected spec tokens padded with -1#44743
jperezdealgaba wants to merge 1 commit into
vllm-project:mainvllm-project/vllm:mainfrom
jperezdealgaba:fix/dos-grammar-rejected-spec-tokensjperezdealgaba/vllm:fix/dos-grammar-rejected-spec-tokensCopy head branch name to clipboard

Conversation

@jperezdealgaba
Copy link
Copy Markdown
Contributor

Summary

Fixes https://github.com/vllm-project/vllm/security/advisories/GHSA-55m4-88pw-2875

When structured output invalidates speculative draft slots, the scheduler pads them with -1 for bitmask layout. This allows invalid token IDs (-1) to reach GPU embeddings, causing a remote denial of service.

Fix:

  • Cap accepted tokens to the grammar-valid prefix in update_from_output
  • Sanitize -1 before writing spec tokens to token_ids_cpu so invalid IDs cannot reach GPU embeddings

Test plan

  • Added regression tests in tests/v1/core/test_ghsa_invalid_spec_tokens.py
  • Added unit test in tests/v1/worker/test_gpu_input_batch.py
  • Verified the fix prevents invalid token IDs from reaching the GPU embedding layer

Notes

  • AI assistance was used in developing this fix (per AGENTS.md)
  • This fix was developed in the GHSA private fork and is now submitted for public review

Made with Cursor

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot added the v1 label Jun 6, 2026
When structured output invalidates speculative draft slots, the scheduler
pads them with -1 for bitmask layout. Cap accepted tokens to the
grammar-valid prefix in update_from_output, and sanitize -1 before writing
spec tokens to token_ids_cpu so invalid IDs cannot reach GPU embeddings.
Add regression tests

Signed-off-by: Juan Pérez de Algaba <jperezde@redhat.com>
Signed-off-by: jperezde <jperezde@redhat.com>
@jperezdealgaba jperezdealgaba force-pushed the fix/dos-grammar-rejected-spec-tokens branch from 38b136a to 75507c5 Compare June 6, 2026 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Morty Proxy This is a proxified and sanitized view of the page, visit original site.