-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Core] Async scheduling + structured outputs compatibility
kv-connector
structured-output
suppress-bc-linter
tpu
Related to Google TPUs
v1
[Feature] Support Prefill Context Parallel (PCP) for GQA flashinfer
#26864
opened Oct 15, 2025 by
LookAround0301
Loading…
3 tasks
[Docs] Move build.inc into arm.inc
documentation
Improvements or additions to documentation
#26862
opened Oct 15, 2025 by
windsonsea
Loading…
[DO NOT MERGE] Experiments related to MoE kernels
#26860
opened Oct 15, 2025 by
zhuohan123
•
Draft
5 tasks
Disable FlashInfer sampler by default
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#26859
opened Oct 15, 2025 by
mgoin
Loading…
5 tasks
[small][batch invariance] Rename the env and internal flags to simplify usage
gpt-oss
Related to GPT-OSS models
v1
#26855
opened Oct 14, 2025 by
bwasti
Loading…
3 of 5 tasks
Update TritonLanguagePlaceholder to have attributes that are used by Flash Linear Attention ops.
#26853
opened Oct 14, 2025 by
madongfly
Loading…
Adjusting AMD test composition 2025-10-14
ci/build
rocm
Related to AMD ROCm
#26852
opened Oct 14, 2025 by
Alexei-V-Ivanov-AMD
Loading…
[BUGFIX][NIXL] quick fix for 'assert self.connector_worker is not None' in get_kv_connector_stats
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
#26851
opened Oct 14, 2025 by
xuechendi
Loading…
5 tasks
[Compressed Tensors] Always clone output for compile robustness
#26849
opened Oct 14, 2025 by
kylesayrs
Loading…
[BugFix] NIXL connector WAR "prefill TP < decode TP" limitation for MLA case
kv-connector
#26848
opened Oct 14, 2025 by
GuanLuo
Loading…
5 tasks
[Frontend][torch.compile] CompilationConfig Overhaul (#20283): Set up -O infrastructure
documentation
Improvements or additions to documentation
frontend
llama
Related to Llama models
speculative-decoding
tpu
Related to Google TPUs
v1
#26847
opened Oct 14, 2025 by
morrison-turnansky
•
Draft
5 tasks
[Attention] Tune CUTLASS MLA num_splits
#26846
opened Oct 14, 2025 by
MatthewBonanni
Loading…
5 tasks
[CI] Fix mypy for ONLY add when PR is ready to merge/full CI is needed
vllm/executor
ready
#26845
opened Oct 14, 2025 by
yewentao256
Loading…
Guard SM100 CUTLASS MoE macro to SM100 builds
ci/build
#26844
opened Oct 14, 2025 by
Jonahcb
Loading…
2 of 5 tasks
[Easy] Get rid of unnecessary paraenthesis in kv_cache_manager
v1
#26842
opened Oct 14, 2025 by
Jialin
Loading…
3 of 5 tasks
[CI/Build] Fix AMD import failures in CI
ci/build
rocm
Related to AMD ROCm
#26841
opened Oct 14, 2025 by
zhewenl
Loading…
Add attention benchmarking tools
performance
Performance-related issues
#26835
opened Oct 14, 2025 by
MatthewBonanni
Loading…
1 of 5 tasks
[Bug] Add Assertion for Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
random-input-len
/ random-output-len
performance
#26834
opened Oct 14, 2025 by
yewentao256
Loading…
[Graph Partition] pass tests for decorator
ready
ONLY add when PR is ready to merge/full CI is needed
#26831
opened Oct 14, 2025 by
BoyuanFeng
Loading…
[Bugfix] : prevent automatic URL path decoding for async client #26636
#26829
opened Oct 14, 2025 by
1994
Loading…
[Bugfix] Fixes prefix-repetition benchmark script
performance
Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
#26828
opened Oct 14, 2025 by
kouroshHakha
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-09-14.