fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10339
fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10339pos-ei-don wants to merge 1 commit into
Conversation
|
Thanks! Do we need backwards compat? LLMs always want to add backwards compat, but we always pull the latest version of vLLM to my knowledge. So we can probably drop that and if CI passes we are good. Otherwise I'd like to have a test that proves the backwards compat is still needed and covers that branch because otherwise we can just have dead code living there forever (or until testing code coverage becomes more aggressive). |
vLLM >= 0.23 removed GuidedDecodingParams (now StructuredOutputsParams) and renamed the SamplingParams field guided_decoding -> structured_outputs. The import failed, HAS_GUIDED_DECODING became False, and the whole guided-decoding block was skipped, so response_format / grammar constraints were silently ignored. Adapt the existing request.Grammar path to the new class/field. Signed-off-by: pos-ei-don <1822533+pos-ei-don@users.noreply.github.com>
3cd27c7 to
2809e16
Compare
|
Agreed — no, we don't need the backwards-compat branch. The one remaining Re-verified on vLLM 0.23.0 (GB10 / arm64 / CUDA 13): with |
Description
On vLLM >= 0.23,
GuidedDecodingParamswas removed fromvllm.sampling_params(replaced byStructuredOutputsParams, and theSamplingParamsfield renamedguided_decoding->structured_outputs). The vLLM backend's import therefore fails:HAS_GUIDED_DECODINGbecomesFalse, so the entire guided-decoding block is skipped andresponse_format/ grammar constraints are silently ignored — the model returns unconstrained text.Reproduce (vLLM 0.23.0, request with
"response_format": {"type": "json_schema", ...}):Innsbruck), schema not enforced.{"country":"Österreich","name":"Innsbruck","population":132493}).Verified on NVIDIA GB10 / arm64 / CUDA 13, vLLM 0.23.0.
Fix
Detect the available class/field at import time and apply via
setattr, keeping the existingrequest.Grammarflow (the Go core already convertsresponse_formatinto a grammar). Backwards compatible (falls back toGuidedDecodingParams/guided_decoding).Related: #8806 targets the same rename for native json_schema via metadata, but currently adds the class detection without consuming it; this PR fixes the existing path so structured output works again on >= 0.23.