fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10343
Open
pos-ei-don wants to merge 1 commit into
Open
fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10343pos-ei-don wants to merge 1 commit into
pos-ei-don wants to merge 1 commit into
Conversation
vLLM >= 0.23 removed GuidedDecodingParams (now StructuredOutputsParams) and renamed the SamplingParams field guided_decoding -> structured_outputs. The import failed, HAS_GUIDED_DECODING became False, and the whole guided-decoding block was skipped, so response_format / grammar constraints were silently ignored. Adapt the existing request.Grammar path to the new class/field. Signed-off-by: pos-ei-don <1822533+pos-ei-don@users.noreply.github.com>
Contributor
Author
|
Heads-up: the All functional specs pass (168/168 unit + 92/101 E2E, 9 skipped). This diff is 6 lines in |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Supersedes #10339, which got auto-closed when I force-pushed from a shallow clone that had severed the commit's parent (the diff briefly blew up to the whole tree and GitHub closed it — sorry for the noise). This branch is now cleanly based on current
masterwith a single 6-line change.Description
On vLLM >= 0.23,
GuidedDecodingParamswas removed fromvllm.sampling_params(replaced byStructuredOutputsParams, and theSamplingParamsfield renamedguided_decoding->structured_outputs). The vLLM backend's import therefore fails:HAS_GUIDED_DECODINGbecomesFalse, so the entire guided-decoding block is skipped andresponse_format/ grammar constraints are silently ignored — the model returns unconstrained text.Reproduce (vLLM 0.23.0, request with
"response_format": {"type": "json_schema", ...}):Innsbruck), schema not enforced.{"country":"Österreich","name":"Innsbruck","population":132493}).Verified on NVIDIA GB10 / arm64 / CUDA 13, vLLM 0.23.0.
Fix
Adapt the existing
request.Grammarpath to the new class/field (StructuredOutputsParams/structured_outputs). Per @richiejp's review on #10339, no backwards-compat branch — the backend tracks the latest vLLM, so there's no supported version below 0.23 to fall back to.