Skip to content

fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10343

Open
pos-ei-don wants to merge 1 commit into
mudler:masterfrom
pos-ei-don:fix/vllm-structured-outputs-023
Open

fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10343
pos-ei-don wants to merge 1 commit into
mudler:masterfrom
pos-ei-don:fix/vllm-structured-outputs-023

Conversation

@pos-ei-don

Copy link
Copy Markdown
Contributor

Supersedes #10339, which got auto-closed when I force-pushed from a shallow clone that had severed the commit's parent (the diff briefly blew up to the whole tree and GitHub closed it — sorry for the noise). This branch is now cleanly based on current master with a single 6-line change.

Description

On vLLM >= 0.23, GuidedDecodingParams was removed from vllm.sampling_params (replaced by StructuredOutputsParams, and the SamplingParams field renamed guided_decoding -> structured_outputs). The vLLM backend's import therefore fails:

try:
    from vllm.sampling_params import GuidedDecodingParams
    HAS_GUIDED_DECODING = True
except ImportError:
    HAS_GUIDED_DECODING = False

HAS_GUIDED_DECODING becomes False, so the entire guided-decoding block is skipped and response_format / grammar constraints are silently ignored — the model returns unconstrained text.

Reproduce (vLLM 0.23.0, request with "response_format": {"type": "json_schema", ...}):

  • Before: plain text (e.g. Innsbruck), schema not enforced.
  • After: schema-conforming JSON (e.g. {"country":"Österreich","name":"Innsbruck","population":132493}).

Verified on NVIDIA GB10 / arm64 / CUDA 13, vLLM 0.23.0.

Fix

Adapt the existing request.Grammar path to the new class/field (StructuredOutputsParams / structured_outputs). Per @richiejp's review on #10339, no backwards-compat branch — the backend tracks the latest vLLM, so there's no supported version below 0.23 to fall back to.

vLLM >= 0.23 removed GuidedDecodingParams (now StructuredOutputsParams) and
renamed the SamplingParams field guided_decoding -> structured_outputs. The
import failed, HAS_GUIDED_DECODING became False, and the whole guided-decoding
block was skipped, so response_format / grammar constraints were silently
ignored. Adapt the existing request.Grammar path to the new class/field.

Signed-off-by: pos-ei-don <1822533+pos-ei-don@users.noreply.github.com>
@pos-ei-don

Copy link
Copy Markdown
Contributor Author

Heads-up: the tests failure here is unrelated to this change. TestAllFieldsHaveRegistryEntries (core/config/meta/registry_coverage_test.go) fails on the config field pipeline.max_history_items (added in #10331), which has no entry in core/config/meta/registry.go and isn't grandfathered — so the coverage gate trips on current master and any PR branched off it.

All functional specs pass (168/168 unit + 92/101 E2E, 9 skipped). This diff is 6 lines in backend/python/vllm/backend.py and can't affect a Go config-registry gate. Happy to send a separate one-line PR adding the registry entry if that helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant