Skip to content

fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10339

Closed
pos-ei-don wants to merge 1 commit into
mudler:masterfrom
pos-ei-don:fix/vllm-structured-outputs-023
Closed

fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10339
pos-ei-don wants to merge 1 commit into
mudler:masterfrom
pos-ei-don:fix/vllm-structured-outputs-023

Conversation

@pos-ei-don

Copy link
Copy Markdown
Contributor

Description

On vLLM >= 0.23, GuidedDecodingParams was removed from vllm.sampling_params (replaced by StructuredOutputsParams, and the SamplingParams field renamed guided_decoding -> structured_outputs). The vLLM backend's import therefore fails:

try:
    from vllm.sampling_params import GuidedDecodingParams
    HAS_GUIDED_DECODING = True
except ImportError:
    HAS_GUIDED_DECODING = False

HAS_GUIDED_DECODING becomes False, so the entire guided-decoding block is skipped and response_format / grammar constraints are silently ignored — the model returns unconstrained text.

Reproduce (vLLM 0.23.0, request with "response_format": {"type": "json_schema", ...}):

  • Before: plain text (e.g. Innsbruck), schema not enforced.
  • After: schema-conforming JSON (e.g. {"country":"Österreich","name":"Innsbruck","population":132493}).

Verified on NVIDIA GB10 / arm64 / CUDA 13, vLLM 0.23.0.

Fix

Detect the available class/field at import time and apply via setattr, keeping the existing request.Grammar flow (the Go core already converts response_format into a grammar). Backwards compatible (falls back to GuidedDecodingParams / guided_decoding).

try:
    from vllm.sampling_params import StructuredOutputsParams as _SO_CLS
    HAS_GUIDED_DECODING = True
    _SO_FIELD = "structured_outputs"
except ImportError:
    try:
        from vllm.sampling_params import GuidedDecodingParams as _SO_CLS
        HAS_GUIDED_DECODING = True
        _SO_FIELD = "guided_decoding"
    except ImportError:
        HAS_GUIDED_DECODING = False
        _SO_CLS = None
        _SO_FIELD = None
if HAS_GUIDED_DECODING and request.Grammar:
    try:
        json.loads(request.Grammar)
        setattr(sampling_params, _SO_FIELD, _SO_CLS(json=request.Grammar))
    except json.JSONDecodeError:
        setattr(sampling_params, _SO_FIELD, _SO_CLS(grammar=request.Grammar))

Related: #8806 targets the same rename for native json_schema via metadata, but currently adds the class detection without consuming it; this PR fixes the existing path so structured output works again on >= 0.23.

@richiejp

Copy link
Copy Markdown
Collaborator

Thanks!

Do we need backwards compat? LLMs always want to add backwards compat, but we always pull the latest version of vLLM to my knowledge. So we can probably drop that and if CI passes we are good.

Otherwise I'd like to have a test that proves the backwards compat is still needed and covers that branch because otherwise we can just have dead code living there forever (or until testing code coverage becomes more aggressive).

vLLM >= 0.23 removed GuidedDecodingParams (now StructuredOutputsParams) and
renamed the SamplingParams field guided_decoding -> structured_outputs. The
import failed, HAS_GUIDED_DECODING became False, and the whole guided-decoding
block was skipped, so response_format / grammar constraints were silently
ignored. Adapt the existing request.Grammar path to the new class/field.

Signed-off-by: pos-ei-don <1822533+pos-ei-don@users.noreply.github.com>
@pos-ei-don pos-ei-don closed this Jun 15, 2026
@pos-ei-don pos-ei-don force-pushed the fix/vllm-structured-outputs-023 branch from 3cd27c7 to 2809e16 Compare June 15, 2026 11:32
@pos-ei-don

Copy link
Copy Markdown
Contributor Author

Agreed — no, we don't need the backwards-compat branch. StructuredOutputsParams has been in vllm.sampling_params since 0.23, and since the backend tracks the latest vLLM there's no supported version below that to fall back to. I dropped the fallback and force-pushed: the PR now just adapts the existing request.Grammar path to the renamed class/field (structured_outputs), so there's no dead branch left to test.

The one remaining try/except ImportError around the import is the original guard (vLLM not installed at all → HAS_GUIDED_DECODING = False), not a version fallback — happy to make the import unconditional if you'd rather.

Re-verified on vLLM 0.23.0 (GB10 / arm64 / CUDA 13): with response_format json_schema the model now returns schema-conforming JSON instead of unconstrained text.

@pos-ei-don

Copy link
Copy Markdown
Contributor Author

Reopened as #10343 — this one got auto-closed when I force-pushed from a shallow clone and accidentally severed the commit's parent. #10343 is the same fix (backwards-compat dropped per your review), cleanly rebased on current master. Continuing there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants