fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed) by pos-ei-don · Pull Request #10343 · mudler/LocalAI

pos-ei-don · 2026-06-15T11:41:33Z

Supersedes #10339, which got auto-closed when I force-pushed from a shallow clone that had severed the commit's parent (the diff briefly blew up to the whole tree and GitHub closed it — sorry for the noise). This branch is now cleanly based on current master with a single 6-line change.

Description

On vLLM >= 0.23, GuidedDecodingParams was removed from vllm.sampling_params (replaced by StructuredOutputsParams, and the SamplingParams field renamed guided_decoding -> structured_outputs). The vLLM backend's import therefore fails:

try:
    from vllm.sampling_params import GuidedDecodingParams
    HAS_GUIDED_DECODING = True
except ImportError:
    HAS_GUIDED_DECODING = False

HAS_GUIDED_DECODING becomes False, so the entire guided-decoding block is skipped and response_format / grammar constraints are silently ignored — the model returns unconstrained text.

Reproduce (vLLM 0.23.0, request with "response_format": {"type": "json_schema", ...}):

Before: plain text (e.g. Innsbruck), schema not enforced.
After: schema-conforming JSON (e.g. {"country":"Österreich","name":"Innsbruck","population":132493}).

Verified on NVIDIA GB10 / arm64 / CUDA 13, vLLM 0.23.0.

Fix

Adapt the existing request.Grammar path to the new class/field (StructuredOutputsParams / structured_outputs). Per @richiejp's review on #10339, no backwards-compat branch — the backend tracks the latest vLLM, so there's no supported version below 0.23 to fall back to.

vLLM >= 0.23 removed GuidedDecodingParams (now StructuredOutputsParams) and renamed the SamplingParams field guided_decoding -> structured_outputs. The import failed, HAS_GUIDED_DECODING became False, and the whole guided-decoding block was skipped, so response_format / grammar constraints were silently ignored. Adapt the existing request.Grammar path to the new class/field. Signed-off-by: pos-ei-don <1822533+pos-ei-don@users.noreply.github.com>

pos-ei-don · 2026-06-15T15:06:43Z

Heads-up: the tests failure here is unrelated to this change. TestAllFieldsHaveRegistryEntries (core/config/meta/registry_coverage_test.go) fails on the config field pipeline.max_history_items (added in #10331), which has no entry in core/config/meta/registry.go and isn't grandfathered — so the coverage gate trips on current master and any PR branched off it.

All functional specs pass (168/168 unit + 92/101 E2E, 9 skipped). This diff is 6 lines in backend/python/vllm/backend.py and can't affect a Go config-registry gate. Happy to send a separate one-line PR adding the registry entry if that helps.

pos-ei-don mentioned this pull request Jun 15, 2026

fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed) #10339

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10343

fix(vllm): structured outputs silently ignored on vLLM >= 0.23 (GuidedDecodingParams removed)#10343
pos-ei-don wants to merge 1 commit into
mudler:masterfrom
pos-ei-don:fix/vllm-structured-outputs-023

pos-ei-don commented Jun 15, 2026

Uh oh!

pos-ei-don commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

pos-ei-don commented Jun 15, 2026

Uh oh!

pos-ei-don commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant