Align FastAPI with official Orpheus prompt setup by SebastianBodza · Pull Request #3 · SebastianBodza/Orpheus_Distributed_FastAPI

SebastianBodza · 2026-03-30T20:01:35Z

This pull request refactors how special tokens are handled in the streaming_api_server.py for prompt formatting and audio stream generation, aligning the code more closely with the official Orpheus-TTS inference protocol. The changes remove the need for searching for a code start token in the output, instead relying on new, clearly defined token ID lists for prompts and stop conditions. This simplifies the logic and improves maintainability.

Special token handling improvements:

Replaced the single CODE_START_TOKEN_ID with PROMPT_START_TOKEN_ID and a list of PROMPT_END_TOKEN_IDS to wrap prompts, and introduced STOP_TOKEN_IDS for stopping generation, matching official Orpheus-TTS inference conventions.
Updated the format_prompt_for_vllm_sync function to use the new prompt start and end token IDs, improving clarity and flexibility.

Audio stream generation logic simplification:

Removed the logic for searching for a code start token in generated token IDs; now, all relevant tokens are processed directly from the output using the new token ID lists. [1] [2] [3]

Other cleanups:

Removed the unused STOP_SEQUENCE constant.

Summary by cubic

Aligns the FastAPI server with the official Orpheus‑TTS inference protocol by standardizing prompt wrapping and stop conditions, and simplifying audio streaming token handling. This removes brittle start‑token scanning and makes the code easier to maintain.

Refactors
- Replaced CODE_START_TOKEN_ID with PROMPT_START_TOKEN_ID and PROMPT_END_TOKEN_IDS; introduced STOP_TOKEN_IDS and passed them to vLLM via extra_body. Removed unused STOP_SEQUENCE.
- Updated format_prompt_for_vllm_sync to wrap prompts with the new start/end token IDs.
- Simplified audio stream generation: removed start-token search, filtered tokens using CODE_REMOVE_TOKEN_ID and CODE_TOKEN_OFFSET, kept chunking behavior for initial/stream sizes, and ensured remaining codes are processed at stream end.

^{Written for commit ab8a5be. Summary will update on new commits.}

cubic-dev-ai

No issues found across 1 file

Align FastAPI with official Orpheus prompt setup

ab8a5be

cubic-dev-ai Bot reviewed Mar 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align FastAPI with official Orpheus prompt setup#3

Align FastAPI with official Orpheus prompt setup#3
SebastianBodza wants to merge 1 commit into
mainfrom
official-setup

SebastianBodza commented Mar 30, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SebastianBodza commented Mar 30, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SebastianBodza commented Mar 30, 2026 •

edited by cubic-dev-ai Bot

Loading