Align FastAPI with official Orpheus prompt setup#3
Open
SebastianBodza wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request refactors how special tokens are handled in the
streaming_api_server.pyfor prompt formatting and audio stream generation, aligning the code more closely with the official Orpheus-TTS inference protocol. The changes remove the need for searching for a code start token in the output, instead relying on new, clearly defined token ID lists for prompts and stop conditions. This simplifies the logic and improves maintainability.Special token handling improvements:
CODE_START_TOKEN_IDwithPROMPT_START_TOKEN_IDand a list ofPROMPT_END_TOKEN_IDSto wrap prompts, and introducedSTOP_TOKEN_IDSfor stopping generation, matching official Orpheus-TTS inference conventions.format_prompt_for_vllm_syncfunction to use the new prompt start and end token IDs, improving clarity and flexibility.Audio stream generation logic simplification:
Other cleanups:
STOP_SEQUENCEconstant.Summary by cubic
Aligns the FastAPI server with the official Orpheus‑TTS inference protocol by standardizing prompt wrapping and stop conditions, and simplifying audio streaming token handling. This removes brittle start‑token scanning and makes the code easier to maintain.
CODE_START_TOKEN_IDwithPROMPT_START_TOKEN_IDandPROMPT_END_TOKEN_IDS; introducedSTOP_TOKEN_IDSand passed them to vLLM viaextra_body. Removed unusedSTOP_SEQUENCE.format_prompt_for_vllm_syncto wrap prompts with the new start/end token IDs.CODE_REMOVE_TOKEN_IDandCODE_TOKEN_OFFSET, kept chunking behavior for initial/stream sizes, and ensured remaining codes are processed at stream end.Written for commit ab8a5be. Summary will update on new commits.