Skip to content

fix triton token2wav model cache thread unsafety#1622

Merged
aluminumbox merged 3 commits into
FunAudioLLM:mainfrom
GoyoUijin:Fix/token2wav-cache-thread-unsafe
Dec 30, 2025
Merged

fix triton token2wav model cache thread unsafety#1622
aluminumbox merged 3 commits into
FunAudioLLM:mainfrom
GoyoUijin:Fix/token2wav-cache-thread-unsafe

Conversation

@GoyoUijin
Copy link
Copy Markdown
Contributor

If the user does not explicitly specify a request_id in the Triton client, the Triton server sets request_id=''.
In this case, different threads may write to the same key in CosyVoice2Model.hift_cache_dict simultaneously, causing cache data from different requests to be mixed up.
When serving CosyVoice2 via Triton Server, I actually observed that the output audio from different requests got mixed.

Example

On the client side:

# request_id is not explicitly specified
sync_triton_client.async_stream_infer(
    "cosyvoice2",
    inputs,
    outputs=outputs,
    enable_empty_final_response=True,
)

On the server side:

    def execute(self, requests):
        for request in requests:
            request_id = request.request_id()  # This value is ''

Therefore, I propose generating and using a unique uuid directly on the server side, instead of relying on the user to provide a request_id.

- Since all token2wav requests within a single cosyvoice2 request must share the same request_id, modify the logic so that a new request_id is generated only if it does not already exist, and ensure that the same request_id is sent consistently.
@yuekaizhang
Copy link
Copy Markdown
Contributor

@GoyoUijin Many thanks. @aluminumbox Would you mind helping merge the PR?

@aluminumbox aluminumbox merged commit ba5db60 into FunAudioLLM:main Dec 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants