fix: preserve OpenAI TTS response format on disk#6090
Open
stablegenius49 wants to merge 1 commit intoAstrBotDevs:masterfrom
Open
fix: preserve OpenAI TTS response format on disk#6090stablegenius49 wants to merge 1 commit intoAstrBotDevs:masterfrom
stablegenius49 wants to merge 1 commit intoAstrBotDevs:masterfrom
Conversation
Contributor
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
Contributor
There was a problem hiding this comment.
Hey - 我在这里给出了一些高层次的反馈:
- 当前
get_audio的新实现会在写入磁盘之前,把整个流式响应先缓存在内存中,这在处理大型 TTS 输出时可能会有问题;建议只缓存一小段前缀用于格式检测,其余的数据直接流式写入文件。 - 当
_resolve_audio_extension遇到一个不在extension_map中的audio/*content-type,且通过头部嗅探也无法匹配到已知格式时,会静默地回退到.wav;更安全的做法可能是抛出错误,或者至少在错误信息中包含原始的content_type,以避免对未知音频格式进行错误标记。
AI 代理提示词
Please address the comments from this code review:
## Overall Comments
- The new implementation of `get_audio` buffers the entire streamed response into memory before writing to disk, which may be problematic for large TTS outputs; consider only buffering a small prefix for format detection and streaming the rest directly to the file.
- When `_resolve_audio_extension` encounters an `audio/*` content-type that is not in `extension_map` and whose header sniffing does not match a known format, it silently falls back to `.wav`; it might be safer to raise or at least include the original `content_type` in an error to avoid mislabeling unknown audio formats.帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的代码评审。
Original comment in English
Hey - I've left some high level feedback:
- The new implementation of
get_audiobuffers the entire streamed response into memory before writing to disk, which may be problematic for large TTS outputs; consider only buffering a small prefix for format detection and streaming the rest directly to the file. - When
_resolve_audio_extensionencounters anaudio/*content-type that is not inextension_mapand whose header sniffing does not match a known format, it silently falls back to.wav; it might be safer to raise or at least include the originalcontent_typein an error to avoid mislabeling unknown audio formats.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new implementation of `get_audio` buffers the entire streamed response into memory before writing to disk, which may be problematic for large TTS outputs; consider only buffering a small prefix for format detection and streaming the rest directly to the file.
- When `_resolve_audio_extension` encounters an `audio/*` content-type that is not in `extension_map` and whose header sniffing does not match a known format, it silently falls back to `.wav`; it might be safer to raise or at least include the original `content_type` in an error to avoid mislabeling unknown audio formats.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #6015
Modifications / 改动点
stop hardcoding the streamed OpenAI TTS response to a
.wavtemp filedetect the real audio format from
content-typeor the first bytes and save with the matching extensionraise a clear runtime error when the TTS endpoint returns JSON / HTML / other non-audio payloads instead of letting the DingTalk ffmpeg conversion fail later with an opaque invalid-input error
add focused regression tests for mp3 content-type preservation and non-audio payload handling
This is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
Verification Steps:
Result:
Result:
Checklist / 检查清单
requirements.txt和pyproject.toml文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations inrequirements.txtandpyproject.toml.Summary by Sourcery
通过在磁盘上保留 OpenAI TTS 流式音频响应的实际格式,并在遇到非音频负载时快速失败,使处理过程更加健壮。
Bug Fixes:
.wav文件。Tests:
content-type的保留,以及对非音频 TTS 响应进行清晰的错误处理。Original summary in English
Summary by Sourcery
Handle OpenAI TTS streamed audio responses more robustly by preserving their actual format on disk and failing fast on non-audio payloads.
Bug Fixes:
Tests: