-
Notifications
You must be signed in to change notification settings - Fork 9
TTS/STT: Gemini-TTS Model Integration with unified API #574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Prajna1999
wants to merge
64
commits into
main
Choose a base branch
from
feature/unified-api-tts
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,219
−303
Open
Changes from all commits
Commits
Show all changes
64 commits
Select commit
Hold shift + click to select a range
5aad042
fix: merge conflict
Prajna1999 f47c20c
chore: update dependencies
Prajna1999 1e03961
feat: add google ai provider for Gemini models
Prajna1999 4ac4de8
feat: working stt with gemini and hotfixing circular import
Prajna1999 3c0bae7
Merge branch 'main' into feature/unified-api-stt-new
Prajna1999 7db94f1
Merge branch 'main' into feature/unified-api-stt-new
Prajna1999 196eb5c
feat: llm_call table, type enforce gAI stt response
Prajna1999 dca3139
Merge remote-tracking branch 'refs/remotes/origin/feature/unified-api…
Prajna1999 271d677
feat: discriminated union type enforcing for stt, tts and text comple…
Prajna1999 5ae59e5
Merge branch 'main' into feature/unified-api-stt-new
Prajna1999 250ce9f
fix: type annotation
Prajna1999 1742a8b
chore: fix alembic revision for shure
Prajna1999 ebb2394
feat: add google stt task to async job
Prajna1999 0bcb697
feat: yolo commit and linting issues
Prajna1999 a6850a3
feat: query input takes audio_url and base64 as audio file input
Prajna1999 f4693f6
chore: test cases for google ai and async job fixes, supress mappers …
Prajna1999 909e249
fix: test cases for config
Prajna1999 a7b0062
chore: clean PLAN.md
Prajna1999 fa25199
chore: extract stt code into its own
Prajna1999 24007a2
Merge branch 'main' into feature/unified-api-stt-new
Prajna1999 bbd2c7f
Refactor evaluation endpoint to use stored configuration and remove a…
avirajsingh7 b907440
fix: default original provider bug
Prajna1999 9f38f45
fix: coderrabbit comments
Prajna1999 f6348b5
Merge branch 'main' into feature/unified-api-stt-new
Prajna1999 b3ea8ec
fix: migration number
Prajna1999 26e0a6a
chore: formatting issue solved
Prajna1999 a623efa
fix: eval core crud test cases
Prajna1999 5c86cf2
fix: test cases for evaluation and test_llm
Prajna1999 c8f165a
chore: test formatting reset to main
Prajna1999 19a6ef7
chore: fix formatting issues
Prajna1999 9bf057b
chore: squash llm_call table migration to sno.43
Prajna1999 665102e
chore: change SQL model signature from ConfigVersionCreatePartial to…
Prajna1999 325ff4d
fix: remove extra imports and add util functions
Prajna1999 237dd97
fix: change llm_call input type and other changes
Prajna1999 df920d2
Merge branch 'main' into feature/unified-api-stt-new
Prajna1999 354a0fc
fix: alembic version for llm_call table
Prajna1999 37ac37f
fix: test cases llm_call and jobs
Prajna1999 9b6a829
Merge branch 'main' into feature/unified-api-stt-new
Prajna1999 321c41c
Merge branch 'main' into feature/unified-api-stt-new
Prajna1999 e9d60e6
chore: variable reference name change and enforced type safety for re…
Prajna1999 1943d3d
chore: remove ad hoc testing code
Prajna1999 7a5d8a1
feat: basic tts implementation with gemini-2.5-pro-preview-tts
Prajna1999 cd0da46
refactor: use pydub for wav to ogg, mp3 conversion
Prajna1999 379a132
feat: add tts config fields to mappers function
Prajna1999 c9be67a
chore: fix test cases
Prajna1999 3fb6ce6
refactor: fix version crud naming
Prajna1999 ee3dd60
chore: fix test cases
Prajna1999 75cccf6
feat: basic tts implementation with gemini-2.5-pro-preview-tts
Prajna1999 243e6f1
refactor: use pydub for wav to ogg, mp3 conversion
Prajna1999 cc3fa95
feat: add tts config fields to mappers function
Prajna1999 f54c1a9
chore: fix test cases
Prajna1999 c044e66
Merge remote-tracking branch 'refs/remotes/origin/feature/unified-api…
Prajna1999 065ac74
Merge branch 'main' into feature/unified-api-tts
Prajna1999 1d953e6
Merge remote-tracking branch 'refs/remotes/origin/feature/unified-api…
Prajna1999 1f93b0e
fix: comments
Prajna1999 afe349f
Merge branch 'main' into feature/unified-api-tts
Prajna1999 2f1e32d
chore: remove unsued imports
Prajna1999 6553e78
reafactor: refactor execute_job function and resolved other comments
Prajna1999 9fffc81
fix: test_job test case
Prajna1999 3a2d625
Merge branch 'main' into feature/unified-api-tts
Prajna1999 1999cbf
chore: test cases coverage and cleanups
Prajna1999 3e0f069
Merge remote-tracking branch 'refs/remotes/origin/feature/unified-api…
Prajna1999 8d9ae7d
fix_name error
Prajna1999 6245e8e
fix: test cases
Prajna1999 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| """ | ||
| Audio processing utilities for format conversion. | ||
|
|
||
| This module provides utilities for converting audio between different formats, | ||
| particularly for TTS output post-processing. | ||
| """ | ||
| import io | ||
| import logging | ||
| from pydub import AudioSegment | ||
|
|
||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| def convert_pcm_to_mp3( | ||
| pcm_bytes: bytes, sample_rate: int = 24000 | ||
| ) -> tuple[bytes | None, str | None]: | ||
| try: | ||
| audio = AudioSegment( | ||
| data=pcm_bytes, sample_width=2, frame_rate=sample_rate, channels=1 | ||
| ) | ||
|
|
||
| output_buffer = io.BytesIO() | ||
| audio.export(output_buffer, format="mp3", bitrate="192k") | ||
| return output_buffer.getvalue(), None | ||
| except Exception as e: | ||
| return None, str(e) | ||
|
|
||
|
|
||
| def convert_pcm_to_ogg( | ||
| pcm_bytes: bytes, sample_rate: int = 24000 | ||
| ) -> tuple[bytes | None, str | None]: | ||
| """Convert raw PCM to OGG with Opus codec.""" | ||
| try: | ||
| audio = AudioSegment( | ||
| data=pcm_bytes, sample_width=2, frame_rate=sample_rate, channels=1 | ||
| ) | ||
|
|
||
| output_buffer = io.BytesIO() | ||
| audio.export( | ||
| output_buffer, format="ogg", codec="libopus", parameters=["-b:a", "64k"] | ||
| ) | ||
| return output_buffer.getvalue(), None | ||
| except Exception as e: | ||
| return None, str(e) | ||
Prajna1999 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.