Skip to content

Add JS Speech to text with sherpa-onnx as a more reliable alternative without sidecar#282

Open
jlocala1 wants to merge 5 commits intomainfrom
feature/whisper-js-experiment
Open

Add JS Speech to text with sherpa-onnx as a more reliable alternative without sidecar#282
jlocala1 wants to merge 5 commits intomainfrom
feature/whisper-js-experiment

Conversation

@jlocala1
Copy link
Copy Markdown
Collaborator

Replaces sidecar speech to text with sherpa-onnx that runs natively in node.js. Uses whisper model of same quality (upgrade from what I showed at the meeting with HuggingFace) but without python dependency. Requires ffmpeg and one time model download

ezhu15 and others added 5 commits April 5, 2026 17:06
… for viewing audio, and also updated the transcription file. transcription now shows the audio playing with timestamps and allows the user to click on a time stamp or line of audio and have the audio clip jump to that spot.
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
launch-stack Ready Ready Preview, Comment Apr 15, 2026 5:28am
pdr-ai-v2 Ready Ready Preview, Comment Apr 15, 2026 5:28am

@jlocala1 jlocala1 requested a review from Deodat-Lawson April 15, 2026 05:29
@Deodat-Lawson
Copy link
Copy Markdown
Owner

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 635f12a09a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +27 to +29
const { userId, videoUrl, category, title, preferredProvider } = validation.data;

const [user] = await db
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Bind video upload to authenticated user

This handler trusts a caller-supplied userId from the request body and immediately uses it to load tenant context, but it never verifies that userId matches the authenticated session (or that a session exists). In environments where this route is reachable, an attacker can submit another user's ID and enqueue transcriptions/documents into the wrong company, which is a cross-tenant authorization issue.

Useful? React with 👍 / 👎.

Comment on lines +64 to +65
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
info = ydl.extract_info(url, download=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict sidecar download URL before yt-dlp fetch

/download-and-transcribe forwards unvalidated user input directly into yt_dlp.extract_info(). Because this endpoint accepts arbitrary URL strings, callers can force the sidecar to fetch non-approved/internal targets or very large resources, bypassing the app-layer hostname allowlist and turning this endpoint into an SSRF/resource-abuse vector.

Useful? React with 👍 / 👎.

Comment thread docker-compose.yml
volumes:
postgres_data:
seaweedfs_data:
sidecar_models:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep SeaweedFS named volume declared

The top-level volumes section now declares sidecar_models but no longer declares seaweedfs_data, while the seaweedfs service still mounts seaweedfs_data:/data. This leaves the compose file internally inconsistent for the local-storage setup and can break docker compose validation/startup for that profile.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants