Skip to content

Dev#201

Open
akshayballal95 wants to merge 6 commits intomainfrom
dev
Open

Dev#201
akshayballal95 wants to merge 6 commits intomainfrom
dev

Conversation

@akshayballal95
Copy link
Collaborator

@akshayballal95 akshayballal95 commented Feb 15, 2026

Note

Medium Risk
Introduces new multimedia processing paths (external ffmpeg execution, base64 decoding, temp file IO) and expands the public server API surface, which could impact reliability and resource usage if inputs are malformed or large.

Overview
Adds opt-in video embedding support behind a new video Cargo feature: videos are processed via an ffmpeg-based frame sampler (VideoProcessor), embedded in batches with a vision model, and annotated with video_path/frame_index metadata; this is wired through Rust (embed_video_file, embed_video_directory), Python bindings (VideoEmbedConfig, embed_video_*), and new docs/examples.

Extends the Actix server to support base64 image inputs: /v1/embeddings now auto-detects text vs base64 images (rejects mixed), and a new /v1/image_embeddings endpoint decodes/validates images, writes temp files, and runs embed_image_batch; server deps add base64 + image. Also adds a CUDA server Dockerfile and minor Docker build tweaks, plus documentation updates and navigation for the new video guide.

Written by Cursor Bugbot for commit a29c2a8. This will update automatically on new commits. Configure here.

akshayballal95 and others added 6 commits December 28, 2025 22:49
…lity

- Introduced a new Dockerfile for building a server with CUDA development tools.
- Updated the main server Dockerfile to improve the build process.
- Added support for image embeddings, including new request and response structures for handling base64 images.
- Enhanced the embedding logic to differentiate between text and image inputs, ensuring proper error handling for mixed input types.
- Updated dependencies in Cargo.toml and Cargo.lock to include base64 and image libraries.
…nagement

- Updated the Dockerfile to use a base image with CUDA 12.2.0 and streamlined the build stages.
- Introduced sccache for caching Rust builds and improved the installation of Rust and cargo-chef.
- Enhanced the build process by separating the planner and builder stages, ensuring better organization and efficiency.
- Removed unnecessary mock scripts and optimized runtime dependencies for a cleaner image.
feature: Added video embedding support and guide
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.


Returns:
A list of EmbedData objects, or None if an adapter is used.
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Video function stubs placed inside audio docstring

High Severity

The embed_video_file and embed_video_directory function definitions are inserted between the opening """ of embed_audio_file's docstring (line 270) and its actual docstring content (line 307). The """ on line 270 opens the docstring, and the """ on line 277 (intended as embed_video_file's docstring opener) closes it, leaving the rest as invalid syntax inside embed_audio_file's body. This means the video functions don't exist as actual type stubs, embed_audio_file's docstring is garbled, and IDEs/type checkers will not recognize the new video APIs.

Fix in Cursor Fix in Web

.is_ok()
{
return true;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base64 image detection always returns true for valid base64

High Severity

is_base64_image checks with_guessed_format().is_ok() on a Cursor, but with_guessed_format() returns Ok even when no image format is detected (it only fails on I/O errors, which don't occur with Cursor). This means any valid base64 string ≥100 characters is classified as an image. The check needs to verify .format().is_some() instead. This causes the /v1/embeddings endpoint to misroute long base64-valid text strings to the image embedding path, leading to failures or wrong results.

Fix in Cursor Fix in Web

EmbeddingResult::MultiVector(_) => {
// For multi-vector embeddings, return empty (or handle differently)
vec![]
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image endpoint silently returns empty multi-vector embeddings

Medium Severity

The new create_image_embeddings endpoint returns vec![] for MultiVector embeddings, silently producing zero-dimensional embedding vectors with no error. Users of multi-vector vision models (e.g., ColPali) would receive response objects where embedding is an empty array, which is indistinguishable from a successful result but contains no usable data. This is a silent data loss scenario.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants