Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Not fully tested, but hopefully fixes at least some of the issues in #1182. Requires testing before deployment to production.
PROBLEM: LLM results were not being logged.
SOLUTION: I think this was not set up properly in utils/client.py, so I've updated that, and now with PII logging enabled, you can see the raw results coming back from the LLM.
PROBLEM: qwen3.2-vl worked directly to vLLM on pegasus, but when trying on mallory via open-webui in front of ollama, it was failing.
SOLUTION: remove stop_tokens in client.py, which were preventing the LLM from generating a message field in the response - it was only outputting a reasoning field with all of its internal discussion about object detection.
GUESS: the actual problem with some photos not returning may have been the LLM having a tiny context window. I saw log output on mallory indicating it was hitting context window limits. In open-webui, I've changed
num_ctxto 16000, whereas it was originally around 2000, which is a very small window.Tested on qwen3.5:4b against several photos on IMAGE examples page, and against the problematic photos in #1182, and they all worked. Note it is pretty slow. Not sure if that is just mallory with its 3090, or larger context window, or something else.
Assistance: Used GPT-5 mini via github copilot to assist with finding changes.
Please ensure you've followed the checklist and provide all the required information before requesting a review.
If you do not have everything applicable to your PR, it will not be reviewed!
If you don't know what something is or if it applies to you, ask!
Please note that PRs from external contributors who have not agreed to our Contributor License Agreement will not be considered.
To accept it, include
I agree to the [current Contributor License Agreement](/CLA.md)in this pull request.Don't delete below this line.
Required Information
Coding/Commit Requirements
New Component Checklist (mandatory for new microservices)
docker-compose.ymlandbuild.yml..github/workflows.README.mdfile that describes what the component does and what it depends on (other microservices, ML models, etc.).OR