fix: prevent double-wrapping of image data URIs in OpenAI message conversion#1091
Conversation
…version ImageBlock values may contain a data URI prefix (data:image/png;base64,). The message_to_openai_message() function was unconditionally wrapping all images with this prefix, causing double-wrapping when the prefix was already present. Now strips any existing data URI prefix before wrapping to ensure the output always has exactly one prefix, regardless of input format. Fixes image rendering in OpenAI-compatible backends when ImageBlock is created with a full data URI. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com> Assisted-by: IBM Bob
jakelorocco
left a comment
There was a problem hiding this comment.
thanks for catching this; one small question since I'm not familiar with how this typically works.
| img_list = [] | ||
| for img in msg.images: | ||
| # Strip data URI prefix if present to avoid double-wrapping | ||
| if "data:" in img and "base64," in img: | ||
| img = img.split("base64,")[1] | ||
| img_list.append( | ||
| { | ||
| "type": "image_url", | ||
| "image_url": {"url": f"data:image/png;base64,{img}"}, | ||
| } | ||
| ) |
There was a problem hiding this comment.
I am not knowledgeable in this area. Are these images always base64 pngs? And, instead of splitting, if the URI already has that information, should we simply append that img without prefixing it? Or is there a reason we should split and re-prefix?
There was a problem hiding this comment.
ImageBlock uses is_valid_base64_png() to force it to always be base64 png, but it accepts it with or without the prefix. The is_valid... code use a split that is identical to what was used here, but...
I do think that split and then prefix is redundant and so, this would read more like a human wrote it if it was kind of like: image_url = img if startswith(prefix) else f"prefix{img}". (i.e. what you said :) )
I'll push a PR soon.
FYI -- I did not consider changing the way ImageBlock stores this value to solve the problem. Just because I happen to be more focused on fixing the wrapping bug vs changing the core implementation.
It was weird doing a split just to add a prefix (which was probably split off). Instead use startswith and prefix if needed. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
ea10dc4
Pull Request
Issue
Fixes #1090
Description
ImageBlock values may contain a data URI prefix (data:image/png;base64,). The message_to_openai_message() function was unconditionally wrapping all images with this prefix, causing double-wrapping when the prefix was already present.
Now strips any existing data URI prefix before wrapping to ensure the output always has exactly one prefix, regardless of input format.
Fixes image rendering in OpenAI-compatible backends when ImageBlock is created with a full data URI.
Testing
Attribution
Adding a new component, requirement, sampling strategy, or tool?
If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.
NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.