Skip to content

fix: prevent double-wrapping of image data URIs in OpenAI message conversion#1091

Merged
markstur merged 2 commits into
generative-computing:mainfrom
markstur:fix/image_double_wrap
May 20, 2026
Merged

fix: prevent double-wrapping of image data URIs in OpenAI message conversion#1091
markstur merged 2 commits into
generative-computing:mainfrom
markstur:fix/image_double_wrap

Conversation

@markstur
Copy link
Copy Markdown
Contributor

Pull Request

Issue

Fixes #1090

Description

ImageBlock values may contain a data URI prefix (data:image/png;base64,). The message_to_openai_message() function was unconditionally wrapping all images with this prefix, causing double-wrapping when the prefix was already present.

Now strips any existing data URI prefix before wrapping to ensure the output always has exactly one prefix, regardless of input format.

Fixes image rendering in OpenAI-compatible backends when ImageBlock is created with a full data URI.

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code was added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

  • AI coding assistants used

Adding a new component, requirement, sampling strategy, or tool?

If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.

  • Component
  • Requirement
  • Sampling Strategy
  • Tool

NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.

…version

ImageBlock values may contain a data URI prefix (data:image/png;base64,).
The message_to_openai_message() function was unconditionally wrapping
all images with this prefix, causing double-wrapping when the prefix
was already present.

Now strips any existing data URI prefix before wrapping to ensure the
output always has exactly one prefix, regardless of input format.

Fixes image rendering in OpenAI-compatible backends when ImageBlock
is created with a full data URI.

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Assisted-by: IBM Bob
@markstur markstur requested a review from a team as a code owner May 19, 2026 00:20
@markstur markstur requested review from jakelorocco and planetf1 May 19, 2026 00:20
@github-actions github-actions Bot added the bug Something isn't working label May 19, 2026
Copy link
Copy Markdown
Contributor

@jakelorocco jakelorocco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for catching this; one small question since I'm not familiar with how this typically works.

Comment on lines +191 to +201
img_list = []
for img in msg.images:
# Strip data URI prefix if present to avoid double-wrapping
if "data:" in img and "base64," in img:
img = img.split("base64,")[1]
img_list.append(
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{img}"},
}
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not knowledgeable in this area. Are these images always base64 pngs? And, instead of splitting, if the URI already has that information, should we simply append that img without prefixing it? Or is there a reason we should split and re-prefix?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImageBlock uses is_valid_base64_png() to force it to always be base64 png, but it accepts it with or without the prefix. The is_valid... code use a split that is identical to what was used here, but...

I do think that split and then prefix is redundant and so, this would read more like a human wrote it if it was kind of like: image_url = img if startswith(prefix) else f"prefix{img}". (i.e. what you said :) )

I'll push a PR soon.

FYI -- I did not consider changing the way ImageBlock stores this value to solve the problem. Just because I happen to be more focused on fixing the wrapping bug vs changing the core implementation.

It was weird doing a split just to add a prefix (which was probably split off).
Instead use startswith and prefix if needed.

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
@markstur markstur enabled auto-merge May 19, 2026 21:38
@markstur markstur requested a review from jakelorocco May 20, 2026 15:28
@markstur markstur added this pull request to the merge queue May 20, 2026
Merged via the queue into generative-computing:main with commit ea10dc4 May 20, 2026
8 checks passed
@markstur markstur deleted the fix/image_double_wrap branch May 20, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ImageBlock with data URI prefix causes double-wrapping in OpenAI message conversion

2 participants