MDEV-39410: ucs2 data not stored correctly in the context by bsrikanth-mariadb · Pull Request #5101 · MariaDB/server

bsrikanth-mariadb · 2026-05-20T09:46:36Z

For constant tables, the rows are stored in the context using the charset my_charset_utf8mb4_bin. However, that is not correct, as every field in the row could use its own charset.

This leads to a replay failure, as the data that gets read from the sql_script file is different from the original row data.

This PR addresses uses field's own charset instead of my_charset_utf8mb4_bin.

For constant tables, the rows are stored in the context using the charset my_charset_utf8mb4_bin. However, that is not correct, as every field in the row could use its own charset. This leads to a replay failure, as the data that gets read from the sql_script file is different from the original row data. This PR addresses uses field's own charset instead of my_charset_utf8mb4_bin.

gemini-code-assist

Code Review

This pull request addresses MDEV-39410 by ensuring ucs2 data is correctly handled in the optimizer context, specifically by using the field's actual charset during row formatting. It also adds a regression test for this scenario. Feedback suggests reverting the change to the initial charset of the temporary string buffer in format_and_store_row, as the charset is overwritten later and the modification may be misleading for future maintenance.

gemini-code-assist · 2026-05-20T09:49:17Z

  Field **pfield;
  char row_buff_tmp[512];
-  String tmp(row_buff_tmp, sizeof(row_buff_tmp), &my_charset_bin);
+  String tmp(row_buff_tmp, sizeof(row_buff_tmp), &my_charset_utf8mb4_bin);


This change appears to be unnecessary and could be confusing. The charset of the tmp string, initialized here, is immediately overwritten by the call to field->val_str(&tmp) on line 3128. For most field types, val_str invokes String::set, which updates both the content and the charset of the string object. Consequently, the initial charset specified in the String constructor is never actually used.

Setting it to &my_charset_utf8mb4_bin might mislead future developers into thinking this specific charset is important for the logic that follows. The original &my_charset_bin is a more generic and suitable choice for a temporary buffer. To enhance code clarity, I recommend reverting this change.

String tmp(row_buff_tmp, sizeof(row_buff_tmp), &my_charset_bin);

bsrikanth-mariadb requested a review from spetrunia May 20, 2026 09:46

bsrikanth-mariadb added the MariaDB Corporation label May 20, 2026

gemini-code-assist Bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MDEV-39410: ucs2 data not stored correctly in the context#5101

MDEV-39410: ucs2 data not stored correctly in the context#5101
bsrikanth-mariadb wants to merge 1 commit into
bb-12.3-MDEV-39368-test-replayfrom
13.1-MDEV-39410-ucs2-data-not-stored-correctly-in-context

bsrikanth-mariadb commented May 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

bsrikanth-mariadb commented May 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant