Skip to content

Make translate emit Utf8View for Utf8View input#20624

Open
shivaaang wants to merge 1 commit intoapache:mainfrom
shivaaang:translate-utf8view
Open

Make translate emit Utf8View for Utf8View input#20624
shivaaang wants to merge 1 commit intoapache:mainfrom
shivaaang:translate-utf8view

Conversation

@shivaaang
Copy link

Which issue does this PR close?

Part of #20585

Rationale for this change

String UDFs should preserve string representation where feasible. translate previously accepted Utf8View input but emitted Utf8, causing an unnecessary type downgrade. This aligns translate with the expected behavior of returning the same string type as its primary input.

What changes are included in this PR?

  1. Updated translate return type inference to emit Utf8View when input is Utf8View, while preserving existing behavior for Utf8 and LargeUtf8.
  2. Refactored translate and translate_with_map to use explicit string builders (via a local TranslateOutput helper trait) instead of .collect::<GenericStringArray<T>>(), so the correct output array type is produced for each input type.
  3. Added unit tests for Utf8View input (basic, null, non-ASCII) and sqllogictests verifying arrow_typeof output for all three string types.

Are these changes tested?

Yes. Unit tests and sqllogictests are included.

Are there any user-facing changes?

No.

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant