Skip to content

Update string UDF's to have return type == input type where appropriate #20585

@Omega359

Description

@Omega359

Is your feature request related to a problem or challenge?

Quite a few UDF's currently only return Utf8 or LargeUtf8 even if the input type is Utf8View. Ideally, all functions that consume strings and return the same will return values in the same data type as the primary (or largest if consuming many strings such as concat).

For example, if a UDF accepts Utf8, LargeUtf8 and Utf8View types for input it should return data in the same type as the input (where feasible).

Currently I'm aware of the following UDF's that will consume Utf8View however do not return it:

  • substrindex
  • splitpart
  • translate
  • replace
  • reverse
  • rpad
  • lpad
  • repeat
  • overlay
  • lower
  • upper

It is possible there are others I'm unaware of

Describe the solution you'd like

All udf's that return string data returns it in the same data type as the primary (or largest) argument type.

Describe alternatives you've considered

No response

Additional context

Searching for usages of the utf8_to_str_type function is a decent place to find udf's that have this issue. One should be careful with fixing that function itself (by adding a utf8view data type for example) as doing so will require all the usages of it to properly handle returning utf8View.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions