Skip to content

Conversation

@tchivs
Copy link
Contributor

@tchivs tchivs commented Dec 25, 2025

Summary

Allow table-options in transform rules to use semicolon ; as the key/value pair delimiter so option values can safely contain commas (e.g. sequence.field=gxsj,jjsj). Keep comma delimiter for backward compatibility.

Motivation

Some downstream table options are multi-value and use comma inside the value. The existing key1=value1,key2=value2 syntax makes those options impossible to express reliably.

Changes

  • SchemaMetadataTransform parses table-options pairs using:
    • ; when present in the string, otherwise ,
    • split("=", 2) to avoid breaking values containing =
  • Docs updated to mention semicolon delimiter for comma-in-value cases.
  • New unit tests for the parsing behavior.

Behavior

  • Still supports legacy format:
    • table-options: key1=value1,key2=value2
  • Supports semicolon format (recommended when values contain commas):
    • table-options: sequence.field=gxsj,jjsj;file-index.range-bitmap.columns=jjsj;file-index.bloom-filter.columns=jjdbh
  • Note: delimiter is chosen by presence of ; (do not mix , and ; in the same string).

@github-actions github-actions bot added docs Improvements or additions to documentation runtime labels Dec 25, 2025
@tchivs
Copy link
Contributor Author

tchivs commented Dec 29, 2025

Thanks for the suggestion @leonardBang! I've updated the PR to support custom delimiters as you recommended.

Changes made:

  1. Added a new optional configuration table-options.delimiter that allows users to specify any custom delimiter (e.g., ;, |, $, etc.)
  2. The default value is , for backward compatibility
  3. Updated documentation with examples showing how to use custom delimiters

Example usage:

transform:
  - source-table: mydb.mytable
    table-options: sequence.field=gxsj,jjsj;file-index.bloom-filter.columns=jjdbh
    table-options.delimiter: ";"

This approach is more flexible and powerful as users can now choose any delimiter that suits their needs.

This commit extends the table-options feature to support custom delimiters,
making it more flexible and powerful for users.

Changes:
- Add optional 'table-options.delimiter' configuration parameter
- Default delimiter is ',' for backward compatibility
- Support any custom delimiter (e.g., ';', '|', '$', etc.)
- Update TransformDef, SchemaMetadataTransform, and TransformRule classes
- Update YAML parser to handle the new configuration
- Add test cases for custom delimiter functionality
- Update documentation with usage examples

Example usage:
transform:
  - source-table: mydb.mytable
    table-options: sequence.field=gxsj,jjsj;file-index.bloom-filter.columns=jjdbh
    table-options.delimiter: ";"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli composer docs Improvements or additions to documentation runtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants