Skip to content

Unicode Specials block bug#2510

Open
manoe wants to merge 3 commits intoCESNET:develfrom
nokia:unicode_block_range_bug
Open

Unicode Specials block bug#2510
manoe wants to merge 3 commits intoCESNET:develfrom
nokia:unicode_block_range_bug

Conversation

@manoe
Copy link
Copy Markdown

@manoe manoe commented Apr 13, 2026

Libyang/yanglint cannot match characters of Unicode Specials block.

The function ly_pat_compile_xmlschema_chblocks_xmlschema2perl() responsible to translate Yang patterns containing Unicode blocks to PCRE2-compatible Perl-based regular expressions with character ranges assumes that each range literal is 19 char long. However, the Specials Unicode block is "special". It contains the disjoint U+FEFF character, and the range U+FFF0-U+FFFD. The original implementation only copies the first 19 chars of the literal, that is \x{FEFF}|\x{FFF0}. The expression is valid unfortunately, however only matches U+FEFF and U+FFF0.

The correction changes the ublock2urange two dimensional char array to a struct array, which is populated during compilation time also with the individual literal lengths to preserve performance (assuming, that the original intention for hard-coding URANGE_LEN was to avoid strlen calls).

Corresponding unit and component tests are also submitted.

@michalvasko
Copy link
Copy Markdown
Member

Okay, thanks, seems this should be fixed but no need to add the length into the struct, a few strlen() calls do not make any difference so please adjust it.

@manoe
Copy link
Copy Markdown
Author

manoe commented Apr 13, 2026

CIFuzz keeps failing, however, based on the logs I cannot tell what went wrong. Can I get a little bit help with fuzzing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants