feat: Add Google RE2/J linear time regular expression as alternative to Java regex#19514
Conversation
76edf35 to
2010eeb
Compare
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 1 |
| P3 | 1 |
| Total | 2 |
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 1 |
| P3 | 1 |
| Total | 2 |
Reviewed 21 of 21 changed files.
This is an automated review by Codex GPT-5.5
6fad2f1 to
cdf47e9
Compare
cdf47e9 to
8633e33
Compare
8633e33 to
f1af702
Compare
f1af702 to
d41d952
Compare
FrankChen021
left a comment
There was a problem hiding this comment.
I reviewed the follow-up. The FlattenSpec heading has been restored, so no inline reply is needed.
Reviewed 24 of 24 changed files.
This is an automated review by Codex GPT-5.5
FrankChen021
left a comment
There was a problem hiding this comment.
I reviewed the follow-up. The regex engine wiring concern is handled: the module is now present in startup injection and the ingestion-facing service paths touched by the follow-up, including indexer, middle manager, overlord/sampler, peon, and coordinator standalone wiring.
Reviewed 24 of 24 changed files.
This is an automated review by Codex GPT-5.5
|
I'm slightly concerned about the multiple interfaces and classes introduced in this PR, I wonder maybe we could just add We could have a system wide default engine, but this could be override in |
Fixes #19513.
Description
Add Google RE2/J linear time regular expression as alternative to Java regex
druid.regex.engine=JAVASupported values:
JAVAjava.util.regex.Patternengine.RE2JGoogle's RE2/Jregex engine with linear-time matching guarantees.Default value:
druid.regex.engine=JAVARE2/J engine
Setting:
druid.regex.engine=RE2Jenables the RE2/J regex engine for ingestion task
regexinput formats.RE2/J helps protect against catastrophic backtracking and Regular Expression Denial of Service (ReDoS) attacks by guaranteeing linear-time regex evaluation.
Compatibility differences
RE2/J does not support all Java regex features.
Unsupported or partially supported features include:
Patterns using unsupported constructs will fail during regex compilation.
Example of catastrophic backtracking
The following Java regex may cause catastrophic backtracking:
against input such as:
Using
RE2Javoids this issue.Performance considerations
JAVAmay support more advanced regex syntax and behavior.RE2Jprovides safer and more predictable runtime characteristics.JAVAmay be preferred for compatibility.RE2Jis recommended.This PR has: