[TEXT-103] Add support for weighted Levenshtein distance#737
[TEXT-103] Add support for weighted Levenshtein distance#737ron-ladin6 wants to merge 3 commits intoapache:masterfrom
Conversation
garydgregory
left a comment
There was a problem hiding this comment.
Hello @ron-ladin6
Thank you for the PR.
Please address my comment.
You did not run mvn by itself per the instructions in the PR template or you'd have found the current build check failures.
TY!
| */ | ||
| private final int replaceCost; | ||
|
|
||
| // ------------------------------------------------------------------------- |
There was a problem hiding this comment.
Remove these // comments please, they don't help maintain the code.
| * @throws IllegalArgumentException if threshold is negative, or any cost is negative. | ||
| * @since 1.13.0 | ||
| */ | ||
| public LevenshteinDistance(final Integer threshold, final int insertCost, final int deleteCost, |
There was a problem hiding this comment.
Use a builder instead, this will avoid constructor inflation and deprecations in the future. You should end upt with a private constructor that takes the Builder as its input. For an example in this component, see RandomStringGenerator but here the builder wouldn't need to extend the deprecated Builder interface.
| this.replaceCost = replaceCost; | ||
| } | ||
|
|
||
| // ------------------------------------------------------------------------- |
There was a problem hiding this comment.
Remove these // comments please, they don't help maintain the code.
| return unlimitedCompare(left, right, insertCost, deleteCost, replaceCost); | ||
| } | ||
|
|
||
| // ------------------------------------------------------------------------- |
There was a problem hiding this comment.
Remove these // comments please, they don't help maintain the code.
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * https://www.apache.org/licenses/LICENSE-2.0 |
There was a problem hiding this comment.
Do NOT edit license headers.
| } | ||
|
|
||
| } | ||
| // ------------------------------------------------------------------------- |
There was a problem hiding this comment.
Remove these // comments please, they don't help maintain the code.
Do use Javadoc if you want to call something out.
| assertEquals(4, dist.getReplaceCost()); | ||
| } | ||
|
|
||
| } |
There was a problem hiding this comment.
Don't delete the empty line at the end of files.
|
hello @garydgregory , |
… and formatting fixes per PR review
|
.Hi @garydgregory , Thanks for the feedback. I've addressed your comments |
garydgregory
left a comment
There was a problem hiding this comment.
Thank you for your update.
- Fix the new Javadoc
sincetag since1.13.0was released in 2024; the next feature version will be1.16.0 - Use the
setprefix for builder setters.
TY!
…for builder methods
|
@garydgregory |
|
We're making changes on an algorithm that we've cited as having been pulled from a book...this is questionable. I wonder if we need a new file all toghether. |
|
I used AI tools to help structure the code rather than copying it from a textbook. However, if the implementation ends up looking like a direct copy from a specific book, I agree that it's better not to include it as-is. |
|
That flat doesn't fly at all. We can't say the code is from a book and have it be from AI |
|
Note @ron-ladin6 I'm happy to help work this in as it's own class |
This PR adds support for weighted Levenshtein distance by allowing custom costs for insertion, deletion, and substitution operations.
Key Changes:
LevenshteinDistancefor custom costs.limitedCompareCustomCostto handle weights while maintaining memory efficiency ($O(\min(n, m))$).