Normalize user CRDT writes to NFD#2146
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR introduces Unicode NFD text normalization wrappers for the MiniLcm write API. It adds a new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The latest updates on your projects. Learn more about Argos notifications ↗︎
|
UI unit Tests 1 files 59 suites 31s ⏱️ Results for commit f3497d6. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@backend/FwLite/FwLiteProjectSync/CrdtFwdataProjectSyncService.cs`:
- Around line 66-70: The CRDT API wrapper order is inconsistent with
MiniLcmApiHubBase: replace the current nested creation that inserts
readNormalization between writeNormalization and validation and instead wrap
crdtApi so writeNormalization directly wraps validation (i.e., use
writeNormalizationWrapperFactory.Create(validationWrapperFactory.Create(crdtApi)))
to match MiniLcmApiHubBase; keep fwdataApi as-is (readNormalization +
validation). Update the line that assigns crdtApi which currently calls
writeNormalizationWrapperFactory.Create(readNormalizationWrapperFactory.Create(validationWrapperFactory.Create(crdtApi))).
In `@backend/FwLite/FwLiteShared/Services/MiniLcmJsInvokable.cs`:
- Line 26: The wrapper factories are applied in reverse by WrapWith, causing
validation to run before write normalization; update the factory list passed to
WrapWith when constructing _wrappedApi so the runtime order becomes Client →
Write Normalization → Read Normalization → Validation → Core API by using the
factory sequence [notificationWrapperFactory, validationWrapperFactory,
readNormalizationWrapperFactory, writeNormalizationWrapperFactory] (so after
WrapWith reversal it yields the documented write→read→validation order); ensure
this change in IMiniLcmApi _wrappedApi construction mirrors the ordering used in
MiniLcmApiHubBase.
In `@backend/FwLite/MiniLcm.Tests/WriteNormalizationTests.cs`:
- Around line 536-557: The test exclusion for JsonPatch overloads leaves
NormalizePatch untested; remove or adjust the IsJsonPatchOverload filter in
GetMethodsThatNeedTests so UpdateObjectInput<> overloads are included, and add a
test (e.g., UpdateWritingSystem) that sends a JsonPatchDocument containing
non-normalized strings through the same path the
MiniLcmWriteApiNormalizationWrapper uses (or invoke the wrapper directly) and
asserts the underlying API receives normalized values; reference
UpdateObjectInput<>, IsJsonPatchOverload, GetMethodsThatNeedTests,
MiniLcmWriteApiNormalizationWrapper, NormalizePatch, and the UpdateWritingSystem
test name when implementing the change.
In `@backend/FwLite/MiniLcm/Normalization/README.md`:
- Line 189: Replace the placeholder issue link "Issue:
https://github.com/sillsdev/languageforge-lexbox/issues/[issue-number]" in the
README by updating the bracketed token to the real GitHub issue number (or the
correct URL) so the reference resolves; locate the exact line containing that
string in Normalization/README.md and substitute [issue-number] with the actual
numeric ID (or remove the line if no issue exists).
🧹 Nitpick comments (4)
backend/FwLite/MiniLcm/Normalization/README.md (2)
24-26: Replace ASCII diagram with Mermaid.As per coding guidelines, use Mermaid diagrams instead of ASCII art for flowcharts and architecture documentation.
📊 Proposed Mermaid diagram
-``` -Client → Write Normalization → Read Normalization → Validation → Core API → Database -``` +```mermaid +flowchart LR + Client --> WN[Write Normalization] + WN --> RN[Read Normalization] + RN --> V[Validation] + V --> CA[Core API] + CA --> DB[Database] +```
186-189: Convert bare URLs to markdown links.Static analysis flagged bare URLs. Consider using proper markdown link syntax for consistency.
📝 Proposed fix
## References -- Unicode Normalization: https://unicode.org/reports/tr15/ -- Issue: https://github.com/sillsdev/languageforge-lexbox/issues/[issue-number] +- [Unicode Normalization (TR15)](https://unicode.org/reports/tr15/) +- [Related Issue](https://github.com/sillsdev/languageforge-lexbox/issues/[issue-number]) - FwLite Architecture: `backend/FwLite/AGENTS.md`backend/FwLite/MiniLcm/Normalization/StringNormalizer.cs (1)
53-65: Inconsistent handling of null RichString values in RichMultiString.When
Normalize(value)returns null (which happens when the input RichString is null per lines 41), the entry is silently dropped from the result. This differs fromNormalize(MultiString)which preserves all keys including those with empty/null values.Consider whether this asymmetry is intentional. If null RichStrings should be preserved, the logic should be updated.
🔄 Proposed fix to preserve null entries
public static RichMultiString Normalize(RichMultiString richMultiString) { var normalized = new RichMultiString(richMultiString.Count); foreach (var (key, value) in richMultiString) { var normalizedRichString = Normalize(value); - if (normalizedRichString is not null) - { - normalized[key] = normalizedRichString; - } + // Preserve all keys, even if the value is null + normalized[key] = normalizedRichString; } return normalized; }backend/FwLite/MiniLcm.Tests/WriteNormalizationTests.cs (1)
928-1171: Consider handlingIEnumerable<string>collections.Currently only
string[]is validated;List<string>/IReadOnlyList<string>would be skipped. If such fields exist now or later, normalization checks could miss them.♻️ Possible enhancement
- // Check collections of model objects - else if (value is IEnumerable enumerable && value is not string) + // Check collections of strings + else if (value is IEnumerable<string> stringEnumerable) + { + int index = 0; + foreach (var item in stringEnumerable) + { + CheckString(item, $"{propPath}[{index++}]", expectNfc, issues); + } + } + // Check collections of model objects + else if (value is IEnumerable enumerable && value is not string) { int index = 0; foreach (var item in enumerable)
62b303e to
5616334
Compare
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/FwLite/FwLiteWeb/Hubs/MiniLcmApiHubBase.cs`:
- Around line 16-22: The constructor parameter projectIdentifier must not be
force-unwrapped; change the MiniLcmApiHubBase constructor signature to accept a
non-null IProjectIdentifier (remove the nullable ?) and remove the
null-forgiving operator from the WrapWith call so WrapWith(projectIdentifier)
gets a true non-null value; update any derived classes (e.g., FwDataMiniLcmHub)
to validate/obtain a non-null IProjectIdentifier and pass it into the base
constructor (do the null checks before calling base) so that WrapWith can be
invoked safely in the base ctor without runtime exceptions from
projectIdentifier!.
In `@backend/FwLite/MiniLcm/Normalization/MiniLcmWriteApiNormalizationWrapper.cs`:
- Around line 324-340: NormalizeEntry and its helper calls (NormalizeSense,
NormalizeComplexFormComponent and StringNormalizer.Normalize) currently
dereference nested properties and collections directly which throws if fields
like LexemeForm, Senses, Components or ComplexForms are null; update these
methods to be null-safe by checking for null before normalizing (use
conditional/null-conditional operators or null-coalescing to empty collections
and only call StringNormalizer.Normalize when the source string is non-null),
return null or empty collections as appropriate instead of throwing, and apply
the same pattern to the other affected blocks around lines referenced (the other
Normalize* wrapper uses at 397-410 and 460-470) so validation can catch missing
required fields instead of producing a 500.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b8cdb204-005f-462d-9c44-808b6c3f7328
📒 Files selected for processing (11)
backend/FwLite/FwLiteShared/Services/MiniLcmJsInvokable.csbackend/FwLite/FwLiteWeb/Hubs/CrdtMiniLcmApiHub.csbackend/FwLite/FwLiteWeb/Hubs/FwDataMiniLcmHub.csbackend/FwLite/FwLiteWeb/Hubs/MiniLcmApiHubBase.csbackend/FwLite/MiniLcm.Tests/MiniLcmTestBase.csbackend/FwLite/MiniLcm.Tests/QueryEntryTestsBase.csbackend/FwLite/MiniLcm.Tests/WriteNormalizationTests.csbackend/FwLite/MiniLcm/Normalization/MiniLcmWriteApiNormalizationWrapper.csbackend/FwLite/MiniLcm/Normalization/StringNormalizer.csbackend/FwLite/MiniLcm/Validators/MiniLcmValidators.csbackend/FwLite/MiniLcm/Wrappers/MiniLcmWrappers.cs
c09b772 to
10647af
Compare
10647af to
1615b05
Compare
* Add CustomView forwarding to normalization wrapper; apply write normalization to all user-facing entry points - MiniLcmWriteApiNormalizationWrapper: add #region CustomView with pass-through CreateCustomView/UpdateCustomView/DeleteCustomView (no normalization needed; Name is view configuration metadata, not linguistic user text) - MiniLcmApiHubBase: make MiniLcmWriteApiNormalizationWrapperFactory non-nullable - FwDataMiniLcmHub: inject and pass write normalization wrapper (FwData now uses it too, for consistent forwarding-new-objects behaviour) - MiniLcmRoutes: apply read + write normalization wrappers via WrapWith, matching MiniLcmJsInvokable / MiniLcmApiHubBase pattern - CrdtFwdataProjectSyncService: add comment documenting that write normalization is intentionally not applied during sync https://claude.ai/code/session_01LN7JMAeHPcoXp7r7xVR8FH * Introduce MiniLcmApiUserFacingWrappers to centralise wrapper ordering and rationale - New MiniLcm/Wrappers/MiniLcmApiUserFacingWrappers.cs groups the three standard factories (read normalisation → write normalisation → validation) with a doc comment explaining why write normalisation is applied to both CRDT and FwData backends, and an Apply(api, project, params innerWrappers) method for callers that need extras (e.g. MiniLcmJsInvokable adds the notification wrapper) - Registered via AddMiniLcmValidators() - All user-facing entry points simplified to inject/call MiniLcmApiUserFacingWrappers: MiniLcmJsInvokable, MiniLcmApiHubBase, CrdtMiniLcmApiHub, FwDataMiniLcmHub, MiniLcmRoutes - CrdtFwdataProjectSyncService: remove read normalisation wrapper (data is already normalised on both sides); simplify comment - CustomView wrapper comment broadened from Name-specific to the whole object https://claude.ai/code/session_01LN7JMAeHPcoXp7r7xVR8FH * Fix MiniLcmApiUserFacingWrappers comment: explain transitive normalisation, drop api??this jargon https://claude.ai/code/session_01LN7JMAeHPcoXp7r7xVR8FH * Correct MiniLcmApiUserFacingWrappers comment: upfront normalisation covers all sub-objects The "transitive coverage" claim was wrong: NormalizeEntry normalises the full object tree before forwarding, so sub-objects from the sync logic are already normalised. The real reason for uniform FwData normalisation is a simpler invariant, not coverage. https://claude.ai/code/session_01LN7JMAeHPcoXp7r7xVR8FH * Don't pass 'this' in normalization wrapper; note new-instance behaviour in comment Upfront normalisation covers the full object tree, so sub-operations triggered by Update* sync logic use data that is already normalised. Passing 'this' back into the recursion caused a redundant second normalisation pass with no benefit. Now the wrapper forwards 'api' as-is; the inner validation wrapper still inserts itself for sub-calls. Also adds a note to MiniLcmApiUserFacingWrappers that the wrapper produces new instances, so the caller's original is never mutated. https://claude.ai/code/session_01LN7JMAeHPcoXp7r7xVR8FH * Remove ! on projectIdentifier; null-guard Normalize* helpers against bad JSON input Item 1: MiniLcmApiHubBase now takes non-nullable IProjectIdentifier; the null-forgiving operator is gone. FwDataMiniLcmHub throws a clear InvalidOperationException at connection time if context.Project is null rather than silently propagating null through !. Item 3: NormalizeEntry/NormalizeSense/NormalizeExampleSentence now use ?? new() for MultiString/RichMultiString fields and ?? [] for collection fields. JSON deserialization from external input can produce null values at runtime despite non-nullable annotations and = new() initializers; the guards prevent a 500 NRE and let the validation layer downstream report the problem properly. https://claude.ai/code/session_01LN7JMAeHPcoXp7r7xVR8FH * Move write normalisation after validation; revert null guards Validation now runs before write normalisation (read norm → validation → write norm). Bad input is rejected with a 400 before the normaliser sees it; the normaliser can therefore assume structurally valid data and needs no null guards of its own. Reverts the ?? new() / ?? [] guards added in the previous commit. https://claude.ai/code/session_01LN7JMAeHPcoXp7r7xVR8FH * Add inline comment explaining validation-before-write-normalisation ordering https://claude.ai/code/session_01LN7JMAeHPcoXp7r7xVR8FH * Rename normalization wrappers for consistent naming MiniLcmApiStringNormalizationWrapper → MiniLcmApiQueryNormalizationWrapper MiniLcmWriteApiNormalizationWrapper → MiniLcmApiWriteNormalizationWrapper Both now follow the MiniLcmApi* prefix convention used across all wrapper factories. "Query" better describes the role of the read-side wrapper (normalises search/exemplar query parameters, not all strings). https://claude.ai/code/session_01LN7JMAeHPcoXp7r7xVR8FH --------- Co-authored-by: Claude <noreply@anthropic.com>
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
Introduces normalizing new CRDT data to NFD, like LibLcm.
Note: the new API wrapper only auto-implements the READ API, so that future necessary normalization is not overlooked.