Simplify interlinear model#63
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
imnasnainaec
left a comment
There was a problem hiding this comment.
@imnasnainaec reviewed 10 files and all commit messages, and made 2 comments.
Reviewable status: 10 of 17 files reviewed, 1 unresolved discussion (waiting on alex-rawlings-yyc and jasonleenaylor).
src/components/CreateProjectModal.tsx line 126 at r1 (raw file):
</label> <input id="analysis-language"
🌱 We may want to allow multiple langs (comma-separated?) on creation.
src/components/ProjectMetadataModal.tsx line 95 at r1 (raw file):
const newLanguage = editLanguage.trim(); try { const updatedProjectJson = await papi.commands.sendCommand(
Doesn't this fall into the known trap of inadvertently clearing the target?
alex-rawlings-yyc
left a comment
There was a problem hiding this comment.
@alex-rawlings-yyc made 2 comments.
Reviewable status: 10 of 17 files reviewed, 1 unresolved discussion (waiting on imnasnainaec and jasonleenaylor).
src/components/CreateProjectModal.tsx line 126 at r1 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
🌱 We may want to allow multiple langs (comma-separated?) on creation.
I haven't gotten around to doing a lot of adjustment of how the model is handled yet. My main goal with this PR was to present some changes to it and make changes to the code accordingly if we agree to change it
src/components/ProjectMetadataModal.tsx line 95 at r1 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Doesn't this fall into the known trap of inadvertently clearing the target?
Oversight. I haven't gone through the code very closely yet as I just want to focus on the model itself for now.
6884c8b to
50d0e8b
Compare
253c32e to
f5083ae
Compare
50d0e8b to
12707a8
Compare
alex-rawlings-yyc
left a comment
There was a problem hiding this comment.
@alex-rawlings-yyc reviewed 18 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on imnasnainaec and jasonleenaylor).
a9576db to
9d7c847
Compare
alex-rawlings-yyc
left a comment
There was a problem hiding this comment.
@alex-rawlings-yyc reviewed 12 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on imnasnainaec and jasonleenaylor).
alex-rawlings-yyc
left a comment
There was a problem hiding this comment.
@alex-rawlings-yyc reviewed 6 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on imnasnainaec and jasonleenaylor).
alex-rawlings-yyc
left a comment
There was a problem hiding this comment.
@alex-rawlings-yyc reviewed 2 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on imnasnainaec and jasonleenaylor).
alex-rawlings-yyc
left a comment
There was a problem hiding this comment.
@alex-rawlings-yyc reviewed 2 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on imnasnainaec and jasonleenaylor).
cda9f96 to
9c494fc
Compare
alex-rawlings-yyc
left a comment
There was a problem hiding this comment.
@alex-rawlings-yyc reviewed 11 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on imnasnainaec and jasonleenaylor).
271581e to
69c889d
Compare
alex-rawlings-yyc
left a comment
There was a problem hiding this comment.
@alex-rawlings-yyc reviewed 2 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on imnasnainaec and jasonleenaylor).
a6816b4 to
a17e5fb
Compare
d40f053 to
b3db9a2
Compare
a17e5fb to
9a38c5d
Compare
…ect and add associated test, fix schema inconsistency
…update mocks to match real API signatures - If the user picks the same project for both source and target roles, a warning notification is shown and the target picker re-opens until distinct projects are chosen or cancelled. - Includes new tests for this flow, updated platform-bible-react/utils mocks to match current API shapes, a new localized error string, and minor test/parser cleanups.
Introduces three modal components (CreateProjectModal, ProjectMetadataModal, SelectInterlinearProjectModal) with supporting menu contributions, localized strings, command registrations in main.ts, storage helpers in projectStorage.ts, and updated type definitions. Adds full test coverage for all new components and expands existing test suites to cover the new flows.
…oss/senseRef constraint - Replace `<div>` containers with `<dialog open aria-labelledby="…">` in `CreateProjectModal`, `ProjectMetadataModal`, and `SelectInterlinearProjectModal` for proper accessibility semantics. - In `ProjectMetadataModal.handleSave`, return early when `updateProjectMetadata` resolves with a falsy value (mirrors the existing guard in `CreateProjectModal.handleCreate`). - In `interlinearizer.d.ts`, replace the discriminated union on `Token` and `Phrase` that forbade setting both `gloss`/`glossSenseRef` (or `gloss`/`senseRef`) with optional fields, reflecting the updated rule that `gloss` serves as a local override when both are present. - Add a test for the falsy-return early-exit path in `ProjectMetadataModal`, and add the missing `waitFor` before negative assertions in the `CreateProjectModal` falsy-return test.
…zero-check in ContinuousView explicit
Introduces three modal components (CreateProjectModal, ProjectMetadataModal, SelectInterlinearProjectModal) with supporting menu contributions, localized strings, command registrations in main.ts, storage helpers in projectStorage.ts, and updated type definitions. Adds full test coverage for all new components and expands existing test suites to cover the new flows.
…zero-check in ContinuousView explicit
…xt, add ActiveProject - Removes the unused bilingual container types (`InterlinearAlignment`, `InterlinearText`) whose `target` side had no role in the current single-source annotation workflow. - Introduces `ActiveProject` as the runtime pairing of a persisted `InterlinearProject` envelope with its reconstructed `Book[]` source text. - Adds `saveProjectAnalysis` to `projectStorage` as the designated save path for analysis/link mutations.
…0% coverage - InterlinearProject: replace analysisWritingSystem: string with analysisLanguages: string[] (LCM/PT9 carry multiple analysis languages per project; a single string was lossy) and add optional targetProjectId so BT Extension's two-sided Translation can be represented without dropping the target text layer - ActiveProject: add target?: Book[] alongside source so AlignmentLink.targetEndpoints resolve to real tokens at runtime - TokenAnalysis / Phrase: remove gloss ↔ senseRef discriminated union; both fields are now independently optional and may coexist — local gloss takes precedence over the lexicon-derived value when both are present - TokenAnalysis: add excluded?: boolean to carry PT9's Cluster.Excluded flag projectStorage: update createProject / updateProjectMetadata signatures and logic to match; updateProjectMetadata clears targetProjectId when called without it (mirrors name/description clear semantics) - papi-backend mock: add papi.storage (readUserData / writeUserData / deleteUserData) which was missing entirely - Remove two dead branches: parseStrictNumber's n >= 0 guard (the ^\d+$ regex makes negative results impossible) and stableStringify's undefined guard (never reachable via the typed call site) - Add tests for duplicate verse SID throw and new storage behaviour; all 268 tests pass at 100% branch coverage
Cluster.Excluded is a deprecated PT9 tag; carrying it into the model adds noise with no consumer value.
…ag from PT9 XML docs
…s, improve error handling
66b0394 to
fcef9f9
Compare
Interlinear Model — Design Justification
Overview
The interlinear model is built around two types:
InterlinearProject— the persisted envelope: analysis data, alignment links, and references to Platform.Bible source/target projects.ActiveProject— the runtime object: the envelope paired with reconstructedBook[]text hierarchies (never serialized).This document explains the key design decisions and their rationale.
Decision 1: Single
analysisfield; optionaltargetInterlinearProjectholds oneanalysis: TextAnalysisfield and an optionaltargetProjectId?. At runtime,ActiveProjectexposessource: Book[]andtarget?: Book[].Why not split analysis into
sourceAnalysis/targetAnalysis? A single-source project (LCM, PT9 glossing) has no target text. Two sibling fields would leavetargetAnalysispermanently empty and force callers to always write tosourceAnalysisby convention rather than by structure. A singleanalysisfield is unambiguous regardless of whether the project is single-source or bilateral.Why not require
target? LCM and single-sided PT9 projects have no target text. Makingtargetrequired would force callers to construct a placeholder — semantically dishonest and a source of drift. An optional field lets single-source projects simply omit it while bilateral projects (BT Extension) settargetProjectIdand gettargetpopulated at runtime.Decision 2:
InterlinearProject+ActiveProjectsplit (persisted vs. runtime)The text hierarchy (
Book/Segment/Token) lives only onActiveProject, not onInterlinearProject. It is rebuilt from Platform.Bible's USJ on every load and never serialized.Why not serialize the text layer? Platform.Bible owns the source text — duplicating it in storage would create a second copy that can drift from the canonical version. Storing only the analysis layer keeps
InterlinearProjectsmall and makes the baseline authoritative. Drift is detected viatokenSnapshotfields onTokenAnalysisandAlignmentEndpoint, which record the token's surface text at analysis time; a mismatch flipsstatusto'stale'.Why a separate type rather than computed fields on
InterlinearProject?ActiveProjectmakes the serialization boundary explicit in the type system. Callers that hold anActiveProjectknow they have live text; callers that hold anInterlinearProjectknow they have only the persisted envelope. This prevents accidental serialization ofBook[].Things to Keep in Mind During Implementation
1.
gloss/glossSenseRefco-presence: callers must enforce rendering priorityThe type system allows both fields to be set simultaneously (intentional, for PT9/LCM round-trips). When rendering or exporting,
glosstakes precedence overglossSenseRef. This rule is not enforced by the model — UI and export code must apply it consistently.2.
targetmay be absent even whentargetProjectIdis setIf a target Platform.Bible project is deleted after the
InterlinearProjectwas created,ActiveProject.targetwill beundefinedat runtime whileproject.targetProjectIdremains set. Code that readsAlignmentLink.targetEndpointsmust guard against this case.3.
updateProjectMetadata: omittingtargetProjectIdclears ittargetProjectIdis an optional trailing parameter. Omitting it (or passingundefined) silently deletes the field — the same behavior asnameanddescription. Callers that only intend to update other fields must explicitly forward the currenttargetProjectIdvalue to avoid inadvertently unbinding the target project.4.
saveProjectAnalysisis a blind read-modify-write with no conflict detectionsaveProjectAnalysisreads the current project, merges in the newanalysisandlinks, and writes back. If two WebView sessions for the same project both load state and then save, the second write silently overwrites the first. There is no last-modified timestamp, version field, or conflict signal. This is unlikely to matter in practice on a single-user Electron desktop, but would become a real hazard if the platform ever supports concurrent sessions or cloud-synced storage.Changes Made Since Original Description
The following changes were applied to
src/types/interlinearizer.d.tsas part of the model refactor. Call sites (bookTokenizer.ts,SegmentView.tsx,PhraseBox.tsx,ContinuousView.tsx, and all tests) still reference the old names and will failtsc --noEmituntil updated in a follow-on pass.Token.id→Token.ref; cross-references updated throughouttokenIdsounded like it might be content-derived (a hash or similar). In this codebaserefalready means a positional identifier (bookRef,startRef,endRef);Token.reffollows that pattern and makes "position-based, not content-derived" explicit. All downstream cross-reference fields were updated:tokenId→tokenRef,tokenIds→tokenRefs.Token.refembeds the verse SID and the token's zero-based character offset (e.g."GEN 1:1:0"), making refs unique within anInterlinearProject(not just within aBook).Segment.idis similarly project-wide unique — it is set to the verse SID (e.g."GEN 1:1").AlignmentEndpointrestructured;TokenSnapshotandMorphemeLinkextractedThe old
AlignmentEndpointwas a flat struct with an& ({ morphemeId?: never } | { tokenAnalysisId: string; morphemeId: string })intersection that mixed the token reference, a surface-text snapshot, and morpheme pointers into one type. The new shape separates concerns into three interfaces:AlignmentEndpoint— holdstoken: TokenSnapshotandmorphemeLink?: MorphemeLinkTokenSnapshot— carriestokenRefandsurfaceText(the snapshot; previouslytokenSnapshot?: stringon the endpoint itself, now required)MorphemeLink— carriestokenAnalysisId+morphemeId; its presence/absence is the discriminantThis removes the flat-intersection invariant that
tokenAnalysisIdandmorphemeIdmust appear together, and makessurfaceTextnon-optional (a snapshot is always required when creating an endpoint).Morpheme→MorphemeAnalysisRenamed to be consistent with
TokenAnalysisandSegmentAnalysis.MorphemeAnalysisdoes not extendAnalysis— the unit of review is theTokenAnalysisrecord, not individual morphemes. An MSA tool produces aTokenAnalysiswithstatus: 'suggested'; the morphemes array is data within that record.Phrase→PhraseAnalysis;tokenSnapshotsremovedPhrasewas renamedPhraseAnalysisto match the naming convention ofTokenAnalysisandSegmentAnalysis. ThetokenSnapshots?: [string, ...string[]]parallel-array field — which had to be kept in sync withtokenIdsby convention with no typed enforcement — was dropped. Phrase drift detection remains an open question (see Open Questions below).Analysisbase interface addedSegmentAnalysis,TokenAnalysis, andPhraseAnalysisall carried duplicatedid,status,confidence?,producer?,sourceUser?fields. These are extracted into a sharedAnalysisbase that all three extend.type→interfaceforTokenAnalysis,PhraseAnalysis,AlignmentEndpointThese were declared as
typealiases but described object shapes with no union or intersection logic. Converted tointerfacefor consistency with the rest of the model.InterlinearProject.linksis now optionallinkswas previouslyAlignmentLink[](always present). It is nowlinks?: AlignmentLink[]. For analysis-only projects (LCM, PT9) that have notargetProjectId, the field is absent rather than an empty array. This makes analysis-only projects structurally distinct from bilateral alignment projects, though the compiler does not yet enforce thetargetProjectId ↔ linksco-presence invariant (see Open Questions below).Open Questions for Team Discussion
1. Phrase staleness
PhraseAnalysiscurrently has no drift-detection mechanism. The removedtokenSnapshots?: [string, ...string[]]parallel-array was fragile — it had to be kept in sync withtokenIdsby convention and had no typed enforcement. When a member token's surface text changes the phrase has no way to detect it independently.Proposed fix: add
tokens: [TokenSnapshot, ...TokenSnapshot[]]toPhraseAnalysis, reusing the existingTokenSnapshottype. This replaces the parallel-array with a paired structure that bundles the token reference and surface text together — exactly the same patternAlignmentEndpoint.tokenalready uses. Staleness detection then works uniformly: compare eachTokenSnapshot.surfaceTextagainst the currentToken.surfaceTextand flipstatusto'stale'on any mismatch.Alternative: derive phrase staleness from member
TokenAnalysis.status. This is fragile because a phrase token may have noTokenAnalysisat all, and is not recommended.2. Project mutability and
linksplacementTwo tightly coupled questions that need a decision before the type can be fully locked down.
Can a project transition from analysis-only to bilateral?
updateProjectMetadatacurrently accepts an optionaltargetProjectId, so the answer today is yes. The question is whether that should be permitted — gaining a target is a significant structural change (it creates a new field category,links, that did not previously exist on the project) and may warrant creating a new project instead. IftargetProjectIdis removed fromupdateProjectMetadata, projects become immutable in type at creation.Should the
targetProjectId ↔ linksinvariant be enforced by the type system?InterlinearProject.linksis only meaningful whentargetProjectIdis set, but the current type does not enforce this co-presence. A discriminated union —AnalysisProject | AlignmentProject— would let the compiler rejectlinkson a project with no target and reject a missinglinkson a project that has one. This removes an entire category of runtime errors at the cost of a slightly more complex type.These two questions are tightly coupled:
createProjectpicks the right variant at call time, and a project is either always anAnalysisProjector always anAlignmentProject. No migration path is needed.targetProjectIdcan be added post-creation, the union needs an explicit conversion step. The storage layer must handle reading anAnalysisProjectand writing back anAlignmentProjectatomically, and any in-memory holders of the old reference must be invalidated. This is feasible on a single-user desktop but adds non-trivial ceremony.The recommended path is immutability at creation, but this is a product decision: if the UX calls for letting users attach a target project after the fact, the type must accommodate it.
This change is