Refine Entity Integrity: formal definition + two required components#187
Open
dimitri-yatsenko wants to merge 1 commit into
Open
Refine Entity Integrity: formal definition + two required components#187dimitri-yatsenko wants to merge 1 commit into
dimitri-yatsenko wants to merge 1 commit into
Conversation
…nents Open the page with a precise formal definition: entity integrity is the guarantee of a bidirectional one-to-one correspondence between real-world entities and their database representations. Make explicit that this requires TWO components, not one: - A real-world identification process (external) — establishes the reliable association between physical entities and their identifiers. - A database uniqueness constraint (internal) — enforces, via the primary key, that no two records share the same identifier. Neither component alone is sufficient. The database can enforce uniqueness but cannot create it; the real-world process is what links a physical entity to its identifier in the first place. Additional structural improvements: - Elevate the "no auto-increment / explicit keys" rule from a buried paragraph to its own subsection under primary-key requirements, with the four reasons (identification before insertion, duplicate detection, composite keys, reproducibility). - Add a "Partial entity integrity" section for applications that need only one direction of the 1:1 mapping (record-to-entity only, or entity-to-record only). - Add a "When no natural key exists" section covering the multi-step token-issuance pattern. - Add an explicit "What the database can and cannot do" table at the end of the conceptual material. - Tighten and reorganize the existing Schema Dimensions content (preserved with light edits). - Refresh See-also links to point to the related concept pages. Source for the formal definition: datajoint/datajoint-book, where the 1:1 bidirectional framing and the external-process requirement are worked out at textbook length (book/20-concepts/04-integrity.md and book/30-design/018-primary-key.md). This PR brings that precision into the official docs in platform-doc voice.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
The Entity Integrity concept page describes the 1:1 correspondence between
real-world entities and database records, but doesn't make the formal
precision explicit. In particular, it understates that entity integrity
requires two components working together — a real-world identification
process plus a database uniqueness constraint — and treats the bidirectional
nature of the correspondence as implicit.
This PR opens the page with a precise formal definition, names the two
required components explicitly, and reorganizes around that framing.
The source of the formal definition is
datajoint/datajoint-book,
where the 1:1 bidirectional framing and the external-process requirement are
worked out at textbook length (
book/20-concepts/04-integrity.mdandbook/30-design/018-primary-key.md). This PR brings that precision into theofficial docs in platform-doc voice.
What changes
New opening: "Entity integrity is the guarantee of a one-to-one
correspondence between real-world entities and their representations in
the database." Followed by the two bidirectional bullets and an explicit
statement that a primary-key constraint alone is not sufficient.
New "Two required components" section (table format): external
identification process + database uniqueness constraint. Names what each
does and where each lives.
Elevated "Explicit keys, no auto-increment" subsection under
primary-key requirements, with the four reasons (identification before
insertion, duplicate detection, composite keys, reproducibility). Was
buried before; now stated as a direct consequence of the formal
definition.
New "Partial entity integrity" section for applications that need only
one direction of the 1:1 mapping (record→entity uniqueness only, or
entity→record completeness only).
New "When no natural key exists" section covering the multi-step
token-issuance pattern (generate, deliver, require, trust the external
process).
New "What the database can and cannot do" table as the closing
conceptual statement.
Tightened and reorganized the existing Schema Dimensions content
(preserved with light edits — same diagrams, same examples).
Refreshed See-also links.
Net change:
+240 / -161lines on one file.Voice and audience
The original page was already in the docs section; this revision keeps that
voice. Where the book uses textbook devices (admonitions, learning
objectives, dropdowns, tab-set SQL alongside DataJoint), this revision
prefers tight prose suitable for the dj-core reference docs.