Refine Entity Integrity: formal definition + two required components by dimitri-yatsenko · Pull Request #187 · datajoint/datajoint-docs

dimitri-yatsenko · 2026-06-13T18:16:08Z

Context

The Entity Integrity concept page describes the 1:1 correspondence between
real-world entities and database records, but doesn't make the formal
precision explicit. In particular, it understates that entity integrity
requires two components working together — a real-world identification
process plus a database uniqueness constraint — and treats the bidirectional
nature of the correspondence as implicit.

This PR opens the page with a precise formal definition, names the two
required components explicitly, and reorganizes around that framing.

The source of the formal definition is
datajoint/datajoint-book,
where the 1:1 bidirectional framing and the external-process requirement are
worked out at textbook length (book/20-concepts/04-integrity.md and
book/30-design/018-primary-key.md). This PR brings that precision into the
official docs in platform-doc voice.

What changes

New opening: "Entity integrity is the guarantee of a one-to-one
correspondence between real-world entities and their representations in
the database." Followed by the two bidirectional bullets and an explicit
statement that a primary-key constraint alone is not sufficient.
New "Two required components" section (table format): external
identification process + database uniqueness constraint. Names what each
does and where each lives.
Elevated "Explicit keys, no auto-increment" subsection under
primary-key requirements, with the four reasons (identification before
insertion, duplicate detection, composite keys, reproducibility). Was
buried before; now stated as a direct consequence of the formal
definition.
New "Partial entity integrity" section for applications that need only
one direction of the 1:1 mapping (record→entity uniqueness only, or
entity→record completeness only).
New "When no natural key exists" section covering the multi-step
token-issuance pattern (generate, deliver, require, trust the external
process).
New "What the database can and cannot do" table as the closing
conceptual statement.
Tightened and reorganized the existing Schema Dimensions content
(preserved with light edits — same diagrams, same examples).
Refreshed See-also links.

Net change: +240 / -161 lines on one file.

Voice and audience

The original page was already in the docs section; this revision keeps that
voice. Where the book uses textbook devices (admonitions, learning
objectives, dropdowns, tab-set SQL alongside DataJoint), this revision
prefers tight prose suitable for the dj-core reference docs.

…nents Open the page with a precise formal definition: entity integrity is the guarantee of a bidirectional one-to-one correspondence between real-world entities and their database representations. Make explicit that this requires TWO components, not one: - A real-world identification process (external) — establishes the reliable association between physical entities and their identifiers. - A database uniqueness constraint (internal) — enforces, via the primary key, that no two records share the same identifier. Neither component alone is sufficient. The database can enforce uniqueness but cannot create it; the real-world process is what links a physical entity to its identifier in the first place. Additional structural improvements: - Elevate the "no auto-increment / explicit keys" rule from a buried paragraph to its own subsection under primary-key requirements, with the four reasons (identification before insertion, duplicate detection, composite keys, reproducibility). - Add a "Partial entity integrity" section for applications that need only one direction of the 1:1 mapping (record-to-entity only, or entity-to-record only). - Add a "When no natural key exists" section covering the multi-step token-issuance pattern. - Add an explicit "What the database can and cannot do" table at the end of the conceptual material. - Tighten and reorganize the existing Schema Dimensions content (preserved with light edits). - Refresh See-also links to point to the related concept pages. Source for the formal definition: datajoint/datajoint-book, where the 1:1 bidirectional framing and the external-process requirement are worked out at textbook length (book/20-concepts/04-integrity.md and book/30-design/018-primary-key.md). This PR brings that precision into the official docs in platform-doc voice.

dimitri-yatsenko requested review from MilagrosMarin, esutlie and ttngu207 June 13, 2026 18:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine Entity Integrity: formal definition + two required components#187

Refine Entity Integrity: formal definition + two required components#187
dimitri-yatsenko wants to merge 1 commit into
mainfrom
refine/entity-integrity-formal-definition

dimitri-yatsenko commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dimitri-yatsenko commented Jun 13, 2026

Context

What changes

Voice and audience

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant