Conversation
Simple spec for blocking global state access in multi-tenant environments. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove dead code: - filepath_checksum_size_limit (never used) - enable_python_native_blobs (never used) - cache (only query_cache is used) - init_function/init_command (database init command) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- All settings can be passed to Connection.from_config() - Only thread_safe is read-only after initialization Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Parameters and defaults - Connection-scoped settings via conn.config - Never accesses global dj.config Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Connection.from_config() creates a Config instance for conn.config - Database connection settings (host, port, user, password, use_tls, backend) become read-only after connection is established - Other settings remain mutable per-connection Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- thread_safe=True: global dj.config becomes read-only - conn.config copies from global config, always mutable - Simpler: global config still readable for defaults Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Simpler API: - Use existing Connection() constructor - conn.config copies from global dj.config - conn.config is always mutable for per-connection settings Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
All runtime operations must use self.connection.config instead of global config: - table.py: safemode for delete/drop - schemas.py: safemode, create_tables - preview.py: display settings - diagram.py: diagram_direction - jobs.py: all jobs settings - autopopulate.py: jobs settings - declare.py: add_job_metadata - connection.py: reconnect, query_cache - hash_registry.py, codecs: stores, download_path Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Explains how connections propagate:
- Connection → Schema → Table classes → Table instances
- Schema falls back to conn() if no connection provided
- Tables inherit connection from schema via _connection class attribute
- In thread_safe mode, Schema("name") fails, Schema("name", connection=conn) works
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- dj.conn().config IS dj.config (same object) - dj.Connection(...).config is COPY of dj.config (independent) - All internal code uses self.connection.config uniformly Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Connection.__init__ always creates self.config = copy(dj.config) - dj.conn() overrides after creation: conn.config = dj.config Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Mixed scenarios: dj.config affects global connection schemas only - Explicit connection schemas have independent config - dj.config.override() affects only schemas using dj.conn() - conn.config.override() affects only that connection's schemas - In thread_safe=True: dj.config.override() raises ThreadSafetyError Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New approach using dj.new() for isolated contexts: - Each context has one config and one connection - ctx.Schema() auto-uses context's connection - ctx.Manual, ctx.Lookup, etc. for table base classes - dj module acts as singleton context (legacy API) - thread_safe=True blocks singleton, allows dj.new() Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- ctx exposes only: config, connection, Schema() - Connection created at context construction via dj.new() - Tables still use dj.Manual, dj.Lookup as base classes - thread_safe=True: dj.config only allows thread_safe access Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- dj.config, dj.conn(), dj.Schema() delegate to singleton instance - Singleton lazily loaded on first access - thread_safe checked at module import, blocks singleton access - inst.config created fresh (not copied from dj.config) - dj.instance() always works, creates isolated instance Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- dj.Manual, dj.Lookup etc. used with @Schema decorator (schema links connection) - inst.Schema(), inst.FreeTable() need connection directly - FreeTable added to Instance class Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- dj.Instance() (uppercase) for consistency with dj.Schema() - Single _singleton_instance created lazily - dj.config -> _singleton.config (via proxy) - dj.conn() -> _singleton.connection - dj.Schema() -> _singleton.Schema() - dj.FreeTable() -> _singleton.FreeTable() - All trigger same singleton creation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Instance class that encapsulates config + connection - Add ThreadSafetyError exception for global state access - Add _ConfigProxy to delegate dj.config to global config - Add _get_singleton_connection for lazy connection creation - Update dj.conn(), dj.Schema(), dj.FreeTable() to use singleton - Connection now stores _config reference for instance isolation - Add DJ_THREAD_SAFE environment variable support - Add comprehensive tests for thread-safe mode When DJ_THREAD_SAFE=true: - dj.config raises ThreadSafetyError - dj.conn() raises ThreadSafetyError - dj.Schema() raises ThreadSafetyError (without explicit connection) - dj.FreeTable() raises ThreadSafetyError (without explicit connection) - dj.Instance() always works for isolated contexts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- conn(host, user, password) now updates the singleton connection instead of creating a separate connection - Remove irrelevant safemode=False from spec examples - thread_safe is set via DJ_THREAD_SAFE env var or config file Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove unused `from typing import Callable` in connection.py (lint failure) - Update mock_cache fixture: `cache` → `download_path` (KeyError in test_attach) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove unused `_singleton_connection` import in __init__.py (F401) - Remove unused `os` import in test_thread_safe.py (F401) - Remove unused `Callable` import in connection.py (F401) - Fix mock_cache fixture: `cache` → `download_path` for 2.0 settings Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The global codec registry is effectively immutable after import: registration runs under Python's import lock, and the only runtime mutation (_load_entry_points) is idempotent under the GIL. Per-instance isolation is unnecessary since codecs are part of the type system, not connection-scoped state. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Catalog all 8 module-level mutable globals with thread-safety classification: guarded (config, connection), safe by design (codec registry), or low risk (logging, blob flags, import caches). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All internal code now reads configuration from self.connection._config instead of the global config singleton. This ensures thread-safe mode works correctly: each Instance's connection carries its own config, and tables/schemas/jobs access it through the connection. Changes across 9 files: - schemas.py: safemode, create_tables default - table.py: safemode in delete/drop, config passed to declare() - expression.py: loglevel in __repr__ - preview.py: display.* settings via query_expression.connection._config - autopopulate.py: jobs.allow_new_pk_fields, jobs.auto_refresh - jobs.py: jobs.default_priority, stale_timeout, keep_completed - declare.py: jobs.add_job_metadata (config param threaded through) - diagram.py: display.diagram_direction (connection stored on instance) - staged_insert.py: get_store_spec() Removed unused `from .settings import config` imports from 7 modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- instance._global_config now reuses settings.config instead of creating a duplicate Config object. This ensures dj.config["safemode"] = False actually affects self.connection._config["safemode"] reads. - schemas.py now uses _get_singleton_connection() from instance.py instead of the old conn() from connection.py, eliminating the duplicate singleton connection holder. - dj.conn() now only creates a new connection when the singleton doesn't exist or reset=True (not on every call with credentials). - test_uppercase_schema: use prompt=False for drop() calls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ode spec Adds Architecture section covering the object graph, config flow for both singleton and Instance paths, and a table of all connection-scoped config reads across 9 modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Eliminate the last global config reads in the runtime path by threading connection-scoped config through the codec encode/decode and hash_registry layers. Encode path: table.py adds _config to the context dict, codecs extract it from key and pass to hash_registry/storage helpers. Decode path: expression.py passes connection to decode_attribute(), which builds a decode key with _config for codec.decode() calls. GC path: scan()/collect() extract config from schemas[0].connection and pass to list_stored_hashes/delete_path/delete_schema_path. All functions accept config=None with lazy fallback to settings.config for backward compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename schemas.Schema → schemas._Schema (internal) and define dj.Schema as a class that inherits from _Schema with a thread-safety check in __init__. This eliminates the confusing function-vs-class duality where dj.Schema was a function wrapper and schemas.Schema was the class. Now dj.Schema is a real class: isinstance, hasattr, and subclass checks all work naturally. The test for rebuild_lineage can use dj.Schema directly instead of importing from datajoint.schemas. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep master's atomic SQL UPDATE for job reservation (prevents race conditions) while threading connection-scoped config through _get_job_version(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Connection.__init__ now accepts `backend` and `config_override` kwargs instead of reading from the module-level global config. This ensures Instance creates connections using its own config, not the global one. Also removes set_thread_safe() — thread-safe mode is an infrastructure decision set via DJ_THREAD_SAFE env var, not a runtime toggle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mweitzel
left a comment
There was a problem hiding this comment.
Looks really good. A few issues:
- hardcoded mysql always used in
schema.pyin list_schemas - warnings for missing global config even when DJ_THREAD_SAFE is true
Regarding this code section: https://github.com/datajoint/datajoint-python/blob/feature/thread-safe-mode/src/datajoint/dependencies.py#L171-L252
Since you're forking on "adapter" ..instead of this being a globally aware "only 2 adapters, mysql and postgres", could this call a method on a local dependency? Like a backend aware function. Similar to the adapter.list_tables_sql function, but instead a adapter.load_table_keys/schema/etc or the like.
- list_schemas() uses adapter.list_schemas_sql() - make_classes() uses adapter.list_tables_sql() instead of SHOW TABLES - Schema.exists uses new adapter.schema_exists_sql() method - Added schema_exists_sql() to base, MySQL, and PostgreSQL adapters Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instance now accepts backend="mysql"|"postgresql" to explicitly set the database backend, with automatic port default derivation (3306 vs 5432). Join, restriction, and union operators now validate that both operands use the same connection, raising DataJointError with a clear message when expressions from different Instances are combined. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract the backend-specific primary key and foreign key SQL from dependencies.py into load_primary_keys_sql() and load_foreign_keys_sql() adapter methods, eliminating the if/else backend fork. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@mweitzel Thanks for the thorough review. Here's where we stand: Already fixed (commit
Fixed in
All unit tests pass (243/243). Let us know if there's anything else. |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Some string manipulation ins datajoint-python/src/datajoint/schemas.py Line 611 in 34f744e |
Replace ad-hoc backtick/quote stripping with adapter.split_full_table_name() so that full table name parsing uses the correct quoting convention for each backend (backticks for MySQL, double quotes for PostgreSQL). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@mweitzel Good catch. Fixed in Updated call sites:
All 243 unit tests pass. |
These methods were marked as "not officially supported", had no tests, no callers, and contained MySQL-specific string manipulation incompatible with PostgreSQL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
dj.Instance(host, user, password)for isolated config + connection pairsDJ_THREAD_SAFE=true) that disables global singletons and forces explicitdj.Instance()usagedj.Schemafrom a function wrapper into a proper class subclassing_Schema, soisinstance,hasattr, etc. work naturallyadapter.split_full_table_name()— inverse ofquote_identifier()— for backend-portable table name parsingload_primary_keys_sql,load_foreign_keys_sql) into adapter methodsschema.code()/save()methods (untested, MySQL-specific, no callers)Usage
Use cases
Design spec
See
docs/design/thread-safe-mode.mdfor the full specification including global state audit.Test plan
Instance,_ConfigProxy, thread-safe guards (tests/unit/test_thread_safe.py)🤖 Generated with Claude Code