Only load what is actually going to be used by build v3 by shangyian · Pull Request #1844 · DataJunction/dj

shangyian · 2026-03-08T20:03:53Z

Summary

Test Plan

PR has an associated issue: #
make check passes
make test shows 100% unit test coverage

Deployment Plan

Co-authored-by: GitHub Actions Bot <>

…nction#1550) * Switch to using the same codemirror editor for custom metadata editing * Fix

* MAX_BY and MIN_BY functions cannot be decomposed, so metric decomposition should reflect that * Fix tests * Remove unused part of query ast update after max_by decomposition

Co-authored-by: GitHub Actions Bot <>

* If a role path is provided, should stick to the role path * Enable role path dimensions in filters Manage filters and dimensions resolution correctly Fix tests * Fix tests * Add test for storing cubes that have dimensions with role paths * Fix

…1552) * When using the DJ client to retrieve metrics data, rename all output columns from their physical names to their semantic names * Fix * Fix tests

* Fix some inconsistencies with metric display name for cubes when deployed make cube deployment part of singular commit * Remove extraneous CreateCube input map * Fix * Fix cube deployment validation * Refactor batch load of metrics and dimensions as separate functions * Make sure that we can set tags for cubes * Fix tests * Fix and cleanup tests * Fix tests

Co-authored-by: GitHub Actions Bot <>

Update YAML docs given new deployment setup

Co-authored-by: GitHub Actions Bot <>

…Junction#1568) * Improve namespace creation UX with inline folder-like interface * Fix tests * Fix lint

* Add support for editing reference links and dimension links * Make edit icon show up by default * Limit the max size of the Dimension Links column * Fix adding, editing or removing complex dim link * Fix tests * lint * Fix tests * Add ability to refresh page after removing dim link * Add tests * lint

* Fix bug that allows saving invalid metrics * Add test that confirms metrics bug fix * make lint * Remove unused variable * only validate columns in metrics without qualifying namespace (those are from dimension links) * mypy

Co-authored-by: GitHub Actions Bot <>

* Add support for dimensional hierarchies * Add history tracking for hierarchies and audit info like created by and owner * Move database functions to classmethods * Add tests for hierarchies * Add hierarchies database migration * Remove the hard-coded level order inputs when creating hierarchies since they can be derived from the list order * Remove validate endpoint for hierarchies

* Add principal kind for GROUP * Add group members table * Add pluggable group membership service * Add API endpoints to support groups * Add groups-related models * Fix database migration

…ataJunction#1504) * Allow configurable embedded query client * more coverage * Catch generic exception incase snowflake connector not installed * Add test for snowflake types to DJ types mappings * Use global client with examples * Update datajunction-server/datajunction_server/config.py Co-authored-by: Olek Górajek <agorajek@users.noreply.github.com> * Update datajunction-server/datajunction_server/config.py Co-authored-by: Olek Górajek <agorajek@users.noreply.github.com> * client_with_roads --------- Co-authored-by: Olek Górajek <agorajek@users.noreply.github.com>

Co-authored-by: GitHub Actions Bot <>

* Add database models for roles, role scopes, and role assignments * Add roles APIs * Add API endpoints for roles, role scopes, and role assignments

…ion#1583)

* Add the ability to filter nodes by mode (draft, published) via GraphQL * Add filtering option to UI * Add column for displaying mode * Add tests for GraphQL mode filtering * Fix tests for UI namespace loading * Add NodeModeSelect

Co-authored-by: GitHub Actions Bot <>

… a recursive CTE to traverse the dimensions graph (DataJunction#1584)

* Update header to include user, notifications, and settings Add notifications page + grouping * Fix * Add test coverage

* When we create a user, we should also create its groups, if any * Add test for syncing groups

…ction#1591) * Add support for creating and managing service accounts in UI * Refactor service account section into separate components * Fix test * Fix test

Co-authored-by: GitHub Actions Bot <>

… Users already exist in DB by the time protected endpoints are called (created via signup or OAuth). Internal deployments should handle user provisioning in auth middleware instead. (DataJunction#1595)

Co-authored-by: GitHub Actions Bot <>

Co-authored-by: GitHub Actions Bot <actions@github.com>

* Fix UI validation errors and form crashes * Fix lint * Fix * Fix tests

Co-authored-by: GitHub Actions Bot <actions@github.com>

* Fix issue with editing transforms * Fix

Co-authored-by: GitHub Actions Bot <actions@github.com>

* Fix server validation and dependency tracking * Fix some tests * Fix issues with restore validation * Fix tests * Remove logging * Fix tests * Fix

Co-authored-by: GitHub Actions Bot <actions@github.com>

DataJunction#1824) * Various query optimizations by reducing eager loading * Add additional test coverage

Co-authored-by: GitHub Actions Bot <actions@github.com>

…oflush (DataJunction#1828) * Fixes a bug when deploying a cube with partitioned columns due to autoflush on orphaned columns * Add test

Co-authored-by: GitHub Actions Bot <actions@github.com>

…ataJunction#1830) * improve dimensions graph discovery speed via bfs * Fix * Add dag tests

…ataJunction#1831) * When getting downstream nodes, be targeted in how much eagerloading we do. We only need cube elements, node revision, node, not the full set of node output options * Don't do N+1 for cubes on dimension link removal * Switch to layered BFS with raw SQL for get downstreams * Remove old downstream nodes BFS implementation * Fix tests, ordering does matter for status propagation - needs to be in topological order * Remove dead code

* Change get_upstream_nodes to use layered BFS with raw SQL approach * Fix dag join * Fix tests * Fix

Co-authored-by: GitHub Actions Bot <actions@github.com>

* Don't refresh ahead for query caching, since we are already only retrieving versioned queries * Switch to a semaphore cap to control concurrent sql building refreshes from background tasks

* Fix duplicate find_matching_cube calls

Co-authored-by: GitHub Actions Bot <actions@github.com>

@timed

…alidation.ms, dj.cube_matching.ms, dj.graphql.query_ms, dj.graphql.errors, dj.db.query_count. Also extended @timed to support sync functions. (DataJunction#1840)

* Add BigQuery query client and dialect support Implements a direct BigQuery integration following the same pattern as the existing Snowflake client. Adds `BigQueryClient` for table introspection via INFORMATION_SCHEMA, registers `bigquery` as a supported dialect with sqlglot transpilation, and exposes it as an optional install extra (`datajunction-server[bigquery]`). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix pre-commit: ruff format, trailing commas, GraphQL schema Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix BigQuery test failures: QueryJobConfig mock and BIGNUMERIC precision - Import QueryJobConfig and ScalarQueryParameter at module level so tests can patch them (accessing via bigquery=None failed) - Fix BIGNUMERIC/BIGDECIMAL to use DecimalType(38, 38) since DJ's DecimalType caps max_precision at 38 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add tests for _get_client and test_connection to reach 100% coverage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add test for BigQuery ImportError path in utils to reach 100% coverage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add engine URI project resolution to BigQueryClient (#1) Mirrors SnowflakeClient's _get_database_from_engine approach: parses the GCP project from the engine URI netloc (bigquery://my-gcp-project) so different DJ catalogs can point to different GCP projects. Also adds BigQuery env config example to .env and updates tests. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Address review comments on BigQueryClient (#2) * Add engine URI project resolution to BigQueryClient Mirrors SnowflakeClient's _get_database_from_engine approach: parses the GCP project from the engine URI netloc (bigquery://my-gcp-project) so different DJ catalogs can point to different GCP projects. Also adds BigQuery env config example to .env and updates tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Address review comments on BigQueryClient - Add sqlglot to bigquery extra for dialect transpilation support - Add BIGQUERY_AVAILABLE import coverage tests (True/False paths) - Add BigQuery config documentation with examples to QueryClientConfig - Remove redundant 0-based index comment in get_columns_for_table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Add BigQuery project resolution order from engine URI * Add BigQuery query execution support to datajunction-query Add BigQuery as a supported engine type in the query service, following the existing Snowflake pattern. Supports project config via extra_params, credentials via config or GOOGLE_APPLICATION_CREDENTIALS env var, and Application Default Credentials as fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add comprehensive BigQuery integration tests Cover credentials path, location, env var fallback, error handling, multi-row and empty results in datajunction-query. Add client project override, location, factory with all options, unsupported type, engine URI project override, and credentials precedence tests in datajunction-server. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix mypy type error and ruff formatting in BigQuery query execution Wrap BigQuery rows in iter() to match Stream (Iterator) type. Apply ruff format to test file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix test_get_columns_for_table_with_engine_project_override mock setup Mock QueryJobConfig and ScalarQueryParameter which are None in CI (google-cloud-bigquery not installed), matching the pattern used by other get_columns_for_table tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add tests for uncovered branches in _get_project_from_engine Cover empty path segment fallthrough (131->134) and query params without project key (136->145) to reach 100% branch coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Strip catalog prefix from BigQuery SQL and fix ADC credential handling DJ generates SQL with catalog-prefixed table names (e.g. my_catalog.dataset.table) but BigQuery interprets three-part names as project.dataset.table. Since the BQ client already has the project configured, strip the catalog prefix so BigQuery receives dataset.table references. Also remove the GOOGLE_APPLICATION_CREDENTIALS env var fallback from credentials_path — let bigquery.Client() handle ADC natively. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Remove unused variable flagged by ruff (F841) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add support for saving cube filters and applying them during SQL generation * Add support for generating SQL on cubes with filters * Add instrumentation for /sql/measures/v3 and extra test * Add test coverage * Add support for querying cube filters via GraphQL * Fix issue with saving cube filters via YAML deployment * Enable displaying in the UI of cube filters * Fix tests * Fix

…on#1841) * Fix issue where a detached relationship on RoleAssignment (due to separate sessions being used to load the auth context and bearer session) was causing errors * Fix inconsistent bug with copy nodes

Co-authored-by: GitHub Actions Bot <actions@github.com>

netlify · 2026-03-08T20:03:57Z

✅ Deploy Preview for thriving-cassata-78ae72 canceled.

Name	Link
🔨 Latest commit	`78aeb7f`
🔍 Latest deploy log	https://app.netlify.com/projects/thriving-cassata-78ae72/deploys/69add62cd061d500080249ea

agorajek and others added 30 commits October 29, 2025 09:34

Bumping DJ to version 0.0.12 (DataJunction#1546)

c63f681

Co-authored-by: GitHub Actions Bot <>

Switch to using CodeMirror editor for custom metadata editing (DataJu…

368fa04

…nction#1550) * Switch to using the same codemirror editor for custom metadata editing * Fix

MAX_BY and MIN_BY functions cannot be decomposed (DataJunction#1549)

bc06389

* MAX_BY and MIN_BY functions cannot be decomposed, so metric decomposition should reflect that * Fix tests * Remove unused part of query ast update after max_by decomposition

Bumping DJ to version 0.0.13 (DataJunction#1551)

15abedb

Co-authored-by: GitHub Actions Bot <>

python client: rename output columns to semantic names (DataJunction#…

da0481e

…1552) * When using the DJ client to retrieve metrics data, rename all output columns from their physical names to their semantic names * Fix * Fix tests

Bumping DJ to version 0.0.14 (DataJunction#1557)

1266fdb

Co-authored-by: GitHub Actions Bot <>

YAML Deployment Docs (DataJunction#1560)

a349a5d

Update YAML docs given new deployment setup

Bumping DJ to version 0.0.15 (DataJunction#1565)

a7be14d

Co-authored-by: GitHub Actions Bot <>

Improve namespace creation UX with inline folder-like interface (Data…

c50a8a1

…Junction#1568) * Improve namespace creation UX with inline folder-like interface * Fix tests * Fix lint

Bug saving invalid metric (DataJunction#1571)

93b4779

* Fix bug that allows saving invalid metrics * Add test that confirms metrics bug fix * make lint * Remove unused variable * only validate columns in metrics without qualifying namespace (those are from dimension links) * mypy

Bumping DJ to version 0.0.16 (DataJunction#1566)

3f00434

Co-authored-by: GitHub Actions Bot <>

Support for Groups as Principal (DataJunction#1574)

69f3346

* Add principal kind for GROUP * Add group members table * Add pluggable group membership service * Add API endpoints to support groups * Add groups-related models * Fix database migration

Bumping DJ to version 0.0.17 (DataJunction#1578)

d68ed19

Co-authored-by: GitHub Actions Bot <>

RBAC Infra: Roles, Scopes & Assignments (DataJunction#1576)

7e524fc

* Add database models for roles, role scopes, and role assignments * Add roles APIs * Add API endpoints for roles, role scopes, and role assignments

Prevent linking to non-dimension nodes (DataJunction#1580)

00beece

Update browserslist with npx update-browserslist-db@latest (DataJunct…

a59c16d

…ion#1583)

Bumping DJ to version 0.0.18 (DataJunction#1585)

620bc89

Co-authored-by: GitHub Actions Bot <>

Make find nodes with common dimensions more efficient by switching to…

7a7e1ae

… a recursive CTE to traverse the dimensions graph (DataJunction#1584)

Settings + Notifications in Navigation Bar (DataJunction#1587)

12dac71

* Update header to include user, notifications, and settings Add notifications page + grouping * Fix * Add test coverage

Groups Population (DataJunction#1590)

b8a7bc0

* When we create a user, we should also create its groups, if any * Add test for syncing groups

Add support for creating and managing service accounts in UI (DataJun…

8fc6c84

…ction#1591) * Add support for creating and managing service accounts in UI * Refactor service account section into separate components * Fix test * Fix test

Bumping DJ to version 0.0.19 (DataJunction#1592)

5e96e5d

Co-authored-by: GitHub Actions Bot <>

Remove unnecessary user upsert and group sync on every write request.…

167b6c2

… Users already exist in DB by the time protected endpoints are called (created via signup or OAuth). Internal deployments should handle user provisioning in auth middleware instead. (DataJunction#1595)

Bumping DJ to version 0.0.20 (DataJunction#1596)

e1ab44f

Co-authored-by: GitHub Actions Bot <>

shangyian and others added 28 commits March 1, 2026 13:38

Bump version to v0.0.77 (DataJunction#1813)

1e734c2

Co-authored-by: GitHub Actions Bot <actions@github.com>

Fix UI validation errors and form crashes (DataJunction#1816)

9934e6d

* Fix UI validation errors and form crashes * Fix lint * Fix * Fix tests

Bump version to v0.0.78 (DataJunction#1814)

cfd4f03

Co-authored-by: GitHub Actions Bot <actions@github.com>

Fix issue with editing transforms (DataJunction#1818)

fa99afe

Fix transform edit (DataJunction#1819)

1546069

* Fix issue with editing transforms * Fix

Bump version to v0.0.79 (DataJunction#1820)

5921732

Co-authored-by: GitHub Actions Bot <actions@github.com>

Server validation and deps (DataJunction#1817)

55a122f

* Fix server validation and dependency tracking * Fix some tests * Fix issues with restore validation * Fix tests * Remove logging * Fix tests * Fix

Bump version to v0.0.80 (DataJunction#1821)

8339b89

Co-authored-by: GitHub Actions Bot <actions@github.com>

Query performance optimizations: reduce eagerloading queries in /sql (

f186d41

DataJunction#1824) * Various query optimizations by reducing eager loading * Add additional test coverage

Bump version to v0.0.81 (DataJunction#1825)

cd21fd7

Co-authored-by: GitHub Actions Bot <actions@github.com>

Bump version to v0.0.82 (DataJunction#1826)

2e9d04f

Co-authored-by: GitHub Actions Bot <actions@github.com>

Fixes a bug when deploying a cube with partitioned columns due to aut…

0e7f94b

…oflush (DataJunction#1828) * Fixes a bug when deploying a cube with partitioned columns due to autoflush on orphaned columns * Add test

Bump version to v0.0.83 (DataJunction#1829)

31f20af

Co-authored-by: GitHub Actions Bot <actions@github.com>

Perf: Replace recursive CTE in get_dimensions_dag with layered BFS (D…

7764a49

…ataJunction#1830) * improve dimensions graph discovery speed via bfs * Fix * Add dag tests

Change get upstreams to use layered BFS with raw SQL (DataJunction#1832)

3c28047

* Change get_upstream_nodes to use layered BFS with raw SQL approach * Fix dag join * Fix tests * Fix

Bump version to v0.0.84 (DataJunction#1833)

8fefedf

Co-authored-by: GitHub Actions Bot <actions@github.com>

Cap query refresh with semaphore (DataJunction#1834)

249a045

* Don't refresh ahead for query caching, since we are already only retrieving versioned queries * Switch to a semaphore cap to control concurrent sql building refreshes from background tasks

Add better metrics instrumentation (DataJunction#1836)

0268811

Add instrumentation for various basic metrics (DataJunction#1837)

f9a8d4f

Build v3 perf (DataJunction#1835)

a074a30

* Fix duplicate find_matching_cube calls

Bump version to v0.0.85 (DataJunction#1838)

599fa4f

Co-authored-by: GitHub Actions Bot <actions@github.com>

Instrument additional metrics, including dj.sql.parsing_ms, dj.node_v…

056e09a

…alidation.ms, dj.cube_matching.ms, dj.graphql.query_ms, dj.graphql.errors, dj.db.query_count. Also extended @timed to support sync functions. (DataJunction#1840)

Fix issue where a detached relationship on RoleAssignment (DataJuncti…

80a1c22

…on#1841) * Fix issue where a detached relationship on RoleAssignment (due to separate sessions being used to load the auth context and bearer session) was causing errors * Fix inconsistent bug with copy nodes

Bump version to v0.0.86 (DataJunction#1843)

7ea91dd

Co-authored-by: GitHub Actions Bot <actions@github.com>

Only load what is actually going to be used by build v3

78aeb7f

shangyian force-pushed the main branch from 52c88f0 to 5e6a05f Compare April 1, 2026 09:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only load what is actually going to be used by build v3#1844

Only load what is actually going to be used by build v3#1844
shangyian wants to merge 3470 commits intoDataJunction:mainfrom
shangyian:build-v3-perf-eagerload

shangyian commented Mar 8, 2026

Uh oh!

netlify bot commented Mar 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

shangyian commented Mar 8, 2026

Summary

Test Plan

Deployment Plan

Uh oh!

netlify bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for thriving-cassata-78ae72 canceled.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

netlify bot commented Mar 8, 2026 •

edited

Loading