Skip to content

Only load what is actually going to be used by build v3#1844

Draft
shangyian wants to merge 3470 commits intoDataJunction:mainfrom
shangyian:build-v3-perf-eagerload
Draft

Only load what is actually going to be used by build v3#1844
shangyian wants to merge 3470 commits intoDataJunction:mainfrom
shangyian:build-v3-perf-eagerload

Conversation

@shangyian
Copy link
Copy Markdown
Collaborator

Summary

Test Plan

  • PR has an associated issue: #
  • make check passes
  • make test shows 100% unit test coverage

Deployment Plan

agorajek and others added 30 commits October 29, 2025 09:34
Co-authored-by: GitHub Actions Bot <>
…nction#1550)

* Switch to using the same codemirror editor for custom metadata editing

* Fix
* MAX_BY and MIN_BY functions cannot be decomposed, so metric decomposition should reflect that

* Fix tests

* Remove unused part of query ast update after max_by decomposition
Co-authored-by: GitHub Actions Bot <>
* If a role path is provided, should stick to the role path

* Enable role path dimensions in filters
Manage filters and dimensions resolution correctly
Fix tests

* Fix tests

* Add test for storing cubes that have dimensions with role paths

* Fix
…1552)

* When using the DJ client to retrieve metrics data, rename all output columns from their physical names to their semantic names

* Fix

* Fix tests
* Fix some inconsistencies with metric display name for cubes when deployed
make cube deployment part of singular commit

* Remove extraneous CreateCube input map

* Fix

* Fix cube deployment validation

* Refactor batch load of metrics and dimensions as separate functions

* Make sure that we can set tags for cubes

* Fix tests

* Fix and cleanup tests

* Fix tests
Co-authored-by: GitHub Actions Bot <>
Update YAML docs given new deployment setup
Co-authored-by: GitHub Actions Bot <>
…Junction#1568)

* Improve namespace creation UX with inline folder-like interface

* Fix tests

* Fix lint
* Add support for editing reference links and dimension links

* Make edit icon show up by default

* Limit the max size of the Dimension Links column

* Fix adding, editing or removing complex dim link

* Fix tests

* lint

* Fix tests

* Add ability to refresh page after removing dim link

* Add tests

* lint
* Fix bug that allows saving invalid metrics

* Add test that confirms metrics bug fix

* make lint

* Remove unused variable

* only validate columns in metrics without qualifying namespace (those are from dimension links)

* mypy
Co-authored-by: GitHub Actions Bot <>
* Add support for dimensional hierarchies

* Add history tracking for hierarchies and audit info like created by and owner

* Move database functions to classmethods

* Add tests for hierarchies

* Add hierarchies database migration

* Remove the hard-coded level order inputs when creating hierarchies since they can be derived from the list order

* Remove validate endpoint for hierarchies
* Add principal kind for GROUP

* Add group members table

* Add pluggable group membership service

* Add API endpoints to support groups

* Add groups-related models

* Fix database migration
…ataJunction#1504)

* Allow configurable embedded query client

* more coverage

* Catch generic exception incase snowflake connector not installed

* Add test for snowflake types to DJ types mappings

* Use global client with examples

* Update datajunction-server/datajunction_server/config.py

Co-authored-by: Olek Górajek <agorajek@users.noreply.github.com>

* Update datajunction-server/datajunction_server/config.py

Co-authored-by: Olek Górajek <agorajek@users.noreply.github.com>

* client_with_roads

---------

Co-authored-by: Olek Górajek <agorajek@users.noreply.github.com>
Co-authored-by: GitHub Actions Bot <>
* Add database models for roles, role scopes, and role assignments

* Add roles APIs

* Add API endpoints for roles, role scopes, and role assignments
* Add the ability to filter nodes by mode (draft, published) via GraphQL

* Add filtering option to UI

* Add column for displaying mode

* Add tests for GraphQL mode filtering

* Fix tests for UI namespace loading

* Add NodeModeSelect
Co-authored-by: GitHub Actions Bot <>
* Update header to include user, notifications, and settings
Add notifications page + grouping

* Fix

* Add test coverage
* When we create a user, we should also create its groups, if any

* Add test for syncing groups
…ction#1591)

* Add support for creating and managing service accounts in UI

* Refactor service account section into separate components

* Fix test

* Fix test
Co-authored-by: GitHub Actions Bot <>
… Users already exist in DB by the time protected endpoints are called (created via signup or OAuth). Internal deployments should handle user provisioning in auth middleware instead. (DataJunction#1595)
Co-authored-by: GitHub Actions Bot <>
shangyian and others added 28 commits March 1, 2026 13:38
Co-authored-by: GitHub Actions Bot <actions@github.com>
* Fix UI validation errors and form crashes

* Fix lint

* Fix

* Fix tests
Co-authored-by: GitHub Actions Bot <actions@github.com>
* Fix issue with editing transforms

* Fix
Co-authored-by: GitHub Actions Bot <actions@github.com>
* Fix server validation and dependency tracking

* Fix some tests

* Fix issues with restore validation

* Fix tests

* Remove logging

* Fix tests

* Fix
Co-authored-by: GitHub Actions Bot <actions@github.com>
DataJunction#1824)

* Various query optimizations by reducing eager loading

* Add additional test coverage
Co-authored-by: GitHub Actions Bot <actions@github.com>
Co-authored-by: GitHub Actions Bot <actions@github.com>
…oflush (DataJunction#1828)

* Fixes a bug when deploying a cube with partitioned columns due to autoflush on orphaned columns

* Add test
Co-authored-by: GitHub Actions Bot <actions@github.com>
…ataJunction#1830)

* improve dimensions graph discovery speed via bfs

* Fix

* Add dag tests
…ataJunction#1831)

* When getting downstream nodes, be targeted in how much eagerloading we do. We only need cube elements, node revision, node, not the full set of node output options

* Don't do N+1 for cubes on dimension link removal

* Switch to layered BFS with raw SQL for get downstreams

* Remove old downstream nodes BFS implementation

* Fix tests, ordering does matter for status propagation - needs to be in topological order

* Remove dead code
* Change get_upstream_nodes to use layered BFS with raw SQL approach

* Fix dag join

* Fix tests

* Fix
Co-authored-by: GitHub Actions Bot <actions@github.com>
* Don't refresh ahead for query caching, since we are already only retrieving versioned queries

* Switch to a semaphore cap to control concurrent sql building refreshes from background tasks
* Fix duplicate find_matching_cube calls
Co-authored-by: GitHub Actions Bot <actions@github.com>
…alidation.ms, dj.cube_matching.ms, dj.graphql.query_ms, dj.graphql.errors, dj.db.query_count. Also extended @timed to support sync functions. (DataJunction#1840)
* Add BigQuery query client and dialect support

Implements a direct BigQuery integration following the same pattern as
the existing Snowflake client. Adds `BigQueryClient` for table
introspection via INFORMATION_SCHEMA, registers `bigquery` as a
supported dialect with sqlglot transpilation, and exposes it as an
optional install extra (`datajunction-server[bigquery]`).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix pre-commit: ruff format, trailing commas, GraphQL schema

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix BigQuery test failures: QueryJobConfig mock and BIGNUMERIC precision

- Import QueryJobConfig and ScalarQueryParameter at module level so
  tests can patch them (accessing via bigquery=None failed)
- Fix BIGNUMERIC/BIGDECIMAL to use DecimalType(38, 38) since DJ's
  DecimalType caps max_precision at 38

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add tests for _get_client and test_connection to reach 100% coverage

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add test for BigQuery ImportError path in utils to reach 100% coverage

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add engine URI project resolution to BigQueryClient (#1)

Mirrors SnowflakeClient's _get_database_from_engine approach: parses
the GCP project from the engine URI netloc (bigquery://my-gcp-project)
so different DJ catalogs can point to different GCP projects.

Also adds BigQuery env config example to .env and updates tests.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Address review comments on BigQueryClient (#2)

* Add engine URI project resolution to BigQueryClient

Mirrors SnowflakeClient's _get_database_from_engine approach: parses
the GCP project from the engine URI netloc (bigquery://my-gcp-project)
so different DJ catalogs can point to different GCP projects.

Also adds BigQuery env config example to .env and updates tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Address review comments on BigQueryClient

- Add sqlglot to bigquery extra for dialect transpilation support
- Add BIGQUERY_AVAILABLE import coverage tests (True/False paths)
- Add BigQuery config documentation with examples to QueryClientConfig
- Remove redundant 0-based index comment in get_columns_for_table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add BigQuery project resolution order from engine URI

* Add BigQuery query execution support to datajunction-query

Add BigQuery as a supported engine type in the query service, following
the existing Snowflake pattern. Supports project config via extra_params,
credentials via config or GOOGLE_APPLICATION_CREDENTIALS env var, and
Application Default Credentials as fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add comprehensive BigQuery integration tests

Cover credentials path, location, env var fallback, error handling,
multi-row and empty results in datajunction-query. Add client project
override, location, factory with all options, unsupported type, engine
URI project override, and credentials precedence tests in
datajunction-server.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix mypy type error and ruff formatting in BigQuery query execution

Wrap BigQuery rows in iter() to match Stream (Iterator) type.
Apply ruff format to test file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix test_get_columns_for_table_with_engine_project_override mock setup

Mock QueryJobConfig and ScalarQueryParameter which are None in CI
(google-cloud-bigquery not installed), matching the pattern used by
other get_columns_for_table tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add tests for uncovered branches in _get_project_from_engine

Cover empty path segment fallthrough (131->134) and query params
without project key (136->145) to reach 100% branch coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Strip catalog prefix from BigQuery SQL and fix ADC credential handling

DJ generates SQL with catalog-prefixed table names (e.g. my_catalog.dataset.table)
but BigQuery interprets three-part names as project.dataset.table. Since the BQ
client already has the project configured, strip the catalog prefix so BigQuery
receives dataset.table references. Also remove the GOOGLE_APPLICATION_CREDENTIALS
env var fallback from credentials_path — let bigquery.Client() handle ADC natively.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove unused variable flagged by ruff (F841)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Add support for saving cube filters and applying them during SQL generation

* Add support for generating SQL on cubes with filters

* Add instrumentation for /sql/measures/v3 and extra test

* Add test coverage

* Add support for querying cube filters via GraphQL

* Fix issue with saving cube filters via YAML deployment

* Enable displaying in the UI of cube filters

* Fix tests

* Fix
…on#1841)

* Fix issue where a detached relationship on RoleAssignment (due to separate sessions being used to load the auth context and bearer session) was causing errors

* Fix inconsistent bug with copy nodes
Co-authored-by: GitHub Actions Bot <actions@github.com>
@netlify
Copy link
Copy Markdown

netlify bot commented Mar 8, 2026

Deploy Preview for thriving-cassata-78ae72 canceled.

Name Link
🔨 Latest commit 78aeb7f
🔍 Latest deploy log https://app.netlify.com/projects/thriving-cassata-78ae72/deploys/69add62cd061d500080249ea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants