Skip to content

Add multi-tenant Databricks token support via cross-namespace K8s secrets#3394

Merged
machichima merged 2 commits intoflyteorg:masterfrom
rohitrsh:feat/databricks-token-support
Mar 10, 2026
Merged

Add multi-tenant Databricks token support via cross-namespace K8s secrets#3394
machichima merged 2 commits intoflyteorg:masterfrom
rohitrsh:feat/databricks-token-support

Conversation

@rohitrsh
Copy link
Contributor

@rohitrsh rohitrsh commented Feb 17, 2026

Tracking issue

Related to flyteorg/flyte#6911

Why are the changes needed?

The Databricks Spark connector currently uses a single global FLYTE_DATABRICKS_ACCESS_TOKEN environment variable for authenticating all Databricks API calls. This creates significant limitations for multi-tenant Flyte deployments:

  1. No project isolation All projects share the same Databricks workspace/token, preventing fine-grained access control (e.g., Unity Catalog scoping per project).
  2. Token rotation requires redeployment Changing the token requires restarting the connector pod.
  3. Single workspace Organisations cannot route different projects to different Databricks workspaces.

This PR adds per-project Databricks token support by enabling the connector to read tokens from Kubernetes secrets in the workflow's project namespace, with backwards-compatible fallback to the existing environment variable.

What changes were proposed in this pull request?

Token Resolution Strategy

The connector now resolves Databricks tokens using a tiered strategy:

  1. Namespace-specific K8s secret Reads a secret (default name: databricks-token, key: token) from the workflow's Kubernetes namespace using cross-namespace lookup.
  2. Fallback to environment variable Falls back to FLYTE_DATABRICKS_ACCESS_TOKEN if no namespace secret is found.

Changes

connector.py:

  • Added get_secret_from_k8s(secret_name, secret_key, namespace) Cross-namespace K8s secret reader using the Kubernetes Python client. Tries in-cluster config first, falls back to kubeconfig for local development.
  • Added get_databricks_token(namespace, task_template, secret_name). Implements the token resolution strategy with structured logging and error handling.
  • Updated get_header() Now accepts an optional auth_token parameter.
  • Updated DatabricksJobMetadata Added auth_token field to persist the resolved token across create/get/delete operations.
  • Updated DatabricksConnector.create(). Now accepts task_execution_metadata parameter, extracts the namespace, and resolves the project-specific token.
  • Updated DatabricksConnector.get() / delete() Use stored auth_token from metadata.

task.py:

  • Added databricks_token_secret field to DatabricksV2 Allows users to specify a custom K8S secret name per task.
  • Updated get_custom() Serialises databricksTokenSecret into the task template custom dict.

User Experience

No changes needed for existing users Existing workflows continue to work with the global FLYTE_DATABRICKS_ACCESS_TOKEN.

New capability for multi-tenant setups:

# Create project-specific Databricks token
kubectl create secret generic databricks-token \
  --from-literal=token='dapi_project_specific_token' \
  --namespace=my-project-namespace
# Optional: Custom secret name per task
@task(
    task_config=DatabricksV2(
        databricks_conf={...},
        databricks_instance="your-instance.cloud.databricks.com",
        databricks_token_secret="my-custom-secret",  # Optional
    )
)
def my_task():
    ...

How was this patch tested?

Unit Tests (test_databricks_token.py)

30+ test cases covering:

get_secret_from_k8s:

  • Secret found successfully (mocked K8S client)
  • Secret not found (404)
  • Secret key missing
  • Secret data is None
  • Kubernetes package not installed (ImportError)
  • Non-404 API exceptions (403 forbidden)
  • Kubeconfig fallback when not in cluster
  • Both config methods fail gracefully

get_databricks_token:

  • Token from namespace secret
  • Token with custom secret name
  • Fallback to env var when namespace secret is missing
  • No namespace falls back to env var
  • No token from any source raises ValueError
  • Empty token raises ValueError
  • Default secret name verification
  • Backward compatibility (no namespace, no secret)

get_header:

  • With pre-resolved auth token
  • Without auth token (auto-resolution)

DatabricksJobMetadata:

  • Auth token persistence
  • Default None value

DatabricksConnector:

  • create() with namespace-specific token
  • create() with custom secret name
  • create() without metadata (backwards compatible)
  • Token stored in returned metadata
  • get() uses stored token
  • delete() uses stored token
  • get() with None token falls back to default

DatabricksV2 task config:

  • Token secret field exists
  • Defaults to None
  • get_custom() serializes databricksTokenSecret
  • get_custom() excludes field when None

Setup process

  1. Connector RBAC: Grant the connector's service account get permission on secrets across namespaces:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: flyte-connector-secret-reader
rules:
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get"]
    resourceNames: ["databricks-token"]
  1. Create namespace secrets:
kubectl create secret generic databricks-token \
  --from-literal=token='your-token' \
  --namespace=your-project-namespace

Screenshots

N/A (backend-only change)

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

N/A

@welcome
Copy link

welcome bot commented Feb 17, 2026

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

@pingsutw
Copy link
Member

pingsutw commented Mar 4, 2026

lgtm, thank you! Could you fix the lint errors?

@rohitrsh rohitrsh force-pushed the feat/databricks-token-support branch from 7ceb06f to 9eef1e0 Compare March 4, 2026 19:48
@rohitrsh rohitrsh force-pushed the feat/databricks-token-support branch from 9eef1e0 to 048ed2e Compare March 5, 2026 09:42
…space K8s secrets

Enable per-project Databricks authentication by reading tokens from
Kubernetes secrets in workflow namespaces, with backward-compatible
fallback to the FLYTE_DATABRICKS_ACCESS_TOKEN environment variable.

Changes:
- Add get_secret_from_k8s() for cross-namespace K8s secret reading
- Add get_databricks_token() with tiered resolution (K8s -> env var)
- Update DatabricksJobMetadata to persist auth_token across lifecycle
- Update DatabricksConnector.create/get/delete to use per-project tokens
- Add DatabricksV2.databricks_token_secret for custom secret names
- Add 31 comprehensive tests covering all token resolution paths

Tracking: flyteorg/flyte#6911
Signed-off-by: Rohit Sharma <rohitrsh@gmail.com>
@rohitrsh rohitrsh force-pushed the feat/databricks-token-support branch from 048ed2e to 0cfa9c8 Compare March 5, 2026 10:57
@rohitrsh
Copy link
Contributor Author

rohitrsh commented Mar 5, 2026

lgtm, thank you! Could you fix the lint errors?

Hey @pingsutw, so here's what I have done to fix the lint issue.

Lint fix: Restored pydoclint-errors-baseline.txt

The lint CI was failing because pydoclint-errors-baseline.txt was empty on this branch. This file contains all pre-existing pydoclint violations across the flytekit repo (introduced in #3077), so they don't block new PRs.

Without the baseline, Pydoclint treated every pre-existing violation in the entire codebase as a new error, causing the failures in ruff, ruff-format, trailing-whitespace, and Pydoclint checks.

What was done:

  • Restored the baseline from master (591 lines)
  • pydoclint auto-regenerated it to 587 lines because our PR actually fixed 4 pre-existing violations the DatabricksV2 class docstring in task.py now properly documents all attributes (databricks_conf, databricks_instance, databricks_token_secret)
  • All new functions (get_secret_from_k8s, get_databricks_token, get_header) have proper Google-style docstrings with type hints zero new violations
  • Verified locally: make lint passes all checks (mypy + pre-commit)
$ make lint                 
mypy flytekit/core
Success: no issues found in 52 source files
mypy flytekit/types
Success: no issues found in 23 source files
mypy --allow-empty-bodies --disable-error-code="annotation-unchecked" tests/flytekit/unit/core
Success: no issues found in 102 source files
pre-commit run --all-files
ruff.....................................................................Passed
ruff-format..............................................................Passed
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
shellcheck...............................................................Passed
Check for exposed PDB statements.........................................Passed
codespell................................................................Passed
pydoclint................................................................Passed

Copy link
Member

@machichima machichima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rohitrsh rohitrsh force-pushed the feat/databricks-token-support branch 2 times, most recently from e13f8f9 to 1df2ae5 Compare March 10, 2026 19:15
Resolve merge conflicts in task.py after flyteorg#3392 (serverless) was merged:
- Combined DatabricksV2 attributes: kept all serverless fields, added
  databricks_token_secret
- Combined get_custom() serialization for both feature sets
- Added auth_token to serverless test metadata assertion
- Removed emoji from error message

Signed-off-by: Rohit Sharma <rohitrsh@gmail.com>
Made-with: Cursor
@rohitrsh rohitrsh force-pushed the feat/databricks-token-support branch from 1df2ae5 to 148e088 Compare March 10, 2026 19:56
@machichima machichima merged commit 6297a98 into flyteorg:master Mar 10, 2026
56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants