Add multi-tenant Databricks token support via cross-namespace K8s secrets by rohitrsh · Pull Request #3394 · flyteorg/flytekit

rohitrsh · 2026-02-17T22:28:40Z

Tracking issue

Why are the changes needed?

The Databricks Spark connector currently uses a single global FLYTE_DATABRICKS_ACCESS_TOKEN environment variable for authenticating all Databricks API calls. This creates significant limitations for multi-tenant Flyte deployments:

No project isolation All projects share the same Databricks workspace/token, preventing fine-grained access control (e.g., Unity Catalog scoping per project).
Token rotation requires redeployment Changing the token requires restarting the connector pod.
Single workspace Organisations cannot route different projects to different Databricks workspaces.

This PR adds per-project Databricks token support by enabling the connector to read tokens from Kubernetes secrets in the workflow's project namespace, with backwards-compatible fallback to the existing environment variable.

What changes were proposed in this pull request?

Token Resolution Strategy

The connector now resolves Databricks tokens using a tiered strategy:

Namespace-specific K8s secret Reads a secret (default name: databricks-token, key: token) from the workflow's Kubernetes namespace using cross-namespace lookup.
Fallback to environment variable Falls back to FLYTE_DATABRICKS_ACCESS_TOKEN if no namespace secret is found.

Changes

connector.py:

Added get_secret_from_k8s(secret_name, secret_key, namespace) Cross-namespace K8s secret reader using the Kubernetes Python client. Tries in-cluster config first, falls back to kubeconfig for local development.
Added get_databricks_token(namespace, task_template, secret_name). Implements the token resolution strategy with structured logging and error handling.
Updated get_header() Now accepts an optional auth_token parameter.
Updated DatabricksJobMetadata Added auth_token field to persist the resolved token across create/get/delete operations.
Updated DatabricksConnector.create(). Now accepts task_execution_metadata parameter, extracts the namespace, and resolves the project-specific token.
Updated DatabricksConnector.get() / delete() Use stored auth_token from metadata.

task.py:

Added databricks_token_secret field to DatabricksV2 Allows users to specify a custom K8S secret name per task.
Updated get_custom() Serialises databricksTokenSecret into the task template custom dict.

User Experience

No changes needed for existing users Existing workflows continue to work with the global FLYTE_DATABRICKS_ACCESS_TOKEN.

New capability for multi-tenant setups:

# Create project-specific Databricks token
kubectl create secret generic databricks-token \
  --from-literal=token='dapi_project_specific_token' \
  --namespace=my-project-namespace

# Optional: Custom secret name per task
@task(
    task_config=DatabricksV2(
        databricks_conf={...},
        databricks_instance="your-instance.cloud.databricks.com",
        databricks_token_secret="my-custom-secret",  # Optional
    )
)
def my_task():
    ...

How was this patch tested?

Unit Tests (`test_databricks_token.py`)

30+ test cases covering:

get_secret_from_k8s:

Secret found successfully (mocked K8S client)
Secret not found (404)
Secret key missing
Secret data is None
Kubernetes package not installed (ImportError)
Non-404 API exceptions (403 forbidden)
Kubeconfig fallback when not in cluster
Both config methods fail gracefully

get_databricks_token:

Token from namespace secret
Token with custom secret name
Fallback to env var when namespace secret is missing
No namespace falls back to env var
No token from any source raises ValueError
Empty token raises ValueError
Default secret name verification
Backward compatibility (no namespace, no secret)

get_header:

With pre-resolved auth token
Without auth token (auto-resolution)

DatabricksJobMetadata:

Auth token persistence
Default None value

DatabricksConnector:

create() with namespace-specific token
create() with custom secret name
create() without metadata (backwards compatible)
Token stored in returned metadata
get() uses stored token
delete() uses stored token
get() with None token falls back to default

DatabricksV2 task config:

Token secret field exists
Defaults to None
get_custom() serializes databricksTokenSecret
get_custom() excludes field when None

Setup process

Connector RBAC: Grant the connector's service account get permission on secrets across namespaces:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: flyte-connector-secret-reader
rules:
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get"]
    resourceNames: ["databricks-token"]

Create namespace secrets:

kubectl create secret generic databricks-token \
  --from-literal=token='your-token' \
  --namespace=your-project-namespace

Screenshots

N/A (backend-only change)

Check all the applicable boxes

I updated the documentation accordingly.
All new and existing tests passed.
All commits are signed-off.

Related PRs

Related: Add RBAC support for cross-namespace secret reading

Docs link

N/A

welcome · 2026-02-17T22:28:44Z

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
Sign off your commits (Reference: DCO Guide).

pingsutw · 2026-03-04T19:18:14Z

lgtm, thank you! Could you fix the lint errors?

…space K8s secrets Enable per-project Databricks authentication by reading tokens from Kubernetes secrets in workflow namespaces, with backward-compatible fallback to the FLYTE_DATABRICKS_ACCESS_TOKEN environment variable. Changes: - Add get_secret_from_k8s() for cross-namespace K8s secret reading - Add get_databricks_token() with tiered resolution (K8s -> env var) - Update DatabricksJobMetadata to persist auth_token across lifecycle - Update DatabricksConnector.create/get/delete to use per-project tokens - Add DatabricksV2.databricks_token_secret for custom secret names - Add 31 comprehensive tests covering all token resolution paths Tracking: flyteorg/flyte#6911 Signed-off-by: Rohit Sharma <rohitrsh@gmail.com>

rohitrsh · 2026-03-05T11:04:49Z

lgtm, thank you! Could you fix the lint errors?

Hey @pingsutw, so here's what I have done to fix the lint issue.

Lint fix: Restored pydoclint-errors-baseline.txt

The lint CI was failing because pydoclint-errors-baseline.txt was empty on this branch. This file contains all pre-existing pydoclint violations across the flytekit repo (introduced in #3077), so they don't block new PRs.

Without the baseline, Pydoclint treated every pre-existing violation in the entire codebase as a new error, causing the failures in ruff, ruff-format, trailing-whitespace, and Pydoclint checks.

What was done:

Restored the baseline from master (591 lines)
pydoclint auto-regenerated it to 587 lines because our PR actually fixed 4 pre-existing violations the DatabricksV2 class docstring in task.py now properly documents all attributes (databricks_conf, databricks_instance, databricks_token_secret)
All new functions (get_secret_from_k8s, get_databricks_token, get_header) have proper Google-style docstrings with type hints zero new violations
Verified locally: make lint passes all checks (mypy + pre-commit)

$ make lint                 
mypy flytekit/core
Success: no issues found in 52 source files
mypy flytekit/types
Success: no issues found in 23 source files
mypy --allow-empty-bodies --disable-error-code="annotation-unchecked" tests/flytekit/unit/core
Success: no issues found in 102 source files
pre-commit run --all-files
ruff.....................................................................Passed
ruff-format..............................................................Passed
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
shellcheck...............................................................Passed
Check for exposed PDB statements.........................................Passed
codespell................................................................Passed
pydoclint................................................................Passed

machichima

LGTM!

Resolve merge conflicts in task.py after flyteorg#3392 (serverless) was merged: - Combined DatabricksV2 attributes: kept all serverless fields, added databricks_token_secret - Combined get_custom() serialization for both feature sets - Added auth_token to serverless test metadata assertion - Removed emoji from error message Signed-off-by: Rohit Sharma <rohitrsh@gmail.com> Made-with: Cursor

rohitrsh requested review from cosmicBboy, davidmirror-ops, kumare3, machichima, pingsutw, samhita-alla and wild-endeavor as code owners February 17, 2026 22:28

rohitrsh mentioned this pull request Feb 17, 2026

Add RBAC support for cross-namespace secret reading flyteorg/flyte#6919

Merged

3 tasks

pingsutw approved these changes Mar 4, 2026

View reviewed changes

rohitrsh force-pushed the feat/databricks-token-support branch from 7ceb06f to 9eef1e0 Compare March 4, 2026 19:48

pingsutw approved these changes Mar 4, 2026

View reviewed changes

rohitrsh force-pushed the feat/databricks-token-support branch from 9eef1e0 to 048ed2e Compare March 5, 2026 09:42

rohitrsh force-pushed the feat/databricks-token-support branch from 048ed2e to 0cfa9c8 Compare March 5, 2026 10:57

machichima approved these changes Mar 10, 2026

View reviewed changes

rohitrsh force-pushed the feat/databricks-token-support branch 2 times, most recently from e13f8f9 to 1df2ae5 Compare March 10, 2026 19:15

rohitrsh force-pushed the feat/databricks-token-support branch from 1df2ae5 to 148e088 Compare March 10, 2026 19:56

machichima merged commit 6297a98 into flyteorg:master Mar 10, 2026
56 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-tenant Databricks token support via cross-namespace K8s secrets#3394

Add multi-tenant Databricks token support via cross-namespace K8s secrets#3394
machichima merged 2 commits intoflyteorg:masterfrom
rohitrsh:feat/databricks-token-support

rohitrsh commented Feb 17, 2026 •

edited

Loading

Uh oh!

welcome bot commented Feb 17, 2026

Uh oh!

pingsutw commented Mar 4, 2026

Uh oh!

rohitrsh commented Mar 5, 2026

Uh oh!

machichima left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rohitrsh commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tracking issue

Why are the changes needed?

What changes were proposed in this pull request?

Token Resolution Strategy

Changes

User Experience

How was this patch tested?

Unit Tests (test_databricks_token.py)

Setup process

Screenshots

Check all the applicable boxes

Related PRs

Docs link

Uh oh!

welcome bot commented Feb 17, 2026

Uh oh!

pingsutw commented Mar 4, 2026

Uh oh!

rohitrsh commented Mar 5, 2026

Uh oh!

machichima left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rohitrsh commented Feb 17, 2026 •

edited

Loading

Unit Tests (`test_databricks_token.py`)