Skip to content

Commit 2572362

Browse files
refactor(backend/kernel): PAT-only auth, drop External trampoline
The earlier auth_bridge routed OAuth/MSAL/federation through the kernel's External token-provider trampoline (a Python callable the kernel invoked per HTTP request). Removing that for now. Why: routing OAuth into the kernel inherently requires per-request token resolution to keep refresh working during a long-running session. Two viable mechanisms (kernel-native OAuth, or the External callback); both have costs (duplicate OAuth flows vs GIL-per-request). Punting the decision until there's actual demand on use_sea=True. Today: the bridge accepts PAT (including TokenFederationProvider- wrapped PAT, which is how `get_python_sql_connector_auth_provider` always shapes it). Any non-PAT auth_provider raises a clear NotSupportedError pointing the user at use_sea=False (Thrift). This shrinks the auth_bridge to ~50 lines and means the kernel- side External enablement PR is no longer on the connector's critical path — there's no kernel-side prerequisite for shipping use_sea=True for PAT users. Unit tests updated: - TokenFederationProvider-wrapped PAT still routes to PAT (kept). - Generic OAuth provider raises NotSupportedError (new). - ExternalAuthProvider raises NotSupportedError (new). - Silent non-PAT provider raises NotSupportedError (new) — reject the type itself rather than trying to extract a token we already know we can't use. Live e2e against dogfood with use_sea=True (PAT): all checks still pass (SELECT 1, range(10000), fetchmany pacing, four metadata calls, session_configuration round-trip, structured DatabaseError on bad SQL). Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
1 parent 9259644 commit 2572362

2 files changed

Lines changed: 76 additions & 99 deletions

File tree

Lines changed: 22 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,19 @@
11
"""Translate the connector's ``AuthProvider`` into ``databricks_sql_kernel``
22
``Session`` auth kwargs.
33
4-
The connector already implements every auth flow it supports (PAT,
5-
OAuth M2M, OAuth U2M, external token providers, federation). The
6-
kernel must not re-implement them. Decision D9 in the integration
7-
design: PAT goes through the kernel's PAT path; everything else
8-
delegates back to the connector via the kernel's ``External``
9-
trampoline, with a Python callback that returns a fresh bearer
10-
token.
4+
This phase ships PAT only. The kernel-side PyO3 binding accepts
5+
``auth_type='pat'``; OAuth / federation / custom credentials
6+
providers are reserved but not yet wired in either layer. Non-PAT
7+
auth raises ``NotSupportedError`` from this bridge so the failure
8+
surfaces at session-open time with a clear message rather than
9+
deep inside the kernel.
1110
1211
Token extraction goes through ``AuthProvider.add_headers({})``
1312
rather than touching auth-provider-specific attributes, so the
14-
bridge works for every subclass — including custom providers a
15-
caller may have wired in.
16-
17-
End-to-end limitation: the kernel's
18-
``build_auth_provider`` currently rejects ``AuthConfig::External``
19-
("reserved; v0 wires PAT + OAuthM2M + OAuthU2M only"). Until the
20-
kernel-side follow-up PR lands, non-PAT auth surfaces a clear
21-
``KernelError(code='InvalidArgument', message='AuthConfig::External
22-
is reserved...')`` from ``Session.open_session``. PAT works today.
13+
bridge works uniformly for every PAT shape — including
14+
``AccessTokenAuthProvider`` wrapped in ``TokenFederationProvider``
15+
(which ``get_python_sql_connector_auth_provider`` does for every
16+
provider it builds).
2317
"""
2418

2519
from __future__ import annotations
@@ -29,6 +23,7 @@
2923

3024
from databricks.sql.auth.authenticators import AccessTokenAuthProvider, AuthProvider
3125
from databricks.sql.auth.token_federation import TokenFederationProvider
26+
from databricks.sql.exc import NotSupportedError
3227

3328
logger = logging.getLogger(__name__)
3429

@@ -64,8 +59,8 @@ def _extract_bearer_token(auth_provider: AuthProvider) -> Optional[str]:
6459
provider-specific internals.
6560
6661
Returns ``None`` if the provider did not write an Authorization
67-
header or wrote a non-Bearer scheme — neither shape is
68-
representable in the kernel's auth surface today.
62+
header or wrote a non-Bearer scheme — neither is representable
63+
in the kernel's PAT auth surface.
6964
"""
7065
headers: Dict[str, str] = {}
7166
auth_provider.add_headers(headers)
@@ -80,29 +75,13 @@ def _extract_bearer_token(auth_provider: AuthProvider) -> Optional[str]:
8075
def kernel_auth_kwargs(auth_provider: AuthProvider) -> Dict[str, Any]:
8176
"""Build the kwargs passed to ``databricks_sql_kernel.Session(...)``.
8277
83-
Two routing decisions:
84-
85-
1. ``AccessTokenAuthProvider`` → ``auth_type='pat'`` with the
86-
static token. Kernel uses it verbatim for every request.
87-
2. Anything else → ``auth_type='external'`` with a callback that
88-
calls ``auth_provider.add_headers({})`` and returns the
89-
fresh bearer token. The connector keeps owning the OAuth /
90-
MSAL / federation flow; the kernel asks for a token whenever
91-
it needs one.
92-
93-
The PAT special-case exists because it's the only path the
94-
kernel actually serves end-to-end today. Once the kernel-side
95-
External enablement lands, PAT could collapse into the
96-
External path too (one callback that returns the static token);
97-
but keeping the explicit ``pat`` route means the kernel does
98-
not pay the GIL-reacquire cost on every HTTP request for PAT
99-
users.
78+
PAT (including ``TokenFederationProvider``-wrapped PAT) routes
79+
through the kernel's PAT path. Anything else raises
80+
``NotSupportedError`` — the kernel binding doesn't accept OAuth
81+
today, and routing OAuth through PAT would silently break
82+
token refresh during long-running sessions.
10083
"""
10184
if _is_pat(auth_provider):
102-
# PAT case: pull the static token out and feed the kernel's
103-
# PAT path. We go through ``add_headers`` regardless of
104-
# whether the provider was wrapped in TokenFederation or
105-
# not — both shapes write the same Authorization header.
10685
token = _extract_bearer_token(auth_provider)
10786
if not token:
10887
raise ValueError(
@@ -111,21 +90,8 @@ def kernel_auth_kwargs(auth_provider: AuthProvider) -> Dict[str, Any]:
11190
)
11291
return {"auth_type": "pat", "access_token": token}
11392

114-
# Every other provider: trampoline a callback. The callback is
115-
# invoked once per HTTP request that needs auth (the kernel does
116-
# not cache the returned token), so the auth_provider's own
117-
# caching is what keeps this fast.
118-
def token_callback() -> str:
119-
token = _extract_bearer_token(auth_provider)
120-
if not token:
121-
raise RuntimeError(
122-
f"{type(auth_provider).__name__}.add_headers did not produce "
123-
"a Bearer Authorization header; cannot supply a token to the kernel"
124-
)
125-
return token
126-
127-
logger.debug(
128-
"Routing %s through kernel External trampoline",
129-
type(auth_provider).__name__,
93+
raise NotSupportedError(
94+
f"The kernel backend (use_sea=True) currently only supports PAT auth, "
95+
f"but got {type(auth_provider).__name__}. Use use_sea=False (Thrift) "
96+
"for OAuth / federation / custom credential providers."
13097
)
131-
return {"auth_type": "external", "token_callback": token_callback}
Lines changed: 54 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
11
"""Unit tests for the kernel backend's auth bridge.
22
3-
The bridge translates the connector's ``AuthProvider`` hierarchy
4-
into ``databricks_sql_kernel.Session`` auth kwargs. PAT goes through
5-
the kernel's PAT path; everything else trampolines through the
6-
``External`` path with a Python callback.
3+
Phase 1 ships PAT only. Tests verify:
4+
- PAT routes through ``auth_type='pat'``.
5+
- ``TokenFederationProvider``-wrapped PAT also routes through
6+
PAT (every provider built by ``get_python_sql_connector_auth_provider``
7+
is federation-wrapped, so the naive isinstance check has to
8+
look through the wrapper).
9+
- Anything else raises ``NotSupportedError`` with a clear message.
710
"""
811

912
from __future__ import annotations
@@ -12,38 +15,36 @@
1215

1316
import pytest
1417

15-
from databricks.sql.auth.authenticators import AccessTokenAuthProvider, AuthProvider
18+
from databricks.sql.auth.authenticators import (
19+
AccessTokenAuthProvider,
20+
AuthProvider,
21+
DatabricksOAuthProvider,
22+
ExternalAuthProvider,
23+
)
1624
from databricks.sql.backend.kernel.auth_bridge import (
1725
_extract_bearer_token,
1826
kernel_auth_kwargs,
1927
)
28+
from databricks.sql.exc import NotSupportedError
2029

2130

2231
class _FakeOAuthProvider(AuthProvider):
23-
"""Stand-in for OAuth/MSAL/federation providers — anything that
24-
isn't ``AccessTokenAuthProvider``. Returns a counter-stamped
25-
token so tests can prove the callback is invoked each call."""
26-
27-
def __init__(self):
28-
self.calls = 0
32+
"""Stand-in for any non-PAT provider. The bridge should reject
33+
these with NotSupportedError."""
2934

3035
def add_headers(self, request_headers):
31-
self.calls += 1
32-
request_headers["Authorization"] = f"Bearer token-{self.calls}"
36+
request_headers["Authorization"] = "Bearer oauth-token-xyz"
3337

3438

3539
class _MalformedProvider(AuthProvider):
36-
"""Provider that returns a non-Bearer Authorization header
37-
(e.g. Basic auth). The bridge should reject this rather than
38-
silently sending the wrong shape to the kernel."""
40+
"""Provider that returns a non-Bearer Authorization header."""
3941

4042
def add_headers(self, request_headers):
4143
request_headers["Authorization"] = "Basic dXNlcjpwYXNz"
4244

4345

4446
class _SilentProvider(AuthProvider):
45-
"""Provider that writes nothing — represents misconfigured
46-
auth or a placeholder. The bridge must surface this clearly."""
47+
"""Provider that writes nothing — misconfigured auth."""
4748

4849
def add_headers(self, request_headers):
4950
pass
@@ -74,43 +75,53 @@ def test_federation_wrapped_pat_routes_to_kernel_pat(self):
7475
underlying ``AccessTokenAuthProvider``."""
7576
from databricks.sql.auth.token_federation import TokenFederationProvider
7677

77-
# TokenFederationProvider needs an http_client; a MagicMock
78-
# is sufficient because we don't trigger any token exchange
79-
# in the test (the cached-token path is never hit).
8078
base = AccessTokenAuthProvider("dapi-abc")
79+
# TokenFederationProvider's __init__ requires an http_client
80+
# to construct cleanly; for this unit test we only exercise
81+
# the add_headers passthrough + the external_provider
82+
# attribute. Bypass __init__ with __new__ and stash just
83+
# the fields the bridge touches.
8184
federated = TokenFederationProvider.__new__(TokenFederationProvider)
8285
federated.external_provider = base
83-
# The bridge only touches `add_headers` (delegated to the
84-
# base) and `external_provider`. Other attrs would be set
85-
# by __init__ but aren't exercised here.
8686
federated.add_headers = base.add_headers
8787
kwargs = kernel_auth_kwargs(federated)
8888
assert kwargs == {"auth_type": "pat", "access_token": "dapi-abc"}
8989

90-
def test_pat_with_silent_provider_raises(self):
90+
def test_pat_with_silent_provider_raises_value_error(self):
9191
"""An AccessTokenAuthProvider that produces no Authorization
9292
header is misconfigured; surface that at bridge-build time,
9393
not on the first kernel HTTP request."""
9494
broken = AccessTokenAuthProvider("dapi-x")
95-
# Force the broken state by monkey-patching add_headers.
9695
broken.add_headers = lambda h: None # type: ignore[method-assign]
9796
with pytest.raises(ValueError, match="Bearer"):
9897
kernel_auth_kwargs(broken)
9998

100-
def test_oauth_routes_to_external_trampoline(self):
101-
provider = _FakeOAuthProvider()
102-
kwargs = kernel_auth_kwargs(provider)
103-
assert kwargs["auth_type"] == "external"
104-
callback = kwargs["token_callback"]
105-
assert callable(callback)
106-
# First call -> token-1, second call -> token-2. Proves the
107-
# callback delegates to the live auth_provider each time
108-
# rather than caching.
109-
assert callback() == "token-1"
110-
assert callback() == "token-2"
111-
assert provider.calls == 2
112-
113-
def test_external_callback_raises_on_missing_header(self):
114-
kwargs = kernel_auth_kwargs(_SilentProvider())
115-
with pytest.raises(RuntimeError, match="Bearer"):
116-
kwargs["token_callback"]()
99+
def test_generic_oauth_provider_raises_not_supported(self):
100+
with pytest.raises(NotSupportedError, match="only supports PAT"):
101+
kernel_auth_kwargs(_FakeOAuthProvider())
102+
103+
def test_external_credentials_provider_raises_not_supported(self):
104+
"""``ExternalAuthProvider`` wraps user-supplied
105+
credentials_provider — kernel doesn't accept these today,
106+
and the bridge surfaces that explicitly."""
107+
# ExternalAuthProvider's __init__ calls the credentials
108+
# provider; supply a noop one.
109+
from databricks.sql.auth.authenticators import CredentialsProvider
110+
111+
class _NoopCreds(CredentialsProvider):
112+
def auth_type(self):
113+
return "noop"
114+
115+
def __call__(self, *args, **kwargs):
116+
return lambda: {"Authorization": "Bearer noop"}
117+
118+
ext = ExternalAuthProvider(_NoopCreds())
119+
with pytest.raises(NotSupportedError, match="only supports PAT"):
120+
kernel_auth_kwargs(ext)
121+
122+
def test_silent_non_pat_provider_also_raises_not_supported(self):
123+
"""Even if a non-PAT provider produces no header, the bridge
124+
rejects the type itself — we don't try to extract a token
125+
from something we already know is unsupported."""
126+
with pytest.raises(NotSupportedError):
127+
kernel_auth_kwargs(_SilentProvider())

0 commit comments

Comments
 (0)