feat(backend/kernel): introduce dedicated use_kernel flag + substantive review fixes

vikrantpuppala · vikrantpuppala · commit 24e9a5c2aeae · 2026-05-15T10:55:50.000Z
Major change: route the kernel backend through a new ``use_kernel=True``
connection kwarg instead of repurposing ``use_sea=True``. ``use_sea=True``
once again routes to the native pure-Python SEA backend (no behaviour
change); ``use_kernel=True`` routes to the Rust kernel via PyO3. The
two flags are mutually exclusive.

This addresses the largest reviewer concern from the multi-agent
review: silently hijacking a documented public flag broke OAuth /
federation / parameter-binding callers on ``use_sea=True`` who had no
opt-out. With the new flag, the kernel backend is fully opt-in and
existing ``use_sea=True`` users continue to get the native SEA backend
they signed up for.

Other substantive fixes:

- session.py: restore ``SeaDatabricksClient`` import + routing. Reject
  ``use_kernel=True`` + ``use_sea=True`` together with a clear
  ``ValueError``.
- client.py (kernel ``Cursor.columns``): update docstring to flag the
  ``catalog_name=None`` divergence — kernel requires a catalog,
  Thrift / native SEA do not (F13).
- conftest.py: drop the collection-time ``pytest_collection_modifyitems``
  hook that was skipping ``extra_params={"use_sea": True}`` cases. With
  ``use_sea=True`` back on the native SEA backend, those cases run as
  they did before this PR (F8).
- kernel/client.py: ``get_tables`` now applies the ``table_types``
  filter client-side using ``ResultSetFilter._filter_arrow_table``
  (the same helper the native SEA backend uses), wrapped in a tiny
  ``_StaticArrowHandle`` that flows the filtered table back through
  the normal ``KernelResultSet`` path. Replaces the previous
  "log a warning and return unfiltered" behaviour (F4).
- kernel/client.py: guard ``_async_handles`` with ``threading.RLock``
  so concurrent cursors on the same connection don't race on
  submit / close / close-session (F15).
- kernel/result_set.py: ``KernelResultSet.close()`` now drops the
  entry from ``backend._async_handles`` so async-submitted statements
  don't leave stale references behind (F5).
- kernel/{__init__,client,auth_bridge}.py, tests/e2e/test_kernel_backend.py:
  update docstrings, error messages, and the e2e fixture to refer to
  ``use_kernel=True`` instead of ``use_sea=True``.
- client.py (``Connection`` docstring): document the new
  ``use_kernel`` kwarg + its Phase-1 limitations.

New tests:

- tests/unit/test_kernel_client.py (38 cases): cover the 14-entry
  ``_CODE_TO_EXCEPTION`` table, ``_reraise_kernel_error`` attribute
  forwarding, the 6-entry ``_STATE_TO_COMMAND_STATE`` table, the
  no-open-session guards on every method, ``open_session`` double-open,
  ``parameters`` / ``query_tags`` rejection, ``get_columns``'
  catalog-required check, ``cancel_command`` / ``close_command``
  no-handle tolerance, ``get_query_state`` sync-path SUCCEEDED, the
  Failed-state re-raise, the synthetic-command-id UUID shape, and
  ``close_session`` cleanup even when per-handle close errors fire.
  Uses a fake ``databricks_sql_kernel`` module installed into
  ``sys.modules`` so the test runs with no Rust extension dependency
  (F9).

77/77 kernel unit tests pass.

Co-authored-by: Isaac
diff --git a/conftest.py b/conftest.py
@@ -1,41 +1,7 @@
-import importlib.util
 import os
 import pytest
 
 
-def _kernel_wheel_available() -> bool:
-    """The ``use_sea=True`` code path now routes through the Rust
-    kernel via PyO3. The ``databricks_sql_kernel`` wheel is not
-    yet on PyPI (built from a separate repo); CI environments
-    without it should skip ``use_sea=True`` parametrized cases
-    rather than fail with a hard ImportError."""
-    return importlib.util.find_spec("databricks_sql_kernel") is not None
-
-
-def pytest_collection_modifyitems(config, items):
-    """Skip parametrized test cases that pass ``use_sea=True`` when
-    the kernel wheel isn't installed.
-
-    The existing e2e suite uses ``@pytest.mark.parametrize(
-    "extra_params", [{}, {"use_sea": True}])`` to exercise both
-    backends. When the kernel wheel is missing those cases die at
-    ``connect()`` time with our pointed ImportError; mark them
-    skipped at collection time so CI signal stays accurate.
-    """
-    if _kernel_wheel_available():
-        return
-    skip_marker = pytest.mark.skip(
-        reason="use_sea=True requires databricks-sql-kernel (not installed)"
-    )
-    for item in items:
-        params = getattr(item, "callspec", None)
-        if params is None:
-            continue
-        extra_params = params.params.get("extra_params")
-        if isinstance(extra_params, dict) and extra_params.get("use_sea") is True:
-            item.add_marker(skip_marker)
-
-
 @pytest.fixture(scope="session")
 def host():
     return os.getenv("DATABRICKS_SERVER_HOSTNAME")
diff --git a/src/databricks/sql/backend/kernel/__init__.py b/src/databricks/sql/backend/kernel/__init__.py
@@ -1,6 +1,6 @@
 """Backend that delegates to the Databricks SQL Kernel (Rust) via PyO3.
 
-Routed when ``use_sea=True`` is passed to ``databricks.sql.connect``.
+Routed when ``use_kernel=True`` is passed to ``databricks.sql.connect``.
 The module's identity is "delegates to the kernel" — not the wire
 protocol the kernel happens to use today (SEA REST). The kernel may
 switch its default transport (SEA REST → SEA gRPC → …) without
@@ -18,7 +18,7 @@
     from databricks.sql.backend.kernel.client import KernelDatabricksClient
 
 ``session.py::_create_backend`` already does this lazy import under
-the ``use_sea=True`` branch.
+the ``use_kernel=True`` branch.
 
 See ``docs/designs/pysql-kernel-integration.md`` in
 ``databricks-sql-kernel`` for the full integration design.
diff --git a/src/databricks/sql/backend/kernel/auth_bridge.py b/src/databricks/sql/backend/kernel/auth_bridge.py
@@ -105,7 +105,7 @@ def kernel_auth_kwargs(auth_provider: AuthProvider) -> Dict[str, Any]:
         return {"auth_type": "pat", "access_token": token}
 
     raise NotSupportedError(
-        f"The kernel backend (use_sea=True) currently only supports PAT auth, "
-        f"but got {type(auth_provider).__name__}. Use use_sea=False (Thrift) "
-        "for OAuth / federation / custom credential providers."
+        f"The kernel backend (use_kernel=True) currently only supports PAT auth, "
+        f"but got {type(auth_provider).__name__}. Use the Thrift backend "
+        "(default) for OAuth / federation / custom credential providers."
     )
diff --git a/src/databricks/sql/backend/kernel/client.py b/src/databricks/sql/backend/kernel/client.py
@@ -1,6 +1,6 @@
 """``DatabricksClient`` backed by the Rust kernel via PyO3.
 
-Routed when ``use_sea=True``. Constructor takes the connector's
+Routed when ``use_kernel=True``. Constructor takes the connector's
 already-built ``auth_provider`` and forwards everything else to the
 kernel's ``Session``. Every kernel call goes through this thin
 wrapper; this module is the single seam between the connector's
@@ -34,6 +34,7 @@
 from __future__ import annotations
 
 import logging
+import threading
 import uuid
 from typing import Any, Dict, List, Optional, TYPE_CHECKING, Union
 
@@ -71,7 +72,7 @@
     # (doing so breaks `poetry lock`). Once published the install
     # hint will move to `pip install 'databricks-sql-connector[kernel]'`.
     raise ImportError(
-        "use_sea=True requires the databricks-sql-kernel package. Install it with:\n"
+        "use_kernel=True requires the databricks-sql-kernel package. Install it with:\n"
         "  pip install databricks-sql-kernel\n"
         "or for local development from the kernel repo:\n"
         "  cd databricks-sql-kernel/pyo3 && maturin develop --release"
@@ -176,7 +177,10 @@ def __init__(
         self._session_id: Optional[SessionId] = None
         # Async-exec handles keyed by CommandId.guid. Populated by
         # ``execute_command(async_op=True)``; drained by ``close_command``.
+        # Guarded by ``_async_handles_lock`` so concurrent cursors on the
+        # same connection don't race on submit / close / close-session.
         self._async_handles: Dict[str, Any] = {}
+        self._async_handles_lock = threading.RLock()
 
     # ── Session lifecycle ──────────────────────────────────────────
 
@@ -226,14 +230,16 @@ def close_session(self, session_id: SessionId) -> None:
             return
         # Close any tracked async handles first so they fire their
         # server-side CloseStatement before the session goes away.
-        for handle in list(self._async_handles.values()):
+        with self._async_handles_lock:
+            handles_to_close = list(self._async_handles.values())
+            self._async_handles.clear()
+        for handle in handles_to_close:
             try:
                 handle.close()
             except _kernel.KernelError as exc:
                 logger.warning(
                     "Error closing async handle during session close: %s", exc
                 )
-        self._async_handles.clear()
         try:
             self._kernel_session.close()
         except _kernel.KernelError as exc:
@@ -280,7 +286,8 @@ def execute_command(
                 async_exec = stmt.submit()
                 command_id = CommandId.from_sea_statement_id(async_exec.statement_id)
                 cursor.active_command_id = command_id
-                self._async_handles[command_id.guid] = async_exec
+                with self._async_handles_lock:
+                    self._async_handles[command_id.guid] = async_exec
                 return None
             executed = stmt.execute()
         except _kernel.KernelError as exc:
@@ -300,7 +307,8 @@ def execute_command(
         return self._make_result_set(executed, cursor, command_id)
 
     def cancel_command(self, command_id: CommandId) -> None:
-        handle = self._async_handles.get(command_id.guid)
+        with self._async_handles_lock:
+            handle = self._async_handles.get(command_id.guid)
         if handle is None:
             # Sync-execute paths fully materialise the result before
             # ``execute_command`` returns, so by the time
@@ -314,7 +322,8 @@ def cancel_command(self, command_id: CommandId) -> None:
             raise _reraise_kernel_error(exc)
 
     def close_command(self, command_id: CommandId) -> None:
-        handle = self._async_handles.pop(command_id.guid, None)
+        with self._async_handles_lock:
+            handle = self._async_handles.pop(command_id.guid, None)
         if handle is None:
             logger.debug("close_command: no tracked handle for %s", command_id)
             return
@@ -324,7 +333,8 @@ def close_command(self, command_id: CommandId) -> None:
             raise _reraise_kernel_error(exc)
 
     def get_query_state(self, command_id: CommandId) -> CommandState:
-        handle = self._async_handles.get(command_id.guid)
+        with self._async_handles_lock:
+            handle = self._async_handles.get(command_id.guid)
         if handle is None:
             # No tracked async handle means execute_command ran
             # sync and the result was materialised before returning;
@@ -347,7 +357,8 @@ def get_execution_result(
         command_id: CommandId,
         cursor: "Cursor",
     ) -> "ResultSet":
-        handle = self._async_handles.get(command_id.guid)
+        with self._async_handles_lock:
+            handle = self._async_handles.get(command_id.guid)
         if handle is None:
             raise ProgrammingError(
                 "get_execution_result called for an unknown command_id; "
@@ -438,16 +449,6 @@ def get_tables(
     ) -> "ResultSet":
         if self._kernel_session is None:
             raise InterfaceError("get_tables requires an open session.")
-        if table_types:
-            # Documented gap: native SEA backend filters here, but
-            # its filter is keyed on SeaResultSet. Day-1 we surface
-            # the unfiltered result; a small follow-up ports the
-            # filter to operate on KernelResultSet.
-            logger.warning(
-                "get_tables: client-side table_types filter not yet implemented "
-                "on the kernel backend; returning unfiltered rows for %r",
-                table_types,
-            )
         try:
             stream = self._kernel_session.metadata().list_tables(
                 catalog=catalog_name,
@@ -457,7 +458,27 @@ def get_tables(
             )
         except _kernel.KernelError as exc:
             raise _reraise_kernel_error(exc)
-        return self._make_result_set(stream, cursor, self._synthetic_command_id())
+        if not table_types:
+            return self._make_result_set(stream, cursor, self._synthetic_command_id())
+        # The kernel today returns the unfiltered ``SHOW TABLES`` shape
+        # regardless of ``table_types``. Drain to a single Arrow table
+        # and apply the same client-side filter the native SEA backend
+        # uses (column index 5 is TABLE_TYPE, case-sensitive). Cheap
+        # because metadata result sets are small.
+        from databricks.sql.backend.sea.utils.filters import ResultSetFilter
+
+        full_table = _drain_kernel_handle(stream)
+        filtered_table = ResultSetFilter._filter_arrow_table(
+            full_table,
+            column_name=full_table.schema.field(5).name,
+            allowed_values=table_types,
+            case_sensitive=True,
+        )
+        return self._make_result_set(
+            _StaticArrowHandle(filtered_table),
+            cursor,
+            self._synthetic_command_id(),
+        )
 
     def get_columns(
         self,
@@ -496,7 +517,7 @@ def get_columns(
     def max_download_threads(self) -> int:
         # CloudFetch parallelism lives kernel-side. This property is
         # consulted by Thrift code paths that don't run for
-        # use_sea=True; return a non-zero default so anything that
+        # use_kernel=True; return a non-zero default so anything that
         # peeks at it does not divide by zero.
         return 10
 
@@ -509,3 +530,52 @@ def max_download_threads(self) -> int:
     "Cancelled": CommandState.CANCELLED,
     "Closed": CommandState.CLOSED,
 }
+
+
+def _drain_kernel_handle(handle: Any) -> Any:
+    """Drain a kernel ResultStream / ExecutedStatement into a single
+    ``pyarrow.Table``. Used by ``get_tables`` to apply a client-side
+    ``table_types`` filter on a metadata result; cheap because
+    metadata streams are small."""
+    import pyarrow
+
+    schema = handle.arrow_schema()
+    batches = []
+    while True:
+        batch = handle.fetch_next_batch()
+        if batch is None:
+            break
+        if batch.num_rows > 0:
+            batches.append(batch)
+    try:
+        handle.close()
+    except _kernel.KernelError:
+        pass
+    return pyarrow.Table.from_batches(batches, schema=schema)
+
+
+class _StaticArrowHandle:
+    """Duck-typed kernel handle that replays a pre-built
+    ``pyarrow.Table`` through ``arrow_schema()`` /
+    ``fetch_next_batch()`` / ``close()``. Used to wrap a
+    post-processed table (e.g., the ``table_types``-filtered output
+    of ``get_tables``) so it flows back through the normal
+    ``KernelResultSet`` path."""
+
+    def __init__(self, table: Any) -> None:
+        self._schema = table.schema
+        self._batches = list(table.to_batches())
+        self._idx = 0
+
+    def arrow_schema(self) -> Any:
+        return self._schema
+
+    def fetch_next_batch(self) -> Optional[Any]:
+        if self._idx >= len(self._batches):
+            return None
+        batch = self._batches[self._idx]
+        self._idx += 1
+        return batch
+
+    def close(self) -> None:
+        self._batches = []
diff --git a/src/databricks/sql/backend/kernel/result_set.py b/src/databricks/sql/backend/kernel/result_set.py
@@ -226,7 +226,21 @@ def close(self) -> None:
             # level; log and swallow so the cursor's __del__ /
             # connection close path stays clean.
             logger.warning("Error closing kernel handle: %s", exc)
+        # Drop the entry from the backend's async-handle map (if
+        # present) — for async-submitted statements the handle is
+        # tracked there and the base ``ResultSet.close`` path would
+        # otherwise leave a stale entry pointing at a closed handle.
+        # No-op for the sync-execute and metadata paths, which never
+        # register in ``_async_handles``.
+        guid = getattr(self.command_id, "guid", None)
+        if guid is not None:
+            self.backend._async_handles_lock.acquire()
+            try:
+                self.backend._async_handles.pop(guid, None)
+            finally:
+                self.backend._async_handles_lock.release()
         self._buffer.clear()
+        self._buffered_count = 0
         self._kernel_handle = None
         self._exhausted = True
         self.has_been_closed_server_side = True
diff --git a/src/databricks/sql/client.py b/src/databricks/sql/client.py
@@ -115,7 +115,17 @@ def __init__(
 
         Parameters:
             :param use_sea: `bool`, optional (default is False)
-                Use the SEA backend instead of the Thrift backend.
+                Use the native pure-Python SEA backend instead of
+                the Thrift backend.
+            :param use_kernel: `bool`, optional (default is False)
+                Route the connection through the Rust kernel
+                (``databricks-sql-kernel`` via PyO3). Requires the
+                kernel wheel to be installed separately
+                (``pip install databricks-sql-kernel``); raises
+                ImportError otherwise. In active development —
+                PAT auth only today; OAuth / federation / external
+                credentials and native parameter binding land in
+                follow-ups. Mutually exclusive with ``use_sea``.
             :param use_hybrid_disposition: `bool`, optional (default is False)
                 Use the hybrid disposition instead of the inline disposition.
             :param server_hostname: Databricks instance host name.
@@ -1575,6 +1585,12 @@ def columns(
         Get columns corresponding to the catalog_name, schema_name, table_name and column_name.
 
         Names can contain % wildcards.
+
+        Note: on ``use_kernel=True``, ``catalog_name`` is required —
+        the kernel's underlying ``SHOW COLUMNS`` cannot span catalogs.
+        Passing ``catalog_name=None`` raises ``ProgrammingError``. The
+        Thrift and native SEA backends accept ``catalog_name=None``.
+
         :returns self
         """
         self._check_not_closed()
diff --git a/src/databricks/sql/session.py b/src/databricks/sql/session.py
diff --git a/tests/e2e/test_kernel_backend.py b/tests/e2e/test_kernel_backend.py
diff --git a/tests/unit/test_kernel_client.py b/tests/unit/test_kernel_client.py