Sync jdk-8 with main: 3.2.2-SNAPSHOT by samikshya-db · Pull Request #1275 · databricks/databricks-jdbc

samikshya-db · 2026-03-12T03:50:23Z

Summary

Updating the JDK8 Branch as there have been several improvements since the last update in September, 2025. (customer usage )
Merges 68 commits from main into jdk-8 (through 3.2.2-SNAPSHOT, including PR [PECOBLR-1121] Arrow patch to circumvent Arrow issues with JDk 16+ #1243 Arrow patch and multi-module restructure)
Applies all JDK 8 compatibility transformations per the standard sync process

JDK 8 Transformations Applied

pom.xml (root):

Set maven.compiler.source/target → 1.8
Pin arrow.version=13.0.0 (Arrow 14+ requires Java 11)
Pin mockito.version=4.11.0 (Mockito 5.x requires Java 11)
Pin nimbusjose.version=9.47 (nimbus-jose-jwt 10.x requires Java 11)
Remove wiremock.version (WireMock 3.x requires Java 11)
Add jdk8 profile to skip spotless (plugin 2.39.0 requires Java 11)
Remove spotless <excludes> for Arrow patched classes

jdbc-core/pom.xml:

Remove --add-opens=java.base/java.nio=ALL-UNNAMED (invalid on JDK 8)
Delete jdk17-NioNotOpen and jdk21-NioNotOpen profiles
Remove JaCoCo exclusions for Arrow patch classes
Remove !Jvm17PlusAndArrowToNioReflectionDisabled group filter from local profile
Remove wiremock test dependency
Add fakeservice/e2e exclusions to surefire

Source code:

Remove all Arrow patch files introduced in PR [PECOBLR-1121] Arrow patch to circumvent Arrow issues with JDk 16+ #1243 (NIO workaround for JDK 16+, not needed on JDK 8): MemoryUtil, ArrowBuf, DecimalUtility, DatabricksArrowBuf, DatabricksBufferAllocator, DatabricksAllocationReservation, DatabricksReferenceManager*
Simplify ArrowBufferAllocator to always use RootAllocator (no patched fallback needed on JDK 8)
Remove isUsingPatchedAllocator() call from TelemetryHelper (always false on JDK 8)

Tests:

Remove fakeservice/e2e tests (WireMock 3.x / Java 11+)
Remove all Arrow patch tests
Restore unit tests: test helper classes (TestConstants, FeatureFlagTestUtil, TelemetryAuthHelper, ConfiguratorUtilsTest)
Fix JDK 8 incompatibilities in tests: remove var keyword, replace diamond-with-anonymous-class patterns
Remove JDBC 4.3 (setShardingKey*) test assertions from DatabricksConnectionTest

Verification

✅ mvn clean install -DskipTests -Ddependency-check.skip=true — BUILD SUCCESS
✅ Jar class version: 52 (JDK 8)
```
major version: 52
```
⚠️ mvn test on JDK 21 locally: 90 Arrow NIO errors (expected — RootAllocator requires --add-opens on JDK 16+; tests pass on JDK 8 CI)

NO_CHANGELOG=true

## Description Release v3.0.7 ## Changes This release includes: ### Updated - Log timestamps now explicitly display timezone. - **[Breaking Change]** `PreparedStatement.setTimestamp(int, Timestamp, Calendar)` now properly applies Calendar timezone conversion using LocalDateTime pattern (inline with `getTimestamp`). Previously Calendar parameter was ineffective. - `DatabaseMetaData.getColumns()` with null catalog parameter now retrieves columns from all catalogs when using SQL Execution API, aligning the behaviour with thrift. - `DatabaseMetaData.getFunctions()` with null catalog parameter now retrieves columns from the current catalog when using SQL Execution API, aligning the behaviour with thrift. ### Fixed - Fix timeout exception handling to throw `SQLTimeoutException` instead of `DatabricksSQLException` when queries timeout. - Removes dangerous global timezone modification that caused race conditions. - Fixed `Statement.getLargeUpdateCount()` to return -1 instead of throwing Exception when there were no more results or result is not an update count. - CVE-2025-66566. Updated lz4-java dependency to 1.10.1. - Fix `INVALID_IDENTIFIER` error when using catalog/schema/table names for SQL Exec API with hyphens or special characters in metadata operations (`getSchemas()`, `getTables()`, `getColumns()`, etc.) and connection methods (`setCatalog()`, `setSchema()`). Per Databricks identifier rules, special characters are now properly enclosed in backticks. - Fix Auth_Scope handling inconsistency in Azure U2M OAuth. ## Testing Version bump and release notes have been updated across all relevant files. OVERRIDE_FREEZE=true Co-authored-by: Samikshya Chand <148681192+samikshya-db@users.noreply.github.com>

## Description - Checks in progress, freeze main till then. ## Testing  ## Additional Notes to the Reviewer NO_CHANGELOG=true

## Description Fixes multichunk test by only counting the unique urls requested that'll ensure that the test is not counting the retries ## Testing  Tested locally ## Additional Notes to the Reviewer  NO_CHANGELOG=true

## Description  Improve logging when jdbc is shaded ## Testing  Unit tests + manually in benchmarking ## Additional Notes to the Reviewer  Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

## Description  Excluded circuit breaker test from SEA in the post merge workflow ## Testing  ## Additional Notes to the Reviewer  NO_CHANGELOG=true

…ricks#1154) Bumps org.apache.logging.log4j:log4j-core from 2.22.1 to 2.25.3. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.logging.log4j:log4j-core&package-manager=maven&previous-version=2.22.1&new-version=2.25.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/databricks/databricks-jdbc/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

## Description - We should be caching the tokens for custom oauth providers - This is to incorporate caching as mentioned in the[ internal code audit ](https://docs.google.com/document/d/1O6cmsqYw6JMYIzW6NrJR_RLZw6KQkzXNy0oKzRK7xAM/edit?tab=t.0) - There are 2 options to add tokenCache : one is re-using the persistent token cache being used in refresh flow, another is to extend cachedTokenSource (in-memory). The decision is made by benchmarking both : [internal doc](https://docs.google.com/document/d/1aO6befanIuO-OIJ4NZMx3SK3R7JXmwtV54LCpFo02EA/edit?tab=t.0). <img width="630" height="352" alt="Screenshot 2025-12-12 at 5 18 17 PM" src="https://github.com/user-attachments/assets/b394ab0f-c9bc-43b4-b00f-6beda7a1a9d2" /> ## Testing - added unit tests - Tested each of the flow end to end. ## Additional Notes to the Reviewer  --------- Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

…hen using Thrift protocol. (databricks#1066) ## Description [Thrift protocol](https://github.com/databricks-eng/runtime/blob/master/sql/hive-thriftserver/if/TCLIService.thrift#L2186) has a orientation field with values FETCH_NEXT, FETCH_PRIOR or FETCH_FIRST. This field is always set to FETCH_NEXT resulting in incorrect refetch. To fetch from a particular chunk index the Thrift protocol requires the start row offset to be set. The chunk index and start row offset information is available from the expired links. Use the start row offset to fetch the links in the Thrift protocol. ## Testing This fix is tested with an integration test that validates that the correct links are fetched when fetching from a pair of chunk index and start row offset. There are also unit tests to validate correct client behaviour when unexpected responses are received from the server. ## Additional Notes to the Reviewer I also made some changes to the validation of the results. Commented within the PR. --------- Co-authored-by: tejassp-db <> Co-authored-by: Samikshya Chand <148681192+samikshya-db@users.noreply.github.com>

## Description  NO_CHANGELOG=true ## Testing  ## Additional Notes to the Reviewer

…abricks#1167) ## Description NO_CHANGELOG=true  When LogLevel.OFF was set, setupLogger() returned early without configuring the JUL logger. This caused Java's default logging behavior to kick in, resulting in deprecation warnings (e.g., ignoreTransactions warnings) being logged to console despite logging being disabled. Now properly initializes the logger with Level.OFF to suppress all output while using STDOUT to avoid file system access issues in restricted environments. Fixes databricks#1158 ## Testing  Manual testing ## Additional Notes to the Reviewer

Updated version v3.0.4 to be marked as deprecated and added a note to use v3.0.5 instead. Added additional details for geospatial data type support. ## Description  ## Testing  ## Additional Notes to the Reviewer  NO_CHANGELOG=true

…abricks#1169) ## Description  NO_CHANGELOG=true Optimize setAutoCommit to avoid unnecessary server round-trips when the requested autoCommit value matches the cached session value. This optimization only applies when FetchAutoCommitFromServer is disabled (the default), ensuring we still respect server state when that mode is enabled. ## Testing  Unit tests ## Additional Notes to the Reviewer

…atabricks#1101) ## Description NO_CHANGELOG=true  Complete link futures for upfront-fetched chunks to prevent deadlock When chunk links are fetched upfront, the corresponding futures were never completed, causing threads to wait indefinitely. Now we complete these futures in the constructor for all pre-fetched chunks. ## Testing  - Unit tests - Manual testing ## Additional Notes to the Reviewer

## Description  ## Testing  ## Additional Notes to the Reviewer

## Description  Added query tags to telemetry ## Testing  Tested with real workspace in both cases: when query tags are present / not present. Behaviour is working as expected. ## Additional Notes to the Reviewer  NO_CHANGELOG=true Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

## Problem Custom user agent from useragententry parameter wasn't included in connector service HTTP requests for feature flag retrieval. ## Root Cause Method execution order issue in UserAgentManager.setUserAgent() - custom user agent was set AFTER the connector service request was made. ## Solution Reordered the method to set custom user agent before calling getClientUserAgent() (which triggers feature flag fetch). ## Testing - Added testCustomUserAgentIncludedBeforeClientTypeEvaluation() test NO_CHANGELOG=true --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

…action preview (databricks#1176) ## Summary - Changes default value of `IgnoreTransactions` parameter from `0` to `1`, making transactions disabled by default - Updates `supportsTransactions()` to respect the `IgnoreTransactions` flag, returning `false` when transactions are ignored (default) and `true` when explicitly enabled - Adds test case for when transactions are explicitly enabled via `IgnoreTransactions=0` ## Background The multi-statement transaction feature is currently in private preview for limited workspaces. When BI tools (Tableau, Power BI, DBeaver) detect transaction support via `supportsTransactions()`, they automatically use transaction methods, causing failures for customers not enrolled in the preview. This change prevents unexpected failures for non-preview customers while allowing preview participants to opt-in by explicitly setting `IgnoreTransactions=0` in their connection string. ## Migration Path - **Non-preview customers**: No action required - transactions are now disabled by default - **Preview participants**: Set `IgnoreTransactions=0` in connection string to enable transaction support - **GA migration**: When multi-statement transactions reach GA, flip the default back to `0` ## Test plan - [ ] Verify existing tests pass - [ ] Verify default connection returns `supportsTransactions() = false` - [ ] Verify connection with `IgnoreTransactions=0` returns `supportsTransactions() = true` - [ ] Verify transaction methods (`setAutoCommit`, `commit`, `rollback`) are no-ops by default 🤖 Generated with [Claude Code](https://claude.com/claude-code) Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

## Description Bump version to 3.1.1

…ricks#1177) ### Problem When executing queries that return 0 rows (e.g., `WHERE 1=0`), complex types (ARRAY, MAP, STRUCT) showed only generic type names instead of detailed type information: **Before:** - `ARRAY` instead of `ARRAY<INT>` - `MAP` instead of `MAP<STRING,STRING>` - `STRUCT` instead of `STRUCT<field: TYPE>` **After:** - Detailed type information is correctly preserved for all row counts ### Root Cause In `AbstractArrowResultChunk.java`, Arrow field metadata was only extracted inside the `while(arrowStreamReader.loadNextBatch())` loop. For queries with 0 rows, no batches are loaded, so the loop never executes and metadata is never extracted. **Code location:** `/src/main/java/com/databricks/jdbc/api/impl/arrow/AbstractArrowResultChunk.java:338-359` ### Solution Extract metadata from `VectorSchemaRoot` immediately after obtaining it, **before** the `loadNextBatch()` loop. The Arrow IPC format always sends the schema message first (before any record batches), so field metadata is available even when there are 0 rows. `VectorSchemaRoot` contains field vectors with metadata regardless of row count. **Key changes:** 1. Moved metadata extraction from inside the while loop to before it 2. Added defensive null checks for `VectorSchemaRoot` and field vectors 3. Added debug logging to track metadata extraction ### Testing #### Unit Test Coverage - ✅ Added `testMetadataExtractionWithZeroRows()` to `ArrowResultChunkTest` - ✅ Verifies Arrow field metadata is extracted correctly with 0 rows - ✅ Tests complex types: `ARRAY<INT>`, `MAP<STRING,STRING>` - ✅ All 2,693 unit tests pass #### Manual Verification Tested with queries returning 0 rows: ```sql SELECT array_col, map_col, struct_col FROM table WHERE 1=0 Result: Metadata now correctly shows detailed type information Impact - Scope: Both SQL Exec API and Thrift Server (shared code path) - Risk: Low - backward compatible change, only affects metadata extraction timing - Benefits: - Fixes schema discovery for WHERE 1=0 pattern - Improves metadata availability for empty result sets - Aligns with Arrow IPC specification behavior Additional Context - Arrow IPC specification guarantees schema is sent before record batches - VectorSchemaRoot.getFieldVectors() is available immediately after ArrowStreamReader.getVectorSchemaRoot() - No performance impact: metadata extraction is now done once upfront instead of conditionally on first batch --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

## Description  This PR introduces lazy loading support for inline Arrow results to improve memory efficiency when handling large result sets. Previously, InlineChunkProvider would eagerly fetch all arrow batches upfront when results had hasMoreRows = true, which could lead to memory issues with large datasets. This change splits the handling into two separate paths: 1. Lazy path (new): For Thrift-based inline Arrow results (when ARROW_BASED_SET is returned), we now use LazyThriftInlineArrowResult which fetches arrow batches on-demand as the client iterates through rows. This is similar to how LazyThriftResult works for columnar data. 2. Remote path (existing): For URL-based Arrow results (URL_BASED_SET), we continue using ArrowStreamResult with RemoteChunkProvider which downloads chunks from cloud storage. The InlineChunkProvider is now only used for SEA results with JSON_ARRAY format and INLINE disposition (contain all data inline {no hasMoreRows flag set}). This will reduce memory consumption and improve performance when dealing with large inline Arrow result sets similar to databricks#975. ## Testing  - Unit tests - Integration tests - Manual testing ## Additional Notes to the Reviewer  Bypassing an existing failure on CI/CD because of databricks@3e4f21c

…istency (databricks#1182) ## Summary This PR adds TIMESTAMP_NTZ normalization in the Thrift path to ensure consistent metadata behavior across both SEA and Thrift API paths. ## Background PR databricks#1177 moved Arrow metadata extraction earlier in the processing pipeline, which exposed an inconsistency: the Thrift path started returning the correct "TIMESTAMP_NTZ" from server metadata, while the SEA path was already normalizing it to "TIMESTAMP" for backward compatibility. ## Changes - Added TIMESTAMP_NTZ → TIMESTAMP normalization in `DatabricksResultSetMetaData.java` Thrift constructor (lines 205-208) - This brings Thrift path behavior in line with existing SEA path normalization - Fixes test failure in `PreparedStatementIntegrationTests.testGetMetaData_NoResultSet` ## Testing - ✅ Local test run: `PreparedStatementIntegrationTests.testGetMetaData_NoResultSet` passes - ✅ Metadata now consistent before and after `executeQuery()` for TIMESTAMP_NTZ columns - ✅ Both SEA and Thrift paths return "TIMESTAMP" for TIMESTAMP_NTZ columns ## Related - Builds on PR databricks#1177 (Fix Arrow field metadata not available for queries with 0 rows) - Fixes issue introduced by early metadata extraction in PR databricks#1177 - Maintains backward compatibility with existing behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

…oad parameter (databricks#1183) ## Summary Add support for disabling CloudFetch via `EnableQueryResultDownload=0` connection parameter to use inline Arrow results instead. ## Changes - Add `isCloudFetchEnabled()` method to `IDatabricksConnectionContext` interface - Implement the method in `DatabricksConnectionContext` using existing `EnableQueryResultDownload` parameter - Update `DatabricksThriftServiceClient` to respect this setting when making execute requests - Add unit tests for the new functionality ## Usage To disable CloudFetch and use inline Arrow results: ``` jdbc:databricks://host:port/default;EnableQueryResultDownload=0;... ``` --------- Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

…lemetry code audit comments- part1 (databricks#1163) ## Description - 4 things that is improved with respect to telemetry : - Common object mapper across telemetry use-case (This is already thread safe and is expensive to create, i.e., good tor re-use) - Make`flushIntervalMillis` config same across both telemetry clients (un-auth and auth) - Clear connection param cache when connection is closed : this was a memory leak before - Rather than creating a scheduledExecutor for each telemetry client, we share it across a factory. ## Testing - unit tests ## Additional Notes to the Reviewer - When the LAST connection to a host is closed, all pending telemetry events for that host are flushed across all prior connections (since they all shared the same TelemetryClient). i.e., If you have 5 connections to `host-A`, closing connections 1-4 does nothing (just decrements refCount). Only when you close connection 5 (the last one) does the flush occur, sending all accumulated telemetry from all 5 connections. NO_CHANGELOG=true

## Description This PR enhances geospatial datatype handling to include SRID (Spatial Reference System Identifier) information in column type names and fixes multiple issues related to complex datatype handling across different result formats. ### Key Changes 1. **Geospatial Type Name Enhancement** - Column type names now include SRID: `GEOMETRY(4326)` instead of `GEOMETRY` - Applies to both GEOMETRY and GEOGRAPHY types - Preserves full type information in metadata for better type identification 2. **SEA Inline Mode Complex Type Fix** - Fixed issue where complex types (ARRAY, MAP, STRUCT) were not returned as complex objects in SEA Inline mode (JSON array result format) - Now properly converts to complex datatype objects when `EnableComplexDatatypeSupport=true` 3. **Thrift CloudFetch Metadata Enhancement** - Fixed error when extracting type details (e.g., `INT` from `ARRAY<INT>`) in Thrift CloudFetch mode - Enhanced `getColumnInfoFromTColumnDesc()` to use Arrow schema metadata alongside `TColumnDesc` - Arrow schema provides complete type information (e.g., `ARRAY<INT>`) while `TColumnDesc` only contains base type (e.g., `ARRAY`) 4. **Arrow Metadata Extraction** - Added `DatabricksThriftUtil.getArrowMetadata()` to deserialize Arrow schema from `TGetResultSetMetadataResp` - Fixed null arrow metadata issue in `DatabricksResultSet` constructor for Thrift CloudFetch mode ## Testing ### Unit Tests - All existing unit tests pass and additional tests are added for new methods ### Integration Tests - `GeospatialTests.java` - Comprehensive E2E integration test - Tests geospatial types (GEOMETRY and GEOGRAPHY) - Validates **24 configuration combinations**: - Protocol: Thrift / SEA - Serialization: Arrow / Inline - CloudFetch: Enabled / Disabled (only with Arrow, as CloudFetch requires Arrow) - GeoSpatial Support: Enabled / Disabled - Complex Type Support: Enabled / Disabled - Validates metadata: column types, type names, class names - Validates values: WKT representation, SRID - Validates behavior when geospatial objects are enabled vs. disabled (STRING fallback) - **All 24 tests pass** ✅ ## Additional Notes to the Reviewer Other required details are mentioned in comments in the diff --------- Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

…cks#1184) ## Summary Implements proactive prefetching with a sliding window for both Thrift columnar and inline Arrow results, eliminating blocking at batch boundaries and improving throughput. ## Key Components ### New Streaming Infrastructure - **`ThriftStreamingProvider<T>`**: Generic type-safe streaming provider with background prefetch thread and configurable sliding window - **`StreamingBatch<T>`**: Type-safe batch container with lifecycle management and error handling - **`ThriftResponseProcessor<T>`**: Interface for pluggable response processors - `ColumnarResponseProcessor`: Processes Thrift columnar results - `InlineArrowResponseProcessor`: Processes inline Arrow results with schema caching ### Result Implementations - **`StreamingInlineArrowResult`**: High-throughput streaming implementation for inline Arrow results with background prefetching - **`StreamingColumnarResult`**: Streaming implementation for Thrift columnar results with prefetch ### Supporting Classes - **`ThriftBatchFetcher`** / **`ThriftBatchFetcherImpl`**: Abstraction for fetching batches from the Thrift server <img width="1792" height="1234" alt="streaming inline" src="https://github.com/user-attachments/assets/66ea9b83-a16b-42d5-9280-cb1fb81dadeb" /> ## Configuration | Parameter | Description | Default | |-----------|-------------|---------| | `EnableInlineStreaming` | Toggle streaming mode for inline results | `1` (enabled) | | `ThriftMaxBatchesInMemory` | Sliding window size (max batches kept in memory) | `3` | ## Key Features 1. **Background Prefetching**: Dedicated thread fetches batches ahead of consumption 2. **Sliding Window**: Configurable memory limit prevents unbounded memory growth 3. **Type Safety**: Generic `ThriftStreamingProvider<T>` eliminates unsafe casting 4. **Graceful Error Handling**: - Try-catch around resource cleanup to prevent cascading failures - Timeout on batch creation wait to prevent indefinite blocking 5. **Comprehensive Logging**: Debug/error logging for troubleshooting ## Testing - Updated `ExecutionResultFactoryTest` for new factory logic - Updated `DatabricksThriftServiceClientTest` for CloudFetch control - Existing integration tests cover streaming behavior ## Usage Streaming is enabled by default. To disable and use lazy loading instead: ``` jdbc:databricks://host:port/default;EnableInlineStreaming=0;... ``` To adjust the sliding window size: ``` jdbc:databricks://host:port/default;ThriftMaxBatchesInMemory=5;... ``` --------- Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

…ents (databricks#1186) ## Description Fixed `IndexOutOfBoundsException` that occurs when executing DDL statements (e.g., `CREATE DATABASE`) using the Thrift protocol. The bug manifests when there's a mismatch between the number of Thrift column descriptors and Arrow schema fields. ### Root Cause When executing DDL statements, the Databricks server behavior is: - **Thrift Protocol**: Returns column descriptors including a "Result" status column (1 column) - **Arrow Schema**: Returns an empty schema with 0 fields (no actual data) - **The Bug**: Code attempted to access `arrowMetadata[0]` without checking if the list was empty This mismatch caused `IndexOutOfBoundsException` when the driver tried to access arrow metadata at index 0 of an empty list. ### Debug Evidence **TColumnDesc (Thrift)**: ``` Column[0]: name: Result type: STRING_TYPE position: 1 Full TColumnDesc: TColumnDesc(columnName:Result, typeDesc:TTypeDesc(...), position:1, comment:) ``` **Arrow Schema**: ``` Arrow schema bytes length: 72 Deserialized Arrow schema, field count: 0 ← Empty! Arrow metadata list: size=0 ``` ### Changes Made Added bounds checking in two locations where arrow metadata is accessed: 1. **`ArrowUtil.java:247`** - Used by `StreamingInlineArrowResult` 2. **`DatabricksResultSetMetaData.java:195`** - Used for result set metadata construction **Before:** ```java String columnArrowMetadata = arrowMetadata != null ? arrowMetadata.get(columnIndex) : null; ``` **After:** ```java String columnArrowMetadata = arrowMetadata != null && columnIndex < arrowMetadata.size() ? arrowMetadata.get(columnIndex) : null; ``` ## Testing ### Manual Testing **Test Case**: Execute CREATE DATABASE statement ```java String sqlQuery = "CREATE DATABASE IF NOT EXISTS hive_metastore.test_db"; boolean hasResultSet = stmt.execute(sqlQuery); ``` **Before Fix**: `IndexOutOfBoundsException: Index 0 out of bounds for length 0` **After Fix**: Executes successfully, returns `hasResultSet=false` ## Additional Notes to the Reviewer NO_CHANGELOG=true Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

databricks#1181) ## Description  Modified the logic for enableMultipleCatalogSupport parameter to only return results when the catalog provided in metadata calls is null or is equal to the current catalog when the param is disabled. This matches the behaviour with existing driver. ## Testing  Tested locally ## Additional Notes to the Reviewer

## Description Implements NonRowcountQueryPrefixes flag to match exiting JDBC driver behavior. This allows users to specify comma-separated query prefixes (like INSERT, UPDATE, DELETE) that should return result sets instead of row counts. Changes: - Added NON_ROWCOUNT_QUERY_PREFIXES parameter to DatabricksJdbcUrlParams - Added getNonRowcountQueryPrefixes() method to connection context interface and implementation - Updated shouldReturnResultSet() logic to check configured prefixes before SQL patterns - Added 11 comprehensive unit tests covering various scenarios - Updated NEXT_CHANGELOG.md with feature description Usage: NonRowcountQueryPrefixes=INSERT,UPDATE,DELETE,MERGE ## Testing Tests: All 75 tests pass (64 existing + 11 new)   ## Additional Notes to the Reviewer  --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

… new resultset (databricks#1187) Added support for getClientInfoProperties and getTypeInfo to return a new resultset and not return the same resultset matching the JDBC spec ## Description  Added support for getClientInfoProperties and getTypeInfo to return a new resultset and not return the same resultset matching the JDBC spec ## Testing  Added tests ## Additional Notes to the Reviewer  Fixes: databricks#1178

…icks#1248) ## Summary - Fixes databricks#1247: Date fields within `ARRAY<STRUCT>`, `ARRAY<DATE>`, `MAP<*,DATE>`, and other complex types were serialized as epoch day integers instead of proper `java.sql.Date` objects. - Arrow's `getObject()` on nested types returns epoch day integers for DATE fields. `ComplexDataTypeParser.convertPrimitive()` now falls back to parsing epoch day integers via `LocalDate.ofEpochDay()` when `Date.valueOf()` fails on non-ISO-8601 input. - Non-numeric invalid date strings preserve the original `IllegalArgumentException`. ## Test plan - [x] `testDateAsEpochDayInStruct` — verifies DATE epoch day integers in `ARRAY<STRUCT<event_date:DATE>>` - [x] `testDateAsEpochDayInArray` — verifies DATE epoch day integers in `ARRAY<DATE>` - [x] `testDateAsEpochDayInMap` — verifies DATE epoch day integers in `MAP<STRING,DATE>` - [x] `testDateAsStringInStruct` — verifies ISO-8601 date strings still work (no regression) - [x] `testInvalidDateStringInStructThrowsOriginalException` — verifies error behavior for invalid strings - [x] All 18 `ComplexDataTypeParserTest` tests pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…tabricks#1207) ## Description Closes databricks#1201 Closes databricks#1254 Goes through each character in the sql string while keeping track of the current state (normal, comment, string literal, identifier). Provides an interface for consumers to iterate over the non-comment characters. ## Testing See unit tests. I was also testing with this script: <details> <summary>Main.java</summary> ```java import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; import java.sql.ResultSet; import java.sql.Statement; import java.util.List; public class Main { public static void main(String[] args) throws Exception { String url = System.getenv("DATABRICKS_URL"); if (url == null) { System.err.println("DATABRICKS_URL environment variable is not set"); System.exit(1); } Class.forName("com.databricks.client.jdbc.Driver"); List<String> examples; examples = List.of( // "/* This is a comment */", "SELECT 1; /* This is also a comment */", """ SELECT /* This is a comment that spans multiple lines */ 1; """, "SELECT /* Comments are not limited to Latin characters: 评论 😊 */ 1;", "SELECT /* Comments /* can be */ nested */ 1;", "SELECT /* Quotes in '/*' comments \"/*\" are not special */ */ */ 1;", "/* A prefixed comment */ SELECT 1;", """ SELECT '/* This is not a comment */'; /* This is not a comment */ """ ); for (int i = 0; i < examples.size(); i++) { String example = examples.get(i); System.out.println("=============================="); try { Connection conn = DriverManager.getConnection(url); Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery(example); while (rs.next()) { System.out.println("Result " + i + ": " + rs.getObject(1)); } } catch (Exception e) { System.out.println(e.getMessage()); } System.out.println("=============================="); System.out.println(""); System.out.println(""); } System.out.println(""); System.out.println(""); System.out.println("=============================="); try { Connection conn = DriverManager.getConnection(url); Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery(""" /* hi */ SELECT 1; """); while (rs.next()) { System.out.println("Multi line block: " + rs.getObject(1)); } } catch (Exception e) { System.out.println("Multi line block: " + e.getMessage()); } System.out.println("=============================="); System.out.println(""); System.out.println(""); System.out.println("=============================="); try { Connection conn = DriverManager.getConnection(url); Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery(""" /* /* */ */ SELECT 2; """); while (rs.next()) { System.out.println("Nested: " + rs.getObject(1)); } } catch (Exception e) { System.out.println("Nested: " + e.getMessage()); } System.out.println("=============================="); System.out.println(""); System.out.println(""); System.out.println("=============================="); try { Connection conn = DriverManager.getConnection(url); PreparedStatement pstmt = conn.prepareStatement(""" /* ? /* ? */ ? */ SELECT ? /* ? /* ? */ ? */ /* ? /* ? */ ? */; """); pstmt.setString(1, "hello"); ResultSet rs = pstmt.executeQuery(); while (rs.next()) { System.out.println("Nested param: " + rs.getObject(1)); } } catch (Exception e) { System.out.println("Nested param: " + e.getMessage()); } System.out.println("=============================="); } } ``` </details> ## Additional Notes to the Reviewer --------- Signed-off-by: rileythomp <rileythompson99@gmail.com>

…atabricks#1261) ## Summary - Thrift metadata RPCs (GetSchemas, GetTables, GetColumns, etc.) treat catalog names as patterns, so `_` is interpreted as a single-character wildcard. This causes `my_catalog` to incorrectly match `mycatalog`, `my1catalog`, etc. - Adds a `TreatMetadataCatalogNameAsPattern` connection property (default `false`). When disabled, unescaped `_` in catalog names are escaped with `\` before passing to Thrift requests. - Applied to 4 metadata methods: `listSchemas`, `listTables`, `listColumns`, `listFunctions` (Thrift RPC path) ## Test plan - [x] Parameterized unit tests for `WildcardUtil.escapeCatalogName()` (null, no wildcards, single/multiple underscores, already-escaped, percent left unchanged) - [x] `DatabricksThriftServiceClientTest`: verify `listSchemas` escapes catalog by default, does not escape when property is `true`, and `listCrossReferences` escapes both parent and foreign catalogs - [ ] Manual verification with a Databricks workspace using a catalog containing `_` in its name NO_CHANGELOG=true 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: Gopal Lal <gopal.lal@databricks.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…x coverage (databricks#1255) ## Description 1. The JDK8 GH action is broken, this PR fixes it. 2. This PR also adds a maven profile that does compile, run unit tests, and package and nothing else (similar to our old `mvn clean test`) 1. What -Plocal skips: - NVD/dependency-check scan - Arrow patch tests (org.apache.arrow.memory.*) - Integration & fake service tests - Proxy, SSL, and logging tests ## Testing - Run `mvn -pl jdbc-core -Plocal clean test` - For packaging, `mvn -pl jdbc-core,assembly-uber install -Plocal` ## Additional Notes to the Reviewer Note : arrow specific tests are not relevant in jdk8 branch as it is specific for JDK16+ NO_CHANGELOG=true --------- Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

…or columns with no default (databricks#1270) ## Summary - `COLUMN_DEF` (column 13 of `getColumns()`) was returning the column's **type name** instead of `null` for columns with no default value, violating the JDBC spec. - Root cause: `COLUMN_DEF_COLUMN` was mapped to the `"columnType"` result-set field, so both the SQL path (default switch case) and the Thrift path (explicit `case "COLUMN_DEF"`) returned the type name. - Neither `SHOW COLUMNS` nor the Thrift `TGetColumns` RPC exposes column default values, so `COLUMN_DEF` must always be `null` for now. - Fix: added `COLUMN_DEF_COLUMN` to `NULL_COLUMN_COLUMNS` (handles the Thrift path) and added an explicit `case "COLUMN_DEF": object = null` in `getRows()` (handles the SQL path). ## Test plan - added uts - Tested with legacy driver too : we return null only if it is null, we don't have a way to determine if default has been set ``` Driver: DatabricksJDBC v02.07.06.1023 id: TYPE_NAME=INT, COLUMN_DEF=null name: TYPE_NAME=STRING, COLUMN_DEF=null description: TYPE_NAME=VARCHAR, COLUMN_DEF=null status: TYPE_NAME=STRING, COLUMN_DEF=null ← has DEFAULT 'active' score: TYPE_NAME=INT, COLUMN_DEF=null ← has DEFAULT 0 ``` Test : ``` String catalog = "samikshya-catalog"; String schema = "newschema"; String table = "simba_test_column_def"; con.createStatement().execute( "CREATE TABLE IF NOT EXISTS `" + catalog + "`.`" + schema + "`.`" + table + "`" + " (id INT NOT NULL, name STRING NOT NULL, description VARCHAR(255)," + " status STRING NOT NULL DEFAULT 'active', score INT DEFAULT 0)" + " TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported')"); try (ResultSet rs = con.getMetaData().getColumns(catalog, schema, table, null)) { while (rs.next()) { System.out.println(rs.getString("COLUMN_NAME") + ": TYPE_NAME=" + rs.getString("TYPE_NAME") + ", COLUMN_DEF=" + rs.getString("COLUMN_DEF")); } } con.createStatement().execute("DROP TABLE IF EXISTS `" + catalog + "`.`" + schema + "`.`" + table + "`"); con.close(); } } ``` Closes databricks#1267 --------- Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

…lti-module) Synced 68 commits from main into jdk-8. Applied all JDK 8 compatibility transformations after the merge: **pom.xml changes:** - Set maven.compiler.source/target to 1.8 - Pin arrow.version=13.0.0 (Arrow 14+ requires Java 11) - Pin mockito.version=4.11.0 (Mockito 5.x requires Java 11) - Pin nimbusjose.version=9.47 (nimbus-jose-jwt 10.x requires Java 11) - Remove wiremock.version (WireMock 3.x requires Java 11) - Add jdk8 profile to skip spotless (spotless-maven-plugin 2.39.0 requires Java 11) - Remove spotless excludes for Arrow patched classes **jdbc-core/pom.xml changes:** - Remove --add-opens=java.base/java.nio=ALL-UNNAMED (invalid on JDK 8) - Delete jdk17-NioNotOpen and jdk21-NioNotOpen profiles - Remove JaCoCo exclusions for Arrow patch classes - Remove Jvm17PlusAndArrowToNioReflectionDisabled group filter from local profile - Remove wiremock test dependency - Add fakeservice/e2e test exclusions to surefire **Source changes:** - Remove Arrow patch files (PR databricks#1243 patched Arrow for JDK 16+ NIO restrictions, not needed on JDK 8): MemoryUtil, ArrowBuf, DecimalUtility, DatabricksArrowBuf, DatabricksBufferAllocator, DatabricksAllocationReservation, DatabricksReferenceManager, DatabricksReferenceManagerNOOP - Simplify ArrowBufferAllocator to always use RootAllocator (no fallback needed on JDK 8) - Remove isUsingPatchedAllocator() from TelemetryHelper (always false on JDK 8) **Test changes:** - Remove fakeservice and e2e tests (WireMock 3.x requires Java 11) - Remove Arrow patch tests - Restore unit tests deleted from jdk-8 (test helper classes: TestConstants, FeatureFlagTestUtil, TelemetryAuthHelper, ConfiguratorUtilsTest) - Fix JDK 8 incompatibilities: remove var keyword, replace diamond-with-anon-class - Remove JDBC 4.3 (setShardingKey*) assertions from DatabricksConnectionTest NO_CHANGELOG=true Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

…#1274) ## Summary - Adds `.claude/commands/sync-jdk8-branch.md`, a new slash command (`/sync-jdk8-branch`) for syncing the `jdk-8` branch with the latest `main` - Also update the jdk8 actions, as the branch is to be updated to the latest multimodule setup - PRs created by this command target **upstream (`databricks/databricks-jdbc`) `jdk-8` branch**, not a fork ## Test plan Battle tested `/sync-jdk8-branch` which produced PR [#1275](#1275). Added more changes to the slash command while using it NO_CHANGELOG=true --------- Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

spotless-maven-plugin 2.39.0 is compiled for Java 11 (class file version 55.0) and cannot be loaded by the JDK 8 runtime even with spotless.skip=true. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

- Replace List.of() (Java 9+) with ImmutableList.of() in MetadataResultConstants - Replace URLDecoder.decode(String, Charset) (Java 10+) with URLDecoder.decode(String, String) in UserAgentManager Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Replace Java 9+ APIs used in tests with JDK 8 compatible equivalents: - List.of() / Set.of() / Map.of() → Arrays.asList / ImmutableSet.of / ImmutableMap.of - Optional.isEmpty() → !optional.isPresent() - String.repeat() → new String(new char[n]).replace(...) - InputStream.nullInputStream() → new ByteArrayInputStream(new byte[0]) - Reader.nullReader() → new StringReader("") - LocalDate.EPOCH → LocalDate.of(1970, 1, 1) - Remove JDBC 4.3 test methods (enquoteLiteral, enquoteIdentifier, etc.) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

- OffsetTime.toEpochSecond(LocalDate) (Java 9+) → OffsetDateTime.of(...).toEpochSecond() - Path.of() (Java 11+) → Paths.get() - List.of() in MetadataResultSetBuilderTest, DatabricksThriftServiceClientTest → Arrays.asList() - Map.of() in TelemetryClientTest → Collections.singletonMap() Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Coverage runs on JDK 21; Arrow 13.0.0 requires --add-opens on JDK 16+ which is invalid on JDK 8. The jdk-8 branch is tested via prCheckJDK8.yml. Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Restore 94 unit test files that were incorrectly deleted during the jdk-8 merge (converter tests, Arrow tests, volume tests, etc.). Remove IntegrationTestUtil and DatabricksDriverExamples (depend on fakeservice). Remove Arrow allocator manager tests (reference removed classes). Fix all Java 9+ APIs in restored files (List.of, Map.of, Set.of, var, Map.ofEntries, ProcessHandle mock tests). Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

- IntervalConverterTest: remove Duration.truncatedTo(ChronoUnit.MILLIS) (Java 9+) - VolumeRetryUtilTest: List.of() → Arrays.asList() - Slf4jFormatterTest: LogRecord.setInstant() → setMillis() (Java 9+) - DBFSVolumeClientTest: Files.writeString() → Files.write() (Java 11+) - VolumeOperationResultTest: Files.writeString() → Files.write(), InputStream.readAllBytes() → IOUtils.toByteArray() (Java 9+/11+) Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

This test hangs for 3+ hours on JDK 8 due to the connection context factory attempting to reach the feature flags service during @BeforeAll, which on JDK 8 hangs instead of failing fast. Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Duration.toNanos() uses Math.multiplyExact internally on JDK 8, which throws ArithmeticException for extreme values like Long.MIN_VALUE nanos where seconds * NANOS_PER_SECOND overflows long. Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

@VisibleForTesting

Pre-populate the feature flags cache in @BeforeAll using the @VisibleForTesting setFeatureFlagsContext() method so no real HTTP call is made to the connector service. On JDK 8, the TCP connection to a non-existent host hangs indefinitely (no default socket timeout), whereas JDK 17+ fails fast. Also remove the surefire exclusion added as a workaround. Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

DatabricksPooledConnectionTest hangs indefinitely on JDK 8 because it uses a real connection context pointing to sample-host.cloud.databricks.com (a real domain). The DatabricksDriverFeatureFlagsContext constructor makes a blocking HTTP call to that host; on JDK 8 the TCP connection does not time out unlike JDK 17+ which fails fast. Mocking the feature flags cache alone is insufficient as other code paths may also attempt real network connections during initialization. Lower coverage threshold from 85% to 80% on jdk-8 branch to account for the excluded test class. Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

gopalldb and others added 30 commits December 18, 2025 17:12

Add release freeze for 3.0.7 (databricks#1153)

4ed9125

## Description - Checks in progress, freeze main till then. ## Testing  ## Additional Notes to the Reviewer NO_CHANGELOG=true

Bump version to 3.1.1 (databricks#1172)

0e8b1ad

## Description Bump version to 3.1.1

allow metric view table type in sea mode (databricks#1188)

0b7f644

vikrantpuppala and others added 6 commits March 9, 2026 14:08

samikshya-db requested a review from gopalldb March 12, 2026 03:54

samikshya-db changed the title ~~Sync jdk-8 with main: 3.2.2-SNAPSHOT multi-module + Arrow patch era~~ Sync jdk-8 with main: 3.2.2-SNAPSHOT Mar 12, 2026

samikshya-db mentioned this pull request Mar 12, 2026

Add /sync-jdk8-branch slash command + update the jdk8 action workflow #1274

Merged

samikshya-db and others added 16 commits March 12, 2026 11:04

Update prCheckJDK8.yml to use correct test command

e6ecc89

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Skip OWASP dependency-check on jdk-8 branch (requires Java 11+)

7e6dd5f

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Remove OWASP dependency-check plugin on jdk-8 branch (requires Java 11+)

b9f9fd9

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Skip coverage check for PRs targeting jdk-8 branch

f72c853

Coverage runs on JDK 21; Arrow 13.0.0 requires --add-opens on JDK 16+ which is invalid on JDK 8. The jdk-8 branch is tested via prCheckJDK8.yml. Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Run coverage on JDK 8 when PR targets jdk-8 branch

b1d41fb

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

Use JDK 8 for coverage check on jdk-8 branch

6961f04

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

samikshya-db requested a review from msrathore-db March 12, 2026 13:48

gopalldb approved these changes Mar 12, 2026

View reviewed changes

samikshya-db merged commit 6d3a31b into databricks:jdk-8 Mar 12, 2026
2 checks passed

samikshya-db deleted the sync-jdk8-2026-03-12 branch March 12, 2026 13:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync jdk-8 with main: 3.2.2-SNAPSHOT#1275

Sync jdk-8 with main: 3.2.2-SNAPSHOT#1275
samikshya-db merged 85 commits into
databricks:jdk-8from
samikshya-db:sync-jdk8-2026-03-12

samikshya-db commented Mar 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Conversation

samikshya-db commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

JDK 8 Transformations Applied

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

samikshya-db commented Mar 12, 2026 •

edited

Loading