Correctness fixes for analytics-engine search route (datetime schema + IP/BINARY byte[]) by ahkcs · Pull Request #5447 · opensearch-project/sql

ahkcs · 2026-05-15T18:20:52Z

What this PR does

Four narrowly-scoped correctness fixes the analytics-engine search path needs. No visitor lowering, no work-arounds for typed-predicate query_string — those belong upstream (see below).

#	Commit	What it fixes	Without this commit
1	`Inject parquet settings in testSearchCommandWithSpecialIndexName`	Makes the special-index test create a parquet-backed shard like the rest of the suite.	Test runs on a Lucene-backed shard — not representative of the analytics-engine route.
2	`Pass minimal {"properties":{}} mapping in testSearchCommandWithSpecialIndexName`	Empty index needs at least an empty `properties` map for the analytics catalog to surface it.	Index goes missing; test fails at "index not found".
3	`Preserve datetime schema-column types in AnalyticsExecutionEngine`	`DatetimeOutputCastRule` casts every datetime column to `VARCHAR` for PPL's wire format. The response schema now recovers the pre-cast `RelDataType` so a `@timestamp` column is reported as `timestamp`, not `string`.	Type-checked clients break: schema says `string` for what is actually a timestamp.
4	`Convert byte[] cells to ExprValue for IP / BINARY fields`	`ExprValueUtils.fromObjectValue` gets a `byte[]` branch — 4 / 16 bytes → `ExprIpValue` via `InetAddress.getByAddress`; other lengths → base64 `ExprStringValue`.	Every IP / binary cell throws `ExpressionEvaluationException: unsupported object class [B`. Hard crash.

Two test-plumbing commits + two correctness commits in two files (AnalyticsExecutionEngine, ExprValueUtils). That's the whole PR.

Out of scope — typed-predicate `query_string` is a Lucene-secondary concern

CalciteSearchCommandIT on the analytics-engine route (tests.analytics.force_routing=true -Dtests.analytics.parquet_indices=true, against current upstream/main + opensearch-project/OpenSearch#21681) — 26 / 52 pass. The 25 failures all collapse to one upstream root cause: Lucene-secondary query_string doesn't re-type literals inside the query body using the field mapping the way normal Lucene _search does.

`query_string` input	Lucene `_search`	Analytics-engine Lucene secondary
`[severityText], "ERROR"` (text equality)	✅ matches	✅ matches
`[severityNumber], "severityNumber:>15"` (numeric range)	✅ re-parses `"15"` as `long`, BKD range hit	❌ literal stays as text → 0 rows
`[@timestamp], "@timestamp:>2024-01-01"` (date range)	✅ re-parses date	❌ 0 rows
`query_string(text) AND age>30` (mixed-type AND)	✅	❌ `Panic: primitive array`

Why this PR doesn't paper over it on the SQL side. The SQL plugin could lower numeric/boolean comparisons to native typed RexCall and dodge the gap (an earlier version of this PR did exactly that, lifting 26 → 38). The fix doesn't belong here, though — typed query_string should evaluate correctly on the Lucene secondary regardless of which engine produced the predicate. The right place is the analytics-engine sandbox's QuerySerializerRegistry / Lucene-secondary mapping path that opensearch-project/OpenSearch#21562 introduced. Once typed query_string works end-to-end there, the 25 failures unblock with no change in this repo.

The compound-AND panic that #21562's author flagged in their own test cases 10c / 10d as "handled separately" is the same family of bug.

Results

CalciteSearchCommandIT on the analytics-engine route, against current upstream/main + opensearch-project/OpenSearch#21681:

Step	PASS / 52
Pre-#21631 baseline	3
+ Upstream lands #21631 / #21698 / #21628 / #21622 / #21562	~26
+ opensearch-project/OpenSearch#21681 (sandbox fixes) + this PR (4 correctness lifts)	26 ⇨ correctness on the 26 that run

The +0 net pass-rate vs. upstream baseline is intentional: the SQL-side fixes here prevent wrong answers (timestamp-as-string schema, byte[] crash on IP / BINARY scans) on tests that already pass at row-count level. They don't unblock new tests — that takes the Lucene-secondary fix described above.

Out of scope — 25 failures bucketed by root cause

All collapse to "Lucene-secondary query_string doesn't type-coerce literals using the field mapping":

Bucket	Tests (25)	Where to fix
Numeric / decimal range (5)	`testSearchWithNumericComparison`, `testSearchWithInclusiveRanges`, `testSearchWithDoubleFieldComparisons`, `testSearchWithDoubleINOperator`, `testSearchWithDoubleRangeOperators`	analytics-engine Lucene-secondary mapping / `QueryStringQueryBuilder` evaluator
Date-math / date range (10)	`testSearchWithAbsoluteEarliestAndNow`, `testSearchWithDateFormats`, `testSearchWithDateRangeComparisons`, `testSearchWithDateINOperator`, `testSearchWithRelativeTimeModifiers`, `testSearchWithQuarterlyModifiers`, `testSearchWithTimeUnitSnapping`, `testSearchTimeModifierWithSnappedWeek`, `testSearchWithChainedRelativeTimeRange`, `testSearchWithNumericTimeRange`	Same
Compound predicate (4)	`testSearchWithNestedParentheses`, `testSearchWithComplexBooleanExpression`, `testSearchMixedWithPipeCommands`, `testSearchWithFieldNotEquals`	Composite engine — composing `query_string`-delegated and DataFusion-native predicates
Attribute / multi-field (2)	`testSearchWithAttributeFields`, `testSearchWithMultipleFieldTypes`	Same family
IP address (1)	`testSearchWithIPAddress`	Same family
Lucene-secondary NOT / wildcard semantics (2)	`testWildcardPatternMatching`, `testDifferenceBetweenNOTAndNotEquals`	Lucene-secondary side, separate from #21562
Special-index empty-mapping streaming (1)	`testSearchCommandWithSpecialIndexName`	Composite-engine streaming — can't open empty composite shard

Tracked in docs/dev/search-command-analytics-route-status.md.

Check list

New functionality includes testing
Commits are signed per the DCO using --signoff

github-actions · 2026-05-15T21:17:06Z

Persistent review updated to latest commit 0cef8c8

github-actions · 2026-05-15T21:18:25Z

Persistent review updated to latest commit 0cef8c8

github-actions · 2026-05-15T21:18:40Z

Persistent review updated to latest commit 0cef8c8

github-actions · 2026-05-15T21:50:10Z

Persistent review updated to latest commit 828b80d

github-actions · 2026-05-15T22:13:15Z

Persistent review updated to latest commit 52ffeb8

github-actions · 2026-05-15T22:28:07Z

Persistent review updated to latest commit 96b33cf

github-actions · 2026-05-15T22:35:29Z

Persistent review updated to latest commit c6e07a0

github-actions · 2026-05-18T19:05:22Z

Persistent review updated to latest commit 51f4213

github-actions · 2026-05-18T19:18:17Z

Persistent review updated to latest commit 8416a3b

github-actions · 2026-05-18T19:56:07Z

Persistent review updated to latest commit 02807c7

The test creates `logs-2021.01.11` and `logs-7.10.0-2021.01.11` directly via raw PUTs to exercise the parser's handling of dot/version-prefixed index names. The raw path bypasses `TestUtils.createIndexByRestClient`, so the analytics-engine `tests.analytics.parquet_indices=true` toggle never injects the composite/parquet settings. On the analytics-engine route this surfaces as `Table 'logs-2021.01.11' not found` (the catalog-builder only exposes indices that match an analytics-engine- recognized data format), failing the test. Routing the creates through `TestUtils.createIndexByRestClient` makes both code paths honor the toggle: legacy / Calcite continues to create plain Lucene-backed indices, analytics-engine gets parquet-backed composite indices. The index names are still tested verbatim. Surfaced by `CalciteSearchCommandIT` on the analytics-engine route. Signed-off-by: Kai Huang <ahkcs@amazon.com>

…ialIndexName `OpenSearchSchemaBuilder.buildSchema` skips any index whose mapping has no `properties` block — the analytics catalog then doesn't surface the index, and the PPL parser fails with `Table 'logs-2021.01.11' not found` before reaching the row type. Routing the special-index create through `TestUtils.createIndexByRestClient` with `null` mapping produces an empty `{}` body, so the index is created with no mapping at all and the catalog still skips it. Pass a minimal `{"mappings":{"properties":{}}}` body. OpenSearch strips the empty `properties` to `{}` server-side but the schema builder sees a non-null mapping and registers the (zero-column) table. The query then parses cleanly and the test's `verifyDataRows()` (no expected rows) succeeds against the empty index. Signed-off-by: Kai Huang <ahkcs@amazon.com>

github-actions · 2026-05-19T17:02:49Z

Persistent review updated to latest commit 57178c1

@timestamp

`DatetimeOutputCastRule` (api/spec/datetime) wraps every datetime-typed root column in `CAST(... AS VARCHAR)` so the analytics-engine backend can emit PPL's documented space-separator format via `to_char` (`DatetimeOutputCastRewriter` on the OpenSearch side). The downside is that `plan.getRowType()` then advertises those columns as VARCHAR — the response schema reports `"string"` for what callers expect as `"timestamp"`. The legacy / Calcite-reference path doesn't have this divergence: the wire value is the formatted string while the schema column type stays as the original datetime type. Walk the outermost `Project` (descending through a single `Sort`, the `LogicalSystemLimit` system query-size wrapper) and, for any slot of shape `CAST(<datetime> AS VARCHAR)`, swap in the inner `RelDataType` for the schema column. Values still come back as the strings DataFusion emitted — only the schema column type is restored. Falls back to the slot's reported type for non-cast slots, non-datetime sources, or when the root isn't a Project (raw scan / aggregate fragment). Surfaced by `verifySchema(... schema("@timestamp", "timestamp") ...)` assertions in `CalciteSearchCommandIT` time-modifier tests; this commit removes the schema-type divergence for those queries (the remaining data-row mismatches are downstream Lucene-secondary issues filed separately). Signed-off-by: Kai Huang <ahkcs@amazon.com>

github-actions · 2026-05-19T18:25:12Z

Persistent review updated to latest commit adf0887

Swiddis · 2026-05-19T19:05:42Z

+      //   `OpenSearchSchemaBuilder` + `schema_coerce.rs`) and route by type here.
+      if (bytes.length == 4 || bytes.length == 16) {
+        try {
+          return new ExprIpValue(java.net.InetAddress.getByAddress(bytes).getHostAddress());


issue: This is going to trip someone up at some point. In particular, any 4-letter words in ASCII will hit this. I don't want to Norway our customers.

Is there not a way to somehow tag the bytes with whether they're actually IP data or not?

Good catch — you're right. Dropped the length-based heuristic in the latest force-push: byte[] cells now always render as base64, no IP-guessing.

Trade-off: genuine IPv4 / IPv6 cells render as base64 strings instead of dotted-decimal until the schema actually tells us "this column is IP." That's the upstream TODO — OpenSearchSchemaBuilder.mapFieldType currently collapses both ip and binary into VARBINARY. Splitting them into separate UDTs is a cross-repo refactor (analytics-api + isthmus + Substrait + SQL plugin), so leaving it as a follow-up.
The conservative fix here still resolves the original crash (every PPL query scanning an ip column was throwing unsupported object class [B) — just without the misinterpretation risk you flagged.

`AnalyticsExecutionEngine.convertRows` reaches `ExprValueUtils.fromObjectValue` for every cell in every row. OpenSearch `ip` and `binary` fields are stored as raw bytes, so once the analytics-engine reader (paired with OpenSearch PR #21681 commit 4) materializes a parquet column, each cell arrives as a Java `byte[]`. The fall-through catch-all previously threw: ExpressionEvaluationException: unsupported object class [B at ExprValueUtils.fromObjectValue(ExprValueUtils.java:168) at AnalyticsExecutionEngine.convertRows(AnalyticsExecutionEngine.java:128) Every PPL query that even projects an ip column fails on the analytics route with this error, regardless of operator or predicate. `OpenSearchSchemaBuilder.mapFieldType` collapses both `ip` and `binary` into Calcite `VARBINARY`, so the cell-level type system can't distinguish the two. An earlier draft of this commit gated IP rendering on byte length (4 or 16 bytes → `ExprIpValue` via `InetAddress.getByAddress`), but that misinterprets any 4-byte ASCII payload (e.g. `"INFO"`, `"WARN"`, `"POST"`) as a public IPv4 address — the classic "Norway problem" applied to bytes. With no schema context here we can't safely guess, so default to a base64-encoded `ExprStringValue`. Typed call sites that DO know the column is `ip` should route through `ipValue(String)` explicitly. Once `OpenSearchSchemaBuilder` splits `ip` and `binary` into separate UDTs (TODO already filed in that file), this branch can render IP cells in dotted-decimal form again without the length-guessing. Signed-off-by: Kai Huang <ahkcs@amazon.com>

github-actions · 2026-05-19T19:21:31Z

Persistent review updated to latest commit cdac62e

ahkcs requested review from LantaoJin, RyanL1997, Swiddis, acarbonetto, anirudha, dai-chen, joshuali925, mengweieric, noCharger, penghuo, ps48, qianheng-aws, songkant-aws, vamsimanohar, ykmr1224 and yuancu as code owners May 15, 2026 18:20

ahkcs mentioned this pull request May 15, 2026

[analytics-engine] Coverage fixes for search command on the analytics-engine route opensearch-project/OpenSearch#21681

Merged

ahkcs changed the title ~~Inject parquet settings in testSearchCommandWithSpecialIndexName~~ Lower structured search predicates to native RexCall; analytics-route coverage fixes May 15, 2026

ahkcs force-pushed the pr/search-special-index-parquet branch from 602e1f0 to 0cef8c8 Compare May 15, 2026 21:17

ahkcs added calcite calcite migration releated PPL Piped processing language enhancement New feature or request labels May 15, 2026

ahkcs force-pushed the pr/search-special-index-parquet branch from 96b33cf to c6e07a0 Compare May 15, 2026 22:34

ahkcs force-pushed the pr/search-special-index-parquet branch from c6e07a0 to 51f4213 Compare May 18, 2026 19:03

ahkcs force-pushed the pr/search-special-index-parquet branch from 51f4213 to 8416a3b Compare May 18, 2026 19:16

ahkcs force-pushed the pr/search-special-index-parquet branch from 8416a3b to 02807c7 Compare May 18, 2026 19:54

ahkcs force-pushed the pr/search-special-index-parquet branch 2 times, most recently from dc23a73 to 9583614 Compare May 18, 2026 22:09

ahkcs added 2 commits May 19, 2026 10:00

ahkcs force-pushed the pr/search-special-index-parquet branch from 20b7bc3 to 57178c1 Compare May 19, 2026 17:01

ahkcs changed the title ~~Lower structured search predicates to native RexCall; analytics-route coverage fixes~~ Correctness fixes for analytics-engine search route (datetime schema + IP/BINARY byte[]) May 19, 2026

ahkcs force-pushed the pr/search-special-index-parquet branch from 57178c1 to adf0887 Compare May 19, 2026 18:23

Swiddis reviewed May 19, 2026

View reviewed changes

ahkcs force-pushed the pr/search-special-index-parquet branch from adf0887 to cdac62e Compare May 19, 2026 19:20

opensearch-project deleted a comment from github-actions Bot May 19, 2026

ahkcs requested a review from Swiddis May 19, 2026 22:13

Conversation

ahkcs commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Out of scope — typed-predicate query_string is a Lucene-secondary concern

Results

Out of scope — 25 failures bucketed by root cause

Check list

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

Swiddis May 19, 2026

Choose a reason for hiding this comment

Uh oh!

ahkcs May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ahkcs commented May 15, 2026 •

edited

Loading

Out of scope — typed-predicate `query_string` is a Lucene-secondary concern

ahkcs May 19, 2026 •

edited

Loading