Conversation
WalkthroughAdds OpenTelemetry tracing: new dependencies and patch overrides, a public Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application\nStartup
participant Main as main()
participant Telemetry as telemetry::init_otel_tracer()
participant OTEL as OpenTelemetry\nExporter/Provider
participant Tracing as tracing-subscriber\nRegistry
Main->>Main: call init_logger()
Main->>Telemetry: init_otel_tracer()
Telemetry->>Telemetry: check OTEL_EXPORTER_OTLP_ENDPOINT
Telemetry->>OTEL: create SpanExporter (grpc/http-proto/http-json)
OTEL-->>Telemetry: exporter
Telemetry->>OTEL: build SdkTracerProvider + BatchSpanProcessor
OTEL-->>Telemetry: SdkTracerProvider
Telemetry->>OTEL: set_tracer_provider(global)
Telemetry->>OTEL: set_text_map_propagator(W3C)
Telemetry-->>Main: Some(provider) or None
Main->>Tracing: register tracing_opentelemetry layer (if provider)
Tracing-->>Main: registry ready
sequenceDiagram
participant Client as Client
participant Middleware as TracingLogger
participant Handler as Query Handler\n(instrumented)
participant Executor as Query Executor / QUERY_RUNTIME
participant OTEL as OTEL Exporter
Client->>Middleware: HTTP request
Middleware->>Middleware: create request span
Middleware->>Handler: invoke handler (with span)
Handler->>Executor: call execute(query)
Executor->>Executor: serialize span context into map
Executor->>Executor: spawn thread in QUERY_RUNTIME with context
Executor->>Executor: extract context in spawned task
Executor->>Executor: set parent span and instrument future
Executor->>OTEL: spans emitted to exporter
Executor-->>Handler: query result
Handler-->>Client: HTTP response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/query/mod.rs (2)
81-98: Consider a more descriptive name for this type alias.
RBis cryptic. A name likeQueryResultBatchesorRecordBatchResultwould be more self-documenting and convey that this represents either collected batches or a streaming result.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/query/mod.rs` around lines 81 - 98, The type alias RB is cryptic; rename it to a descriptive identifier (e.g., QueryResultBatches or RecordBatchResult) and update all references accordingly. Locate the type alias declaration "pub type RB = Either<...>" and change the alias name, then update any usages (functions, structs, trait impls) that reference RB to the new name (ensure imports and re-exports are adjusted if RB was pub). Run cargo check/tests to confirm no remaining references to RB.
520-520: Consider consistent span naming with a common prefix.Other spans in this file use a
query.prefix (e.g.,query.execute,query.datafusion_execute). For consistency and easier filtering in trace backends, consider naming thesequery.get_bin_densityandquery.get_manifest_list.🔧 Suggested naming for consistency
-#[tracing::instrument(name = "get_bin_density", skip_all, fields(stream = %self.stream))] +#[tracing::instrument(name = "query.get_bin_density", skip_all, fields(stream = %self.stream))] pub async fn get_bin_density(-#[tracing::instrument(name = "get_manifest_list", skip(time_range, tenant_id), fields(stream = %stream_name))] +#[tracing::instrument(name = "query.get_manifest_list", skip(time_range, tenant_id), fields(stream = %stream_name))] pub async fn get_manifest_list(Also applies to: 726-726
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/query/mod.rs` at line 520, The tracing span names lack the common "query." prefix; update the tracing::instrument attributes for the functions like get_bin_density and get_manifest_list to use names such as "query.get_bin_density" and "query.get_manifest_list" (i.e., change #[tracing::instrument(name = "get_bin_density", ...)] to #[tracing::instrument(name = "query.get_bin_density", ...)] and similarly for get_manifest_list) so span names match the existing query.* convention for consistent filtering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/handlers/http/query.rs`:
- Line 120: The tracing span currently emits the full SQL via the attribute on
the "query" handler (the tracing::instrument on the function using
query_request), which can leak sensitive data; remove the raw query text from
the span fields and instead record a non-sensitive identifier such as a
hash/fingerprint or metadata (e.g., query_hash, param_count, or a boolean for
streaming) derived from query_request.query; update the tracing attribute and
any span field population to use that safe field (compute the hash/fingerprint
in the handler using query_request.query and expose only that value in the span,
or omit the SQL entirely and keep query.streaming if needed).
In `@src/telemetry.rs`:
- Around line 34-35: The OTEL initialization currently only checks for
OTEL_EXPORTER_OTLP_ENDPOINT so a signal-specific env var like
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is ignored; update the gate in the OTEL init
path (where OTEL is enabled/disabled, e.g., in the function that reads env vars
to initialize OTEL) to proceed if either OTEL_EXPORTER_OTLP_ENDPOINT or
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is set, and ensure later logic still prefers
the signal-specific OTLP_TRACES endpoint when constructing the traces exporter
URL; change the boolean check that references only OTEL_EXPORTER_OTLP_ENDPOINT
to an existence check for either env var and preserve existing resolution logic
for trace endpoint selection.
---
Nitpick comments:
In `@src/query/mod.rs`:
- Around line 81-98: The type alias RB is cryptic; rename it to a descriptive
identifier (e.g., QueryResultBatches or RecordBatchResult) and update all
references accordingly. Locate the type alias declaration "pub type RB =
Either<...>" and change the alias name, then update any usages (functions,
structs, trait impls) that reference RB to the new name (ensure imports and
re-exports are adjusted if RB was pub). Run cargo check/tests to confirm no
remaining references to RB.
- Line 520: The tracing span names lack the common "query." prefix; update the
tracing::instrument attributes for the functions like get_bin_density and
get_manifest_list to use names such as "query.get_bin_density" and
"query.get_manifest_list" (i.e., change #[tracing::instrument(name =
"get_bin_density", ...)] to #[tracing::instrument(name =
"query.get_bin_density", ...)] and similarly for get_manifest_list) so span
names match the existing query.* convention for consistent filtering.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7d2eb9b3-0884-4b11-9ba6-9582033fe14f
📒 Files selected for processing (9)
Cargo.tomlsrc/handlers/http/cluster/mod.rssrc/handlers/http/modal/mod.rssrc/handlers/http/query.rssrc/lib.rssrc/main.rssrc/query/mod.rssrc/storage/field_stats.rssrc/telemetry.rs
ef37f5f to
7701143
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/query/mod.rs (1)
81-98: Consider adding documentation for theRBtype alias.The
RBtype alias encapsulates a complex nested type for query results. While functional, adding a doc comment explaining when each variant (batch vs. streaming) is used would improve maintainability.📝 Suggested documentation
+/// Query result type: batch (`Left`) or streaming (`Right`). +/// +/// - `Left(Vec<RecordBatch>)`: Used when `streaming = false` — all results collected. +/// - `Right(...)`: Used when `streaming = true` — results streamed as they arrive. pub type RB = Either< Vec<RecordBatch>, Pin<🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/query/mod.rs` around lines 81 - 98, The RB type alias lacks documentation: add a Rust doc comment above the pub type RB declaration explaining that RB is an Either containing either a Vec<RecordBatch> (used for fully-materialized in-memory results) or a streaming adapter (Pin<Box<RecordBatchStreamAdapter<select_all::SelectAll<...>>>> used for incremental/streamed results), and describe when each variant is produced and consumed (e.g., small queries vs. large/streamed queries) and any relevant Send/Pin constraints; reference the RB alias and the nested types (RecordBatch, RecordBatchStreamAdapter, select_all::SelectAll, RecordBatchStream) in the comment for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/query/mod.rs`:
- Around line 81-98: The RB type alias lacks documentation: add a Rust doc
comment above the pub type RB declaration explaining that RB is an Either
containing either a Vec<RecordBatch> (used for fully-materialized in-memory
results) or a streaming adapter
(Pin<Box<RecordBatchStreamAdapter<select_all::SelectAll<...>>>> used for
incremental/streamed results), and describe when each variant is produced and
consumed (e.g., small queries vs. large/streamed queries) and any relevant
Send/Pin constraints; reference the RB alias and the nested types (RecordBatch,
RecordBatchStreamAdapter, select_all::SelectAll, RecordBatchStream) in the
comment for clarity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 4e0961e6-d501-4552-8134-32054487afbd
📒 Files selected for processing (9)
Cargo.tomlsrc/handlers/http/cluster/mod.rssrc/handlers/http/modal/mod.rssrc/handlers/http/query.rssrc/lib.rssrc/main.rssrc/query/mod.rssrc/storage/field_stats.rssrc/telemetry.rs
✅ Files skipped from review due to trivial changes (4)
- src/lib.rs
- src/handlers/http/modal/mod.rs
- src/storage/field_stats.rs
- src/handlers/http/cluster/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- Cargo.toml
- src/main.rs
7701143 to
8e15a0d
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/query/mod.rs (1)
81-98: Consider a more descriptive name for the type alias.
RBis quite terse and doesn't convey the semantic meaning. A name likeQueryResultBatchesorRecordBatchOutputwould improve readability, especially for developers unfamiliar with this module.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/query/mod.rs` around lines 81 - 98, The type alias currently named RB is too terse; rename the alias RB to a more descriptive identifier such as QueryResultBatches or RecordBatchOutput in src/query/mod.rs and update all usages accordingly (e.g., replace RB with the new name wherever referenced in functions, structs, or trait signatures) so the alias clearly conveys it represents either Vec<RecordBatch> or the boxed RecordBatchStreamAdapter; ensure imports/exports and any pub visibility remain unchanged after renaming.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/query/mod.rs`:
- Around line 81-98: The type alias currently named RB is too terse; rename the
alias RB to a more descriptive identifier such as QueryResultBatches or
RecordBatchOutput in src/query/mod.rs and update all usages accordingly (e.g.,
replace RB with the new name wherever referenced in functions, structs, or trait
signatures) so the alias clearly conveys it represents either Vec<RecordBatch>
or the boxed RecordBatchStreamAdapter; ensure imports/exports and any pub
visibility remain unchanged after renaming.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 3a7f94d3-d818-4f1e-9d4e-61a3be5ea962
📒 Files selected for processing (9)
Cargo.tomlsrc/handlers/http/cluster/mod.rssrc/handlers/http/modal/mod.rssrc/handlers/http/query.rssrc/lib.rssrc/main.rssrc/query/mod.rssrc/storage/field_stats.rssrc/telemetry.rs
✅ Files skipped from review due to trivial changes (4)
- src/lib.rs
- src/handlers/http/cluster/mod.rs
- src/handlers/http/modal/mod.rs
- src/telemetry.rs
🚧 Files skipped from review as they are similar to previous changes (3)
- Cargo.toml
- src/main.rs
- src/handlers/http/query.rs
8e15a0d to
1933219
Compare
Clean compilation with no warnings. Here's a summary of all changes made: ## Files Modified ### 1. `Cargo.toml` - Added dependencies: `tracing-opentelemetry = "0.32"`, `tracing-actix-web = "0.7"`, `opentelemetry`, `opentelemetry_sdk` (with `rt-tokio`), `opentelemetry-otlp` (with `grpc-tonic`, `http-proto`, `http-json`) — all from the same git rev as existing `opentelemetry-proto` - Added `tracing-subscriber` feature `"registry"` - Added `[patch.crates-io]` section to unify `opentelemetry` and `opentelemetry_sdk` types across all crates ### 2. `src/telemetry.rs` (NEW) - `init_otel_tracer() -> Option<SdkTracerProvider>` — reads `OTEL_EXPORTER_OTLP_ENDPOINT` env var; if unset returns `None` (OTel disabled). Supports gRPC, HTTP/protobuf, and HTTP/JSON (default) protocols via `OTEL_EXPORTER_OTLP_PROTOCOL`. Registers W3C `TraceContextPropagator` globally. ### 3. `src/lib.rs` - Added `pub mod telemetry;` ### 4. `src/main.rs` - `init_logger()` now returns `Option<SdkTracerProvider>` and wires the OTel tracing layer into the subscriber - `main()` captures the provider and calls `provider.shutdown()` before exit ### 5. `src/handlers/http/modal/mod.rs` - Replaced `actix_web::middleware::Logger::default()` with `tracing_actix_web::TracingLogger::default()` for automatic HTTP request tracing with W3C traceparent propagation ### 6. `src/handlers/http/query.rs` — 7 functions instrumented - **`query()`** — root span with `query.sql` and `query.streaming` fields - **`get_counts()`** — root span - **`handle_count_query()`** — child span with `table` field - **`handle_non_streaming_query()`** — child span - **`handle_streaming_query()`** — child span - **`into_query()`** — child span - **`get_records_and_fields()`** — child span - **`create_streams_for_distributed()`** — child span with `stream_count` field + Pattern 1 span propagation into `JoinSet::spawn` tasks ### 7. `src/query/mod.rs` — 4 functions instrumented - **`execute()`** — child span + **Pattern 2 W3C TraceContext propagation** across `QUERY_RUNTIME` (separate `Runtime::new()` — cross-OS-thread boundary). Injects context before spawn, extracts and sets parent inside the spawned closure. - **`Query::execute()`** — child span (`query.datafusion_execute`) - **`CountsRequest::get_bin_density()`** — child span with `stream` field - **`get_manifest_list()`** — child span with `stream` field ### 8. `src/storage/field_stats.rs` — 1 function instrumented - **`get_dataset_stats()`** — root span ### 9. `src/handlers/http/cluster/mod.rs` — 1 function instrumented - **`send_query_request()`** — child span Co-authored-by: otex-dev <dev@otex.dev>
1933219 to
5d3b658
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/handlers/http/query.rs (1)
483-503:⚠️ Potential issue | 🟠 MajorPropagate stream-load failures out of the
JoinSet.The spawned task returns
(stream_name, result), but thejoin_next()loop only checksJoinError, so this helper still returnsOk(())when a requested stream failed to load. Both callers use?, so they are expecting a precise failure here rather than a later planning error. MatchOk((stream_name, Err(e)))in the join loop and return that failure to the caller.
Based on learnings: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)ensures proper error handling in both modes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/handlers/http/query.rs` around lines 483 - 503, The join loop currently only checks JoinError and ignores worker task results, so any stream-load failure returned as Ok((stream_name, Err(e))) is swallowed; update the join_set.join_next() handling to match Ok((stream_name, Err(e))) and propagate that Err to the caller (return Err(e) or wrap with context) instead of continuing, while still logging JoinError when result is Err(join_error). Also ensure the spawn closure still returns (stream_name, result) from create_stream_and_schema_from_storage and consider applying the Parseable existence check pattern (!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)) in the relevant logstream handlers so stream existence is validated in both Query and standalone modes.
♻️ Duplicate comments (1)
src/handlers/http/query.rs (1)
120-120:⚠️ Potential issue | 🟠 MajorDo not emit raw SQL into request spans.
query.sql = %query_request.queryexports full statements, including literals and identifiers, into tracing backends. Record only a fingerprint, length, or statement kind here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/handlers/http/query.rs` at line 120, The tracing span currently emits raw SQL via the attribute fields(…, query.sql = %query_request.query, …); instead replace that with a safe derived value (e.g. a fingerprint, hash, statement kind, or length) and expose that instead of the full text. Compute a deterministic fingerprint or short hash from query_request.query (or extract the statement kind/length) prior to the span and change the tracing field to something like query.fingerprint, query.hash, query.kind or query.length (referencing the computed value) while removing query.sql = %query_request.query from the tracing attribute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/handlers/http/query.rs`:
- Around line 483-503: The join loop currently only checks JoinError and ignores
worker task results, so any stream-load failure returned as Ok((stream_name,
Err(e))) is swallowed; update the join_set.join_next() handling to match
Ok((stream_name, Err(e))) and propagate that Err to the caller (return Err(e) or
wrap with context) instead of continuing, while still logging JoinError when
result is Err(join_error). Also ensure the spawn closure still returns
(stream_name, result) from create_stream_and_schema_from_storage and consider
applying the Parseable existence check pattern
(!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode !=
Mode::Query ||
!PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)) in the
relevant logstream handlers so stream existence is validated in both Query and
standalone modes.
---
Duplicate comments:
In `@src/handlers/http/query.rs`:
- Line 120: The tracing span currently emits raw SQL via the attribute fields(…,
query.sql = %query_request.query, …); instead replace that with a safe derived
value (e.g. a fingerprint, hash, statement kind, or length) and expose that
instead of the full text. Compute a deterministic fingerprint or short hash from
query_request.query (or extract the statement kind/length) prior to the span and
change the tracing field to something like query.fingerprint, query.hash,
query.kind or query.length (referencing the computed value) while removing
query.sql = %query_request.query from the tracing attribute.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 1ec32b0a-0a06-4d26-b606-002329b8e92c
📒 Files selected for processing (9)
Cargo.tomlsrc/handlers/http/cluster/mod.rssrc/handlers/http/modal/mod.rssrc/handlers/http/query.rssrc/lib.rssrc/main.rssrc/query/mod.rssrc/storage/field_stats.rssrc/telemetry.rs
✅ Files skipped from review due to trivial changes (3)
- src/handlers/http/modal/mod.rs
- Cargo.toml
- src/handlers/http/cluster/mod.rs
🚧 Files skipped from review as they are similar to previous changes (3)
- src/storage/field_stats.rs
- src/main.rs
- src/telemetry.rs
Clean compilation with no warnings. Here's a summary of all changes made:
Files Modified
1.
Cargo.tomltracing-opentelemetry = "0.32",tracing-actix-web = "0.7",opentelemetry,opentelemetry_sdk(withrt-tokio),opentelemetry-otlp(withgrpc-tonic,http-proto,http-json) — all from the same git rev as existingopentelemetry-prototracing-subscriberfeature"registry"[patch.crates-io]section to unifyopentelemetryandopentelemetry_sdktypes across all crates2.
src/telemetry.rs(NEW)init_otel_tracer() -> Option<SdkTracerProvider>— readsOTEL_EXPORTER_OTLP_ENDPOINTenv var; if unset returnsNone(OTel disabled). Supports gRPC, HTTP/protobuf, and HTTP/JSON (default) protocols viaOTEL_EXPORTER_OTLP_PROTOCOL. Registers W3CTraceContextPropagatorglobally.3.
src/lib.rspub mod telemetry;4.
src/main.rsinit_logger()now returnsOption<SdkTracerProvider>and wires the OTel tracing layer into the subscribermain()captures the provider and callsprovider.shutdown()before exit5.
src/handlers/http/modal/mod.rsactix_web::middleware::Logger::default()withtracing_actix_web::TracingLogger::default()for automatic HTTP request tracing with W3C traceparent propagation6.
src/handlers/http/query.rs— 7 functions instrumentedquery()— root span withquery.sqlandquery.streamingfieldsget_counts()— root spanhandle_count_query()— child span withtablefieldhandle_non_streaming_query()— child spanhandle_streaming_query()— child spaninto_query()— child spanget_records_and_fields()— child spancreate_streams_for_distributed()— child span withstream_countfield + Pattern 1 span propagation intoJoinSet::spawntasks7.
src/query/mod.rs— 4 functions instrumentedexecute()— child span + Pattern 2 W3C TraceContext propagation acrossQUERY_RUNTIME(separateRuntime::new()— cross-OS-thread boundary). Injects context before spawn, extracts and sets parent inside the spawned closure.Query::execute()— child span (query.datafusion_execute)CountsRequest::get_bin_density()— child span withstreamfieldget_manifest_list()— child span withstreamfield8.
src/storage/field_stats.rs— 1 function instrumentedget_dataset_stats()— root span9.
src/handlers/http/cluster/mod.rs— 1 function instrumentedsend_query_request()— child spanCo-authored-by: otex-dev dev@otex.dev
Summary by CodeRabbit