Conversation
Auto-generated by otex Modified files: 2 Coverage: 100% (112/112 endpoints) Co-authored-by: otex-dev <dev@otex.dev>
WalkthroughThe changes introduce OpenTelemetry-based distributed tracing to the application, adding trace feature dependencies, a new telemetry module for OTEL initialization, and instrumentation via Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Actix as Actix Web
participant TracingLogger
participant Handler as HTTP Handler
participant Tracer as Tracing Layer
participant OTEL as OpenTelemetry
participant Exporter as OTLP Exporter
Client->>Actix: HTTP Request
Actix->>TracingLogger: Request arrives
TracingLogger->>Tracer: Create root span
Tracer->>OTEL: Register span context
TracingLogger->>Handler: Forward request
Handler->>Tracer: Create handler span (annotated)
Tracer->>OTEL: Record span fields (tenant_id, stream_name, etc.)
Handler-->>Handler: Execute logic
Handler->>Tracer: Record nested spans (query operations)
Tracer->>OTEL: Update context
OTEL->>Exporter: Batch spans
Exporter-->>OTEL: Acknowledge
Handler-->>TracingLogger: Response ready
TracingLogger->>Tracer: End root span
Tracer->>OTEL: Flush span
Actix-->>Client: HTTP Response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
Cargo.toml (1)
111-115: Add structured tracking for OpenTelemetry fork migration.The codebase uses a fork (
parmesant/opentelemetry-rust) foropentelemetry,opentelemetry_sdk,opentelemetry-otlp, andopentelemetry-proto(revision pinned to45fb828769e6ade96d56ca1f5fa14cf0986a5341). While line 99 notes this is a temporary measure pending upstream merge, the tracking could be more explicit.Consider adding a TODO comment referencing the upstream PR/issue number or creating a tracking issue to ensure the migration is not overlooked as the codebase evolves.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Cargo.toml` around lines 111 - 115, The Cargo.toml dependency entries for opentelemetry, opentelemetry_sdk, opentelemetry-otlp (the forked parmsant/opentelemetry-rust rev 45fb8287) lack an explicit tracking note; add a clear TODO comment near these entries referencing the upstream PR/issue number (or a freshly created internal tracking issue) and the fork rev so future maintainers know to revert to the upstream crates once the merge lands, e.g., annotate the opentelemetry, opentelemetry_sdk, and opentelemetry-otlp dependency lines with the TODO and the PR/issue URL or internal tracking ID.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/query/mod.rs`:
- Around line 139-143: The tracing span declared by #[tracing::instrument(name =
"datafusion.dispatch", skip(query, tenant_id), fields(...))] leaves the span
fields is_streaming and stream_name unassigned; update the attribute to assign
those fields from the corresponding local variables (e.g., fields(is_streaming =
%is_streaming, stream_name = %stream_name)) so the span captures their values
for the function/method where the tracing::instrument attribute is applied.
- Around line 180-183: The current injection uses the global OTel context which
can omit the active tracing span; replace the use of
opentelemetry::Context::current() with the active tracing span's context
obtained via tracing::Span::current().context() (OpenTelemetrySpanExt) when
calling opentelemetry::global::get_text_map_propagator to inject into trace_cx
so the datafusion.dispatch parent is preserved; update the closure accordingly
where trace_cx, opentelemetry::global::get_text_map_propagator, and
tracing::Span::current() are referenced.
In `@src/telemetry.rs`:
- Around line 43-58: The match over OTEL_EXPORTER_OTLP_PROTOCOL currently treats
any non-"grpc" value as HttpJson and silences builder errors via .ok()?, so
update the code to (1) read OTEL_EXPORTER_OTLP_PROTOCOL with an explicit default
string like "http/json" instead of unwrap_or_default(), (2) expand the match on
protocol.as_str() to include "http/protobuf" -> use Protocol::HttpBinary and an
explicit "http/json" -> Protocol::HttpJson arm, (3) add a catch-all arm that
logs or returns an error for unknown protocol values rather than coercing them,
and (4) stop swallowing builder errors by returning or logging the builder
Result (remove .ok()? and propagate or map_err the builder error) so failures
are distinguishable from telemetry being disabled; locate changes around the
protocol variable, the match block, SpanExporter::builder(), and uses of
Protocol::HttpJson/Protocol::HttpBinary.
---
Nitpick comments:
In `@Cargo.toml`:
- Around line 111-115: The Cargo.toml dependency entries for opentelemetry,
opentelemetry_sdk, opentelemetry-otlp (the forked parmsant/opentelemetry-rust
rev 45fb8287) lack an explicit tracking note; add a clear TODO comment near
these entries referencing the upstream PR/issue number (or a freshly created
internal tracking issue) and the fork rev so future maintainers know to revert
to the upstream crates once the merge lands, e.g., annotate the opentelemetry,
opentelemetry_sdk, and opentelemetry-otlp dependency lines with the TODO and the
PR/issue URL or internal tracking ID.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: c1637193-0604-4110-b657-be578c63f4a8
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (11)
Cargo.tomlsrc/handlers/http/logstream.rssrc/handlers/http/modal/mod.rssrc/handlers/http/query.rssrc/handlers/http/rbac.rssrc/handlers/http/role.rssrc/lib.rssrc/main.rssrc/parseable/mod.rssrc/query/mod.rssrc/telemetry.rs
| #[tracing::instrument( | ||
| name = "datafusion.dispatch", | ||
| skip(query, tenant_id), | ||
| fields(is_streaming) | ||
| )] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
wc -l src/query/mod.rsRepository: parseablehq/parseable
Length of output: 86
🏁 Script executed:
# Check lines around 139-143
sed -n '130,155p' src/query/mod.rsRepository: parseablehq/parseable
Length of output: 842
🏁 Script executed:
# Check lines around 764-768
sed -n '755,780p' src/query/mod.rsRepository: parseablehq/parseable
Length of output: 830
🏁 Script executed:
# Search for span.record or Span::record in the execute function and surrounding context
sed -n '139,300p' src/query/mod.rs | grep -n "span\|record"Repository: parseablehq/parseable
Length of output: 339
🏁 Script executed:
# Search for span.record in the get_manifest_list function
sed -n '764,850p' src/query/mod.rs | grep -n "span\|record"Repository: parseablehq/parseable
Length of output: 47
🏁 Script executed:
# Let me also check the tracing documentation behavior by searching for any span.record patterns in the entire file
rg "span\.record|Span::record" src/query/mod.rsRepository: parseablehq/parseable
Length of output: 89
🏁 Script executed:
# Get broader context of execute function starting from line 139
sed -n '139,250p' src/query/mod.rsRepository: parseablehq/parseable
Length of output: 4302
🏁 Script executed:
# Also check what span.record is being called on at the location found earlier
sed -n '190,210p' src/query/mod.rsRepository: parseablehq/parseable
Length of output: 937
🏁 Script executed:
# Get the complete get_manifest_list function body
sed -n '764,850p' src/query/mod.rsRepository: parseablehq/parseable
Length of output: 3071
Assign values to span fields is_streaming and stream_name in the #[tracing::instrument] attributes.
Both fields(is_streaming) and fields(stream_name) declare fields without assigning values. When a parameter is not skipped, it implicitly creates an empty field. Explicitly assign the values:
Suggested fixes
#[tracing::instrument(
name = "datafusion.dispatch",
skip(query, tenant_id),
- fields(is_streaming)
+ fields(is_streaming = is_streaming)
)] #[tracing::instrument(
name = "catalog.manifest_list",
skip(time_range, tenant_id),
- fields(stream_name)
+ fields(stream_name = %stream_name)
)]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[tracing::instrument( | |
| name = "datafusion.dispatch", | |
| skip(query, tenant_id), | |
| fields(is_streaming) | |
| )] | |
| #[tracing::instrument( | |
| name = "datafusion.dispatch", | |
| skip(query, tenant_id), | |
| fields(is_streaming = is_streaming) | |
| )] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/query/mod.rs` around lines 139 - 143, The tracing span declared by
#[tracing::instrument(name = "datafusion.dispatch", skip(query, tenant_id),
fields(...))] leaves the span fields is_streaming and stream_name unassigned;
update the attribute to assign those fields from the corresponding local
variables (e.g., fields(is_streaming = %is_streaming, stream_name =
%stream_name)) so the span captures their values for the function/method where
the tracing::instrument attribute is applied.
| let mut trace_cx: HashMap<String, String> = HashMap::new(); | ||
| opentelemetry::global::get_text_map_propagator(|prop| { | ||
| prop.inject_context(&opentelemetry::Context::current(), &mut trace_cx); | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the specific file and lines mentioned
cd src && head -n 200 query/mod.rs | tail -n 40Repository: parseablehq/parseable
Length of output: 1819
🏁 Script executed:
# Get more context around lines 180-183
sed -n '170,190p' src/query/mod.rsRepository: parseablehq/parseable
Length of output: 995
🏁 Script executed:
# Search for QUERY_RUNTIME to understand the execution model
rg "QUERY_RUNTIME" --max-count 20Repository: parseablehq/parseable
Length of output: 454
🏁 Script executed:
# Check for tracing-opentelemetry imports and usage patterns
rg "tracing.*opentelemetry|OpenTelemetrySpanExt" -A 2 -B 2Repository: parseablehq/parseable
Length of output: 1339
🏁 Script executed:
# Look at the Cargo.toml to see what versions are being used
cat Cargo.toml | grep -A 5 "\[dependencies\]" | head -n 30Repository: parseablehq/parseable
Length of output: 220
🏁 Script executed:
# Check if tracing::Span::current() is used elsewhere in the codebase
rg "Span::current()"Repository: parseablehq/parseable
Length of output: 1024
🏁 Script executed:
# Look for other uses of context extraction in the codebase
rg "\.context\(\)" -B 2 -A 2Repository: parseablehq/parseable
Length of output: 714
🏁 Script executed:
# Check what OpenTelemetrySpanExt provides and if it's properly imported
rg "OpenTelemetrySpanExt" -A 5 -B 5Repository: parseablehq/parseable
Length of output: 628
🌐 Web query:
tracing-opentelemetry Span::current().context() OpenTelemetrySpanExt 0.32
💡 Result:
In tracing-opentelemetry = 0.32.x, Span::current().context() is not an inherent tracing::Span method—it comes from the extension trait OpenTelemetrySpanExt. Bring the trait into scope:
use tracing::Span;
use tracing_opentelemetry::OpenTelemetrySpanExt as _;
let cx = Span::current().context(); // opentelemetry::Contextcontext() returns an opentelemetry::Context you can inject into outbound requests, etc. (docs.rs)
If you specifically need IDs, you can get them from the returned context’s span context:
use tracing::Span;
use tracing_opentelemetry::OpenTelemetrySpanExt as _;
use opentelemetry::trace::TraceContextExt as _;
let cx = Span::current().context();
let sc = cx.span().span_context();
let trace_id = sc.trace_id();
let span_id = sc.span_id();(The TraceContextExt usage is shown in the crate docs/examples.) (docs.rs)
Citations:
- 1: https://docs.rs/tracing-opentelemetry/latest/tracing_opentelemetry/trait.OpenTelemetrySpanExt.html
- 2: https://docs.rs/tracing-opentelemetry/latest/tracing_opentelemetry/trait.OpenTelemetrySpanExt.html
Use the active tracing span's OTel context instead of the global context here.
In tracing-opentelemetry, the active tracing span's context (obtained via tracing::Span::current().context() through the OpenTelemetrySpanExt trait) may differ from the global opentelemetry::Context::current(). When injecting context before crossing the QUERY_RUNTIME boundary, using the global context instead of the span context can lose the datafusion.dispatch parent, causing datafusion.execute to appear as a new root span.
Since OpenTelemetrySpanExt is already imported, update the injection to use the active span's context:
Suggested fix
+ let current_cx = tracing::Span::current().context();
let mut trace_cx: HashMap<String, String> = HashMap::new();
opentelemetry::global::get_text_map_propagator(|prop| {
- prop.inject_context(&opentelemetry::Context::current(), &mut trace_cx);
+ prop.inject_context(¤t_cx, &mut trace_cx);
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let mut trace_cx: HashMap<String, String> = HashMap::new(); | |
| opentelemetry::global::get_text_map_propagator(|prop| { | |
| prop.inject_context(&opentelemetry::Context::current(), &mut trace_cx); | |
| }); | |
| let current_cx = tracing::Span::current().context(); | |
| let mut trace_cx: HashMap<String, String> = HashMap::new(); | |
| opentelemetry::global::get_text_map_propagator(|prop| { | |
| prop.inject_context(¤t_cx, &mut trace_cx); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/query/mod.rs` around lines 180 - 183, The current injection uses the
global OTel context which can omit the active tracing span; replace the use of
opentelemetry::Context::current() with the active tracing span's context
obtained via tracing::Span::current().context() (OpenTelemetrySpanExt) when
calling opentelemetry::global::get_text_map_propagator to inject into trace_cx
so the datafusion.dispatch parent is preserved; update the closure accordingly
where trace_cx, opentelemetry::global::get_text_map_propagator, and
tracing::Span::current() are referenced.
| let protocol = std::env::var("OTEL_EXPORTER_OTLP_PROTOCOL").unwrap_or_default(); | ||
|
|
||
| let exporter = match protocol.as_str() { | ||
| "grpc" => SpanExporter::builder() | ||
| .with_tonic() | ||
| .with_endpoint(&endpoint) | ||
| .build() | ||
| .ok()?, | ||
| // HTTP/JSON is the default — required for Parseable OSS which only | ||
| // accepts application/json at /v1/traces. | ||
| _ => SpanExporter::builder() | ||
| .with_http() | ||
| .with_protocol(Protocol::HttpJson) | ||
| .with_endpoint(&endpoint) | ||
| .build() | ||
| .ok()?, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if src/telemetry.rs exists and read the relevant section
fd telemetry.rsRepository: parseablehq/parseable
Length of output: 81
🏁 Script executed:
# Let's examine the file to see the actual code
if [ -f "src/telemetry.rs" ]; then
wc -l src/telemetry.rs
fiRepository: parseablehq/parseable
Length of output: 84
🏁 Script executed:
# Check the exact code around lines 43-58
if [ -f "src/telemetry.rs" ]; then
sed -n '40,65p' src/telemetry.rs
fiRepository: parseablehq/parseable
Length of output: 979
🏁 Script executed:
# Search for Protocol enum usage in the codebase to understand available options
rg "Protocol::" -A 2 -B 2Repository: parseablehq/parseable
Length of output: 4053
🌐 Web query:
OpenTelemetry OTLP exporter protocol options http/protobuf vs http/json grpc
💡 Result:
OTLP exporter protocol options
1) grpc (OTLP/gRPC)
- Transport/encoding: Protobuf over gRPC (HTTP/2). [2]
- Default port: 4317. [4]
- When to use: Best performance/efficiency; good for service-to-service inside controlled networks.
- Gotchas: gRPC can be harder through some proxies/LBs, and is not supported from browsers in typical OTel JS setups. [5]
2) http/protobuf (OTLP/HTTP + Protobuf)
- Transport/encoding: Protobuf over plain HTTP (HTTP/1.1 or HTTP/2). [2]
- Default port: 4318. [4]
- When to use: Most compatible “works everywhere” option (proxies, gateways, serverless); recommended default by spec unless an SDK has reasons to default to gRPC. [2]
- Endpoint shape: Typically POST to
/v1/traces,/v1/metrics,/v1/logson the HTTP endpoint. [2]
3) http/json (OTLP/HTTP + JSON)
- Transport/encoding: Same OTLP Protobuf schema, but payload is encoded as JSON over HTTP. [3]
- When to use: Easier to debug/inspect; commonly used for browser exporting (when allowed by CORS/CSP). [5]
- Gotchas: Usually larger payloads and more CPU than Protobuf.
How you select it (common SDK env vars)
OTEL_EXPORTER_OTLP_PROTOCOL=grpc|http/protobuf|http/json[1]
(or per-signal:OTEL_EXPORTER_OTLP_TRACES_PROTOCOL,..._METRICS_PROTOCOL,..._LOGS_PROTOCOL). [1][2]
Sources
[1] https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/
[2] https://opentelemetry.io/docs/specs/otel/protocol/exporter/
[3] https://opentelemetry.io/docs/specs/otlp/
[4] https://opentelemetry.io/docs/specs/otlp/ (default ports section)
[5] https://opentelemetry.io/docs/languages/js/exporters/
🏁 Script executed:
# Check Cargo.toml for opentelemetry_otlp version
fd Cargo.toml | head -5Repository: parseablehq/parseable
Length of output: 75
🏁 Script executed:
cat Cargo.toml 2>/dev/null | grep -A 5 "opentelemetry"Repository: parseablehq/parseable
Length of output: 1391
🏁 Script executed:
# Search for Protocol import and definition
rg "use.*Protocol|enum Protocol" -B 2 -A 5Repository: parseablehq/parseable
Length of output: 442
🌐 Web query:
opentelemetry_otlp Protocol enum HttpBinary variant Rust
💡 Result:
In the Rust crate opentelemetry-otlp, the protocol enum is:
opentelemetry_otlp::Protocol::{Grpc, HttpBinary, HttpJson}HttpBinarymeans OTLP over HTTP with binary Protobuf payloads (i.e., OTLP/HTTP + protobuf). [1]
Use it like:
use opentelemetry_otlp::Protocol;
.with_protocol(Protocol::HttpBinary)Environment-variable equivalent: set OTEL_EXPORTER_OTLP_PROTOCOL to http/protobuf (the spec default is typically http/protobuf unless an SDK chooses otherwise). [2][3]
Sources:
[1] opentelemetry_otlp::Protocol docs (variants + meaning) (docs.rs)
[2] OTEL_EXPORTER_OTLP_PROTOCOL constant docs (accepted values) (docs.rs)
[3] OpenTelemetry OTLP exporter spec (protocol guidance/default) (opentelemetry.io)
Citations:
- 1: https://docs.rs/opentelemetry-otlp/latest/opentelemetry_otlp/enum.Protocol.html?utm_source=openai
- 2: https://docs.rs/opentelemetry-otlp/latest/opentelemetry_otlp/constant.OTEL_EXPORTER_OTLP_PROTOCOL.html?utm_source=openai
- 3: https://opentelemetry.io/docs/specs/otel/protocol/exporter/?utm_source=openai
Handle non-grpc OTLP protocols explicitly.
The current code silently coerces any non-grpc protocol to HttpJson. Since http/protobuf is a standard OTEL_EXPORTER_OTLP_PROTOCOL value supported by the opentelemetry-otlp crate (via Protocol::HttpBinary), configurations using this protocol are misinterpreted. Additionally, .ok()? hides builder errors, making failures indistinguishable from "telemetry disabled."
Explicitly handle http/protobuf and log or reject unknown values. If http/json is intentional for Parseable OSS compatibility, make this clear by using an explicit default rather than unwrap_or_default().
Suggested fix
- let protocol = std::env::var("OTEL_EXPORTER_OTLP_PROTOCOL").unwrap_or_default();
+ let protocol = std::env::var("OTEL_EXPORTER_OTLP_PROTOCOL")
+ .unwrap_or_else(|_| "http/json".to_owned());
let exporter = match protocol.as_str() {
"grpc" => SpanExporter::builder()
.with_tonic()
.with_endpoint(&endpoint)
- .build()
- .ok()?,
- // HTTP/JSON is the default — required for Parseable OSS which only
- // accepts application/json at /v1/traces.
- _ => SpanExporter::builder()
+ .build(),
+ "http/protobuf" => SpanExporter::builder()
.with_http()
- .with_protocol(Protocol::HttpJson)
+ .with_protocol(Protocol::HttpBinary)
.with_endpoint(&endpoint)
- .build()
- .ok()?,
- };
+ .build(),
+ "http/json" | "" => SpanExporter::builder()
+ .with_http()
+ .with_protocol(Protocol::HttpJson)
+ .with_endpoint(&endpoint)
+ .build(),
+ other => {
+ tracing::error!("Unsupported OTLP protocol: {other}");
+ return None;
+ }
+ }
+ .map_err(|err| {
+ tracing::error!("Failed to initialize OTLP exporter: {err}");
+ err
+ })
+ .ok()?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/telemetry.rs` around lines 43 - 58, The match over
OTEL_EXPORTER_OTLP_PROTOCOL currently treats any non-"grpc" value as HttpJson
and silences builder errors via .ok()?, so update the code to (1) read
OTEL_EXPORTER_OTLP_PROTOCOL with an explicit default string like "http/json"
instead of unwrap_or_default(), (2) expand the match on protocol.as_str() to
include "http/protobuf" -> use Protocol::HttpBinary and an explicit "http/json"
-> Protocol::HttpJson arm, (3) add a catch-all arm that logs or returns an error
for unknown protocol values rather than coercing them, and (4) stop swallowing
builder errors by returning or logging the builder Result (remove .ok()? and
propagate or map_err the builder error) so failures are distinguishable from
telemetry being disabled; locate changes around the protocol variable, the match
block, SpanExporter::builder(), and uses of
Protocol::HttpJson/Protocol::HttpBinary.
Modified files: 2
Coverage: 100% (112/112 endpoints)
Co-authored-by: otex-dev dev@otex.dev
Summary by CodeRabbit
Release Notes
OTEL_EXPORTER_OTLP_ENDPOINTandOTEL_EXPORTER_OTLP_PROTOCOL) to support both gRPC and HTTP OTLP export protocols for external trace collection.