feat(csharp/src/Drivers/Databricks): Add Activity-based logging to DatabricksStatement#3617
feat(csharp/src/Drivers/Databricks): Add Activity-based logging to DatabricksStatement#3617msrathore-db wants to merge 4 commits intoapache:mainfrom
Conversation
| protected override Schema GetSchemaFromMetadata(TGetResultSetMetadataResp metadata) | ||
| { | ||
| // Log schema parsing decision | ||
| Activity.Current?.SetTag("statement.schema.has_arrow_schema", metadata.__isset.arrowSchema); |
There was a problem hiding this comment.
Does this output logs correctly? Why not using this.TraceActivity?
@birschick-bq What should be the recommended way for trace?
There was a problem hiding this comment.
I was concerned that Activity.Current might not be set to the activity in your current block - i.e., it might have been set for a child call. But in thinking about it, I don't think that would be the case, as long as the child call disposed the Activity in their scope.
However, I think the best usage pattern for Activity.Current is where you don't start a new activity and you don't pass an existing activity as a parameter. That is, a situation in which you might be called from a method that may or may not have a current activity started.
So I think Activity.Current is used appropriately in the case shown in line 93, above.
Use this.TraceActivity when you want a new activity (line) with its own events and tags. Typically something "major" or structural in your code. If the new activity (this.TraceActivity) is a child call, the ParentId will be set to indicate it is a child activity of the parent activity.
Summary
Adds comprehensive Activity-based logging to
DatabricksStatement.csfor improved observability and debugging of Databricks ADBC operations.Changes
Methods with Logging Added
SetStatementProperties: Logs configuration (Arrow native types, CloudFetch settings, async mode)GetSchemaFromMetadata: Logs schema parsing decisions (Arrow vs Thrift, column count)GetCatalogsAsync: Logs catalog queries with feature flags and row countsGetSchemasAsync: Logs schema queries with catalog filteringGetTablesAsync: Logs table queries with SPARK catalog handlingGetColumnsAsync: Logs column queries with search criteriaGetPrimaryKeysAsync: Logs primary key queriesGetCrossReferenceAsync: Logs foreign key queriesGetColumnsExtendedAsync: Logs DESC TABLE EXTENDED queriesSetOption: Logs configuration changesWhat Gets Logged
Tags:
enable_multiple_catalog_support,pk_fk_enabled)db.response.returned_rows)Events:
statement.<operation>.start/statement.<operation>.completecalling_base_implementation,returning_empty_result,fallback_to_base)using_arrow_schema,fallback_to_thrift)Example Log Output
{ "OperationName": "GetCatalogs", "Duration": "00:00:00.582", "TagObjects": { "statement.feature.enable_multiple_catalog_support": true, "db.response.returned_rows": 28 }, "Events": [ { "Name": "statement.get_catalogs.start" }, { "Name": "statement.get_catalogs.calling_base_implementation" }, { "Name": "statement.get_catalogs.complete" } ] }Testing
Tested locally by enabling logging with:
Verified that all tags, events, and distributed tracing context are captured correctly in trace files for all implemented methods (GetCatalogs, GetSchemas, GetTables, GetColumns, and statement configuration).
PR generated by Cursor.