diff --git a/README.md b/README.md index 5b79d26..f884850 100644 --- a/README.md +++ b/README.md @@ -14,22 +14,17 @@ Botanu adds **runs** on top of distributed tracing. A run represents a single bu ## Quick Start ```python -from botanu import enable, botanu_use_case, emit_outcome +from botanu import enable, botanu_use_case enable(service_name="my-app") -@botanu_use_case(name="Customer Support") -async def handle_ticket(ticket_id: str): - # All LLM calls, DB queries, and HTTP requests inside - # are automatically instrumented and linked to this run - context = await fetch_context(ticket_id) - response = await generate_response(context) - emit_outcome("success", value_type="tickets_resolved", value_amount=1) - return response +@botanu_use_case(name="process_order") +def process_order(order_id: str): + order = db.get_order(order_id) + result = llm.analyze(order) + return result ``` -That's it. All operations within the use case are automatically tracked. - ## Installation ```bash @@ -57,14 +52,22 @@ No manual instrumentation required. ## Kubernetes Deployment -For large-scale deployments, use zero-code instrumentation via OTel Operator: +For large-scale deployments (2000+ services): + +| Service Type | Code Change | Kubernetes Config | +|--------------|-------------|-------------------| +| Entry point | `@botanu_use_case` decorator | Annotation | +| Intermediate | None | Annotation only | ```yaml +# Intermediate services - annotation only, no code changes metadata: annotations: instrumentation.opentelemetry.io/inject-python: "true" ``` +Auto-instrumentation captures all HTTP calls including retries (requests, httpx, aiohttp, urllib3). + See [Kubernetes Deployment Guide](./docs/integration/kubernetes.md) for details. ## Documentation diff --git a/docs/api/decorators.md b/docs/api/decorators.md index 71e1b9a..98e5c93 100644 --- a/docs/api/decorators.md +++ b/docs/api/decorators.md @@ -10,11 +10,8 @@ from botanu import botanu_use_case @botanu_use_case( name: str, workflow: Optional[str] = None, - *, environment: Optional[str] = None, tenant_id: Optional[str] = None, - auto_outcome_on_success: bool = True, - span_kind: SpanKind = SpanKind.SERVER, ) ``` @@ -22,70 +19,32 @@ from botanu import botanu_use_case | Parameter | Type | Default | Description | |-----------|------|---------|-------------| -| `name` | `str` | Required | Use case name (e.g., "Customer Support"). Low cardinality for grouping. | -| `workflow` | `str` | Function name | Workflow identifier. Defaults to the decorated function's qualified name. | -| `environment` | `str` | From env | Deployment environment (production, staging, etc.). | -| `tenant_id` | `str` | `None` | Tenant identifier for multi-tenant systems. | -| `auto_outcome_on_success` | `bool` | `True` | Automatically emit "success" outcome if function completes without exception. | -| `span_kind` | `SpanKind` | `SERVER` | OpenTelemetry span kind. | +| `name` | `str` | Required | Use case name for grouping | +| `workflow` | `str` | Function name | Workflow identifier | +| `environment` | `str` | From env | Deployment environment | +| `tenant_id` | `str` | `None` | Tenant identifier for multi-tenant systems | -### Behavior - -1. **Generates UUIDv7 `run_id`** - Sortable, globally unique identifier -2. **Creates root span** - Named `botanu.run/{name}` -3. **Emits events** - `botanu.run.started` and `botanu.run.completed` -4. **Sets baggage** - Propagates context via W3C Baggage -5. **Records outcome** - On completion or exception - -### Examples - -#### Basic Usage - -```python -@botanu_use_case("Customer Support") -async def handle_ticket(ticket_id: str): - result = await process_ticket(ticket_id) - emit_outcome("success", value_type="tickets_resolved", value_amount=1) - return result -``` - -#### With All Parameters +### Example ```python -@botanu_use_case( - name="Document Processing", - workflow="pdf_extraction", - environment="production", - tenant_id="acme-corp", - auto_outcome_on_success=False, - span_kind=SpanKind.CONSUMER, -) -async def process_document(doc_id: str): - ... -``` - -#### Sync Functions +from botanu import botanu_use_case -```python -@botanu_use_case("Batch Processing") -def process_batch(batch_id: str): - # Works with sync functions too - return process_items(batch_id) +@botanu_use_case(name="process_order") +def process_order(order_id: str): + order = db.get_order(order_id) + result = llm.analyze(order) + return result ``` ### Span Attributes -The decorator sets these span attributes: - -| Attribute | Source | -|-----------|--------| +| Attribute | Description | +|-----------|-------------| | `botanu.run_id` | Generated UUIDv7 | | `botanu.use_case` | `name` parameter | | `botanu.workflow` | `workflow` parameter or function name | -| `botanu.workflow_version` | SHA256 hash of function source | -| `botanu.environment` | `environment` parameter or env var | -| `botanu.tenant_id` | `tenant_id` parameter (if provided) | -| `botanu.parent_run_id` | Parent run ID (if nested) | +| `botanu.environment` | Deployment environment | +| `botanu.tenant_id` | Tenant identifier (if provided) | ### Alias @@ -94,115 +53,48 @@ The decorator sets these span attributes: ```python from botanu import use_case -@use_case("My Use Case") -async def my_function(): - ... +@use_case(name="process_order") +def process_order(order_id: str): + return db.get_order(order_id) ``` ---- - ## @botanu_outcome -Convenience decorator for sub-functions to emit outcomes based on success/failure. +Decorator for sub-functions to emit outcomes based on success/failure. ```python from botanu import botanu_outcome -@botanu_outcome( - success: Optional[str] = None, - partial: Optional[str] = None, - failed: Optional[str] = None, -) +@botanu_outcome() +def extract_data(): + return fetch_from_source() ``` -### Parameters - -| Parameter | Type | Default | Description | -|-----------|------|---------|-------------| -| `success` | `str` | `None` | Custom label for success outcome (reserved for future use). | -| `partial` | `str` | `None` | Custom label for partial outcome (reserved for future use). | -| `failed` | `str` | `None` | Custom label for failed outcome (reserved for future use). | - -### Behavior - -- **Does NOT create a new run** - Works within an existing run -- **Emits "success"** if function completes without exception -- **Emits "failed"** with exception class name if exception raised -- **Skips emission** if outcome already set on current span +- Emits "success" on completion +- Emits "failed" with exception class name if exception raised +- Does NOT create a new run ### Example ```python from botanu import botanu_use_case, botanu_outcome -@botanu_use_case("Data Pipeline") -async def run_pipeline(): - await extract_data() - await transform_data() - await load_data() +@botanu_use_case(name="data_pipeline") +def run_pipeline(): + extract_data() + transform_data() + load_data() @botanu_outcome() -async def extract_data(): - # Emits "success" on completion - return await fetch_from_source() +def extract_data(): + return fetch_from_source() @botanu_outcome() -async def transform_data(): - # Emits "failed" with reason if exception - return await apply_transformations() -``` - ---- - -## Function Signatures - -### Async Support - -Both decorators support async and sync functions: - -```python -# Async -@botanu_use_case("Async Use Case") -async def async_handler(): - await do_work() - -# Sync -@botanu_use_case("Sync Use Case") -def sync_handler(): - do_work() -``` - -### Return Values - -Decorated functions preserve their return values: - -```python -@botanu_use_case("Processing") -async def process(data) -> ProcessResult: - return ProcessResult(status="complete", items=100) - -result = await process(data) -assert isinstance(result, ProcessResult) -``` - -### Exception Handling - -Exceptions are recorded and re-raised: - -```python -@botanu_use_case("Risky Operation") -async def risky(): - raise ValueError("Something went wrong") - -try: - await risky() -except ValueError: - # Exception is re-raised after recording - pass +def transform_data(): + return apply_transformations() ``` ## See Also -- [Quickstart](../getting-started/quickstart.md) - Getting started -- [Run Context](../concepts/run-context.md) - Understanding runs -- [Outcomes](../tracking/outcomes.md) - Recording outcomes +- [Quickstart](../getting-started/quickstart.md) +- [Run Context](../concepts/run-context.md) diff --git a/docs/getting-started/quickstart.md b/docs/getting-started/quickstart.md index df9a510..b3190ed 100644 --- a/docs/getting-started/quickstart.md +++ b/docs/getting-started/quickstart.md @@ -1,166 +1,72 @@ # Quickstart -Get run-level cost attribution working in 5 minutes. +Get run-level cost attribution working in minutes. ## Prerequisites - Python 3.9+ -- Botanu SDK installed (`pip install "botanu[sdk]"`) -- OpenTelemetry Collector running (see [Collector Configuration](../integration/collector.md)) +- OpenTelemetry Collector (see [Collector Configuration](../integration/collector.md)) -## Step 1: Enable the SDK +## Step 1: Install -At application startup, enable Botanu: - -```python -from botanu import enable - -enable(service_name="my-ai-service") +```bash +pip install "botanu[all]" ``` -This: -- Configures OpenTelemetry with OTLP export -- Adds the `RunContextEnricher` span processor -- Enables W3C Baggage propagation - -## Step 2: Define a Use Case - -Wrap your entry point with `@botanu_use_case`: +## Step 2: Enable ```python -from botanu import botanu_use_case, emit_outcome - -@botanu_use_case("Customer Support") -async def handle_support_ticket(ticket_id: str): - # Your business logic here - context = await fetch_ticket_context(ticket_id) - response = await generate_response(context) - await send_response(ticket_id, response) - - # Record the business outcome - emit_outcome("success", value_type="tickets_resolved", value_amount=1) - return response -``` - -Every operation inside this function (LLM calls, database queries, HTTP requests) will be automatically linked to the same `run_id`. - -## Step 3: Track LLM Calls - -For manual LLM tracking (when auto-instrumentation isn't available): +from botanu import enable -```python -from botanu.tracking.llm import track_llm_call - -@botanu_use_case("Document Analysis") -async def analyze_document(doc_id: str): - document = await fetch_document(doc_id) - - with track_llm_call(provider="openai", model="gpt-4") as tracker: - response = await openai.chat.completions.create( - model="gpt-4", - messages=[{"role": "user", "content": document}] - ) - tracker.set_tokens( - input_tokens=response.usage.prompt_tokens, - output_tokens=response.usage.completion_tokens, - ) - tracker.set_request_id(response.id) - - emit_outcome("success", value_type="documents_analyzed", value_amount=1) - return response.choices[0].message.content +enable(service_name="my-service") ``` -## Step 4: Track Data Operations - -Track database and storage operations for complete cost visibility: +## Step 3: Define Entry Point ```python -from botanu.tracking.data import track_db_operation, track_storage_operation +from botanu import botanu_use_case -@botanu_use_case("Data Pipeline") -async def process_data(job_id: str): - # Track database reads - with track_db_operation(system="postgresql", operation="SELECT") as db: - rows = await fetch_records(job_id) - db.set_result(rows_returned=len(rows)) - - # Track storage writes - with track_storage_operation(system="s3", operation="PUT") as storage: - await upload_results(job_id, rows) - storage.set_result(bytes_written=len(rows) * 1024) - - emit_outcome("success", value_type="jobs_processed", value_amount=1) +@botanu_use_case(name="process_order") +def process_order(order_id: str): + order = db.get_order(order_id) + result = llm.analyze(order) + return result ``` +All LLM calls, database queries, and HTTP requests inside the function are automatically tracked with the same `run_id`. + ## Complete Example ```python -import asyncio -from botanu import enable, botanu_use_case, emit_outcome -from botanu.tracking.llm import track_llm_call -from botanu.tracking.data import track_db_operation - -# Initialize at startup -enable(service_name="support-bot") - -@botanu_use_case("Customer Support") -async def handle_ticket(ticket_id: str): - """Process a customer support ticket.""" - - # Fetch ticket from database (auto-tracked if using instrumented client) - with track_db_operation(system="postgresql", operation="SELECT") as db: - ticket = await db_client.fetch_ticket(ticket_id) - db.set_result(rows_returned=1) - - # Generate response with LLM - with track_llm_call(provider="openai", model="gpt-4") as llm: - response = await openai_client.chat.completions.create( - model="gpt-4", - messages=[ - {"role": "system", "content": "You are a helpful support agent."}, - {"role": "user", "content": ticket.description} - ] - ) - llm.set_tokens( - input_tokens=response.usage.prompt_tokens, - output_tokens=response.usage.completion_tokens, - ) - - # Save response (auto-tracked) - with track_db_operation(system="postgresql", operation="INSERT") as db: - await db_client.save_response(ticket_id, response.choices[0].message.content) - db.set_result(rows_affected=1) - - # Record business outcome - emit_outcome("success", value_type="tickets_resolved", value_amount=1) - - return response.choices[0].message.content - -# Run -asyncio.run(handle_ticket("TICKET-123")) +from botanu import enable, botanu_use_case + +enable(service_name="order-service") + +@botanu_use_case(name="process_order") +def process_order(order_id: str): + order = db.get_order(order_id) + result = openai.chat.completions.create( + model="gpt-4", + messages=[{"role": "user", "content": order.description}] + ) + db.save_result(order_id, result) + return result ``` ## What Gets Tracked -After running, you'll have spans with: - -| Attribute | Value | Description | -|-----------|-------|-------------| -| `botanu.run_id` | `019abc12-...` | Unique run identifier (UUIDv7) | -| `botanu.use_case` | `Customer Support` | Business use case | -| `botanu.outcome` | `success` | Outcome status | +| Attribute | Example | Description | +|-----------|---------|-------------| +| `botanu.run_id` | `019abc12-...` | Unique run identifier | +| `botanu.use_case` | `process_order` | Business use case | | `gen_ai.usage.input_tokens` | `150` | LLM input tokens | | `gen_ai.usage.output_tokens` | `200` | LLM output tokens | -| `gen_ai.provider.name` | `openai` | LLM provider | | `db.system` | `postgresql` | Database system | -All spans share the same `run_id`, enabling: -- Total cost per business transaction -- Cost breakdown by component -- Cost-per-outcome analytics +All spans share the same `run_id`, enabling cost-per-transaction analytics. ## Next Steps - [Configuration](configuration.md) - Environment variables and YAML config -- [LLM Tracking](../tracking/llm-tracking.md) - Detailed LLM instrumentation +- [Kubernetes Deployment](../integration/kubernetes.md) - Zero-code instrumentation at scale - [Context Propagation](../concepts/context-propagation.md) - Cross-service tracing diff --git a/docs/index.md b/docs/index.md index ec9f8f8..c08dfd0 100644 --- a/docs/index.md +++ b/docs/index.md @@ -49,20 +49,15 @@ Botanu introduces **run-level attribution**: a unique `run_id` that follows your ## Quick Example ```python -from botanu import enable, botanu_use_case, emit_outcome - -enable(service_name="support-agent") - -@botanu_use_case("Customer Support") -async def handle_ticket(ticket_id: str): - # All LLM calls, DB queries, and HTTP requests are auto-instrumented - context = await fetch_context(ticket_id) - response = await openai.chat.completions.create( - model="gpt-4", - messages=[{"role": "user", "content": context}] - ) - emit_outcome("success", value_type="tickets_resolved", value_amount=1) - return response +from botanu import enable, botanu_use_case + +enable(service_name="my-app") + +@botanu_use_case(name="process_order") +def process_order(order_id: str): + order = db.get_order(order_id) + result = llm.analyze(order) + return result ``` ## License diff --git a/docs/integration/auto-instrumentation.md b/docs/integration/auto-instrumentation.md index 3d4f0b3..4df42d0 100644 --- a/docs/integration/auto-instrumentation.md +++ b/docs/integration/auto-instrumentation.md @@ -1,190 +1,92 @@ # Auto-Instrumentation -Automatically instrument common libraries for seamless tracing. +Automatically instrument common libraries without code changes. -## Overview +## Installation -Botanu leverages OpenTelemetry's auto-instrumentation ecosystem. When enabled, your HTTP clients, web frameworks, databases, and LLM providers are automatically traced without code changes. +```bash +pip install "botanu[all]" +``` -## Enabling Auto-Instrumentation +## Usage ```python -from botanu import enable - -enable( - service_name="my-service", - auto_instrument=True, # Default -) -``` +from botanu import enable, botanu_use_case -Or with specific packages: +enable(service_name="my-service") -```python -enable( - service_name="my-service", - auto_instrument_packages=["requests", "fastapi", "openai_v2"], -) +@botanu_use_case(name="process_order") +def process_order(order_id: str): + order = db.get_order(order_id) + result = openai.chat.completions.create( + model="gpt-4", + messages=[{"role": "user", "content": order.description}] + ) + return result ``` +All operations inside are automatically traced. + ## Supported Libraries ### HTTP Clients -| Library | Package | Notes | -|---------|---------|-------| -| requests | `opentelemetry-instrumentation-requests` | Sync HTTP | -| httpx | `opentelemetry-instrumentation-httpx` | Sync/async HTTP | -| urllib3 | `opentelemetry-instrumentation-urllib3` | Low-level HTTP | -| aiohttp | `opentelemetry-instrumentation-aiohttp-client` | Async HTTP | +| Library | Package | +|---------|---------| +| requests | `opentelemetry-instrumentation-requests` | +| httpx | `opentelemetry-instrumentation-httpx` | +| urllib3 | `opentelemetry-instrumentation-urllib3` | +| aiohttp | `opentelemetry-instrumentation-aiohttp-client` | ### Web Frameworks -| Framework | Package | Notes | -|-----------|---------|-------| -| FastAPI | `opentelemetry-instrumentation-fastapi` | ASGI framework | -| Flask | `opentelemetry-instrumentation-flask` | WSGI framework | -| Django | `opentelemetry-instrumentation-django` | Full-stack framework | -| Starlette | `opentelemetry-instrumentation-starlette` | ASGI toolkit | +| Framework | Package | +|-----------|---------| +| FastAPI | `opentelemetry-instrumentation-fastapi` | +| Flask | `opentelemetry-instrumentation-flask` | +| Django | `opentelemetry-instrumentation-django` | +| Starlette | `opentelemetry-instrumentation-starlette` | ### Databases -| Database | Package | Notes | -|----------|---------|-------| -| SQLAlchemy | `opentelemetry-instrumentation-sqlalchemy` | ORM/Core | -| psycopg2 | `opentelemetry-instrumentation-psycopg2` | PostgreSQL | -| asyncpg | `opentelemetry-instrumentation-asyncpg` | Async PostgreSQL | -| pymongo | `opentelemetry-instrumentation-pymongo` | MongoDB | -| redis | `opentelemetry-instrumentation-redis` | Redis | +| Database | Package | +|----------|---------| +| SQLAlchemy | `opentelemetry-instrumentation-sqlalchemy` | +| psycopg2 | `opentelemetry-instrumentation-psycopg2` | +| asyncpg | `opentelemetry-instrumentation-asyncpg` | +| pymongo | `opentelemetry-instrumentation-pymongo` | +| redis | `opentelemetry-instrumentation-redis` | ### Messaging -| System | Package | Notes | -|--------|---------|-------| -| Celery | `opentelemetry-instrumentation-celery` | Task queue | -| kafka-python | `opentelemetry-instrumentation-kafka-python` | Kafka client | - -### GenAI / LLM Providers - -| Provider | Package | Notes | -|----------|---------|-------| -| OpenAI | `opentelemetry-instrumentation-openai-v2` | ChatGPT, GPT-4 | -| Anthropic | `opentelemetry-instrumentation-anthropic` | Claude | -| Vertex AI | `opentelemetry-instrumentation-vertexai` | Google Vertex | -| Google GenAI | `opentelemetry-instrumentation-google-genai` | Gemini | -| LangChain | `opentelemetry-instrumentation-langchain` | LangChain | +| System | Package | +|--------|---------| +| Celery | `opentelemetry-instrumentation-celery` | +| Kafka | `opentelemetry-instrumentation-kafka-python` | -### Other - -| Library | Package | Notes | -|---------|---------|-------| -| gRPC | `opentelemetry-instrumentation-grpc` | RPC framework | -| logging | `opentelemetry-instrumentation-logging` | Python logging | - -## Installation - -Install the instrumentation packages you need: - -```bash -# Full suite -pip install "botanu[instruments,genai]" - -# Or individual packages -pip install opentelemetry-instrumentation-fastapi -pip install opentelemetry-instrumentation-openai-v2 -``` +### LLM Providers -## How It Works - -1. **At startup**, Botanu calls each instrumentor's `instrument()` method -2. **Instrumented libraries** automatically create spans for operations -3. **RunContextEnricher** adds `run_id` to every span via baggage -4. **All spans** are linked to the current run, enabling cost attribution - -```python -from botanu import enable, botanu_use_case - -enable(service_name="my-service") - -@botanu_use_case("Customer Support") -async def handle_ticket(ticket_id: str): - # requests.get() automatically creates a span with run_id - context = requests.get(f"https://api.example.com/tickets/{ticket_id}") - - # OpenAI call automatically creates a span with tokens, model, etc. - response = await openai.chat.completions.create( - model="gpt-4", - messages=[{"role": "user", "content": context.text}] - ) - - return response -``` +| Provider | Package | +|----------|---------| +| OpenAI | `opentelemetry-instrumentation-openai-v2` | +| Anthropic | `opentelemetry-instrumentation-anthropic` | +| Vertex AI | `opentelemetry-instrumentation-vertexai` | +| Google GenAI | `opentelemetry-instrumentation-google-genai` | +| LangChain | `opentelemetry-instrumentation-langchain` | ## Context Propagation -Auto-instrumented HTTP clients automatically propagate context: +HTTP clients automatically propagate `run_id` via W3C Baggage headers: -```python -@botanu_use_case("Distributed Workflow") -async def orchestrate(): - # Baggage (run_id, use_case) is injected into request headers - response = requests.get("https://service-b.example.com/process") - # Service B extracts baggage and continues the trace -``` - -Headers injected: ``` traceparent: 00-{trace_id}-{span_id}-01 -baggage: botanu.run_id=019abc12...,botanu.use_case=Distributed%20Workflow +baggage: botanu.run_id=019abc12... ``` -## Customizing Instrumentation - -### Exclude Specific Endpoints +## Span Attributes -```python -from opentelemetry.instrumentation.requests import RequestsInstrumentor +OpenAI calls produce: -# Exclude health checks from tracing -RequestsInstrumentor().instrument( - excluded_urls=["health", "metrics"] -) -``` - -### Add Request/Response Hooks - -```python -def request_hook(span, request): - span.set_attribute("http.request.custom_header", request.headers.get("X-Custom")) - -def response_hook(span, request, response): - span.set_attribute("http.response.custom_header", response.headers.get("X-Custom")) - -RequestsInstrumentor().instrument( - request_hook=request_hook, - response_hook=response_hook, -) -``` - -## GenAI Instrumentation Details - -### OpenAI - -Automatically captures: -- Model name and parameters -- Token usage (input, output, cached) -- Request/response IDs -- Streaming status -- Tool/function calls - -```python -# Automatically traced -response = await openai.chat.completions.create( - model="gpt-4", - messages=[{"role": "user", "content": "Hello"}] -) -``` - -Span attributes: ``` gen_ai.operation.name: chat gen_ai.provider.name: openai @@ -193,111 +95,36 @@ gen_ai.usage.input_tokens: 10 gen_ai.usage.output_tokens: 25 ``` -### Anthropic - -Automatically captures: -- Model and version -- Token usage with cache breakdown -- Stop reason - -```python -# Automatically traced -response = await anthropic.messages.create( - model="claude-3-opus-20240229", - messages=[{"role": "user", "content": "Hello"}] -) -``` - -### LangChain - -Traces the full chain execution: - -```python -# Each step is traced -chain = prompt | llm | parser -result = await chain.ainvoke({"input": "Hello"}) -``` - -## Combining with Manual Tracking +Database calls produce: -Auto-instrumentation works alongside manual tracking: - -```python -from botanu import botanu_use_case, emit_outcome -from botanu.tracking.llm import track_llm_call - -@botanu_use_case("Hybrid Workflow") -async def hybrid_example(): - # Auto-instrumented HTTP call - data = requests.get("https://api.example.com/data") - - # Manual tracking for custom provider - with track_llm_call(provider="custom-llm", model="my-model") as tracker: - response = await custom_llm_call(data.json()) - tracker.set_tokens(input_tokens=100, output_tokens=200) - - # Auto-instrumented database call - await database.execute("INSERT INTO results VALUES (?)", response) - - emit_outcome("success") ``` - -## Disabling Auto-Instrumentation - -### Completely Disable - -```python -enable( - service_name="my-service", - auto_instrument_packages=[], # Empty list -) -``` - -### Disable Specific Libraries - -```python -enable( - service_name="my-service", - auto_instrument_packages=["fastapi", "openai_v2"], # Only these -) +db.system: postgresql +db.operation: SELECT +db.statement: SELECT * FROM orders WHERE id = ? ``` ## Troubleshooting ### Spans Not Appearing -1. Check the library is installed: - ```bash - pip list | grep opentelemetry-instrumentation - ``` - -2. Verify instrumentation is enabled: - ```python - from opentelemetry.instrumentation.requests import RequestsInstrumentor - print(RequestsInstrumentor().is_instrumented()) - ``` - -3. Ensure `enable()` is called before library imports: - ```python - from botanu import enable - enable(service_name="my-service") +Ensure `enable()` is called before library imports: - # Import after enable() - import requests - ``` +```python +from botanu import enable +enable(service_name="my-service") -### Context Not Propagating +import requests +import openai +``` -Check that baggage propagator is configured: +### Check Instrumentation Status ```python -from opentelemetry import propagate -print(propagate.get_global_textmap()) -# Should include W3CBaggagePropagator +from opentelemetry.instrumentation.requests import RequestsInstrumentor +print(RequestsInstrumentor().is_instrumented()) ``` ## See Also -- [Existing OTel Setup](existing-otel.md) - Integration with existing OTel +- [Kubernetes Deployment](kubernetes.md) - Zero-code instrumentation at scale - [Collector Configuration](collector.md) - Collector setup -- [Context Propagation](../concepts/context-propagation.md) - How context flows diff --git a/docs/integration/kubernetes.md b/docs/integration/kubernetes.md index 5efec52..e35b949 100644 --- a/docs/integration/kubernetes.md +++ b/docs/integration/kubernetes.md @@ -6,6 +6,29 @@ Zero-code instrumentation for large-scale deployments. For organizations with thousands of applications, modifying code in every repo is impractical. This guide covers zero-code instrumentation using Kubernetes-native approaches. +## What Requires Code Changes + +| Service Type | Code Change | Config Change | +|--------------|-------------|---------------| +| **Entry point** | `@botanu_use_case` decorator (generates `run_id`) | K8s annotation | +| **Intermediate services** | None | K8s annotation only | + +**Entry point** = The service where the business transaction starts (API gateway, webhook handler, queue consumer). + +**Intermediate services** = All downstream services called by the entry point. + +## What Gets Auto-Instrumented + +With zero-code instrumentation, the following are automatically traced: + +- **HTTP clients** — requests, httpx, urllib3, aiohttp (including retries) +- **Frameworks** — FastAPI, Flask, Django, Starlette +- **Databases** — PostgreSQL, MySQL, MongoDB, Redis, SQLAlchemy +- **Messaging** — Celery, Kafka +- **LLM Providers** — OpenAI, Anthropic, Vertex AI + +**Retries are automatically captured.** Each HTTP call (including retries from libraries like `tenacity`, `urllib3.util.retry`, or `httpx` retry) creates a separate span. The `run_id` propagates via W3C Baggage headers on every request. + ## Architecture ``` @@ -221,25 +244,26 @@ spec: exporters: [otlp] ``` -## Entry Point Service +## Entry Point Service (Code Change Required) -Only the entry point service needs the Botanu decorator: +The entry point service is the **only** service that needs a code change. It must use `@botanu_use_case` to generate the `run_id`: ```python -# entry-service/app.py -from botanu import enable, botanu_use_case, emit_outcome +from botanu import enable, botanu_use_case enable(service_name="entry-service") -@botanu_use_case(name="Customer Support") -async def handle_request(request_id: str): - # Calls to downstream services propagate run_id automatically - result = await call_service_b(request_id) - emit_outcome("success") +@botanu_use_case(name="process_order") +def process_order(order_id: str): + order = db.get_order(order_id) + result = llm.analyze(order) + notify_service.send(result) return result ``` -Downstream services (B, C, D, etc.) need zero code changes. +The `@botanu_use_case` decorator generates a `run_id` and propagates it via W3C Baggage to all downstream calls. + +**Downstream services (B, C, D, etc.) need zero code changes** — they just need the K8s annotation. ## Helm Chart