From 138c47bb187d5aa59f41e4b15afa77716d046102 Mon Sep 17 00:00:00 2001 From: Chris Gillum Date: Mon, 19 Jan 2026 10:38:32 -0800 Subject: [PATCH] Add comprehensive DTFx documentation - Add getting-started guides (installation, quickstart, choosing-a-backend) - Add core concepts documentation (orchestrations, activities, replay, determinism) - Add feature documentation (timers, events, retries, sub-orchestrations, etc.) - Add backend provider documentation (Durable Task Scheduler, Azure Storage, MSSQL, etc.) - Add advanced topics (middleware, entities, serialization, testing) - Add telemetry documentation (distributed tracing, Application Insights, logging) - Add samples catalog and README files for sample projects - Update root README.md to link to documentation --- README.md | 2 + docs/README.md | 31 + docs/advanced/README.md | 15 + docs/advanced/entities.md | 30 + docs/advanced/middleware.md | 544 +++++++++++++++ docs/advanced/serialization.md | 477 +++++++++++++ docs/advanced/testing.md | 632 ++++++++++++++++++ docs/concepts/README.md | 16 + docs/concepts/activities.md | 354 ++++++++++ docs/concepts/core-concepts.md | 230 +++++++ docs/concepts/deterministic-constraints.md | 290 ++++++++ docs/concepts/orchestrations.md | 331 +++++++++ docs/concepts/replay-and-durability.md | 249 +++++++ docs/features/README.md | 15 + docs/features/error-handling.md | 515 ++++++++++++++ docs/features/eternal-orchestrations.md | 364 ++++++++++ docs/features/external-events.md | 518 ++++++++++++++ docs/features/retries.md | 303 +++++++++ docs/features/sub-orchestrations.md | 315 +++++++++ docs/features/timers.md | 304 +++++++++ docs/features/versioning.md | 498 ++++++++++++++ docs/getting-started/choosing-a-backend.md | 166 +++++ docs/getting-started/installation.md | 91 +++ docs/getting-started/quickstart.md | 173 +++++ docs/providers/README.md | 19 + docs/providers/azure-storage.md | 390 +++++++++++ docs/providers/custom-provider.md | 269 ++++++++ docs/providers/durable-task-scheduler.md | 220 ++++++ docs/providers/emulator.md | 56 ++ docs/providers/mssql.md | 32 + docs/providers/service-bus.md | 164 +++++ docs/providers/service-fabric.md | 197 ++++++ docs/samples/catalog.md | 21 + docs/support.md | 49 ++ docs/telemetry/application-insights.md | 358 ++++++++++ docs/telemetry/distributed-tracing.md | 285 ++++++++ docs/telemetry/logging.md | 356 ++++++++++ samples/Correlation.Samples/Readme.md | 44 +- .../ApplicationInsights/README.md | 57 ++ samples/DistributedTraceSample/README.md | 56 ++ samples/DurableTask.Samples/README.md | 203 ++++++ samples/ManagedIdentitySample/README.md | 65 ++ 42 files changed, 9292 insertions(+), 12 deletions(-) create mode 100644 docs/README.md create mode 100644 docs/advanced/README.md create mode 100644 docs/advanced/entities.md create mode 100644 docs/advanced/middleware.md create mode 100644 docs/advanced/serialization.md create mode 100644 docs/advanced/testing.md create mode 100644 docs/concepts/README.md create mode 100644 docs/concepts/activities.md create mode 100644 docs/concepts/core-concepts.md create mode 100644 docs/concepts/deterministic-constraints.md create mode 100644 docs/concepts/orchestrations.md create mode 100644 docs/concepts/replay-and-durability.md create mode 100644 docs/features/README.md create mode 100644 docs/features/error-handling.md create mode 100644 docs/features/eternal-orchestrations.md create mode 100644 docs/features/external-events.md create mode 100644 docs/features/retries.md create mode 100644 docs/features/sub-orchestrations.md create mode 100644 docs/features/timers.md create mode 100644 docs/features/versioning.md create mode 100644 docs/getting-started/choosing-a-backend.md create mode 100644 docs/getting-started/installation.md create mode 100644 docs/getting-started/quickstart.md create mode 100644 docs/providers/README.md create mode 100644 docs/providers/azure-storage.md create mode 100644 docs/providers/custom-provider.md create mode 100644 docs/providers/durable-task-scheduler.md create mode 100644 docs/providers/emulator.md create mode 100644 docs/providers/mssql.md create mode 100644 docs/providers/service-bus.md create mode 100644 docs/providers/service-fabric.md create mode 100644 docs/samples/catalog.md create mode 100644 docs/support.md create mode 100644 docs/telemetry/application-insights.md create mode 100644 docs/telemetry/distributed-tracing.md create mode 100644 docs/telemetry/logging.md create mode 100644 samples/DistributedTraceSample/ApplicationInsights/README.md create mode 100644 samples/DistributedTraceSample/README.md create mode 100644 samples/DurableTask.Samples/README.md create mode 100644 samples/ManagedIdentitySample/README.md diff --git a/README.md b/README.md index 04177195b..b961c34f4 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,8 @@ The Durable Task Framework (DTFx) is a library that allows users to write long running persistent workflows (referred to as _orchestrations_) in C# using simple async/await coding constructs. It is used heavily within various teams at Microsoft to reliably orchestrate long running provisioning, monitoring, and management operations. The orchestrations scale out linearly by simply adding more worker machines. This framework is also used to power the serverless [Durable Functions](https://docs.microsoft.com/azure/azure-functions/durable/durable-functions-overview) extension of [Azure Functions](https://azure.microsoft.com/services/functions/). +> **πŸ“– Documentation:** Comprehensive documentation is available in the [docs](./docs/README.md) folder. The [GitHub Wiki](https://github.com/Azure/durabletask/wiki) is no longer actively maintained β€” please refer to the docs folder for up-to-date content. + By open sourcing this project we hope to give the community a very cost-effective alternative to heavy duty workflow systems. We also hope to build an ecosystem of providers and activities around this simple yet incredibly powerful framework. This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 000000000..a50ec54fc --- /dev/null +++ b/docs/README.md @@ -0,0 +1,31 @@ +# Durable Task Framework Documentation + +The Durable Task Framework (DTFx) is an open-source framework for writing long-running, fault-tolerant workflow orchestrations in .NET. It provides the foundation for [Azure Durable Functions](https://learn.microsoft.com/azure/azure-functions/durable/durable-functions-overview) and can be used standalone with various backend storage providers. + +## Quick Links + +| Section | Description | +| ------- | ----------- | +| [Getting Started](getting-started/installation.md) | Installation, quickstart, and choosing a backend | +| [Core Concepts](concepts/core-concepts.md) | Task Hubs, Workers, Clients, and architecture overview | +| [Features](features/retries.md) | Retries, timers, external events, sub-orchestrations, and more | +| [Providers](providers/durable-task-scheduler.md) | Backend storage providers (Durable Task Scheduler, Azure Storage, etc.) | +| [Telemetry](telemetry/distributed-tracing.md) | Distributed tracing, logging, and Application Insights | +| [Advanced Topics](advanced/middleware.md) | Middleware, entities, serialization, and testing | +| [Samples](samples/catalog.md) | Sample projects and code patterns | + +## Recommended: Durable Task Scheduler with the modern .NET SDK + +For new projects, we recommend using the **[Durable Task Scheduler](providers/durable-task-scheduler.md)**β€”a fully managed Azure service that provides: + +- βœ… A more modern [Durable Task .NET SDK](https://github.com/microsoft/durabletask-dotnet) with improved developer experience +- βœ… Zero infrastructure management +- βœ… Built-in monitoring dashboard +- βœ… Highest throughput of all backends +- βœ… 24/7 Microsoft Azure support with SLA + +See [Choosing a Backend](getting-started/choosing-a-backend.md) for a full comparison of all available providers. + +## Support + +See [Support](support.md) for information about getting help with the Durable Task Framework. diff --git a/docs/advanced/README.md b/docs/advanced/README.md new file mode 100644 index 000000000..0101a1443 --- /dev/null +++ b/docs/advanced/README.md @@ -0,0 +1,15 @@ +# Advanced Topics + +This section covers advanced features and techniques for the Durable Task Framework. + +## Topics + +| Topic | Description | +| ----- | ----------- | +| [Middleware](middleware.md) | Intercept and extend orchestration/activity execution with cross-cutting concerns | +| [Serialization](serialization.md) | Custom data converters and serialization patterns | +| [Testing](testing.md) | Unit testing activities, integration testing with the emulator | +| [Entities](entities.md) | Durable Entities guidance (not supported for direct use in DTFx) | + +> [!NOTE] +> For Durable Entities support, see [Azure Durable Functions](https://docs.microsoft.com/azure/azure-functions/durable/durable-functions-entities) or the [Durable Task SDK](https://github.com/microsoft/durabletask-dotnet) with [Durable Task Scheduler](../providers/durable-task-scheduler.md). diff --git a/docs/advanced/entities.md b/docs/advanced/entities.md new file mode 100644 index 000000000..cfbc46beb --- /dev/null +++ b/docs/advanced/entities.md @@ -0,0 +1,30 @@ +# Durable Entities + +Durable Entities provide a way to manage small pieces of state with well-defined operations. Entities are addressable by a unique identifier and can be called from orchestrations or signaled from anywhere. + +## Entity Support in the Durable Task Framework + +> [!IMPORTANT] +> Durable Entities are **not directly supported** for end-user development in the Durable Task Framework. The entity-related APIs that exist in this library (such as `TaskEntity`, `EntityId`, `OrchestrationEntityContext`, etc.) are low-level infrastructure components intended to support [Azure Durable Functions](https://docs.microsoft.com/azure/azure-functions/durable/durable-functions-entities) scenarios. + +## Recommended Alternatives + +If you want to build applications that leverage the capabilities of Durable Entities, consider one of the following options: + +### Azure Durable Functions + +[Azure Durable Functions](https://docs.microsoft.com/azure/azure-functions/durable/durable-functions-entities) provides a complete, high-level programming model for Durable Entities with full support for: + +- Entity classes and function-based entities +- Calling and signaling entities from orchestrations +- Entity state persistence and management +- Distributed locking and critical sections + +### Durable Task SDK with Durable Task Scheduler + +The [Durable Task SDK](https://github.com/microsoft/durabletask-dotnet) used together with the [Durable Task Scheduler](durable-task-scheduler.md) provides a modern programming model with entity support. This is the recommended approach for new .NET applications that need durable entity capabilities outside of Azure Functions. + +## Next Steps + +- [Durable Task Scheduler](../providers/durable-task-scheduler.md) β€” Learn about the Durable Task Scheduler backend +- [Choosing a Backend](../getting-started/choosing-a-backend.md) β€” Compare available backend providers diff --git a/docs/advanced/middleware.md b/docs/advanced/middleware.md new file mode 100644 index 000000000..18b1e73c5 --- /dev/null +++ b/docs/advanced/middleware.md @@ -0,0 +1,544 @@ +# Middleware + +Middleware in the Durable Task Framework allows you to intercept and extend orchestration and activity execution. This is useful for cross-cutting concerns like logging, metrics, authentication, or context propagation. + +## Middleware Delegate Signature + +Middleware is registered as a delegate with the following signature: + +```csharp +using DurableTask.Core.Middleware; + +// Middleware delegate signature +Func, Task> +``` + +The `DispatchMiddlewareContext` provides access to execution context via `GetProperty()` and `SetProperty()` methods. + +## Orchestration Middleware + +### Available Context Properties + +Orchestration middleware can access these properties via `context.GetProperty()`: + +| Type | Description | +| ---- | ----------- | +| `OrchestrationInstance` | The orchestration instance (InstanceId, ExecutionId) | +| `TaskOrchestration` | The orchestration implementation (may be null for out-of-process scenarios) | +| `OrchestrationRuntimeState` | History, status, name, version, input, tags, and more | +| `OrchestrationExecutionContext` | Contains orchestration tags | +| `TaskOrchestrationWorkItem` | The work item being processed | + +### Creating Orchestration Middleware + +```csharp +public static class OrchestrationLoggingMiddleware +{ + public static Func, Task> Create(ILogger logger) + { + return async (context, next) => + { + var instance = context.GetProperty(); + var runtimeState = context.GetProperty(); + var instanceId = instance?.InstanceId ?? "unknown"; + var orchestrationName = runtimeState?.Name ?? "unknown"; + + logger.LogInformation("Orchestration {Name} ({InstanceId}) starting execution", + orchestrationName, instanceId); + var stopwatch = Stopwatch.StartNew(); + + try + { + await next(); + logger.LogInformation("Orchestration {Name} ({InstanceId}) completed in {ElapsedMs}ms", + orchestrationName, instanceId, stopwatch.ElapsedMilliseconds); + } + catch (Exception ex) + { + logger.LogError(ex, "Orchestration {Name} ({InstanceId}) failed after {ElapsedMs}ms", + orchestrationName, instanceId, stopwatch.ElapsedMilliseconds); + throw; + } + }; + } +} +``` + +### Registering Orchestration Middleware + +```csharp +var worker = new TaskHubWorker(orchestrationService, loggerFactory); + +// Add middleware using lambda - order matters (first registered = outermost) +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + var instance = context.GetProperty(); + Console.WriteLine($"Processing orchestration: {instance?.InstanceId}"); + await next(); +}); + +// Or use a factory method +worker.AddOrchestrationDispatcherMiddleware( + OrchestrationLoggingMiddleware.Create(logger)); + +await worker.StartAsync(); +``` + +## Activity Middleware + +### Context Properties for Activities + +Activity middleware can access these properties via `context.GetProperty()`: + +| Type | Description | +| ---- | ----------- | +| `OrchestrationInstance` | The parent orchestration instance | +| `TaskActivity` | The activity implementation (may be null for out-of-process scenarios) | +| `TaskScheduledEvent` | Contains activity name, version, input, and event ID | +| `OrchestrationExecutionContext` | Contains orchestration tags (if available) | + +### Creating Activity Middleware + +```csharp +public static class ActivityLoggingMiddleware +{ + public static Func, Task> Create(ILogger logger) + { + return async (context, next) => + { + var scheduledEvent = context.GetProperty(); + var instance = context.GetProperty(); + var activityName = scheduledEvent?.Name ?? "unknown"; + var instanceId = instance?.InstanceId ?? "unknown"; + + logger.LogInformation("Activity {ActivityName} starting for orchestration {InstanceId}", + activityName, instanceId); + var stopwatch = Stopwatch.StartNew(); + + try + { + await next(); + logger.LogInformation("Activity {ActivityName} completed in {ElapsedMs}ms", + activityName, stopwatch.ElapsedMilliseconds); + } + catch (Exception ex) + { + logger.LogError(ex, "Activity {ActivityName} failed after {ElapsedMs}ms", + activityName, stopwatch.ElapsedMilliseconds); + throw; + } + }; + } +} +``` + +### Registering Activity Middleware + +```csharp +var worker = new TaskHubWorker(orchestrationService, loggerFactory); + +// Add middleware using lambda +worker.AddActivityDispatcherMiddleware(async (context, next) => +{ + var scheduledEvent = context.GetProperty(); + Console.WriteLine($"Executing activity: {scheduledEvent?.Name}"); + await next(); +}); + +// Or use a factory method +worker.AddActivityDispatcherMiddleware( + ActivityLoggingMiddleware.Create(logger)); + +await worker.StartAsync(); +``` + +## Common Middleware Patterns + +### Metrics Collection + +```csharp +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + var runtimeState = context.GetProperty(); + var orchestrationName = runtimeState?.Name ?? "unknown"; + var stopwatch = Stopwatch.StartNew(); + var success = true; + + try + { + await next(); + } + catch + { + success = false; + throw; + } + finally + { + metrics.RecordDuration($"orchestration.{orchestrationName}.duration", stopwatch.Elapsed); + metrics.RecordCounter(success ? "orchestration.success" : "orchestration.failure"); + } +}); +``` + +### Context Propagation (Using Tags) + +```csharp +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + var executionContext = context.GetProperty(); + + // Extract tenant ID from orchestration tags + string tenantId = "default"; + if (executionContext?.OrchestrationTags?.TryGetValue("TenantId", out var tenant) == true) + { + tenantId = tenant; + } + + // Set ambient context + using (TenantContext.SetCurrent(tenantId)) + { + await next(); + } +}); +``` + +### Exception Handling Considerations + +> [!IMPORTANT] +> Exceptions thrown in middleware cause the work item to be **retried**, not failed. If you want to explicitly fail an orchestration or activity, you must set the result directly. + +```csharp +// CAUTION: This causes infinite retries, NOT a failure! +worker.AddActivityDispatcherMiddleware(async (context, next) => +{ + try + { + await next(); + } + catch (Exception ex) + { + // Logging is fine, but re-throwing will cause retries + logger.LogError(ex, "Activity failed"); + throw; // ⚠️ This causes the activity to be retried, not failed! + } +}); +``` + +To properly fail an activity from middleware, use `TaskFailureException` or set the result: + +```csharp +// Option 1: Throw TaskFailureException (gets converted to TaskFailedEvent) +worker.AddActivityDispatcherMiddleware(async (context, next) => +{ + try + { + await next(); + } + catch (Exception ex) + { + // This properly fails the activity and reports failure to the orchestration + throw new TaskFailureException(ex.Message, ex, ex.ToString()); + } +}); + +// Option 2: Set the failure result directly +worker.AddActivityDispatcherMiddleware(async (context, next) => +{ + var scheduledEvent = context.GetProperty(); + + try + { + await next(); + } + catch (Exception ex) + { + // Explicitly set a failure result + context.SetProperty(new ActivityExecutionResult + { + ResponseEvent = new TaskFailedEvent( + eventId: -1, + taskScheduledEventId: scheduledEvent.EventId, + reason: ex.Message, + details: ex.ToString(), + failureDetails: new FailureDetails(ex)) + }); + // Don't re-throw - we've handled the failure + } +}); +``` + +### Authentication/Authorization + +```csharp +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + var executionContext = context.GetProperty(); + + string? userId = null; + executionContext?.OrchestrationTags?.TryGetValue("UserId", out userId); + + if (string.IsNullOrEmpty(userId) || + !await authService.IsAuthorizedAsync(userId, "ExecuteOrchestration")) + { + // Don't throw - that would cause retries. Instead, fail the orchestration explicitly. + context.SetProperty(new OrchestratorExecutionResult + { + Actions = new[] + { + new OrchestrationCompleteOrchestratorAction + { + OrchestrationStatus = OrchestrationStatus.Failed, + Result = $"User {userId ?? "unknown"} is not authorized to execute orchestrations", + FailureDetails = new FailureDetails( + errorType: "UnauthorizedAccessException", + errorMessage: $"User {userId ?? "unknown"} is not authorized", + stackTrace: null, + innerFailure: null, + isNonRetriable: true) + } + } + }); + return; // Don't call next() + } + + await next(); +}); +``` + +## Middleware Context + +### Accessing Built-in Properties + +```csharp +// For orchestration middleware +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + // Core identification + var instance = context.GetProperty(); + var instanceId = instance?.InstanceId; + var executionId = instance?.ExecutionId; + + // Orchestration metadata + var runtimeState = context.GetProperty(); + var orchestrationName = runtimeState?.Name; + var orchestrationVersion = runtimeState?.Version; + var input = runtimeState?.Input; + var status = runtimeState?.OrchestrationStatus; + + // Tags + var executionContext = context.GetProperty(); + var tags = executionContext?.OrchestrationTags; + + // The orchestration implementation (may be null for out-of-process execution) + var orchestration = context.GetProperty(); + + await next(); +}); + +// For activity middleware +worker.AddActivityDispatcherMiddleware(async (context, next) => +{ + // Parent orchestration instance + var instance = context.GetProperty(); + + // Activity details from the scheduled event + var scheduledEvent = context.GetProperty(); + var activityName = scheduledEvent?.Name; + var activityVersion = scheduledEvent?.Version; + var activityInput = scheduledEvent?.Input; + var eventId = scheduledEvent?.EventId; + + // The activity implementation (may be null for out-of-process execution) + var activity = context.GetProperty(); + + await next(); +}); +``` + +### Setting Custom Properties + +```csharp +// First middleware sets a property +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + // Set a named property for downstream middleware + context.SetProperty("CorrelationId", Guid.NewGuid().ToString()); + await next(); +}); + +// Downstream middleware reads the property +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + var correlationId = context.GetProperty("CorrelationId"); + Console.WriteLine($"Correlation ID: {correlationId}"); + await next(); +}); +``` + +## Middleware Ordering + +Middleware executes in a pipeline. The order of registration determines execution order: + +```csharp +// Registration order +worker.AddOrchestrationDispatcherMiddleware(AuthMiddleware); // 1st registered +worker.AddOrchestrationDispatcherMiddleware(LoggingMiddleware); // 2nd registered +worker.AddOrchestrationDispatcherMiddleware(MetricsMiddleware); // 3rd registered + +// Execution order (onion model): +// AuthMiddleware β†’ +// LoggingMiddleware β†’ +// MetricsMiddleware β†’ +// [Orchestration executes] +// ← MetricsMiddleware returns +// ← LoggingMiddleware returns +// ← AuthMiddleware returns +``` + +## Best Practices + +### 1. Keep Middleware Focused + +Each middleware should have a single responsibility: + +```csharp +// Good - single responsibility with factory methods +public static class LoggingMiddleware +{ + public static Func, Task> Create(ILogger logger) => /* logging only */; +} + +public static class MetricsMiddleware +{ + public static Func, Task> Create(IMetrics metrics) => /* metrics only */; +} + +// Avoid combining multiple concerns in one middleware +``` + +### 2. Understand Exception Behavior + +Exceptions thrown in middleware cause **retries**, not failures: + +```csharp +// For activities: Use TaskFailureException to signal failure to orchestration +worker.AddActivityDispatcherMiddleware(async (context, next) => +{ + try + { + await next(); + } + catch (MyValidationException ex) + { + // Convert to TaskFailureException to properly fail the activity + throw new TaskFailureException(ex.Message, ex, ex.ToString()); + } + // Other exceptions will cause retries +}); + +// For orchestrations: Set result with failed status +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + try + { + await next(); + } + catch (Exception ex) when (ShouldFailOrchestration(ex)) + { + context.SetProperty(new OrchestratorExecutionResult + { + Actions = new[] + { + new OrchestrationCompleteOrchestratorAction + { + OrchestrationStatus = OrchestrationStatus.Failed, + Result = ex.Message, + FailureDetails = new FailureDetails(ex) + } + } + }); + // Don't re-throw - we've handled the failure + } +}); +``` + +### 3. Use Dependency Injection Patterns + +Capture dependencies via closures or factory methods: + +```csharp +// Using closures +public static Func, Task> CreateTelemetryMiddleware( + TelemetryClient telemetry, + ILogger logger) +{ + return async (context, next) => + { + var instance = context.GetProperty(); + telemetry.TrackEvent("OrchestrationStarted", + new Dictionary { ["InstanceId"] = instance?.InstanceId }); + + await next(); + }; +} + +// Registration +worker.AddOrchestrationDispatcherMiddleware( + CreateTelemetryMiddleware(telemetryClient, logger)); +``` + +### 4. Intercepting Execution Results + +Middleware can intercept and modify execution results: + +```csharp +// For orchestrations - intercept or provide custom results +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + await next(); + + // After execution, you can read the result + var result = context.GetProperty(); + // Inspect result.Actions, result.CustomStatus, etc. +}); + +// For activities - intercept or provide custom results +worker.AddActivityDispatcherMiddleware(async (context, next) => +{ + await next(); + + // After execution, you can read the result + var result = context.GetProperty(); + // Inspect result.ResponseEvent +}); +``` + +### 5. Out-of-Process Execution + +Middleware can completely replace execution for out-of-process scenarios: + +```csharp +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + var runtimeState = context.GetProperty(); + + // Execute orchestration out-of-process and get result + var actions = await ExecuteOutOfProcessAsync(runtimeState); + + // Set the result directly - the default handler will be skipped + context.SetProperty(new OrchestratorExecutionResult + { + Actions = actions, + CustomStatus = "Executed out-of-process" + }); + + // Don't call next() if you're providing the result yourself +}); +``` + +## Next Steps + +- [Entities](entities.md) β€” Durable Entities pattern +- [Serialization](serialization.md) β€” Custom data converters +- [Testing](testing.md) β€” Testing orchestrations diff --git a/docs/advanced/serialization.md b/docs/advanced/serialization.md new file mode 100644 index 000000000..3faabfcd1 --- /dev/null +++ b/docs/advanced/serialization.md @@ -0,0 +1,477 @@ +# Serialization + +The Durable Task Framework uses serialization to persist orchestration state, activity inputs/outputs, and messages between components. Understanding serialization is essential for correct orchestration behavior. + +## Default Serialization + +By default, DTFx uses JSON serialization via Newtonsoft.Json (Json.NET). + +The default `JsonDataConverter` uses these settings: + +```csharp +new JsonSerializerSettings +{ + TypeNameHandling = TypeNameHandling.Objects, + DateParseHandling = DateParseHandling.None, + SerializationBinder = new PackageUpgradeSerializationBinder() +} +``` + +**Key behaviors:** + +- `TypeNameHandling.Objects` β€” Includes type information for polymorphic deserialization +- `DateParseHandling.None` β€” Dates are not automatically parsed (preserves as strings) +- `PackageUpgradeSerializationBinder` β€” Handles type name migration across package versions + +## Custom DataConverter + +### Creating a Custom Converter + +Extend the abstract `DataConverter` class: + +```csharp +using DurableTask.Core.Serializing; +using System.Text.Json; + +public class SystemTextJsonDataConverter : DataConverter +{ + private readonly JsonSerializerOptions _options; + + public SystemTextJsonDataConverter() + { + _options = new JsonSerializerOptions + { + PropertyNamingPolicy = JsonNamingPolicy.CamelCase, + WriteIndented = false + }; + } + + public override string Serialize(object value) + { + return Serialize(value, formatted: false); + } + + public override string Serialize(object value, bool formatted) + { + if (value == null) + { + return null; + } + + var options = formatted + ? new JsonSerializerOptions(_options) { WriteIndented = true } + : _options; + + return JsonSerializer.Serialize(value, options); + } + + public override object Deserialize(string data, Type objectType) + { + if (string.IsNullOrEmpty(data)) + { + return null; + } + + return JsonSerializer.Deserialize(data, objectType, _options); + } +} +``` + +### Custom JsonSerializerSettings + +For custom Newtonsoft.Json settings, pass settings to the constructor: + +```csharp +var settings = new JsonSerializerSettings +{ + TypeNameHandling = TypeNameHandling.Auto, + NullValueHandling = NullValueHandling.Ignore, + DateFormatHandling = DateFormatHandling.IsoDateFormat, + ContractResolver = new CamelCasePropertyNamesContractResolver() +}; + +var converter = new JsonDataConverter(settings); +``` + +### Using Custom Converters + +Set custom converters on the `OrchestrationContext`: + +```csharp +public class MyOrchestration : TaskOrchestration +{ + public override async Task RunTask( + OrchestrationContext context, + Input input) + { + // Use custom converter for messages (must be JsonDataConverter or subclass) + context.MessageDataConverter = new JsonDataConverter(customSettings); + + // Use custom converter for errors (must be JsonDataConverter or subclass) + context.ErrorDataConverter = new JsonDataConverter(customSettings); + + // Now all serialization uses custom converter + var result = await context.ScheduleTask( + typeof(MyActivity), + input); + + return result.ToString(); + } +} +``` + +> **Note:** The `MessageDataConverter` and `ErrorDataConverter` properties are typed as `JsonDataConverter`, not the base `DataConverter` class. To use completely custom serialization logic, you must subclass `JsonDataConverter` or use the `DataConverter` property on `TaskOrchestration` and `TaskActivity` classes instead. + +## Activity Serialization + +Activities also use `DataConverter`: + +```csharp +public class MyActivity : TaskActivity +{ + public MyActivity() + : base(new CustomJsonDataConverter()) // Pass converter to base constructor + { + } + + protected override Output Execute(TaskContext context, Input input) + { + // input was deserialized with DataConverter + // return value will be serialized with DataConverter + return new Output { Value = input.Value * 2 }; + } +} +``` + +## Serialization Considerations + +### Immutable Types + +Use immutable types for orchestration inputs and outputs: + +```csharp +// Good - immutable record +public record OrderInput(string OrderId, List Items); + +// Good - immutable class +public class OrderInput +{ + public OrderInput(string orderId, List items) + { + OrderId = orderId; + Items = items.ToList(); // Defensive copy + } + + public string OrderId { get; } + public IReadOnlyList Items { get; } +} +``` + +### Polymorphic Types + +When using inheritance, ensure proper type handling: + +```csharp +// Base class +public abstract class PaymentMethod +{ + public string Id { get; set; } +} + +// Derived classes +public class CreditCard : PaymentMethod +{ + public string CardNumber { get; set; } +} + +public class BankTransfer : PaymentMethod +{ + public string AccountNumber { get; set; } +} + +// TypeNameHandling.Objects (default) handles this correctly +var payment = new CreditCard { Id = "1", CardNumber = "4111..." }; +var json = converter.Serialize(payment); +// json includes "$type" property for deserialization +``` + +### Circular References + +Avoid circular references in serialized objects: + +```csharp +// Bad - circular reference +public class Node +{ + public string Value { get; set; } + public Node Parent { get; set; } // Can create circular reference + public List Children { get; set; } +} + +// Better - use IDs for references +public class Node +{ + public string Id { get; set; } + public string Value { get; set; } + public string ParentId { get; set; } + public List ChildIds { get; set; } +} +``` + +If you must handle circular references: + +```csharp +var settings = new JsonSerializerSettings +{ + TypeNameHandling = TypeNameHandling.Objects, + ReferenceLoopHandling = ReferenceLoopHandling.Serialize, + PreserveReferencesHandling = PreserveReferencesHandling.Objects +}; +``` + +### Large Payloads + +Avoid large payloads in orchestration state: + +```csharp +// Bad - large payload stored in state +public class BadInput +{ + public byte[] FileContent { get; set; } // Could be megabytes +} + +// Better - store reference, not content +public class BetterInput +{ + public string BlobUri { get; set; } // Reference to blob storage +} + +public override async Task RunTask( + OrchestrationContext context, + BetterInput input) +{ + // Activity downloads content when needed + var content = await context.ScheduleTask( + typeof(DownloadBlobActivity), + input.BlobUri); + + // Process and store result + var resultUri = await context.ScheduleTask( + typeof(UploadResultActivity), + processedContent); + + return resultUri; +} +``` + +### Non-Serializable Types + +Never include `CancellationToken` or other non-serializable runtime types in your input/output classes: + +```csharp +// DANGEROUS - CancellationToken cannot be serialized safely +public class BadActivityInput +{ + public string Data { get; set; } + public CancellationToken CancellationToken { get; set; } // DO NOT DO THIS +} + +// Good - pass cancellation token through method parameters, not serialized state +public class GoodActivityInput +{ + public string Data { get; set; } +} +``` + +> [!WARNING] +> Attempting to serialize `CancellationToken` can cause memory corruption, application crashes, and unpredictable behavior. The `CancellationToken` struct contains internal handles and references that are not designed for serialization. + +Other types to avoid in serialized data: + +- `CancellationToken` and `CancellationTokenSource` +- `Task` and `Task` +- `Thread`, `Timer`, and other threading primitives +- `Stream` and its derivatives +- `HttpClient` and other network clients +- Any type holding unmanaged resources or handles + +## Compression + +For large payloads, consider compression: + +```csharp +public class CompressedDataConverter : DataConverter +{ + private readonly JsonDataConverter _inner = JsonDataConverter.Default; + + public override string Serialize(object value) + { + string json = _inner.Serialize(value); + return CompressString(json); + } + + public override object Deserialize(string data, Type objectType) + { + string json = DecompressString(data); + return _inner.Deserialize(json, objectType); + } + + private string CompressString(string text) + { + var bytes = Encoding.UTF8.GetBytes(text); + using var output = new MemoryStream(); + using (var gzip = new GZipStream(output, CompressionLevel.Optimal)) + { + gzip.Write(bytes, 0, bytes.Length); + } + return Convert.ToBase64String(output.ToArray()); + } + + private string DecompressString(string compressed) + { + var bytes = Convert.FromBase64String(compressed); + using var input = new MemoryStream(bytes); + using var gzip = new GZipStream(input, CompressionMode.Decompress); + using var reader = new StreamReader(gzip); + return reader.ReadToEnd(); + } +} +``` + +## Version Compatibility + +### Schema Evolution + +Design for forward and backward compatibility: + +```csharp +// Version 1 +public class OrderV1 +{ + public string OrderId { get; set; } + public decimal Amount { get; set; } +} + +// Version 2 - added property +public class OrderV2 +{ + public string OrderId { get; set; } + public decimal Amount { get; set; } + public string Currency { get; set; } = "USD"; // Default for old data +} + +// Version 3 - renamed property +public class OrderV3 +{ + public string OrderId { get; set; } + + [JsonProperty("Amount")] // Map old name + public decimal TotalAmount { get; set; } + + public string Currency { get; set; } = "USD"; +} +``` + +### Type Name Changes + +When moving types between assemblies or namespaces: + +```csharp +// Custom binder to handle type migrations +public class MySerializationBinder : ISerializationBinder +{ + public Type BindToType(string assemblyName, string typeName) + { + // Handle old type names + if (typeName == "OldNamespace.MyType") + { + return typeof(NewNamespace.MyType); + } + + // Fall back to default + return Type.GetType($"{typeName}, {assemblyName}"); + } + + public void BindToName(Type serializedType, out string assemblyName, out string typeName) + { + assemblyName = serializedType.Assembly.FullName; + typeName = serializedType.FullName; + } +} +``` + +## Best Practices + +### 1. Use Simple, Serializable Types + +```csharp +// Good - simple POCO +public class OrderInput +{ + public string OrderId { get; set; } + public List ItemIds { get; set; } + public decimal Total { get; set; } +} + +// Avoid - complex types with behavior +public class OrderInput +{ + private readonly IOrderValidator _validator; // Not serializable + + public void Validate() { /* ... */ } // Behavior belongs elsewhere +} +``` + +### 2. Keep Payloads Small + +```csharp +// Good - minimal data +public class ProcessingInput +{ + public string DocumentId { get; set; } +} + +// Avoid - embedding large data +public class ProcessingInput +{ + public byte[] DocumentContent { get; set; } // Could be huge +} +``` + +### 3. Be Explicit About Nullability + +```csharp +public class OrderInput +{ + public string OrderId { get; set; } // Required + public string? CustomerNote { get; set; } // Optional + public List Items { get; set; } = new(); // Never null +} +``` + +### 4. Test Serialization Round-Trips + +```csharp +[Fact] +public void OrderInput_SerializesCorrectly() +{ + var input = new OrderInput + { + OrderId = "order-123", + Items = new List { "item-1", "item-2" } + }; + + var converter = JsonDataConverter.Default; + string json = converter.Serialize(input); + var deserialized = converter.Deserialize(json); + + Assert.Equal(input.OrderId, deserialized.OrderId); + Assert.Equal(input.Items, deserialized.Items); +} +``` + +## Next Steps + +- [Testing](testing.md) β€” Testing orchestrations +- [Middleware](middleware.md) β€” Custom middleware +- [Entities](entities.md) β€” Durable Entities diff --git a/docs/advanced/testing.md b/docs/advanced/testing.md new file mode 100644 index 000000000..83e779611 --- /dev/null +++ b/docs/advanced/testing.md @@ -0,0 +1,632 @@ +# Testing Orchestrations + +Testing durable orchestrations requires special consideration due to their replay-based execution model. This guide covers strategies and patterns for effectively testing your orchestrations and activities. + +## Testing Approaches + +There are three main approaches to testing DTFx code: + +1. **Unit testing** β€” Test components in isolation with mocks +2. **Integration testing** β€” Test with the in-memory emulator +3. **End-to-end testing** β€” Test with real backend providers + +## Unit Testing Activities + +Activities are standard async methods, making them straightforward to test: + +```csharp +using Microsoft.VisualStudio.TestTools.UnitTesting; + +[TestClass] +public class ActivityTests +{ + [TestMethod] + public async Task ProcessOrderActivity_ValidOrder_ReturnsConfirmation() + { + // Arrange + var activity = new ProcessOrderActivity( + mockInventoryService.Object, + mockPaymentService.Object); + + var orchestrationInstance = new OrchestrationInstance + { + InstanceId = "test-123", + ExecutionId = Guid.NewGuid().ToString() + }; + var context = new TaskContext(orchestrationInstance); + var input = new OrderInput { OrderId = "order-1", Amount = 99.99m }; + + // Act + var result = await activity.RunAsync(context, input); + + // Assert + Assert.IsNotNull(result); + Assert.AreEqual("Confirmed", result.Status); + } + + [TestMethod] + public async Task ProcessOrderActivity_InvalidOrder_ThrowsException() + { + // Arrange + var activity = new ProcessOrderActivity( + mockInventoryService.Object, + mockPaymentService.Object); + + var orchestrationInstance = new OrchestrationInstance + { + InstanceId = "test-123", + ExecutionId = Guid.NewGuid().ToString() + }; + var context = new TaskContext(orchestrationInstance); + var input = new OrderInput { OrderId = null }; + + // Act & Assert + await Assert.ThrowsExceptionAsync( + () => activity.RunAsync(context, input)); + } +} +``` + +### Testing with Dependencies + +Use dependency injection for testable activities: + +```csharp +public class SendEmailActivity : AsyncTaskActivity +{ + private readonly IEmailService _emailService; + + public SendEmailActivity(IEmailService emailService) + { + _emailService = emailService; + } + + protected override async Task ExecuteAsync( + TaskContext context, + EmailRequest input) + { + return await _emailService.SendAsync(input); + } +} + +[TestClass] +public class SendEmailActivityTests +{ + [TestMethod] + public async Task SendEmail_ValidRequest_Succeeds() + { + // Arrange + var mockEmailService = new Mock(); + mockEmailService + .Setup(x => x.SendAsync(It.IsAny())) + .ReturnsAsync(new EmailResult { Success = true, MessageId = "msg-1" }); + + var activity = new SendEmailActivity(mockEmailService.Object); + + // Act + var result = await activity.ExecuteAsync( + context: null!, + input: new EmailRequest { To = "test@example.com", Subject = "Test" }); + + // Assert + Assert.IsTrue(result.Success); + Assert.AreEqual("msg-1", result.MessageId); + mockEmailService.Verify(x => x.SendAsync(It.IsAny()), Times.Once); + } +} +``` + +## Unit Testing Orchestrations + +Orchestrations are harder to unit test due to their stateful nature and use of the `OrchestrationContext`. The recommended approach is to use **integration testing with the emulator** (see below), but you can also extract testable logic into separate classes. + +### Extract Business Logic for Unit Testing + +Extract complex business logic into separate, testable classes. Keep orchestration code thinβ€”focused only on coordination: + +```csharp +// Testable logic class - no orchestration dependencies +public class OrderLogic : IOrderLogic +{ + public void ValidateOrder(OrderInput input) + { + if (string.IsNullOrEmpty(input.OrderId)) + throw new ArgumentException("OrderId is required"); + } + + public NextStep DetermineNextStep(InventoryResult inventory) + { + return inventory.AllAvailable + ? NextStep.ProcessPayment + : NextStep.BackOrder; + } +} + +// Unit tests for the extracted logic +[TestClass] +public class OrderLogicTests +{ + [TestMethod] + public void ValidateOrder_MissingOrderId_ThrowsArgumentException() + { + var logic = new OrderLogic(); + var input = new OrderInput { OrderId = null }; + + Assert.ThrowsException( + () => logic.ValidateOrder(input)); + } + + [TestMethod] + public void DetermineNextStep_AllAvailable_ReturnsProcessPayment() + { + var logic = new OrderLogic(); + var inventory = new InventoryResult { AllAvailable = true }; + + var result = logic.DetermineNextStep(inventory); + + Assert.AreEqual(NextStep.ProcessPayment, result); + } +} +``` + +Then use the logic in your orchestration: + +```csharp +public class OrderOrchestration : TaskOrchestration +{ + // Use a static/singleton instance or instantiate directly + // Note: Constructor dependency injection is NOT supported by default + // because the framework uses Activator.CreateInstance() which requires + // a parameterless constructor. + private readonly IOrderLogic _logic = new OrderLogic(); + + public override async Task RunTask( + OrchestrationContext context, + OrderInput input) + { + // Validate using testable logic + _logic.ValidateOrder(input); + + var inventory = await context.ScheduleTask( + typeof(CheckInventoryActivity), + input.Items); + + // Process result using testable logic + var decision = _logic.DetermineNextStep(inventory); + + // ... rest of orchestration + } +} +``` + +> [!IMPORTANT] +> Orchestrations are instantiated by the framework using `Activator.CreateInstance()`, which requires a parameterless constructor. Constructor-based dependency injection is not supported out of the box. If you need DI, you must implement a custom `ObjectCreator` and register it with `AddTaskOrchestrations()`. + +### Why Not Mock OrchestrationContext? + +`OrchestrationContext` is an abstract class with complex internal state management for replay semantics. Creating a proper mock requires implementing many methods and simulating the replay behavior correctly. **Integration testing with the emulator is strongly recommended** insteadβ€”it's fast, reliable, and tests the actual orchestration behavior. + +## Integration Testing with Emulator + +The emulator provides fast, isolated testing without external dependencies: + +```csharp +using DurableTask.Core; +using DurableTask.Emulator; +using Microsoft.Extensions.Logging; +using Microsoft.VisualStudio.TestTools.UnitTesting; + +[TestClass] +public class OrderOrchestrationIntegrationTests +{ + private ILoggerFactory _loggerFactory; + private LocalOrchestrationService _service; + private TaskHubWorker _worker; + private TaskHubClient _client; + + [TestInitialize] + public async Task Setup() + { + _loggerFactory = LoggerFactory.Create(builder => builder.AddConsole()); + _service = new LocalOrchestrationService(); + _worker = new TaskHubWorker(_service, _loggerFactory); + _client = new TaskHubClient(_service, loggerFactory: _loggerFactory); + + // Register orchestrations and activities + _worker.AddTaskOrchestrations(typeof(OrderOrchestration)); + _worker.AddTaskActivities( + typeof(ValidateOrderActivity), + typeof(ProcessPaymentActivity), + typeof(SendConfirmationActivity)); + + await _worker.StartAsync(); + } + + [TestCleanup] + public async Task Cleanup() + { + await _worker.StopAsync(isForced: true); + } + + [TestMethod] + public async Task OrderOrchestration_ValidOrder_CompletesSuccessfully() + { + // Arrange + var input = new OrderInput + { + OrderId = "order-123", + CustomerId = "customer-456", + Items = new[] { "item-1", "item-2" } + }; + + // Act + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(OrderOrchestration), + input); + + var result = await _client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + + // Assert + Assert.AreEqual(OrchestrationStatus.Completed, result.OrchestrationStatus); + var output = result.GetOutput(); + Assert.AreEqual("Confirmed", output.Status); + } + + [TestMethod] + public async Task OrderOrchestration_InvalidOrder_Fails() + { + // Arrange + var input = new OrderInput { OrderId = null }; + + // Act + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(OrderOrchestration), + input); + + var result = await _client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + + // Assert + Assert.AreEqual(OrchestrationStatus.Failed, result.OrchestrationStatus); + } +} +``` + +### Testing Timeouts and Timers + +```csharp +[TestMethod] +public async Task ReminderOrchestration_SendsReminderAfterDelay() +{ + // Arrange + var input = new ReminderInput { DelayMinutes = 30 }; + var remindersSent = new List(); + + // Track activity calls + _worker.AddTaskActivities( + new MockSendReminderActivity(reminder => remindersSent.Add(reminder))); + + // Act + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(ReminderOrchestration), + input); + + // Note: Emulator runs timers immediately in test mode + var result = await _client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + + // Assert + Assert.AreEqual(OrchestrationStatus.Completed, result.OrchestrationStatus); + Assert.AreEqual(1, remindersSent.Count); +} +``` + +### Testing Sub-Orchestrations + +```csharp +[TestMethod] +public async Task ParentOrchestration_CallsChildOrchestration() +{ + // Arrange + _worker.AddTaskOrchestrations( + typeof(ParentOrchestration), + typeof(ChildOrchestration)); + _worker.AddTaskActivities(typeof(ChildActivity)); + + // Act + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(ParentOrchestration), + new ParentInput { Value = 5 }); + + var result = await _client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + + // Assert + Assert.AreEqual(OrchestrationStatus.Completed, result.OrchestrationStatus); + var output = result.GetOutput(); + Assert.AreEqual(10, output.ProcessedValue); // Child doubled the value +} +``` + +### Testing External Events + +```csharp +[TestMethod] +public async Task ApprovalOrchestration_WaitsForApproval() +{ + // Arrange + var input = new ApprovalRequest { RequestId = "req-1", Amount = 500 }; + + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(ApprovalOrchestration), + input); + + // Wait a bit for orchestration to reach the wait point + await Task.Delay(100); + + // Act - send approval event + await _client.RaiseEventAsync( + instance, + "ApprovalResult", + new ApprovalResult { Approved = true, ApprovedBy = "manager@example.com" }); + + var result = await _client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + + // Assert + Assert.AreEqual(OrchestrationStatus.Completed, result.OrchestrationStatus); + var output = result.GetOutput(); + Assert.IsTrue(output.WasApproved); +} +``` + +## Testing Retry Behavior + +```csharp +[TestMethod] +public async Task Orchestration_RetriesFailedActivity() +{ + // Arrange + var failCount = 0; + var failingActivity = new Func>( + async (context, input) => + { + failCount++; + if (failCount < 3) + { + throw new TransientException("Temporary failure"); + } + return "Success"; + }); + + _worker.AddTaskActivities( + TestOrchestrationHost.MakeActivity( + "FailingActivity", + failingActivity)); + + _worker.AddTaskOrchestrations(typeof(RetryingOrchestration)); + + // Act + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(RetryingOrchestration), + "input"); + + var result = await _client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + + // Assert + Assert.AreEqual(OrchestrationStatus.Completed, result.OrchestrationStatus); + Assert.AreEqual(3, failCount); // Failed twice, succeeded on third attempt +} +``` + +## Testing Replay Behavior + +Ensure your orchestrations handle replay correctly: + +```csharp +[TestMethod] +public async Task Orchestration_DoesNotDuplicateSideEffects() +{ + // Arrange + var sideEffectCount = 0; + + _worker.AddTaskActivities( + new CountingSideEffectActivity(() => Interlocked.Increment(ref sideEffectCount))); + + _worker.AddTaskOrchestrations(typeof(SideEffectOrchestration)); + + // Act - run orchestration that will replay + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(SideEffectOrchestration), + "input"); + + var result = await _client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + + // Assert - side effect should only occur once despite replays + Assert.AreEqual(1, sideEffectCount); +} +``` + +## Test Helpers + +### Creating Mock Activities + +```csharp +public static class TestHelpers +{ + public static TaskActivity MakeActivity( + string name, + Func> implementation) + { + return new FuncTaskActivity(implementation) + { + Name = name + }; + } +} + +// Usage +var mockActivity = TestHelpers.MakeActivity( + "ProcessOrder", + async (context, input) => new OrderResult { Status = "Confirmed" }); +``` + +### Test Base Class + +```csharp +public abstract class OrchestrationTestBase +{ + protected ILoggerFactory LoggerFactory; + protected LocalOrchestrationService Service; + protected TaskHubWorker Worker; + protected TaskHubClient Client; + + [TestInitialize] + public virtual async Task TestInitialize() + { + LoggerFactory = Microsoft.Extensions.Logging.LoggerFactory.Create(builder => builder.AddConsole()); + Service = new LocalOrchestrationService(); + Worker = new TaskHubWorker(Service, LoggerFactory); + Client = new TaskHubClient(Service, loggerFactory: LoggerFactory); + + RegisterOrchestrations(Worker); + RegisterActivities(Worker); + + await Worker.StartAsync(); + } + + [TestCleanup] + public virtual async Task TestCleanup() + { + await Worker.StopAsync(isForced: true); + } + + protected abstract void RegisterOrchestrations(TaskHubWorker worker); + protected abstract void RegisterActivities(TaskHubWorker worker); + + protected async Task RunOrchestrationAsync( + Type orchestrationType, + object input, + TimeSpan? timeout = null) + { + var instance = await Client.CreateOrchestrationInstanceAsync( + orchestrationType, + input); + + var result = await Client.WaitForOrchestrationAsync( + instance, + timeout ?? TimeSpan.FromSeconds(30)); + + if (result.OrchestrationStatus == OrchestrationStatus.Failed) + { + throw new Exception( + $"Orchestration failed: {result.FailureDetails?.ErrorMessage}"); + } + + return result.GetOutput(); + } +} +``` + +## Best Practices + +### 1. Use the Emulator for Speed + +```csharp +// Fast - use emulator for most tests +var service = new LocalOrchestrationService(); + +// Slow - only for end-to-end tests +var service = new AzureStorageOrchestrationService(settings); +``` + +### 2. Test Determinism + +Verify orchestrations are deterministic: + +```csharp +[TestMethod] +public async Task Orchestration_IsDeterministic() +{ + // Run the same orchestration multiple times + for (int i = 0; i < 5; i++) + { + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(MyOrchestration), + new Input { Value = 42 }); + + var result = await _client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + + Assert.AreEqual(OrchestrationStatus.Completed, result.OrchestrationStatus); + Assert.AreEqual(84, result.GetOutput()); + } +} +``` + +### 3. Test Edge Cases + +```csharp +[TestMethod] +public async Task Orchestration_HandlesNullInput() +{ + // Test with null + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(MyOrchestration), + input: null); + + var result = await _client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + + // Verify appropriate handling +} + +[TestMethod] +public async Task Orchestration_HandlesEmptyList() +{ + var input = new Input { Items = new List() }; + + var instance = await _client.CreateOrchestrationInstanceAsync( + typeof(ProcessItemsOrchestration), + input); + + // ... +} +``` + +### 4. Isolate Tests + +```csharp +[TestInitialize] +public async Task Setup() +{ + // Create fresh service for each test + _service = new LocalOrchestrationService(); + // ... +} +``` + +## Sample Test Project + +See the complete test examples: +- [DurableTask.Samples.Tests](../../Test/DurableTask.Samples.Tests) +- [DurableTask.Core.Tests](../../Test/DurableTask.Core.Tests) +- [DurableTask.AzureStorage.Tests](../../Test/DurableTask.AzureStorage.Tests) + +## Next Steps + +- [Middleware](middleware.md) β€” Custom middleware +- [Serialization](serialization.md) β€” Custom serialization +- [Error Handling](../features/error-handling.md) β€” Exception handling diff --git a/docs/concepts/README.md b/docs/concepts/README.md new file mode 100644 index 000000000..1ad80e857 --- /dev/null +++ b/docs/concepts/README.md @@ -0,0 +1,16 @@ +# Core Concepts + +This section explains the fundamental concepts you need to understand when working with the Durable Task Framework. + +## Suggested Reading Order + +| Topic | Description | +| ----- | ----------- | +| [Core Concepts](core-concepts.md) | Architecture overview: Task Hubs, Workers, Clients | +| [Orchestrations](orchestrations.md) | Creating and managing durable workflows | +| [Activities](activities.md) | Implementing the basic units of work | +| [Replay and Durability](replay-and-durability.md) | How event sourcing enables fault tolerance | +| [Deterministic Constraints](deterministic-constraints.md) | Rules for writing correct orchestration code | + +> [!TIP] +> Start with [Core Concepts](core-concepts.md) for the architecture overview, then read [Replay and Durability](replay-and-durability.md) to understand *why* orchestrations have special constraints. diff --git a/docs/concepts/activities.md b/docs/concepts/activities.md new file mode 100644 index 000000000..803e6a8f9 --- /dev/null +++ b/docs/concepts/activities.md @@ -0,0 +1,354 @@ +# Activities + +Activities are the basic units of work in the Durable Task Framework. They perform actual operations like calling APIs, accessing databases, or performing computations. Unlike orchestrations, activities do not need to be deterministic. + +## Creating Activities + +### Type Parameters + +- `TInput` β€” The input type passed from the orchestration +- `TResult` β€” The return type sent back to the orchestration + +Note that input and output types must be JSON-serializable. See [serialization](../advanced/serialization.md) for details. + +### Synchronous Activities + +For simple, synchronous work: + +```csharp +using DurableTask.Core; + +public class GreetActivity : TaskActivity +{ + protected override string Execute(TaskContext context, string name) + { + return $"Hello, {name}!"; + } +} +``` + +### Asynchronous Activities + +For async operations (recommended for I/O): + +```csharp +public class CallApiActivity : AsyncTaskActivity +{ + private static readonly HttpClient s_httpClient = new HttpClient(); + + protected override async Task ExecuteAsync( + TaskContext context, + ApiRequest input) + { + using var response = await s_httpClient.PostAsJsonAsync(input.Url, input.Body); + response.EnsureSuccessStatusCode(); + return await response.Content.ReadFromJsonAsync(); + } +} +``` + +## Registration + +### Basic Registration + +```csharp +var worker = new TaskHubWorker(service, loggerFactory); +worker.AddTaskActivities(typeof(GreetActivity), typeof(CallApiActivity)); +await worker.StartAsync(); +``` + +### With Dependency Injection + +Create activity instances with dependencies: + +```csharp +// Using activity factory +worker.AddTaskActivities(new ActivityObjectCreator( + () => new CallApiActivity(httpClient))); + +// Or implement INameVersionObjectManager for full control +``` + +### With Generic Creator + +```csharp +public class MyActivityCreator : ObjectCreator +{ + private readonly IServiceProvider _services; + + public MyActivityCreator(IServiceProvider services) + { + _services = services; + } + + public override TaskActivity Create() + { + // Resolve from DI container + return (TaskActivity)_services.GetRequiredService(Type); + } +} + +// Register +worker.AddTaskActivities(new MyActivityCreator(serviceProvider)); +``` + +## Calling Activities from Orchestrations + +### Basic Call + +```csharp +var result = await context.ScheduleTask(typeof(GreetActivity), "World"); +``` + +### With Retry Options + +```csharp +var retryOptions = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 3) +{ + BackoffCoefficient = 2.0, + MaxRetryInterval = TimeSpan.FromMinutes(1), + RetryTimeout = TimeSpan.FromMinutes(10) +}; + +var result = await context.ScheduleWithRetry( + typeof(CallApiActivity), + retryOptions, + apiRequest); +``` + +### Using Typed Proxies + +Generate strongly-typed activity clients: + +```csharp +// Define interface +public interface IOrderActivities +{ + Task ValidateOrder(Order order); + Task ProcessPayment(PaymentRequest request); + Task ShipOrder(ShippingRequest request); +} + +// In orchestration +public override async Task RunTask( + OrchestrationContext context, + Order order) +{ + var activities = context.CreateClient(); + + var isValid = await activities.ValidateOrder(order); + if (!isValid) return new OrderResult { Success = false }; + + var payment = await activities.ProcessPayment(order.Payment); + var tracking = await activities.ShipOrder(order.Shipping); + + return new OrderResult { Success = true, TrackingNumber = tracking }; +} +``` + +> [!IMPORTANT] +> Do not include `TaskContext` or `CancellationToken` parameters in activity interface methods. Only JSON-serializable input and output types are allowed. + +## Activity Best Practices + +### 1. Keep Activities Focused + +Each activity should do one thing: + +```csharp +// βœ… Good - single responsibility +public class SendEmailActivity : AsyncTaskActivity { } +public class SaveToDbActivity : AsyncTaskActivity { } + +// ❌ Bad - too many responsibilities +public class DoEverythingActivity : AsyncTaskActivity +{ + // Sends email, saves to DB, calls API, etc. +} +``` + +The exception to this is when performance considerations require batching multiple related operations together to reduce overhead. However, this must be done carefully with attention to error handling and idempotency. + +### 2. Make Activities Idempotent + +Activities may be retried, so design them to be idempotent: + +```csharp +public class ProcessPaymentActivity : AsyncTaskActivity +{ + protected override async Task ExecuteAsync( + TaskContext context, + PaymentRequest input) + { + // Use idempotency key to prevent duplicate charges + return await _paymentService.ProcessAsync( + input, + idempotencyKey: input.OrderId); + } +} +``` + +### 3. Handle Timeouts + +Implement cancellation support: + +```csharp +public class LongRunningActivity : AsyncTaskActivity +{ + protected override async Task ExecuteAsync( + TaskContext context, + Input input) + { + using var cts = new CancellationTokenSource(TimeSpan.FromMinutes(5)); + + try + { + return await DoWorkAsync(input, cts.Token); + } + catch (OperationCanceledException) + { + throw new TimeoutException("Activity timed out"); + } + } +} +``` + +### 4. Log with Context + +Include orchestration context in logs: + +```csharp +public class MyActivity : AsyncTaskActivity +{ + private readonly ILogger _logger; + + protected override async Task ExecuteAsync( + TaskContext context, + Input input) + { + _logger.LogInformation( + "Processing {Input} for orchestration {InstanceId}", + input, + context.OrchestrationInstance.InstanceId); + + // ... do work ... + } +} +``` + +### 5. Return Serializable Results + +Ensure return types can be serialized: + +```csharp +// βœ… Good - serializable POCO +public class ActivityResult +{ + public string Status { get; set; } + public int Count { get; set; } + public DateTime ProcessedAt { get; set; } +} + +// ❌ Bad - not serializable +public class BadResult +{ + public HttpClient Client { get; set; } // Can't serialize + public Stream DataStream { get; set; } // Can't serialize +} +``` + +## Activity Execution Model + +### How Activities Run + +1. Orchestration calls `ScheduleTask()` β€” creates a `TaskScheduled` event in the orchestration history +2. Activity message is placed on the provider-specific work item queue +3. A worker picks up the message and executes the activity (typically as competing consumers) +4. Result is sent back to the orchestration's provider-specific control queue +5. Orchestration replays and sees `TaskCompleted` event in its updated history + +### Activity vs Orchestration Context + +| Feature | Activity (`TaskContext`) | Orchestration (`OrchestrationContext`) | +| ------- | ------------------------ | -------------------------------------- | +| Instance info | βœ… Available | βœ… Available | +| Schedule tasks | ❌ No | βœ… Yes | +| Create timers | ❌ No | βœ… Yes | +| Wait for events | ❌ No | βœ… Yes | +| Determinism required | ❌ No | βœ… Yes | +| Can call external APIs | βœ… Yes | ❌ Should not | + +## Error Handling in Activities + +### Throwing Exceptions + +Unhandled exceptions fail the activity and become `TaskFailedException` in the orchestration: + +```csharp +public class ValidateActivity : TaskActivity +{ + protected override bool Execute(TaskContext context, Order order) + { + if (string.IsNullOrEmpty(order.CustomerId)) + { + throw new ArgumentException("Customer ID is required"); + } + return true; + } +} + +// In orchestration +try +{ + await context.ScheduleTask(typeof(ValidateActivity), order); +} +catch (TaskFailedException ex) +{ + // ex.InnerException contains the original ArgumentException +} +``` + +### Returning Errors vs Throwing + +Consider returning error results for expected failures: + +```csharp +public class ProcessOrderActivity : AsyncTaskActivity +{ + protected override async Task ExecuteAsync( + TaskContext context, + Order order) + { + var inventory = await CheckInventoryAsync(order); + + if (!inventory.IsAvailable) + { + // Expected case - return result + return new OrderResult + { + Success = false, + Error = "Insufficient inventory" + }; + } + + // Unexpected case - throw + if (order.TotalAmount < 0) + { + throw new InvalidOperationException("Invalid order amount"); + } + + return new OrderResult { Success = true }; + } +} +``` + +This approach avoids potentially expensive retries for known failure conditions, and also avoids problems with serializing exceptions. + +## Next Steps + +- [Orchestrations](orchestrations.md) β€” Coordinating activities +- [Retries](../features/retries.md) β€” Configuring automatic retries +- [Error Handling](../features/error-handling.md) β€” Comprehensive error handling +- [Replay and Durability](replay-and-durability.md) β€” Understanding the replay model diff --git a/docs/concepts/core-concepts.md b/docs/concepts/core-concepts.md new file mode 100644 index 000000000..86d0c03b9 --- /dev/null +++ b/docs/concepts/core-concepts.md @@ -0,0 +1,230 @@ +# Core Concepts + +This document explains the fundamental concepts of the Durable Task Framework (DTFx). + +## Architecture Overview + +```text +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Task Hub β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ TaskHubWorker β”‚ β”‚ TaskHubClient β”‚ β”‚ +β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ +β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β€’ Start β”‚ β”‚ +β”‚ β”‚ β”‚Orchestrationβ”‚ β”‚ β”‚ β€’ Query β”‚ β”‚ +β”‚ β”‚ β”‚ Handlers β”‚ β”‚ β”‚ β€’ Send Events β”‚ β”‚ +β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β€’ Terminate β”‚ β”‚ +β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚ Activity β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ Handlers β”‚ β”‚ β”‚ β”‚ +β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ +β”‚ β”‚ β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β–Ό β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ IOrchestrationService β”‚ β”‚ +β”‚ β”‚ (Backend Provider) β”‚ β”‚ +β”‚ β”‚ β”‚ β”‚ +β”‚ β”‚ β€’ Message Queues (control, work items) β”‚ β”‚ +β”‚ β”‚ β€’ State Storage (history, instances) β”‚ β”‚ +β”‚ β”‚ β€’ Scale Management (partitions, etc.) β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +## Task Hub + +A **Task Hub** is a logical container for orchestration and activity state. It represents a single deployment unit and includes: + +- **Message Queues** β€” Queues for orchestration and activity work items +- **History Store** β€” Persistent storage for orchestration history +- **Instance Store** β€” Metadata about orchestration instances for querying + +### Key Characteristics + +- Each task hub is **isolated** β€” orchestrations in different task hubs cannot interact directly +- Multiple workers can connect to the same task hub for **scale-out** +- All connected workers must share the same backend provider configuration and orchestration/activity code +- The task hub name is used as a **namespace** for all stored data + +## TaskHubWorker + +The **TaskHubWorker** hosts and executes orchestrations and activities. It: + +- Polls the backend for work items +- Dispatches orchestration and activity code +- Reports completion back to the backend + +### Lifecycle + +```csharp +// Create worker +var orchestrationService = GetSelectedOrchestrationService(); +var worker = new TaskHubWorker(orchestrationService, loggerFactory); + +// Register handlers +worker.AddTaskOrchestrations(typeof(MyOrchestration)); +worker.AddTaskActivities(typeof(MyActivity)); + +// Start processing +await worker.StartAsync(); + +// ... application runs ... + +// Graceful shutdown +await worker.StopAsync(); +``` + +### Scaling + +Multiple workers can connect to the same task hub: + +```text +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Worker 1 β”‚ β”‚ Worker 2 β”‚ β”‚ Worker 3 β”‚ +β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ + β”‚ β”‚ β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ Task Hub β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +Work is distributed across workers automatically by the selected backend provider. + +## TaskHubClient + +The **TaskHubClient** is used to manage orchestration instances from external code: + +```csharp +var orchestrationService = GetSelectedOrchestrationService(); +var client = new TaskHubClient(orchestrationService, loggerFactory: loggerFactory); + +// Start a new orchestration +var instance = await client.CreateOrchestrationInstanceAsync( + typeof(MyOrchestration), + instanceId: "order-123", + input: new OrderData { ... }); + +// Query status +var state = await client.GetOrchestrationStateAsync(instance); + +// Send an event +await client.RaiseEventAsync(instance, "ApprovalReceived", approvalData); + +// Wait for completion +var result = await client.WaitForOrchestrationAsync(instance, timeout); + +// Terminate +await client.TerminateInstanceAsync(instance, "Cancelled by user"); +``` + +## Orchestrations + +**Orchestrations** are the core workflow definitions. They: + +- Define the sequence and logic of work +- Coordinate activities and sub-orchestrations +- Are **durable** β€” survive process restarts +- Must be **deterministic** β€” same input produces same sequence of actions + +```csharp +public class OrderOrchestration : TaskOrchestration +{ + public override async Task RunTask( + OrchestrationContext context, + OrderInput input) + { + // Orchestration logic here + var validated = await context.ScheduleTask(typeof(ValidateOrder), input); + + if (!validated) + return new OrderResult { Success = false }; + + await context.ScheduleTask(typeof(ProcessPayment), input); + await context.ScheduleTask(typeof(ShipOrder), input); + + return new OrderResult { Success = true }; + } +} +``` + +See [Orchestrations](orchestrations.md) for detailed documentation. + +## Activities + +**Activities** are the units of work that orchestrations schedule. They: + +- Perform the actual work (API calls, database operations, etc.) +- Can be **retried** automatically on failure +- Are **not** required to be deterministic +- Run once per scheduled invocation (with at-least-once guarantees) + +```csharp +public class ProcessPaymentActivity : AsyncTaskActivity +{ + protected override async Task ExecuteAsync( + TaskContext context, + PaymentInput input) + { + // Actual work here - call payment API, etc. + var result = await PaymentService.ProcessAsync(input); + return result; + } +} +``` + +See [Activities](activities.md) for detailed documentation. + +## Instance IDs + +Every orchestration instance has a unique **Instance ID**: + +```csharp +// Auto-generated ID +var instance = await client.CreateOrchestrationInstanceAsync(typeof(MyOrchestration), input); +// instance.InstanceId = "abc123..." + +// Custom ID (recommended for idempotency) +var instance = await client.CreateOrchestrationInstanceAsync( + typeof(MyOrchestration), + instanceId: "order-456", // Your custom ID + input: orderData); +``` + +### Best Practices + +- Use **meaningful IDs** like `order-{orderId}` or `user-{userId}-workflow` +- Use random GUIDs if no meaningful ID is available and make sure to store them +- Avoid reusing IDs for different logical workflows to prevent conflicts + +## Orchestration Status + +Orchestrations can be in one of these states: + +| Status | Description | +| ------ | ----------- | +| `Pending` | Scheduled but not yet started | +| `Running` | Currently executing or waiting | +| `Suspended` | Paused due to external request | +| `Completed` | Finished successfully | +| `Failed` | Terminated due to unhandled exception | +| `Terminated` | Explicitly terminated via API | +| `Canceled` | Not currently implemented | +| `ContinuedAsNew` | Restarted via `ContinueAsNew` (not used in recent versions) | + +```csharp +var state = await client.GetOrchestrationStateAsync(instance); +Console.WriteLine($"Status: {state.OrchestrationStatus}"); +``` + +## Next Steps + +- [Orchestrations](orchestrations.md) β€” Writing orchestration logic +- [Activities](activities.md) β€” Writing activity code +- [Replay and Durability](replay-and-durability.md) β€” How durability works +- [Deterministic Constraints](deterministic-constraints.md) β€” Rules for orchestration code diff --git a/docs/concepts/deterministic-constraints.md b/docs/concepts/deterministic-constraints.md new file mode 100644 index 000000000..8ef4ff3b3 --- /dev/null +++ b/docs/concepts/deterministic-constraints.md @@ -0,0 +1,290 @@ +# Deterministic Constraints + +Orchestration code must be **deterministic**β€”it must produce the same sequence of operations every time it runs with the same history. This is required because orchestrations are [replayed](replay-and-durability.md) to rebuild state after interruptions. + +## The Golden Rule + +> **The same input must always produce the same sequence of durable operations.** + +Durable operations include: + +- `ScheduleTask` / `ScheduleWithRetry` +- `CreateTimer` +- `WaitForExternalEvent` +- `CreateSubOrchestrationInstance` +- `ContinueAsNew` + +## What NOT to Do + +### ❌ Don't Use Current Time Directly + +```csharp +// ❌ WRONG - Non-deterministic +if (DateTime.UtcNow > deadline) +{ + await context.ScheduleTask(typeof(ExpiredActivity), input); +} + +// βœ… CORRECT - Use orchestration time +if (context.CurrentUtcDateTime > deadline) +{ + await context.ScheduleTask(typeof(ExpiredActivity), input); +} +``` + +### ❌ Don't Use Random Numbers + +```csharp +// ❌ WRONG - Different on replay +var random = new Random(); +if (random.Next(100) > 50) +{ + await context.ScheduleTask(typeof(ActivityA), input); +} + +// βœ… CORRECT - Get random value from activity +var randomValue = await context.ScheduleTask(typeof(GetRandomNumberActivity), 100); +if (randomValue > 50) +{ + await context.ScheduleTask(typeof(ActivityA), input); +} + +// βœ… OR use a fixed seed +var random = new Random(42); // Fixed seed +if (random.Next(100) > 50) +{ + await context.ScheduleTask(typeof(ActivityA), input); +} +``` + +### ❌ Don't Use GUIDs Directly + +```csharp +// ❌ WRONG - Different GUID on replay +var id = Guid.NewGuid().ToString(); +await context.ScheduleTask(typeof(ProcessActivity), id); + +// βœ… CORRECT - Use orchestration's NewGuid +var id = context.NewGuid().ToString(); +await context.ScheduleTask(typeof(ProcessActivity), id); + +// βœ… Also correct - Get from activity +var id = await context.ScheduleTask(typeof(GenerateIdActivity), null); +``` + +### ❌ Don't Read Environment Variables + +```csharp +// ❌ WRONG - May change between replays +var endpoint = Environment.GetEnvironmentVariable("API_ENDPOINT"); +await context.ScheduleTask(typeof(CallApiActivity), endpoint); + +// βœ… CORRECT - Pass as input or read in activity +// Option 1: Pass as orchestration input +await context.ScheduleTask(typeof(CallApiActivity), input.ApiEndpoint); + +// Option 2: Read in activity +await context.ScheduleTask(typeof(CallApiWithConfigActivity), input); +``` + +### ❌ Don't Make Network Calls + +```csharp +// ❌ WRONG - Side effect, non-deterministic +var response = await httpClient.GetAsync("https://api.example.com/data"); +var data = await response.Content.ReadAsStringAsync(); + +// βœ… CORRECT - Use activity for network calls +var data = await context.ScheduleTask(typeof(FetchDataActivity), "https://api.example.com/data"); +``` + +> [!NOTE] +> Awaiting a non-durable task like `httpClient.GetAsync` may cause the orchestration to hang indefinitely. + +### ❌ Don't Access Databases + +```csharp +// ❌ WRONG - Data may change between replays +var user = await dbContext.Users.FindAsync(userId); + +// βœ… CORRECT - Use activity +var user = await context.ScheduleTask(typeof(GetUserActivity), userId); +``` + +> [!NOTE] +> Awaiting a non-durable task like `dbContext.Users.FindAsync` may cause the orchestration to hang indefinitely. + +### ❌ Don't Use Thread.Sleep + +```csharp +// ❌ WRONG - Blocks thread, doesn't persist +Thread.Sleep(TimeSpan.FromMinutes(5)); +await Task.Delay(TimeSpan.FromMinutes(5)); + +// βœ… CORRECT - Use durable timer +await context.CreateTimer(context.CurrentUtcDateTime.AddMinutes(5), true); +``` + +> [!NOTE] +> Awaiting a non-durable task like `Task.Delay` may cause the orchestration to hang indefinitely. + +### ❌ Don't Use Mutable Static Variables + +```csharp +// ❌ WRONG - State not preserved across replays +static int counter = 0; +counter++; +if (counter > 5) { ... } + +// βœ… CORRECT - Use orchestration input/output for state +public override async Task RunTask(OrchestrationContext context, int currentCount) +{ + currentCount++; + if (currentCount > 5) { ... } +} +``` + +### ❌ Don't Use Non-Deterministic Collections + +```csharp +// ❌ WRONG - HashSet and Dictionary iteration order is not guaranteed +var items = new HashSet { "a", "b", "c" }; +foreach (var item in items) +{ + await context.ScheduleTask(typeof(ProcessActivity), item); +} + +// βœ… CORRECT - Use ordered collection +var items = new List { "a", "b", "c" }; +foreach (var item in items) +{ + await context.ScheduleTask(typeof(ProcessActivity), item); +} +``` + +### ❌ Don't Use Task.Run or Threading APIs + +```csharp +// ❌ WRONG - Background tasks are non-deterministic and may not complete before replay +await Task.Run(() => ProcessData(input)); + +// ❌ WRONG - Manual thread creation is non-deterministic +var thread = new Thread(() => DoWork()); +thread.Start(); + +// ❌ WRONG - ThreadPool work is non-deterministic +ThreadPool.QueueUserWorkItem(_ => ProcessItem(input)); + +// βœ… CORRECT - Use activities for background work +var result = await context.ScheduleTask(typeof(ProcessDataActivity), input); + +// βœ… CORRECT - Use fan-out pattern for parallel work +var tasks = input.Items.Select(item => + context.ScheduleTask(typeof(ProcessItemActivity), item)); +var results = await Task.WhenAll(tasks); +``` + +> [!NOTE] +> `Task.Run`, `ThreadPool.QueueUserWorkItem`, and manual thread creation introduce non-determinism because: +> +> - The work may complete at different times during replay +> - Background threads don't participate in orchestration checkpointing +> - Results are not captured in the orchestration history + +## What IS Safe + +### βœ… Local Computation + +```csharp +// βœ… Safe - deterministic computation +var sum = input.Values.Sum(); +var filtered = input.Items.Where(x => x.IsActive).ToList(); +var formatted = $"Order {input.OrderId}: {input.Description}"; +``` + +### βœ… Using Context Properties and Methods + +```csharp +// βœ… Safe - consistent across replays +var instanceId = context.OrchestrationInstance.InstanceId; +var currentTime = context.CurrentUtcDateTime; +var newId = context.NewGuid(); +``` + +### βœ… Conditional Logic Based on Durable Results + +```csharp +// βœ… Safe - result comes from history during replay +var status = await context.ScheduleTask(typeof(GetStatusActivity), orderId); +if (status == OrderStatus.Approved) +{ + await context.ScheduleTask(typeof(ProcessOrderActivity), orderId); +} +``` + +### βœ… Loops with Deterministic Bounds + +```csharp +// βœ… Safe - loop bounds are deterministic +for (int i = 0; i < input.Items.Count; i++) +{ + await context.ScheduleTask(typeof(ProcessItemActivity), input.Items[i]); +} +``` + +### βœ… Parallel Execution + +```csharp +// βœ… Safe - Task.WhenAll is deterministic +var tasks = input.Items.Select(item => + context.ScheduleTask(typeof(ProcessItemActivity), item)); +var results = await Task.WhenAll(tasks); +``` + +## Summary Table + +| Operation | Allowed in Orchestration? | Alternative | +| --------- | ------------------------- | ----------- | +| `DateTime.UtcNow` | ❌ No | `context.CurrentUtcDateTime` | +| `Guid.NewGuid()` | ❌ No | `context.NewGuid()` | +| `Random.Next()` | ❌ No | Get from activity | +| `Thread.Sleep()` / `Task.Delay()` | ❌ No | `context.CreateTimer()` | +| `Task.Run()` | ❌ No | Use activity or fan-out | +| `ThreadPool.QueueUserWorkItem()` | ❌ No | Use activity | +| Manual thread creation | ❌ No | Use activity | +| HTTP calls | ❌ No | Use activity | +| Database queries | ❌ No | Use activity | +| File I/O | ❌ No | Use activity | +| Environment variables | ⚠️ Avoid | Pass as input or read in activity | +| Static mutable state | ❌ No | Use orchestration state | +| `HashSet` or `Dictionary` iteration | ⚠️ Avoid | Use `List` or sorted collection | +| Local computation | βœ… Yes | β€” | +| String manipulation | βœ… Yes | β€” | +| LINQ queries (on local data) | βœ… Yes | β€” | + +## Detecting Non-Determinism + +### Runtime Detection + +Some non-deterministic issues cause runtime errors: + +```text +NonDeterministicOrchestrationException: The orchestration 'MyOrchestration' +has a non-deterministic replay detected. The history expected 'TaskScheduled' +for 'ActivityA' but got 'TaskScheduled' for 'ActivityB'. +``` + +### Static Analysis + +Consider using analyzers or code reviews to catch issues: + +- Review all `DateTime`, `Guid`, `Random` usage +- Search for HTTP client usage +- Check for `Thread.Sleep` or `Task.Delay` +- Check for `Task.Run`, `ThreadPool`, or `new Thread` + +## Next Steps + +- [Replay and Durability](replay-and-durability.md) β€” Why determinism matters +- [Versioning](../features/versioning.md) β€” Safely updating orchestration code +- [Error Handling](../features/error-handling.md) β€” Handling failures deterministically diff --git a/docs/concepts/orchestrations.md b/docs/concepts/orchestrations.md new file mode 100644 index 000000000..148386e0a --- /dev/null +++ b/docs/concepts/orchestrations.md @@ -0,0 +1,331 @@ +# Orchestrations + +Orchestrations are the core building blocks of the Durable Task Framework. They define durable, long-running workflows that coordinate activities, sub-orchestrations, timers, and external events. + +## Creating an Orchestration + +### Basic Structure + +Inherit from `TaskOrchestration`: + +```csharp +using DurableTask.Core; + +public class OrderProcessingOrchestration : TaskOrchestration +{ + public override async Task RunTask( + OrchestrationContext context, + OrderInput input) + { + // Orchestration logic here + return new OrderResult { Success = true }; + } +} +``` + +### Type Parameters + +- `TResult` β€” The return type of the orchestration +- `TInput` β€” The input type passed when starting the orchestration + +### Registration + +Register orchestrations with the worker: + +```csharp +var worker = new TaskHubWorker(service, loggerFactory); +worker.AddTaskOrchestrations(typeof(OrderProcessingOrchestration)); +await worker.StartAsync(); +``` + +## OrchestrationContext + +The `OrchestrationContext` provides APIs for scheduling durable operations: + +### Scheduling Activities + +```csharp +// Schedule an activity and wait for result +var result = await context.ScheduleTask(typeof(MyActivity), input); + +// Schedule with retry options +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 3); + +var result = await context.ScheduleWithRetry( + typeof(MyActivity), + options, + input); +``` + +### Creating Timers + +Timers allow orchestrations to wait for a specific time or duration. They are durable and survive process restarts. + +```csharp +// Wait for a specific time +await context.CreateTimer(context.CurrentUtcDateTime.AddHours(1), true); + +// Use for delays (not Thread.Sleep!) +await context.CreateTimer(context.CurrentUtcDateTime.AddMinutes(5), true); +``` + +> [!IMPORTANT] +> +> - Never use `Thread.Sleep` for delays in orchestrations. +> - Always use `context.CurrentUtcDateTime` for time calculations to ensure determinism. +> - Timers are cancellable using `CancellationToken` and must be cancelled if no longer needed. + +### Waiting for External Events + +Orchestrations can pause and wait for external events sent from client code or other orchestrations. + +```csharp +// Wait indefinitely for an event +var approvalData = await context.WaitForExternalEvent("ApprovalReceived"); + +// Wait with timeout +using var cts = new CancellationTokenSource(); +var timerTask = context.CreateTimer(context.CurrentUtcDateTime.AddDays(1), true, cts.Token); +var eventTask = context.WaitForExternalEvent("ApprovalReceived"); + +var winner = await Task.WhenAny(timerTask, eventTask); +if (winner == eventTask) +{ + // Timer cancelled since event was received (this is important) + cts.Cancel(); + var approval = await eventTask; + // Process approval +} +else +{ + // Timeout - escalate or reject +} +``` + +### Sub-Orchestrations + +```csharp +// Start a sub-orchestration +var subResult = await context.CreateSubOrchestrationInstance( + typeof(SubOrchestration), + subInput); + +// With custom instance ID +var subResult = await context.CreateSubOrchestrationInstance( + typeof(SubOrchestration), + "sub-instance-123", + subInput); +``` + +### Continue As New + +```csharp +// Restart orchestration with new input (eternal orchestrations) +context.ContinueAsNew(newInput); +return default; // Return value is ignored +``` + +## Orchestration Patterns + +### Sequential Execution + +```csharp +public override async Task RunTask(OrchestrationContext context, string input) +{ + var step1 = await context.ScheduleTask(typeof(Step1Activity), input); + var step2 = await context.ScheduleTask(typeof(Step2Activity), step1); + var step3 = await context.ScheduleTask(typeof(Step3Activity), step2); + return step3; +} +``` + +### Fan-Out/Fan-In (Parallel Execution) + +The fan-out/fan-in pattern allows multiple tasks to be executed in parallel, with the orchestration waiting for all to complete before proceeding. + +```csharp +public override async Task RunTask(OrchestrationContext context, int[] inputs) +{ + // Fan-out: Start all tasks in parallel + var tasks = inputs.Select(i => + context.ScheduleTask(typeof(ProcessItemActivity), i)).ToList(); + + // Fan-in: Wait for all to complete + var results = await Task.WhenAll(tasks); + + return results; +} +``` + +### Human Interaction + +```csharp +public override async Task RunTask( + OrchestrationContext context, + ApprovalRequest request) +{ + // Send notification to approver + await context.ScheduleTask(typeof(SendApprovalRequestActivity), request); + + // Wait for approval with timeout + using var cts = new CancellationTokenSource(); + var approvalTask = context.WaitForExternalEvent("Approved"); + var timeoutTask = context.CreateTimer( + context.CurrentUtcDateTime.AddDays(7), + true, + cts.Token); + + var winner = await Task.WhenAny(approvalTask, timeoutTask); + + if (winner == approvalTask) + { + cts.Cancel(); + return new ApprovalResult { Approved = await approvalTask }; + } + + return new ApprovalResult { Approved = false, TimedOut = true }; +} +``` + +### Monitor Pattern + +```csharp +public override async Task RunTask( + OrchestrationContext context, + MonitorInput input) +{ + int pollingInterval = 30; // seconds + DateTime expiryTime = context.CurrentUtcDateTime.AddHours(2); + + while (context.CurrentUtcDateTime < expiryTime) + { + var status = await context.ScheduleTask( + typeof(CheckJobStatusActivity), + input.JobId); + + if (status.IsComplete) + { + return new MonitorResult { Completed = true, Status = status }; + } + + // Wait before polling again + await context.CreateTimer( + context.CurrentUtcDateTime.AddSeconds(pollingInterval), + true); + + // Optional: exponential backoff + pollingInterval = Math.Min(pollingInterval * 2, 300); + } + + return new MonitorResult { Completed = false, TimedOut = true }; +} +``` + +> [!IMPORTANT] +> +> - Long loops can lead to resource exhaustion. Use `ContinueAsNew` for very long-running monitors. +> - Avoid tight polling loops; always include delays via `context.CreateTimer`. + +## Getting Orchestration Information + +### Current Instance ID + +```csharp +string instanceId = context.OrchestrationInstance.InstanceId; +``` + +### Current Time + +Always use `context.CurrentUtcDateTime` instead of `DateTime.UtcNow`: + +```csharp +// βœ… Correct - deterministic +var now = context.CurrentUtcDateTime; + +// ❌ Wrong - non-deterministic +var now = DateTime.UtcNow; +``` + +See [Deterministic Constraints](deterministic-constraints.md) for more details. + +### Replay Detection + +```csharp +if (!context.IsReplaying) +{ + // Only runs during first execution, not during replay + _logger.LogInformation("Processing order {OrderId}", input.OrderId); +} +``` + +See [Replay and Durability](replay-and-durability.md) for more details. + +## Starting Orchestrations + +### From Client Code + +```csharp +var client = new TaskHubClient(service, loggerFactory: loggerFactory); + +// Start with auto-generated instance ID +var instance = await client.CreateOrchestrationInstanceAsync( + typeof(OrderProcessingOrchestration), + new OrderInput { OrderId = "12345" }); + +// Start with custom instance ID +var instance = await client.CreateOrchestrationInstanceAsync( + typeof(OrderProcessingOrchestration), + instanceId: "order-12345", + input: new OrderInput { OrderId = "12345" }); + +// Start at a scheduled time +var instance = await client.CreateScheduledOrchestrationInstanceAsync( + typeof(OrderProcessingOrchestration), + instanceId: "scheduled-order", + input: new OrderInput { OrderId = "12345" }, + startAt: DateTime.UtcNow.AddHours(1)); +``` + +> [!NOTE] +> Not all backends support scheduled orchestrations. + +### Waiting for Completion + +```csharp +var result = await client.WaitForOrchestrationAsync( + instance, + timeout: TimeSpan.FromMinutes(5)); + +if (result.OrchestrationStatus == OrchestrationStatus.Completed) +{ + var output = result.Output; // Serialized result +} +``` + +## Error Handling + +See [Error Handling](../features/error-handling.md) for comprehensive error handling patterns. + +```csharp +public override async Task RunTask(OrchestrationContext context, string input) +{ + try + { + return await context.ScheduleTask(typeof(RiskyActivity), input); + } + catch (TaskFailedException ex) + { + // Activity threw an exception + return await context.ScheduleTask(typeof(CompensationActivity), input); + } +} +``` + +## Next Steps + +- [Activities](activities.md) β€” Writing activity code +- [Deterministic Constraints](deterministic-constraints.md) β€” Important rules for orchestration code +- [Replay and Durability](replay-and-durability.md) β€” Understanding how orchestrations are replayed +- [Features](../features/retries.md) β€” Retries, timers, events, and more diff --git a/docs/concepts/replay-and-durability.md b/docs/concepts/replay-and-durability.md new file mode 100644 index 000000000..be95ba9d1 --- /dev/null +++ b/docs/concepts/replay-and-durability.md @@ -0,0 +1,249 @@ +# Replay and Durability + +The Durable Task Framework achieves durability through an **event-sourcing** pattern. Understanding how replay works is essential for writing correct orchestrations. + +## How Durability Works + +### The Problem + +Traditional workflows have a problem: if the process crashes, in-progress state is lost. + +```text +Process starts β†’ Workflow runs β†’ CRASH β†’ State lost ❌ +``` + +### The Solution: Event Sourcing + +DTFx persists every decision as an event in the history: + +```text +Orchestration executes β†’ Event recorded β†’ (Crash) β†’ Replay from history β†’ Continue βœ… +``` + +## The Replay Model + +### First Execution + +When an orchestration runs for the first time: + +```csharp +public override async Task RunTask(OrchestrationContext context, string input) +{ + var a = await context.ScheduleTask(typeof(ActivityA), input); // Executes, records TaskScheduled + var b = await context.ScheduleTask(typeof(ActivityB), a); // Executes, records TaskScheduled + return b; +} +``` + +**History after first execution:** + +```text +1. ExecutionStarted { Input: "hello" } +2. TaskScheduled { Name: "ActivityA" } +3. TaskCompleted { Result: "A-result" } +4. TaskScheduled { Name: "ActivityB" } +5. TaskCompleted { Result: "B-result" } +6. ExecutionCompleted { Result: "B-result" } +``` + +### Replay After Crash + +If the process crashes and restarts, the orchestration **replays**: + +1. Framework loads the history from storage +2. The orchestration's `RunTask` method executes again from the beginning +3. Each `await` checks if there's already a result in history +4. If result exists, return it immediately (no actual execution) +5. If no result, schedule the work and wait + +```csharp +// During replay: +var a = await context.ScheduleTask(typeof(ActivityA), input); +// ↑ Sees TaskCompleted in history, returns "A-result" immediately + +var b = await context.ScheduleTask(typeof(ActivityB), a); +// ↑ Sees TaskCompleted in history, returns "B-result" immediately +``` + +### Partial Replay + +If an orchestration is waiting for an activity: + +```text +1. ExecutionStarted { Input: "hello" } +2. TaskScheduled { Name: "ActivityA" } +3. TaskCompleted { Result: "A-result" } +4. TaskScheduled { Name: "ActivityB" } +← Activity B is still running +``` + +When Activity B completes, the orchestration replays: + +```csharp +var a = await context.ScheduleTask(typeof(ActivityA), input); +// ↑ Returns "A-result" from history + +var b = await context.ScheduleTask(typeof(ActivityB), a); +// ↑ Finds new TaskCompleted event, returns result + +return b; // Orchestration completes +``` + +## Checkpointing + +### When Checkpoints Occur + +The orchestration state is checkpointed (saved) when: + +1. An `await` yields control back to the framework +2. The orchestration completes or fails +3. `ContinueAsNew` is called and the current execution ends + +### What Gets Saved + +- Complete event history +- Custom status (if set) +- Input and output (if any) + +### What Doesn't Get Saved + +- Local variables (they're rebuilt during replay) +- In-memory state outside the orchestration + +## Understanding Context.IsReplaying + +The `IsReplaying` property tells you if the orchestration is replaying: + +```csharp +public override async Task RunTask(OrchestrationContext context, string input) +{ + // This code runs during EVERY replay + var greeting = $"Hello, {input}"; + + if (!context.IsReplaying) + { + // This only runs during the FIRST execution of this code path + _logger.LogInformation("Processing input: {Input}", input); + } + + var result = await context.ScheduleTask(typeof(MyActivity), greeting); + + return result; +} +``` + +### When to Use IsReplaying + +| Use Case | Use IsReplaying? | +| -------- | ---------------- | +| Logging | βœ… Yes - avoid duplicate logs | +| Metrics | βœ… Yes - avoid double-counting | +| Business logic | ❌ No - should work identically during replay | +| Side effects | ❌ No - use activities instead | + +## Why Determinism Matters + +Because orchestrations replay, they **must** produce the same sequence of events every time: + +### Example: Non-Deterministic Code (BAD) + +```csharp +// ❌ WRONG - Different result on each replay +public override async Task RunTask(OrchestrationContext context, string input) +{ + if (DateTime.UtcNow.Hour < 12) // Different on replay! + { + return await context.ScheduleTask(typeof(MorningActivity), input); + } + return await context.ScheduleTask(typeof(EveningActivity), input); +} +``` + +If the orchestration starts at 11:55 AM and replays at 12:05 PM, it will try to match `EveningActivity` against a history containing `MorningActivity` β†’ **crash**. + +### Example: Deterministic Code (GOOD) + +```csharp +// βœ… CORRECT - Same result on every replay +public override async Task RunTask(OrchestrationContext context, string input) +{ + if (context.CurrentUtcDateTime.Hour < 12) // Same value during replay! + { + return await context.ScheduleTask(typeof(MorningActivity), input); + } + return await context.ScheduleTask(typeof(EveningActivity), input); +} +``` + +## History Events + +Common events in the orchestration history: + +| Event | Description | +| ----- | ----------- | +| `ExecutionStarted` | Orchestration started | +| `TaskScheduled` | Activity was scheduled | +| `TaskCompleted` | Activity completed successfully | +| `TaskFailed` | Activity failed | +| `SubOrchestrationInstanceCreated` | Sub-orchestration started | +| `SubOrchestrationInstanceCompleted` | Sub-orchestration completed | +| `TimerCreated` | Timer was created | +| `TimerFired` | Timer elapsed | +| `EventRaised` | External event received | +| `ExecutionCompleted` | Orchestration completed | +| `ExecutionFailed` | Orchestration failed | +| `ExecutionTerminated` | Orchestration was terminated | +| `ContinueAsNew` | Orchestration restarted | + +## Viewing History + +### Via Client + +```csharp +var history = await client.GetOrchestrationHistoryAsync(instance); +foreach (var evt in history) +{ + Console.WriteLine($"{evt.EventType}: {evt.Timestamp}"); +} +``` + +### What History Tells You + +- Exact sequence of operations +- Timing of each step +- Input/output of each activity +- Where failures occurred + +## Performance Implications + +### History Growth + +Every operation adds to the history. Large histories can impact: + +- **Memory** β€” Full history is loaded into memory during replay +- **Latency** β€” More events = longer replay time +- **Storage** β€” More data to persist and transfer (the exact impact depends on the storage provider) + +### Mitigation Strategies + +1. **Use `ContinueAsNew`** for long-running orchestrations: + + ```csharp + if (context.CurrentUtcDateTime > startTime.AddHours(24)) + { + context.ContinueAsNew(newState); // Reset history + return default; + } + ``` + +2. **Batch operations** in activities instead of many small activities + +3. **Sub-orchestrations** for logical groupings (separate history) + +4. **Purge completed instances** periodically + +## Next Steps + +- [Deterministic Constraints](deterministic-constraints.md) β€” Rules for writing deterministic code +- [Eternal Orchestrations](../features/eternal-orchestrations.md) β€” Managing long-running workflows +- [Versioning](../features/versioning.md) β€” Updating orchestration code safely diff --git a/docs/features/README.md b/docs/features/README.md new file mode 100644 index 000000000..d92f2125d --- /dev/null +++ b/docs/features/README.md @@ -0,0 +1,15 @@ +# Features + +This section covers the built-in features and patterns available in the Durable Task Framework. + +## Topics + +| Feature | Description | +| ------- | ----------- | +| [Retries](retries.md) | Automatic retry policies for activities and sub-orchestrations | +| [Timers](timers.md) | Durable delays and scheduling with `CreateTimer` | +| [External Events](external-events.md) | Receiving data from outside sources (webhooks, human interaction) | +| [Sub-Orchestrations](sub-orchestrations.md) | Breaking workflows into smaller, reusable pieces | +| [Error Handling](error-handling.md) | Exception handling, compensation, and recovery patterns | +| [Eternal Orchestrations](eternal-orchestrations.md) | Long-running workflows with `ContinueAsNew` | +| [Versioning](versioning.md) | Strategies for updating orchestrations safely | diff --git a/docs/features/error-handling.md b/docs/features/error-handling.md new file mode 100644 index 000000000..fe7cf54d6 --- /dev/null +++ b/docs/features/error-handling.md @@ -0,0 +1,515 @@ +# Error Handling + +The Durable Task Framework provides robust error handling capabilities for orchestrations and activities. This guide covers exception handling, compensation, and recovery patterns. + +## Activity Exceptions + +### Basic Exception Handling + +When an activity throws an exception, it becomes a `TaskFailedException` in the orchestration: + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + try + { + var result = await context.ScheduleTask(typeof(RiskyActivity), input); + return new Result { Success = true, Data = result }; + } + catch (TaskFailedException ex) + { + // ex.InnerException contains the original exception + return new Result { Success = false, Error = ex.InnerException?.Message }; + } +} +``` + +### Exception Details + +```csharp +catch (TaskFailedException ex) +{ + var originalException = ex.InnerException; + var activityName = ex.Name; // "RiskyActivity" + var scheduledEventId = ex.ScheduleId; // Event ID in history + + _logger.LogError(originalException, + "Activity {Activity} failed", activityName); +} +``` + +## Error Propagation Modes + +The `TaskHubWorker.ErrorPropagationMode` property controls how exception information is propagated from failed activities and sub-orchestrations. + +### SerializeExceptions (Default) + +The default mode serializes the original exception and makes it available via `InnerException`: + +```csharp +worker.ErrorPropagationMode = ErrorPropagationMode.SerializeExceptions; + +// In orchestration: +catch (TaskFailedException ex) +{ + // Original exception is deserialized and available + var originalException = ex.InnerException; + + if (originalException is InvalidOperationException invalidOp) + { + // Can catch specific exception types + } +} +``` + +**Limitations:** + +- Not all exception types can be serialized/deserialized correctly +- Custom exceptions may lose data if not properly serializable +- Doesn't work across language boundaries (e.g., polyglot scenarios) + +### UseFailureDetails (Recommended) + +The `UseFailureDetails` mode provides consistent, structured error information via `FailureDetails`: + +```csharp +worker.ErrorPropagationMode = ErrorPropagationMode.UseFailureDetails; +``` + +With this mode: + +- `InnerException` is **always null** +- Error details are available via `FailureDetails` property +- Works consistently across all exception types and language runtimes + +```csharp +catch (TaskFailedException ex) +{ + // InnerException is null in UseFailureDetails mode + // Use FailureDetails instead + FailureDetails details = ex.FailureDetails; + + string errorType = details.ErrorType; // e.g., "System.InvalidOperationException" + string errorMessage = details.ErrorMessage; // The exception message + string stackTrace = details.StackTrace; // Full stack trace + bool isNonRetriable = details.IsNonRetriable; + + // Check for inner failures (nested exceptions) + FailureDetails innerFailure = details.InnerFailure; +} +``` + +#### Checking Exception Types + +Use `IsCausedBy()` to check exception types without deserializing: + +```csharp +catch (TaskFailedException ex) when (ex.FailureDetails?.IsCausedBy() == true) +{ + // Handle InvalidOperationException +} +catch (TaskFailedException ex) when (ex.FailureDetails?.IsCausedBy() == true) +{ + // Handle TimeoutException +} +``` + +#### Sub-Orchestration Failures + +The same pattern applies to `SubOrchestrationFailedException`: + +```csharp +try +{ + await context.CreateSubOrchestrationInstance( + typeof(ChildOrchestration), + input); +} +catch (SubOrchestrationFailedException ex) +{ + FailureDetails details = ex.FailureDetails; + + _logger.LogError( + "Child orchestration failed: {ErrorType}: {Message}", + details.ErrorType, + details.ErrorMessage); +} +``` + +#### When to Use UseFailureDetails + +Use `UseFailureDetails` when: + +- You need consistent error handling across all exception types +- Running orchestrations/activities out-of-process or in other language runtimes +- Custom exceptions may not serialize correctly +- You want to avoid deserialization issues with `InnerException` + +> [!WARNING] +> Changing `ErrorPropagationMode` on an existing deployment can break in-flight orchestrations if they contain exception handling logic that depends on `InnerException`. Plan changes carefully and consider using [versioning strategies](versioning.md). + +#### Custom Exception Properties + +When using `UseFailureDetails`, you can include custom properties from your exceptions in the `FailureDetails.Properties` dictionary by implementing `IExceptionPropertiesProvider`: + +```csharp +public class CustomExceptionPropertiesProvider : IExceptionPropertiesProvider +{ + public IDictionary? GetExceptionProperties(Exception exception) + { + // Extract custom properties from known exception types + if (exception is OrderProcessingException orderEx) + { + return new Dictionary + { + ["OrderId"] = orderEx.OrderId, + ["FailureStage"] = orderEx.Stage, + ["RetryCount"] = orderEx.RetryCount + }; + } + + if (exception is ValidationException validationEx) + { + return new Dictionary + { + ["FieldName"] = validationEx.FieldName, + ["ValidationRule"] = validationEx.Rule + }; + } + + // Return null for exceptions without custom properties + return null; + } +} +``` + +Register the provider with the `TaskHubWorker`: + +```csharp +var worker = new TaskHubWorker(orchestrationService, loggerFactory); +worker.ErrorPropagationMode = ErrorPropagationMode.UseFailureDetails; +worker.ExceptionPropertiesProvider = new CustomExceptionPropertiesProvider(); +``` + +Access the custom properties in your orchestration's error handling: + +```csharp +catch (TaskFailedException ex) +{ + FailureDetails details = ex.FailureDetails; + + if (details.Properties != null) + { + if (details.Properties.TryGetValue("OrderId", out var orderId)) + { + _logger.LogError("Order {OrderId} failed: {Message}", orderId, details.ErrorMessage); + } + + if (details.Properties.TryGetValue("RetryCount", out var retryCount) && + retryCount is int count && count >= 3) + { + // Too many retries, escalate + await context.ScheduleTask(typeof(EscalateFailureActivity), details); + } + } +} +``` + +> [!NOTE] +> Property values should be simple, serializable types (strings, numbers, booleans). Complex objects may not serialize correctly across process boundaries. + +### Handling Specific Exception Types + +```csharp +try +{ + await context.ScheduleTask(typeof(PaymentActivity), payment); +} +catch (TaskFailedException ex) when (ex.InnerException is InsufficientFundsException) +{ + // Handle specific business error + await context.ScheduleTask(typeof(NotifyCustomerActivity), + "Payment failed: Insufficient funds"); +} +catch (TaskFailedException ex) when (ex.InnerException is PaymentGatewayException) +{ + // Retry with different gateway + await context.ScheduleTask(typeof(BackupPaymentActivity), payment); +} +catch (TaskFailedException) +{ + // Handle all other failures + throw; +} +``` + +## Automatic Retries + +Use `ScheduleWithRetry` for transient failures: + +```csharp +var retryOptions = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 3) +{ + BackoffCoefficient = 2.0, + Handle = ex => ex is TimeoutException || ex is HttpRequestException +}; + +try +{ + await context.ScheduleWithRetry( + typeof(UnreliableActivity), + retryOptions, + input); +} +catch (TaskFailedException ex) +{ + // All retries exhausted +} +``` + +See [Retries](retries.md) for detailed retry configuration. + +## Sub-Orchestration Exceptions + +```csharp +try +{ + await context.CreateSubOrchestrationInstance( + typeof(ChildOrchestration), + input); +} +catch (SubOrchestrationFailedException ex) +{ + // Child orchestration threw an unhandled exception + var failureReason = ex.InnerException?.Message; +} +``` + +## Compensation Patterns + +### Saga Pattern + +Compensate previous steps when a later step fails: + +```csharp +public override async Task RunTask( + OrchestrationContext context, + OrderInput input) +{ + // Track completed steps for compensation + var completedSteps = new List(); + + try + { + // Step 1: Reserve inventory + await context.ScheduleTask(typeof(ReserveInventoryActivity), input); + completedSteps.Add("inventory"); + + // Step 2: Charge payment + await context.ScheduleTask(typeof(ChargePaymentActivity), input); + completedSteps.Add("payment"); + + // Step 3: Ship order (might fail) + await context.ScheduleTask(typeof(ShipOrderActivity), input); + + return new OrderResult { Success = true }; + } + catch (TaskFailedException ex) + { + // Compensate in reverse order + if (completedSteps.Contains("payment")) + { + await context.ScheduleTask(typeof(RefundPaymentActivity), input); + } + + if (completedSteps.Contains("inventory")) + { + await context.ScheduleTask(typeof(ReleaseInventoryActivity), input); + } + + return new OrderResult + { + Success = false, + Error = ex.InnerException?.Message + }; + } +} +``` + +### Compensation with Sub-Orchestrations + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + try + { + await context.CreateSubOrchestrationInstance( + typeof(ProcessOrderOrchestration), + input); + } + catch (SubOrchestrationFailedException) + { + // Run compensation orchestration + await context.CreateSubOrchestrationInstance( + typeof(CompensateOrderOrchestration), + new CompensationInput { OriginalInput = input }); + } +} +``` + +## Error Result Pattern + +Return errors as results instead of throwing: + +```csharp +// Activity returns result with error info +public class ProcessOrderActivity : AsyncTaskActivity +{ + protected override async Task ExecuteAsync( + TaskContext context, + Order order) + { + if (!await ValidateOrderAsync(order)) + { + // Return error instead of throwing + return new OrderResult + { + Success = false, + ErrorCode = "VALIDATION_FAILED", + ErrorMessage = "Order validation failed" + }; + } + + // Process order... + return new OrderResult { Success = true, OrderId = newOrderId }; + } +} + +// Orchestration checks result +public override async Task RunTask(OrchestrationContext context, Order input) +{ + var result = await context.ScheduleTask(typeof(ProcessOrderActivity), input); + + if (!result.Success) + { + // Handle error without exception + await context.ScheduleTask(typeof(NotifyErrorActivity), result); + return new Result { Success = false, Error = result.ErrorMessage }; + } + + return new Result { Success = true, OrderId = result.OrderId }; +} +``` + +## Timeout Handling + +### Activity Timeout + +Activities don't have built-in timeout. Handle in orchestration: + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + using var cts = new CancellationTokenSource(); + + var activityTask = context.ScheduleTask(typeof(LongRunningActivity), input); + var timeoutTask = context.CreateTimer( + context.CurrentUtcDateTime.AddMinutes(30), + true, + cts.Token); + + var winner = await Task.WhenAny(activityTask, timeoutTask); + + if (winner == activityTask) + { + cts.Cancel(); + return new Result { Success = true, Data = await activityTask }; + } + else + { + // Activity is still running but we've timed out + // Note: Activity will complete eventually, but result is ignored + return new Result { Success = false, TimedOut = true }; + } +} +``` + +### Orchestration-Level Timeout + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + var deadline = context.CurrentUtcDateTime.AddHours(4); + + while (context.CurrentUtcDateTime < deadline) + { + var status = await context.ScheduleTask(typeof(CheckStatusActivity), input); + + if (status.IsComplete) + return new Result { Success = true }; + + await context.CreateTimer(context.CurrentUtcDateTime.AddMinutes(5), true); + } + + return new Result { Success = false, TimedOut = true }; +} +``` + +## Circuit Breaker Pattern + +Prevent repeated failures: + +```csharp +public override async Task RunTask(OrchestrationContext context, State state) +{ + state ??= new State(); + + // Circuit breaker check + if (state.ConsecutiveFailures >= 5) + { + var cooldownEnd = state.LastFailure.AddMinutes(15); + if (context.CurrentUtcDateTime < cooldownEnd) + { + // Circuit is open - wait before retry + await context.CreateTimer(cooldownEnd, true); + } + state.ConsecutiveFailures = 0; // Reset after cooldown + } + + try + { + var result = await context.ScheduleTask(typeof(ExternalServiceActivity), state.Input); + state.ConsecutiveFailures = 0; + return new Result { Success = true, Data = result }; + } + catch (TaskFailedException) + { + state.ConsecutiveFailures++; + state.LastFailure = context.CurrentUtcDateTime; + + // Continue to retry with backoff + context.ContinueAsNew(state); + return null; + } +} +``` + +## Best Practices Summary + +| Practice | Description | +| -------- | ----------- | +| **Use `UseFailureDetails` mode** | Prefer `ErrorPropagationMode.UseFailureDetails` for consistent error handling | +| **Use retries for transient failures** | Configure `RetryOptions` for HTTP, timeout errors | +| **Return errors for expected failures** | Use result types instead of exceptions for business errors | +| **Implement compensation** | Use Saga pattern for multi-step transactions | +| **Set timeouts** | Don't let orchestrations wait indefinitely | +| **Log with context** | Include instance ID, activity name, error details | +| **Test failure scenarios** | Verify compensation and recovery logic | + +## Next Steps + +- [Retries](retries.md) β€” Configuring automatic retries +- [Replay and Durability](../concepts/replay-and-durability.md) β€” Understanding exception persistence +- [Testing](../advanced/testing.md) β€” Testing error handling diff --git a/docs/features/eternal-orchestrations.md b/docs/features/eternal-orchestrations.md new file mode 100644 index 000000000..5e2747ade --- /dev/null +++ b/docs/features/eternal-orchestrations.md @@ -0,0 +1,364 @@ +# Eternal Orchestrations + +Eternal orchestrations are long-running workflows that run indefinitely by periodically restarting themselves. This pattern is useful for monitoring, scheduling, and other recurring tasks. + +## The ContinueAsNew Pattern + +### Basic Eternal Orchestration + +```csharp +public class MonitorOrchestration : TaskOrchestration +{ + public override async Task RunTask( + OrchestrationContext context, + MonitorInput input) + { + // Do the monitoring work + await context.ScheduleTask(typeof(CheckHealthActivity), input.Target); + + // Wait for next interval + await context.CreateTimer( + context.CurrentUtcDateTime.AddMinutes(input.IntervalMinutes), + true); + + // Restart with fresh history + context.ContinueAsNew(input); + + return null; // Has no effect since ContinueAsNew was called + } +} +``` + +### Why ContinueAsNew? + +Without `ContinueAsNew`, orchestration history grows unbounded: + +```text +// After 1000 iterations without ContinueAsNew: +History size: 10,000+ events +Memory usage: High +Replay time: Slow +``` + +An orchestration with an unbounded history can lead to severe performance degradation and process crashes due to OutOfMemoryExceptions. + +With `ContinueAsNew`: + +```text +// After 1000 iterations with ContinueAsNew: +History size: ~10 events (reset each iteration) +Memory usage: Low +Replay time: Fast +``` + +## ContinueAsNew Behavior + +### What Happens + +1. `ContinueAsNew(newInput)` is called +2. Current execution completes when `RunTask` returns +3. New execution starts with: + - Same instance ID + - Fresh (empty) history + - New input provided to `ContinueAsNew` + +### Status Transitions + +```text +Running β†’ ContinuedAsNew β†’ Running (new execution) +``` + +### History Reset + +Old history is usually **replaced**, not appended. The previous execution's history can optionally be retained for auditing (provider-dependent). + +## Common Patterns + +### Periodic Monitoring + +```csharp +public override async Task RunTask( + OrchestrationContext context, + MonitorConfig config) +{ + // Check system health + var health = await context.ScheduleTask( + typeof(CheckHealthActivity), + config.Endpoint); + + // Alert if unhealthy + if (!health.IsHealthy) + { + await context.ScheduleTask( + typeof(SendAlertActivity), + new Alert { Endpoint = config.Endpoint, Status = health }); + } + + // Wait before next check + await context.CreateTimer( + context.CurrentUtcDateTime.AddMinutes(config.CheckIntervalMinutes), + true); + + // Continue forever + context.ContinueAsNew(config); + return null; +} +``` + +### Job Queue Processor + +```csharp +public override async Task RunTask( + OrchestrationContext context, + QueueConfig config) +{ + // Get next batch of jobs + var jobs = await context.ScheduleTask>( + typeof(GetPendingJobsActivity), + new GetJobsInput { MaxCount = config.BatchSize }); + + if (jobs.Any()) + { + // Process jobs in parallel + var tasks = jobs.Select(job => + context.ScheduleTask(typeof(ProcessJobActivity), job)); + await Task.WhenAll(tasks); + } + + // Short delay if no jobs, to avoid busy-waiting + var delay = jobs.Any() + ? TimeSpan.FromSeconds(1) + : TimeSpan.FromSeconds(30); + + await context.CreateTimer(context.CurrentUtcDateTime.Add(delay), true); + + context.ContinueAsNew(config); + return null; +} +``` + +### Cron Scheduler + +```csharp +public override async Task RunTask( + OrchestrationContext context, + CronSchedule schedule) +{ + // Calculate next run time + var nextRun = GetNextCronTime(schedule.CronExpression, context.CurrentUtcDateTime); + + // Wait until scheduled time + await context.CreateTimer(nextRun, true); + + // Execute the scheduled task + await context.ScheduleTask(typeof(ScheduledTaskActivity), schedule.TaskInput); + + // Continue to next scheduled run + context.ContinueAsNew(schedule); + return null; +} +``` + +### Stateful Aggregator + +```csharp +public class AggregatorOrchestration : TaskOrchestration +{ + public override async Task RunTask( + OrchestrationContext context, + AggregatorState state) + { + // Initialize state on first run + state ??= new AggregatorState { Count = 0, Total = 0 }; + + // Wait for data event or periodic save + using var cts = new CancellationTokenSource(); + var eventTask = context.WaitForExternalEvent("NewData"); + var saveTask = context.CreateTimer( + context.CurrentUtcDateTime.AddMinutes(5), + true, + cts.Token); + + var winner = await Task.WhenAny(eventTask, saveTask); + cts.Cancel(); + + if (winner == eventTask) + { + // Update aggregations + var data = await eventTask; + state.Count++; + state.Total += data.Value; + state.LastUpdated = context.CurrentUtcDateTime; + } + else + { + // Periodic save + if (state.Count > 0) + { + await context.ScheduleTask( + typeof(SaveAggregationActivity), + state); + } + } + + // Check for termination signal + if (state.ShouldTerminate) + { + return state; // Actually return and complete + } + + // Continue with updated state + context.ContinueAsNew(state); + return null; + } +} +``` + +### With Maximum Iterations + +```csharp +public override async Task RunTask( + OrchestrationContext context, + ProcessingState state) +{ + state.Iteration++; + + // Do work + var result = await context.ScheduleTask( + typeof(ProcessBatchActivity), + state.CurrentBatch); + + // Check completion conditions + if (state.Iteration >= state.MaxIterations) + { + return new ProcessingResult + { + Completed = true, + Iterations = state.Iteration + }; + } + + if (!state.HasMoreWork) + { + return new ProcessingResult + { + Completed = true, + Iterations = state.Iteration + }; + } + + // Wait and continue + await context.CreateTimer( + context.CurrentUtcDateTime.AddSeconds(state.DelaySeconds), + true); + + context.ContinueAsNew(state); + return null; // No effect due to ContinueAsNew +} +``` + +## Graceful Termination + +### Using External Events + +```csharp +public override async Task RunTask( + OrchestrationContext context, + Config config) +{ + using var cts = new CancellationTokenSource(); + + // Check for stop signal + Task stopTask = context.WaitForExternalEvent("Stop"); + Task workTask = DoWorkAsync(context, config); + Task timerTask = context.CreateTimer( + context.CurrentUtcDateTime.AddMinutes(1), + true, + cts.Token); + + Task winner = await Task.WhenAny(stopTask, workTask, timerTask); + cts.Cancel(); + + if (winner == stopTask) + { + // Graceful shutdown + return new Result { StoppedGracefully = true }; + } + + context.ContinueAsNew(config); + return null; +} +``` + +## Best Practices + +### 1. Be Careful with Tight Loops + +Immediate restarts via `ContinueAsNew` can be useful when processing batches of external events to minimize latency. However, be careful to avoid tight loops that do no meaningful work: + +```csharp +// βœ… OK - immediate restart when processing a batch of work +if (pendingItems.Any()) +{ + await ProcessBatchAsync(context, pendingItems); + context.ContinueAsNew(state); // Restart immediately to check for more + return null; +} + +// βœ… Good - add delay when no work to do +await context.CreateTimer(context.CurrentUtcDateTime.AddSeconds(30), true); +context.ContinueAsNew(state); + +// ⚠️ Risky - tight loop with no work and no delay +var items = await context.ScheduleTask>(typeof(GetItemsActivity), null); +if (!items.Any()) +{ + context.ContinueAsNew(state); // Immediately restarts even with no work! + return null; +} +``` + +### 2. Carry Forward Essential State + +```csharp +// βœ… Good - preserves necessary context +context.ContinueAsNew(new State +{ + TotalProcessed = state.TotalProcessed + batchSize, + LastCheckpoint = context.CurrentUtcDateTime, + Config = state.Config +}); + +// ⚠️ Careful - losing important state +context.ContinueAsNew(state.Config); // Lost TotalProcessed +``` + +### 3. Provide Termination Mechanism + +```csharp +// βœ… Good - can be stopped gracefully +if (config.StopRequested || iterationCount > maxIterations) +{ + return finalResult; +} +context.ContinueAsNew(config); +``` + +### 4. Monitor History Size + +If `ContinueAsNew` isn't called frequently enough, history can still grow. Consider continuing after a fixed number of operations: + +```csharp +if (state.OperationsSinceRestart > 100) +{ + state.OperationsSinceRestart = 0; + context.ContinueAsNew(state); + return null; +} +``` + +## Next Steps + +- [Timers](timers.md) β€” Creating durable delays +- [External Events](external-events.md) β€” Signaling eternal orchestrations +- [Replay and Durability](../concepts/replay-and-durability.md) β€” Understanding history growth diff --git a/docs/features/external-events.md b/docs/features/external-events.md new file mode 100644 index 000000000..7703f4492 --- /dev/null +++ b/docs/features/external-events.md @@ -0,0 +1,518 @@ +# External Events + +External events allow orchestrations to receive data from outside sources. This enables human interaction patterns, webhooks, and inter-orchestration communication. + +## The Event Pattern + +DTFx uses the `OnEvent` method override combined with `TaskCompletionSource` to handle external events. This pattern provides full control over event handling and is the standard approach in the framework. + +> [!NOTE] +> If you're familiar with [Azure Durable Functions](https://learn.microsoft.com/azure/azure-functions/durable/), note that DTFx does not have a built-in `WaitForExternalEvent()` helper method. Instead, DTFx provides the lower-level `OnEvent` pattern shown below, which Durable Functions builds upon. + +This pattern: + +1. Creates a `TaskCompletionSource` to represent the pending event +2. Awaits the `TaskCompletionSource.Task` in `RunTask` +3. Overrides `OnEvent` to receive events and complete the task + +### Basic Event Wait + +```csharp +public class SignalOrchestration : TaskOrchestration +{ + TaskCompletionSource resumeHandle; + + public override async Task RunTask(OrchestrationContext context, string input) + { + // Wait for external signal + string user = await WaitForSignal(); + + // Continue with the workflow + string greeting = await context.ScheduleTask(typeof(SendGreetingTask), user); + return greeting; + } + + async Task WaitForSignal() + { + this.resumeHandle = new TaskCompletionSource(); + string data = await this.resumeHandle.Task; + this.resumeHandle = null; + return data; + } + + public override void OnEvent(OrchestrationContext context, string name, string input) + { + // Complete the pending task when event arrives + this.resumeHandle?.SetResult(input); + } +} +``` + +### Typed Event Data + +For strongly-typed event data, deserialize in `OnEvent`: + +```csharp +public class ApprovalOrchestration : TaskOrchestration +{ + TaskCompletionSource approvalHandle; + + public override async Task RunTask( + OrchestrationContext context, + ApprovalRequest request) + { + // Send approval request + await context.ScheduleTask(typeof(SendApprovalEmailActivity), request); + + // Wait for approval response + this.approvalHandle = new TaskCompletionSource(); + var response = await this.approvalHandle.Task; + this.approvalHandle = null; + + return new ApprovalResult + { + IsApproved = response.IsApproved, + ApprovedBy = response.ApprovedBy + }; + } + + public override void OnEvent(OrchestrationContext context, string name, string input) + { + if (name == "Approval" && this.approvalHandle != null) + { + var response = context.MessageDataConverter.Deserialize(input); + this.approvalHandle.SetResult(response); + } + } +} +``` + +### Wait with Timeout + +Combine with timers to implement timeouts: + +```csharp +public class TimedApprovalOrchestration : TaskOrchestration +{ + TaskCompletionSource eventHandle; + + public override async Task RunTask(OrchestrationContext context, Request input) + { + // Set up the event wait + this.eventHandle = new TaskCompletionSource(); + var eventTask = this.eventHandle.Task; + + // Set up timeout + using var cts = new CancellationTokenSource(); + var timeoutTask = context.CreateTimer( + context.CurrentUtcDateTime.AddHours(24), + "timeout", + cts.Token); + + // Wait for either event or timeout + var winner = await Task.WhenAny(eventTask, timeoutTask); + + if (winner == eventTask) + { + cts.Cancel(); + var response = await eventTask; + this.eventHandle = null; + return new Result { Response = response, TimedOut = false }; + } + else + { + this.eventHandle = null; + return new Result { TimedOut = true }; + } + } + + public override void OnEvent(OrchestrationContext context, string name, string input) + { + if (name == "UserResponse") + { + this.eventHandle?.SetResult(input); + } + } +} +``` + +### Multiple Event Types + +Handle different event types with named checks: + +```csharp +public class MultiEventOrchestration : TaskOrchestration +{ + TaskCompletionSource<(string EventType, string Data)> eventHandle; + + public override async Task RunTask(OrchestrationContext context, Request input) + { + this.eventHandle = new TaskCompletionSource<(string, string)>(); + + using var cts = new CancellationTokenSource(); + var eventTask = this.eventHandle.Task; + var timeoutTask = context.CreateTimer( + context.CurrentUtcDateTime.AddDays(7), + "timeout", + cts.Token); + + var winner = await Task.WhenAny(eventTask, timeoutTask); + cts.Cancel(); + this.eventHandle = null; + + if (winner == timeoutTask) + { + return new Result { Status = "TimedOut" }; + } + + var (eventType, data) = await eventTask; + + return eventType switch + { + "Approve" => new Result { Status = "Approved" }, + "Reject" => new Result { Status = "Rejected", Reason = data }, + "Cancel" => new Result { Status = "Cancelled" }, + _ => new Result { Status = "Unknown" } + }; + } + + public override void OnEvent(OrchestrationContext context, string name, string input) + { + if (this.eventHandle != null && + (name == "Approve" || name == "Reject" || name == "Cancel")) + { + this.eventHandle.SetResult((name, input)); + } + } +} +``` + +## Sending Events + +### From TaskHubClient + +```csharp +var service = GetOrchestrationService(); +var client = new TaskHubClient(service, loggerFactory: loggerFactory); + +// Send event to a specific orchestration instance +await client.RaiseEventAsync( + instance, // OrchestrationInstance + eventName: "Approval", // Event name (passed to OnEvent) + eventData: new ApprovalData // Event payload (serialized to string) + { + IsApproved = true, + ApprovedBy = "manager@company.com" + }); + +// Using instance ID directly +await client.RaiseEventAsync( + new OrchestrationInstance { InstanceId = "order-12345" }, + "Approval", + new ApprovalData { IsApproved = true }); +``` + +### From Another Orchestration + +Orchestrations cannot directly raise events. Use an activity: + +```csharp +public override async Task RunTask(OrchestrationContext context, SignalInput input) +{ + // Do some work... + + // Use an activity to send the event + await context.ScheduleTask(typeof(SendEventActivity), new SendEventInput + { + TargetInstanceId = input.TargetOrchestrationId, + EventName = "DataReady", + EventData = input.Data + }); +} + +// Activity to send the event +public class SendEventActivity : AsyncTaskActivity +{ + private readonly TaskHubClient _client; + + public SendEventActivity(TaskHubClient client) + { + _client = client; + } + + protected override async Task ExecuteAsync( + TaskContext context, + SendEventInput input) + { + await _client.RaiseEventAsync( + new OrchestrationInstance { InstanceId = input.TargetInstanceId }, + input.EventName, + input.EventData); + return true; + } +} +``` + +### From External Systems (Webhooks) + +```csharp +// In an ASP.NET Core controller +[ApiController] +[Route("api/[controller]")] +public class WebhookController : ControllerBase +{ + private readonly TaskHubClient _client; + + public WebhookController(TaskHubClient client) + { + _client = client; + } + + [HttpPost("approve/{instanceId}")] + public async Task Approve( + string instanceId, + [FromBody] ApprovalRequest request) + { + await _client.RaiseEventAsync( + new OrchestrationInstance { InstanceId = instanceId }, + "Approval", + new ApprovalData + { + IsApproved = request.Approved, + ApprovedBy = User.Identity?.Name + }); + + return Ok(); + } +} +``` + +## Event Patterns + +### Human Approval Workflow + +```csharp +public class ApprovalWorkflow : TaskOrchestration +{ + TaskCompletionSource approvalHandle; + + public override async Task RunTask( + OrchestrationContext context, + ApprovalRequest request) + { + // Step 1: Send approval request email + await context.ScheduleTask(typeof(SendApprovalEmailActivity), new EmailData + { + To = request.ApproverEmail, + Subject = $"Approval needed: {request.Title}", + ApprovalUrl = $"https://myapp.com/approve/{context.OrchestrationInstance.InstanceId}" + }); + + // Step 2: Wait for response with 7-day timeout + this.approvalHandle = new TaskCompletionSource(); + + using var cts = new CancellationTokenSource(); + var approvalTask = this.approvalHandle.Task; + var timeoutTask = context.CreateTimer( + context.CurrentUtcDateTime.AddDays(7), + "timeout", + cts.Token); + + var winner = await Task.WhenAny(approvalTask, timeoutTask); + cts.Cancel(); + + if (winner == timeoutTask) + { + this.approvalHandle = null; + await context.ScheduleTask(typeof(SendTimeoutNotificationActivity), request); + return new ApprovalResult { Status = ApprovalStatus.TimedOut }; + } + + var response = await approvalTask; + this.approvalHandle = null; + + // Step 3: Process the decision + if (response.IsApproved) + { + await context.ScheduleTask(typeof(ProcessApprovalActivity), request); + return new ApprovalResult { Status = ApprovalStatus.Approved }; + } + else + { + await context.ScheduleTask(typeof(ProcessRejectionActivity), new RejectionData + { + Request = request, + Reason = response.RejectionReason + }); + return new ApprovalResult + { + Status = ApprovalStatus.Rejected, + Reason = response.RejectionReason + }; + } + } + + public override void OnEvent(OrchestrationContext context, string name, string input) + { + if (name == "ApprovalResponse" && this.approvalHandle != null) + { + var response = context.MessageDataConverter.Deserialize(input); + this.approvalHandle.SetResult(response); + } + } +} +``` + +### Sequential Multi-Step Events + +```csharp +public class MultiStepOrchestration : TaskOrchestration +{ + TaskCompletionSource currentEventHandle; + string currentEventName; + + public override async Task RunTask(OrchestrationContext context, Request input) + { + // Wait for step 1 + var step1 = await WaitForEvent("Step1Complete"); + await context.ScheduleTask(typeof(ProcessStep1Activity), step1); + + // Wait for step 2 + var step2 = await WaitForEvent("Step2Complete"); + await context.ScheduleTask(typeof(ProcessStep2Activity), step2); + + // Wait for step 3 + var step3 = await WaitForEvent("Step3Complete"); + await context.ScheduleTask(typeof(ProcessStep3Activity), step3); + + return new Result { Success = true }; + } + + async Task WaitForEvent(string eventName) + { + this.currentEventName = eventName; + this.currentEventHandle = new TaskCompletionSource(); + var result = await this.currentEventHandle.Task; + this.currentEventHandle = null; + this.currentEventName = null; + return result; + } + + public override void OnEvent(OrchestrationContext context, string name, string input) + { + if (name == this.currentEventName && this.currentEventHandle != null) + { + this.currentEventHandle.SetResult(input); + } + } +} +``` + +## Event Behavior + +### Event Buffering + +Events sent before the orchestration reaches its wait point are **buffered** and delivered when `OnEvent` is called during replay. The framework replays the event from history. + +### Event History + +Events are recorded in the orchestration history: + +```text +EventRaised { Name: "Approval", Input: "{...}" } +``` + +During replay, the `OnEvent` method is called with the same event data from history, one at a time using a single thread, ensuring deterministic behavior. The thread used to call `OnEvent` is the same thread that runs the orchestration code. + +## Best Practices + +### 1. Use Meaningful Event Names + +```csharp +public override void OnEvent(OrchestrationContext context, string name, string input) +{ + // βœ… Good - clear, descriptive names + if (name == "OrderApproved") { ... } + if (name == "PaymentReceived") { ... } + + // ❌ Bad - unclear names + if (name == "Event1") { ... } + if (name == "Data") { ... } +} +``` + +### 2. Include Timeout + +```csharp +// βœ… Good - has timeout +var eventTask = this.eventHandle.Task; +var timeoutTask = context.CreateTimer(deadline, "timeout", cts.Token); +await Task.WhenAny(eventTask, timeoutTask); + +// ⚠️ Risky - waits forever +await this.eventHandle.Task; +``` + +### 3. Clean Up Handles + +```csharp +public override async Task RunTask(OrchestrationContext context, Request input) +{ + this.eventHandle = new TaskCompletionSource(); + + try + { + var result = await this.eventHandle.Task; + return new Result { Data = result }; + } + finally + { + // βœ… Always clean up + this.eventHandle = null; + } +} +``` + +### 4. Validate Event Data + +```csharp +public override void OnEvent(OrchestrationContext context, string name, string input) +{ + if (name == "Approval" && this.approvalHandle != null) + { + var response = context.MessageDataConverter.Deserialize(input); + + // Validate before completing + if (string.IsNullOrEmpty(response.ApprovedBy)) + { + // Could log warning or ignore invalid event + return; + } + + this.approvalHandle.SetResult(response); + } +} +``` + +### 5. Document Expected Events + +```csharp +/// +/// Order processing orchestration. +/// +/// Expected external events: +/// - "PaymentConfirmed" (PaymentData): Payment has been processed +/// - "ShippingReady" (ShippingData): Order is ready to ship +/// - "Cancel" (CancellationData): Cancel the order +/// +public class OrderOrchestration : TaskOrchestration +{ + // ... +} +``` + +## Next Steps + +- [Timers](timers.md) β€” Combining events with timeouts +- [Sub-Orchestrations](sub-orchestrations.md) β€” Coordinating child workflows +- [Error Handling](error-handling.md) β€” Handling event failures diff --git a/docs/features/retries.md b/docs/features/retries.md new file mode 100644 index 000000000..2baad2623 --- /dev/null +++ b/docs/features/retries.md @@ -0,0 +1,303 @@ +# Automatic Retries + +The Durable Task Framework supports automatic retries for activities and sub-orchestrations. Retries are handled durably - the retry count and timing survive process restarts. + +## Basic Retry Configuration + +### RetryOptions + +Configure retries using `RetryOptions`: + +```csharp +var retryOptions = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 3); +``` + +### Calling with Retry + +```csharp +var result = await context.ScheduleWithRetry( + typeof(UnreliableActivity), + retryOptions, + input); +``` + +## RetryOptions Properties + +| Property | Description | Default | +| -------- | ----------- | ------- | +| `FirstRetryInterval` | Delay before the first retry | Required | +| `MaxNumberOfAttempts` | Maximum total attempts (including first) | Required | +| `BackoffCoefficient` | Multiplier for exponential backoff | 1.0 | +| `MaxRetryInterval` | Maximum delay between retries | `TimeSpan.MaxValue` | +| `RetryTimeout` | Total time allowed for all retries | `TimeSpan.MaxValue` | +| `Handle` | Custom exception filter function | Retry all | + +## Retry Patterns + +### Fixed Interval + +Same delay between each retry: + +```csharp +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(10), + maxNumberOfAttempts: 5); +// BackoffCoefficient defaults to 1.0 +// Delays: 10s, 10s, 10s, 10s +``` + +### Exponential Backoff + +Increasing delays between retries: + +```csharp +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(1), + maxNumberOfAttempts: 5) +{ + BackoffCoefficient = 2.0 +}; +// Delays: 1s, 2s, 4s, 8s +``` + +### Exponential with Max Interval + +Cap the maximum delay: + +```csharp +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(1), + maxNumberOfAttempts: 10) +{ + BackoffCoefficient = 2.0, + MaxRetryInterval = TimeSpan.FromMinutes(1) +}; +// Delays: 1s, 2s, 4s, 8s, 16s, 32s, 60s, 60s, 60s +``` + +### With Timeout + +Limit total retry time: + +```csharp +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 100) // High limit +{ + BackoffCoefficient = 2.0, + MaxRetryInterval = TimeSpan.FromMinutes(5), + RetryTimeout = TimeSpan.FromHours(1) // Stop after 1 hour +}; +``` + +## Custom Exception Handling + +### Handle Specific Exceptions + +```csharp +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 3) +{ + Handle = exception => + { + // Only retry on transient failures + return exception is HttpRequestException || + exception is TimeoutException; + } +}; +``` + +### Retry Based on Exception Details + +```csharp +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 5) +{ + Handle = exception => + { + if (exception is ApiException apiEx) + { + // Retry on 429 (rate limit) or 5xx (server errors) + return apiEx.StatusCode == 429 || + (int)apiEx.StatusCode >= 500; + } + return false; + } +}; +``` + +### Never Retry Specific Errors + +```csharp +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 3) +{ + Handle = exception => + { + // Don't retry validation errors + if (exception is ValidationException) + return false; + + // Don't retry authentication errors + if (exception is AuthenticationException) + return false; + + return true; // Retry everything else + } +}; +``` + +## Sub-Orchestration Retries + +Retry sub-orchestrations with the same pattern: + +```csharp +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromMinutes(1), + maxNumberOfAttempts: 3); + +var result = await context.CreateSubOrchestrationInstanceWithRetry( + typeof(ChildOrchestration), + options, + input); +``` + +## Retry Behavior + +### What Happens During Retry + +1. Activity throws an exception +2. Framework records `TaskFailed` event +3. Retry timer is created (durable) +4. Timer fires, activity is scheduled again +5. If successful, `TaskCompleted` is recorded +6. If failed and attempts remain, go to step 2 + +### Durability + +Retries are durable: + +- Retry count survives process restarts +- Timer state is persisted +- No duplicate executions + +### Final Failure + +After all retries exhausted: + +- `TaskFailedException` is thrown in orchestration +- Contains the last exception as `InnerException` +- Orchestration can catch and handle + +```csharp +try +{ + var result = await context.ScheduleWithRetry( + typeof(UnreliableActivity), + retryOptions, + input); +} +catch (TaskFailedException ex) +{ + // All retries failed + _logger.LogError(ex.InnerException, "Activity failed after all retries"); + await context.ScheduleTask(typeof(CompensationActivity), input); +} +``` + +## Best Practices + +### 1. Use Idempotent Activities + +Activities may execute multiple times: + +```csharp +public class PaymentActivity : AsyncTaskActivity +{ + protected override async Task ExecuteAsync( + TaskContext context, + Payment input) + { + // Use idempotency key to prevent duplicate charges + return await _paymentService.ChargeAsync( + input.Amount, + idempotencyKey: input.OrderId); + } +} +``` + +### 2. Don't Retry Non-Transient Errors + +```csharp +var options = new RetryOptions(...) +{ + Handle = ex => !(ex is ValidationException) && + !(ex is NotFoundException) +}; +``` + +### 3. Set Reasonable Timeouts + +```csharp +var options = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 10) +{ + RetryTimeout = TimeSpan.FromMinutes(30) // Don't retry forever +}; +``` + +### 4. Consider Circuit Breaker Pattern + +For repeated failures, consider manual circuit breaking: + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + int consecutiveFailures = 0; + + while (consecutiveFailures < 3) + { + try + { + return await context.ScheduleWithRetry( + typeof(MyActivity), + retryOptions, + input); + } + catch (TaskFailedException) + { + consecutiveFailures++; + await context.CreateTimer( + context.CurrentUtcDateTime.AddMinutes(5 * consecutiveFailures), + true); + } + } + + throw new Exception("Circuit breaker opened"); +} +``` + +## Comparison with Activity-Level Retry + +Activity-level retries (inside the activity code) are not durable and do not survive orchestration restarts. They also do not appear in the orchestration history. + +| Feature | Orchestration Retry (ScheduleWithRetry) | Activity-Internal Retry | +| ------- | -------------------------------------- | ---------------------- | +| Durable | βœ… Yes | ❌ No | +| Survives crashes | βœ… Yes | ❌ No | +| Visible in history | βœ… Yes | ❌ No | +| Configurable per-call | βœ… Yes | ⚠️ Limited | + +Prefer orchestration-level retries for durability. + +## Next Steps + +- [Error Handling](error-handling.md) β€” Comprehensive error handling patterns +- [Timers](timers.md) β€” Durable timers and delays +- [Activities](../concepts/activities.md) β€” Writing retry-safe activities diff --git a/docs/features/sub-orchestrations.md b/docs/features/sub-orchestrations.md new file mode 100644 index 000000000..5305c580d --- /dev/null +++ b/docs/features/sub-orchestrations.md @@ -0,0 +1,315 @@ +# Sub-Orchestrations + +Sub-orchestrations allow you to break complex workflows into smaller, reusable pieces. A parent orchestration can start child orchestrations and wait for their results. + +## Creating Sub-Orchestrations + +### Basic Usage + +```csharp +public override async Task RunTask( + OrchestrationContext context, + OrderInput input) +{ + // Start a sub-orchestration and wait for result + var paymentResult = await context.CreateSubOrchestrationInstance( + typeof(PaymentOrchestration), + input.PaymentData); + + return new OrderResult { PaymentId = paymentResult.TransactionId }; +} +``` + +### With Custom Instance ID + +```csharp +var result = await context.CreateSubOrchestrationInstance( + typeof(ShippingOrchestration), + instanceId: $"shipping-{input.OrderId}", // Custom ID + input: input.ShippingData); +``` + +### With Retry Options + +```csharp +var retryOptions = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(30), + maxNumberOfAttempts: 3) +{ + BackoffCoefficient = 2.0 +}; + +var result = await context.CreateSubOrchestrationInstanceWithRetry( + typeof(ChildOrchestration), + retryOptions, + input); +``` + +## Sub-Orchestration Patterns + +### Sequential Sub-Orchestrations + +```csharp +public override async Task RunTask( + OrchestrationContext context, + OrderInput input) +{ + // Step 1: Validate + var validationResult = await context.CreateSubOrchestrationInstance( + typeof(ValidationOrchestration), + input); + + if (!validationResult.IsValid) + return new OrderResult { Error = validationResult.Error }; + + // Step 2: Process payment + var paymentResult = await context.CreateSubOrchestrationInstance( + typeof(PaymentOrchestration), + input.PaymentData); + + // Step 3: Fulfill order + var fulfillmentResult = await context.CreateSubOrchestrationInstance( + typeof(FulfillmentOrchestration), + new FulfillmentInput { OrderId = input.OrderId, PaymentId = paymentResult.Id }); + + return new OrderResult { Success = true, TrackingNumber = fulfillmentResult.TrackingNumber }; +} +``` + +### Parallel Sub-Orchestrations (Fan-Out) + +```csharp +public override async Task RunTask( + OrchestrationContext context, + BatchInput input) +{ + // Start all sub-orchestrations in parallel + var tasks = input.Items.Select(item => + context.CreateSubOrchestrationInstance( + typeof(ProcessItemOrchestration), + instanceId: $"item-{item.Id}", + input: item)); + + // Wait for all to complete + var results = await Task.WhenAll(tasks); + + return new BatchResult + { + ProcessedCount = results.Length, + SuccessCount = results.Count(r => r.Success), + FailedItems = results.Where(r => !r.Success).Select(r => r.ItemId).ToList() + }; +} +``` + +### Conditional Sub-Orchestrations + +```csharp +public override async Task RunTask( + OrchestrationContext context, + Input input) +{ + // Choose sub-orchestration based on input + if (input.ProcessType == ProcessType.Express) + { + return await context.CreateSubOrchestrationInstance( + typeof(ExpressProcessingOrchestration), + input); + } + else + { + return await context.CreateSubOrchestrationInstance( + typeof(StandardProcessingOrchestration), + input); + } +} +``` + +### Hierarchical Workflows + +```csharp +// Top-level orchestration +public class ProjectOrchestration : TaskOrchestration +{ + public override async Task RunTask( + OrchestrationContext context, + ProjectInput input) + { + var results = new List(); + + foreach (var phase in input.Phases) + { + var phaseResult = await context.CreateSubOrchestrationInstance( + typeof(PhaseOrchestration), + phase); + results.Add(phaseResult); + + if (!phaseResult.Success) + break; // Stop on failure + } + + return new ProjectResult { Phases = results }; + } +} + +// Second-level orchestration +public class PhaseOrchestration : TaskOrchestration +{ + public override async Task RunTask( + OrchestrationContext context, + PhaseInput input) + { + // Each phase has multiple tasks as sub-orchestrations + var taskResults = await Task.WhenAll( + input.Tasks.Select(t => + context.CreateSubOrchestrationInstance( + typeof(TaskOrchestration), + t))); + + return new PhaseResult + { + Success = taskResults.All(r => r.Success), + TaskResults = taskResults.ToList() + }; + } +} +``` + +## Sub-Orchestration vs Activity + +| Feature | Sub-Orchestration | Activity | +| ------- | ----------------- | -------- | +| **Can call other orchestrations** | βœ… Yes | ❌ Not directly | +| **Can use timers** | βœ… Yes | ❌ No | +| **Can wait for events** | βœ… Yes | ❌ No | +| **Has own history** | βœ… Yes | ❌ No | +| **Overhead** | Higher | Lower | +| **Use for** | Complex workflows | Single operations | + +### When to Use Sub-Orchestrations + +βœ… **Use sub-orchestrations when:** + +- The child workflow needs timers, events, or fan-out +- You want isolated history (for debugging, monitoring, or to reduce parent history size) +- You want to distribute orchestration work across multiple worker instances +- The logic is reusable across multiple parent orchestrations +- The child workflow is complex enough to warrant separate management + +❌ **Use activities instead when:** + +- Performing a single operation (API call, DB query) +- No need for durable timers or events +- Simple, stateless work + +## Instance ID Management + +### Auto-Generated IDs + +```csharp +// ID is an automatically generated GUID +var result = await context.CreateSubOrchestrationInstance( + typeof(ChildOrchestration), + input); +``` + +### Custom IDs for Idempotency + +```csharp +// Using custom ID ensures idempotency +var result = await context.CreateSubOrchestrationInstance( + typeof(ChildOrchestration), + instanceId: $"{context.OrchestrationInstance.InstanceId}:child:{input.ItemId}", + input: input); +``` + +### Naming Conventions + +```csharp +// Good patterns for sub-orchestration IDs: +$"{parentId}:payment" // Single child of type +$"{parentId}:item:{itemId}" // Multiple children by item +$"order-{orderId}:fulfillment" // Business-meaningful +``` + +## Error Handling + +### Catching Sub-Orchestration Failures + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + try + { + var result = await context.CreateSubOrchestrationInstance( + typeof(RiskyOrchestration), + input); + return result; + } + catch (SubOrchestrationFailedException ex) + { + // Sub-orchestration threw an unhandled exception + await context.ScheduleTask(typeof(CompensationActivity), input); + return new Result { Error = ex.Message }; + } +} +``` + +### Timeout Handling + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + using var cts = new CancellationTokenSource(); + + var subOrchTask = context.CreateSubOrchestrationInstance( + typeof(LongRunningOrchestration), + input); + + var timeoutTask = context.CreateTimer( + context.CurrentUtcDateTime.AddHours(1), + true, + cts.Token); + + var winner = await Task.WhenAny(subOrchTask, timeoutTask); + + if (winner == subOrchTask) + { + cts.Cancel(); + return await subOrchTask; + } + else + { + // Note: The sub-orchestration continues running! + // Consider terminating it via an activity if needed + return new Result { TimedOut = true }; + } +} +``` + +## Monitoring Sub-Orchestrations + +### Getting Sub-Orchestration Status + +```csharp +// From an activity or external code +var service = GetOrchestrationService(); +var client = new TaskHubClient(service, loggerFactory: loggerFactory); +var state = await client.GetOrchestrationStateAsync( + new OrchestrationInstance { InstanceId = subOrchestrationId }); +``` + +### Viewing in History + +Sub-orchestration events in parent history: + +```text +SubOrchestrationInstanceCreated { InstanceId: "child-123", Name: "PaymentOrchestration" } +SubOrchestrationInstanceCompleted { InstanceId: "child-123", Result: "{...}" } +``` + +## Next Steps + +- [Activities](../concepts/activities.md) β€” When to use activities instead +- [Error Handling](error-handling.md) β€” Handling sub-orchestration failures +- [Versioning](versioning.md) β€” Updating sub-orchestration code diff --git a/docs/features/timers.md b/docs/features/timers.md new file mode 100644 index 000000000..d413947fd --- /dev/null +++ b/docs/features/timers.md @@ -0,0 +1,304 @@ +# Durable Timers + +Durable timers allow orchestrations to wait for specified durations or until specific times. Unlike `Thread.Sleep` or `Task.Delay`, durable timers are persisted and survive process restarts. + +## Creating Timers + +### Wait for Duration + +```csharp +// Wait for 5 minutes +await context.CreateTimer( + context.CurrentUtcDateTime.AddMinutes(5), + true); +``` + +### Wait Until Specific Time + +```csharp +// Wait until midnight +var midnight = context.CurrentUtcDateTime.Date.AddDays(1); +await context.CreateTimer(midnight, true); +``` + +### Timer with CancellationToken + +```csharp +using var cts = new CancellationTokenSource(); +var timerTask = context.CreateTimer( + context.CurrentUtcDateTime.AddHours(1), + true, + cts.Token); + +// Cancel the timer if needed +cts.Cancel(); +``` + +## Common Patterns + +### Timeout with Fallback + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + using var cts = new CancellationTokenSource(); + + var workTask = context.ScheduleTask(typeof(LongRunningActivity), input); + var timeoutTask = context.CreateTimer( + context.CurrentUtcDateTime.AddMinutes(30), + true, + cts.Token); + + var winner = await Task.WhenAny(workTask, timeoutTask); + + if (winner == workTask) + { + cts.Cancel(); // Cancel the timer + return await workTask; + } + else + { + // Timeout occurred + return new Result { TimedOut = true }; + } +} +``` + +### Approval with Deadline + +```csharp +public override async Task RunTask( + OrchestrationContext context, + ApprovalRequest request) +{ + // Send approval request + await context.ScheduleTask(typeof(SendApprovalEmail), request); + + using var cts = new CancellationTokenSource(); + + // Wait for approval event or 7-day timeout + var approvalTask = context.WaitForExternalEvent("Approved"); + var deadlineTask = context.CreateTimer( + context.CurrentUtcDateTime.AddDays(7), + true, + cts.Token); + + var winner = await Task.WhenAny(approvalTask, deadlineTask); + + if (winner == approvalTask) + { + cts.Cancel(); + var approved = await approvalTask; + return new ApprovalResult { Approved = approved }; + } + else + { + return new ApprovalResult { Approved = false, Expired = true }; + } +} +``` + +### Periodic Polling (Monitor Pattern) + +```csharp +public override async Task RunTask( + OrchestrationContext context, + JobInput input) +{ + var expirationTime = context.CurrentUtcDateTime.AddHours(4); + var pollingInterval = TimeSpan.FromSeconds(30); + + while (context.CurrentUtcDateTime < expirationTime) + { + var status = await context.ScheduleTask( + typeof(CheckJobStatusActivity), + input.JobId); + + if (status.IsComplete) + { + return new JobResult { Success = true, Data = status.Data }; + } + + // Wait before next poll + var nextCheck = context.CurrentUtcDateTime.Add(pollingInterval); + await context.CreateTimer(nextCheck, true); + + // Optional: exponential backoff + pollingInterval = TimeSpan.FromSeconds( + Math.Min(pollingInterval.TotalSeconds * 1.5, 300)); + } + + return new JobResult { Success = false, TimedOut = true }; +} +``` + +### Scheduled Execution + +```csharp +public override async Task RunTask( + OrchestrationContext context, + ScheduledTask input) +{ + // Wait until scheduled time + if (context.CurrentUtcDateTime < input.ScheduledTime) + { + await context.CreateTimer(input.ScheduledTime, true); + } + + // Execute the task + return await context.ScheduleTask( + typeof(ScheduledWorkActivity), + input); +} +``` + +### Cron-like Scheduling + +```csharp +public override async Task RunTask(OrchestrationContext context, CronInput input) +{ + var nextRun = GetNextCronTime(input.CronExpression, context.CurrentUtcDateTime); + + // Wait until next scheduled time + await context.CreateTimer(nextRun, true); + + // Execute scheduled work + await context.ScheduleTask(typeof(CronJobActivity), input); + + // Continue as new for next iteration + context.ContinueAsNew(input); +} + +private DateTime GetNextCronTime(string cronExpression, DateTime fromTime) +{ + // Use a cron parsing library like Cronos + var expression = CronExpression.Parse(cronExpression); + return expression.GetNextOccurrence(fromTime, TimeZoneInfo.Utc) + ?? throw new InvalidOperationException("No next occurrence"); +} +``` + +### Reminder/Notification Pattern + +```csharp +public override async Task RunTask(OrchestrationContext context, ReminderInput input) +{ + // Send initial notification + await context.ScheduleTask(typeof(SendReminderActivity), new ReminderData + { + UserId = input.UserId, + Message = input.InitialMessage + }); + + // Send follow-up reminders + foreach (var reminder in input.FollowUpSchedule) + { + await context.CreateTimer( + context.CurrentUtcDateTime.Add(reminder.Delay), + true); + + await context.ScheduleTask(typeof(SendReminderActivity), new ReminderData + { + UserId = input.UserId, + Message = reminder.Message + }); + } +} +``` + +## Timer Behavior + +### Durability + +Timers are persisted as `TimerCreated` events: + +```text +1. ExecutionStarted +2. TimerCreated { FireAt: "2024-01-15T10:00:00Z" } +``` + +When the timer fires, a `TimerFired` event is added: + +```text +3. TimerFired { TimerId: 1 } +``` + +### Replay Behavior + +During replay: + +- Past timers complete immediately (fire time already passed) +- Future timers wait for the scheduled time + +### Minimum Duration + +Very short timers (< 1 second) may not provide precise timing due to: + +- Message processing overhead +- Partition lease renewal intervals +- Clock synchronization + +For precise short delays, use activities. + +## Best Practices + +### 1. Always Use Context Time + +```csharp +// βœ… Correct +await context.CreateTimer(context.CurrentUtcDateTime.AddMinutes(5), true); + +// ❌ Wrong - non-deterministic +await context.CreateTimer(DateTime.UtcNow.AddMinutes(5), true); +``` + +### 2. Cancel Unused Timers + +```csharp +using var cts = new CancellationTokenSource(); +var timer = context.CreateTimer(deadline, true, cts.Token); +var work = context.WaitForExternalEvent("Event"); + +var winner = await Task.WhenAny(timer, work); +if (winner == work) +{ + cts.Cancel(); // Important: cancel the timer +} +``` + +> [!NOTE] +> If an orchestration completes while timers are pending, the orchestration will remain in the "Running" state until all timers either fire or are cancelled. + +### 3. Avoid Very Long Timers Without ContinueAsNew + +Super long timers make it harder to version orchestration code. Periodically break up long waits using `ContinueAsNew` if possible. + +```csharp +// For very long waits, consider breaking up with ContinueAsNew +public override async Task RunTask(OrchestrationContext context, WaitInput input) +{ + var remainingWait = input.TotalWait - (context.CurrentUtcDateTime - input.StartTime); + + if (remainingWait > TimeSpan.FromDays(7)) + { + // Wait for a week, then continue as new + await context.CreateTimer( + context.CurrentUtcDateTime.AddDays(7), + true); + context.ContinueAsNew(input); + return; + } + + await context.CreateTimer( + context.CurrentUtcDateTime.Add(remainingWait), + true); + + await context.ScheduleTask(typeof(FinalActivity), input); +} +``` + +## Next Steps + +- [External Events](external-events.md) β€” Combining timers with events +- [Eternal Orchestrations](eternal-orchestrations.md) β€” Long-running workflows +- [Replay and Durability](../concepts/replay-and-durability.md) β€” How timers are persisted diff --git a/docs/features/versioning.md b/docs/features/versioning.md new file mode 100644 index 000000000..0eb6d21c4 --- /dev/null +++ b/docs/features/versioning.md @@ -0,0 +1,498 @@ +# Orchestration Versioning + +When you need to update orchestration code while instances are running, careful versioning strategies are required to avoid breaking in-flight orchestrations. + +## The Versioning Problem + +Orchestrations use [replay](../concepts/replay-and-durability.md) to rebuild state. If you change the code while an orchestration is in-flight, replay can fail. + +### Example: Adding an Activity at the Beginning + +```csharp +// Version 1 +public override async Task RunTask(OrchestrationContext context, string input) +{ + var a = await context.ScheduleTask(typeof(ActivityA), input); + var b = await context.ScheduleTask(typeof(ActivityB), a); + return b; +} + +// Version 2 - Added a new activity at the BEGINNING +public override async Task RunTask(OrchestrationContext context, string input) +{ + var validated = await context.ScheduleTask(typeof(ValidateActivity), input); // NEW + var a = await context.ScheduleTask(typeof(ActivityA), validated); + var b = await context.ScheduleTask(typeof(ActivityB), a); + return b; +} +``` + +Suppose an instance started with V1 and completed `ActivityA`. Its history contains: + +```text +TaskScheduled { Name: "ActivityA" } +TaskCompleted { Result: "..." } +``` + +When V2 code replays this history: + +1. V2 expects first task to be `ValidateActivity` +2. History shows first task was `ActivityA` +3. **NonDeterministicOrchestrationException** is thrown + +### Why Adding to the End Is Different + +Adding activities at the **end** of an orchestration is generally safe because: + +- Completed orchestrations are never replayed +- In-flight orchestrations haven't reached that point yet + +```csharp +// Version 2 - Adding at the END is safe +public override async Task RunTask(OrchestrationContext context, string input) +{ + var a = await context.ScheduleTask(typeof(ActivityA), input); + var b = await context.ScheduleTask(typeof(ActivityB), a); + var c = await context.ScheduleTask(typeof(ActivityC), b); // Usually safe to add here + return c; +} +``` + +However, be cautious if in-flight orchestrations are waiting on timers or external events near the endβ€”they may still replay and encounter the new code. + +## Versioning Strategies + +### Strategy 1: Side-by-Side Versioning + +Deploy multiple versions of the orchestration simultaneously using `NameValueObjectCreator`: + +```csharp +// Define both versions as separate classes +public class OrderOrchestrationV1 : TaskOrchestration +{ + public override async Task RunTask(OrchestrationContext context, Input input) + { + // V1 logic + } +} + +public class OrderOrchestrationV2 : TaskOrchestration +{ + public override async Task RunTask(OrchestrationContext context, Input input) + { + // V2 logic with new features + } +} + +// Register both with explicit name and version +worker.AddTaskOrchestrations( + new NameValueObjectCreator( + "OrderOrchestration", "V1", typeof(OrderOrchestrationV1)), + new NameValueObjectCreator( + "OrderOrchestration", "V2", typeof(OrderOrchestrationV2))); +``` + +Start new instances with the new version: + +```csharp +// Start with specific version +var instance = await client.CreateOrchestrationInstanceAsync( + "OrderOrchestration", + "V2", // Version string must match registration + input); +``` + +### Strategy 2: Feature Flags with Version Check + +Check version in orchestration code: + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + var a = await context.ScheduleTask(typeof(ActivityA), input); + + // Only run new code for instances started after cutoff + if (context.OrchestrationInstance.ExecutionId != null && + input.Version >= 2) + { + var b = await context.ScheduleTask(typeof(ActivityB), a); + return new Result { Data = b }; + } + + return new Result { Data = a }; +} +``` + +### Strategy 3: Wait for Completion + +The safest approach for breaking changes: + +1. **Stop starting new instances** of the old version +2. **Wait for all running instances** to complete +3. **Deploy the new version** +4. **Resume starting instances** + +```csharp +// Query running instances +var runningInstances = await client.GetOrchestrationStateAsync( + new OrchestrationStateQuery + { + RuntimeStatus = new[] { OrchestrationStatus.Running } + }); + +// Wait for completion +while (runningInstances.Any()) +{ + await Task.Delay(TimeSpan.FromMinutes(1)); + runningInstances = await client.GetOrchestrationStateAsync(...); +} + +// Safe to deploy new version +``` + +### Strategy 4: Graceful Migration + +For long-running orchestrations, add a migration point: + +```csharp +// V1: Add migration check +public override async Task RunTask(OrchestrationContext context, Input input) +{ + // Check if migration is needed + if (input.ShouldMigrate) + { + // Start V2 orchestration with current state + var result = await context.CreateSubOrchestrationInstance( + "OrderOrchestration", + "2.0", + input); + return result; + } + + // Continue with V1 logic + var a = await context.ScheduleTask(typeof(ActivityA), input); + return new Result { Data = a }; +} +``` + +### Strategy 5: Worker-Level Version Filtering + +Configure workers to only process orchestrations matching specific version criteria using `VersioningSettings`. This enables zero-downtime deployments by running multiple worker versions simultaneously. + +#### Setting Up VersioningSettings + +```csharp +using DurableTask.Core.Settings; + +var versioningSettings = new VersioningSettings +{ + Version = "2.0", + MatchStrategy = VersioningSettings.VersionMatchStrategy.CurrentOrOlder, + FailureStrategy = VersioningSettings.VersionFailureStrategy.Reject +}; + +var worker = new TaskHubWorker(orchestrationService, versioningSettings, loggerFactory); +``` + +> [!IMPORTANT] +> The `Version` property serves two purposes: +> +> 1. It defines which orchestrations this worker will process (based on `MatchStrategy`) +> 2. It becomes the **default version** for all new orchestrations created without an explicit version + +This means when you start a new orchestration without specifying a version, it will automatically be stamped with the worker's configured version: + +```csharp +// This orchestration will be created with version "2.0" (from VersioningSettings) +var instance = await client.CreateOrchestrationInstanceAsync( + typeof(OrderOrchestration), + input); +``` + +#### Version Match Strategies + +| Strategy | Description | +| -------- | ----------- | +| `None` | Default. Ignore version, process all orchestrations. | +| `Strict` | Only process orchestrations with an **exact** version match. | +| `CurrentOrOlder` | Process orchestrations with version **less than or equal** to the worker version. | + +#### Version Failure Strategies + +| Strategy | Description | +| -------- | ----------- | +| `Reject` | Default. Abandon the work item so another worker can pick it up (or retry later). | +| `Fail` | Fail the orchestration with a `VersionMismatch` error. | + +#### Blue-Green Deployment Example + +Run old and new workers simultaneously during deployments: + +```csharp +// OLD worker (handles existing orchestrations) +var oldSettings = new VersioningSettings +{ + Version = "1.0", + MatchStrategy = VersioningSettings.VersionMatchStrategy.Strict, + FailureStrategy = VersioningSettings.VersionFailureStrategy.Reject +}; +var oldWorker = new TaskHubWorker(orchestrationService, oldSettings, loggerFactory); +oldWorker.AddTaskOrchestrations(typeof(OrderOrchestrationV1)); + +// NEW worker (handles new orchestrations) +var newSettings = new VersioningSettings +{ + Version = "2.0", + MatchStrategy = VersioningSettings.VersionMatchStrategy.Strict, + FailureStrategy = VersioningSettings.VersionFailureStrategy.Reject +}; +var newWorker = new TaskHubWorker(orchestrationService, newSettings, loggerFactory); +newWorker.AddTaskOrchestrations(typeof(OrderOrchestrationV2)); + +// Both workers run simultaneously +// - V1 orchestrations are processed by oldWorker +// - V2 orchestrations are processed by newWorker +// Once all V1 orchestrations complete, retire oldWorker +``` + +#### Version Comparison + +Versions are compared using the following rules: + +1. Empty versions are treated as "unversioned" and compare as less than any defined version +2. If both versions can be parsed as `System.Version` (e.g., "1.0.0", "2.1"), numeric comparison is used +3. Otherwise, case-insensitive string comparison is used + +```csharp +// Version comparison examples +VersioningSettings.CompareVersions("1.0.0", "1.0.0"); // Returns 0 (equal) +VersioningSettings.CompareVersions("2.0.0", "1.0.0"); // Returns 1 (greater) +VersioningSettings.CompareVersions("1.0.0", "2.0.0"); // Returns -1 (less) +VersioningSettings.CompareVersions("", "1.0.0"); // Returns -1 (empty < defined) +``` + +#### Accessing Version in Orchestrations + +The orchestration version is available via `OrchestrationContext.Version`: + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + // Access the version this orchestration was started with + string version = context.Version; + + if (!context.IsReplaying) + { + _logger.LogInformation("Processing orchestration version: {Version}", version); + } + + // Use version for conditional logic (CompareVersions handles "2.0", "2.1", "3.0", etc.) + if (VersioningSettings.CompareVersions(version, "2.0") >= 0) + { + // V2+ specific logic + } + + // ... +} +``` + +## Safe Code Changes + +### Changes That Are Safe + +βœ… **Adding activities at the end** (after all existing durable operations): + +```csharp +// Safe - existing orchestrations completed or haven't reached this point +var a = await context.ScheduleTask(typeof(ActivityA), input); +var b = await context.ScheduleTask(typeof(ActivityB), a); +var c = await context.ScheduleTask(typeof(ActivityC), b); // Added at end +``` + +βœ… **Changing activity implementation** (not the orchestration code): + +```csharp +// Safe - activity logic doesn't affect replay +public class ActivityA : TaskActivity +{ + protected override string Execute(TaskContext context, string input) + { + return input.ToUpper(); // Changed from ToLower() + } +} +``` + +βœ… **Adding logging or metrics** (using IsReplaying): + +```csharp +if (!context.IsReplaying) +{ + _logger.LogInformation("Processing..."); // Safe to add +} +``` + +βœ… **Changing non-durable code**: + +```csharp +var formatted = input.Trim().ToLower(); // Safe to change +var result = await context.ScheduleTask(typeof(MyActivity), formatted); +``` + +### Changes That Are NOT Safe + +❌ **Removing or reordering activities**: + +```csharp +// V1 +var a = await context.ScheduleTask(typeof(ActivityA), input); +var b = await context.ScheduleTask(typeof(ActivityB), a); + +// V2 - BREAKS replay +var b = await context.ScheduleTask(typeof(ActivityB), input); +var a = await context.ScheduleTask(typeof(ActivityA), b); +``` + +❌ **Changing activity types**: + +```csharp +// V1 +await context.ScheduleTask(typeof(ActivityA), input); + +// V2 - BREAKS replay (different activity name) +await context.ScheduleTask(typeof(ActivityANew), input); +``` + +❌ **Changing conditional logic that affects scheduling**: + +```csharp +// V1 +if (input.Amount > 100) + await context.ScheduleTask(typeof(LargeOrderActivity), input); + +// V2 - BREAKS replay (different threshold) +if (input.Amount > 50) // Changed condition! + await context.ScheduleTask(typeof(LargeOrderActivity), input); +``` + +❌ **Adding activities in the middle**: + +```csharp +// V1 +var a = await context.ScheduleTask(typeof(ActivityA), input); +var c = await context.ScheduleTask(typeof(ActivityC), a); + +// V2 - BREAKS replay +var a = await context.ScheduleTask(typeof(ActivityA), input); +var b = await context.ScheduleTask(typeof(ActivityB), a); // Added in middle! +var c = await context.ScheduleTask(typeof(ActivityC), b); +``` + +❌ **Changing retry policies**: + +```csharp +// V1 +var options = new RetryOptions(TimeSpan.FromSeconds(5), maxNumberOfAttempts: 3); +await context.ScheduleWithRetry(typeof(ActivityA), options, input); + +// V2 - BREAKS replay (different retry behavior recorded in history) +var options = new RetryOptions(TimeSpan.FromSeconds(10), maxNumberOfAttempts: 5); +await context.ScheduleWithRetry(typeof(ActivityA), options, input); +``` + +## Orchestration Name Registration + +### Custom Naming + +By default, orchestrations are registered using their class name. Use `NameValueObjectCreator` to specify a custom name: + +```csharp +public class OrderOrchestration : TaskOrchestration { } + +// Register with custom name "OrderProcessing" instead of class name +worker.AddTaskOrchestrations( + new NameValueObjectCreator( + "OrderProcessing", "", typeof(OrderOrchestration))); +``` + +### Side-by-Side Registration + +Use `NameValueObjectCreator` to register multiple versions of the same orchestration: + +```csharp +public class OrderOrchestrationV1 : TaskOrchestration { /* V1 impl */ } + +public class OrderOrchestrationV2 : TaskOrchestration { /* V2 impl */ } +``` + +### Registration + +```csharp +worker.AddTaskOrchestrations( + new NameValueObjectCreator( + "OrderProcessing", + "V1", + typeof(OrderOrchestrationV1)), + new NameValueObjectCreator( + "OrderProcessing", + "V2", + typeof(OrderOrchestrationV2))); +``` + +## Best Practices + +### 1. Plan for Versioning from the Start + +```csharp +public class OrderInput +{ + public int Version { get; set; } = 1; // Include version in input + public string OrderId { get; set; } + // ... +} +``` + +### 2. Use Feature Flags for Gradual Rollout + +```csharp +public override async Task RunTask(OrchestrationContext context, Input input) +{ + if (input.Features.UseNewPaymentFlow) + { + return await NewPaymentFlowAsync(context, input); + } + return await LegacyPaymentFlowAsync(context, input); +} +``` + +### 3. Keep Orchestrations Short-Lived When Possible + +Long-running orchestrations are harder to version. Consider: + +- Breaking into sub-orchestrations +- Using `ContinueAsNew` more frequently +- Designing for completion within hours/days, not months + +### 4. Document Breaking Changes + +```csharp +/// +/// Order processing orchestration. +/// +/// Version History: +/// - V1: Initial version +/// - V2: Added fraud check activity (BREAKING - wait for V1 completion) +/// - V2.1: Updated logging (compatible with V2) +/// +public class OrderOrchestrationV2_1 : TaskOrchestration { } + +// Register with name and version +worker.AddTaskOrchestrations( + new NameValueObjectCreator( + "OrderProcessing", "V2.1", typeof(OrderOrchestrationV2_1))); +``` + +## Next Steps + +- [Replay and Durability](../concepts/replay-and-durability.md) β€” Understanding why versioning matters +- [Deterministic Constraints](../concepts/deterministic-constraints.md) β€” Writing safe orchestration code +- [Error Handling](error-handling.md) β€” Handling version mismatch errors diff --git a/docs/getting-started/choosing-a-backend.md b/docs/getting-started/choosing-a-backend.md new file mode 100644 index 000000000..cff2cfa3a --- /dev/null +++ b/docs/getting-started/choosing-a-backend.md @@ -0,0 +1,166 @@ +# Choosing a Backend + +The Durable Task Framework (DTFx) supports multiple backend storage providers. This guide helps you choose the right one for your needs. + +## Recommendation: Durable Task Scheduler + +For most new projects, we recommend the **[Durable Task Scheduler](../providers/durable-task-scheduler.md)**β€”a fully managed Azure service that eliminates infrastructure management and provides the best developer experience. + +## Provider Comparison + +| Feature | [Durable Task Scheduler](../providers/durable-task-scheduler.md) | [Azure Storage](../providers/azure-storage.md) | [MSSQL](../providers/mssql.md) | [Service Bus](../providers/service-bus.md) | [Service Fabric](../providers/service-fabric.md) | [Emulator](../providers/emulator.md) | +| ------- | ---------------------- | ------------- | ----- | ----------- | -------------- | -------- | +| **Type** | ⭐ Managed service | Self-managed | Self-managed | Self-managed | Self-managed | In-memory | +| **Production ready** | βœ… Yes | βœ… Yes | βœ… Yes | βœ… Yes | βœ… Yes | ❌ No | +| **Azure support SLA** | βœ… Yes | ❌ No | ❌ No | ❌ No | ❌ No | ❌ No | +| **Infrastructure** | None required | Storage account | SQL Server database | Service Bus namespace | Service Fabric cluster | None | +| **Throughput** | Very high | Moderate+ | Moderate+ | Moderate | Unknown | N/A | +| **Latency** | Low | Moderate | Low | Moderate+ | Unknown | Very low | +| **Built-in dashboard** | βœ… Yes | ❌ No | ❌ No | ❌ No | ❌ No | ❌ No | +| **Managed identity** | βœ… Yes | βœ… Yes | βœ… Yes | βœ… Yes | N/A | N/A | +| **Local emulator** | βœ… Docker | N/A | βœ… SQL Server | N/A | N/A | βœ… Built-in | +| **Cost model** | Fixed monthly or per-operation | Storage transactions | Database DTUs/vCores | Messaging units | Cluster nodes | Free | + +## When to Use Each Provider + +### Durable Task Scheduler ⭐ Recommended + +**Best for:** + +- βœ… New projects and greenfield development +- βœ… Production workloads requiring enterprise support +- βœ… Teams that want to minimize operational overhead +- βœ… High-throughput scenarios +- βœ… Applications needing built-in monitoring + +**Considerations:** + +- Requires Azure subscription +- Cost based on operations (see [pricing](https://learn.microsoft.com/azure/azure-functions/durable/durable-task-scheduler/durable-task-scheduler-dedicated-sku)) + +πŸ‘‰ **[Get started with Durable Task Scheduler](../providers/durable-task-scheduler.md)** + +--- + +### Azure Storage + +**Best for:** + +- βœ… Existing Azure Storage deployments +- βœ… Cost-sensitive workloads with moderate throughput +- βœ… Scenarios requiring data residency control +- βœ… Teams already managing Azure Storage infrastructure + +**Considerations:** + +- Throughput limited by Azure Storage transaction limits +- Requires management of storage account, queues, tables, and blobs +- No built-in monitoring dashboard + +πŸ‘‰ **[Get started with Azure Storage](../providers/azure-storage.md)** + +--- + +### MSSQL (Microsoft SQL Server) + +**Best for:** + +- βœ… Non-Azure or hybrid deployments +- βœ… Teams with existing SQL Server expertise +- βœ… Scenarios requiring direct database queries against orchestration state +- βœ… Environments with strict BCDR requirements + +**Considerations:** + +- Requires management of SQL Server database +- State is stored in indexed tables with stored procedures for direct querying +- Available for Azure SQL Database, SQL Server, or any compatible MSSQL database + +πŸ‘‰ **[Get started with MSSQL](https://github.com/microsoft/durabletask-mssql)** + +--- + +### Service Bus + +**Best for:** + +- βœ… Existing Service Bus deployments +- βœ… Low(er)-latency message delivery requirements + +**Considerations:** + +- Requires management of Service Bus namespace +- Tracking store requires separate Azure Storage account + +πŸ‘‰ **[Get started with Service Bus](../providers/service-bus.md)** + +--- + +### Service Fabric + +**Best for:** + +- βœ… Existing Service Fabric clusters +- βœ… Integration with Service Fabric stateful services + +**Considerations:** + +- Requires Service Fabric cluster management +- Tightly coupled to Service Fabric ecosystem + +πŸ‘‰ **[Get started with Service Fabric](../providers/service-fabric.md)** + +--- + +### Emulator + +**Best for:** + +- βœ… Local development and testing +- βœ… Unit tests and integration tests +- βœ… Learning and experimentation + +**Considerations:** + +- In-memory onlyβ€”data is lost on restart +- Not suitable for production use +- Single-process only + +πŸ‘‰ **[Get started with Emulator](../providers/emulator.md)** + +--- + +### Netherite ⚠️ Deprecated + +> **Warning:** Netherite is being deprecated and is not recommended for new projects. + +Netherite is an ultra-high performance backend developed by Microsoft Research that uses Azure Event Hubs and Azure Page Blobs with [FASTER](https://www.microsoft.com/research/project/faster/) database technology. + +**Considerations:** + +- ⚠️ Being deprecatedβ€”not recommended for new projects +- More complex infrastructure requirements (Event Hubs + Azure Storage) +- Consider migrating to Durable Task Scheduler for similar performance characteristics + +πŸ‘‰ **[Netherite GitHub Repository](https://github.com/microsoft/durabletask-netherite)** + +--- + +## Migration Between Providers + +Each provider stores orchestration state differently, so migrating between providers requires: + +1. **Completing or terminating** all running orchestrations +2. **Reconfiguring** the application with the new provider +3. **Restarting** orchestrations from scratch + +There is no built-in state migration tool between providers. + +## Need Help Deciding? + +- For **enterprise support**, choose [Durable Task Scheduler](../providers/durable-task-scheduler.md) +- For **non-Azure deployments**, choose [MSSQL](https://github.com/microsoft/durabletask-mssql) +- For **lowest cost**, choose [Azure Storage](../providers/azure-storage.md) +- For **local testing**, choose [Emulator](../providers/emulator.md) + +See [Support](../support.md) for more information about getting help. diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md new file mode 100644 index 000000000..d1e511090 --- /dev/null +++ b/docs/getting-started/installation.md @@ -0,0 +1,91 @@ +# Installation + +This guide covers installing the Durable Task Framework (DTFx) packages for your project. + +## Prerequisites + +- .NET 6.0 or later (.NET 10.0 is currently recommended) +- .NET Framework 4.7.2 or later (for .NET Framework projects) + +## NuGet Packages + +### Core Package + +All DTFx applications require the core package: + +```bash +dotnet add package Microsoft.Azure.DurableTask.Core +``` + +### Backend Providers + +Backend providers implement the storage and messaging layers for DTFx. You can choose one of several backend providers based on your needs. See [Choosing a Backend](choosing-a-backend.md) for guidance. + +#### Durable Task Scheduler (Recommended) + +For new projects, we recommend the fully managed [Durable Task Scheduler](../providers/durable-task-scheduler.md): + +```bash +dotnet add package Microsoft.DurableTask.AzureManagedBackend +``` + +#### Azure Storage + +For self-managed deployments using Azure Storage (queues, tables, blobs): + +```bash +dotnet add package Microsoft.Azure.DurableTask.AzureStorage +``` + +#### Azure Service Bus + +For deployments using Azure Service Bus: + +```bash +dotnet add package Microsoft.Azure.DurableTask.ServiceBus +``` + +#### Azure Service Fabric + +For Service Fabric applications: + +```bash +dotnet add package Microsoft.Azure.DurableTask.AzureServiceFabric +``` + +#### Emulator (Local Development) + +For local development and testing without external dependencies: + +```bash +dotnet add package Microsoft.Azure.DurableTask.Emulator +``` + +### Optional Packages + +#### Application Insights Integration + +For Application Insights telemetry: + +```bash +dotnet add package Microsoft.Azure.DurableTask.ApplicationInsights +``` + +## Package Versions + +All DTFx packages follow semantic versioning. We recommend using the latest stable versions: + +| Package | NuGet | +| ------- | ----- | +| DurableTask.Core | [![NuGet](https://img.shields.io/nuget/v/Microsoft.Azure.DurableTask.Core.svg)](https://www.nuget.org/packages/Microsoft.Azure.DurableTask.Core/) | +| DurableTask.AzureManagedBackend | [![NuGet](https://img.shields.io/nuget/v/Microsoft.DurableTask.AzureManagedBackend.svg)](https://www.nuget.org/packages/Microsoft.DurableTask.AzureManagedBackend/) | +| DurableTask.AzureStorage | [![NuGet](https://img.shields.io/nuget/v/Microsoft.Azure.DurableTask.AzureStorage.svg)](https://www.nuget.org/packages/Microsoft.Azure.DurableTask.AzureStorage/) | +| DurableTask.ServiceBus | [![NuGet](https://img.shields.io/nuget/v/Microsoft.Azure.DurableTask.ServiceBus.svg)](https://www.nuget.org/packages/Microsoft.Azure.DurableTask.ServiceBus/) | +| DurableTask.AzureServiceFabric | [![NuGet](https://img.shields.io/nuget/v/Microsoft.Azure.DurableTask.AzureServiceFabric.svg)](https://www.nuget.org/packages/Microsoft.Azure.DurableTask.AzureServiceFabric/) | +| DurableTask.Emulator | [![NuGet](https://img.shields.io/nuget/v/Microsoft.Azure.DurableTask.Emulator.svg)](https://www.nuget.org/packages/Microsoft.Azure.DurableTask.Emulator/) | + +## Next Steps + +- [Quickstart](quickstart.md) β€” Create your first orchestration +- [Choosing a Backend](choosing-a-backend.md) β€” Compare backend providers +- [Core Concepts](../concepts/core-concepts.md) β€” Understand the architecture diff --git a/docs/getting-started/quickstart.md b/docs/getting-started/quickstart.md new file mode 100644 index 000000000..e4f74c959 --- /dev/null +++ b/docs/getting-started/quickstart.md @@ -0,0 +1,173 @@ +# Quickstart + +This guide walks you through creating your first Durable Task Framework (DTFx) orchestration. + +## Overview + +In this quickstart, you'll create: +1. An **activity** that performs a simple greeting +2. An **orchestration** that calls the activity +3. A **host** that runs the orchestration + +## Step 1: Create a New Project + +```bash +dotnet new console -n HelloDurableTask +cd HelloDurableTask +``` + +## Step 2: Install Packages + +For this quickstart, we'll use the in-memory emulator: + +```bash +dotnet add package Microsoft.Azure.DurableTask.Core +dotnet add package Microsoft.Azure.DurableTask.Emulator +``` + +> πŸ’‘ For production, see [Choosing a Backend](choosing-a-backend.md) to select an appropriate provider. + +## Step 3: Create an Activity + +Activities are the basic unit of work in DTFx. Create a file named `GreetActivity.cs`: + +```csharp +using DurableTask.Core; + +public class GreetActivity : TaskActivity +{ + protected override string Execute(TaskContext context, string name) + { + return $"Hello, {name}!"; + } +} +``` + +## Step 4: Create an Orchestration + +Orchestrations coordinate activities. Create a file named `GreetingOrchestration.cs`: + +```csharp +using DurableTask.Core; + +public class GreetingOrchestration : TaskOrchestration +{ + public override async Task RunTask(OrchestrationContext context, string input) + { + // Call the GreetActivity + string greeting = await context.ScheduleTask(typeof(GreetActivity), input); + return greeting; + } +} +``` + +## Step 5: Create the Host + +Update `Program.cs` to create and run the orchestration: + +```csharp +using DurableTask.Core; +using DurableTask.Emulator; +using Microsoft.Extensions.Logging; + +// Create logger factory for diagnostics +using ILoggerFactory loggerFactory = LoggerFactory.Create(builder => +{ + builder.AddConsole(); + builder.SetMinimumLevel(LogLevel.Information); +}); + +// Create the in-memory orchestration service +var service = new LocalOrchestrationService(); + +// Create and configure the worker +var worker = new TaskHubWorker(service, loggerFactory); +worker.AddTaskOrchestrations(typeof(GreetingOrchestration)); +worker.AddTaskActivities(typeof(GreetActivity)); + +// Start the worker +await worker.StartAsync(); +Console.WriteLine("Worker started."); + +// Create a client to start orchestrations +var client = new TaskHubClient(service, loggerFactory: loggerFactory); + +// Start a new orchestration instance +var instance = await client.CreateOrchestrationInstanceAsync( + typeof(GreetingOrchestration), + "World"); + +Console.WriteLine($"Started orchestration: {instance.InstanceId}"); + +// Wait for completion +var result = await client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromSeconds(30)); + +Console.WriteLine($"Result: {result.Output}"); +Console.WriteLine($"Status: {result.OrchestrationStatus}"); + +// Stop the worker +await worker.StopAsync(); +``` + +## Step 6: Run the Application + +```bash +dotnet run +``` + +Expected output: +``` +Worker started. +Started orchestration: +Result: "Hello, World!" +Status: Completed +``` + +## Understanding the Code + +### TaskActivity + +```csharp +public class GreetActivity : TaskActivity +``` + +- `TaskActivity` β€” Base class for activities +- Activities contain the actual work logic +- They are automatically retried on failure (configurable) + +### TaskOrchestration + +```csharp +public class GreetingOrchestration : TaskOrchestration +``` + +- `TaskOrchestration` β€” Base class for orchestrations +- Orchestrations coordinate activities and sub-orchestrations +- They must be [deterministic](../concepts/deterministic-constraints.md) + +### OrchestrationContext + +```csharp +await context.ScheduleTask(typeof(GreetActivity), input); +``` + +- `OrchestrationContext` provides APIs for scheduling work +- `ScheduleTask` β€” Schedule an activity +- `CreateSubOrchestrationInstance` β€” Start a sub-orchestration +- `CreateTimer` β€” Create a durable timer +- `WaitForExternalEvent` β€” Wait for an external event + +### TaskHubWorker and TaskHubClient + +- `TaskHubWorker` β€” Hosts orchestrations and activities +- `TaskHubClient` β€” Starts and manages orchestration instances + +## Next Steps + +- [Choosing a Backend](choosing-a-backend.md) β€” Select a production-ready provider +- [Core Concepts](../concepts/core-concepts.md) β€” Understand Task Hubs, Workers, and Clients +- [Writing Orchestrations](../concepts/orchestrations.md) β€” Learn orchestration patterns +- [Writing Activities](../concepts/activities.md) β€” Learn activity patterns +- [Samples Catalog](../samples/catalog.md) β€” Explore more examples diff --git a/docs/providers/README.md b/docs/providers/README.md new file mode 100644 index 000000000..e9f200629 --- /dev/null +++ b/docs/providers/README.md @@ -0,0 +1,19 @@ +# Backend Providers + +The Durable Task Framework supports multiple backend storage providers. Choose based on your requirements for management overhead, throughput, and existing infrastructure. + +## Provider Comparison + +| Provider | Type | Best For | +| -------- | ---- | -------- | +| [Durable Task Scheduler](durable-task-scheduler.md) ⭐ | Managed | New projects, production workloads | +| [Azure Storage](azure-storage.md) | Self-managed | Existing Azure Storage infrastructure | +| [MSSQL](mssql.md) | Self-managed | SQL Server / Azure SQL infrastructure | +| [Emulator](emulator.md) | In-memory | Local development and testing | +| [Service Fabric](service-fabric.md) | Self-managed | Applications on Service Fabric clusters | +| [Service Bus](service-bus.md) | Self-managed | Legacy (maintenance mode) | +| [Custom Provider](custom-provider.md) | DIY | Specialized storage requirements | + +> ⭐ **Recommended:** For new projects, use the [Durable Task Scheduler](durable-task-scheduler.md)β€”a fully managed Azure service with zero infrastructure management and built-in monitoring. + +See [Choosing a Backend](../getting-started/choosing-a-backend.md) for detailed guidance. diff --git a/docs/providers/azure-storage.md b/docs/providers/azure-storage.md new file mode 100644 index 000000000..cc9960838 --- /dev/null +++ b/docs/providers/azure-storage.md @@ -0,0 +1,390 @@ +# Azure Storage Provider + +The Azure Storage provider uses Azure Storage queues, tables, and blobs to persist orchestration state. It's a self-managed option suitable for existing Azure Storage deployments. + +## When to Use Azure Storage + +βœ… **Good for:** + +- Existing Azure Storage infrastructure +- Cost-sensitive workloads with low-to-moderate throughput +- Internal Azure services in Ring-1 or lower + +⚠️ **Consider [Durable Task Scheduler](durable-task-scheduler.md) instead for:** + +- New projects +- Enterprise support requirements +- High-throughput scenarios +- Zero infrastructure management +- Internal Azure services in Ring-2 or higher + +## Installation + +```bash +dotnet add package Microsoft.Azure.DurableTask.AzureStorage +``` + +## Configuration + +### Basic Setup + +```csharp +using DurableTask.AzureStorage; +using DurableTask.Core; +using Microsoft.Extensions.Logging; + +using ILoggerFactory loggerFactory = LoggerFactory.Create(builder => +{ + builder.AddConsole(); + builder.SetMinimumLevel(LogLevel.Information); +}); + +var settings = new AzureStorageOrchestrationServiceSettings +{ + StorageConnectionString = "DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...;", + TaskHubName = "MyTaskHub", + LoggerFactory = loggerFactory +}; + +var service = new AzureStorageOrchestrationService(settings); +await service.CreateIfNotExistsAsync(); + +var worker = new TaskHubWorker(service, loggerFactory); +var client = new TaskHubClient(service, loggerFactory: loggerFactory); +``` + +### Using Managed Identity + +```csharp +using Azure.Identity; +using DurableTask.AzureStorage; +using DurableTask.Core; +using Microsoft.Extensions.Logging; + +using ILoggerFactory loggerFactory = LoggerFactory.Create(builder => +{ + builder.AddConsole(); + builder.SetMinimumLevel(LogLevel.Information); +}); + +// Uses DefaultAzureCredential +var credential = new DefaultAzureCredential(); + +var settings = new AzureStorageOrchestrationServiceSettings +{ + TaskHubName = "MyTaskHub", + StorageAccountClientProvider = new StorageAccountClientProvider("mystorageaccount", credential), + LoggerFactory = loggerFactory +}; + +var service = new AzureStorageOrchestrationService(settings); +``` + +> [!TIP] +> For complete runnable examples using managed identity, see the [Managed Identity Samples](../../samples/ManagedIdentitySample/). + +## Configuration Options + +### Core Settings + +| Setting | Description | Default | +| ------- | ----------- | ------- | +| `TaskHubName` | Name of the task hub (alphanumeric, 3-45 chars) | Required | +| `StorageConnectionString` | Azure Storage connection string | Required* | +| `StorageAccountName` | Storage account name (for managed identity) | Required* | + +*Either `StorageConnectionString` or `StorageAccountName` with credentials is required. + +### Performance Settings + +| Setting | Description | Default | +| ------- | ----------- | ------- | +| `PartitionCount` | Number of control queue partitions (1-16) | 4 | +| `ControlQueueBufferThreshold` | Max messages prefetched and buffered per partition | 64 | +| `MaxConcurrentTaskOrchestrationWorkItems` | Max concurrent orchestrations | 100 | +| `MaxConcurrentTaskActivityWorkItems` | Max concurrent activities | 10 | + +### Partition Management + +| Setting | Description | Default | +| ------- | ----------- | ------- | +| `UseTablePartitionManagement` | Use table-based partition management (recommended) | `true` | +| `UseLegacyPartitionManagement` | Use legacy blob-based partition management | `false` | +| `LeaseRenewInterval` | Interval for renewing partition leases | 10 seconds | +| `LeaseInterval` | Lease duration before expiration | 30 seconds | +| `LeaseAcquireInterval` | Interval for checking partition balance | 10 seconds | + +> [!NOTE] +> Table-based partition management (`UseTablePartitionManagement = true`) is the default and recommended option. It provides better reliability for partition distribution and uses a `{taskhub}Partitions` table instead of blob leases. It's also significantly less expensive in terms of Azure Storage operations. + +### Example Configuration + +```csharp +var settings = new AzureStorageOrchestrationServiceSettings +{ + StorageConnectionString = connectionString, + TaskHubName = "MyTaskHub", + + // Performance tuning + PartitionCount = 8, + ControlQueueBufferThreshold = 128, + MaxConcurrentTaskOrchestrationWorkItems = 200, + MaxConcurrentTaskActivityWorkItems = 200, + + // Lease settings + LeaseInterval = TimeSpan.FromSeconds(30), + LeaseRenewInterval = TimeSpan.FromSeconds(10) +}; +``` + +## Architecture + +### Storage Resources + +The Azure Storage provider creates these resources: + +| Resource Type | Name Pattern | Purpose | +| ------------- | ------------ | ------- | +| **Control Queues** | `{taskhub}-control-{0..N}` | Orchestration messages | +| **Work Item Queue** | `{taskhub}-workitems` | Activity messages | +| **History Table** | `{taskhub}History` | Orchestration history | +| **Instances Table** | `{taskhub}Instances` | Instance metadata | +| **Partitions Table** | `{taskhub}Partitions` | Partition leases (table manager) | +| **Lease Blobs** | `{taskhub}-leases/` | Partition leases (blob manager) | + +### Partitioning + +The Azure Storage provider uses **partitions** to distribute orchestration workloads across workers. Each partition corresponds to exactly one **control queue**. + +#### How Partitioning Works + +- **Orchestrations and entities** are assigned to partitions by hashing the instance ID +- Instance IDs are random GUIDs by default, ensuring even distribution across partitions +- A single orchestration instance is always processed by one partition (and therefore one worker) at a time +- The `PartitionCount` setting (1–16, default 4) determines how many control queues are created + +#### Queue Architecture + +The task hub uses two types of queues: + +| Queue Type | Count | Purpose | Processing | +| ---------- | ----- | ------- | ---------- | +| **Control queues** | `PartitionCount` | Orchestration lifecycle messages | Partitioned β€” each queue owned by one worker | +| **Work item queue** | 1 | Activity function messages | Shared β€” all workers compete for messages | + +```text +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Task Hub β”‚ +β”‚ β”‚ +β”‚ CONTROL QUEUES (partitioned) WORK ITEM QUEUE (shared) β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ control-00 β”‚ control-01 β”‚ ... β”‚ β”‚ workitems β”‚ β”‚ +β”‚ β”‚ (Worker A) β”‚ (Worker B) β”‚ β”‚ β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ All workers compete β”‚ β”‚ +β”‚ β”‚ β€’ Start β”‚ β€’ Start β”‚ β”‚ β”‚ for activity messages β”‚ β”‚ +β”‚ β”‚ β€’ Timer β”‚ β€’ Timer β”‚ β”‚ β”‚ β”‚ β”‚ +β”‚ β”‚ β€’ Activity β”‚ β€’ Activity β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ complete β”‚ complete β”‚ β”‚ β–² β”‚ +β”‚ β”‚ β€’ External β”‚ β€’ External β”‚ β”‚ β”‚ β”‚ +β”‚ β”‚ event β”‚ event β”‚ β”‚ Activities scheduled by β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜ orchestrators go here β”‚ +β”‚ β”‚ β”‚ β”‚ +β”‚ β–Ό β–Ό β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ ORCHESTRATION INSTANCES β”‚ β”‚ +β”‚ β”‚ Hash(InstanceID) β†’ Partition β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +#### Control Queue Messages + +Control queues contain orchestration lifecycle messages: + +- **ExecutionStarted** β€” New orchestration started +- **TaskCompleted** β€” Activity function completed +- **TimerFired** β€” Durable timer expired +- **EventRaised** β€” External event received +- **SubOrchestrationCompleted** β€” Child orchestration completed + +When messages are dequeued, up to 32 messages are fetched in a single poll. Messages for the same instance are batched together for efficient processing. + +#### Work Item Queue + +The work item queue is a simple, non-partitioned queue for activity function messages: + +- All workers compete to dequeue activity messages +- Activities are **stateless** β€” any worker can execute any activity +- Activities can scale out infinitely (limited only by worker count) + +#### Partition Count Guidance + +| Workload | Recommended `PartitionCount` | +| -------- | ---------------------------- | +| Development/testing | 1–2 | +| Low-to-moderate throughput | 4 (default) | +| High throughput | 8–16 | + +> [!IMPORTANT] +> Partition count **cannot be changed** after task hub creation. Set it high enough to accommodate future scale-out needs. The maximum number of workers that can process orchestrations concurrently equals the partition count. Note that higher partition counts increase Azure Storage costs due to more queue and table operations. + +### Lease Management + +Workers compete for partition ownership using one of two partition managers: + +#### Table Partition Manager (Default) + +When `UseTablePartitionManagement = true` (default): + +- Partition leases are stored in the `{taskhub}Partitions` table +- Uses Azure Table ETags for concurrency control +- Provides better reliability due to transactional updates + +#### Blob Partition Manager (Legacy) + +When `UseTablePartitionManagement = false`: + +- Partition leases are stored as blobs in `{taskhub}-leases/` +- Uses Azure Blob leases for concurrency control +- Available in "safe" (`UseLegacyPartitionManagement = false`) and "legacy" (`UseLegacyPartitionManagement = true`) variants + +#### Partition lifecycle + +1. Workers acquire leases to claim partition ownership +2. Leases are renewed at `LeaseRenewInterval` (default 10s) +3. Leases expire after `LeaseInterval` (default 30s) if not renewed +4. Partitions are automatically balanced across workers + +### Message Processing + +1. **Prefetching**: Messages are prefetched from control queues in batches +2. **Batching**: Messages for the same instance are grouped together +3. **History fetch**: Orchestration history is loaded from Table Storage +4. **Processing**: Orchestration code runs with the loaded history +5. **Checkpoint**: New history and messages are appended + +### Checkpoint Order + +Checkpoints are written in this order to ensure that no data is lost if there is a failure: + +1. New messages β†’ Storage queues +2. New history β†’ Table storage +3. Delete processed messages + +Because the checkpoints aren't atomic, duplicates may occur. The replay model handles history duplicates gracefully. Message duplicates may result in activities being executed multiple times if an unexpected failure occurs. + +## Scaling + +### Horizontal Scaling + +Multiple workers can connect to the same task hub: + +```csharp +// Worker 1, 2, 3... all connect to same task hub +var service = new AzureStorageOrchestrationService(settings); +var worker = new TaskHubWorker(service, loggerFactory); +await worker.StartAsync(); +``` + +Partitions are automatically distributed across workers. + +### Partition Count + +For high-throughput scenarios, increase partition count: + +```csharp +var settings = new AzureStorageOrchestrationServiceSettings +{ + PartitionCount = 16 // More partitions = more parallelism +}; +``` + +> [!WARNING] +> Partition count cannot be changed after task hub creation. + +## Operations + +### Create Task Hub + +```csharp +await service.CreateIfNotExistsAsync(); +``` + +### Delete Task Hub + +```csharp +await service.DeleteAsync(); +``` + +### Purge History + +```csharp +await service.PurgeOrchestrationHistoryAsync( + DateTime.UtcNow.AddDays(-30), // Older than 30 days + OrchestrationStateTimeRangeFilterType.OrchestrationCompletedTimeFilter); +``` + +> [!WARNING] +> Purging history is very expensive and may take hours for large task hubs. It's recommended to purge history frequently and in smaller time ranges. + +## Monitoring + +### Azure Storage Metrics + +Monitor these metrics in Azure portal: + +- Queue message count +- Table transactions +- Blob lease operations + +## Logging + +The Azure Storage provider supports structured logging via `Microsoft.Extensions.Logging`. + +### Enabling Logging + +```csharp +using Microsoft.Extensions.Logging; + +// Create a logger factory +ILoggerFactory loggerFactory = LoggerFactory.Create(builder => +{ + builder.AddConsole(); + builder.AddFilter("DurableTask.AzureStorage", LogLevel.Information); + builder.AddFilter("DurableTask.Core", LogLevel.Information); +}); + +// Pass logger factory to settings +var settings = new AzureStorageOrchestrationServiceSettings +{ + StorageConnectionString = connectionString, + TaskHubName = "MyTaskHub", + LoggerFactory = loggerFactory +}; + +var service = new AzureStorageOrchestrationService(settings); +``` + +### Log Categories + +| Category | Description | +| -------- | ----------- | +| `DurableTask.AzureStorage` | Azure Storage-specific operations (messages, queues, tables) | +| `DurableTask.Core` | Core framework operations (orchestrations, activities, dispatchers) | + +### Example Log Events + +- `SendingMessage` / `ReceivedMessage` β€” Queue message operations +- `FetchedInstanceHistory` β€” History table reads +- `PoisonMessageDetected` β€” Unprocessable messages +- `PartitionManagerInfo` / `PartitionManagerWarning` β€” Partition management + +### ETW Event Source + +Events are also published to Event Tracing for Windows (ETW) via the `DurableTask-AzureStorage` event source (GUID: {4C4AD4A2-F396-5E18-01B6-618C12A10433}). + +## Next Steps + +- [Choosing a Backend](../getting-started/choosing-a-backend.md) β€” Compare all providers +- [Durable Task Scheduler](durable-task-scheduler.md) β€” Recommended managed alternative +- [Core Concepts](../concepts/core-concepts.md) β€” Learn the fundamentals diff --git a/docs/providers/custom-provider.md b/docs/providers/custom-provider.md new file mode 100644 index 000000000..7b49636b5 --- /dev/null +++ b/docs/providers/custom-provider.md @@ -0,0 +1,269 @@ +# Custom Provider Implementation + +You can implement a custom storage provider by implementing the `IOrchestrationService` interface. This allows you to use DTFx with any backend storage system. + +## When to Implement a Custom Provider + +βœ… **Good for:** + +- Integrating with proprietary storage systems +- Specialized requirements not met by existing providers +- Research and experimentation + +⚠️ **Consider existing providers first:** + +- [Durable Task Scheduler](durable-task-scheduler.md) β€” Managed service +- [Azure Storage](azure-storage.md) β€” Self-managed with Azure Storage +- [Emulator](emulator.md) β€” Local development + +## Core Interfaces + +### IOrchestrationService + +The primary interface for storage providers: + +```csharp +public interface IOrchestrationService +{ + // Lifecycle + Task StartAsync(); + Task StopAsync(); + Task StopAsync(bool isForced); + Task CreateAsync(); + Task CreateAsync(bool recreateInstanceStore); + Task CreateIfNotExistsAsync(); + Task DeleteAsync(); + Task DeleteAsync(bool deleteInstanceStore); + + // Orchestration dispatcher + int TaskOrchestrationDispatcherCount { get; } + int MaxConcurrentTaskOrchestrationWorkItems { get; } + BehaviorOnContinueAsNew EventBehaviourForContinueAsNew { get; } + + bool IsMaxMessageCountExceeded(int currentMessageCount, OrchestrationRuntimeState runtimeState); + int GetDelayInSecondsAfterOnProcessException(Exception exception); + int GetDelayInSecondsAfterOnFetchException(Exception exception); + + Task LockNextTaskOrchestrationWorkItemAsync( + TimeSpan receiveTimeout, CancellationToken cancellationToken); + Task RenewTaskOrchestrationWorkItemLockAsync(TaskOrchestrationWorkItem workItem); + Task CompleteTaskOrchestrationWorkItemAsync( + TaskOrchestrationWorkItem workItem, + OrchestrationRuntimeState newOrchestrationRuntimeState, + IList outboundMessages, + IList orchestratorMessages, + IList timerMessages, + TaskMessage continuedAsNewMessage, + OrchestrationState orchestrationState); + Task AbandonTaskOrchestrationWorkItemAsync(TaskOrchestrationWorkItem workItem); + Task ReleaseTaskOrchestrationWorkItemAsync(TaskOrchestrationWorkItem workItem); + + // Activity dispatcher + int TaskActivityDispatcherCount { get; } + int MaxConcurrentTaskActivityWorkItems { get; } + + Task LockNextTaskActivityWorkItem( + TimeSpan receiveTimeout, CancellationToken cancellationToken); + Task RenewTaskActivityWorkItemLockAsync(TaskActivityWorkItem workItem); + Task CompleteTaskActivityWorkItemAsync(TaskActivityWorkItem workItem, TaskMessage responseMessage); + Task AbandonTaskActivityWorkItemAsync(TaskActivityWorkItem workItem); +} +``` + +### IOrchestrationServiceClient + +For client operations (starting, querying, managing instances): + +```csharp +public interface IOrchestrationServiceClient +{ + Task CreateTaskOrchestrationAsync(TaskMessage creationMessage); + Task CreateTaskOrchestrationAsync(TaskMessage creationMessage, OrchestrationStatus[] dedupeStatuses); + + Task SendTaskOrchestrationMessageAsync(TaskMessage message); + Task SendTaskOrchestrationMessageBatchAsync(params TaskMessage[] messages); + + Task WaitForOrchestrationAsync( + string instanceId, + string executionId, + TimeSpan timeout, + CancellationToken cancellationToken); + + Task ForceTerminateTaskOrchestrationAsync(string instanceId, string reason); + + Task GetOrchestrationStateAsync(string instanceId, string executionId); + Task> GetOrchestrationStateAsync(string instanceId, bool allExecutions); + + Task GetOrchestrationHistoryAsync(string instanceId, string executionId); + Task PurgeOrchestrationHistoryAsync( + DateTime thresholdDateTimeUtc, + OrchestrationStateTimeRangeFilterType timeRangeFilterType); +} +``` + +> [!NOTE] +> Most providers implement both interfaces in a single class. + +## Minimal Implementation + +Here's a skeleton for a custom provider: + +```csharp +public class MyCustomOrchestrationService : IOrchestrationService, IOrchestrationServiceClient +{ + private readonly MyStorageBackend _storage; + + public MyCustomOrchestrationService(string connectionString) + { + _storage = new MyStorageBackend(connectionString); + } + + // Lifecycle + public Task CreateAsync() => CreateIfNotExistsAsync(); + public Task CreateAsync(bool recreateInstanceStore) => CreateIfNotExistsAsync(); + + public async Task CreateIfNotExistsAsync() + { + await _storage.InitializeAsync(); + } + + public async Task DeleteAsync() + { + await _storage.DeleteAllDataAsync(); + } + + public Task DeleteAsync(bool deleteInstanceStore) => DeleteAsync(); + + // Worker lifecycle + public Task StartAsync() + { + // Start background processes if needed + return Task.CompletedTask; + } + + public Task StopAsync() => StopAsync(false); + public Task StopAsync(bool isForced) + { + // Stop background processes + return Task.CompletedTask; + } + + // Work item polling + public async Task LockNextTaskOrchestrationWorkItemAsync( + TimeSpan receiveTimeout, + CancellationToken cancellationToken) + { + // Poll for orchestration messages + var message = await _storage.DequeueOrchestrationMessageAsync(receiveTimeout, cancellationToken); + if (message == null) return null; + + // Load history + var history = await _storage.LoadHistoryAsync(message.InstanceId); + + return new TaskOrchestrationWorkItem + { + InstanceId = message.InstanceId, + NewMessages = new[] { message }, + OrchestrationRuntimeState = new OrchestrationRuntimeState(history) + }; + } + + public async Task LockNextTaskActivityWorkItem( + TimeSpan receiveTimeout, + CancellationToken cancellationToken) + { + var message = await _storage.DequeueActivityMessageAsync(receiveTimeout, cancellationToken); + if (message == null) return null; + + return new TaskActivityWorkItem + { + Id = Guid.NewGuid().ToString(), + TaskMessage = message + }; + } + + // Orchestration completion + public async Task CompleteTaskOrchestrationWorkItemAsync( + TaskOrchestrationWorkItem workItem, + OrchestrationRuntimeState newState, + IList outboundMessages, + IList orchestratorMessages, + IList timerMessages, + TaskMessage continuedAsNewMessage, + OrchestrationState state) + { + // Save new history + await _storage.SaveHistoryAsync(workItem.InstanceId, newState.Events); + + // Enqueue outbound messages (activities) + foreach (var msg in outboundMessages) + { + await _storage.EnqueueActivityMessageAsync(msg); + } + + // Enqueue orchestrator messages (sub-orchestrations, events) + foreach (var msg in orchestratorMessages) + { + await _storage.EnqueueOrchestrationMessageAsync(msg); + } + + // Handle timers + foreach (var msg in timerMessages) + { + await _storage.ScheduleTimerAsync(msg); + } + + // Handle continue-as-new + if (continuedAsNewMessage != null) + { + await _storage.EnqueueOrchestrationMessageAsync(continuedAsNewMessage); + } + + // Update instance status + await _storage.UpdateInstanceStateAsync(workItem.InstanceId, state); + } + + // ... implement remaining interface methods + + // Capabilities + public int TaskOrchestrationDispatcherCount => 1; + public int MaxConcurrentTaskOrchestrationWorkItems => settings.MaxOrchestrationConcurrency; + public int TaskActivityDispatcherCount => 1; + public int MaxConcurrentTaskActivityWorkItems => settings.MaxActivityConcurrency; + public BehaviorOnContinueAsNew EventBehaviourForContinueAsNew => + BehaviorOnContinueAsNew.Carryover; +} +``` + +## Key Concepts + +### Work Items + +**TaskOrchestrationWorkItem**: Represents orchestration work to process + +- Contains one or more messages that triggered the orchestrator invocation (`ExecutionStartedEvent`, `TaskCompletedEvent`, etc.) +- Contains the current orchestration state (the full history) +- Must be completed or abandoned, and should always be released + +**TaskActivityWorkItem**: Represents activity work to execute + +- Contains a single activity task message (`TaskScheduledEvent`) +- Must be completed or abandoned +- Supports lock renewal for long-running activities + +### State Management + +Your provider must manage: + +1. **Message queues** β€” For orchestration and activity messages +2. **History storage** β€” For orchestration event history +3. **Instance metadata** β€” For querying orchestration status +4. **Timer scheduling** β€” For durable timers + +How you implement these components is entirely up to you. In the ideal case, your storage backend should provide atomic operations (like in the Durable Task Scheduler and MSSQL backend providers) to ensure consistency. If that's not possible (like in Azure Storage), you must gracefully handle potential inconsistencies due to process crashes to ensure there's no data loss. + +## Next Steps + +- [Azure Storage Provider](azure-storage.md) β€” Reference implementation +- [Core Concepts](../concepts/core-concepts.md) β€” Understand the architecture +- [Replay and Durability](../concepts/replay-and-durability.md) β€” Key concepts for providers diff --git a/docs/providers/durable-task-scheduler.md b/docs/providers/durable-task-scheduler.md new file mode 100644 index 000000000..731a17d80 --- /dev/null +++ b/docs/providers/durable-task-scheduler.md @@ -0,0 +1,220 @@ +# Durable Task Scheduler + +The **Durable Task Scheduler** is a fully managed Azure service purpose-built for running durable orchestrations. It provides the best experience for production workloads with zero infrastructure management and built-in enterprise support. + +> ⭐ **Recommended**: For new projects, we recommend the Durable Task Scheduler as your backend provider. + +## Overview + +| Feature | Benefit | +| ------- | ------- | +| **Fully Managed** | No storage accounts, databases, or infrastructure to manage | +| **Built-in Dashboard** | Monitor orchestrations without additional tooling | +| **Highest Throughput** | Purpose-built for durable workflow performance | +| **Azure Support** | 24/7 enterprise support with SLA (with Azure support plan) | +| **Managed Identity** | Secure authentication using Azure AD | +| **Local Emulator** | Docker-based emulator for development | + +For complete documentation on creating and configuring Azure resources, authentication, SKUs, RBAC, and pricing, see the [official Azure documentation](https://learn.microsoft.com/azure/azure-functions/durable/durable-task-scheduler/durable-task-scheduler). + +## Installation + +```bash +dotnet add package Microsoft.DurableTask.AzureManagedBackend +``` + +## DTFx Code Sample + +```csharp +using DurableTask.Core; +using Microsoft.DurableTask.AzureManagedBackend; +using Microsoft.Extensions.Logging; + +// Get connection string from environment +// Expected format: "Endpoint=https://;Authentication=;TaskHub=" +string? connectionString = Environment.GetEnvironmentVariable("DTS_CONNECTION_STRING"); +if (string.IsNullOrWhiteSpace(connectionString)) +{ + Console.Error.WriteLine("An environment variable named DTS_CONNECTION_STRING is required."); + return; +} + +// Configure logging +ILoggerFactory loggerFactory = LoggerFactory.Create(builder => + builder.AddSimpleConsole(options => + { + options.SingleLine = true; + options.UseUtcTimestamp = true; + options.TimestampFormat = "yyyy-MM-ddTHH:mm:ss.fffZ "; + })); + +// Create the orchestration service for Durable Task Scheduler +AzureManagedOrchestrationService service = new( + AzureManagedOrchestrationServiceOptions.FromConnectionString(connectionString), + loggerFactory); + +// Create and configure the worker +TaskHubWorker worker = new(service, loggerFactory); +worker.AddTaskOrchestrations(typeof(HelloWorldOrchestration)); +worker.AddTaskActivities(typeof(HelloActivity)); + +// Start the worker +await worker.StartAsync(); + +// Create a client and start an orchestration +TaskHubClient client = new(service, null, loggerFactory); +OrchestrationInstance instance = await client.CreateOrchestrationInstanceAsync( + orchestrationType: typeof(HelloWorldOrchestration), + input: null); + +Console.WriteLine($"Started orchestration with ID = '{instance.InstanceId}'"); + +// Wait for completion +OrchestrationState state = await client.WaitForOrchestrationAsync( + instance, + TimeSpan.FromMinutes(1)); + +Console.WriteLine($"Orchestration completed with status: {state.OrchestrationStatus}"); +Console.WriteLine($"Output: {state.Output}"); + +// Clean up +await worker.StopAsync(); +service.Dispose(); +``` + +## Connection String Format + +```text +Endpoint=;TaskHub=;Authentication= +``` + +See [Authentication Types](#authentication-types) for supported credential types. + +## Configuration Options + +The `AzureManagedOrchestrationServiceOptions` class provides configuration for the Durable Task Scheduler backend. + +### Creating Options + +**From connection string (recommended):** + +```csharp +var options = AzureManagedOrchestrationServiceOptions.FromConnectionString(connectionString); +``` + +**Manual construction:** + +```csharp +var options = new AzureManagedOrchestrationServiceOptions( + address: "https://myscheduler.westus3.durabletask.io", + credential: new DefaultAzureCredential()); +options.TaskHubName = "my-task-hub"; +``` + +### Authentication Types + +The connection string `Authentication` property supports these credential types: + +| Value | Credential Type | Use Case | +| ----- | --------------- | -------- | +| `DefaultAzure` | `DefaultAzureCredential` | General purpose; tries multiple auth methods | +| `ManagedIdentity` | `ManagedIdentityCredential` | Azure-hosted apps; add `ClientId` for user-assigned | +| `WorkloadIdentity` | `WorkloadIdentityCredential` | Kubernetes, CI/CD pipelines, SPIFFE | +| `Environment` | `EnvironmentCredential` | Container apps with env var credentials | +| `AzureCLI` | `AzureCliCredential` | Local dev with `az login` | +| `AzurePowerShell` | `AzurePowerShellCredential` | Local dev with `Connect-AzAccount` | +| `VisualStudio` | `VisualStudioCredential` | Local dev from Visual Studio | +| `InteractiveBrowser` | `InteractiveBrowserCredential` | Interactive scenarios (not for production) | +| `None` | No authentication | Local emulator only | + +**User-assigned managed identity example:** + +```text +Endpoint=https://myscheduler.westus3.durabletask.io;TaskHub=default;Authentication=ManagedIdentity;ClientId=00000000-0000-0000-0000-000000000000 +``` + +### Concurrency Settings + +Control how many work items are processed in parallel: + +| Property | Default | Description | +| -------- | ------- | ----------- | +| `MaxConcurrentOrchestrationWorkItems` | `ProcessorCount * 10` | Max parallel orchestration executions | +| `MaxConcurrentActivityWorkItems` | `ProcessorCount * 10` | Max parallel activity executions | + +```csharp +var options = AzureManagedOrchestrationServiceOptions.FromConnectionString(connectionString); +options.MaxConcurrentOrchestrationWorkItems = 50; +options.MaxConcurrentActivityWorkItems = 100; +``` + +> [!TIP] +> Increase activity concurrency for I/O-bound workloads. Reduce orchestration concurrency if orchestrations consume significant memory. + +### Large Payload Storage + +For payloads exceeding gRPC message limits, configure Azure Blob Storage to externalize large data: + +```csharp +var options = AzureManagedOrchestrationServiceOptions.FromConnectionString(connectionString); +options.LargePayloadStorageOptions = new LargePayloadStorageOptions("UseDevelopmentStorage=true") +{ + ExternalizeThresholdBytes = 1024, // Externalize payloads larger than 1KB + MaxExternalizedPayloadBytes = 4194304, // Max 4MB payload size + CompressPayloads = true // Compress before storing (default: true) +}; +``` + +| Property | Description | +| -------- | ----------- | +| Constructor (`connectionString`) | Azure Storage connection string or `"UseDevelopmentStorage=true"` for local dev | +| `ExternalizeThresholdBytes` | Payloads larger than this are stored in blob storage | +| `MaxExternalizedPayloadBytes` | Maximum allowed payload size (fails fast if exceeded) | +| `CompressPayloads` | Whether to compress payloads before storing (improves storage efficiency) | + +When enabled, large orchestration inputs, outputs, activity results, and event payloads are automatically stored in blob storage and retrieved transparently. + +### Additional Options + +| Property | Default | Description | +| -------- | ------- | ----------- | +| `TaskHubName` | `"default"` | Name of the task hub (usually set via connection string) | +| `OrchestrationHistoryCacheExpirationPeriod` | 10 minutes | How long orchestration history is cached in memory | +| `ResourceId` | `https://durabletask.io` | OAuth resource ID (change only for sovereign clouds) | + +## Local Development with Emulator + +For local development, use the Docker-based emulator: + +```bash +docker pull mcr.microsoft.com/dts/dts-emulator:latest +docker run -d -p 8080:8080 -p 8082:8082 mcr.microsoft.com/dts/dts-emulator:latest +``` + +Connect to the emulator: + +```csharp +var connectionString = "Endpoint=http://localhost:8080;TaskHub=default;Authentication=None"; +var service = new AzureManagedOrchestrationService( + AzureManagedOrchestrationServiceOptions.FromConnectionString(connectionString), + loggerFactory); +``` + +Access the local dashboard at: `http://localhost:8082` + +## Samples + +For complete working examples, see the [Durable Task Scheduler samples repository](https://github.com/Azure-Samples/Durable-Task-Scheduler/tree/main/samples/dtfx). + +## Additional Resources + +- [Azure Documentation](https://learn.microsoft.com/azure/azure-functions/durable/durable-task-scheduler/durable-task-scheduler) β€” Creating resources, configuration, SKUs, RBAC, pricing +- [Quickstart Guide](https://learn.microsoft.com/azure/azure-functions/durable/durable-task-scheduler/quickstart-durable-task-scheduler) +- [Azure Samples Repository](https://github.com/Azure-Samples/Durable-Task-Scheduler/) +- [Support](../support.md) β€” Enterprise support options + +## Next Steps + +- [Choosing a Backend](../getting-started/choosing-a-backend.md) β€” Compare all providers +- [Quickstart](../getting-started/quickstart.md) β€” Create your first orchestration +- [Core Concepts](../concepts/core-concepts.md) β€” Learn the fundamentals diff --git a/docs/providers/emulator.md b/docs/providers/emulator.md new file mode 100644 index 000000000..195ea2082 --- /dev/null +++ b/docs/providers/emulator.md @@ -0,0 +1,56 @@ +# Emulator Provider + +The Emulator provider (`LocalOrchestrationService`) is an in-memory implementation for local development and testing. It requires no external dependencies and is ideal for quick iteration. + +## Installation + +```bash +dotnet add package Microsoft.Azure.DurableTask.Emulator +``` + +## Usage + +```csharp +using DurableTask.Core; +using DurableTask.Emulator; +using Microsoft.Extensions.Logging; + +using ILoggerFactory loggerFactory = LoggerFactory.Create(builder => +{ + builder.AddConsole(); + builder.SetMinimumLevel(LogLevel.Information); +}); + +// Create in-memory service +var service = new LocalOrchestrationService(); + +// Create worker and client +var worker = new TaskHubWorker(service, loggerFactory); +var client = new TaskHubClient(service, loggerFactory: loggerFactory); + +// ... +``` + +## Limitations + +| Limitation | Description | +| ---------- | ----------- | +| **In-memory only** | All state is lost when process exits | +| **Single process** | Cannot share state across processes | + +## Transitioning to Production + +When moving from emulator to production, replace `LocalOrchestrationService` with your chosen provider. The rest of your code remains the same: + +```csharp +// Development +IOrchestrationService service = new LocalOrchestrationService(); + +// Production (example: Azure Storage) +IOrchestrationService service = new AzureStorageOrchestrationService(settings); +``` + +## Next Steps + +- [Quickstart](../getting-started/quickstart.md) β€” Get started with the emulator +- [Choosing a Backend](../getting-started/choosing-a-backend.md) β€” Select a production provider diff --git a/docs/providers/mssql.md b/docs/providers/mssql.md new file mode 100644 index 000000000..1bc2e25eb --- /dev/null +++ b/docs/providers/mssql.md @@ -0,0 +1,32 @@ +# MSSQL Provider + +The MSSQL provider uses Microsoft SQL Server or Azure SQL Database as the backend storage for orchestration state. This provider is maintained in a separate repository. + +## Repository + +The MSSQL provider is available at: **[https://github.com/microsoft/durabletask-mssql](https://github.com/microsoft/durabletask-mssql)** + +## Features + +- Uses SQL Server or Azure SQL Database for durable storage +- Supports both on-premises SQL Server and Azure SQL +- Includes database migrations for schema management +- Compatible with DTFx and Azure Durable Functions + +## Getting Started + +For installation, configuration, and usage documentation, see the [durabletask-mssql repository](https://github.com/microsoft/durabletask-mssql). + +## When to Use + +Consider the MSSQL provider when: + +- You have existing SQL Server infrastructure +- You need the transactional guarantees of a relational database +- You want to query orchestration state using familiar SQL tools +- Your organization has SQL Server expertise and operational practices + +## Next Steps + +- [durabletask-mssql Documentation](https://github.com/microsoft/durabletask-mssql#readme) +- [Choosing a Backend](../getting-started/choosing-a-backend.md) diff --git a/docs/providers/service-bus.md b/docs/providers/service-bus.md new file mode 100644 index 000000000..5e17bb934 --- /dev/null +++ b/docs/providers/service-bus.md @@ -0,0 +1,164 @@ +# Service Bus Provider + +The Service Bus provider uses Azure Service Bus for orchestration messaging. It's suitable for scenarios requiring Service Bus integration or existing Service Bus infrastructure. + +> [!WARNING] +> The Service Bus provider is in maintenance mode and is not recommended for new projects. Consider using [Durable Task Scheduler](durable-task-scheduler.md) for a managed alternative or the [Azure Storage Provider](azure-storage.md) for self-managed deployments. + +## Installation + +```bash +dotnet add package Microsoft.Azure.DurableTask.ServiceBus +``` + +## Configuration + +### Basic Setup + +The Service Bus provider requires separate stores for messaging (Service Bus) and history/state (Azure Storage): + +```csharp +using DurableTask.ServiceBus; +using DurableTask.ServiceBus.Settings; +using DurableTask.ServiceBus.Tracking; +using DurableTask.Core; +using Microsoft.Extensions.Logging; + +string serviceBusConnectionString = "Endpoint=sb://mynamespace.servicebus.windows.net/;SharedAccessKeyName=...;SharedAccessKey=..."; +string storageConnectionString = "DefaultEndpointsProtocol=https;AccountName=...;AccountKey=..."; +string taskHubName = "MyTaskHub"; + +using ILoggerFactory loggerFactory = LoggerFactory.Create(builder => +{ + builder.AddConsole(); + builder.SetMinimumLevel(LogLevel.Information); +}); + +// Create the instance store (Azure Table Storage for history) +var instanceStore = new AzureTableInstanceStore(taskHubName, storageConnectionString); + +// Create the blob store (Azure Blob Storage for large messages/sessions) +var blobStore = new AzureStorageBlobStore(taskHubName, storageConnectionString); + +// Configure settings +var settings = new ServiceBusOrchestrationServiceSettings(); + +// Create the orchestration service +var service = new ServiceBusOrchestrationService( + serviceBusConnectionString, + taskHubName, + instanceStore, + blobStore, + settings); + +await service.CreateIfNotExistsAsync(); + +var worker = new TaskHubWorker(service, loggerFactory); +var client = new TaskHubClient(service, loggerFactory: loggerFactory); +``` + +### Using Managed Identity + +The Service Bus provider supports managed identity authentication (.NET Standard 2.0+): + +```csharp +using Azure.Identity; +using DurableTask.ServiceBus; +using DurableTask.ServiceBus.Settings; +using DurableTask.ServiceBus.Tracking; + +string serviceBusNamespace = "mynamespace.servicebus.windows.net"; +Uri storageEndpoint = new Uri("https://mystorageaccount.table.core.windows.net"); +Uri blobEndpoint = new Uri("https://mystorageaccount.blob.core.windows.net"); +string taskHubName = "MyTaskHub"; + +var credential = new DefaultAzureCredential(); + +// Create stores with managed identity +var instanceStore = new AzureTableInstanceStore(taskHubName, storageEndpoint, credential); +var blobStore = new AzureStorageBlobStore(taskHubName, blobEndpoint, credential); + +var settings = new ServiceBusOrchestrationServiceSettings(); + +// Create Service Bus connection with managed identity +var service = new ServiceBusOrchestrationService( + serviceBusNamespace, // Just the hostname, not a connection string + credential, + taskHubName, + instanceStore, + blobStore, + settings); +``` + +## Configuration Options + +### ServiceBusOrchestrationServiceSettings + +| Setting | Description | Default | +| ------- | ----------- | ------- | +| `MaxTaskOrchestrationDeliveryCount` | Max delivery attempts for orchestration messages | 10 | +| `MaxTaskActivityDeliveryCount` | Max delivery attempts for activity messages | 10 | +| `MaxTrackingDeliveryCount` | Max delivery attempts for tracking messages | 10 | +| `MaxQueueSizeInMegabytes` | Maximum queue size for Service Bus queues | 1024 | +| `PrefetchCount` | Message prefetch count | 50 | +| `TaskOrchestrationDispatcherSettings` | Orchestration dispatcher configuration | See below | +| `TaskActivityDispatcherSettings` | Activity dispatcher configuration | See below | +| `MessageCompressionSettings` | Message compression configuration | Disabled | +| `JumpStartSettings` | Jump start (stale instance recovery) settings | Enabled | + +### Dispatcher Settings + +Dispatcher settings control concurrency: + +```csharp +var settings = new ServiceBusOrchestrationServiceSettings +{ + TaskOrchestrationDispatcherSettings = + { + MaxConcurrentOrchestrations = 100, + CompressOrchestrationState = true + }, + TaskActivityDispatcherSettings = + { + MaxConcurrentActivities = 100 + } +}; +``` + +## Architecture + +### Service Bus Resources + +The provider creates these Service Bus entities: + +| Entity Type | Name Pattern | Purpose | +| ----------- | ------------ | ------- | +| **Orchestrator Queue** | `{taskhub}/orchestrator` | Orchestration messages | +| **Worker Queue** | `{taskhub}/worker` | Activity messages | +| **Tracking Queue** | `{taskhub}/tracking` | Tracking events | + +### Storage Resources + +In addition to Service Bus, the provider uses Azure Storage: + +| Resource | Name Pattern | Purpose | +| -------- | ------------ | ------- | +| **Instance History Table** | `InstanceHistory00{taskhub}` | Orchestration state and execution history | +| **Jump Start Table** | `JumpStart00{taskhub}` | Pending orchestrations for stale instance recovery | +| **Blob Container** | `{taskhub}-dtfx` | Large messages and session state | + +> [!NOTE] +> Unlike the Azure Storage provider (which has separate History and Instances tables), the Service Bus provider stores both instance metadata and history events in a single `InstanceHistory` table. + +## Limitations + +- Requires both Service Bus and Azure Storage +- Limited query capabilities compared to Azure Storage provider +- Less commonly used β€” smaller community and fewer examples +- No built-in monitoring dashboard + +## Next Steps + +- [Choosing a Backend](../getting-started/choosing-a-backend.md) β€” Compare all providers +- [Durable Task Scheduler](durable-task-scheduler.md) β€” Recommended managed alternative +- [Azure Storage Provider](azure-storage.md) β€” Alternative self-managed option diff --git a/docs/providers/service-fabric.md b/docs/providers/service-fabric.md new file mode 100644 index 000000000..5109d288f --- /dev/null +++ b/docs/providers/service-fabric.md @@ -0,0 +1,197 @@ +# Service Fabric Provider + +The Service Fabric provider uses Azure Service Fabric reliable collections for orchestration state. It's designed for applications already running on Service Fabric clusters. + +## Installation + +```bash +dotnet add package Microsoft.Azure.DurableTask.AzureServiceFabric +``` + +## Configuration + +### Basic Setup + +The Service Fabric provider includes built-in infrastructure via `TaskHubStatefulService` and `TaskHubProxyListener`: + +```csharp +using DurableTask.AzureServiceFabric; +using DurableTask.AzureServiceFabric.Service; +using DurableTask.Core; +using Microsoft.ServiceFabric.Services.Runtime; + +// In Program.cs +ServiceRuntime.RegisterServiceAsync("StatefulServiceType", context => +{ + var settings = new FabricOrchestrationProviderSettings(); + + var listener = new TaskHubProxyListener( + settings, + RegisterOrchestrations); + + return new TaskHubStatefulService(context, new[] { listener }); +}).GetAwaiter().GetResult(); + +void RegisterOrchestrations(TaskHubWorker worker) +{ + worker.AddTaskOrchestrations(typeof(MyOrchestration)); + worker.AddTaskActivities(typeof(MyActivity)); +} +``` + +### Manual Setup with Provider Factory + +For more control, use the `FabricOrchestrationProviderFactory`: + +```csharp +using DurableTask.AzureServiceFabric; +using DurableTask.Core; +using Microsoft.ServiceFabric.Services.Runtime; + +public class DurableTaskService : StatefulService +{ + private FabricOrchestrationProvider provider; + private TaskHubWorker worker; + + public DurableTaskService(StatefulServiceContext context) : base(context) { } + + protected override async Task RunAsync(CancellationToken cancellationToken) + { + var settings = new FabricOrchestrationProviderSettings(); + + var factory = new FabricOrchestrationProviderFactory( + this.StateManager, + settings); + + provider = factory.CreateProvider(); + + worker = new TaskHubWorker(provider.OrchestrationService, settings.LoggerFactory); + worker.AddTaskOrchestrations(typeof(MyOrchestration)); + worker.AddTaskActivities(typeof(MyActivity)); + + await worker.StartAsync(); + + try + { + await Task.Delay(Timeout.Infinite, cancellationToken); + } + finally + { + await worker.StopAsync(); + provider.Dispose(); + } + } +} +``` + +### Service Registration + +Register the service in `Program.cs`: + +```csharp +ServiceRuntime.RegisterServiceAsync( + "DurableTaskServiceType", + context => new DurableTaskService(context)) + .GetAwaiter().GetResult(); +``` + +## Architecture + +### Reliable Collections + +State is stored in Service Fabric reliable collections: + +| Collection Name | Purpose | +| --------------- | ------- | +| `DtfxSfp_Orchestrations` | Orchestration sessions and state | +| `DtfxSfp_Activities` | Pending activity messages | +| `DtfxSfp_InstanceStore` | Instance metadata for queries | +| `DtfxSfp_ExecutionIdStore` | Execution ID mappings | +| `DtfxSfp_ScheduledMessages` | Scheduled timer messages | +| `DtfxSfp_SessionMessages_{id}` | Per-session message queues | + +### Partitioning + +Service Fabric handles partitioning automatically based on your service configuration: + +```xml + + + + + +``` + +## Configuration Options + +| Setting | Description | Default | +| ------- | ----------- | ------- | +| `TaskOrchestrationDispatcherSettings.MaxConcurrentOrchestrations` | Max concurrent orchestrations | 1000 | +| `TaskOrchestrationDispatcherSettings.DispatcherCount` | Number of orchestration dispatchers | 10 | +| `TaskActivityDispatcherSettings.MaxConcurrentActivities` | Max concurrent activities | 1000 | +| `TaskActivityDispatcherSettings.DispatcherCount` | Number of activity dispatchers | 10 | +| `LoggerFactory` | Optional logger factory for diagnostics | null | + +### Example Configuration + +```csharp +var settings = new FabricOrchestrationProviderSettings +{ + TaskOrchestrationDispatcherSettings = + { + MaxConcurrentOrchestrations = 500, + DispatcherCount = 5 + }, + TaskActivityDispatcherSettings = + { + MaxConcurrentActivities = 500, + DispatcherCount = 5 + } +}; +``` + +## Client Access + +### From Within Service Fabric + +Use the `FabricOrchestrationProvider` to get both worker and client: + +```csharp +var factory = new FabricOrchestrationProviderFactory(this.StateManager, settings); +var provider = factory.CreateProvider(); + +var worker = new TaskHubWorker(provider.OrchestrationService, settings.LoggerFactory); +var client = new TaskHubClient(provider.OrchestrationServiceClient, loggerFactory: settings.LoggerFactory); + +var instance = await client.CreateOrchestrationInstanceAsync( + typeof(MyOrchestration), + input); +``` + +### From External Applications + +External clients connect via the built-in HTTP API or Service Fabric remoting: + +```csharp +// Create a service proxy +var serviceUri = new Uri("fabric:/MyApp/DurableTaskService"); +var proxy = ServiceProxy.Create(serviceUri); + +// Call methods on the proxy +var instanceId = await proxy.StartOrchestrationAsync(input); +``` + +The `TaskHubProxyListener` exposes an HTTP API via `FabricOrchestrationServiceController` for external access. + +## Limitations + +- Requires Service Fabric cluster +- Tightly coupled to Service Fabric ecosystem +- More complex deployment and management +- No external persistence β€” state is lost if all replicas are lost + +## Next Steps + +- [Choosing a Backend](../getting-started/choosing-a-backend.md) β€” Compare all providers +- [Durable Task Scheduler](durable-task-scheduler.md) β€” Recommended managed alternative +- [Azure Storage Provider](azure-storage.md) β€” Alternative self-managed option diff --git a/docs/samples/catalog.md b/docs/samples/catalog.md new file mode 100644 index 000000000..4464c8f2d --- /dev/null +++ b/docs/samples/catalog.md @@ -0,0 +1,21 @@ +# Sample Applications + +This page provides an overview of the sample applications included in the repository. These samples demonstrate various patterns and features of the Durable Task Framework. + +For detailed instructions on running each sample, see the README in each project directory. + +## Sample Projects + +| Sample | Description | Key Features | +| ------ | ----------- | ------------ | +| [DurableTask.Samples](../../samples/DurableTask.Samples/) | Core orchestration patterns | Greetings, Cron, Fan-out/Fan-in, Error handling, External events | +| [Correlation.Samples](../../samples/Correlation.Samples/) | Legacy distributed tracing | W3C TraceContext, Application Insights | +| [DistributedTraceSample](../../samples/DistributedTraceSample/) | Modern telemetry integration | OpenTelemetry, Application Insights | +| [ManagedIdentitySample](../../samples/ManagedIdentitySample/) | Azure authentication | Managed Identity with Azure Storage | + +## Additional Resources + +- [Getting Started](../getting-started/quickstart.md) +- [Core Concepts](../concepts/core-concepts.md) +- [Choosing a Backend](../getting-started/choosing-a-backend.md) +- [Testing](../advanced/testing.md) diff --git a/docs/support.md b/docs/support.md new file mode 100644 index 000000000..3afb8efa9 --- /dev/null +++ b/docs/support.md @@ -0,0 +1,49 @@ +# Support + +This document describes support options for the Durable Task Framework (DTFx). + +## Community Support (Open Source) + +The Durable Task Framework is an open-source project maintained by Microsoft. Community support is available through [GitHub Issues](https://github.com/Azure/durabletask/issues) for bug reports, feature requests, and technical questions. + +### Support Policy + +Community support is provided on a **best-effort basis**: + +- ⚠️ **No SLA** β€” Response times are not guaranteed +- ⚠️ **No 24/7 coverage** β€” Issues are triaged during business hours +- ⚠️ **Community-driven** β€” Many answers come from community members +- ⚠️ **Not all providers maintained** β€” Some backend providers are no longer actively maintained +- βœ… **Open collaboration** β€” All issues and discussions are public + +See [Choosing a Backend](getting-started/choosing-a-backend.md) for information on the development status of each backend provider. + +This model works well for: + +- Learning and experimentation +- Non-critical workloads +- Development and testing environments +- Projects with in-house expertise + +## Enterprise Support: Durable Task Scheduler + +For production workloads requiring guaranteed support, we recommend using DTFx with the **[Durable Task Scheduler](providers/durable-task-scheduler.md)** as the backend provider. This fully managed Azure service offers enterprise-grade support with the following benefits: + +| Feature | Open Source (BYO Providers) | Durable Task Scheduler | +| ------- | -------------------------- | ---------------------- | +| **Support SLA** | ❌ Best-effort | βœ… Azure support with SLA | +| **24/7 Coverage** | ❌ No | βœ… Yes (with Azure support plan) | +| **Infrastructure** | Self-managed | Fully managed by Azure | +| **Monitoring** | Bring your own tools | Built-in dashboard | +| **Throughput** | Varies by provider | Highest available | +| **Response Time** | Not guaranteed | Based on Azure support tier | + +## Reporting Security Issues + +⚠️ **Do not report security vulnerabilities through public GitHub issues.** + +Please see [SECURITY.md](../SECURITY.md) for instructions on reporting security issues responsibly. + +## Contributing + +We welcome contributions! See the [GitHub repository](https://github.com/Azure/durabletask) for contribution guidelines. diff --git a/docs/telemetry/application-insights.md b/docs/telemetry/application-insights.md new file mode 100644 index 000000000..fe2d0ddf8 --- /dev/null +++ b/docs/telemetry/application-insights.md @@ -0,0 +1,358 @@ +# Application Insights Integration + +The Durable Task Framework provides deep integration with Azure Application Insights for monitoring, diagnostics, and performance analysis. + +## Installation + +```bash +dotnet add package Microsoft.Azure.DurableTask.ApplicationInsights +dotnet add package Microsoft.ApplicationInsights +``` + +## Setup + +### Basic Configuration + +```csharp +using Microsoft.ApplicationInsights; +using Microsoft.ApplicationInsights.Extensibility; +using Microsoft.Azure.DurableTask.ApplicationInsights; +using Microsoft.Extensions.DependencyInjection; + +var services = new ServiceCollection(); + +// Add Application Insights +services.AddApplicationInsightsTelemetryWorkerService(options => +{ + options.ConnectionString = "InstrumentationKey=your-key;..."; +}); + +// Add DurableTask telemetry module for distributed tracing +services.TryAddEnumerable( + ServiceDescriptor.Singleton()); + +var serviceProvider = services.BuildServiceProvider(); +``` + +### ASP.NET Core Integration + +```csharp +// In Program.cs +var builder = WebApplication.CreateBuilder(args); + +// Add Application Insights +builder.Services.AddApplicationInsightsTelemetry(); + +// Add DurableTask telemetry module +builder.Services.TryAddEnumerable( + ServiceDescriptor.Singleton()); + +var app = builder.Build(); +``` + +### Console Application + +```csharp +using Microsoft.ApplicationInsights; +using Microsoft.ApplicationInsights.Extensibility; +using Microsoft.Azure.DurableTask.ApplicationInsights; +using DurableTask.Core; +using DurableTask.Emulator; + +// Configure Application Insights +var configuration = TelemetryConfiguration.CreateDefault(); +configuration.ConnectionString = "InstrumentationKey=..."; + +// Add the DurableTask telemetry module +var module = new DurableTelemetryModule(); +module.Initialize(configuration); + +var telemetryClient = new TelemetryClient(configuration); + +// Create logger factory for diagnostics +using ILoggerFactory loggerFactory = LoggerFactory.Create(builder => +{ + builder.AddConsole(); + builder.SetMinimumLevel(LogLevel.Information); +}); + +// Create DTFx components +var service = new LocalOrchestrationService(); +var worker = new TaskHubWorker(service, loggerFactory); +worker.AddTaskOrchestrations(typeof(MyOrchestration)); +worker.AddTaskActivities(typeof(MyActivity)); + +await worker.StartAsync(); + +// ... run orchestrations ... + +await worker.StopAsync(); + +// Ensure telemetry is flushed +telemetryClient.Flush(); +await Task.Delay(TimeSpan.FromSeconds(5)); +``` + +## What Gets Tracked + +### Automatic Telemetry + +The `DurableTelemetryModule` automatically tracks: + +| Telemetry Type | Description | +|----------------|-------------| +| **Requests** | Orchestration and activity executions | +| **Dependencies** | Activity calls, sub-orchestrations | +| **Traces** | Log messages from DTFx | +| **Exceptions** | Failures in orchestrations and activities | +| **Custom Events** | Orchestration lifecycle events | + +### Distributed Tracing + +Operations are automatically correlated: + +```text +Request: OrderOrchestration (parent) +β”œβ”€β”€ Dependency: ValidateOrderActivity +β”œβ”€β”€ Dependency: ProcessPaymentActivity +β”œβ”€β”€ Dependency: ShippingOrchestration (sub-orchestration) +β”‚ └── Dependency: CreateShipmentActivity +└── Dependency: SendConfirmationActivity +``` + +## Custom Telemetry + +### Adding Custom Properties + +```csharp +public class OrderOrchestration : TaskOrchestration +{ + private readonly TelemetryClient _telemetryClient; + + public OrderOrchestration(TelemetryClient telemetryClient) + { + _telemetryClient = telemetryClient; + } + + public override async Task RunTask( + OrchestrationContext context, + OrderInput input) + { + // Track custom event + if (!context.IsReplaying) + { + _telemetryClient.TrackEvent("OrderProcessingStarted", new Dictionary + { + ["InstanceId"] = context.OrchestrationInstance.InstanceId, + ["OrderId"] = input.OrderId, + ["CustomerId"] = input.CustomerId + }); + } + + // ... orchestration logic ... + + if (!context.IsReplaying) + { + _telemetryClient.TrackEvent("OrderProcessingCompleted", new Dictionary + { + ["InstanceId"] = context.OrchestrationInstance.InstanceId, + ["OrderId"] = input.OrderId, + ["Status"] = "Success" + }); + } + + return result; + } +} +``` + +### Tracking Metrics + +```csharp +if (!context.IsReplaying) +{ + _telemetryClient.TrackMetric("OrderProcessingDuration", + (context.CurrentUtcDateTime - startTime).TotalMilliseconds); + + _telemetryClient.TrackMetric("OrderItemCount", input.Items.Count); +} +``` + +### Tracking Exceptions + +```csharp +try +{ + await context.ScheduleTask(typeof(RiskyActivity), input); +} +catch (TaskFailedException ex) +{ + if (!context.IsReplaying) + { + _telemetryClient.TrackException(ex.InnerException, new Dictionary + { + ["InstanceId"] = context.OrchestrationInstance.InstanceId, + ["ActivityName"] = ex.Name + }); + } + throw; +} +``` + +## Querying Data + +### Kusto Queries (Log Analytics) + +**Orchestration execution times:** +```kusto +requests +| where name contains "orchestration" +| summarize avg(duration), percentile(duration, 95) by name +| order by avg_duration desc +``` + +**Failed orchestrations:** +```kusto +requests +| where name contains "orchestration" and success == false +| project timestamp, name, duration, customDimensions +| order by timestamp desc +``` + +**Activity performance:** +```kusto +dependencies +| where type == "DurableTask" +| summarize count(), avg(duration) by name +| order by count_ desc +``` + +**End-to-end traces:** +```kusto +union requests, dependencies, traces +| where operation_Id == "your-operation-id" +| order by timestamp asc +| project timestamp, itemType, name, message, duration +``` + +## Live Metrics + +Application Insights Live Metrics shows real-time: + +- Incoming request rate +- Failure rate +- Dependency call duration +- Server response time + +Enable Live Metrics in your configuration: + +```csharp +services.AddApplicationInsightsTelemetryWorkerService(options => +{ + options.ConnectionString = "..."; + options.EnableLiveMetrics = true; +}); +``` + +## Alerts + +Configure alerts for common scenarios: + +### High Failure Rate + +```kusto +requests +| where name contains "orchestration" +| summarize failureCount = countif(success == false), totalCount = count() by bin(timestamp, 5m) +| extend failureRate = failureCount * 100.0 / totalCount +| where failureRate > 5 +``` + +### Long-Running Orchestrations + +```kusto +requests +| where name contains "orchestration" +| where duration > 300000 // 5 minutes +| project timestamp, name, duration, operation_Id +``` + +### Stuck Orchestrations + +Monitor for orchestrations that haven't progressed: + +```kusto +customEvents +| where name == "OrchestrationStarted" +| join kind=leftanti ( + customEvents + | where name == "OrchestrationCompleted" + | project completedInstanceId = tostring(customDimensions["InstanceId"]) +) on $left.customDimensions["InstanceId"] == $right.completedInstanceId +| where timestamp < ago(1h) +``` + +## Sampling + +For high-volume scenarios, configure sampling: + +```csharp +services.AddApplicationInsightsTelemetryWorkerService(options => +{ + options.ConnectionString = "..."; +}); + +services.Configure(config => +{ + config.DefaultTelemetrySink.TelemetryProcessorChainBuilder + .UseAdaptiveSampling(maxTelemetryItemsPerSecond: 5) + .Build(); +}); +``` + +## Best Practices + +### 1. Use IsReplaying for Custom Telemetry + +```csharp +if (!context.IsReplaying) +{ + _telemetryClient.TrackEvent("CustomEvent"); +} +``` + +### 2. Include Correlation IDs + +```csharp +_telemetryClient.TrackEvent("OrderProcessed", new Dictionary +{ + ["InstanceId"] = context.OrchestrationInstance.InstanceId, + ["OrderId"] = input.OrderId +}); +``` + +### 3. Flush Before Shutdown + +```csharp +await worker.StopAsync(); +telemetryClient.Flush(); +await Task.Delay(TimeSpan.FromSeconds(5)); // Allow time for flush +``` + +### 4. Monitor Key Metrics + +- Orchestration success/failure rate +- Activity duration +- Queue depth (if applicable) +- Concurrent orchestrations + +## Samples + +See the complete working sample: +- [Application Insights Sample](../../../samples/DistributedTraceSample/ApplicationInsights) + +## Next Steps + +- [Distributed Tracing](distributed-tracing.md) β€” OpenTelemetry integration +- [Logging](logging.md) β€” Structured logging +- [Support](../support.md) β€” Getting help diff --git a/docs/telemetry/distributed-tracing.md b/docs/telemetry/distributed-tracing.md new file mode 100644 index 000000000..c03684ea5 --- /dev/null +++ b/docs/telemetry/distributed-tracing.md @@ -0,0 +1,285 @@ +# Distributed Tracing + +The Durable Task Framework supports distributed tracing using the standard .NET `ActivitySource` API, compatible with both OpenTelemetry and Application Insights. + +## Overview + +Distributed tracing provides visibility into orchestration execution across services and activities. DTFx emits spans for: + +- Starting orchestrations +- Running orchestrations +- Starting and running activities +- Sub-orchestrations +- Timers +- External events + +## Supported Protocols + +DTFx supports trace context propagation using standard protocols: + +| Protocol | Description | +| -------- | ----------- | +| **W3C TraceContext** | W3C standard for distributed tracing (default) | +| **HTTP Correlation Protocol** | Legacy Application Insights protocol | + +## OpenTelemetry Setup + +### Installation + +```bash +dotnet add package OpenTelemetry +dotnet add package OpenTelemetry.Exporter.Console # Or your preferred exporter +``` + +### Configuration + +Add the `DurableTask.Core` source to the OpenTelemetry trace builder: + +```csharp +using OpenTelemetry; +using OpenTelemetry.Trace; + +var tracerProvider = Sdk.CreateTracerProviderBuilder() + .AddSource("DurableTask.Core") + .AddConsoleExporter() // Or your preferred exporter + .Build(); +``` + +### Full Example + +```csharp +using OpenTelemetry; +using OpenTelemetry.Trace; +using DurableTask.Core; +using DurableTask.AzureStorage; + +// Configure OpenTelemetry +using var tracerProvider = Sdk.CreateTracerProviderBuilder() + .AddSource("DurableTask.Core") + .AddConsoleExporter() + .Build(); + +// Create logger factory +using ILoggerFactory loggerFactory = LoggerFactory.Create(builder => +{ + builder.AddConsole(); + builder.SetMinimumLevel(LogLevel.Information); +}); + +// Set up DTFx +var settings = new AzureStorageOrchestrationServiceSettings +{ + TaskHubName = "MyTaskHub", + StorageAccountClientProvider = new StorageAccountClientProvider(connectionString), + LoggerFactory = loggerFactory +}; + +var service = new AzureStorageOrchestrationService(settings); +await service.CreateIfNotExistsAsync(); + +var worker = new TaskHubWorker(service, loggerFactory); +worker.AddTaskOrchestrations(typeof(MyOrchestration)); +worker.AddTaskActivities(typeof(MyActivity)); + +await worker.StartAsync(); + +var client = new TaskHubClient(service, loggerFactory: loggerFactory); +var instance = await client.CreateOrchestrationInstanceAsync( + typeof(MyOrchestration), + "input"); + +await client.WaitForOrchestrationAsync(instance, TimeSpan.FromMinutes(1)); +await worker.StopAsync(); +``` + +### Exporting to Azure Monitor + +```csharp +using Azure.Monitor.OpenTelemetry.Exporter; + +var tracerProvider = Sdk.CreateTracerProviderBuilder() + .AddSource("DurableTask.Core") + .AddAzureMonitorTraceExporter(o => + { + o.ConnectionString = "InstrumentationKey=..."; + }) + .Build(); +``` + +### Exporting to Jaeger + +```csharp +using OpenTelemetry.Exporter; + +var tracerProvider = Sdk.CreateTracerProviderBuilder() + .AddSource("DurableTask.Core") + .AddJaegerExporter(o => + { + o.AgentHost = "localhost"; + o.AgentPort = 6831; + }) + .Build(); +``` + +## Application Insights Setup + +For Application Insights integration, use the dedicated telemetry module. + +### Application Insights Installation + +```bash +dotnet add package Microsoft.Azure.DurableTask.ApplicationInsights +dotnet add package Microsoft.ApplicationInsights +``` + +### Application Insights Configuration + +```csharp +using Microsoft.ApplicationInsights.Extensibility; +using Microsoft.Azure.DurableTask.ApplicationInsights; +using Microsoft.Extensions.DependencyInjection; + +var services = new ServiceCollection(); + +// Add Application Insights +services.AddApplicationInsightsTelemetryWorkerService(options => +{ + options.ConnectionString = "InstrumentationKey=..."; +}); + +// Add DurableTask telemetry module +services.TryAddEnumerable( + ServiceDescriptor.Singleton()); + +var serviceProvider = services.BuildServiceProvider(); +``` + +### ASP.NET Core Integration + +```csharp +// In Program.cs or Startup.cs +builder.Services.AddApplicationInsightsTelemetry(); +builder.Services.TryAddEnumerable( + ServiceDescriptor.Singleton()); +``` + +## Span Reference + +### Orchestration Spans + +| Span Name | Kind | Description | +| --------- | ---- | ----------- | +| `create_orchestration:{name}` | Producer | Starting an orchestration from client | +| `orchestration:{name}` | Server | Running an orchestration in worker | +| `orchestration:{name}` | Client | Starting a sub-orchestration | + +### Activity Spans + +| Span Name | Kind | Description | +| --------- | ---- | ----------- | +| `activity:{name}` | Client | Starting an activity from orchestration | +| `activity:{name}` | Server | Running an activity in worker | + +### Other Spans + +| Span Name | Kind | Description | +| --------- | ---- | ----------- | +| `timer` | Internal | Durable timer | +| `event:{name}` | Producer | Sending an external event | + +## Attributes + +DTFx spans include these attributes: + +| Attribute | Type | Description | +| --------- | ---- | ----------- | +| `durabletask.type` | string | Type: "orchestration", "activity", "timer", "event" | +| `durabletask.task.name` | string | Name of the task | +| `durabletask.task.version` | string | Version of the task (if specified) | +| `durabletask.task.instance_id` | string | Orchestration instance ID | +| `durabletask.task.execution_id` | string | Execution ID | +| `durabletask.task.task_id` | int | Task index within orchestration | +| `durabletask.task.result` | string | Result: "Succeeded", "Failed", "Terminated" | + +## Trace Correlation + +Traces are automatically correlated across: + +- Parent orchestration β†’ Sub-orchestration +- Orchestration β†’ Activity +- Client β†’ Orchestration + +### Example Trace Hierarchy + +```text +create_orchestration:OrderOrchestration (Producer) +└── orchestration:OrderOrchestration (Server) + β”œβ”€β”€ activity:ValidateOrder (Client) + β”‚ └── activity:ValidateOrder (Server) + β”œβ”€β”€ activity:ProcessPayment (Client) + β”‚ └── activity:ProcessPayment (Server) + └── orchestration:ShippingOrchestration (Client) + └── orchestration:ShippingOrchestration (Server) + └── activity:CreateShipment (Client) + └── activity:CreateShipment (Server) +``` + +## Samples + +See the sample projects for complete working examples: + +- [OpenTelemetry Sample](../../samples/DistributedTraceSample/OpenTelemetry) β€” Modern ActivitySource-based tracing with OpenTelemetry +- [Application Insights Sample](../../samples/DistributedTraceSample/ApplicationInsights) β€” Modern ActivitySource-based tracing with Application Insights +- [Correlation Sample](../../samples/Correlation.Samples) β€” Legacy CorrelationSettings-based tracing (Azure Storage only) + +## Legacy Correlation (Azure Storage Only) + +The Azure Storage provider includes a legacy correlation system using `CorrelationSettings`. This approach predates the modern `ActivitySource` API and is maintained for backward compatibility. + +### Enabling Legacy Correlation + +```csharp +using DurableTask.Core.Settings; + +// Enable legacy distributed tracing +CorrelationSettings.Current.EnableDistributedTracing = true; +CorrelationSettings.Current.Protocol = Protocol.W3CTraceContext; // or Protocol.HttpCorrelationProtocol +``` + +### Setting Up Telemetry + +The legacy system requires manual setup of `CorrelationTraceClient`: + +```csharp +using DurableTask.Core; +using Microsoft.ApplicationInsights; + +// Set up telemetry callbacks +CorrelationTraceClient.SetUp( + (TraceContextBase requestTraceContext) => + { + requestTraceContext.Stop(); + var requestTelemetry = requestTraceContext.CreateRequestTelemetry(); + telemetryClient.TrackRequest(requestTelemetry); + }, + (TraceContextBase dependencyTraceContext) => + { + dependencyTraceContext.Stop(); + var dependencyTelemetry = dependencyTraceContext.CreateDependencyTelemetry(); + telemetryClient.TrackDependency(dependencyTelemetry); + }, + (Exception e) => + { + telemetryClient.TrackException(e); + } +); +``` + +> [!NOTE] +> The modern `ActivitySource` approach (OpenTelemetry/DurableTelemetryModule) is recommended for new projects. The legacy `CorrelationSettings` system only works with the Azure Storage provider. + +## Next Steps + +- [Logging](logging.md) β€” Structured logging in DTFx +- [Application Insights](application-insights.md) β€” Full AI integration +- [Semantic Conventions](traces/semantic-conventions.md) β€” Detailed span specification diff --git a/docs/telemetry/logging.md b/docs/telemetry/logging.md new file mode 100644 index 000000000..467729675 --- /dev/null +++ b/docs/telemetry/logging.md @@ -0,0 +1,356 @@ +# Logging + +The Durable Task Framework provides structured logging for observability and debugging. This guide covers logging configuration and best practices. + +## Log Sources + +DTFx emits logs from these sources: + +| Source | Description | +| ------ | ----------- | +| `DurableTask.Core` | Core framework operations | +| `DurableTask.AzureStorage` | Azure Storage provider | +| `DurableTask.ServiceBus` | Service Bus provider | +| `DurableTask.AzureManagedBackend` | Durable Task Scheduler | + +## Configuring Logging + +### With Microsoft.Extensions.Logging + +```csharp +using Microsoft.Extensions.Logging; + +var loggerFactory = LoggerFactory.Create(builder => +{ + builder + .SetMinimumLevel(LogLevel.Information) + .AddConsole() + .AddFilter("DurableTask.Core", LogLevel.Debug); +}); + +// Configure provider-specific logging (e.g., Azure Storage) +var settings = new AzureStorageOrchestrationServiceSettings +{ + TaskHubName = "MyTaskHub", + StorageAccountClientProvider = new StorageAccountClientProvider( + "mystorageaccount", + new DefaultAzureCredential()), + LoggerFactory = loggerFactory, // Provider logs +}; +var service = new AzureStorageOrchestrationService(settings); + +// Pass to worker and client +var worker = new TaskHubWorker(service, loggerFactory); +var client = new TaskHubClient(service, loggerFactory: loggerFactory); +``` + +> [!NOTE] +> Pass the `ILoggerFactory` to all three locations (provider settings, worker, and client) for complete log coverage. Provider-specific logs include message delivery times, partition operations, and other backend details useful for debugging. + +### With Serilog + +```csharp +using Serilog; + +Log.Logger = new LoggerConfiguration() + .MinimumLevel.Information() + .MinimumLevel.Override("DurableTask.Core", Serilog.Events.LogEventLevel.Debug) + .WriteTo.Console() + .CreateLogger(); + +var loggerFactory = new LoggerFactory().AddSerilog(); +var worker = new TaskHubWorker(service, loggerFactory); +``` + +### ASP.NET Core Integration + +```csharp +// In Program.cs +builder.Logging.AddFilter("DurableTask.Core", LogLevel.Debug); +``` + +## Log Events + +### Orchestration Events + +| Event ID | Level | Description | +| -------- | ----- | ----------- | +| 40 | Information | Scheduling orchestration | +| 43 | Information | Waiting for orchestration | +| 49 | Information | Orchestration completed | +| 51 | Information | Executing orchestration logic | +| 52 | Information | Orchestration executed (scheduled operations) | + +### Activity Events + +| Event ID | Level | Description | +| -------- | ----- | ----------- | +| 46 | Information | Scheduling activity | +| 60 | Information | Starting activity | +| 61 | Information | Activity completed | + +### Worker Events + +| Event ID | Level | Description | +| -------- | ----- | ----------- | +| 10 | Information | Worker starting | +| 11 | Information | Worker started | +| 12 | Information | Worker stopping | +| 13 | Information | Worker stopped | + +### Example Log Output + +```text +info: DurableTask.Core[10] Durable task hub worker is starting +info: DurableTask.Core[40] Scheduling orchestration 'MyOrchestration' with instance ID = 'abc123' +info: DurableTask.Core[51] abc123: Executing 'MyOrchestration' orchestration logic +info: DurableTask.Core[46] abc123: Scheduling activity [MyActivity#0] +info: DurableTask.Core[60] abc123: Starting task activity [MyActivity#0] +info: DurableTask.Core[61] abc123: Task activity [MyActivity#0] completed successfully +info: DurableTask.Core[49] abc123: Orchestration completed with status 'Completed' +``` + +## Logging in Orchestrations + +### Using IsReplaying + +Avoid duplicate logs during replay. Note that DTFx orchestrations do not support constructor-based dependency injection. Use a static logger or pass a logger factory through your object creator: + +```csharp +public class MyOrchestration : TaskOrchestration +{ + // Use a static logger or configure via ObjectCreator + private static readonly ILogger Logger = LoggerFactory + .Create(builder => builder.AddConsole()) + .CreateLogger(); + + public override async Task RunTask( + OrchestrationContext context, + string input) + { + // Only log during actual execution, not replay + if (!context.IsReplaying) + { + Logger.LogInformation( + "Processing orchestration {InstanceId} with input {Input}", + context.OrchestrationInstance.InstanceId, + input); + } + + var result = await context.ScheduleTask(typeof(MyActivity), input); + + if (!context.IsReplaying) + { + Logger.LogInformation( + "Orchestration {InstanceId} completed with result {Result}", + context.OrchestrationInstance.InstanceId, + result); + } + + return result; + } +} +``` + +### Structured Logging Best Practices + +Include relevant context in log messages: + +```csharp +// βœ… Good - structured with context +_logger.LogInformation( + "Processing order {OrderId} for customer {CustomerId} in orchestration {InstanceId}", + input.OrderId, + input.CustomerId, + context.OrchestrationInstance.InstanceId); + +// ❌ Bad - string concatenation, no structure +_logger.LogInformation( + "Processing order " + input.OrderId + " for customer " + input.CustomerId); +``` + +## Logging in Activities + +Activities don't have replay concerns, so log freely. Like orchestrations, DTFx activities do not support constructor-based dependency injection. Use a static logger or configure via a custom `ObjectCreator`: + +```csharp +public class MyActivity : AsyncTaskActivity +{ + // Use a static logger or configure via ObjectCreator + private static readonly ILogger Logger = LoggerFactory + .Create(builder => builder.AddConsole()) + .CreateLogger(); + + protected override async Task ExecuteAsync( + TaskContext context, + string input) + { + Logger.LogInformation( + "Starting activity for orchestration {InstanceId}", + context.OrchestrationInstance.InstanceId); + + try + { + var result = await DoWorkAsync(input); + + Logger.LogInformation( + "Activity completed for orchestration {InstanceId}", + context.OrchestrationInstance.InstanceId); + + return result; + } + catch (Exception ex) + { + Logger.LogError(ex, + "Activity failed for orchestration {InstanceId}", + context.OrchestrationInstance.InstanceId); + throw; + } + } + + private Task DoWorkAsync(string input) => Task.FromResult(input); +} +``` + +## Log Correlation + +### Correlation IDs + +Include correlation IDs for end-to-end tracing: + +```csharp +public override async Task RunTask( + OrchestrationContext context, + OrderInput input) +{ + using (Logger.BeginScope(new Dictionary + { + ["InstanceId"] = context.OrchestrationInstance.InstanceId, + ["OrderId"] = input.OrderId, + ["CorrelationId"] = input.CorrelationId + })) + { + if (!context.IsReplaying) + { + Logger.LogInformation("Starting order processing"); + } + + // ... orchestration logic + } +} +``` + +### Distributed Tracing Integration + +For trace correlation, see [Distributed Tracing](distributed-tracing.md). + +## Log Levels + +Recommended log level configuration: + +| Environment | DurableTask.Core | Provider | +| ----------- | ---------------- | -------- | +| Development | Debug | Debug | +| Testing | Debug | Information | +| Production | Information | Warning | + +### Configuration Example + +```json +{ + "Logging": { + "LogLevel": { + "Default": "Information", + "DurableTask.Core": "Information", + "DurableTask.AzureStorage": "Warning" + } + } +} +``` + +## Filtering Noisy Logs + +Some operations generate many logs. Filter as needed: + +```csharp +builder.Logging + .AddFilter("DurableTask.Core", LogLevel.Information) + // Reduce noise from Azure Storage provider + .AddFilter("DurableTask.AzureStorage", LogLevel.Warning); +``` + +## Diagnostic Logging + +For troubleshooting, enable debug logging: + +```csharp +builder.Logging + .SetMinimumLevel(LogLevel.Debug) + .AddFilter("DurableTask", LogLevel.Debug); +``` + +This reveals: + +- Message processing details +- Partition lease operations +- History loading/saving +- Timer scheduling + +## Logging with Middleware + +For cross-cutting logging concerns without modifying orchestration or activity code, use [middleware](../advanced/middleware.md). This approach lets you intercept all executions in one place: + +```csharp +var worker = new TaskHubWorker(orchestrationService, loggerFactory); + +// Add orchestration logging middleware +worker.AddOrchestrationDispatcherMiddleware(async (context, next) => +{ + var instance = context.GetProperty(); + var runtimeState = context.GetProperty(); + + logger.LogInformation("Orchestration {Name} ({InstanceId}) starting", + runtimeState?.Name, instance?.InstanceId); + + var stopwatch = Stopwatch.StartNew(); + try + { + await next(); + logger.LogInformation("Orchestration {Name} ({InstanceId}) completed in {ElapsedMs}ms", + runtimeState?.Name, instance?.InstanceId, stopwatch.ElapsedMilliseconds); + } + catch (Exception ex) + { + logger.LogError(ex, "Orchestration {Name} ({InstanceId}) failed", + runtimeState?.Name, instance?.InstanceId); + throw; + } +}); + +// Add activity logging middleware +worker.AddActivityDispatcherMiddleware(async (context, next) => +{ + var scheduledEvent = context.GetProperty(); + var instance = context.GetProperty(); + + logger.LogInformation("Activity {ActivityName} starting for {InstanceId}", + scheduledEvent?.Name, instance?.InstanceId); + + await next(); +}); +``` + +See [Middleware](../advanced/middleware.md) for complete examples. + +## Event Source Logging + +In addition to `ILogger`, DTFx also emits logs via [Event Source](https://docs.microsoft.com/dotnet/api/system.diagnostics.tracing.eventsource), which is used by platforms like Azure Functions and Azure App Service for automatic telemetry collection. Event Source logging is always enabled and captures additional correlation details. + +For advanced Event Source configuration, including provider GUIDs and structured logging details, see the [source documentation](https://github.com/Azure/durabletask/blob/main/src/DurableTask.Core/Logging/README.md). + +## Next Steps + +- [Distributed Tracing](distributed-tracing.md) β€” OpenTelemetry integration +- [Application Insights](application-insights.md) β€” Full AI integration +- [Middleware](../advanced/middleware.md) β€” Cross-cutting concerns including logging +- [Error Handling](../features/error-handling.md) β€” Logging errors diff --git a/samples/Correlation.Samples/Readme.md b/samples/Correlation.Samples/Readme.md index f2920c9ae..e01372ac6 100644 --- a/samples/Correlation.Samples/Readme.md +++ b/samples/Correlation.Samples/Readme.md @@ -1,22 +1,42 @@ -# Distributed Tracing for Durable Task +# Correlation Samples -Distributed Tracing for Durable Task is a feature for enabling correlation propagation among orchestrations and activities. -The key features of Distributed Tracing for Durable Task are: +This sample demonstrates legacy distributed tracing using the `CorrelationSettings` API with the Azure Storage provider. -- **End to End Tracing with Application Insights**: Support Complex orchestration scenario. Multi-Layered Sub Orchestration, Fan-out Fan-in, retry, Timer, and more. -- **Support Protocol**: [W3C TraceContext](https://w3c.github.io/trace-context/) and [Http Correlation Protocol](https://github.com/dotnet/corefx/blob/master/src/System.Diagnostics.DiagnosticSource/src/HttpCorrelationProtocol.md) -- **Suppress Distributed Tracing**: No breaking change for the current implementation +> [!NOTE] +> For comprehensive distributed tracing documentation, including the modern `ActivitySource`-based approach, see the [Distributed Tracing Guide](../../docs/telemetry/distributed-tracing.md). -Currently, we support [DurableTask.AzureStorage](https://w3c.github.io/trace-context/). +## Overview + +This sample shows the legacy correlation approach using: + +- `CorrelationSettings.Current.EnableDistributedTracing` β€” Enable tracing +- `CorrelationTraceClient.SetUp()` β€” Manual telemetry callbacks +- Application Insights for trace visualization ![Overview](docs/images/overview.png) -# Getting Started +## Supported Scenarios + +The samples demonstrate tracing across various orchestration patterns: + +- Simple orchestrations (`HelloOrchestrator`) +- Fan-out/fan-in (`FanOutFanInOrchestrator`) +- Sub-orchestrations (`SubOrchestratorOrchestration`) +- Retry scenarios (`RetryOrchestration`, `MultiLayeredOrchestrationWithRetry`) +- Continue-as-new (`ContinueAsNewOrchestration`) +- Terminated orchestrations (`TerminatedOrchestration`) + +## Getting Started + +See [docs/getting-started.md](docs/getting-started.md) for setup instructions. + +## Provider Implementation -If you want to try Distributed Tracing with DurableTask.AzureStorage, you can find a document with a Handful of examples. +If you're implementing distributed tracing for a custom provider, see [docs/overview.md](docs/overview.md) for the architecture and extension points. - - [Intro](docs/getting-started.md) +## Modern Alternative -# Developing Provider +For new projects, consider using the modern `ActivitySource`-based approach with OpenTelemetry: -If you want to implement Distributed Tracing for other DurableTask providers, Read [Develop Distributed Tracing](docs/overview.md). \ No newline at end of file +- [OpenTelemetry Sample](../DistributedTraceSample/OpenTelemetry) +- [Application Insights Sample](../DistributedTraceSample/ApplicationInsights) diff --git a/samples/DistributedTraceSample/ApplicationInsights/README.md b/samples/DistributedTraceSample/ApplicationInsights/README.md new file mode 100644 index 000000000..cf358239a --- /dev/null +++ b/samples/DistributedTraceSample/ApplicationInsights/README.md @@ -0,0 +1,57 @@ +# Application Insights Sample + +This sample demonstrates direct integration with Azure Application Insights for distributed tracing in Durable Task applications. + +## Prerequisites + +- .NET 6.0 SDK or later +- Azure Storage Emulator (Azurite) or Azure Storage account +- Azure Application Insights resource + +## Configuration + +1. Create an Application Insights resource in the Azure Portal + +2. Configure the connection string in `appsettings.json`: + + ```json + { + "ApplicationInsights": { + "ConnectionString": "InstrumentationKey=..." + } + } + ``` + + Or set the environment variable: + + ```text + APPLICATIONINSIGHTS_CONNECTION_STRING=InstrumentationKey=... + ``` + +## Code Setup + +```csharp +services.AddApplicationInsightsTelemetryWorkerService(); +services.TryAddEnumerable( + ServiceDescriptor.Singleton()); +``` + +The `FilterOutStorageTelemetryProcessor` is included to reduce noise from Azure Storage operations in your telemetry. + +## Running the Sample + +```bash +dotnet run +``` + +## Viewing Traces + +1. Navigate to your Application Insights resource in the Azure Portal +2. Go to **Transaction Search** +3. Click on an entry to view the end-to-end transaction +4. A Gantt chart will show the visual representation of the trace and spans + +## Additional Resources + +- [Application Insights Documentation](../../../docs/telemetry/application-insights.md) +- [Distributed Tracing Guide](../../../docs/telemetry/distributed-tracing.md) diff --git a/samples/DistributedTraceSample/README.md b/samples/DistributedTraceSample/README.md new file mode 100644 index 000000000..f74316198 --- /dev/null +++ b/samples/DistributedTraceSample/README.md @@ -0,0 +1,56 @@ +# Distributed Trace Samples + +This directory contains samples demonstrating telemetry integration with different distributed tracing providers for Durable Task applications. + +## Overview + +Distributed tracing allows you to monitor and debug orchestrations across your entire application stack. These samples show how to configure various telemetry exporters. + +## Samples + +| Sample | Description | +| ------ | ----------- | +| [OpenTelemetry](OpenTelemetry/) | Integration with OpenTelemetry for vendor-neutral distributed tracing | +| [ApplicationInsights](ApplicationInsights/) | Integration with Azure Application Insights | + +## OpenTelemetry Sample + +The [OpenTelemetry sample](OpenTelemetry/) demonstrates how to configure distributed tracing with multiple exporters including Console, Application Insights, and Zipkin. + +```csharp +using var tracerProvider = Sdk.CreateTracerProviderBuilder() + .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("MySample")) + .AddSource("DurableTask.Core") + .AddConsoleExporter() + .AddZipkinExporter() + .AddAzureMonitorTraceExporter(options => + { + options.ConnectionString = Environment.GetEnvironmentVariable("AZURE_MONITOR_CONNECTION_STRING"); + }) + .Build(); +``` + +See the [OpenTelemetry README](OpenTelemetry/README.md) for detailed setup instructions. + +## Application Insights Sample + +The [Application Insights sample](ApplicationInsights/) demonstrates direct integration with Azure Application Insights without OpenTelemetry. + +```csharp +services.AddApplicationInsightsTelemetryWorkerService(); +services.TryAddEnumerable( + ServiceDescriptor.Singleton()); +``` + +## Prerequisites + +- .NET 6.0 SDK or later +- Azure Storage Emulator (Azurite) or Azure Storage account +- (Optional) Application Insights resource +- (Optional) Zipkin instance for OpenTelemetry sample + +## Additional Resources + +- [Distributed Tracing Guide](../../docs/telemetry/distributed-tracing.md) +- [Application Insights Documentation](../../docs/telemetry/application-insights.md) +- [OpenTelemetry Documentation](https://opentelemetry.io/) diff --git a/samples/DurableTask.Samples/README.md b/samples/DurableTask.Samples/README.md new file mode 100644 index 000000000..9da5433ce --- /dev/null +++ b/samples/DurableTask.Samples/README.md @@ -0,0 +1,203 @@ +# DurableTask.Samples + +This project contains core sample orchestrations demonstrating fundamental patterns of the Durable Task Framework using the Azure Storage backend. + +## Prerequisites + +- .NET Framework 4.8 or later +- Azure Storage Emulator (Azurite) or Azure Storage account + +## Configuration + +Configure the connection string in `App.config`: + +```xml + + + + +``` + +For Azure Storage, replace `UseDevelopmentStorage=true` with your connection string (if not using the emulator): + +```text +DefaultEndpointsProtocol=https;AccountName=...;AccountKey=... +``` + +## Running the Samples + +### 1. Create the Task Hub (first time only) + +```bash +DurableTask.Samples.exe -c +``` + +### 2. Start an Orchestration + +```bash +DurableTask.Samples.exe -s [-p ] +``` + +The worker automatically starts and waits for the orchestration to complete. + +## Available Samples + +### Greetings + +A simple "Hello World" orchestration that calls greeting activities. + +```csharp +public class GreetingsOrchestration : TaskOrchestration +{ + public override async Task RunTask(OrchestrationContext context, string input) + { + string greeting = await context.ScheduleTask(typeof(GetUserTask)); + string result = await context.ScheduleTask(typeof(SendGreetingTask), greeting); + return result; + } +} +``` + +**Run:** `DurableTask.Samples.exe -s Greetings` + +### Greetings2 + +Demonstrates parameterized orchestrations with a configurable number of greetings. + +**Run:** `DurableTask.Samples.exe -s Greetings2 -p 5` + +### Cron + +An eternal orchestration that runs on a schedule using `CreateTimer` and `ContinueAsNew`. + +```csharp +public class CronOrchestration : TaskOrchestration +{ + public override async Task RunTask(OrchestrationContext context, string schedule) + { + // Execute the scheduled task + await context.ScheduleTask(typeof(CronTask)); + + // Wait until next scheduled time + DateTime nextRun = CalculateNextRun(context.CurrentUtcDateTime, schedule); + await context.CreateTimer(nextRun, true); + + // Continue as new instance + context.ContinueAsNew(schedule); + return "Completed cycle"; + } +} +``` + +**Run:** `DurableTask.Samples.exe -s Cron -p "0 12 * * *"` + +### AverageCalculator + +Fan-out/fan-in pattern that distributes computation across multiple activities. + +```csharp +public class AverageCalculatorOrchestration : TaskOrchestration +{ + public override async Task RunTask(OrchestrationContext context, int[] numbers) + { + // Fan-out: process chunks in parallel + var tasks = new List>(); + foreach (var chunk in numbers.Chunk(10)) + { + tasks.Add(context.ScheduleTask(typeof(ComputeSumTask), chunk)); + } + + // Fan-in: aggregate results + int[] sums = await Task.WhenAll(tasks); + return sums.Sum() / (double)numbers.Length; + } +} +``` + +**Run:** `DurableTask.Samples.exe -s Average -p "1 50 10"` + +Parameters: ` ` + +### ErrorHandling + +Demonstrates retry policies and exception handling patterns. + +```csharp +public override async Task RunTask(OrchestrationContext context, string input) +{ + var retryOptions = new RetryOptions( + firstRetryInterval: TimeSpan.FromSeconds(5), + maxNumberOfAttempts: 3); + + try + { + return await context.ScheduleWithRetry( + typeof(UnreliableActivity), + retryOptions, + input); + } + catch (TaskFailedException ex) + { + // Handle permanent failure + return $"Failed after retries: {ex.Message}"; + } +} +``` + +**Run:** `DurableTask.Samples.exe -s ErrorHandling` + +### Signal + +Demonstrates external events and human interaction patterns. + +```csharp +public override async Task RunTask(OrchestrationContext context, ApprovalRequest input) +{ + // Send notification + await context.ScheduleTask(typeof(SendApprovalRequest), input); + + // Wait for external event + var approval = await context.WaitForExternalEvent("ApprovalResult"); + + if (approval.IsApproved) + { + await context.ScheduleTask(typeof(ProcessApproval), input); + return "Approved and processed"; + } + + return "Rejected"; +} +``` + +**Run:** `DurableTask.Samples.exe -s Signal` + +To raise an event to a running instance: + +```bash +DurableTask.Samples.exe -n -i -p +``` + +### SumOfSquares + +Another fan-out/fan-in example computing sum of squares from a JSON input file. + +**Run:** `DurableTask.Samples.exe -s SumOfSquares` + +## Command Line Options + +| Option | Description | +| ------ | ----------- | +| `-c` | Create the task hub (required on first run) | +| `-s ` | Start the specified orchestration | +| `-p ` | Parameters to pass to the orchestration | +| `-i ` | Instance ID (auto-generated if not specified) | +| `-n ` | Event name for raising events | +| `-w` | Skip the worker (useful when worker runs separately) | + +## Additional Resources + +- [Getting Started Guide](../../docs/getting-started/quickstart.md) +- [Orchestrations](../../docs/concepts/orchestrations.md) +- [Activities](../../docs/concepts/activities.md) +- [Error Handling](../../docs/features/error-handling.md) +- [Timers](../../docs/features/timers.md) diff --git a/samples/ManagedIdentitySample/README.md b/samples/ManagedIdentitySample/README.md new file mode 100644 index 000000000..2d07dc5d0 --- /dev/null +++ b/samples/ManagedIdentitySample/README.md @@ -0,0 +1,65 @@ +# Managed Identity Sample + +This directory contains samples demonstrating how to use Azure Managed Identity for authentication with Azure Storage in Durable Task applications. + +## Overview + +Managed Identity provides a more secure alternative to connection strings by eliminating the need to store credentials. These samples show how to configure identity-based connections for both v1.x and v2.x versions of the Azure Storage provider. + +## Samples + +| Sample | Description | +| ------ | ----------- | +| [DTFx.AzureStorage v1.x](DTFx.AzureStorage%20v1.x/) | Legacy WindowsAzure.Storage SDK with Managed Identity | +| [DTFx.AzureStorage v2.x](DTFx.AzureStorage%20v2.x/) | Modern Azure.Storage.* SDK with TokenCredential | + +## Prerequisites + +Before running these samples, you must: + +1. **Create an Azure Storage account** or reuse an existing one + +2. **Create your identity** in the Azure Portal. Detailed instructions can be found in the [Microsoft Entra documentation](https://learn.microsoft.com/entra/identity-platform/quickstart-register-app?tabs=certificate) + +3. **Assign Role-based Access Controls (RBAC)** to the identity with [these instructions](https://learn.microsoft.com/azure/role-based-access-control/role-assignments-portal-managed-identity#Overview): + - Storage Queue Data Contributor + - Storage Blob Data Contributor + - Storage Table Data Contributor + +4. **Configure the identity** in your app's configuration + +5. **Set the storage account name** in your configuration. The account name can be replaced with individual service URIs (BlobServiceUri, TableServiceUri, QueueServiceUri) + +## Code Examples + +### DTFx.AzureStorage v1.x + +```csharp +var credential = new DefaultAzureCredential(); +var settings = new AzureStorageOrchestrationServiceSettings +{ + StorageAccountClientProvider = new ManagedIdentityStorageAccountClientProvider( + storageAccountName, + credential) +}; +``` + +> [!NOTE] +> Identity-based connection is **not supported** with .NET Framework 4.x when using DurableTask.AzureStorage v1.x + +### DTFx.AzureStorage v2.x + +```csharp +var credential = new DefaultAzureCredential(); +var settings = new AzureStorageOrchestrationServiceSettings +{ + StorageAccountClientProvider = new StorageAccountClientProvider( + new Uri($"https://{storageAccountName}.blob.core.windows.net"), + credential) +}; +``` + +## Additional Resources + +- [Azure Storage Provider Documentation](../../docs/providers/azure-storage.md) +- [Azure Managed Identity Overview](https://learn.microsoft.com/azure/active-directory/managed-identities-azure-resources/overview)