Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .autover/changes/91693d62-b0c7-49b0-a74f-531aa1509864.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"Projects": [
{
"Name": "Amazon.Lambda.DurableExecution",
"Type": "Patch",
"ChangelogMessages": [
"Initial preview release of the Durable Execution SDK for .NET. Build long-running Lambda workflows with automatic checkpointing via `StepAsync`, `WaitAsync`, `RunInChildContextAsync`, `CreateCallbackAsync`, and `WaitForCallbackAsync` on `IDurableContext`."
]
}
]
}
106 changes: 106 additions & 0 deletions Libraries/src/Amazon.Lambda.DurableExecution/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# AWS Lambda Durable Execution SDK for .NET

> **Preview.** `Amazon.Lambda.DurableExecution` is in active development (0.x). Public APIs may change before 1.0.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't have the deployment of new Amazon.Lambda.RuntimeSupport with support for accessing the serializer you should call out only executable programming model is supported for now.

I would suggest making all of the examples be executable mode as well and later we can revise the README once the managed runtime is updated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


`Amazon.Lambda.DurableExecution` is the .NET SDK for building resilient, long-running AWS Lambda functions that automatically checkpoint progress and resume after failures. Workflows can run for up to one year, with charges only for active compute time.

## Key Features

- **Automatic checkpointing** — progress is saved after each step; failures resume from the last checkpoint.
- **Cost-effective waits** — suspend execution for minutes, hours, or days without compute charges.
- **Configurable retries** — built-in retry strategies with exponential backoff and jitter.
- **Replay safety** — functions deterministically resume from checkpoints after interruptions.
- **Type safety** — full generic type support for step results.
- **AOT-friendly** — pluggable `ILambdaSerializer` so you can register `SourceGeneratorLambdaJsonSerializer<TContext>` for trimmed / Native AOT functions.

## How It Works

Your handler delegates to `DurableFunction.WrapAsync`, which gives your workflow function an `IDurableContext`. The context is your interface to durable operations:

- `ctx.StepAsync` — run code and checkpoint the result. ([docs](docs/core/steps.md))
- `ctx.WaitAsync` — suspend execution without compute charges. ([docs](docs/core/wait.md))
- `ctx.CreateCallbackAsync` / `ctx.WaitForCallbackAsync` — wait for external events (approvals, webhooks). ([docs](docs/core/callbacks.md))
- `ctx.RunInChildContextAsync` — run an isolated child context with its own checkpoint log. ([docs](docs/core/child-contexts.md))

## Quick Start

### Installation

```bash
dotnet add package Amazon.Lambda.DurableExecution
```

### Your first durable function

> **Programming model:** the preview only supports the **executable programming model** — your function is an executable assembly that hosts its own bootstrap loop and passes the serializer to the runtime in code. Class-library handlers on the managed runtime will be supported once `Amazon.Lambda.RuntimeSupport` ships the changes that let `DurableFunction.WrapAsync` resolve the serializer from `ILambdaContext.Serializer`. This README will be updated then.

A complete order-processing workflow with two steps and a wait, deployed as an executable assembly on the `dotnet10` runtime. `Main` builds a `LambdaBootstrap` with your handler and an `ILambdaSerializer`, and `DurableFunction.WrapAsync` uses that serializer to checkpoint step inputs and outputs.

```csharp
using Amazon.Lambda.Core;
using Amazon.Lambda.DurableExecution;
using Amazon.Lambda.RuntimeSupport;
using Amazon.Lambda.Serialization.SystemTextJson;

namespace OrderProcessor;

public class OrderProcessor
{
public static async Task Main()
{
var handler = new OrderProcessor();
var serializer = new DefaultLambdaJsonSerializer();
using var wrapper = HandlerWrapper.GetHandlerWrapper<DurableExecutionInvocationInput, DurableExecutionInvocationOutput>(
handler.Handler, serializer);
using var bootstrap = new LambdaBootstrap(wrapper);
await bootstrap.RunAsync();
}

public Task<DurableExecutionInvocationOutput> Handler(
DurableExecutionInvocationInput input, ILambdaContext context)
=> DurableFunction.WrapAsync<Order, OrderResult>(Workflow, input, context);

private async Task<OrderResult> Workflow(Order order, IDurableContext ctx)
{
var reservation = await ctx.StepAsync(
async _ => await InventoryService.ReserveAsync(order.Items),
name: "reserve-inventory");

var payment = await ctx.StepAsync(
async _ => await PaymentService.ChargeAsync(order.PaymentMethod, order.Total),
name: "process-payment");

await ctx.WaitAsync(TimeSpan.FromHours(2), name: "warehouse-processing");

var shipment = await ctx.StepAsync(
async _ => await ShippingService.ShipAsync(reservation, order.Address),
name: "confirm-shipment");

return new OrderResult(order.Id, shipment.TrackingNumber);
}
}

public record Order(string Id, IReadOnlyList<OrderItem> Items, PaymentMethod PaymentMethod, decimal Total, Address Address);
public record OrderResult(string OrderId, string TrackingNumber);
```

For AOT or trim-friendly serialization, swap `DefaultLambdaJsonSerializer` for `SourceGeneratorLambdaJsonSerializer<TContext>` and register your `JsonSerializerContext`.

## Documentation

**Core operations**

- [Steps](docs/core/steps.md) — execute code with automatic checkpointing, retry strategies, and at-least/at-most-once semantics.
- [Wait](docs/core/wait.md) — pause execution without compute charges.
- [Callbacks](docs/core/callbacks.md) — wait for external systems to respond.
- [Child Contexts](docs/core/child-contexts.md) — group related operations into isolated, checkpointed units.

**Examples**

End-to-end test functions (each paired with an integration test) live under `Libraries/test/Amazon.Lambda.DurableExecution.IntegrationTests/TestFunctions/`.

## Related SDKs

- [aws-durable-execution-sdk-java](https://github.com/aws/aws-durable-execution-sdk-java) — Java SDK
- [aws-durable-execution-sdk-js](https://github.com/aws/aws-durable-execution-sdk-js) — JavaScript / TypeScript SDK
- [aws-durable-execution-sdk-python](https://github.com/aws/aws-durable-execution-sdk-python) — Python SDK
185 changes: 185 additions & 0 deletions Libraries/src/Amazon.Lambda.DurableExecution/docs/core/callbacks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
# Callbacks

Callbacks let a workflow suspend until an external system (a human approver, a webhook, another service) delivers a result. The external system completes the callback by calling `SendDurableExecutionCallbackSuccess`, `SendDurableExecutionCallbackFailure`, or `SendDurableExecutionCallbackHeartbeat` with the `callbackId` you handed it.

Two APIs are available:

- `WaitForCallbackAsync<T>` — composite operation; create the callback, hand it to the external system inside a submitter delegate, and suspend until the result arrives.
- `CreateCallbackAsync<T>` — lower-level; allocate the callback yourself, hand the ID out in your own steps, and `await` the result separately.

## `WaitForCallbackAsync<T>`

```csharp
Task<T> WaitForCallbackAsync<T>(
Func<string, IWaitForCallbackContext, Task> submitter,
string? name = null,
WaitForCallbackConfig? config = null,
CancellationToken cancellationToken = default);
```

The submitter receives the freshly allocated `callbackId` and an `IWaitForCallbackContext` (logger-only). Submitter failures (after retries are exhausted) surface as `CallbackSubmitterException`; callback failures and timeouts surface as `CallbackFailedException` / `CallbackTimeoutException`.

## `CreateCallbackAsync<T>`

```csharp
Task<ICallback<T>> CreateCallbackAsync<T>(
string? name = null,
CallbackConfig? config = null,
CancellationToken cancellationToken = default);
```

The returned `ICallback<T>` exposes:

- `string CallbackId` — give this to the external system.
- `Task<T> GetResultAsync(CancellationToken)` — `await` to suspend until the external system completes the callback.

The result is deserialized using the registered `ILambdaSerializer`. Throws `CallbackFailedException` or `CallbackTimeoutException` on failure.

## End-to-end example

Two Lambdas: a workflow that suspends on a callback, and a separate approver Lambda that resolves it. The workflow hands its `callbackId` to the approver via `Event` invocation (fire-and-forget), then suspends. The approver runs in its own Lambda and signals completion by calling `SendDurableExecutionCallbackSuccessAsync`.

### 1. Workflow Lambda — `WaitForCallbackAsync<T>`

```csharp
using Amazon.Lambda;
using Amazon.Lambda.Core;
using Amazon.Lambda.DurableExecution;
using Amazon.Lambda.Model;
using Amazon.Lambda.RuntimeSupport;
using Amazon.Lambda.Serialization.SystemTextJson;

namespace OrderApprovalWorkflow;

public class Function
{
private static readonly IAmazonLambda LambdaClient = new AmazonLambdaClient();

public static async Task Main()
{
var handler = new Function();
var serializer = new DefaultLambdaJsonSerializer();
using var wrapper = HandlerWrapper.GetHandlerWrapper<DurableExecutionInvocationInput, DurableExecutionInvocationOutput>(
handler.Handler, serializer);
using var bootstrap = new LambdaBootstrap(wrapper);
await bootstrap.RunAsync();
}

public Task<DurableExecutionInvocationOutput> Handler(
DurableExecutionInvocationInput input, ILambdaContext context)
=> DurableFunction.WrapAsync<OrderInput, ApprovalResult>(Workflow, input, context);

private async Task<ApprovalResult> Workflow(OrderInput input, IDurableContext ctx)
{
var approverFunctionName = Environment.GetEnvironmentVariable("APPROVER_FUNCTION_NAME")
?? throw new InvalidOperationException("APPROVER_FUNCTION_NAME env var not set");

// Suspend until the approver Lambda calls SendDurableExecutionCallbackSuccessAsync
// with this callback ID. The submitter is invoked once with a freshly-allocated
// ID; it hands the ID to the approver and returns immediately.
var result = await ctx.WaitForCallbackAsync<ApprovalResult>(
submitter: async (callbackId, cbCtx) =>
{
var payload = $$"""{"callbackId":"{{callbackId}}","orderId":"{{input.OrderId}}"}""";
await LambdaClient.InvokeAsync(new InvokeRequest
{
FunctionName = approverFunctionName,
InvocationType = InvocationType.Event, // fire-and-forget
Payload = payload
});
},
name: "approve");

return result;
}
}

public record OrderInput(string OrderId);
public record ApprovalResult(string Status, string ApprovedBy);
```

### 2. Approver Lambda — completes the callback

A plain Lambda — no durable execution wrapper. It receives the callback ID, performs whatever logic the external system needs, and calls `SendDurableExecutionCallbackSuccessAsync` to resume the workflow.

```csharp
using System.Text;
using Amazon.Lambda;
using Amazon.Lambda.Core;
using Amazon.Lambda.Model;
using Amazon.Lambda.RuntimeSupport;
using Amazon.Lambda.Serialization.SystemTextJson;

namespace OrderApprovalWorkflow;

public class ApproverFunction
{
private static readonly IAmazonLambda LambdaClient = new AmazonLambdaClient();

public static async Task Main()
{
var handler = new ApproverFunction();
var serializer = new DefaultLambdaJsonSerializer();
using var wrapper = HandlerWrapper.GetHandlerWrapper<ApproverInput, object?>(
handler.Handler, serializer);
using var bootstrap = new LambdaBootstrap(wrapper);
await bootstrap.RunAsync();
}

public async Task<object?> Handler(ApproverInput input, ILambdaContext context)
{
// The result JSON must match the T in WaitForCallbackAsync<T> — here, ApprovalResult.
var resultJson = $$"""{"Status":"approved","ApprovedBy":"{{input.OrderId}}"}""";
await LambdaClient.SendDurableExecutionCallbackSuccessAsync(
new SendDurableExecutionCallbackSuccessRequest
{
CallbackId = input.CallbackId,
Result = new MemoryStream(Encoding.UTF8.GetBytes(resultJson))
});
return null;
}
}

public record ApproverInput(string CallbackId, string OrderId);
```

To signal failure instead, call `SendDurableExecutionCallbackFailureAsync` — the workflow throws `CallbackFailedException`. To extend the heartbeat deadline (when `HeartbeatTimeout` is configured), call `SendDurableExecutionCallbackHeartbeatAsync`.

### `CreateCallbackAsync<T>` variant

When you need to allocate the ID before deciding how to hand it out — e.g. several steps run between callback creation and submission — use `CreateCallbackAsync` and a separate `StepAsync` for the submission. Wrapping the hand-off in a step prevents replays from re-invoking the approver.

```csharp
private async Task<ApprovalResult> Workflow(OrderInput input, IDurableContext ctx)
{
var cb = await ctx.CreateCallbackAsync<ApprovalResult>(name: "approve");

await ctx.StepAsync(async _ =>
{
var payload = $$"""{"callbackId":"{{cb.CallbackId}}","orderId":"{{input.OrderId}}"}""";
await LambdaClient.InvokeAsync(new InvokeRequest
{
FunctionName = approverFunctionName,
InvocationType = InvocationType.Event,
Payload = payload
});
}, name: "submit");

return await cb.GetResultAsync();
}
```

## Configuration

```csharp
public class CallbackConfig
{
public TimeSpan Timeout { get; set; } // overall callback timeout, ≥ 1s or Zero (default = no timeout)
public TimeSpan HeartbeatTimeout { get; set; } // heartbeat-gap timeout, ≥ 1s or Zero (default = no timeout)
}

public class WaitForCallbackConfig : CallbackConfig
{
public IRetryStrategy? RetryStrategy { get; set; } // applied to the submitter step only
}
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Child Contexts

`RunInChildContextAsync` runs a sub-workflow inside its own deterministic operation-ID space. The child's return value is checkpointed as a single `CONTEXT` operation, so subsequent invocations replay the cached value without re-executing the contained operations. Use to group related steps under a shared error/observability boundary.

## Signatures

```csharp
Task<T> RunInChildContextAsync<T>(
Func<IDurableContext, Task<T>> func,
string? name = null,
ChildContextConfig? config = null,
CancellationToken cancellationToken = default);

Task RunInChildContextAsync(
Func<IDurableContext, Task> func,
string? name = null,
ChildContextConfig? config = null,
CancellationToken cancellationToken = default);
```

## Example

```csharp
var phaseResult = await ctx.RunInChildContextAsync<string>(
async childCtx =>
{
var validated = await childCtx.StepAsync(async _ => Validate(input), name: "validate");
await childCtx.WaitAsync(TimeSpan.FromSeconds(2), name: "short_wait");
var processed = await childCtx.StepAsync(async _ => Process(validated), name: "process");
return processed;
},
name: "phase",
config: new ChildContextConfig { SubType = "OrderProcessing" });
```

## Configuration

```csharp
public sealed class ChildContextConfig
{
public string? SubType { get; set; } // observability label
public Func<Exception, Exception>? ErrorMapping { get; set; } // remap thrown exceptions
}
```

`ErrorMapping` lets you translate exceptions thrown inside the child context into a domain-specific exception type before they propagate to the parent.
Loading