Skip to content

Latest commit

 

History

History
556 lines (449 loc) · 27.7 KB

File metadata and controls

556 lines (449 loc) · 27.7 KB

Architecture

AzureOpsCrew

Table of Contents

  1. Overview & Design Principles
  2. System Architecture
  3. Core Components
  4. Key Patterns
  5. Technology Stack

Overview & Design Principles

Clean Architecture with DDD

AzureOpsCrew is built following Clean Architecture principles combined with Domain-Driven Design (DDD). The solution is organized into distinct layers with clear separation of concerns:

src/
├── Domain/              # Core business logic, entities, and domain services
├── Infrastructure.Ai/   # AI provider integrations, LLM clients, MCP servers
├── Infrastructure.Db/   # Entity Framework DbContext, migrations, repository
├── Api/                 # HTTP endpoints, SignalR hubs, background services
└── Front/               # Blazor WebAssembly frontend

Multi-Agent Collaboration Model

The system enables autonomous AI agents to collaborate in channels or direct messages. Key characteristics:

  • Agent Autonomy: Agents can independently trigger, execute, and wait for conditions
  • Channel-Based Collaboration: Multiple agents participate in shared conversations
  • Direct Messages: Agent-to-agent or agent-to-human private conversations
  • Tool Execution: Agents can use built-in tools or MCP server tools with approval workflows

Production-Grade AI Principles

  • Provider Abstraction: Support for multiple AI providers (OpenAI, Anthropic, Azure Foundry, DeepSeek, Ollama, OpenRouter)
  • Streaming Responses: Real-time delivery of agent thoughts and reasoning
  • Observability: Complete logging of agent thoughts, HTTP calls, and tool executions
  • Fault Tolerance: Health checks, retry logic, and graceful error handling
  • Scalability: Background service architecture with SignalR for real-time updates

System Architecture

Layer Diagram

┌─────────────────────────────────────────────────────────────────┐
│                         Frontend Layer                          │
│                     (Blazor WebAssembly)                        │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │
│  │  Channels    │  │  Direct Msg  │  │  Settings    │           │
│  └──────────────┘  └──────────────┘  └──────────────┘           │
└──────────────────────────┬──────────────────────────────────────┘
                           │ SignalR / HTTP
┌──────────────────────────▼──────────────────────────────────────┐
│                          API Layer                              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │
│  │   Endpoints  │  │  SignalR Hub │  │ Background   │           │
│  │  (Minimal)   │  │   (Events)   │  │  Services    │           │
│  └──────────────┘  └──────────────┘  └──────────────┘           │
└──────────────────────────┬──────────────────────────────────────┘
                           │
┌──────────────────────────▼──────────────────────────────────────┐
│                       Domain Layer                              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │
│  │  Entities    │  │   Services   │  │  Interfaces  │           │
│  │ (Agent, Msg) │  │ (IAiAgent,   │  │ (IProvider   │           │
│  │              │  │  IProvider)  │  │  Facade)     │           │
│  └──────────────┘  └──────────────┘  └──────────────┘           │
└─────────────┬───────────────────────┬───────────────────────────┘
              │                       │
┌─────────────▼──────────┐   ┌────────▼──────────────────────────┐
│ Infrastructure.Ai      │   │ Infrastructure.Db                 │
│ ┌────────────────────┐ │   │ ┌────────────────────────────┐    │
│ │ Provider Facades   │ │   │ │ Entity Framework           │    │
│ │ (OpenAI, Anthropic,│ │   │ │ DbContext + Migrations     │    │
│ │  DeepSeek, etc.)   │ │   │ └────────────────────────────┘    │
│ │ MCP Server Facade  │ │   └───────────────────────────────────┘
│ │ Prompt Service     │ │
│ │ Agent Factory      │ │
│ └────────────────────┘ │
└────────────────────────┘

Component Responsibilities

Layer Component Responsibility
Front Blazor WASM SPA UI, state management, SignalR client
API Endpoints Minimal API endpoints for CRUD operations
API SignalR Hubs Real-time events for messages, tool calls, agent status
API AgentScheduler Background service managing agent execution loops
API AgentRunService Orchestrates single agent execution (LLM calls, tool execution)
Domain Entities Core business objects (Agent, Channel, Message, Trigger, etc.)
Domain Interfaces Abstractions for AI agents and providers
Infra.Ai ProviderFacades Concrete implementations of AI provider integrations
Infra.Ai PromptService Builds system prompts from composable chunks
Infra.Db DbContext EF Core database access and migrations

Data Flow

1. Message Flow (User/Agent → Channel)

┌─────────┐     HTTP POST      ┌─────────┐     Create      ┌──────────┐
│ Front/  │ ──────────────────►│ API     │ ───────────────►│  Trigger │
│ User    │   /api/channels/   │Endpoint │  MessageTrigger │          │
└─────────┘     {id}/messages  └─────────┘                 └──────────┘
                                                              │
                                                              ▼
                                                       ┌──────────────┐
                                                       │AgentScheduler│
                                                       │ (Queue)      │
                                                       └──────────────┘
                                                              │
                                                              ▼
                                                      ┌───────────────┐
                                                      │AgentRunService│
                                                      │ - Load Data   │
                                                      │ - Call LLM    │
                                                      │ - Exec Tools  │
                                                      └───────────────┘
                                                             │
                                      ┌──────────────────────┼──────────────────────┐
                                      ▼                      ▼                      ▼
                                 ┌─────────┐            ┌─────────┐            ┌─────────┐
                                 │Provider │            │Backend  │            │MCP      │
                                 │Facade   │            │Tools    │            │Server   │
                                 └─────────┘            └─────────┘            └─────────┘
                                      │                      │                      │
                                      └──────────────────────┼──────────────────────┘
                                                             ▼
                                                      ┌──────────────┐
                                                      │   SignalR    │
                                                      │  Broadcast   │
                                                      └──────────────┘
                                                             │
                                                             ▼
                                                        ┌─────────┐
                                                        │ Front/  │
                                                        │ Clients │
                                                        └─────────┘

2. Agent Execution Flow

┌─────────────────────────────────────────────────────────────────────┐
│                         Agent Execution Loop                        │
│                                                                     │
│  1. Wait for Signal ───────────────────────────────────────┐        │
│     (AgentScheduler.WaitForSignal)                         │        │
│                                                            │        │
│  2. Load Wait Conditions & Triggers ───────────────────────┤        │
│     - Check pending wait conditions                        │        │
│     - Check queued triggers                                │        │
│                                                            │        │
│  3. Try Satisfy Wait Conditions ───────────────────────────┤        │
│     - Match triggers to wait conditions                    │        │
│     - Mark satisfied conditions                            │        │
│                                                            │        │
│  4. Execute Agent (if triggers exist) ─────────────────────┤        │
│     - Mark trigger as started                              │        │
│     - Call AgentRunService.Run()                           │        │
│                                                            │        │
│     ┌─────────────────────────────────────────────────┐    │        │
│     │  AgentRunService.Run()                          │    │        │
│     │                                                 │    │        │
│     │  a. Check approval resolutions                  │    │        │
│     │  b. Load agent run data (messages, tools, etc.) │    │        │
│     │  c. Loop (max 300 iterations):                  │    │        │
│     │     i. Call LLM (streaming thoughts)            │    │        │
│     │     ii. Save agent thoughts                     │    │        │
│     │     iii. Broadcast reasoning via SignalR        │    │        │
│     │     iv. Execute tool calls                      │    │        │
│     │     v. Check for approval requirement           │    │        │
│     │     vi. Save tool results                       │    │        │
│     │     vii. If final text, save message            │    │        │
│     │     viii. If no tools, break                    │    │        │
│     │                                                 │    │        │
│     └─────────────────────────────────────────────────┘    │        │
│                                                            │        │
│  5. Mark trigger as completed ─────────────────────────────┤        │
│                                                            │        │
│  6. Loop back to step 1 ───────────────────────────────────┘        │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

3. Sequence Diagram

Sequence Diagram


Core Components

Domain Layer

Entities

Agent (src/Domain/Agents/Agent.cs)

  • Represents an AI agent with configuration
  • Properties: Id, Info (AgentInfo), ProviderId, Color
  • Methods: Update(), SetAvailableMcpServer(), RemoveAvailableMcpServer()

Channel (src/Domain/Channels/Channel.cs)

  • Multi-agent conversation space
  • Properties: Id, Name, Description, AgentIds[]

DirectMessageChannel (src/Domain/Chats/DirectMessageChannel.cs)

  • Private conversation between two agents or agent-user
  • Properties: Id, Agent1Id, Agent2Id

Message (src/Domain/Chats/Message.cs)

  • Chat message with author info
  • Properties: Id, Text, PostedAt, AgentId, UserId, ChannelId, DmId, AgentThoughtId

AgentThought (src/Domain/Chats/AgentThought.cs)

  • LLM response content (text, reasoning, tool calls)
  • Stores raw content from AI providers

Provider (src/Domain/Providers/Provider.cs)

  • AI provider configuration
  • Types: OpenAI, Anthropic, AzureFoundry, DeepSeek, Ollama, OpenRouter
  • Properties: Id, Name, ProviderType, ApiKey, ApiEndpoint, DefaultModel

Domain Services

IAiAgentFactory (src/Domain/AgentServices/IAiAgentFactory.cs)

  • Creates configured AI agents from Microsoft.Agents.AI

IProviderFacade (src/Domain/ProviderServices/IProviderFacade.cs)

  • Abstraction for AI provider operations
  • Methods: TestConnectionAsync(), ListModelsAsync(), CreateChatClient()

IProviderFacadeResolver (src/Domain/ProviderServices/IProviderFacadeResolver.cs)

  • Resolves appropriate facade by provider type

Value Objects & Enums

  • AgentInfo - Agent configuration record
  • ToolType - BackEnd, FrontEnd, McpServer
  • ProviderType - Supported AI providers
  • LlmMessageContentType - Types of LLM content

API Layer

Endpoints (src/Api/Endpoints/)

Endpoint Purpose
AuthEndpoints Login, registration, token refresh
ChannelEndpoints CRUD for channels, add/remove agents
DmEndpoints Direct message operations
ProviderEndpoints Provider management, model listing, connection testing
McpServerConfigurationEndpoints MCP server management
UsersEndpoints User profile operations

SignalR Hubs (src/Api/Channels/)

ChannelEventsHub (ChannelEventsHub.cs)

  • JoinChannel(channelId) - Subscribe to channel events
  • LeaveChannel(channelId) - Unsubscribe from channel events
  • Events: MessageAdded, ToolCallStart/Completed, AgentStatus, ApprovalRequest, ReasoningContent

DmEventsHub (DmEventsHub.cs)

  • Similar to ChannelEventsHub but for direct messages
  • JoinDm(dmId) / LeaveDm(dmId)

Background Services (src/Api/Background/)

AgentScheduler (AgentScheduler.cs)

  • Manages agent execution loops
  • Queues triggers via QueueTrigger()
  • Coordinates AgentSignalManager for signaling
  • Executes AgentRunService per trigger

AgentRunService (AgentRunService.cs)

  • Orchestrates single agent execution
  • Manages approval workflow
  • Handles tool execution via ToolCallRouter
  • Broadcasts events via IChannelEventBroadcaster

AgentSignalManager (AgentSignalManager.cs)

  • Coordinates signaling between scheduler and agent loops
  • Uses AsyncAutoResetEvent per (agentId, chatId) pair

ToolCallRouter (ToolExecutors/ToolCallRouter.cs)

  • Routes tool calls to appropriate executor:
    • BackendToolExecutor - Built-in backend tools
    • McpServerToolExecutor - MCP server tools

Infrastructure Layer

Infrastructure.Ai (src/Infrastructure.Ai/)

Provider Facades (ProviderFacades/)

  • OpenAIProviderFacade - OpenAI API integration
  • AnthropicProviderFacade - Anthropic Claude API
  • AzureFoundryProviderFacade - Azure OpenAI / Foundry
  • DeepSeekProviderFacade - DeepSeek API
  • OllamaProviderFacade - Local Ollama models
  • OpenRouterProviderFacade - OpenRouter aggregator
  • ProviderServiceFactory - Factory for creating facades

MCP Server (Mcp/McpServerFacade.cs)

  • Model Context Protocol server integration
  • Tool discovery and execution

Prompt Service (AgentServices/Prompt/)

  • PromptService - Composes system prompts from chunks
  • Prompt chunks:
    • GeneralPromptChunk - Base system prompt
    • DoingTasksPromptChunk - Task-oriented instructions
    • ToolUsagePolicyPromptChunk - Tool usage guidelines
    • ChatRulesPromptChunk - Conversation behavior
    • AgentInfoPromptChunk - Agent identity
    • ChatParticipantsPromptChunk - Context about participants
    • ChannelManagerPromptChunk / ChannelWorkerPromptChunk - Role-specific instructions
    • FirstMessagePromptChunk - Initial conversation guidance

Context Service (AgentServices/ContextService.cs)

  • Prepares conversation context from messages and thoughts
  • Handles long-term memory integration

Long-Term Memory (AgentServices/LongTermMemories/)

  • None - No persistent memory
  • InMemory - In-process fact storage
  • Neo4j - Graph-based memory with relationship tracking

Infrastructure.Db (src/Infrastructure.Db/)

AzureOpsCrewContext (AzureOpsCrewContext.cs)

  • Entity Framework Core DbContext
  • DbSets: Agents, Channels, Dms, Messages, AgentThoughts, Providers, McpServerConfigurations, Triggers, WaitConditions, Users

Entity Type Configurations (EntityTypes/)

  • Fluent API configurations for all entities
  • Handles relationships, indexes, and constraints

Migrations (Migrations/)

  • Database schema versioning

Frontend Layer (src/Front/)

Blazor WebAssembly SPA

  • Component-based UI with Razor pages
  • SignalR client for real-time updates
  • Services for API communication:
    • AuthenticationService - JWT auth
    • AgentService - Agent management
    • ChannelService - Channel operations
    • DmService - Direct messages
    • ChatHubClient - SignalR connection
    • MarkdownService - Markdown rendering

Key Patterns

Agent Execution Flow

The agent execution follows a trigger-based pattern:

  1. Trigger Creation - A trigger is queued (e.g., new message)
  2. Wait Conditions - Agent may wait for specific conditions
  3. Agent Loop - Background service processes triggers when wait conditions are satisfied
  4. LLM Interaction - Streaming response with thoughts, reasoning, and tool calls
  5. Tool Execution - Sequential tool execution with optional approval
  6. Result Broadcasting - Real-time updates via SignalR

Trigger/WaitCondition System

Trigger Types (src/Domain/Triggers/):

  • MessageTrigger - Fired when a new message is posted
  • ToolApprovalTrigger - Fired when user approves/rejects tool execution

WaitCondition Types (src/Domain/WaitConditions/):

  • MessageWaitCondition - Agent waits for new messages after a point in time
  • ToolApprovalWaitCondition - Agent waits for user approval on MCP tool execution

Flow:

Trigger created → Queued in DB → AgentScheduler signaled
                ↓
        AgentLoop checks WaitConditions
                ↓
        If all satisfied → Execute AgentRunService
                ↓
        Mark Trigger as Completed

Provider Abstraction

The IProviderFacade interface provides a unified API for different AI providers:

public interface IProviderFacade
{
    Task<TestConnectionResult> TestConnectionAsync(Provider config, CancellationToken ct);
    Task<ProviderModelInfo[]> ListModelsAsync(Provider config, CancellationToken ct);
    IChatClient CreateChatClient(Provider config, string model, CancellationToken ct);
}

Each facade implements:

  • Provider-specific HTTP client configuration
  • API key authentication
  • Model listing and discovery
  • Chat client creation with streaming support

Tool Approval Workflow

For MCP server tools requiring approval:

┌─────────────────────────────────────────────────────────────────┐
│                      Tool Approval Flow                         │
│                                                                 │
│  1. Agent calls MCP tool requiring approval                     │
│     └─► AgentRunService detects approval requirement            │
│                                                                 │
│  2. Create ApprovalRequest thought + WaitCondition              │
│     └─► Save to DB, Signal scheduler                            │
│                                                                 │
│  3. Broadcast ApprovalRequest via SignalR                       │
│     └─► Frontend shows approval UI                              │
│                                                                 │
│  4. Agent enters "WaitingForApproval" state                     │
│     └─► Run loop breaks, returns early                          │
│                                                                 │
│  5. User approves/rejects via API                               │
│     └─► Creates ToolApprovalTrigger + ApprovalResponse thought  │
│                                                                 │
│  6. Next agent run checks for pending approvals                 │
│     └─► If approved: Execute tool, save result                  │
│     └─► If rejected: Save synthetic error result                │
│                                                                 │
│  7. Continue normal execution                                   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Real-time Communication

SignalR Events:

Event Payload Triggered By
MessageAdded Message Agent posts message
ToolCallStart ToolCallEvent Agent starts tool execution
ToolCallCompleted ToolCallEvent + Result Tool execution completes
AgentStatus Status (Idle/Running/Error/WaitingForApproval) Agent state change
ApprovalRequest ApprovalRequest Agent requests tool approval
ReasoningContent Text + Agent Agent thinking/reasoning

Client Flow:

Frontend Connects → Join Channel/Dm Groups
                         ↓
              Receive Events & Update UI
                         ↓
                   User Action → API Call
                         ↓
                   Backend Processes → Broadcast Events

Technology Stack

Backend

Component Technology Purpose
Framework .NET 10 / ASP.NET Core Web API, SignalR, Background Services
Language C# 13 Primary language
Database SQL Server / PostgreSQL Entity Framework Core
ORM Entity Framework Core 10.0 Database access
Authentication JWT Token-based auth
Real-time SignalR WebSocket-based events
Logging Serilog Structured logging
AI SDKs Microsoft.Extensions.AI, Microsoft.Agents.AI LLM integration
OpenAPI Scalar API documentation

Frontend

Component Technology Purpose
Framework Blazor WebAssembly SPA framework
UI Razor Components Component-based UI
State Management Reactive pattern (Reactive<T>) Reactive state
Real-time SignalR Client WebSocket connection
Markdown Markdig Markdown rendering

AI Providers

  • OpenAI - GPT-4, GPT-4.1, o1 models
  • Anthropic - Claude 3.5/3.6 Sonnet, Opus
  • Azure Foundry - Azure OpenAI Service
  • DeepSeek - DeepSeek models
  • Ollama - Local model hosting
  • OpenRouter - Model aggregator

Deployment

Platform Technology
Containerization Docker
Orchestration Docker Compose
Reverse Proxy Nginx (frontend)
Health Checks ASP.NET Core Health Checks
Configuration Environment variables, .env files

Database Support

  • SQL Server - Default (Microsoft SQL Server 2022)
  • PostgreSQL - Alternative (via docker-compose-postgres.yml)

External Integrations

  • MCP (Model Context Protocol) - Extensible tool system
  • Brevo - Email verification (optional)
  • Neo4j - Graph-based long-term memory (optional)

Development Notes

Adding a New AI Provider

  1. Create XxxProviderFacade.cs in src/Infrastructure.Ai/ProviderFacades/
  2. Implement IProviderFacade
  3. Add provider type to ProviderType enum
  4. Register in ProviderServiceFactory.cs
  5. Update frontend provider selection

Adding a New Backend Tool

  1. Create tool class in src/Domain/Tools/BackEnd/ or src/Api/Background/Tools/
  2. Implement tool interface with GetDeclaration() method
  3. Register in BackendToolExecutor.cs
  4. Add frontend handler if UI interaction needed

Extending Prompt Chunks

  1. Create new chunk class implementing IPromptChunk
  2. Add to PromptService.PromptChunks list
  3. Implement GetContent() and ShouldBeAdded() methods