Skip to content

Commit 86cb65e

Browse files
authored
Merge pull request #304 from awf-project/feature/F078-fix-cli-provider-invocations-breaking-ch
feat(agents): fix CLI provider invocations and add streaming output support (F078)
2 parents a985070 + 8be89e7 commit 86cb65e

49 files changed

Lines changed: 2354 additions & 843 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Breaking Changes
1111

12+
- **F078**: CLI provider invocation flags updated to match current binary APIs — Claude and Gemini `output_format: json` now maps to `--output-format stream-json` (was `--output-format json`); Codex invocation changed from `codex --prompt "<prompt>" --quiet` to `codex exec --json "<prompt>"`; `quiet` option removed from Codex (silently ignored); Codex conversation resume changed from `codex resume <id> --prompt "<prompt>"` to `codex resume <id> --json "<prompt>"`; workflows using `output_format: json` require no YAML changes (mapping is automatic); workflows using `quiet: true` for Codex should remove the option (no-op)
1213
- **F077**: Option keys normalized to snake_case — `allowedTools` renamed to `allowed_tools`, `dangerouslySkipPermissions` renamed to `dangerously_skip_permissions` in workflow YAML; old camelCase keys are silently ignored (Go map miss); `dangerously_skip_permissions` fails closed (permissions not skipped), `allowed_tools` fails open (no tool restriction applied); update existing workflow files to use the new snake_case keys
1314

1415
### Added
1516

17+
- **F078**: OpenCode `--model` flag support — `model` option in workflow YAML now passed as `--model <value>` to OpenCode CLI in both `Execute` and `ExecuteConversation`; OpenCode always passes `--format json` for structured output
1618
- **F077**: `dangerously_skip_permissions` support for Gemini (`--approval-mode=yolo`) and Codex (`--yolo`) providers — unified permission bypass key works across all three agent providers (Claude, Gemini, Codex)
1719
- **F076**: `awf upgrade` self-update command — checks latest release on GitHub, downloads platform-specific binary, verifies SHA256 checksum, and atomically replaces the current executable; `--check` reports available updates without installing; `--version v0.5.0` installs a specific version; `--force` upgrades even if already on latest or running a dev build; heuristic warning when binary appears managed by a package manager (homebrew, apt, snap, nix); cross-filesystem fallback (copy + chmod) when `os.Rename` fails; `GITHUB_TOKEN` env var supported for rate-limited environments
1820

@@ -22,6 +24,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2224

2325
### Removed
2426

27+
- **F078**: Dead validation helpers `validatePrompt()`, `validateContext()`, `validateState()` removed from agent helpers — all were no-ops or unreachable after provider refactoring
2528
- **F077**: Dead helper functions `getWorkflowID()` and `getStepName()` removed from agent helpers — keys `workflowID`/`stepName` were never injected by any caller; `workflow` and `step` fields removed from Claude provider audit log (redundant with execution service context)
2629

2730
## [0.6.0] - 2026-04-05

CLAUDE.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,6 @@ func TestWorkflowValidation(t *testing.T) {
217217

218218
## Architecture Rules
219219

220-
- Apply bug fixes uniformly across all components implementing the same pattern; verify path resolution consistency across all executors when fixing one
221220
- Never modify production code in test-only fixes; bugs discovered during testing must be documented in Bug Escalation Protocol (.specify/implementation/ISSUE/bug/) before implementing fixes
222221
- Document discovered runtime bugs in .specify/implementation/ISSUE/bug/ directory before implementation; prevents scope creep and enables separate tracking from test fixes
223222
- Own timeout responsibility in application layer via context.WithTimeout; infrastructure adapters must respect context cancellation without enforcing additional timeouts
@@ -239,6 +238,7 @@ func TestWorkflowValidation(t *testing.T) {
239238
- Initialize ApproximationTokenizer immediately before NewConversationManager in interfaces layer; token counting must be ready before conversation context is established
240239
- Resolve user-provided interpolated variables before registry or dependency lookups to enable dynamic selection at runtime; apply identical resolution logic across all related code paths
241240
- Implement per-provider flag mapping without shared abstraction when CLI syntax diverges fundamentally; document divergence (Claude: --flag-name, Gemini: --flag-name=value, Codex: --flag-name) inline
241+
- Synchronize provider CLI flag changes across both implementation files and central options configuration (options.go); verify declarations and validation rules align
242242

243243
## Common Pitfalls
244244

@@ -306,6 +306,3 @@ func TestWorkflowValidation(t *testing.T) {
306306
- Extract repeated test assertion patterns (>5 duplicates) into table-driven or closure-based helpers to eliminate code duplication
307307
- Extract HTTP server setup patterns from integration tests into helper functions; eliminate duplication across multiple test functions
308308
- When flipping integration test assertions for newly-enabled features, transition from 'not configured' errors to provider-level implementation errors; verify assertions change state, not disappear
309-
310-
## Review Standards
311-

internal/application/agent_step_test.go

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ import (
44
"context"
55
"errors"
66
"fmt"
7+
"io"
78
"strings"
89
"testing"
910
"time"
@@ -186,7 +187,7 @@ func TestExecutionService_AgentStep_BasicExecution(t *testing.T) {
186187

187188
registry := mocks.NewMockAgentRegistry()
188189
claude := mocks.NewMockAgentProvider("claude")
189-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
190+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
190191
if prompt == "Summarize this text" {
191192
return &workflow.AgentResult{
192193
Provider: "claude",
@@ -265,7 +266,7 @@ func TestExecutionService_AgentStep_WithOnFailure(t *testing.T) {
265266

266267
registry := mocks.NewMockAgentRegistry()
267268
claude := mocks.NewMockAgentProvider("claude")
268-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
269+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
269270
if prompt == "Summarize this text" {
270271
return &workflow.AgentResult{
271272
Provider: "claude",
@@ -348,7 +349,7 @@ func TestExecutionService_AgentStep_InMixedWorkflow(t *testing.T) {
348349

349350
registry := mocks.NewMockAgentRegistry()
350351
claude := mocks.NewMockAgentProvider("claude")
351-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
352+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
352353
if prompt == "Analyze the prepared data" {
353354
return &workflow.AgentResult{
354355
Provider: "claude",
@@ -431,7 +432,7 @@ func TestExecutionService_AgentStep_StepTimeout(t *testing.T) {
431432

432433
registry := mocks.NewMockAgentRegistry()
433434
claude := mocks.NewMockAgentProvider("claude")
434-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
435+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
435436
if prompt == "Summarize this text" {
436437
return &workflow.AgentResult{
437438
Provider: "claude",
@@ -501,7 +502,7 @@ func TestExecutionService_AgentStep_AgentTimeout(t *testing.T) {
501502

502503
registry := mocks.NewMockAgentRegistry()
503504
claude := mocks.NewMockAgentProvider("claude")
504-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
505+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
505506
if prompt == "Summarize this text" {
506507
return &workflow.AgentResult{
507508
Provider: "claude",
@@ -574,7 +575,7 @@ func TestExecutionService_AgentStep_ContextCancellation(t *testing.T) {
574575

575576
registry := mocks.NewMockAgentRegistry()
576577
claude := mocks.NewMockAgentProvider("claude")
577-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
578+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
578579
return nil, context.Canceled
579580
})
580581
_ = registry.Register(claude)
@@ -644,7 +645,7 @@ func TestExecutionService_AgentStep_InParallelBranches(t *testing.T) {
644645
registry := mocks.NewMockAgentRegistry()
645646

646647
claude := mocks.NewMockAgentProvider("claude")
647-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
648+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
648649
if prompt == "Analyze sentiment" {
649650
return &workflow.AgentResult{
650651
Provider: "claude",
@@ -667,7 +668,7 @@ func TestExecutionService_AgentStep_InParallelBranches(t *testing.T) {
667668
_ = registry.Register(claude)
668669

669670
gemini := mocks.NewMockAgentProvider("gemini")
670-
gemini.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
671+
gemini.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
671672
if prompt == "Extract keywords" {
672673
return &workflow.AgentResult{
673674
Provider: "gemini",
@@ -750,7 +751,7 @@ func TestExecutionService_AgentStep_PromptInterpolation(t *testing.T) {
750751
registry := mocks.NewMockAgentRegistry()
751752
claude := mocks.NewMockAgentProvider("claude")
752753
// Note: mock resolver doesn't interpolate, so prompt stays as-is
753-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
754+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
754755
if prompt == "Explain {{inputs.topic}} in simple terms" {
755756
return &workflow.AgentResult{
756757
Provider: "claude",
@@ -829,7 +830,7 @@ func TestExecutionService_AgentStep_MultipleProviders(t *testing.T) {
829830
registry := mocks.NewMockAgentRegistry()
830831

831832
claude := mocks.NewMockAgentProvider("claude")
832-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
833+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
833834
if prompt == "Analyze this code" {
834835
return &workflow.AgentResult{
835836
Provider: "claude",
@@ -852,7 +853,7 @@ func TestExecutionService_AgentStep_MultipleProviders(t *testing.T) {
852853
_ = registry.Register(claude)
853854

854855
gemini := mocks.NewMockAgentProvider("gemini")
855-
gemini.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
856+
gemini.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
856857
if prompt == "Review the analysis" {
857858
return &workflow.AgentResult{
858859
Provider: "gemini",
@@ -1004,7 +1005,7 @@ func TestExecutionService_Resume_WithAgentStep(t *testing.T) {
10041005

10051006
registry := mocks.NewMockAgentRegistry()
10061007
claude := mocks.NewMockAgentProvider("claude")
1007-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
1008+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
10081009
if prompt == "Summarize this text" {
10091010
return &workflow.AgentResult{
10101011
Provider: "claude",
@@ -1073,7 +1074,7 @@ func TestExecutionService_AgentStep_ContinueOnError(t *testing.T) {
10731074

10741075
registry := mocks.NewMockAgentRegistry()
10751076
claude := mocks.NewMockAgentProvider("claude")
1076-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
1077+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
10771078
if prompt == "Summarize this text" {
10781079
return &workflow.AgentResult{
10791080
Provider: "claude",
@@ -1252,7 +1253,7 @@ func TestExecutionService_AgentStep_ExecutionError(t *testing.T) {
12521253

12531254
registry := mocks.NewMockAgentRegistry()
12541255
claude := mocks.NewMockAgentProvider("claude")
1255-
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
1256+
claude.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
12561257
return nil, errors.New("network connection failed")
12571258
})
12581259
_ = registry.Register(claude)
@@ -1393,7 +1394,7 @@ func TestExecutionService_AgentStep_ProviderInterpolation(t *testing.T) {
13931394
registry := mocks.NewMockAgentRegistry()
13941395
for _, name := range tt.registeredNames {
13951396
provider := mocks.NewMockAgentProvider(name)
1396-
provider.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
1397+
provider.SetExecuteFunc(func(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
13971398
return &workflow.AgentResult{
13981399
Provider: name,
13991400
Output: "Result from " + name,

internal/application/conversation_manager.go

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ import (
44
"context"
55
"errors"
66
"fmt"
7+
"io"
78
"strings"
89

910
"github.com/awf-project/cli/internal/domain/ports"
@@ -118,14 +119,15 @@ func (m *ConversationManager) executeTurn(
118119
state *workflow.ConversationState,
119120
prompt string,
120121
options map[string]any,
122+
stdoutW, stderrW io.Writer,
121123
) (*workflow.ConversationResult, error) {
122124
select {
123125
case <-ctx.Done():
124126
return nil, ctx.Err()
125127
default:
126128
}
127129

128-
result, err := provider.ExecuteConversation(ctx, state, prompt, options)
130+
result, err := provider.ExecuteConversation(ctx, state, prompt, options, stdoutW, stderrW)
129131
if err != nil {
130132
return nil, err
131133
}
@@ -197,6 +199,7 @@ func (m *ConversationManager) ExecuteConversation(
197199
config *workflow.ConversationConfig,
198200
execCtx *workflow.ExecutionContext,
199201
buildContext ContextBuilderFunc,
202+
stdoutW, stderrW io.Writer,
200203
) (*workflow.ConversationResult, error) {
201204
if err := m.validateConversationInputs(step, config); err != nil {
202205
return nil, err
@@ -233,7 +236,7 @@ func (m *ConversationManager) ExecuteConversation(
233236

234237
var lastResult *workflow.ConversationResult
235238
for turnCount := 0; turnCount < maxTurns; turnCount++ {
236-
result, err := m.executeTurn(ctx, provider, state, resolvedPrompt, options)
239+
result, err := m.executeTurn(ctx, provider, state, resolvedPrompt, options, stdoutW, stderrW)
237240
if err != nil {
238241
return nil, err
239242
}

internal/application/conversation_manager_helpers_test.go

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ package application
33
import (
44
"context"
55
"errors"
6+
"io"
67
"testing"
78

89
"github.com/awf-project/cli/internal/domain/ports"
@@ -29,7 +30,7 @@ func (m *mockAgentProvider) Name() string {
2930
return m.name
3031
}
3132

32-
func (m *mockAgentProvider) Execute(ctx context.Context, prompt string, options map[string]any) (*workflow.AgentResult, error) {
33+
func (m *mockAgentProvider) Execute(ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer) (*workflow.AgentResult, error) {
3334
return nil, nil // Not used in conversation manager tests
3435
}
3536

@@ -38,6 +39,7 @@ func (m *mockAgentProvider) ExecuteConversation(
3839
state *workflow.ConversationState,
3940
prompt string,
4041
options map[string]any,
42+
stdout, stderr io.Writer,
4143
) (*workflow.ConversationResult, error) {
4244
if m.err != nil {
4345
return nil, m.err
@@ -433,7 +435,7 @@ func TestConversationManager_executeTurn(t *testing.T) {
433435
cancel() // Cancel immediately
434436
}
435437

436-
result, err := mgr.executeTurn(ctx, provider, tt.state, tt.prompt, tt.options)
438+
result, err := mgr.executeTurn(ctx, provider, tt.state, tt.prompt, tt.options, nil, nil)
437439

438440
if tt.expectError {
439441
assert.Error(t, err)

0 commit comments

Comments
 (0)