feat(phoenix): run real AgentV targets inside Phoenix experiments

## Objective

Replace synthetic Phoenix adapter task outputs with actual AgentV target execution so Phoenix experiments represent real AgentV behavior.

## Acceptance Signals

- Phoenix experiment tasks invoke AgentV execution for the corresponding normalized test case.
- Phoenix task output matches the actual AgentV target output, not synthesized expected-output content.
- AgentV scores, assertions, verdicts, duration, cost, token usage, target, and stable `agentv_test_id` metadata are preserved in Phoenix run/evaluation metadata.
- Missing target/configuration produces a clear run error and is not reported as an evaluator failure.
- Dry-run/reference mode remains network-free and clearly separated from live execution.
- At least one live Phoenix smoke against a deterministic example records real task runs and evaluator runs.

## Implementation Notes

Keep AgentV eval YAML, target execution, workspace lifecycle, and scoring authoritative. Phoenix should host experiment artifacts, not become a parallel AgentV runtime.

Relevant files:

- `packages/phoenix-adapter/src/phoenix/run-experiment.ts`
- `packages/phoenix-adapter/src/run/options.ts`
- `packages/phoenix-adapter/src/run/run-suite.ts`
- `packages/phoenix-adapter/src/agentv/load-spec.ts`
- `packages/phoenix-adapter/src/phoenix/types.ts`
- `packages/phoenix-adapter/test/phoenix-run-experiment.test.ts`
- `packages/phoenix-adapter/test/agentv-execution.test.ts`

## Non-goals

- Do not reimplement workspace pooling, Docker lifecycle, matrices, or trials inside the adapter.
- Do not make Phoenix required for normal `agentv eval` execution.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(phoenix): run real AgentV targets inside Phoenix experiments #1284

Objective

Acceptance Signals

Implementation Notes

Non-goals

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(phoenix): run real AgentV targets inside Phoenix experiments #1284

Description

Objective

Acceptance Signals

Implementation Notes

Non-goals

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions