Skip to content

feat: autoresearch marketplace foundations — worker seller flow, provenance, coordinator fixes#265

Open
bussyjd wants to merge 11 commits intomainfrom
feat/autoresearch-integration
Open

feat: autoresearch marketplace foundations — worker seller flow, provenance, coordinator fixes#265
bussyjd wants to merge 11 commits intomainfrom
feat/autoresearch-integration

Conversation

@bussyjd
Copy link
Collaborator

@bussyjd bussyjd commented Mar 12, 2026

Summary

This PR adds the first solid marketplace foundations for autoresearch on obol-stack:

  • canonical provenance metadata for optimized models/services
  • a sell-side GPU worker flow for paid autoresearch experiments
  • richer worker registration metadata for discovery/ranking
  • coordinator fixes so discovery/payment work against the current 8004scan and remote-signer APIs
  • documentation refresh to match the implemented behavior

The goal is to support three paths over time:

  1. GPU contributors selling experiment execution capacity
  2. researchers coordinating autoresearch runs across remote workers
  3. service builders publishing optimized models and selling inference/apps on top

This PR mainly strengthens the GPU seller and coordinator foundations.

What’s included

1. Canonical provenance support

Adds a consistent provenance shape across:

  • publish.py
  • --provenance-file
  • ServiceOffer.spec.provenance
  • stored inference deployment config
  • generated agent-registration.json

Canonical provenance shape:

{
  "framework": "autoresearch",
  "metricName": "val_bpb",
  "metricValue": "0.9973",
  "experimentId": "abc123def456",
  "trainHash": "sha256:deadbeefcafebabe",
  "paramCount": "50000000"
}

Key fixes:

  • removed nested metric object
  • switched to one camelCase schema
  • made trainHash explicitly sha256:...
  • made paramCount stringified consistently
  • made provenance loading strict with unknown-field rejection

2. GPU seller flow

Adds a concrete seller-side worker implementation:

  • Dockerfile.worker
  • new embedded skill: autoresearch-worker
  • worker_api.py exposing:
    • GET /health
    • GET /healthz
    • GET /status
    • GET /best
    • GET /experiments/<id>
    • POST /experiment

Worker behavior:

  • runs one experiment at a time
  • stores results/logs on disk
  • extracts val_bpb
  • computes train_hash
  • tracks best result
  • returns 409 while busy

This gives a concrete path for:

  • running a GPU-backed worker
  • exposing it behind a Kubernetes Service
  • monetizing it via obol sell http

3. Richer registration metadata for GPU workers

Adds support for passing structured worker metadata at sell time:

obol sell http autoresearch-worker   ...   --register-metadata gpu=A100-80GB   --register-metadata framework=pytorch   --register-metadata best_val_bpb=1.234   --register-metadata total_experiments=42

Changes:

  • cmd/obol/sell.go adds --register-metadata key=value
  • CRD adds spec.registration.metadata
  • monetize.py now:
    • includes registration.metadata in generated agent-registration.json
    • mirrors it into indexed on-chain metadata as metadata.<key>

This makes GPU workers easier to discover, compare, and rank.

4. Coordinator fixes for current APIs

Fixes the autoresearch coordinator to match the current external APIs:

  • 8004scan:

    • uses /api/v1/public/agents
    • handles the current data response shape
    • prefers raw_metadata.offchain_content
    • falls back to the off-chain URI only when needed
  • OASF extraction:

    • reads the actual services[] entry with name: "OASF"
  • remote-signer:

    • uses GET /api/v1/keys
    • uses POST /api/v1/sign/<address>/typed-data

This resolves the previous mismatch where the coordinator used stale 8004scan behavior and old eth2 signer endpoints.

5. Documentation refresh

Refreshes the autoresearch coordinator docs/references so they match the code:

  • current 8004scan endpoint shape
  • current signer API
  • real OASF registration structure
  • worker examples using /services/autoresearch-worker

Commits included

  • 4b992f5 feat: add autoresearch GPU worker sell-side flow
  • 3506ad4 feat: add richer GPU worker registration metadata
  • dc81c85 fix: unify canonical provenance schema
  • 2e9a436 fix: align autoresearch coordinator with current APIs
  • 71a42d2 docs: refresh autoresearch coordinator references

Validation performed locally

Validated in this environment:

  • Python unit tests for worker helper behavior
  • Python unit tests for registration metadata generation
  • Python unit tests for provenance generation
  • Python unit tests for coordinator helper logic
  • py_compile on touched Python files
  • live coordinate.py discover --limit 1 against the current 8004scan API

Notes / scope

This PR improves the marketplace foundations, but it does not yet claim:

  • full trustless GPU scheduling
  • a complete replacement for Ensue shared-state semantics
  • production-complete GPU deployment packaging
  • full end-to-end integration coverage for every marketplace journey

It is best understood as:

  • a real sell-side worker foundation
  • canonical provenance plumbing
  • coordinator compatibility fixes
  • better discovery metadata

Follow-up work

Likely next steps after this PR:

  • k3s GPU worker chart / deployment packaging
  • fuller end-to-end integration tests
  • additional worker capability metadata conventions
  • deeper coordinator/state model work beyond discovery + payment

Closes #264

bussyjd added 7 commits March 12, 2026 13:27
…app deployment pattern

Add provenance metadata to ServiceOffer CRD and CLI, new embedded skills
for autoresearch model optimization and distributed GPU coordination,
and a validated pattern for agent-deployed x402-gated web apps.

Closes #264
--per-hour was passing the raw hourly price as the per-request charge
(e.g., $0.50/hour charged $0.50 per HTTP request). Now approximates
using a 5-minute experiment budget: perRequest = perHour * 5/60.

Also rewrites worker_api.py to use stdlib http.server (no Flask dep).
@bussyjd bussyjd changed the title feat: autoresearch integration — provenance, GPU marketplace, app deployment feat: autoresearch marketplace foundations — worker seller flow, provenance, coordinator fixes Mar 12, 2026
@bussyjd bussyjd requested a review from OisinKyne March 13, 2026 08:55
bussyjd added 2 commits March 13, 2026 10:12
…eWarning

- Move Dockerfile.worker from repo root to
  internal/embed/skills/autoresearch-worker/docker/Dockerfile
  (build context is the skill directory, not the obol-stack root)

- Add Gateway API request timeout on HTTPRoutes when
  payment.maxTimeoutSeconds > 30s. Derived as maxTimeout + 120s
  overhead. Prevents Traefik from killing long-running GPU
  experiment requests (300s+) before the worker responds.

- Fix ResourceWarning: close HTTPError responses in coordinator
  _http_post and persona test post_json helpers.
Comment on lines +165 to +172
def ollama_create(model_name: str, modelfile_path: Path) -> None:
"""Register the model with Ollama."""
print(f"Creating Ollama model: {model_name}")
run(
["ollama", "create", model_name, "-f", str(modelfile_path)],
capture=False,
)
print(f"Model '{model_name}' registered with Ollama")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the autoresearcher only for publishing models?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No — publish.py is just one piece. The autoresearch integration covers three flows:

  1. Model optimization + publishing (this script) — autonomous train loop, selects the best checkpoint, generates provenance, publishes to Ollama, optionally sells inference via obol sell inference --provenance-file
  2. GPU worker monetization (autoresearch-worker skill + Dockerfile.worker + worker_api.py) — exposes a worker HTTP API (POST /experiment, GET /best, etc.), runs one experiment at a time, sold via obol sell http with x402 payment gating and ERC-8004 registration
  3. Distributed coordination (autoresearch-coordinator skill) — discovers GPU workers via 8004scan, submits experiments with x402 micropayments, collects results across workers, tracks a leaderboard

The upstream autoresearch project is an agent-driven LLM research loop (agent edits train.py, runs 5-min GPU experiments, measures val_bpb, iterates). autoresearch-at-home extends it with swarm coordination via Ensue shared memory (claim experiments, publish results, exchange hypotheses/insights across agents on different GPUs).

This PR integrates that ecosystem into obol-stack's marketplace: workers sell GPU compute, coordinators discover and pay them, and published models carry provenance metadata (experiment ID, train hash, metric) all the way into the ERC-8004 on-chain registration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: autoresearch-at-home integration — GPU marketplace + optimized inference selling

2 participants