Skip to content

feat: migrate to Praxis filter-based proxy architecture#27

Open
franciscojavierarceo wants to merge 1 commit into
mainfrom
feat/praxis-integration
Open

feat: migrate to Praxis filter-based proxy architecture#27
franciscojavierarceo wants to merge 1 commit into
mainfrom
feat/praxis-integration

Conversation

@franciscojavierarceo
Copy link
Copy Markdown
Collaborator

@franciscojavierarceo franciscojavierarceo commented May 15, 2026

Summary

Replaces the hand-rolled Axum proxy with Praxis, a composable filter-based reverse proxy framework built on Pingora. Each gateway concern — proxying, auth, state hydration, tool dispatch, agentic looping — is an independent filter wired together via YAML configuration.

Why Praxis

  • Each filter is self-contained — implements HttpFilter with hooks for request/response. Filters don't know about each other.
  • YAML-configured pipeline — adding, removing, or reordering filters requires no code changes.
  • Native SSE streaming — Praxis/Pingora proxies the upstream response stream directly to the client. No buffering, no reqwest intermediary.
  • Hot reload — filter pipelines can be reloaded without restarting the server.

What changed

Filters introduced:

  • responses_proxy — sets ctx.upstream to vLLM's /v1/responses endpoint and injects auth credentials. Praxis/Pingora handles the actual proxying and streaming natively.
  • state_hydration — stub filter for conversation-state hydration. Inspects request body for previous_response_id and will call the state store to hydrate conversation history.
  • agentic_loop — stub filter for agentic re-inference. Inspects response body for function_call output items and will re-enter the inference loop.
  • tool_dispatch — stub filter for tool execution. Inspects response body for tool calls and will dispatch them.

Removed:

  • src/app.rs, src/proxy.rs, src/server.rs — replaced by filters + Praxis server runtime
  • benches/proxy_bench.rs — benchmark harness for the old Axum proxy (will be re-added)

Dependencies:

  • Praxis crates (praxis, praxis-proxy-core, praxis-proxy-filter, praxis-test-utils) via git at rev 2f7ea31
  • Base URLs ending with /v1 are normalized to avoid /v1/v1/responses double-prefix

Docs:

  • Updated README.md with architecture diagram, filter table, and run instructions
  • Updated docs/index.md with architecture overview and Praxis context
  • Added docs/architecture/index.md with Mermaid diagram, filter pipeline reference, streaming details, and component descriptions
  • Added Architecture page to mkdocs nav

Health endpoint:

  • Provided by Praxis's built-in admin endpoint (admin: { address: "127.0.0.1:9901" } in config)

Filter pipeline

filter_chains:
  - name: agentic
    filters:
      - filter: state_hydration
        store_base_url: "http://localhost:8080"
      - filter: agentic_loop
        max_iterations: 10
      - filter: tool_dispatch
      - filter: responses_proxy
        vllm_base_url: "http://localhost:8000"

Test plan

  • cargo build succeeds
  • cargo clippy --all-targets -- -D warnings clean
  • cargo fmt -- --check clean
  • pre-commit run --all-files clean
  • All 9 tests pass (2 unit + 7 integration):
    • test_non_stream_passthrough — JSON request/response round-trip
    • test_stream_passthrough — SSE streaming passthrough
    • test_auth_injection — API key injected from config
    • test_client_auth_precedence — client-supplied auth preserved
    • test_vllm_http_error_passthrough — upstream 429 forwarded
    • test_mid_stream_failure_closes_cleanly — partial stream handled
    • test_connect_error_maps_to_502 — unreachable vLLM returns 502

Replace the Axum HTTP server with Praxis as the core proxy runtime.
All request handling logic is now implemented as composable Praxis
filters (responses_proxy, ogx_state, agentic_loop, tool_dispatch),
wired together via YAML configuration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant