Maintainer: John O'Hare · Upstream IP: Melvin Carvalho (JSS, DID:Nostr) · MAINTAINERS.md
visionclaw.mp4
Why VisionClaw? · Quick Start · Capabilities · Architecture · Performance · Documentation
92 CUDA kernels · GPU clustering, anomaly detection and PageRank · Multi-user immersive XR · 88 agent skills · OWL 2 ontology governance · Nostr DID identity · Solid Pod sovereignty
VisionClaw is an open-source knowledge engineering platform that transforms organisations into governed agentic meshes. It ingests knowledge from Logseq notebooks via GitHub, reasons over it with an OWL 2 EL inference engine (Whelk-rs), renders the result as an interactive 3D graph where nodes attract or repel based on semantic relationships, and exposes everything to AI agents through 7 Model Context Protocol tools. Users collaborate in the same space through multi-user XR presence, spatial voice, and immersive graph exploration.
Every agent decision is semantically grounded, every mutation passes consistency checking, and every reasoning chain is auditable from edge case back to first principles. Governance isn't an inhibitor, it's an accelerant.
73% of frontline AI adoption happens without management sign-off. Your workforce is already building shadow workflows, stitching together AI agents, automating procurement shortcuts, inventing cross-functional pipelines that don't appear on any org chart. The question isn't whether your organisation is becoming an agentic mesh. It's whether you'll shape how it forms.
The personal agent revolution has a governance problem. Autonomous AI agents are powerful, popular, and ready to act. They've also shown what happens without shared semantics, formal reasoning, or organisational guardrails: unauthorised actions, prompt injection attacks, and enterprises deploying security scanners just to detect rogue agent instances on their own networks.
When agents know their authority boundary and surface exceptions cleanly, the 90% of decisions that don't need human judgment flow without friction. The 10% that do get clean, contextualised escalation with full provenance.
VisionClaw is the knowledge engineering substrate of the VisionFlow coordination platform — the federated mesh where autonomous agents, human judgment, and institutional knowledge collaborate through shared protocols and self-sovereign data.
GPU-accelerated force-directed graph — 934 nodes responding to spring, repulsion, and ontology-driven semantic forces in real time
Chloe Nevitt interacting with Prof Rob Aspin's precursor to VisionClaw in the Octave Multimodal Lab University of Salford 2017
git clone https://github.com/DreamLab-AI/VisionClaw.git
cd VisionClaw && cp .env.example .env
docker-compose --profile dev up -d| Service | URL | Description |
|---|---|---|
| Frontend | http://localhost:3001 | 3D knowledge graph interface (via Nginx) |
| API | http://localhost:4000/api | REST + WebSocket endpoints (Rust/Actix-web) |
| Solid Pod | http://localhost:8484 | Embedded Solid pod server (solid-pod-rs) |
Enable voice routing (LiveKit + whisper + TTS)
docker-compose -f docker-compose.yml -f docker-compose.voice.yml --profile dev up -dAdds LiveKit SFU (port 7880), turbo-whisper STT (CUDA), and Kokoro TTS. Requires GPU for real-time transcription.
Enable multi-user XR (Vircadia World Server)
docker-compose -f docker-compose.yml -f docker-compose.vircadia.yml --profile dev up -dAdds Vircadia World Server with avatar sync, HRTF spatial audio, and collaborative graph editing.
Native Rust + CUDA build
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
git clone https://github.com/DreamLab-AI/VisionClaw.git
cd VisionClaw && cp .env.example .env
cargo build --release --features gpu
cd client && npm install && npm run build && cd ..
./target/release/webxrRequires CUDA 13.1 toolkit. See Deployment Guide for full GPU setup.
flowchart TB
subgraph Layer3["LAYER 3 — DECLARATIVE GOVERNANCE"]
JB["Judgment Broker\n(Human-in-the-Loop)"]
Policy["AI-Enforced Policies\nBias · Security · Alignment"]
Trust["Cascading Trust\nNostr DID Identity"]
end
subgraph Layer2["LAYER 2 — ORCHESTRATION"]
Skills["83 Agent Skills\nClaude-Flow DAG Pipelines"]
Ontology["OWL 2 EL Reasoning\nWhelk-rs Inference Engine"]
MCP["7 MCP Tools\nKnowledge Graph Read/Write"]
GPU["GPU Compute\n92 CUDA Kernels"]
end
subgraph Layer1["LAYER 1 — DISCOVERY ENGINE"]
Ingest["Knowledge Ingestion\nLogseq · GitHub · RSS"]
Graph["Oxigraph + SQLite\n+ RuVector pgvector Memory"]
Viz["3D Visualisation\nR3F · Babylon.js · WebXR"]
Voice["Voice Routing\n4-Plane Architecture"]
end
Layer1 -->|"Insights bubble up"| Layer2
Layer2 -->|"Exceptions surface"| Layer3
Layer3 -->|"Governance flows down"| Layer2
Layer2 -->|"Validated workflows deploy"| Layer1
style Layer3 fill:#1A0A2A,stroke:#8B5CF6
style Layer2 fill:#0A1A2A,stroke:#00D4FF
style Layer1 fill:#0A2A1A,stroke:#10B981
|
Semantic Governance
|
GPU-Accelerated Physics
|
|
Agent Skills + MCP
|
Multi-User Immersive XR
|
|
Self-Sovereign Identity
|
Voice Routing (4-Plane Architecture)
|
How shadow workflows become sanctioned organisational intelligence:
flowchart LR
D["DISCOVERY\nPassive agent monitoring\ndetects the pattern"]
C["CODIFICATION\nMaps the new path\nas a proposed DAG —\nOWL 2 formalised\nwith provenance"]
V["VALIDATION\nThe Judgment Broker\nreviews for strategic\nfit & bias"]
I["INTEGRATION\nPromoted to live mesh\nwith SLAs, ownership,\nquality"]
A["AMPLIFICATION\nMesh propagates\npattern to other\nteams where it applies"]
D --> C --> V --> I --> A
style D fill:#0A2A1A,stroke:#10B981
style C fill:#0A1A2A,stroke:#00D4FF
style V fill:#1A0A2A,stroke:#8B5CF6
style I fill:#0A1A2A,stroke:#00D4FF
style A fill:#0A2A1A,stroke:#10B981
Agents publish structured Nostr events; the relay routes them; the forum renders decision surfaces; humans respond with cryptographically signed events. The governance audit trail is immutable by construction.
| Kind | Name | Flow |
|---|---|---|
| 31400 | PanelDefinition | Agent → declares a control panel |
| 31401 | PanelState | Agent → current data snapshot |
| 31402 | ActionRequest | Agent → requests a human decision |
| 31403 | ActionResponse | Human → approve/reject (NIP-98 signed) |
| 31404 | PanelUpdate | Agent → incremental state diff |
| 31405 | PanelRetired | Agent → retires a control panel |
7 MCP Ontology Tools
| Tool | Purpose |
|---|---|
ontology_discover |
Semantic keyword search with Whelk inference expansion |
ontology_read |
Enriched note with axioms, relationships, schema context |
ontology_query |
Validated Cypher execution with schema-aware label checking |
ontology_traverse |
BFS graph traversal from starting IRI |
ontology_propose |
Create/amend notes → consistency check → GitHub PR |
ontology_validate |
Axiom consistency check against Whelk reasoner |
ontology_status |
Service health and statistics |
Binary WebSocket Protocol (V2/V3)
High-frequency position updates use a compact binary protocol instead of JSON, achieving 80% bandwidth reduction.
V2 Standard (36 bytes/node) — production default:
| Bytes | Field | Type | Description |
|---|---|---|---|
| 0–3 | Node ID | u32 | Flag bits 26-31 encode node type |
| 4–15 | Position (X/Y/Z) | f32×3 | World-space position |
| 16–27 | Velocity (X/Y/Z) | f32×3 | Physics velocity |
| 28–31 | SSSP distance | f32 | Shortest-path from source |
| 32–35 | Timestamp | u32 | ms since session start |
V3 Analytics (48 bytes/node) — includes GPU analytics:
Adds cluster_id (u16), anomaly_score (f32), community_id (u16), page_rank (f32) at bytes 36–47.
Agent skill domains (83 skills)
Creative Production — Script, storyboard, shot-list, grade & publish workflows. ComfyUI orchestration for image, video, and 3D asset generation.
Research & Synthesis — Multi-source ingestion, GraphRAG, semantic clustering, Perplexity integration.
Knowledge Codification — Tacit-to-explicit extraction; OWL concept mapping; Logseq-formatted output.
Governance & Audit — Bias detection, provenance chains (content-addressed beads), declarative policy enforcement.
Workflow Discovery — Shadow workflow detection; DAG proposal & validation against ontology.
Spatial & Immersive — XR scene graph, light field, WebXR rendering agent, Blender MCP, ComfyUI SAM3D.
Identity & Trust — DID management, key rotation, Nostr agent communications, NIP-26 delegation.
Development & Quality — Rust development, pair programming, agentic QE fleet (111+ sub-agents), GitHub code review.
Infrastructure & DevOps — Docker management, Kubernetes ops, Linux admin, network analysis, monitoring.
Node geometry and material system
| Node Type | Geometry | Material | ID Encoding |
|---|---|---|---|
| Knowledge (public pages) | Icosahedron r=0.5 | GemNodeMaterial — analytics-driven colour |
Bit 30 set (0x40000000) |
| Ontology | Sphere r=0.5 | CrystalOrbMaterial — depth-pulsing cosmic spectrum |
Bits 26-28 set (0x1C000000) |
| Agent | Capsule r=0.3 h=0.6 | AgentCapsuleMaterial — bioluminescent heartbeat |
Bit 31 set (0x80000000) |
| Linked pages | Icosahedron r=0.35 | GemNodeMaterial |
No flag bits |
Agent visual states: #10b981 (idle) · #fbbf24 (spawning/active) · #ef4444 (error) · #f97316 (busy).
Voice routing (4-plane architecture)
| Plane | Direction | Scope | Trigger |
|---|---|---|---|
| 1 | User mic → turbo-whisper STT → Agent | Private | PTT held |
| 2 | Agent → Kokoro TTS → User ear | Private | Agent responds |
| 3 | User mic → LiveKit SFU → All users | Public (spatial) | PTT released |
| 4 | Agent TTS → LiveKit → All users | Public (spatial) | Agent configured public |
Opus 48kHz mono end-to-end. HRTF spatial panning from Vircadia entity positions.
Logseq ontology input (source data)
| Ontology metadata | Graph structure |
|---|---|
![]() |
![]() |
| OWL entity page with category, hierarchy, and source metadata | Graph view showing semantic clusters |
Dense knowledge graph in Logseq — the raw ontology VisionClaw ingests, reasons over, and renders in 3D
Mesh KPIs — measuring what matters
| KPI | Formula | Target | What It Measures |
|---|---|---|---|
| Mesh Velocity | Δt(insight → codified workflow) | < 48h | How fast a discovered shortcut becomes a sanctioned, reusable DAG |
| Augmentation Ratio | Cognitive load offloaded ÷ Total cognitive load | > 65% | Percentage of decision-making handled by agents without human escalation |
| Trust Variance | σ(Agent Decision Quality) over 30-day window | < 0.12σ | Drift or bias monitoring in the automated task layer |
| HITL Precision | Correct escalations ÷ Total escalations | > 90% | Are the edge cases the mesh flags actually requiring human intervention? |
flowchart TB
subgraph Client["Browser Client (React 19 + Three.js / Babylon.js)"]
R3F["React Three Fiber\n(desktop graph)"]
BabylonXR["Babylon.js\n(immersive XR)"]
BinProto["Binary Protocol V2/V3"]
Voice["Voice Orchestrator"]
end
subgraph Server["Rust Backend (Actix-web · Hexagonal · CQRS)"]
Handlers["HTTP/WS Handlers\n(9 ports · 12 adapters)"]
Actors["23 Actix Actors\n(supervised concurrency)"]
Services["OWL Ontology Pipeline\n(Whelk-rs EL++)"]
MCP["MCP Tool Server\n(:9500 TCP)"]
end
subgraph Data["Data Layer"]
Oxigraph[("Oxigraph + SQLite\n(SPARQL triple store)")]
RuVector[("RuVector PostgreSQL\n(pgvector + HNSW)")]
Solid["Solid Pod\n(embedded solid-pod-rs)"]
end
subgraph GPU["GPU Compute (CUDA 13.1)"]
Physics["Force Physics\n+ Semantic Forces"]
Analytics["K-Means · Louvain\nPageRank · LOF Anomaly"]
end
subgraph Mesh["VisionFlow Mesh"]
Relay["Nostr Relay\n(NIP-42 AUTH)"]
AB["Agentbox\n(agent runtime)"]
Forum["Forum\n(governance UI)"]
end
Client <-->|"Binary V2/V3 + REST"| Server
Server <--> Oxigraph
Server <--> RuVector
Server <--> Solid
Server <--> GPU
MCP <--> AB
Server <-->|"31400-31405"| Relay
Relay <--> Forum
style Client fill:#e1f5ff,stroke:#0288d1
style Server fill:#fff3e0,stroke:#ff9800
style Data fill:#f3e5f5,stroke:#9c27b0
style GPU fill:#e8f5e9,stroke:#4caf50
style Mesh fill:#1a1a2e15,stroke:#e94560
Hexagonal architecture (9 ports · 12 adapters · 114 CQRS handlers)
VisionClaw follows strict hexagonal architecture. Business logic in src/services/ depends only on port traits in src/ports/. Concrete implementations live in src/adapters/, swapped at startup via dependency injection.
flowchart LR
subgraph Ports["src/ports/ (Traits)"]
GP[GraphRepository]
OR[OntologyRepository]
IE[InferenceEngine]
GPA[GpuPhysicsAdapter]
GSA[GpuSemanticAnalyzer]
SR[SettingsRepository]
SP[SolidPodRepository]
NR[NostrRelay]
VR[VectorRepository]
end
subgraph Adapters["src/adapters/ (Implementations)"]
OxiGraph[OxigraphGraphRepository]
OxiOntology[OxigraphOntologyRepository]
Whelk[WhelkInferenceEngine]
CudaPhysics[PhysicsOrchestratorAdapter]
SolidPod[EmbeddedSolidPodAdapter]
RuVectorAdapter[RuVectorAdapter]
end
subgraph Services["src/services/ (Business Logic)"]
OQS[OntologyQueryService]
OMS[OntologyMutationService]
GPS[GitHubPRService]
OPS[OntologyPipelineService]
end
Services --> Ports
Adapters -.->|implements| Ports
style Ports fill:#e8f5e9,stroke:#4caf50
style Adapters fill:#fff3e0,stroke:#ff9800
style Services fill:#e1f5ff,stroke:#0288d1
23-Actor supervision tree
The backend uses Actix actors for supervised concurrency. GPU actors form a hierarchy: GraphServiceSupervisor → PhysicsOrchestratorActor → ForceComputeActor. All actors restart automatically on failure.
GPU Physics Actors:
| Actor | Purpose |
|---|---|
ForceComputeActor |
Core force-directed layout (CUDA) — 60Hz |
StressMajorizationActor |
Stress majorisation algorithm |
ClusteringActor |
K-Means + Louvain community detection (GPU) |
PageRankActor |
GPU PageRank centrality computation |
ShortestPathActor |
Delta-stepping SSSP (GPU) |
ConnectedComponentsActor |
Label propagation component detection (GPU) |
AnomalyDetectionActor |
LOF / Z-score anomaly detection (GPU) |
SemanticForcesActor |
OWL-driven attraction/repulsion constraints |
ConstraintActor |
Layout constraint solving |
AnalyticsSupervisor |
GPU analytics orchestration |
BroadcastOptimizerActor |
Delta-filter + periodic full-broadcast (300 iters) |
Service Actors:
| Actor | Purpose |
|---|---|
GraphStateActor |
Canonical graph state — single source of truth |
OntologyActor |
OWL class management and Whelk bridge |
ClientCoordinatorActor |
Per-client session management + WebSocket |
PhysicsOrchestratorActor |
Delegates to GPU actors, manages convergence |
SemanticProcessorActor |
NLP query processing |
VoiceCommandsActor |
Voice-to-action routing |
TaskOrchestratorActor |
Background task scheduling |
GitHubSyncActor |
Incremental GitHub sync (SHA1 delta) |
OntologyPipelineActor |
Assembler → converter → Whelk pipeline |
GraphServiceSupervisor |
Top-level GPU supervision and restart |
ServerNostrActor |
Signs and publishes governance events (31400/31402) |
AgentMonitorActor |
Agent lifecycle monitoring |
DDD bounded contexts (10 contexts)
Core Domain: Knowledge Graph · Ontology Governance · Physics Simulation
Supporting Domain: Authentication (Nostr NIP-98) · Identity (DID/Solid) · Agent Orchestration · Semantic Analysis
Generic Domain: User Management · Audit/Provenance · Configuration
Each context has its own aggregate roots, domain events, and anti-corruption layers. Cross-context communication uses domain events, never direct model sharing. See DDD Bounded Contexts.
| Deployment | Context | Scale |
|---|---|---|
| DreamLab Creative Hub | 50-person creative technology team — live production | ~998 knowledge graph nodes, daily ontology mutations |
| University of Salford | Research partnership validating semantic force-directed layout | Multi-institution ontology |
| THG World Record | Large-scale multi-user immersive data visualisation | 250+ concurrent XR users |
| Metric | Result | Conditions |
|---|---|---|
| GPU physics speedup | 55× | vs single-threaded CPU |
| HNSW semantic search | 61µs p50 | RuVector pgvector, 1.17M entries |
| WebSocket latency | 10ms | Local network, V2 binary |
| Bandwidth reduction | 80% | Binary V2 vs JSON |
| Concurrent XR users | 250+ | Vircadia World Server |
| CUDA kernels | 92 | 6,585 LOC across 11 files |
Full technology breakdown
| Layer | Technology | Detail |
|---|---|---|
| Backend | Rust 2021 · Actix-web | 427 files, 175K LOC · hexagonal CQRS · 9 ports · 12 adapters · 114 handlers |
| Frontend (desktop) | React 19 · Three.js 0.182 · R3F | 370 files, 96K LOC · TypeScript 5.9 · InstancedMesh · SAB zero-copy |
| Frontend (XR) | Babylon.js | Immersive/VR mode — Quest 3 foveated rendering, hand tracking |
| WASM | Rust → wasm-pack | scene-effects crate: zero-copy Float32Array view over WebAssembly.Memory |
| Graph Store | Oxigraph + SQLite | ADR-11 canonical persistence (SPARQL triple store) |
| Vector Memory | RuVector PostgreSQL · pgvector | 1.17M+ entries · HNSW 384-dim · MiniLM-L6-v2 · 61µs search |
| GPU | CUDA 13.1 · cudarc | 92 kernel functions · 6,585 LOC · PTX ISA auto-downgrade |
| Ontology | OWL 2 EL · Whelk-rs | EL++ subsumption · consistency checking |
| Multi-User | Vircadia World Server | Avatar sync · spatial HRTF audio · collaborative editing |
| Voice | LiveKit SFU · turbo-whisper · Kokoro | CUDA STT · TTS · Opus 48kHz · 4-plane routing |
| Identity | Nostr NIP-07/NIP-98 · DID:Nostr | Browser extension signing · NIP-26 delegation · W3C key rotation |
| User Data | Solid Pods · solid-pod-rs (embedded) | Per-user data sovereignty · WAC access control · JSON-LD |
| Agents | Claude-Flow · MCP · RAFT | 83 skills · 7 ontology tools · hive-mind consensus |
| Build | Vite 6 · Vitest · Playwright | Frontend build · unit tests · E2E tests |
| Infra | Docker Compose | 15+ services · multi-profile (dev/prod/voice/xr) |
VisionClaw uses the Diataxis framework — 106 markdown files across four categories, 46 with embedded Mermaid diagrams:
| Category | Path | Content |
|---|---|---|
| Tutorials | docs/tutorials/ |
First graph, platform overview |
| How-To Guides | docs/how-to/ |
Deployment, agents, XR setup, performance profiling, operations |
| Explanation | docs/explanation/ |
Architecture, DDD, ontology, GPU physics, VisionFlow platform, security |
| Reference | docs/reference/ |
REST API, WebSocket protocol, agents catalog, error codes |
Key entry points: Documentation Hub · VisionFlow Platform · Wardley Map · Architecture Overview · Deployment Topology · Known Issues
Prerequisites, build commands, system requirements
| Tool | Version | Purpose |
|---|---|---|
| Rust | 2021 edition | Backend |
| Node.js | 20+ | Frontend |
| Docker + Docker Compose | — | Services |
| CUDA Toolkit | 13.1 | GPU acceleration (optional) |
cargo build --release && cargo test
cd client && npm install && npm run build && npm test| Tier | CPU | RAM | GPU | Use Case |
|---|---|---|---|---|
| Minimum | 4-core 2.5GHz | 8 GB | Integrated | Development · < 10K nodes |
| Recommended | 8-core 3.0GHz | 16 GB | GTX 1060 / RX 580 | Production · < 50K nodes |
| Enterprise | 16+ cores | 32 GB+ | RTX 4080+ (16GB VRAM) | Large graphs · multi-user XR |
Platform support: Linux (full GPU) · macOS (CPU-only) · Windows (WSL2) · Meta Quest 3 (Beta)
Project structure
VisionClaw/
├── src/ # Rust backend (427 files, 175K LOC)
│ ├── actors/ # 23 Actix actors (GPU compute + services)
│ ├── adapters/ # Oxigraph, Whelk, CUDA, Solid, RuVector adapters
│ ├── handlers/ # HTTP/WebSocket request handlers (CQRS)
│ ├── services/ # Business logic (ontology, voice, agents)
│ ├── ports/ # Trait definitions (9 hexagonal boundaries)
│ ├── gpu/ # CUDA kernel bridge, memory, streaming
│ └── utils/*.cu # 92 CUDA kernel functions (11 files, 6,585 LOC)
├── client/ # React frontend (370 files, 96K LOC)
│ ├── src/features/ # 13 feature modules (graph, settings, etc.)
│ ├── src/services/ # Voice, WebSocket, Nostr auth, Solid
│ └── crates/scene-effects/ # Rust WASM crate — zero-copy scene FX
├── agentbox/ # Submodule: agent runtime container
├── docs/ # Diataxis documentation (106 files, 46 with Mermaid)
│ ├── explanation/ # Architecture (incl. VisionFlow platform doc)
│ ├── adr/ # 91 Architecture Decision Records
│ └── KNOWN_ISSUES.md # Active P1/P2 bugs
├── tests/ # Integration tests
└── scripts/ # Build, migration, embedding ingestion
See the Contributing Guide. Check Known Issues before starting — the Ontology Edge Gap (ONT-001) and V4 delta instability (WS-001) are active P1/P2 bugs.
Mozilla Public License 2.0 — Use commercially, modify freely, share changes to MPL files.
VisionClaw is the knowledge engineering substrate of VisionFlow, built by DreamLab AI.
VisionFlow Platform · Documentation · Known Issues · Discussions







