GOAT-TS (Thinking System) is a knowledge-graph–driven cognition scaffold: you ingest text into a graph, run spreading activation and memory dynamics over it, then reason over tension and hypotheses. It is designed to run locally first (single machine) with clear paths to scale out (Spark, Redis, Kubernetes).
Author: Ben Michalek (BoggersTheFish)
Python 3.11+ · NebulaGraph · PyTorch · LangChain · Streamlit
- Stores knowledge in a graph. Text is turned into concepts (nodes) and relationships (edges). Each ingestion chunk or reasoning pass is recorded as a wave (cognitive episode), so you can see which concepts came from which source.
- Runs a cognition loop. You seed the graph (by concept labels or node IDs), then the system spreads activation (ACT-R style), applies memory decay and state transitions (ACTIVE → DORMANT → DEEP), and optionally runs a gravity-style simulation to update positions and masses. That loop is the core “thinking” step.
- Reasons over the graph. You ask a query; the system retrieves a relevant subgraph, computes “tension” (mismatch between where nodes are and where they “should” be), and produces hypotheses (e.g. “What explains the conflict between X and Y?”). Results can be cached in Redis.
- Supports ingestion and simulation. You can acquire text (sample, Wikipedia API, or dumps), run Spark ETL to Parquet, extract triples (regex or LLM), merge similar concepts (FAISS), and load everything into NebulaGraph. Simulation reads from the graph, runs one physics step (forces, domains), and writes updated masses and cluster IDs back.
So: ingest → graph → cognition loop (spread + memory + optional gravity) → reasoning (query → subgraph → tension → hypotheses). All of this can run in dry-run (in-memory, no Docker) or live (NebulaGraph, Redis, Spark via Docker).
- Single place to try the pipeline. One repo gives you acquisition, ETL, extraction, graph schema, activation, memory, reasoning, and simulation. You can validate the design and extend it without switching projects.
- Local-first. You can develop and test with
--dry-runand no Docker; when ready, start Docker and use--livefor real storage and caching. - Structured cognition model. Waves and in_wave edges give you provenance (“this concept appeared in this chunk”); tension and hypotheses make reasoning interpretable; memory states (ACTIVE/DORMANT/DEEP) make decay and consolidation explicit.
- Ready for extension. The roadmap (Stages 1–10: core loop → benchmarks/API → usability → integrations → scaling → community → advanced evolution) is in ROADMAP.md; see also README_ARCHITECTURE.md and docs/extensions.md.
Run all commands from the repository root. Use python -m streamlit run ..., python -m pytest, and python scripts/... so the same invocations work on Windows, macOS, and Linux (see PLATFORM.md).
Create a virtual environment and install dependencies:
Windows (PowerShell):
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -r requirements.txtmacOS / Linux:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtIf pytest or modules like nebula3 are not found, it almost always means the virtual environment is not active in the current shell or requirements.txt was not installed in that environment. Activate the environment and rerun installation before running tests or scripts (see also TROUBLESHOOTING.md).
Start the Streamlit GUI (setup, config, ingestion, simulation, reasoning, monitoring, export, API):
python -m streamlit run scripts/goat_ts_gui.pyOpen the URL shown (e.g. http://localhost:8501). In the sidebar go to Setup Wizard.
- Click “Check system (verify all steps)” to see what’s already done (dependencies, Docker, connection, schema). The sidebar shows Deps, Docker, Connect, Schema (✓ or —).
- Complete any missing steps in order:
- Step 1 — Dependencies: Install Python requirements (button runs
pip install -r requirements.txt). Once verified or installed, the step is locked. - Step 2 — Docker: Start NebulaGraph, Redis, and Spark with “Start Docker,” then “Check Docker status.” When Docker is up, Start is locked.
- Step 3 — Connect: Test connection to NebulaGraph (credentials in
configs/graph.yamland optional.env). - Step 4 — Schema: Apply schema (dry-run first, then “Apply schema (live)” once Docker is up). When the space exists and is applied, the live button is locked.
- Step 1 — Dependencies: Install Python requirements (button runs
- Time estimates per step use the last run duration when available; completed steps are locked so you don’t re-run them by mistake.
You can also run these steps from the terminal (see PLATFORM.md and sections below).
- Config Editor — Load, edit, and save
configs/graph.yaml,configs/reasoning.yaml,configs/simulation.yaml. - Data Ingestion — Acquire dumps (sample / wikipedia-api / wikipedia-sample), run Spark ETL, run the extraction pipeline (dry-run or
--live). - Simulation & Physics — Run one simulation step from the graph (optionally with live write-back), or run the gravity demo.
- Reasoning Loop — Run the reasoning demo with a query (dry-run or live).
- Monitoring & Debug — Dump graph stats, list waves, export subgraph (JSON/PNG). Use Export & API to start the HTTP API server.
- Debug log — Open
http://localhost:8501/?page=debugto see subprocess output (e.g. pip, Docker, schema).
Run the AGI-style cognition loop (seeds → spreading activation → memory tick → optional gravity):
python -m src.agi_loop.demo_loop --dry-run --seed-labels concept --ticks 10 --export-dot demo_out.dotWith forces and optional capabilities (self-reflection, curiosity, goal generator):
python -m src.agi_loop.demo_loop --dry-run --seed-labels concept --ticks 10 --enable-forces --enable-goal-generator --enable-curiosityUse --dry-run for in-memory (no Nebula); omit it and ensure Docker + schema are up for live runs.
One-click demo and presets (Stage 6): Run a short cognition loop without touching config:
python scripts/one_click_demo.py --preset quick-demoPresets (quick-demo, full-demo, lightweight) live in configs/presets.yaml; use --preset <name> with demo_loop or one_click_demo. In the GUI, Lightweight mode (Setup Wizard) uses in-memory fallback for all features.
Start the API server from repo root:
uvicorn scripts.serve_api:app --reload --host 0.0.0.0 --port 8000| Endpoint | Description |
|---|---|
POST /run_demo |
Run cognition demo. Body: {"ticks": 5, "dry_run": true, "seed_labels": "concept"}. |
POST /reasoning |
Run reasoning loop. Body: {"query": "your query", "live": false}. Use output_format: "app" for full JSON (activated_nodes, graph_context, hypotheses). |
GET /health |
Health check. |
Example:
curl -X POST http://localhost:8000/run_demo -H "Content-Type: application/json" -d "{\"ticks\": 3, \"dry_run\": true}"A simpler Streamlit app for running a short demo or viewing graph stats:
python -m streamlit run scripts/streamlit_viz.py- Q&A bot:
python scripts/app_qa_bot.py --query "What is a knowledge graph?"(single run or interactive with--live). - Knowledge explorer:
python scripts/app_knowledge_explorer.py "your query" --output out.json— subgraph + tension + hypotheses as JSON. - Ingestion sources: Configure RSS/URLs in
configs/ingestion_sources.yaml; usesrc.ingestion.connectorsforfetch_urlsandrss_feed_to_chunks.
Enable plugins via configs/plugins.yaml (plugins.enabled). See docs/extensions.md for the extensions gallery (connectors, apps, presets, API output formats).
- Meta-reasoning:
src.reasoning.meta_reasoning—repo_curiosity_scan,roadmap_to_hypotheses,run_meta_reasoning(repo + ROADMAP → hypotheses). - Self-assessment:
python scripts/self_assessment_demo.py— runs benchmarks and writesexamples/self_assessment_report.md.
| Document | What it’s for |
|---|---|
| README.md | This file — what the system does, why it’s useful, how to use it. |
| README_ARCHITECTURE.md | Technical architecture: ingestion, graph, simulation, reasoning, compliance (Steps 1–7), demo loop, benchmarks. |
| ROADMAP.md | Development roadmap (Stages 1–10) and how to run each stage. |
| CODEBASE.md | Codebase reference: modules, data models, scripts, configs. |
| CONTRIBUTING.md | How to contribute, run tests, and open PRs. |
| PLATFORM.md | Portability (Windows, macOS, Linux) and how to verify. |
| examples/README.md | Sample input, export shape, API request examples. |
| docs/extensions.md | Extensions gallery and plugin system (Stage 9). |
| CHANGELOG.md | Summary of documentation and feature changes. |
GOAT/
├── configs/ # graph.yaml, reasoning.yaml, simulation.yaml (and optional llm.yaml)
├── docker/ # Docker Compose (NebulaGraph, Redis, Spark)
├── examples/ # Sample input, export shape, API examples
├── infra/ # Terraform, Ansible, K8s (deployment scaffolding)
├── scripts/ # Entry points: schema, ingestion, demos, export, API, Streamlit GUI
├── src/ # Core packages
│ ├── agi_loop/ # Cognition demo loop (demo_loop.py)
│ ├── graph/ # Client, models, schema, vector index, compression
│ ├── ingestion/ # Extraction pipeline, LLM extract, online ingest
│ ├── reasoning/ # Loop, tension, reflection, goal generator, curiosity, cache
│ ├── simulation/ # Gravity, loop
│ ├── physics/ # Forces, layout
│ └── monitoring/ # Metrics (Prometheus)
├── tests/ # Pytest suite (milestones, benchmarks, API, graph, reasoning)
├── requirements.txt
├── pytest.ini
└── README.md
- NebulaGraph: Default
root/nebula, host127.0.0.1, port9669. Set inconfigs/graph.yamlor override with a.envin the repo root:NEBULA_HOST,NEBULA_PORT,NEBULA_USERNAME,NEBULA_PASSWORD(see.env.example). - Scripts: Use
--liveto talk to the real graph and Redis; without it, scripts use dry-run (in-memory). Prefer--dry-runfirst when available. - GPU: Optional CUDA/FAISS-GPU via
configs/simulation.yaml(use_gpu) or envGOAT_USE_GPU=1; see ROADMAP.md.
- Acquire text:
python scripts/acquire_dumps.py --source sample(orwikipedia-api,wikipedia-sample); output underdata/raw/. - Spark ETL:
python scripts/run_spark_etl.py— reads text, writes Parquet (valuecolumn). Requires Java (JAVA_HOME). - Extraction:
python -m src.ingestion.extraction_pipeline --input-path <path> [--live]or combined:scripts/run_batch_ingestion.py --acquire [--spark-etl] [--live]. - Reasoning demo:
python scripts/run_reasoning_demo.py --query "your query" --live(omit--livefor dry-run). - Graph stats / export:
python scripts/dump_graph_stats.py --live;python scripts/export_subgraph.py --concept "X" --live --output out.json --plot out.png.
The graph stores a cognition layer alongside concepts:
- Nodes = concepts (and topic/cluster nodes). Properties include
label,mass,activation,state(ACTIVE/DORMANT/DEEP),cluster_id,metadata. - Waves = one cognitive episode per ingestion chunk or reasoning pass. Properties:
label,source(e.g.ingestion/reasoning),intensity,coherence,tension,source_chunk_id. - relates = concept-to-concept (and concept–cluster) edges.
- in_wave = edges from concept nodes to waves (provenance: “this concept appeared in this episode”).
Schema is in src/graph/schema/; apply with python scripts/apply_schema.py --live after the space is created.
Paths use pathlib.Path; subprocesses use list arguments and cwd=ROOT. On Windows use ; instead of && to chain commands. See PLATFORM.md for details and verification steps.
For common issues (Python/virtualenv, Docker and ports, NebulaGraph connection, JAVA_HOME/Spark, LLM configuration, GPU/CUDA, or GUI/API server problems), see:
It focuses on local-first workflows on Windows, macOS, and Linux and complements PLATFORM.md.
We use semantic versioning (e.g. v0.1.0). Tagged releases are listed under Releases. See CHANGELOG.md for notable changes.
This project is open source. See LICENSE in the repository root for terms.