From 8b4e1b7b9ce14b984a9002b65ac52ade013572f1 Mon Sep 17 00:00:00 2001 From: Brett Kinny Date: Sat, 23 May 2026 17:10:02 +1000 Subject: [PATCH 1/2] docs: acknowledge AI-assisted authorship at top of README MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a 🤖 blockquote between the tagline and the existing ⚠️ heads-up. First substantive thing a visitor reads: most of the code and nearly all of the docs in this repo were written by AI agents under direction, framed honestly so the project's nature is clear up-front. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 89fd4a8..ffa8a59 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,8 @@ **Your self-hosted [StackChan](https://github.com/m5stack/StackChan) robot assistant — kid-safe by default, hackable by design, private by architecture.** +> 🤖 **AI-assisted project.** Most of the code and nearly all of the docs in this repo were written by AI agents (primarily Claude Code) under my direction. I've been coding professionally for 15+ years; Dotty is one of a few side projects I'm using to learn the current generation of AI/LLM tooling first-hand — what the tools do well, where they break, and how to drive them. Feedback on the output very welcome. + > ⚠️ **Heads up: this is not a stable project yet.** Dotty is buggy, frequently broken, and actively changing day-to-day. End-to-end behaviour works on the maintainer's hardware but regressions land all the time, the API and config surface shifts without notice, and a fresh deploy on someone else's gear has not been verified. Treat this as a hobby-grade work-in-progress, not a polished product. Bugs, PRs, and "this didn't work for me" issues all very welcome. 🍺☕ If you do try a fresh end-to-end deploy, please get in touch — I'll buy you a beer or a coffee. The best place to ask questions, get help, or show off a build is the [Dotty community Discord](https://discord.gg/7sKE5c6A). > > **Known rough edges:** face emoji rendering is missing visual differentiation for 4 of 9 emotions (sad / surprise / love / laughing); sound-direction localizer has a hardware-AEC-related left-bias on M5Stack CoreS3 (energy detection works, direction is unreliable); kid-voice ASR accuracy on SenseVoice has a kid-speech gap that whisper.cpp will close in a follow-up. From b554ddc8e2ed5c6df82c31294b8b7679453f1345 Mon Sep 17 00:00:00 2001 From: Brett Kinny Date: Sat, 23 May 2026 18:28:52 +1000 Subject: [PATCH 2/2] #15: pass VLM/VISION/OPENROUTER api keys into dotty-behaviour container MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Post-#36 the active vision path is dotty-behaviour/dispatch/vlm.py, which inherited the loud-error contract from bridge.py's hardened _call_vision_api. But dotty-behaviour/docker-compose.yml didn't pass any of VLM_API_KEY / VISION_API_KEY / OPENROUTER_API_KEY into the container — so a fresh deploy resolves all three to empty inside the container even with a populated host .env, and every photo intent falls through to the VLM_OFFLINE_SENTINEL. - Add ${VAR:-} interpolation for all four (incl. AUDIO_CAPTION_API_KEY) so the compose picks them up from the shell that runs `docker compose up` without erroring when none are set. - Document the env-var requirement in dotty-behaviour/README.md under "Build + run on Unraid". Closes #15. The bridge-side hardening (aa2d8ba) and the original RPi-side env-var TODO were already moot post-#36 RPi decommission; this commit closes the post-#36 manifestation of the same bug. Co-Authored-By: Claude Opus 4.7 (1M context) --- dotty-behaviour/README.md | 17 +++++++++++++++++ dotty-behaviour/docker-compose.yml | 11 +++++++++++ 2 files changed, 28 insertions(+) diff --git a/dotty-behaviour/README.md b/dotty-behaviour/README.md index 399d735..8fc6931 100644 --- a/dotty-behaviour/README.md +++ b/dotty-behaviour/README.md @@ -35,6 +35,23 @@ ssh root@ ' ' ``` +### Vision-key env var (issue #15) + +Photo intents need an OpenAI-compatible API key for the VLM call. The +compose file picks up any of these from the shell that runs +`docker compose up`: + +- `VLM_API_KEY` (preferred) +- `VISION_API_KEY` (fallback) +- `OPENROUTER_API_KEY` (fallback of fallback) + +If none are set the container still starts, but `dispatch/vlm.py` +returns the `VLM_OFFLINE_SENTINEL` string for every photo intent so +the downstream LLM is told the camera is unavailable rather than +confabulating a description. Set the key in the host shell before +`docker compose up`, or pass `--env-file ` at a `.env` that +contains it. + ## Why a separate container The bridge was a separate process on the RPi for the whole life of diff --git a/dotty-behaviour/docker-compose.yml b/dotty-behaviour/docker-compose.yml index 7b0d419..38cf3af 100644 --- a/dotty-behaviour/docker-compose.yml +++ b/dotty-behaviour/docker-compose.yml @@ -39,6 +39,17 @@ services: - DOTTY_STATE_DIR=/var/lib/dotty-behaviour/state - HOUSEHOLD_YAML_PATH=/var/lib/dotty-behaviour/state/household.yaml - GREETER_STATE_PATH=/var/lib/dotty-behaviour/state/greeter_state.json + # Vision-language model credentials (issue #15). Required for + # photo intents — without these the container falls through to + # the VLM_OFFLINE_SENTINEL contract in dispatch/vlm.py and the + # downstream LLM is told the camera is unavailable rather than + # confabulating a description. Resolved in fallback order: + # VLM_API_KEY → VISION_API_KEY → OPENROUTER_API_KEY. Set any one + # in your shell or in .env next to this compose file. + - OPENROUTER_API_KEY=${OPENROUTER_API_KEY:-} + - VISION_API_KEY=${VISION_API_KEY:-} + - VLM_API_KEY=${VLM_API_KEY:-} + - AUDIO_CAPTION_API_KEY=${AUDIO_CAPTION_API_KEY:-} volumes: - /mnt/user/appdata/dotty-behaviour/state:/var/lib/dotty-behaviour/state - /mnt/user/appdata/dotty-behaviour/logs:/var/lib/dotty-behaviour/logs