Point at your UI in the browser, type what to change — your AI agent reads it and edits the code.
No Chrome extension. No MCP server. No account. Just CDP + one small JS overlay.
You ask for a change, the agent writes code it can't see, you eyeball the browser and then type a paragraph describing what's off — "the CTA is too big, move it left, wrong green". Slow, lossy, repeat.
browser-annotations closes the loop. Point at the real UI, the agent fixes the real code.
That one sentence runs the whole loop:
- The agent ships it — and proves it. Builds the change, opens the page, screenshots it, and checks it actually renders + matches the ask. Says "done" only when it's real — no "should work" hand-waving.
- You point, you don't type. Annotations are already on: hover → click the thing → say what's wrong → Save. Clicking "smaller" on the actual button beats a paragraph describing it. (Copy grabs every note as markdown.)
- Back in the CLI, applied. Write "done" — the agent fixes each pin, re-verifies in the browser, and reports what changed. Loop until it's perfect.
The payoff: your coding agent finally sees what you see — "looks wrong in the browser" becomes "fixed in the code" in one click. No extension, no MCP, no leaving your terminal.
Overlay keys: Save = ⌘/Ctrl+Enter · Copy = notes→clipboard · Alt+A pause · Clear wipes · ✕ deletes one.
As a skill — any agent (via skills.sh). Self-contained — the CDP client is
bundled in the skill, so this alone is enough; the agent calls python3 <skill-dir>/cdp.py (needs only Python 3 + a debug Chrome):
npx skills add kuzmany/browser-annotations # Claude Code, Cursor, 50+ agents (--agent '*' for all)Want the browser-annotate shell command (for typing it yourself)? Run the installer:
git clone https://github.com/kuzmany/browser-annotations && cd browser-annotations && ./install.shNo install at all? Paste overlay/bh-annotate.js into the DevTools console and annotate any page.
browser-annotate --open https://localhost:3000That's it. If no debug Chrome is running, it launches one for you (its own profile in
~/.browser-annotations/chrome), opens the page, and turns annotations on. Already have a Chrome on
--remote-debugging-port? It attaches to that instead. Set $CHROME to pick a specific binary.
Cross-machine (CLI and Chrome on different hosts — e.g. CLI in a VM, Chrome on your laptop)? Point it
at the reachable endpoint: --cdp ws://<host>:<port>/… or $CDP_URL / $BU_CDP_WS
(browser-harness users: $BH_CDP_ENV → its .env, live endpoint auto-read).
| Command (short alias) | Does |
|---|---|
browser-annotate (bh-annotate) [--open URL] [--url SUB] |
inject the overlay (annotations on). --open URL opens a fresh tab first (or launches Chrome if none is running) — no browser-harness needed. |
browser-apply (bh-apply) [--url SUB] [--json] |
export notes → ./.annotations/notes.md |
Endpoint: --cdp → $CDP_URL → $BU_CDP_WS → http://localhost:9222.
--url pins a tab by URL; --window ID pins a Chrome window — auto-resolved from $BH_SESSION_WINDOW_ID or browser-harness's ~/.bh-session-windows.json (keyed by $BU_NAME).
Each note in notes.md carries a unique CSS selector + tag/text/box — so the agent edits the exact element, no guessing:
## [#2] `header > div > a` — a "Order"
note: make this button green- Injects the overlay over CDP (browser WebSocket + flat session) — CSP-safe. Your pins + notes persist in
localStorage; after a hard page reload just re-runbrowser-annotateto bring the overlay back (it reloads your pins). lib/cdp.pyis ~190 lines, stdlib only (no pip deps); it also doesshot(screenshots) for the agent's self-check.- Self-contained —
--open <url>creates+navigates its own tab over CDP, so it needs nothing but a--remote-debugging-portChrome. When browser-harness is present it's used for nicer opening (right session window + domain-skills) and screenshots — but never required.
MIT © kuzmany