Telegram operator bot: alerts + on-demand commands, over Tor (#121, #45)#341
Open
VijitSingh97 wants to merge 11 commits into
Open
Telegram operator bot: alerts + on-demand commands, over Tor (#121, #45)#341VijitSingh97 wants to merge 11 commits into
VijitSingh97 wants to merge 11 commits into
Conversation
Ship a thin, notifications-only Telegram pusher for v1.0: node down/recovered, worker offline/back online, and sync finished. Off by default; no interactive bot (that stays in #45). Consumes signals the data loop already computes rather than re-collecting: - node down/recovered from NodeHealthMonitor's debounced `down` edge (#31) - sync finished from the sync-gate `miner_released` latch (#35) - worker offline/back via a new flap-protected per-worker presence tracker New dashboard modules: - telegram_notifier.py: thin sendMessage transport; enabled only with token + chat_id; fail-silent on offline/Tor-only hosts; never logs the bot token. - worker_presence.py: WorkerPresenceMonitor, the per-worker analogue of NodeHealthMonitor (debounce + recovery hysteresis + silent baseline + reset when the proxy is intentionally stopped). - alert_service.py: folds per-cycle signals into debounced alerts; pure evaluate() + off-thread process(); wired into data_service.run(). Plumbing: config.json telegram.* -> pithead render_env -> per-event env vars -> config.py. bot_token rendered to the owner-only .env and masked in the apply preview. Injected into the dashboard container in docker-compose; added to the advanced example config. Docs: new docs/telegram.md setup guide (BotFather, chat id, per-event toggles, "one chat, two bots" with #79, Tor-only caveat, troubleshooting); cross-refs in the docs index, configuration reference, and CHANGELOG. Tests: 49 new pytest cases (notifier/monitor/alert service), plus stack tests for env propagation and bot-token secrecy. Full suite green; coverage 93%. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
# Conflicts: # CHANGELOG.md
Re-apply the #121 draft (PR #143) onto current develop, resolving 103 commits of context drift. New modules land unchanged; conflicts were all additive (healthchecks #79, clearnet #234, update-badge #224 rows sitting beside the new telegram rows). Config renamed config.advanced.example.json -> config.reference.json; kept the subnet-aware P2POOL_URL over the draft's hardcoded one. Adds the telegram.commands scaffolding for the #45 half. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ged alerter Add TelegramCommandBot: an in-process long-poll (getUpdates) loop that answers read-only /status, /hashrate, /workers, /sync, /help from the configured chat. Reuses build_metrics so replies match the dashboard exactly; access-gated to the one chat_id; long-poll needs no inbound port and rides the same egress as the alerts; fail-silent + never logs the token, same discipline as the notifier. Wired as a third background task in main.py (no-op unless telegram.commands.enabled). Config: telegram.commands.enabled -> TELEGRAM_COMMANDS_ENABLED (default false). Also ruff-format + lint-fix the #121 modules the draft predated (import sort, drop an unused import). 37 new unit tests; patch coverage 95%. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per feedback: the dashboard already computes each rig's DOWN state (status != online, shown in the uptime column) and keeps the row visible ~1h before it falls off the proxy table (#182). Drive the worker-offline debounce off that same status instead of inferring offline from absence in the online-name set — so a Telegram 'offline' alert lines up with the rig showing DOWN on screen, and a rig that vanishes from the table entirely is forgotten (the dashboard no longer shows it) rather than aged into a false offline. WorkerPresenceMonitor.update now takes the worker rows (name+status); offline fires while DOWN for offline_after, recovered after recovery_after online, forgotten when the proxy stops listing it. Drops the redundant retention timer (WORKER_RETENTION_SEC now removed — it mirrored the lifecycle's own falloff). worker_presence.py 100% covered; make test green; patch coverage 94%. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… /xvb commands Alerts (all reuse what the dashboard already computes, per feedback): - worker joined/left — woven into WorkerPresenceMonitor via a prime flag so a restart/failover-readmit doesn't replay the roster as joins; 'left' fires when a rig drops off the proxy table entirely (vs 'offline' = DOWN-but-still-listed). - disk filling/critical — crosses the same DISK_WARN/CRITICAL_PERCENT thresholds as the dashboard's low-disk badge (#138); a full disk corrupts monerod's DB. - DB write failing — StateManager.is_db_healthy flipping false (#131). Disk usage is read once in the loop and reused in the snapshot. Commands (read-only, reuse build_metrics + the system snapshot): - /system (disk/RAM/CPU/load/HugePages), /pool (sidechain + Monero network), /xvb (mode/tier/routed/raffle-eligibility). Config: telegram.events gains worker_joined/worker_left/disk_space/db_unhealthy (all default true); plumbed through pithead render, compose, config.reference.json. make test green; patch coverage 97%. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…mand Reviewed the dashboard's own noteworthy-state catalog (build_badges) and added the high-value, cheap-to-reuse signals: - xvb_no_share — donating to XvB with no PPLNS share in the window means raffle wins are skipped (#158); revenue make-or-break. Gated on XvB enabled. - clearnet_exposed — a node syncing over clearnet exposes the host IP (#183); privacy signal on a Tor-first stack. Reverts to Tor automatically (#234). Both computed in the data loop from existing figures (shares_in_pplns_window, clearnet_sync_state) and passed as scalars, keeping evaluate() pure. Events default true. /earnings command — estimated P2Pool XMR/day, reusing service/earnings.xmr_per_hs_day applied to the displayed P2Pool 1h hashrate (no web-layer import). Larger follow-ups raised as separate issues (block/payout-found, container crash-loop, two-way control, XvB-registration alert, etc). make test green; patch coverage 97%. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…i, egress diagram - Tor routing (#340): notifier + command bot now dial api.telegram.org through the bridge Tor SOCKS proxy (socks5h), same as Healthchecks/XvB — never leaks the host IP. Command bot swapped from aiohttp to requests-in-a-thread to reuse the SOCKS path (no new dep); getUpdates long-poll runs via asyncio.to_thread. - New alerts (folding in the achievable parts of #339): 'Pithead online' one-shot heartbeat on start, XvB registration rejected/failing/recovered, new-release-available. - Emoji enrichment across all alert messages (⛏️ workers, ⛓️ nodes, 💾 disk, 🗄️ db, 🎰 XvB, 🧅/🌐 clearnet, 🚀 online). - Network diagram (docs/architecture.md): now shows each dashboard egress path (Telegram, Healthchecks, XvB stats, GitHub) tagged with its route — all 🟢 Tor. - config.json telegram.events gains xvb_registration/new_release/stack_online (default true); plumbed pithead->compose->config.py; docs updated. Kept as issues (genuinely need new infra): #336 block/payout (no per-node/payout signal), #337 crash-loop (read proxy has no inspect/health), #338 two-way control (auth model). make test green; patch coverage 96%. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
8 tasks
… gaps Testing to standard (docs/testing-strategy.md — test each behaviour once, at the lowest honest tier): - tier-1 pytest: assert AlertService.EVT_* == config.py TELEGRAM_EVENTS (adding an alert but forgetting its toggle, or vice versa, now fails a test). - tier-1 shell (run.sh): assert every telegram.events key in config.reference.json renders into .env AND is declared in docker-compose.yml — guards the config surface the pytest can't see (14 events × 2, +28 assertions). - Fill real branches: process() swallowing an evaluate() error (the never-break-the- loop guard), reply_for /hashrate + /sync, _safe_reply_for's error path, the benign XvB-registration transition. alert_service + telegram_commands now 99%. Docs reviewed against docs/STYLE.md (voice + code-accuracy) — no changes needed. make test green; patch coverage 98%; test-inventory regenerated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Daily summary: a once-a-day status roll-up (nodes/mining/workers/hashrate/shares/ disk) pushed at a configurable local time. telegram.daily_summary_time (default 08:00) + a daily_summary event toggle (default on). Fires once/day at the target; a post-time restart waits for the next day rather than replaying; malformed time disables it. Uses the dashboard container's timezone. Built lazily (only when due) from the same build_metrics the dashboard renders. - Config plumbed config.json -> pithead render -> compose -> config.py; describe_change + the run.sh event-consistency loop + a time-propagation test cover the surfaces. - Wiring test (tier 1): asserts DataService.run() hands AlertService the full signal contract each cycle + calls maybe_daily_summary — closes the one automatable e2e gap (the alert LOGIC was fully unit-tested; the loop->alerter wiring wasn't asserted). make test green; patch coverage 97%; docs (telegram.md/configuration.md/CHANGELOG) updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This was referenced Jul 3, 2026
#339) - /hashrate consistency: total and per-worker now use one shared effective_hashrate() (10m avg, 1m fallback for a just-connected rig without 10m history) — so per-worker lines sum to the total and a fresh worker shows its real rate, not 0.00. Fixed in /hashrate, /workers, and the daily digest label; _aggregate_hashrate reuses the helper. - Tor network panel (egress #170): the Telegram bot now appears as a dashboard egress path (Tor when enabled, else inactive) in both the egress list and the topology graph. - hashrate_low alert (#339 remainder): edge alert when a fixed XvB tier can't be sustained / recovers, from metrics.low_hr_warning (built once per cycle, only when the bot is on). #340 (Tor routing) was already complete. make test green; patch coverage 98%; docs + CHANGELOG + roadmap #333 updated; 6 issues (#99/#104/#59/#84/#118/#116) got Telegram acceptance-criteria bullets. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #121, closes #45. Supersedes the stale draft #143 (its three modules were salvaged and rebased here).
A Telegram operator bot for the stack: push alerts for anything worth knowing, plus read-only status commands on demand. Off by default; everything routes over Tor.
Alerts (debounced, one message per real transition)
🚀 Pithead online · ⛓️ node down/recovered · ⛏️ worker offline/back · 🎉 worker joined · 👋 worker left · ✅ sync finished · 💾 disk filling/critical · 🗄️ DB write failing · 🎰 no PPLNS share (XvB wins skipped) · 🎰 XvB registration rejected/failing · 🌐 node exposed on clearnet · 🆕 new release available
Each reuses a signal the dashboard already computes (the
build_badgescatalog /Metrics/NodeHealthMonitor) — no re-collection. Per-eventtelegram.events.*toggles, all default on.Commands (read-only; only the configured
chat_idis answered)/status·/hashrate·/workers·/sync·/system·/pool·/xvb·/earnings·/helpLong-poll (
getUpdates), so no inbound port; replies come from the samebuild_metricsthe web UI renders, so they always match the dashboard.Design
api.telegram.orgthrough the bridge Tor SOCKS proxy (socks5h), like Healthchecks/XvB — never leaks the host IP.bot_tokenlives in the owner-only.env, masked in theapplypreview, never logged.docs/architecture.mddiagram now shows every dashboard egress path (Telegram/Healthchecks/XvB/GitHub), all 🟢 Tor.Testing
make testgreen — dashboard 769, stack 452, selftest 101, fakes 12. Patch coverage 96%, full lint clean, test-inventory regenerated. Validated live on gouda: rebuilt viaupgrade, bot polls over Tor,getMe/sendMessagesucceed over Tor, stack healthy.Follow-ups (kept as issues — need new infra)
#336 block/payout alerts · #337 container crash-loop alerts · #338 two-way control commands · #339 hashrate-low-for-tier alert.
🤖 Generated with Claude Code