Skip to content

Commit 8ab9477

Browse files
committed
docs: expand AGENTS.md with comprehensive project documentation
- Add Overview section describing bot purpose and tech stack - Expand Commands section with linter and staging bot run options - Add detailed Structure section with directory tree and component descriptions - Add 'Where to Look' quick reference table for common tasks - Add Code Map table with file complexity indicators - Restructure Architecture Patterns with subsections for handler groups, singletons, state machine, and database conventions - Enhance Code Style guidelines with improved clarity - Clarify Testing practices and async mode requirements - Add Anti-Patterns section documenting project-specific conventions - Add Unique Conventions section for Indonesian localization, admin auth, URL whitelisting, and restart recovery - Add CI/CD and Notes sections for deployment and operational details
1 parent 24b4047 commit 8ab9477

1 file changed

Lines changed: 154 additions & 25 deletions

File tree

AGENTS.md

Lines changed: 154 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,181 @@
11
# AGENTS.md - PythonID Telegram Bot
22

3+
## Overview
4+
5+
Indonesian Telegram bot for group profile enforcement (photo + username), captcha verification, and anti-spam protection. Built with python-telegram-bot v20+, SQLModel, Pydantic, and Logfire.
6+
37
## Commands
48

59
```bash
610
# Install dependencies
711
uv sync
812

9-
# Run tests
13+
# Run tests (100% coverage maintained)
1014
uv run pytest
1115

12-
# Run a single test file
16+
# Run single test file
1317
uv run pytest tests/test_check.py
1418

15-
# Run a single test function
19+
# Run single test function
1620
uv run pytest tests/test_check.py::TestHandleCheckCommand::test_check_command_non_admin
1721

18-
# Run tests with coverage
22+
# Run with coverage
1923
uv run pytest --cov=bot --cov-report=term-missing
2024

25+
# Run linter
26+
uv run ruff check .
27+
2128
# Run the bot
2229
uv run pythonid-bot
30+
31+
# Run staging
32+
BOT_ENV=staging uv run pythonid-bot
33+
```
34+
35+
## Structure
36+
37+
```
38+
PythonID/
39+
├── src/bot/
40+
│ ├── main.py # Entry point + handler registration (priority groups!)
41+
│ ├── config.py # Pydantic settings (get_settings() cached)
42+
│ ├── constants.py # Indonesian templates + URL whitelists (528 lines)
43+
│ ├── handlers/ # Telegram update handlers
44+
│ │ ├── captcha.py # New member verification flow
45+
│ │ ├── verify.py # Admin /verify, /unverify commands
46+
│ │ ├── check.py # Admin /check command + forwarded message handling
47+
│ │ ├── anti_spam.py # Probation enforcement (links/forwards)
48+
│ │ ├── message.py # Profile compliance monitoring
49+
│ │ ├── dm.py # DM unrestriction flow
50+
│ │ └── topic_guard.py # Warning topic protection (group=-1)
51+
│ ├── services/
52+
│ │ ├── user_checker.py # Profile validation (photo + username)
53+
│ │ ├── scheduler.py # JobQueue auto-restriction (every 5 min)
54+
│ │ ├── telegram_utils.py # Shared API helpers
55+
│ │ ├── bot_info.py # Bot metadata cache (singleton)
56+
│ │ └── captcha_recovery.py # Restart recovery for pending captchas
57+
│ └── database/
58+
│ ├── models.py # SQLModel schemas (4 tables)
59+
│ └── service.py # DatabaseService singleton (645 lines)
60+
├── tests/ # pytest-asyncio (18 files, 100% coverage)
61+
└── data/bot.db # SQLite (auto-created, WAL mode)
62+
```
63+
64+
## Where to Look
65+
66+
| Task | Location | Notes |
67+
|------|----------|-------|
68+
| Add new handler | `main.py` | Register with appropriate group (-1, 0, 1) |
69+
| Modify messages | `constants.py` | All Indonesian templates centralized |
70+
| Add DB table | `database/models.py``database/service.py` | Add model, then service methods |
71+
| Change config | `config.py` | Pydantic BaseSettings with env vars |
72+
| Add URL whitelist | `constants.py``WHITELISTED_URL_DOMAINS` | Suffix-based matching |
73+
| Add Telegram whitelist | `constants.py``WHITELISTED_TELEGRAM_PATHS` | Lowercase, exact path match |
74+
75+
## Code Map (Key Files)
76+
77+
| File | Lines | Role |
78+
|------|-------|------|
79+
| `database/service.py` | 645 | **Complexity hotspot** - handles warnings, captcha, probation state |
80+
| `constants.py` | 528 | Templates + massive whitelists (Indonesian tech community) |
81+
| `handlers/captcha.py` | 365 | New member join → restrict → verify → unrestrict lifecycle |
82+
| `handlers/verify.py` | 344 | Admin verification commands + inline button callbacks |
83+
| `handlers/anti_spam.py` | 327 | Probation enforcement with URL whitelisting |
84+
| `main.py` | 293 | Entry point, logging, handler registration, JobQueue setup |
85+
86+
## Architecture Patterns
87+
88+
### Handler Priority Groups
89+
```python
90+
# main.py - Order matters!
91+
group=-1 # topic_guard: Runs FIRST, deletes unauthorized warning topic msgs
92+
group=0 # Commands, DM, anti_spam: Default priority
93+
group=1 # message_handler: Runs LAST, profile compliance check
2394
```
2495

25-
## Architecture
96+
### Singletons
97+
- `get_settings()` — Pydantic settings, `@lru_cache`
98+
- `get_database()` — DatabaseService, lazy init
99+
- `BotInfoCache` — Class-level cache for bot username/ID
26100

27-
- **src/bot/**: Main application package
28-
- **main.py**: Entry point with JobQueue integration — register new handlers here
29-
- **config.py**: Pydantic settings (`get_settings()` cached via `lru_cache`)
30-
- **constants.py**: Centralized message templates and utilities
31-
- **handlers/**: Telegram update handlers (message.py, dm.py, captcha.py, verify.py, anti_spam.py, topic_guard.py, check.py)
32-
- **services/**: Business logic (user_checker.py, scheduler.py, bot_info.py, telegram_utils.py, captcha_recovery.py)
33-
- **database/**: SQLModel schemas (models.py) and SQLite operations (service.py) — use `get_database()` singleton
34-
- **tests/**: pytest-asyncio tests with mocked telegram API
35-
- **data/bot.db**: SQLite database (auto-created via `SQLModel.metadata.create_all`)
101+
### State Machine (Progressive Restriction)
102+
```
103+
1st violation → Warning with threshold info
104+
2nd to (N-1) → Silent increment (no spam)
105+
Nth violation → Restrict + notification
106+
Time threshold → Auto-restrict via scheduler (parallel path)
107+
```
108+
109+
### Database Conventions
110+
- SQLite with **WAL mode** for concurrency
111+
- `session.exec(select(Model).where(...)).first()` syntax
112+
- Atomic updates for violation counts (prevents race conditions)
113+
- No Alembic — use `SQLModel.metadata.create_all`
36114

37115
## Code Style
38116

39-
- **Python 3.11+** with type hints; imports grouped: stdlib → third-party → local
40-
- **Async/await**: All handlers are async functions
41-
- **PTB v20+**: Use `ContextTypes.DEFAULT_TYPE` for context type hints, not legacy `Dispatcher`/`Updater`
42-
- **SQLModel**: Use `session.exec(select(Model).where(...)).first()` syntax; no Alembic migrations
43-
- **Logging**: Use `logfire` for structured logging, not `print()` or stdlib `logging`
44-
- **Error handling**: Catch specific exceptions (e.g., `TimedOut`), log errors, return gracefully
45-
- **No comments**: Avoid inline comments unless code is complex
46-
- **Docstrings**: Module-level docstrings required, function docstrings for public APIs
117+
- **Python 3.11+** with type hints
118+
- **Imports**: stdlib → third-party → local
119+
- **Async/await**: All handlers are async
120+
- **PTB v20+**: Use `ContextTypes.DEFAULT_TYPE`, not legacy Dispatcher
121+
- **Logging**: Use `logfire` via stdlib `logging.getLogger(__name__)`
122+
- **Error handling**: Catch specific exceptions (`TimedOut`), log, return gracefully
123+
- **No inline comments** unless code is complex
124+
- **Docstrings**: Module-level required; function docstrings for public APIs
47125

48126
## Testing
49127

50-
- **Async mode**: `asyncio_mode = auto` in pyproject.toml — do NOT use `@pytest.mark.asyncio` decorators
51-
- **Fixtures**: Check existing fixtures in test files (`mock_update`, `mock_context`, `mock_settings`)
52-
- **Mocking**: Use `AsyncMock` and `MagicMock` for telegram API; no real network calls
128+
- **Async mode**: `asyncio_mode = auto` — do NOT use `@pytest.mark.asyncio` decorators
129+
- **No conftest.py**: Fixtures defined locally in each test file (intentional isolation)
130+
- **Fixtures**: `mock_update`, `mock_context`, `mock_settings` — copy from existing tests
131+
- **Database tests**: Use `temp_db` fixture with `tempfile.TemporaryDirectory`
132+
- **Mocking**: `AsyncMock` for Telegram API; no real network calls
133+
- **Coverage**: 100% maintained — check before committing
134+
135+
## Anti-Patterns (THIS PROJECT)
136+
137+
| Forbidden | Why |
138+
|-----------|-----|
139+
| `@pytest.mark.asyncio` decorator | `asyncio_mode = auto` handles this |
140+
| Manual `conftest.py` fixtures | Project uses local fixture pattern |
141+
| Raw SQL in handlers | Use `DatabaseService` methods |
142+
| Hardcoded Indonesian text | Use `constants.py` templates |
143+
| `print()` statements | Use `logging.getLogger(__name__)` |
144+
| Empty `except:` blocks | Catch specific exceptions, log with `exc_info=True` |
145+
146+
## Unique Conventions
147+
148+
### Indonesian Localization
149+
- All user-facing messages in `constants.py`
150+
- Time formatting: `format_threshold_display(minutes)` → "3 jam" or "30 menit"
151+
- Duration formatting: `format_hours_display(hours)` → "7 hari" or "12 jam"
152+
153+
### Admin Authorization
154+
```python
155+
admin_ids = context.bot_data.get("admin_ids", [])
156+
if user.id not in admin_ids:
157+
return # or send "Admin only" message
158+
```
159+
160+
### URL Whitelisting (Anti-spam)
161+
- Suffix-based hostname matching in `is_url_whitelisted()`
162+
- `WHITELISTED_URL_DOMAINS` — tech/docs domains (github.com, docs.python.org, etc.)
163+
- `WHITELISTED_TELEGRAM_PATHS` — Indonesian tech communities (lowercase)
164+
165+
### Restart Recovery
166+
- Pending captchas persisted to DB, recovered in `post_init()`
167+
- JobQueue timeouts re-scheduled on bot startup
168+
169+
## CI/CD
170+
171+
- **GitHub Actions**: `.github/workflows/python-checks.yml`
172+
- **Matrix**: Python 3.11, 3.12, 3.13, 3.14
173+
- **Steps**: Ruff lint → pytest
174+
- **Docker**: Multi-stage build with `uv`, non-root user, 512MB limit
175+
176+
## Notes
177+
178+
- Topic guard runs at `group=-1` to intercept unauthorized messages BEFORE other handlers
179+
- JobQueue auto-restriction job runs every 5 minutes (first run after 5 min delay)
180+
- Bot uses `allowed_updates=["message", "callback_query", "chat_member"]`
181+
- Captcha uses both `ChatMemberHandler` (for "Hide Join" groups) and `MessageHandler` fallback

0 commit comments

Comments
 (0)