|
1 | 1 | # AGENTS.md - PythonID Telegram Bot |
2 | 2 |
|
| 3 | +## Overview |
| 4 | + |
| 5 | +Indonesian Telegram bot for group profile enforcement (photo + username), captcha verification, and anti-spam protection. Built with python-telegram-bot v20+, SQLModel, Pydantic, and Logfire. |
| 6 | + |
3 | 7 | ## Commands |
4 | 8 |
|
5 | 9 | ```bash |
6 | 10 | # Install dependencies |
7 | 11 | uv sync |
8 | 12 |
|
9 | | -# Run tests |
| 13 | +# Run tests (100% coverage maintained) |
10 | 14 | uv run pytest |
11 | 15 |
|
12 | | -# Run a single test file |
| 16 | +# Run single test file |
13 | 17 | uv run pytest tests/test_check.py |
14 | 18 |
|
15 | | -# Run a single test function |
| 19 | +# Run single test function |
16 | 20 | uv run pytest tests/test_check.py::TestHandleCheckCommand::test_check_command_non_admin |
17 | 21 |
|
18 | | -# Run tests with coverage |
| 22 | +# Run with coverage |
19 | 23 | uv run pytest --cov=bot --cov-report=term-missing |
20 | 24 |
|
| 25 | +# Run linter |
| 26 | +uv run ruff check . |
| 27 | + |
21 | 28 | # Run the bot |
22 | 29 | uv run pythonid-bot |
| 30 | + |
| 31 | +# Run staging |
| 32 | +BOT_ENV=staging uv run pythonid-bot |
| 33 | +``` |
| 34 | + |
| 35 | +## Structure |
| 36 | + |
| 37 | +``` |
| 38 | +PythonID/ |
| 39 | +├── src/bot/ |
| 40 | +│ ├── main.py # Entry point + handler registration (priority groups!) |
| 41 | +│ ├── config.py # Pydantic settings (get_settings() cached) |
| 42 | +│ ├── constants.py # Indonesian templates + URL whitelists (528 lines) |
| 43 | +│ ├── handlers/ # Telegram update handlers |
| 44 | +│ │ ├── captcha.py # New member verification flow |
| 45 | +│ │ ├── verify.py # Admin /verify, /unverify commands |
| 46 | +│ │ ├── check.py # Admin /check command + forwarded message handling |
| 47 | +│ │ ├── anti_spam.py # Probation enforcement (links/forwards) |
| 48 | +│ │ ├── message.py # Profile compliance monitoring |
| 49 | +│ │ ├── dm.py # DM unrestriction flow |
| 50 | +│ │ └── topic_guard.py # Warning topic protection (group=-1) |
| 51 | +│ ├── services/ |
| 52 | +│ │ ├── user_checker.py # Profile validation (photo + username) |
| 53 | +│ │ ├── scheduler.py # JobQueue auto-restriction (every 5 min) |
| 54 | +│ │ ├── telegram_utils.py # Shared API helpers |
| 55 | +│ │ ├── bot_info.py # Bot metadata cache (singleton) |
| 56 | +│ │ └── captcha_recovery.py # Restart recovery for pending captchas |
| 57 | +│ └── database/ |
| 58 | +│ ├── models.py # SQLModel schemas (4 tables) |
| 59 | +│ └── service.py # DatabaseService singleton (645 lines) |
| 60 | +├── tests/ # pytest-asyncio (18 files, 100% coverage) |
| 61 | +└── data/bot.db # SQLite (auto-created, WAL mode) |
| 62 | +``` |
| 63 | + |
| 64 | +## Where to Look |
| 65 | + |
| 66 | +| Task | Location | Notes | |
| 67 | +|------|----------|-------| |
| 68 | +| Add new handler | `main.py` | Register with appropriate group (-1, 0, 1) | |
| 69 | +| Modify messages | `constants.py` | All Indonesian templates centralized | |
| 70 | +| Add DB table | `database/models.py` → `database/service.py` | Add model, then service methods | |
| 71 | +| Change config | `config.py` | Pydantic BaseSettings with env vars | |
| 72 | +| Add URL whitelist | `constants.py` → `WHITELISTED_URL_DOMAINS` | Suffix-based matching | |
| 73 | +| Add Telegram whitelist | `constants.py` → `WHITELISTED_TELEGRAM_PATHS` | Lowercase, exact path match | |
| 74 | + |
| 75 | +## Code Map (Key Files) |
| 76 | + |
| 77 | +| File | Lines | Role | |
| 78 | +|------|-------|------| |
| 79 | +| `database/service.py` | 645 | **Complexity hotspot** - handles warnings, captcha, probation state | |
| 80 | +| `constants.py` | 528 | Templates + massive whitelists (Indonesian tech community) | |
| 81 | +| `handlers/captcha.py` | 365 | New member join → restrict → verify → unrestrict lifecycle | |
| 82 | +| `handlers/verify.py` | 344 | Admin verification commands + inline button callbacks | |
| 83 | +| `handlers/anti_spam.py` | 327 | Probation enforcement with URL whitelisting | |
| 84 | +| `main.py` | 293 | Entry point, logging, handler registration, JobQueue setup | |
| 85 | + |
| 86 | +## Architecture Patterns |
| 87 | + |
| 88 | +### Handler Priority Groups |
| 89 | +```python |
| 90 | +# main.py - Order matters! |
| 91 | +group=-1 # topic_guard: Runs FIRST, deletes unauthorized warning topic msgs |
| 92 | +group=0 # Commands, DM, anti_spam: Default priority |
| 93 | +group=1 # message_handler: Runs LAST, profile compliance check |
23 | 94 | ``` |
24 | 95 |
|
25 | | -## Architecture |
| 96 | +### Singletons |
| 97 | +- `get_settings()` — Pydantic settings, `@lru_cache` |
| 98 | +- `get_database()` — DatabaseService, lazy init |
| 99 | +- `BotInfoCache` — Class-level cache for bot username/ID |
26 | 100 |
|
27 | | -- **src/bot/**: Main application package |
28 | | - - **main.py**: Entry point with JobQueue integration — register new handlers here |
29 | | - - **config.py**: Pydantic settings (`get_settings()` cached via `lru_cache`) |
30 | | - - **constants.py**: Centralized message templates and utilities |
31 | | - - **handlers/**: Telegram update handlers (message.py, dm.py, captcha.py, verify.py, anti_spam.py, topic_guard.py, check.py) |
32 | | - - **services/**: Business logic (user_checker.py, scheduler.py, bot_info.py, telegram_utils.py, captcha_recovery.py) |
33 | | - - **database/**: SQLModel schemas (models.py) and SQLite operations (service.py) — use `get_database()` singleton |
34 | | -- **tests/**: pytest-asyncio tests with mocked telegram API |
35 | | -- **data/bot.db**: SQLite database (auto-created via `SQLModel.metadata.create_all`) |
| 101 | +### State Machine (Progressive Restriction) |
| 102 | +``` |
| 103 | +1st violation → Warning with threshold info |
| 104 | +2nd to (N-1) → Silent increment (no spam) |
| 105 | +Nth violation → Restrict + notification |
| 106 | +Time threshold → Auto-restrict via scheduler (parallel path) |
| 107 | +``` |
| 108 | + |
| 109 | +### Database Conventions |
| 110 | +- SQLite with **WAL mode** for concurrency |
| 111 | +- `session.exec(select(Model).where(...)).first()` syntax |
| 112 | +- Atomic updates for violation counts (prevents race conditions) |
| 113 | +- No Alembic — use `SQLModel.metadata.create_all` |
36 | 114 |
|
37 | 115 | ## Code Style |
38 | 116 |
|
39 | | -- **Python 3.11+** with type hints; imports grouped: stdlib → third-party → local |
40 | | -- **Async/await**: All handlers are async functions |
41 | | -- **PTB v20+**: Use `ContextTypes.DEFAULT_TYPE` for context type hints, not legacy `Dispatcher`/`Updater` |
42 | | -- **SQLModel**: Use `session.exec(select(Model).where(...)).first()` syntax; no Alembic migrations |
43 | | -- **Logging**: Use `logfire` for structured logging, not `print()` or stdlib `logging` |
44 | | -- **Error handling**: Catch specific exceptions (e.g., `TimedOut`), log errors, return gracefully |
45 | | -- **No comments**: Avoid inline comments unless code is complex |
46 | | -- **Docstrings**: Module-level docstrings required, function docstrings for public APIs |
| 117 | +- **Python 3.11+** with type hints |
| 118 | +- **Imports**: stdlib → third-party → local |
| 119 | +- **Async/await**: All handlers are async |
| 120 | +- **PTB v20+**: Use `ContextTypes.DEFAULT_TYPE`, not legacy Dispatcher |
| 121 | +- **Logging**: Use `logfire` via stdlib `logging.getLogger(__name__)` |
| 122 | +- **Error handling**: Catch specific exceptions (`TimedOut`), log, return gracefully |
| 123 | +- **No inline comments** unless code is complex |
| 124 | +- **Docstrings**: Module-level required; function docstrings for public APIs |
47 | 125 |
|
48 | 126 | ## Testing |
49 | 127 |
|
50 | | -- **Async mode**: `asyncio_mode = auto` in pyproject.toml — do NOT use `@pytest.mark.asyncio` decorators |
51 | | -- **Fixtures**: Check existing fixtures in test files (`mock_update`, `mock_context`, `mock_settings`) |
52 | | -- **Mocking**: Use `AsyncMock` and `MagicMock` for telegram API; no real network calls |
| 128 | +- **Async mode**: `asyncio_mode = auto` — do NOT use `@pytest.mark.asyncio` decorators |
| 129 | +- **No conftest.py**: Fixtures defined locally in each test file (intentional isolation) |
| 130 | +- **Fixtures**: `mock_update`, `mock_context`, `mock_settings` — copy from existing tests |
| 131 | +- **Database tests**: Use `temp_db` fixture with `tempfile.TemporaryDirectory` |
| 132 | +- **Mocking**: `AsyncMock` for Telegram API; no real network calls |
| 133 | +- **Coverage**: 100% maintained — check before committing |
| 134 | + |
| 135 | +## Anti-Patterns (THIS PROJECT) |
| 136 | + |
| 137 | +| Forbidden | Why | |
| 138 | +|-----------|-----| |
| 139 | +| `@pytest.mark.asyncio` decorator | `asyncio_mode = auto` handles this | |
| 140 | +| Manual `conftest.py` fixtures | Project uses local fixture pattern | |
| 141 | +| Raw SQL in handlers | Use `DatabaseService` methods | |
| 142 | +| Hardcoded Indonesian text | Use `constants.py` templates | |
| 143 | +| `print()` statements | Use `logging.getLogger(__name__)` | |
| 144 | +| Empty `except:` blocks | Catch specific exceptions, log with `exc_info=True` | |
| 145 | + |
| 146 | +## Unique Conventions |
| 147 | + |
| 148 | +### Indonesian Localization |
| 149 | +- All user-facing messages in `constants.py` |
| 150 | +- Time formatting: `format_threshold_display(minutes)` → "3 jam" or "30 menit" |
| 151 | +- Duration formatting: `format_hours_display(hours)` → "7 hari" or "12 jam" |
| 152 | + |
| 153 | +### Admin Authorization |
| 154 | +```python |
| 155 | +admin_ids = context.bot_data.get("admin_ids", []) |
| 156 | +if user.id not in admin_ids: |
| 157 | + return # or send "Admin only" message |
| 158 | +``` |
| 159 | + |
| 160 | +### URL Whitelisting (Anti-spam) |
| 161 | +- Suffix-based hostname matching in `is_url_whitelisted()` |
| 162 | +- `WHITELISTED_URL_DOMAINS` — tech/docs domains (github.com, docs.python.org, etc.) |
| 163 | +- `WHITELISTED_TELEGRAM_PATHS` — Indonesian tech communities (lowercase) |
| 164 | + |
| 165 | +### Restart Recovery |
| 166 | +- Pending captchas persisted to DB, recovered in `post_init()` |
| 167 | +- JobQueue timeouts re-scheduled on bot startup |
| 168 | + |
| 169 | +## CI/CD |
| 170 | + |
| 171 | +- **GitHub Actions**: `.github/workflows/python-checks.yml` |
| 172 | +- **Matrix**: Python 3.11, 3.12, 3.13, 3.14 |
| 173 | +- **Steps**: Ruff lint → pytest |
| 174 | +- **Docker**: Multi-stage build with `uv`, non-root user, 512MB limit |
| 175 | + |
| 176 | +## Notes |
| 177 | + |
| 178 | +- Topic guard runs at `group=-1` to intercept unauthorized messages BEFORE other handlers |
| 179 | +- JobQueue auto-restriction job runs every 5 minutes (first run after 5 min delay) |
| 180 | +- Bot uses `allowed_updates=["message", "callback_query", "chat_member"]` |
| 181 | +- Captcha uses both `ChatMemberHandler` (for "Hide Join" groups) and `MessageHandler` fallback |
0 commit comments