Portable workflow for exporting GitHub stars, classifying them, syncing GitHub Lists, and auditing the final result.
This project is designed to run in a fresh directory on another machine. It supports three official classification paths:
- local rules
- AI classification through an OpenAI-compatible API
- manual classification by editing JSON or CSV files
- Export stars through GitHub GraphQL with a token
- Export stars through a logged-in browser as a fallback
- Generate classification outputs in
json,csv, andmd - Import a manually edited classification file
- Create and sync GitHub Lists through browser automation
- Audit whether GitHub List membership matches the classification file
On a fresh Windows machine:
- Install
Python 3.11+ - Install
Node.js 20+ - Clone this repository
- Copy
.env.exampleto.env - Set
GITHUB_TOKENif you want the recommended API export path - Run:
.\scripts\bootstrap.ps1
.\scripts\menu.ps1If you only want the simplest first run, choose:
Export stars (API)Classify stars (Rules)Sync GitHub ListsAudit GitHub Lists
flowchart TD
A[Export stars] --> B{How to classify?}
B --> C[Rules]
B --> D[AI]
B --> E[Manual JSON or CSV]
C --> F[Standard classification files]
D --> F
E --> G[Import classification]
G --> F
F --> H[Sync GitHub Lists]
H --> I[Audit GitHub Lists]
- Install
Python 3.11+andNode.js 20+ - Copy
.env.exampleto.env - Review
config/config.example.yaml - Bootstrap:
.\scripts\bootstrap.ps1- Run the interactive menu:
.\scripts\menu.ps1If you prefer direct commands, the canonical entrypoint is:
python -m github_stars_tool.cli --config .\config\config.example.yaml <command>python -m github_stars_tool.cli --config .\config\config.example.yaml export --mode api
python -m github_stars_tool.cli --config .\config\config.example.yaml classify --mode rules
python -m github_stars_tool.cli --config .\config\config.example.yaml sync-lists
python -m github_stars_tool.cli --config .\config\config.example.yaml audit-listspython -m github_stars_tool.cli --config .\config\config.example.yaml export --mode api
python -m github_stars_tool.cli --config .\config\config.example.yaml classify --mode aiThen review the generated classification files before syncing lists.
- Export stars
- Copy one of these templates and edit it:
examples/manual/classification.manual.example.jsonexamples/manual/classification.manual.example.csv
- Import the edited file:
python -m github_stars_tool.cli --config .\config\config.example.yaml import-classification --input .\your_manual_file.jsonThe import command validates structure, auto-fills missing GitHub URLs from full_name, regenerates the standard json/csv/md outputs, and stops on duplicate repositories.
Required fields per repository:
categoryfull_nameinowner/repoformat
Optional fields:
urllanguagestarstopicsdescriptionreasonlist_description
JSON accepts either:
- a top-level object with
repositories - or a plain array of repository entries
CSV requires at least the category and full_name columns.
Set GITHUB_TOKEN in your environment.
Recommended token types:
- classic personal access token
- or a fine-grained token that can read the signed-in user's starred repositories
The tool does not hardcode token class. It only checks whether the token can successfully query GitHub.
The AI classifier is config-driven and assumes an OpenAI-compatible API surface.
Config fields:
llm.providerllm.base_urlllm.endpointchat_completionsresponses
llm.modelllm.structured_outputjson_objectoff
llm.temperature
Recommended default:
endpoint: chat_completionsstructured_output: json_object
If the provider returns an HTML challenge page such as Cloudflare protection, the tool reports that explicitly instead of failing with a vague parse error.
GitHub has public API support for starring data, but not a stable public API for GitHub Lists management. Because of that:
exportprefers API modesync-listsandaudit-listsuse browser automation
Check:
GITHUB_TOKENis set- the token can read your starred repositories
- the token belongs to the GitHub account you expect
Check:
- Chrome is already logged into GitHub
- Chrome was started with remote debugging on port
9222 - the browser session is not blocked by a GitHub login prompt
Example:
& 'C:\Program Files\Google\Chrome\Application\chrome.exe' `
--remote-debugging-port=9222 `
--user-data-dir="$env:LOCALAPPDATA\Google\Chrome\User Data" `
--profile-directory=Default `
https://github.comThat usually means the upstream provider returned an anti-bot or Cloudflare challenge page instead of JSON. In that case:
- try a direct provider API origin
- try a different endpoint such as
chat_completions - confirm your provider supports OpenAI-compatible requests
Check:
- every repository has
category - every repository has
full_name full_nameusesowner/repo- the same repository does not appear twice
config/configuration templatesexamples/manual/editable manual classification templatessrc/github_stars_tool/Python CLIdata/raw/exported star datadata/normalized/normalized star datadata/classification/classification outputsdata/audit/sync state and audit reportsscripts/bootstrap, menu, and helper scripts
- Do not run
sync-listsandaudit-listsin parallel - Run
audit-listsonly aftersync-lists - If
github.useris blank, sync derives the currently logged-in GitHub account from the browser session - Runtime data under
data/is ignored by.gitignore, so the repository stays publishable