feat: add ecosystem domain allowlists from gh-aw#213
Conversation
Add ecosystem identifier support for the network.allow front matter field. Users can now reference ecosystem names (e.g., python, rust, node) that expand to curated domain lists, matching gh-aw's approach. Changes: - Add src/data/ecosystem_domains.json sourced from gh-aw with 30+ ecosystem categories - Add src/ecosystem_domains.rs module with lookup, validation, and compound ecosystem support - Update generate_allowed_domains() to resolve ecosystem identifiers in both network.allow and network.blocked - Extend dependency updater workflow to sync ecosystem_domains.json from gh-aw upstream - Update AGENTS.md with ecosystem identifier documentation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move Lean runtime domains from hardcoded LEAN_REQUIRED_HOSTS constant into ecosystem_domains.json. The Lean extension now returns the ecosystem identifier 'lean' from required_hosts(), and generate_allowed_domains() resolves it via the JSON like any other ecosystem. Extension hosts now support ecosystem identifiers too. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add reservoir.lean-lang.org (Lake package registry) and static.lean-lang.org (toolchain binary downloads) to the lean ecosystem entry. Update LEAN_REQUIRED_HOSTS constant to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The canonical lean domain list is now in ecosystem_domains.json. The LeanExtension returns the "lean" ecosystem identifier, making this constant dead code. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The sync workflow previously replaced the local file with upstream verbatim, which would delete ado-aw-specific entries like 'lean'. Updated instructions to merge upstream changes while preserving local-only keys. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Needs changes — one data bug that would silently allow Rust crate registry access in Python-only pipelines, plus a doc typo that would break copy-paste examples. Findings🐛 Bugs / Logic Issues
|
Align field names with gh-aw conventions by renaming: - network.allowed → network.allow - network.blocked → network.block Updates front matter parsing, compiler, tests, documentation, and all prompt/agent files to use the new field names. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Recompile lock files to reflect updated network field names in update-awf-version workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace continue-after-if with an if/else block for clearer mutually-exclusive branching between ecosystem identifiers and raw domain validation. Also fix remaining stale "network.allowed" references in error messages, comments, and test descriptions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Good feature overall, but there's a data bug causing Rust domains to leak into the Python ecosystem, and a breaking rename that will silently drop custom allowlists from recompiled pipelines. Findings🐛 Bugs / Logic Issues
|
Add two tests to ecosystem_domains::tests: - test_embedded_json_parses_as_expected_schema: validates the compile-time-embedded JSON deserializes correctly as HashMap<String, Vec<String>> and every ecosystem has a non-empty domain list. - test_malformed_json_rejected: confirms serde_json rejects schema mismatches (string instead of array, non-string array elements, invalid JSON syntax), validating the safety of the .expect() guard on the LazyLock initializer. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… rename Add #[serde(deny_unknown_fields)] to NetworkConfig so that the old field name (network.allow) produces a compile-time error instead of being silently ignored. This prevents users from losing their AWF domain allowlists after the rename to network.allowed. Also fix stale doc comments referencing the old field name. Tests added: - test_network_config_rejects_old_allow_field - test_network_config_accepts_allowed_field - test_network_config_rejects_arbitrary_unknown_field Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rename remaining occurrences of network.allow to network.allowed in: - src/compile/standalone.rs: doc comments and warning/error messages - tests/compiler_tests.rs: doc comments and test description strings Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Good feature addition, but contains one unannounced breaking change and a data quality issue in the embedded JSON that needs attention before merging. Findings🐛 Bugs / Logic Issues
🔒 Security Concerns
|
- Add warning in generate_allowed_domains() when an extension requires an unknown ecosystem identifier, matching the existing guard on the user-host path. - Add depth guard (max 8) to get_ecosystem_domains() to prevent stack overflow from circular compound ecosystem references. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Well-implemented feature with one breaking change that needs awareness, and a subtle silent-failure path in the validation logic. Findings🐛 Bugs / Logic Issues
🔒 Security Concerns
|
Summary
Adds ecosystem identifier support for the
network.allowfront matter field, aligning with gh-aw's network allowlisting approach. Users can now reference ecosystem names (e.g.,python,rust,node) that expand to curated domain lists instead of manually listing individual domains.Before
After
Changes
src/data/ecosystem_domains.json— New file: 30+ ecosystem categories sourced from gh-aw'secosystem_domains.json, embedded at compile timesrc/ecosystem_domains.rs— New module: loads/parses the JSON, providesget_ecosystem_domains(),is_ecosystem_identifier(), andis_known_ecosystem()functions, supports compound ecosystems (e.g.,default-safe-outputs= defaults + dev-tools + github + local)src/compile/standalone.rs— Updatedgenerate_allowed_domains()to resolve ecosystem identifiers in bothnetwork.allowandnetwork.blocked. Unknown identifiers emit a compile-time warning. Added 5 integration tests.src/main.rs— Register newecosystem_domainsmodule.github/workflows/update-awf-version.md— Extended the dependency updater workflow to syncecosystem_domains.jsonfrom gh-aw upstream (max PRs bumped 3→4)AGENTS.md— Documented ecosystem identifiers with available ecosystem table, updated architecture diagram, and updated{{ allowed_domains }}marker docsDesign Decisions
python,linux-distros); domain names contain dots (pypi.org). Invalid strings (spaces, special chars) fall through to existing DNS validation.network.allowcontinue to work exactly as before. Ecosystem identifiers are purely additive.CORE_ALLOWED_HOSTSinallowed_hosts.rsis always included regardless ofnetwork:config — ecosystems are an additional layer.Testing
ecosystem_domains.rs+ 5 integration tests instandalone.rs)"bad host!") continues to work correctly — the tightened identifier heuristic rejects it