Skip to content

Python: [BREAKING] Refactor FileSkillsSource for depth-based discovery and predicate filters#6488

Open
giles17 wants to merge 2 commits into
microsoft:mainfrom
giles17:giles17/refactor-file-skills-depth-based-discovery
Open

Python: [BREAKING] Refactor FileSkillsSource for depth-based discovery and predicate filters#6488
giles17 wants to merge 2 commits into
microsoft:mainfrom
giles17:giles17/refactor-file-skills-depth-based-discovery

Conversation

@giles17

@giles17 giles17 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Motivation and Context

Ports the refactoring from .NET PR #6109 to the Python side. The old directory-whitelist approach (resource_directories/script_directories) was rigid — users had to know exact subdirectory names upfront. The new depth-based scanning with optional filter predicates is strictly more flexible and aligns both language implementations.

Description

Refactors FileSkillsSource to replace preconfigured directory whitelists with depth-controlled recursive scanning and user-supplied filter predicates.

Removed:

  • resource_directories / script_directories constructor options

Added:

  • search_depth option (>= 1, default 2): controls how deep the recursive scan goes within each skill directory. Depth 1 = root only; depth 2 = root + one level of subdirectories.
  • script_filter / resource_filter predicate options that receive a FileSkillFilterContext (skill_name + relative_file_path), allowing whitelist/blacklist filtering by file path.
  • FileSkillFilterContext class exported from agent_framework.

Preserved:

  • All security checks (path containment, symlink detection) using the skill root as the trusted boundary.
  • Extension-based filtering (resource_extensions / script_extensions).

Updated:

  • SkillsProvider.from_paths() accepts the new parameters.
  • Discovery methods refactored from @staticmethod to instance methods.
  • Tests rewritten to validate depth-based behavior and filter predicates.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add [BREAKING] prefix to the title of the PR.

…y and predicate filters

Refactors FileSkillsSource to make script and resource discovery more flexible.

## Changes

- **Drops** resource_directories / script_directories options (preconfigured
  directory whitelists).
- **Adds** search_depth option (>= 1, default 2): controls how deep the
  recursive scan goes within each skill directory.
- **Adds** script_filter / resource_filter predicate options that receive a
  FileSkillFilterContext (skill_name + relative_file_path), allowing
  whitelist/blacklist filtering by file path.
- **Adds** FileSkillFilterContext class exported from agent_framework.

## Notes

- The Skills API is marked @experimental -- the option removals are intentional
  breaking changes within the experimental surface.
- Security checks (path containment, symlink detection) are preserved and
  continue to use the skill root directory as the trusted boundary.
- Ports the same refactoring from .NET PR microsoft#6109 while following Python
  conventions (instance methods, Callable type hints, __slots__).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 11, 2026 22:04
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _skills.py10254096%294, 541, 1007, 1022, 1024–1025, 1381–1382, 1394–1395, 1676, 1705, 2174, 2624–2625, 2722, 2730, 2735, 2738, 2743, 2763, 2775, 2780, 2878, 2886, 2891, 2894, 2899, 2919, 2928, 2933, 3196–3197, 3546, 3773–3774, 3801–3802, 3809–3810
   _tools.py11389092%222–223, 400, 402, 415, 440–442, 450, 468, 482, 489, 496, 519, 521, 528, 536, 665, 699–701, 704–706, 708, 714, 765–767, 792, 818, 822, 860–862, 866, 888, 1031–1032, 1036, 1072, 1084, 1091–1094, 1115, 1119, 1123, 1137–1139, 1489, 1581, 1609, 1631, 1639, 1730, 1737–1738, 1797, 1801, 1847, 1908–1909, 1956, 1970, 1973, 1986, 1989, 2012, 2019, 2029, 2033, 2112, 2165, 2185, 2187, 2243, 2322, 2519, 2585–2586, 2738, 2743, 2745–2746, 2815, 2820, 2827
TOTAL39144447688% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
7815 34 💤 0 ❌ 0 🔥 2m 5s ⏱️

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors Python file-based skill discovery to match the newer .NET behavior by replacing fixed subdirectory whitelists with depth-based recursive scanning and optional predicate filters, while keeping existing containment/symlink safety checks.

Changes:

  • Replace resource_directories / script_directories with search_depth (default 2) for recursive discovery within each skill directory.
  • Add script_filter / resource_filter predicates using the new exported FileSkillFilterContext to allow per-file inclusion/exclusion decisions.
  • Rewrite/extend unit tests to validate depth behavior, filtering, and existing safety guarantees (path traversal + symlink skipping).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
python/packages/core/tests/core/test_skills.py Updates tests to validate depth-based discovery and filter predicate behavior.
python/packages/core/agent_framework/_tools.py Minor refactor in approval-request state handling (no functional change intended).
python/packages/core/agent_framework/_skills.py Implements search_depth + predicate filters for FileSkillsSource, adds FileSkillFilterContext, updates SkillsProvider.from_paths() surface.
python/packages/core/agent_framework/init.py Exports FileSkillFilterContext from the public package surface.

Comment thread python/packages/core/agent_framework/_skills.py

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 5 | Confidence: 89%

✓ Correctness

The refactoring from directory-whitelist to depth-based scanning with filter predicates is well-implemented and correct. The depth logic correctly maps search_depth=1 to root-only, search_depth=2 to root+one-level, etc. Security checks (path containment, symlink detection) are preserved at all recursion levels. Filter predicates are applied after security checks, and the FileSkillFilterContext validation is sound. The from_paths factory method correctly forwards new parameters. No correctness bugs found.

✓ Security Reliability

The refactoring correctly preserves all security invariants (path containment, symlink detection at both directory and file levels). The depth-based recursion applies directory-level security checks before descending and file-level checks before including any resource. One minor reliability concern: unhandled exceptions from user-provided filter predicates would crash the entire skill discovery process rather than gracefully skipping the affected file.

✓ Test Coverage

Test coverage for the refactored FileSkillsSource is generally strong. Depth-based discovery is well-tested for resources at both unit and integration levels. Filter predicates are tested for both resources and scripts. However, there's an asymmetry: resource_filter has a dedicated context-verification test (test_resource_filter_receives_correct_context) confirming that skill_name and relative_file_path are passed correctly, but script_filter lacks an equivalent test. Similarly, from_paths has a passthrough test for resource_filter and search_depth but not for script_filter. These are low-severity gaps since the code paths are structurally identical.

✓ Failure Modes

The refactoring from directory-whitelists to depth-based recursive scanning is well-implemented from a failure-mode perspective. Error handling for OS-level failures (iterdir OSError), security checks (symlinks, path traversal), and input validation (search_depth >= 1) are all preserved or improved. User-provided filter predicates follow standard Python fail-fast callback semantics. No silent failures, lost errors, or partial-write scenarios were identified.

✗ Design Approach

The refactor mostly hangs together, but the new recursive file discovery no longer preserves skill boundaries when one skill directory contains another nested skill directory. Because nested SKILL.md folders are still discovered as separate skills elsewhere, the current approach can silently attach a child skill’s resources/scripts to its parent as well, which is a design regression worth fixing before merge.

Flagged Issues

  • Nested skill directories are treated as separate skills by _discover_skill_directories (lines 3161-3195), but the new recursive resource/script scanners descend into every subdirectory (lines 2796-2807 and 2945-2956) without checking for a nested SKILL.md. With parent/SKILL.md and parent/child/SKILL.md, the child's files will be surfaced under both skills. The scaners should stop recursing into subdirectories that contain their own SKILL.md.

Automated review by giles17's agents

Comment thread python/packages/core/agent_framework/_skills.py
@github-actions

Copy link
Copy Markdown
Contributor

Flagged issue

Nested skill directories are treated as separate skills by _discover_skill_directories (lines 3161-3195), but the new recursive resource/script scanners descend into every subdirectory (lines 2796-2807 and 2945-2956) without checking for a nested SKILL.md. With parent/SKILL.md and parent/child/SKILL.md, the child's files will be surfaced under both skills. The scaners should stop recursing into subdirectories that contain their own SKILL.md.


Source: automated DevFlow PR review

@giles17 giles17 changed the title Python: [Breaking] Refactor FileSkillsSource for depth-based discovery and predicate filters Python: [BREAKING] Refactor FileSkillsSource for depth-based discovery and predicate filters Jun 12, 2026
…rectories

- Add clarifying comments distinguishing MAX_SEARCH_DEPTH (SKILL.md
  discovery) from DEFAULT_SEARCH_DEPTH (per-skill resource/script scanning).
- Stop recursing into subdirectories that contain their own SKILL.md,
  preventing child skill files from being attached to the parent skill.
- Add test verifying nested skill boundary is respected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants