Add debug-github-ci and debug-jenkins-ci skills by neubig · Pull Request #76 · OpenHands/extensions

neubig · 2026-02-26T20:14:41Z

Summary

Adds two CI-debugging extensions:

debug-github-ci for GitHub Actions failures
debug-jenkins-ci for Jenkins build failures

The current branch now also adds unit coverage for the pure log-processing / formatting helpers that previous reviews requested.

Details

What these extensions provide

GitHub Actions

skill guidance for interactive CI debugging in OpenHands
a composite GitHub Action that can analyze a failed workflow run automatically
context-aware log truncation that prioritizes error regions before falling back to head/tail truncation

Jenkins

skill guidance for interactive Jenkins debugging
an agent script that fetches build metadata, stages, and console output
the same context-aware truncation approach for large logs

Review-driven follow-up changes on this branch

add pytest coverage for the pure helper functions called out in review:
- GitHub: _find_error_context, _truncate_logs, format_failed_jobs
- Jenkins: _find_error_context, _truncate_logs, format_duration, format_timestamp, and prompt.format_prompt
compile regex patterns once per scan instead of recompiling in the hot loop
skip invalid custom regex patterns with a warning instead of crashing the scan
make the agent-script modules importable for unit tests even when the full OpenHands runtime is not installed locally
document the pluginRoot marketplace resolution and why the plugin directories contain skill symlinks
document the Ubuntu runner assumption next to GitHub CLI installation in the action

Testing

$ python3 -m pytest \
    tests/test_debug_github_ci_agent_script.py \
    tests/test_debug_jenkins_ci_helpers.py -q
............
12 passed in 0.04s

Evidence

Verification link: View conversation

Cannot be fully verified end-to-end in current environment:

What I tried: reviewed the plugin/skill wiring, added the requested pure-function tests, and ran those tests locally.
Resource needed: a real failing GitHub Actions run plus a reachable Jenkins instance with credentials so the full automation paths can be exercised end-to-end.
Reason: the primary user-facing behavior here is autonomous debugging against live CI systems. I do not have honest live-run evidence yet for the GitHub Action + Jenkins integrations themselves in this environment.
Manual verification steps:
1. Trigger the GitHub workflow against a known failed Actions run and confirm the agent posts a useful diagnosis.
2. Run the Jenkins script against a failed Jenkins build with valid credentials.
3. Confirm both flows fetch logs, summarize failed jobs/stages, and preserve relevant error context in truncated logs.

Checklist

Pure helper tests added for GitHub and Jenkins scripts
Regex hot-loop recompilation fixed in both scripts
Review-requested documentation clarifications added
Live GitHub Actions / Jenkins end-to-end evidence gathered
All review threads resolved

all-hands-bot

Taste Rating: 🔴 Needs Improvement

Core Issue: These are plugins with executable code being registered as skills in the marketplace, with broken source paths. The marketplace registration fundamentally conflicts with the actual directory structure.

Verdict: ❌ Needs rework - Path resolution is broken, architecture needs clarification.

Key Insight: You cannot register plugins as skills. Pick one: either register the SKILL.md files as skills (correct paths), or register these as plugins with their automation capabilities. The current approach breaks both.

marketplaces/default.json

plugins/debug-github-ci/scripts/agent_script.py

plugins/debug-jenkins-ci/scripts/agent_script.py

plugins/debug-github-ci/action.yml

plugins/debug-github-ci/skills/debug-github-ci

skills/debug-github-ci/SKILL.md

plugins/debug-github-ci/scripts/agent_script.py

all-hands-bot

Taste Rating: 🟡 Acceptable - Works but has unresolved design issues

Core Pattern: Previous review feedback was "resolved" with explanations rather than fixes. The code works, but fundamental concerns remain unaddressed.

Verdict: ✅ Worth merging with caveats - The automation is functional and useful, but carries technical debt from day one.

Key Insight: You built plugins with executable automation but registered them as skills in the marketplace. This identity crisis creates confusion about what these artifacts actually are and how they should be used.

plugins/debug-github-ci/scripts/agent_script.py

marketplaces/default.json

plugins/debug-github-ci/scripts/prompt.py

plugins/debug-jenkins-ci/scripts/agent_script.py

plugins/debug-github-ci/README.md

plugins/debug-github-ci/action.yml

all-hands-bot

Taste Rating: 🟡 Acceptable

Core Pattern: Solves a real problem (CI debugging) with straightforward implementation. Previous feedback addressed most technical issues. Remaining concerns are operational - cost controls, rate limiting, and failure modes.

Verdict: ✅ Worth merging with operational caveats - Code quality is solid after refactoring, but production usage needs monitoring.

Key Insight: Every CI failure will trigger an LLM agent that costs money and makes API calls. No safeguards for rate limiting, cost caps, or recursive failures.

plugins/debug-github-ci/workflows/debug-ci-failure.yml

plugins/debug-github-ci/scripts/agent_script.py

marketplaces/default.json

plugins/debug-github-ci/scripts/agent_script.py

all-hands-bot

Taste Rating: 🟡 Acceptable - Works but has unresolved issues

Core Issue: The code is reasonably well-structured after previous refactorings, but marketplace paths are still broken, there are no tests, and some failure modes remain unhandled.

Verdict: ❌ Needs fixes - Fix the marketplace paths and add basic test coverage before merging.

marketplaces/default.json

plugins/debug-github-ci/scripts/agent_script.py

plugins/debug-jenkins-ci/scripts/agent_script.py

plugins/debug-github-ci/action.yml

plugins/debug-github-ci/scripts/agent_script.py

plugins/debug-jenkins-ci/scripts/agent_script.py

all-hands-bot

Taste Rating: 🟡 Acceptable - Works but misses fundamentals

Core Pattern: Solves a real problem (CI debugging) with mostly clean code, but ships 2000+ lines with zero tests for functions that have edge cases and can break silently.

Verdict: ❌ Needs fixes - Add tests for pure functions and fix regex compilation issue before merging.

Key Insight: Previous reviews spent energy on architecture bikeshedding (paths, dual-mode, recursion) but missed the fundamental issue: you cannot verify correctness of log truncation, pattern matching, and formatting without tests. "We'll rely on integration testing" is a lazy excuse when you have pure functions with clear edge cases.

plugins/debug-github-ci/scripts/agent_script.py

plugins/debug-jenkins-ci/scripts/agent_script.py

plugins/debug-github-ci/skills/debug-github-ci

Squashed commit from add-ci-debug-skills branch with conflict resolution. Includes all changes from the original PR with main branch merged. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

Taste Rating: 🔴 Needs Improvement

Core Issue: Three unresolved critical issues remain from previous reviews: regex compilation in hot loops (performance), zero tests for 2000+ lines (correctness), and symlink architecture confusion (maintainability).

Verdict: ❌ Needs rework before merge - Fix regex compilation and add tests for pure functions.

Key Insight: You solved the hard problems (recursion guards, configurable patterns, input validation) but left trivial performance bugs and skipped testing entirely. The regex issue has a 3-line fix; the testing objection ("ROI is low") doesn't hold for pure functions with no API dependencies.

plugins/debug-github-ci/scripts/agent_script.py

plugins/debug-jenkins-ci/scripts/agent_script.py

plugins/debug-github-ci/scripts/agent_script.py

plugins/debug-jenkins-ci/scripts/agent_script.py

plugins/debug-github-ci/skills/debug-github-ci

marketplaces/default.json

plugins/debug-github-ci/action.yml

plugins/debug-jenkins-ci/scripts/agent_script.py

Co-authored-by: openhands <openhands@all-hands.dev>

neubig marked this pull request as ready for review March 1, 2026 13:17

all-hands-bot reviewed Mar 1, 2026

View reviewed changes

neubig marked this pull request as draft March 1, 2026 13:28

neubig marked this pull request as ready for review March 2, 2026 03:40

all-hands-bot reviewed Mar 2, 2026

View reviewed changes

neubig marked this pull request as draft March 2, 2026 12:21

neubig marked this pull request as ready for review March 2, 2026 12:36

all-hands-bot reviewed Mar 2, 2026

View reviewed changes

neubig marked this pull request as draft March 2, 2026 12:46

neubig marked this pull request as ready for review March 2, 2026 13:00

all-hands-bot reviewed Mar 2, 2026

View reviewed changes

neubig marked this pull request as draft March 2, 2026 13:09

neubig marked this pull request as ready for review March 2, 2026 13:16

all-hands-bot reviewed Mar 2, 2026

View reviewed changes

neubig marked this pull request as draft March 3, 2026 13:50

Add debug-github-ci and debug-jenkins-ci skills

b617dbd

Squashed commit from add-ci-debug-skills branch with conflict resolution. Includes all changes from the original PR with main branch merged. Co-authored-by: openhands <openhands@all-hands.dev>

neubig force-pushed the add-ci-debug-skills branch from 256241f to b617dbd Compare March 3, 2026 13:52

neubig marked this pull request as ready for review March 3, 2026 13:53

all-hands-bot reviewed Mar 3, 2026

View reviewed changes

neubig marked this pull request as draft March 7, 2026 19:24

openhands-agent added 2 commits March 7, 2026 21:39

test(debug-github-ci): cover log parsing helpers

bb1e2f6

Co-authored-by: openhands <openhands@all-hands.dev>

test(debug-jenkins-ci): cover pure helpers

4ce5ef2

Co-authored-by: openhands <openhands@all-hands.dev>

Conversation

neubig commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

What these extensions provide

Review-driven follow-up changes on this branch

Testing

Evidence

Checklist

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Taste Rating: 🔴 Needs Improvement

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Taste Rating: 🟡 Acceptable - Works but has unresolved design issues

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Taste Rating: 🟡 Acceptable

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Taste Rating: 🟡 Acceptable - Works but has unresolved issues

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Taste Rating: 🟡 Acceptable - Works but misses fundamentals

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Taste Rating: 🔴 Needs Improvement

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

neubig commented Feb 26, 2026 •

edited

Loading