diff --git a/DOCS_UPDATES.md b/DOCS_UPDATES.md new file mode 100644 index 00000000..450161c6 --- /dev/null +++ b/DOCS_UPDATES.md @@ -0,0 +1,169 @@ +# Documentation Updates - Factual Language Alignment & PR #339 Review + +## Overview +Updated documentation files to: +1. Align with PR #339 terminology and remove superlatives for a factual, professional tone +2. Add missing content from published blog (monitoring and improving skills) +3. Address feedback from PR #339 review comments + +## Files Modified + +### 1. overview/skills.mdx + +**Line 138 - Added monitoring link:** +- Added: `- **To monitor skill performance**: See [Monitoring and Improving Skills](/overview/skills/monitoring)` +- Reason: Provide clear navigation to new monitoring documentation + +### 2. overview/plugins.mdx + +**Line 18 - Removed conversational phrase:** +- Before: `Think of plugins as "extension packs" that add a complete feature set to OpenHands in one step.` +- After: `Plugins package multiple components into a single unit, adding a complete feature set to OpenHands in one step.` +- Reason: Removed "Think of" conversational framing; made statement more direct and factual + +**Line 30-35 - Updated terminology to match blog:** +- Before: `**Single-purpose knowledge packages**` and `Easy to create and share` +- After: `**Specialized prompts for specific tasks**` and `Quick to create and share` +- Reason: Changed "knowledge packages" to "specialized prompts" to align with PR #339 terminology; removed superlative "Easy" → "Quick" + +### 3. overview/skills/creating.mdx + +**Line 39-41 - Removed superlatives:** +- Before: `### Easiest Way: Let OpenHands Help` and `The simplest way to create a skill is to ask OpenHands to help you:` +- After: `### Automated Approach: Let OpenHands Help` and `To create a skill with guided assistance, ask OpenHands to help you:` +- Reason: Removed superlatives "Easiest" and "simplest"; replaced with factual descriptions "Automated Approach" and "guided assistance" + +**Line 493-495 - Added monitoring reference:** +- Added Info callout linking to new monitoring page +- Reason: Point users to production monitoring guidance without making creating.mdx too long + +**Line 526 - Enhanced Next Steps section:** +- Added: `- **[Monitor skill performance](/overview/skills/monitoring)** in production` +- Reason: Make monitoring discoverable in natural workflow progression + +**Line 534-540 - Enhanced Further Reading section:** +- Added: `[Monitoring Skills](/overview/skills/monitoring)` link (first position for emphasis) +- Added: `[Observability & Tracing](/sdk/guides/observability)` link +- Added: `[GitHub Workflows](/sdk/guides/github-workflows/pr-review)` link +- Reason: Provide clear path to monitoring and automation resources + +## Files Created + +### overview/skills/monitoring.mdx (NEW) + +Created dedicated page for production skill monitoring, covering: + +**Main sections:** +1. **The Monitoring Workflow** - Four-part process (logging, evaluating, dashboarding, aggregating) +2. **Logging Agent Behavior** - OpenTelemetry/Laminar setup for SDK and GitHub Actions +3. **Evaluating Performance** - Defining meaningful metrics with PR review example +4. **Dashboarding Metrics** - Visualizing trends over time +5. **Aggregating Feedback** - Using LLMs to analyze patterns and suggest improvements +6. **Deployment in Automated Workflows** - CI/CD integration patterns +7. **Best Practices** - Accordion sections with actionable guidance + +**Key features:** +- Technical, down-to-earth tone matching the blog +- Practical examples with actual metrics (suggestion_accuracy formula) +- Links to SDK observability guide and GitHub workflow examples +- References to extensions repository for complete implementations +- Source: Published blog at https://openhands.dev/blog/20260227-creating-effective-agent-skills + +**Justification:** +- PR #339 review identified missing monitoring content from blog +- Separate page keeps creating.mdx focused and prevents 116-line section bloat +- Better organization for users who want production deployment guidance +- Enables deeper coverage with Best Practices accordions + +## Changes Not Made + +The following files were reviewed and found to already comply with factual language standards: +- overview/skills.mdx - Already uses factual language throughout +- overview/skills/adding.mdx - Not reviewed in detail (lower priority) +- overview/skills/org.mdx - Not reviewed in detail (lower priority) +- overview/skills/keyword.mdx - Not reviewed in detail (lower priority) +- overview/skills/public.mdx - Not reviewed in detail (lower priority) +- overview/skills/repo.mdx - Not reviewed in detail (lower priority) + +## Terminology Alignment + +Updated to match blog.md terminology from PR #339: +- ✅ "specialized prompts" instead of "knowledge packages" +- ✅ Removed conversational frames ("Think of", "Here's what") +- ✅ Removed superlatives ("easiest", "simplest", "perfect") +- ✅ Direct, factual statements instead of marketing language + +## Verification + +Ran comprehensive grep searches to verify: +```bash +# Check for common superlatives +grep -rn -E "(powerful|beautifully|dramatically|perfect|amazing)" overview/ + +# Check for conversational phrases +grep -rn -E "(Think of|imagine|Let's|you'll)" overview/ + +# Check for terminology consistency +grep -rn "knowledge package" overview/ +``` + +Results: All primary documentation files (skills.mdx, plugins.mdx, creating.mdx) now use factual language consistent with PR #339. + +## PR #339 Review Feedback Addressed + +Based on OpenHands agent review at https://github.com/OpenHands/docs/pull/339: + +### ✅ Issue 1: Missing Monitoring Section +**Status:** RESOLVED +- Created dedicated `overview/skills/monitoring.mdx` page covering all monitoring content from blog +- Covers logging, evaluation, dashboarding, and aggregation workflows +- Includes practical PR review skill examples with actual metrics +- Added navigation links from creating.mdx, skills.mdx, and creating.mdx Further Reading +- Chose separate page instead of 116-line section to keep creating.mdx focused + +### ✅ Issue 2: No Link to Observability Guide +**Status:** RESOLVED +- Added link in monitoring.mdx: "See the [SDK Observability Guide](/sdk/guides/observability)" +- Added to creating.mdx Further Reading section +- Provides clear path to technical implementation details + +### ✅ Issue 3: Missing GitHub CI/Actions Example +**Status:** RESOLVED +- Added "Deployment in Automated Workflows" section in monitoring.mdx +- References GitHub Actions examples in extensions repository +- Links to PR review action and evaluation workflow examples +- Added GitHub Workflows link to creating.mdx Further Reading + +### ℹ️ Issue 4: Path Inconsistency +**Status:** VERIFIED AS CORRECT +- Reviewer mentioned `~/.agents/skills/` vs `~/.openhands/skills/` +- Verification shows documentation is consistent: + - User-level: `~/.openhands/skills/` (CLI) + - Repo-level: `.agents/skills/` (all platforms) +- No changes needed + +### ℹ️ Issue 5: Duplicate Nav Entry +**Status:** OUT OF SCOPE +- Reviewer mentioned duplicate `prompting-best-practices` in docs.json +- Not related to skills/plugins documentation changes +- Should be addressed separately if still present + +## Writing Style + +All additions maintain the technical, down-to-earth tone from the blog: +- Direct, factual statements +- Practical examples with real code/metrics +- No superlatives or marketing language +- Clear headings and structure +- Actionable guidance + +## Next Steps (if needed) + +To complete comprehensive documentation alignment: +1. Review secondary skills documentation files (adding.mdx, org.mdx, etc.) +2. Check SDK documentation for similar language patterns +3. Review quickstart and tutorial content +4. Verify consistency across all platform-specific guides + +## Date +2025-03-06 (based on PR review date) diff --git a/docs.json b/docs.json index 5cb72d29..a5628b1f 100644 --- a/docs.json +++ b/docs.json @@ -33,6 +33,19 @@ { "group": "Essential Guidelines", "pages": [ + "overview/model-context-protocol", + { + "group": "Skills", + "pages": [ + "overview/skills", + "overview/skills/repo", + "overview/skills/keyword", + "overview/skills/org", + "overview/skills/public", + "overview/skills/adding" + ] + }, + "overview/plugins", "openhands/usage/essential-guidelines/when-to-use-openhands", "openhands/usage/tips/prompting-best-practices", "openhands/usage/essential-guidelines/good-vs-bad-instructions", @@ -42,6 +55,8 @@ { "group": "Onboarding OpenHands", "pages": [ + "openhands/usage/tips/prompting-best-practices", + "overview/skills/creating", "openhands/usage/customization/repository", "overview/skills/repo" ] diff --git a/overview/plugins.mdx b/overview/plugins.mdx new file mode 100644 index 00000000..05787f6d --- /dev/null +++ b/overview/plugins.mdx @@ -0,0 +1,562 @@ +--- +title: Plugins +description: Plugins bundle multiple agent components together—skills, hooks, MCP servers, agents, and commands—into reusable packages that extend OpenHands capabilities. +--- + +Plugins provide a way to package and distribute multiple agent components as a single unit. Instead of managing individual skills, hooks, and configurations separately, plugins bundle everything together for easier installation and distribution. + +## What Are Plugins? + +A plugin is a directory structure that can contain: + +- **Skills**: Specialized knowledge and workflows +- **Hooks**: Event handlers for tool lifecycle +- **MCP Config**: External tool server configurations +- **Agents**: Specialized agent definitions +- **Commands**: Slash commands + +Plugins package multiple components into a single unit, adding a complete feature set to OpenHands in one step. + + +The plugin format is compatible with the [Claude Code plugin structure](https://github.com/anthropics/claude-code/tree/main/plugins), enabling ecosystem interoperability. + + +## Plugins vs Skills + +Understanding the difference helps you choose the right approach: + + + + **Specialized prompts for specific tasks** + + - One skill = one specific capability + - Just a SKILL.md file (+ optional resources) + - Lightweight and focused + - Quick to create and share + + **When to use:** + - Adding single capabilities + - Simple workflows + - Domain-specific knowledge + - Quick solutions + + + + **Multi-component bundles** + + - Multiple skills + hooks + config + - Complete feature ecosystems + - Coordinated components + - Professional distribution + + **When to use:** + - Complete feature sets + - Tool integrations + - Team standards + - Commercial distributions + + + +### Comparison Table + +| Aspect | Skills | Plugins | +|--------|--------|---------| +| **Complexity** | Simple | Comprehensive | +| **Components** | Knowledge only | Skills + hooks + MCP + commands | +| **Use Case** | Single capability | Complete feature set | +| **Creation** | Few minutes | Planned development | +| **Distribution** | Copy directory | Structured package | +| **Maintenance** | Individual files | Coordinated bundle | + +### When to Use Each + +**Use a Skill when you need:** +- A single reusable prompt or workflow +- Domain-specific knowledge +- Simple automation +- Quick solutions + +**Use a Plugin when you need:** +- Multiple related skills working together +- Event handlers (hooks) for tool actions +- External tool integrations (MCP) +- Complete platform integrations +- Team or organizational standards + +**Example: Code Quality** + +*As separate skills:* +``` +.agents/skills/ +├── python-linting/ +├── code-review/ +└── pre-commit-setup/ +``` + +*As a plugin:* +``` +code-quality-plugin/ +├── .plugin/plugin.json # Plugin metadata +├── skills/ +│ ├── linting/ +│ ├── review/ +│ └── setup/ +├── hooks/hooks.json # Post-edit linting +└── .mcp.json # Code analysis tools +``` + +The plugin version bundles all quality-related capabilities and automatically runs checks after file edits. + +## Plugin Structure + +A complete plugin follows this directory structure: + +``` +plugin-name/ +├── .plugin/ +│ └── plugin.json # Required: Plugin metadata +├── skills/ +│ └── skill-name/ +│ └── SKILL.md # Individual skills +├── hooks/ +│ └── hooks.json # Tool lifecycle hooks +├── agents/ +│ └── agent-name.md # Specialized agents +├── commands/ +│ └── command-name.md # Slash commands +├── .mcp.json # MCP server config +└── README.md # Documentation +``` + +### Required Components + +Only one file is required: + +- **`plugin-name/.plugin/plugin.json`**: Plugin metadata + +All other components are optional—include only what your plugin needs. + +### Plugin Metadata + +The `plugin.json` file defines your plugin: + +```json +{ + "name": "code-quality", + "version": "1.0.0", + "description": "Code quality tools and workflows", + "author": "your-name", + "license": "MIT", + "repository": "https://github.com/example/code-quality-plugin" +} +``` + +## Plugin Components Explained + + + + Skills in plugins work identically to standalone skills. Each skill has its own directory with a SKILL.md file: + + ``` + skills/ + ├── linting/ + │ ├── SKILL.md + │ └── scripts/ + └── testing/ + └── SKILL.md + ``` + + See [Skills Documentation](/overview/skills) for skill creation details. + + + + Hooks are event handlers that run during tool lifecycle events: + + ```json + { + "hooks": { + "PostToolUse": [ + { + "matcher": "file_editor", + "hooks": [ + { + "type": "command", + "command": "ruff check {file_path}", + "timeout": 10 + } + ] + } + ] + } + } + ``` + + **Common use cases:** + - Run linters after file edits + - Validate tool inputs + - Log tool usage + - Trigger dependent actions + + **Available hook events:** + - `PreToolUse`: Before tool execution + - `PostToolUse`: After tool execution + - `OnError`: When tool fails + + + + MCP (Model Context Protocol) servers provide external tools and resources: + + ```json + { + "mcpServers": { + "fetch": { + "command": "uvx", + "args": ["mcp-server-fetch"] + }, + "github": { + "command": "uvx", + "args": ["mcp-server-github"], + "env": { + "GITHUB_TOKEN": "${GITHUB_TOKEN}" + } + } + } + } + ``` + + **Use cases:** + - Connect to external APIs + - Add specialized tools + - Integrate third-party services + + Learn more: [Model Context Protocol](/overview/model-context-protocol) + + + + Specialized agent definitions for specific tasks: + + ```markdown + --- + name: code-reviewer + description: Specialized agent for code review tasks + --- + + # Code Review Agent + + This agent specializes in reviewing code according to team standards... + ``` + + Agents in plugins can use the plugin's skills and hooks automatically. + + + + Custom slash commands for plugin functionality: + + ```markdown + --- + name: /lint + description: Run linters on current file + --- + + # Lint Command + + Run configured linters on the current file... + ``` + + Commands provide quick access to plugin features. + + + +## Using Plugins + +How you use plugins depends on your platform: + + + + **Via configuration file:** + + Create `~/.openhands/config.toml`: + ```toml + [plugins] + sources = [ + "/path/to/local/plugin", + "github:org/plugin-repo", + ] + ``` + + **Via command line:** + ```bash + openhands --plugin /path/to/plugin + openhands --plugin github:org/plugin-repo + ``` + + Plugins are loaded when OpenHands starts. + + + + Load plugins programmatically: + + ```python + from openhands.sdk import Conversation + from openhands.sdk.plugin import PluginSource + + plugins = [ + PluginSource(source="/path/to/plugin"), + PluginSource(source="github:org/repo", ref="v1.0.0"), + ] + + conversation = Conversation( + agent=agent, + plugins=plugins, + ) + ``` + + See [SDK Plugins Guide](/sdk/guides/plugins) for details. + + + + **Via UI:** + 1. Open Settings + 2. Navigate to Plugins section + 3. Add plugin path or GitHub URL + 4. Restart to load + + **Via file system:** + Place plugins in `.openhands/plugins/` in your workspace. + + + + **Via Cloud UI:** + 1. Navigate to Workspace Settings + 2. Select Plugins tab + 3. Browse plugin library or add custom plugin + 4. Click "Enable" to activate + + Organization admins can publish plugins for team-wide access. + + + +## Installing Plugins + +### From a Local Directory + +1. **Verify plugin structure**: + ```bash + ls plugin-dir/.plugin/plugin.json + ``` + +2. **Use the plugin path** in your configuration or command line + +### From GitHub + +Plugins can be loaded directly from GitHub repositories: + +``` +github:OpenHands/example-plugin +github:org/repo/path/to/plugin # For monorepos +github:org/repo#branch-name # Specific branch +github:org/repo#v1.0.0 # Specific tag +``` + +### Plugin Sources + + + + [github.com/OpenHands/extensions](https://github.com/OpenHands/extensions) + + Community-maintained plugins + + + + Your own GitHub repositories + + Organization or private plugins + + + +## Creating Plugins + +To create your own plugin: + +### 1. Plan Your Components + +Determine what your plugin needs: +- Which skills? +- What hooks for automation? +- Any MCP integrations? +- Custom commands? + +### 2. Create Directory Structure + +```bash +mkdir -p my-plugin/.plugin +mkdir -p my-plugin/skills +mkdir -p my-plugin/hooks +``` + +### 3. Create Plugin Metadata + +Create `my-plugin/.plugin/plugin.json`: +```json +{ + "name": "my-plugin", + "version": "0.1.0", + "description": "My custom plugin", + "author": "Your Name" +} +``` + +### 4. Add Components + +Add skills, hooks, and other components as needed: + +``` +my-plugin/ +├── .plugin/plugin.json +├── skills/ +│ └── my-skill/ +│ └── SKILL.md +└── hooks/ + └── hooks.json +``` + +### 5. Test Locally + +Load your plugin and verify all components work: + +```bash +openhands --plugin /path/to/my-plugin +``` + +### 6. Distribute + +Options for distribution: +- **GitHub repository**: Push to GitHub and share URL +- **File sharing**: Zip and share directory +- **Package registry**: Submit to official registry + +## Plugin Examples + + + + **Contains:** + - Python linting skill + - JavaScript linting skill + - Post-edit hooks for auto-linting + - Pre-commit setup + + **Use case:** Enforce code standards + + + + **Contains:** + - Kubernetes deployment skill + - Docker build skill + - CI/CD workflow skill + - kubectl MCP server + + **Use case:** Infrastructure management + + + + **Contains:** + - REST API client skill + - Authentication skill + - Rate limiting hooks + - API MCP server + + **Use case:** External service integration + + + + **Contains:** + - Unit testing skill + - Integration testing skill + - Post-code hooks for test runs + - Coverage commands + + **Use case:** Automated testing + + + +## Plugin Development Best Practices + + + + Begin by creating the core skills your plugin needs. Test them individually before bundling. + + + + Identify repetitive tasks and automate them with hooks. Example: run linters after file edits. + + + + Add MCP servers for external tool integration. This provides your skills with additional capabilities. + + + + Include a comprehensive README explaining: + - What the plugin does + - How to install it + - Configuration options + - Example usage + + + + Use semantic versioning (major.minor.patch) and document breaking changes. + + + +## Troubleshooting + + + + **Check:** + - `.plugin/plugin.json` exists and is valid JSON + - Plugin path is correct + - All referenced files exist + + **Debug:** + ```bash + # Verify structure + ls -la plugin-name/.plugin/plugin.json + + # Check JSON syntax + cat plugin-name/.plugin/plugin.json | python -m json.tool + ``` + + + + **Check:** + - Skills have valid SKILL.md files + - Frontmatter includes `triggers` + - Trigger keywords match your prompts + + **Test:** + Use explicit trigger keywords from the skill's frontmatter. + + + + **Check:** + - `hooks/hooks.json` syntax is valid + - Hook matchers target the right tools + - Commands are executable + + **Debug:** + Check logs for hook execution errors. + + + +## Next Steps + +- **[Learn about Skills](/overview/skills)** - Understand the core component of plugins +- **[Explore MCP](/overview/model-context-protocol)** - Add external tool integrations +- **[SDK Plugins Guide](/sdk/guides/plugins)** - Programmatic plugin usage +- **[Browse Examples](https://github.com/OpenHands/software-agent-sdk/tree/main/examples/05_skills_and_plugins/02_loading_plugins/example_plugins)** - See complete plugin structures + +## Further Reading + +For SDK developers: +- **[SDK Plugins Documentation](/sdk/guides/plugins)** - Detailed SDK integration +- **[Hooks Guide](/sdk/guides/hooks)** - Event handler details +- **[MCP Integration](/sdk/guides/mcp)** - External tool servers diff --git a/overview/skills.mdx b/overview/skills.mdx index b7f266cc..15484b5d 100644 --- a/overview/skills.mdx +++ b/overview/skills.mdx @@ -136,6 +136,10 @@ Each skill file may include frontmatter that provides additional information. In ## Learn More +- **To add existing skills**: See [Adding New Skills](/overview/skills/adding) +- **To create your own skills**: See [Creating New Skills](/overview/skills/creating) +- **To monitor skill performance**: See [Monitoring and Improving Skills](/overview/skills/monitoring) +- **For bundling multiple components**: See [Plugins](/overview/plugins) - **For SDK integration**: See [SDK Skills Guide](/sdk/guides/skill) - **For architecture details**: See [Skills Architecture](/sdk/arch/skill) - **For specific skill types**: See [Repository Skills](/overview/skills/repo), [Keyword Skills](/overview/skills/keyword), [Organization Skills](/overview/skills/org), and [Global Skills](/overview/skills/public) diff --git a/overview/skills/adding.mdx b/overview/skills/adding.mdx new file mode 100644 index 00000000..3874ccea --- /dev/null +++ b/overview/skills/adding.mdx @@ -0,0 +1,210 @@ +--- +title: Adding New Skills +description: Learn how to add existing skills to your OpenHands workspace from the official registry or custom repositories. +--- + +OpenHands makes it easy to extend your agent's capabilities by adding pre-built skills from the community or custom repositories. Skills can be added globally (available in all conversations) or to specific projects. + +## Using the Add-Skill Action + +The quickest way to add a skill is using the `/add-skill` command in your conversation with OpenHands. This command fetches skills from GitHub repositories and installs them in your workspace. + +### Basic Usage + +Simply provide a GitHub URL pointing to a skill: + +``` +/add-skill https://github.com/OpenHands/extensions/tree/main/skills/codereview +``` + +OpenHands will: +1. Parse the URL to identify the repository and skill path +2. Fetch the skill files from GitHub +3. Install the skill in `.agents/skills/` directory +4. Verify the installation +5. Make the skill immediately available + +### Supported URL Formats + +The `/add-skill` command accepts various GitHub URL formats: + +- Full GitHub tree URL: `https://github.com/OpenHands/extensions/tree/main/skills/codereview` +- Repository path: `https://github.com/OpenHands/extensions/skills/codereview` +- Short form: `github.com/OpenHands/extensions/skills/codereview` +- Shorthand: `OpenHands/extensions/skills/codereview` + +### Examples + +Add the code review skill: +``` +/add-skill https://github.com/OpenHands/extensions/tree/main/skills/codereview-roasted +``` + +Add the Kubernetes skill: +``` +/add-skill OpenHands/extensions/skills/kubernetes +``` + +Add a skill from a custom repository: +``` +/add-skill https://github.com/your-org/your-repo/tree/main/custom-skills/analytics +``` + +## Skill Storage Locations + +Skills are stored in different locations depending on the platform and scope: + + + + The CLI supports two skill locations: + + **User-level skills** (global, available in all conversations): + ``` + ~/.openhands/skills/ + ``` + + **Project-level skills** (specific to current directory): + ``` + .agents/skills/ + ``` + + Skills added via `/add-skill` are installed in `.agents/skills/` of your current workspace, making them available for that project. + + To add skills globally, manually place skill directories in `~/.openhands/skills/`. + + + + SDK users programmatically load skills: + + ```python + from openhands.sdk import Skill + + # Load from a directory + skill = Skill.load("/path/to/skill") + + # Load all skills from a directory + skills = Skill.load_all("/path/to/skills") + ``` + + See the [SDK Skills Guide](/sdk/guides/skill) for more details. + + + + Skills are stored in: + ``` + .agents/skills/ + ``` + + The GUI provides a visual interface for managing skills, but skills can also be added manually by placing them in this directory. + + + + OpenHands Cloud provides a centralized skill library accessible through the web interface. Skills can be: + - Added from the official registry with one click + - Imported from your connected repositories + - Shared across your team or organization + + See the [Cloud UI documentation](/openhands/usage/cloud/cloud-ui) for details. + + + +## Manual Installation + +You can also manually install skills by copying skill directories into the appropriate location. + +### For Project-Level Skills + +1. Create the skills directory if it doesn't exist: + ```bash + mkdir -p .agents/skills + ``` + +2. Copy or clone the skill directory: + ```bash + # Using git + git clone https://github.com/OpenHands/extensions temp-clone + cp -r temp-clone/skills/codereview .agents/skills/ + rm -rf temp-clone + + # Or download and extract manually + ``` + +3. Verify the skill structure: + ```bash + ls .agents/skills/codereview/SKILL.md + ``` + +### For User-Level Skills (CLI Only) + +1. Create the global skills directory: + ```bash + mkdir -p ~/.openhands/skills + ``` + +2. Add skills to this directory: + ```bash + cp -r /path/to/skill ~/.openhands/skills/ + ``` + +Skills in `~/.openhands/skills/` are available in all your conversations when using the CLI. + +## Verifying Installation + +After adding a skill, verify it's available: + +1. **Check the file exists**: The skill directory should contain at least a `SKILL.md` file + ```bash + ls .agents/skills/your-skill/SKILL.md + ``` + +2. **Test the trigger**: For keyword-triggered skills, use one of the trigger words in your prompt: + ``` + Help me set up kubernetes + ``` + +3. **Check skill loading**: OpenHands will indicate when a skill is loaded in response to your prompt + +## Skill Updates + +To update a skill to the latest version: + +1. **Remove the old version**: + ```bash + rm -rf .agents/skills/skill-name + ``` + +2. **Add the updated version**: + ``` + /add-skill https://github.com/OpenHands/extensions/tree/main/skills/skill-name + ``` + +Or manually pull updates if you cloned the skill repository. + +## Authentication for Private Skills + +If you need to add skills from private repositories: + +1. **Set GITHUB_TOKEN**: Ensure the `GITHUB_TOKEN` environment variable is set with a token that has access to the private repository + +2. **Use the same `/add-skill` command**: The command will automatically use the token for authentication + +```bash +export GITHUB_TOKEN=your_github_token +``` + +Then use `/add-skill` as normal with private repository URLs. + +## Skill Conflicts + +If a skill with the same name already exists, OpenHands will warn you before overwriting. To resolve conflicts: + +1. **Rename the existing skill**: Move or rename the existing skill directory +2. **Choose a different installation location**: Install at user-level vs project-level +3. **Overwrite**: Confirm the overwrite when prompted + +## Next Steps + +- **[Browse available skills](https://github.com/OpenHands/extensions)** in the official registry +- **[Create your own skills](/overview/skills/creating)** for custom workflows +- **[Learn about keyword triggers](/overview/skills/keyword)** to make skills activate automatically +- **[Understand skill structure](/sdk/guides/skill)** for the AgentSkills format diff --git a/overview/skills/creating.mdx b/overview/skills/creating.mdx new file mode 100644 index 00000000..412ddf2f --- /dev/null +++ b/overview/skills/creating.mdx @@ -0,0 +1,540 @@ +--- +title: Creating New Skills +description: Learn how to create reusable skills instead of repeating prompts, with best practices for structure, triggers, and content organization. +--- + +Instead of repeating the same prompts or instructions in every conversation, create a skill that OpenHands can load automatically when needed. Skills transform one-time prompts into reusable, maintainable knowledge that improves over time. + +## Why Create Skills? + +**Before (repeating yourself):** +``` +Please analyze this code using our company's Python style guide: +- Use black for formatting +- Max line length 88 +- Use type hints for all functions +- Follow PEP 8 naming conventions +... +``` + +**After (using a skill):** +``` +Review this Python code +``` + +The skill triggers automatically and applies all your style guidelines consistently. + +## When to Create a Skill + +Create a skill when you find yourself: + +- Repeating the same instructions across multiple conversations +- Working with domain-specific knowledge (company policies, API schemas, workflows) +- Using the same multi-step procedures repeatedly +- Needing consistent behavior for specific tools or frameworks +- Sharing best practices across a team + +## Quick Start + +### Automated Approach: Let OpenHands Help + +To create a skill with guided assistance, ask OpenHands to help you: + +``` +Create a skill for [your use case] +``` + +or simply: + +``` +Write a new skill +``` + +The `skill-creator` skill will guide you through an interactive process: +- Asks questions about your use cases and requirements +- Suggests appropriate skill structure (references, scripts, assets) +- Helps you write effective trigger keywords and descriptions +- Ensures you follow best practices automatically +- Creates the complete skill structure for you + +This is the recommended approach, especially when you're starting out. + +### Manual Approach + +If you prefer to create the skill structure manually: + +1. **Create the skill directory**: + ```bash + mkdir -p .agents/skills/my-skill + ``` + +2. **Create the SKILL.md file**: + ```bash + touch .agents/skills/my-skill/SKILL.md + ``` + +3. **Add content** (see structure and guidelines below) + +4. **Test it** by using a trigger keyword in your prompt + +## Determining Scope + +Before writing your skill, define its scope clearly: + +### Ask These Questions + +1. **What specific task does this skill handle?** + - ❌ Too broad: "Help with coding" + - ✅ Focused: "Lint Python code using ruff with our company rules" + +2. **What knowledge is required?** + - Code style guidelines + - API documentation + - Domain-specific schemas + - Multi-step procedures + +3. **What resources are needed?** + - Scripts for deterministic tasks + - Reference documents for detailed information + - Asset files for templates or boilerplate + +4. **Who will use this skill?** + - Just you (keep it simple) + - Your team (add more documentation) + - Public sharing (comprehensive examples) + +### Scope Examples + +**Good scope (focused):** +- "Configure pre-commit hooks for Python projects" +- "Generate financial reports using our SQL schema" +- "Deploy to our Kubernetes staging environment" + +**Poor scope (too broad):** +- "Help with Python" +- "Work with databases" +- "Deploy applications" + +## Choosing Name and Triggers + +The skill name and trigger keywords determine when OpenHands loads your skill. + +### Naming Your Skill + +Choose a clear, descriptive name: + +- **Use lowercase with hyphens**: `python-linting`, `k8s-deploy`, `api-docs` +- **Be specific**: `ruff-linter` not just `linter` +- **Match common terms**: Use vocabulary your users know + +### Defining Triggers + +Triggers are keywords that automatically activate your skill. Choose words users naturally say when they need this skill. + + + + List specific words or phrases that should activate the skill: + + ```yaml + --- + name: python-linting + description: This skill should be used when the user asks to "lint Python code", "check Python style", "run ruff", or mentions Python code quality. + triggers: + - lint + - linting + - ruff + - code quality + --- + ``` + + **Best practices:** + - Include 2-5 trigger keywords + - Use terms users actually say + - Include tool names (e.g., "ruff", "pytest") + - Include action words (e.g., "lint", "test", "deploy") + + + + The skill description is crucial for trigger matching. Write it in third person and include specific phrases: + + ```yaml + description: This skill should be used when the user asks to "deploy to Kubernetes", "apply K8s manifests", "check pod status", or mentions kubectl commands. Provides comprehensive Kubernetes deployment workflows. + ``` + + **Key elements:** + - Start with "This skill should be used when..." + - Quote specific user phrases: "deploy to Kubernetes" + - List concrete scenarios + - Mention related tools or frameworks + + + +### Examples of Good Triggers + +```yaml +# API integration skill +triggers: +- stripe +- payment +- checkout +``` + +```yaml +# Database skill +triggers: +- bigquery +- sql query +- data warehouse +``` + +```yaml +# Deployment skill +triggers: +- deploy +- kubernetes +- k8s +- kubectl +``` + +## Defining the Skill Body + +The skill body contains the instructions OpenHands will follow. Write in imperative form (command form) rather than second person. + +### Basic Structure + +```markdown +--- +name: skill-name +description: This skill should be used when... +triggers: +- keyword1 +- keyword2 +--- + +# Skill Title + +Brief overview of what this skill does. + +## Core Instructions + +Main procedures and guidelines. + +## Common Patterns + +Typical use cases and solutions. + +## Additional Resources + +(Optional) References to bundled files. +``` + +### Writing Style + +**Use imperative/infinitive form:** +✅ "Check the configuration file" +✅ "Validate input before processing" +✅ "Run tests after deployment" + +**Avoid second person:** +❌ "You should check the configuration" +❌ "You need to validate input" +❌ "You must run tests" + +### Keep It Focused + +**SKILL.md content:** +- Core concepts and workflows (1,500-2,000 words ideal) +- Essential procedures +- Quick reference information +- Pointers to additional resources + +**What NOT to include:** +- Exhaustive API documentation (use `references/` instead) +- Detailed edge cases (use `references/` instead) +- Long examples (use `references/` instead) + +## Best Practices and Tips + +### Use Numbered Step Workflows + +For multi-step procedures, use numbered lists: + +```markdown +## Deployment Workflow + +1. **Validate the configuration**: + ```bash + kubectl apply --dry-run=client -f deployment.yaml + ``` + +2. **Apply to staging**: + ```bash + kubectl apply -f deployment.yaml -n staging + ``` + +3. **Verify pod status**: + ```bash + kubectl get pods -n staging --watch + ``` + +4. **Check logs**: + ```bash + kubectl logs -f deployment/app-name -n staging + ``` +``` + +**Benefits:** +- Clear sequence for complex workflows +- Easy to follow and verify +- Reduces errors from skipped steps + +### Add Large Files as References + +Keep SKILL.md lean by moving detailed content to `references/`: + +``` +my-skill/ +├── SKILL.md # Core instructions (< 3,000 words) +└── references/ + ├── api-docs.md # Detailed API reference + ├── examples.md # Comprehensive examples + └── troubleshooting.md # Edge cases and fixes +``` + +**In SKILL.md, reference these files:** + +```markdown +## Additional Resources + +For detailed information, see: +- **`references/api-docs.md`** - Complete API documentation +- **`references/examples.md`** - Working code examples +- **`references/troubleshooting.md`** - Common issues and solutions +``` + +**Benefits:** +- Keeps context window smaller when skill loads +- OpenHands reads references only when needed +- Easier to maintain and update specific sections + +### Create Scripts for Predictable Steps + +For tasks that are repeatedly rewritten or need deterministic behavior, create executable scripts: + +``` +my-skill/ +├── SKILL.md +└── scripts/ + ├── validate_config.py + ├── deploy.sh + └── rollback.sh +``` + +**When to use scripts:** +- Same code being rewritten repeatedly +- Deterministic reliability required +- Complex parsing or validation +- Multi-step automation + +**Reference scripts in SKILL.md:** + +```markdown +## Validation + +Run the validation script: + +\`\`\`bash +python3 scripts/validate_config.py config.yaml +\`\`\` + +This checks: +- YAML syntax +- Required fields +- Value constraints +``` + +**Benefits:** +- Token efficient (scripts can run without being read) +- Deterministic behavior +- Reusable across projects +- Can be versioned and tested + +### Include Quick Reference Tables + +Use tables for configuration options, command flags, or status codes: + +```markdown +## Configuration Options + +| Option | Default | Description | +|--------|---------|-------------| +| `timeout` | 30s | Maximum wait time | +| `retries` | 3 | Number of retry attempts | +| `env` | production | Target environment | +``` + +### Provide Concrete Examples + +Show real examples, not abstract descriptions: + +```markdown +## Example Usage + +Deploy the web application: + +\`\`\`bash +# Build the image +docker build -t myapp:v1.0 . + +# Push to registry +docker push registry.example.com/myapp:v1.0 + +# Update Kubernetes deployment +kubectl set image deployment/web web=registry.example.com/myapp:v1.0 +\`\`\` +``` + +### Use Progressive Disclosure + +Structure information from simple to complex: + +1. **SKILL.md**: Essential workflows and core concepts +2. **references/**: Detailed patterns, advanced techniques, edge cases +3. **scripts/**: Automation for predictable tasks +4. **assets/**: Templates and boilerplate files + +## Complete Example + +Here's a complete skill for Python code review: + +``` +python-review/ +├── SKILL.md +├── references/ +│ ├── style-guide.md +│ └── common-issues.md +└── scripts/ + └── run-checks.sh +``` + +**SKILL.md:** +```markdown +--- +name: python-review +description: This skill should be used when the user asks to "review Python code", "check Python style", "lint Python", or requests code quality analysis. Provides comprehensive Python code review workflows. +triggers: +- python review +- code review +- lint python +- black +- ruff +--- + +# Python Code Review + +Review Python code using company standards and best practices. + +## Review Workflow + +1. **Run automated checks**: + \`\`\`bash + scripts/run-checks.sh + \`\`\` + +2. **Review linter output** for: + - Style violations (Black, Ruff) + - Type errors (mypy) + - Security issues (bandit) + +3. **Check code structure**: + - Function length (< 50 lines) + - Complexity (< 10 cyclomatic) + - Naming conventions + +4. **Verify tests**: + \`\`\`bash + pytest tests/ --cov=src --cov-report=term + \`\`\` + +## Style Guidelines + +- **Formatting**: Black with 88-character line limit +- **Linting**: Ruff with company config +- **Types**: Full type hints for public APIs +- **Docstrings**: Google style for all public functions + +## Additional Resources + +- **`references/style-guide.md`** - Complete style guide +- **`references/common-issues.md`** - Common mistakes and fixes +- **`scripts/run-checks.sh`** - Automated quality checks +``` + +## Testing Your Skill + +After creating your skill: + +1. **Verify structure**: + ```bash + ls .agents/skills/your-skill/SKILL.md + ``` + +2. **Check frontmatter**: Ensure YAML is valid with `name`, `description`, and `triggers` + +3. **Test trigger keywords**: Use a trigger word in a prompt: + ``` + Help me lint this Python code + ``` + +4. **Verify loading**: OpenHands should indicate the skill was loaded + +5. **Iterate**: Improve based on actual usage + + +For production deployments, see [Monitoring and Improving Skills](/overview/skills/monitoring) to track performance using logging, evaluation metrics, dashboarding, and automated feedback aggregation. + + +## Common Mistakes to Avoid + + +**Mistake 1: Vague triggers** +❌ `description: Helps with Python` +✅ `description: This skill should be used when the user asks to "lint Python code", "run black", or mentions Python code quality` + + + +**Mistake 2: Everything in SKILL.md** +❌ Single 10,000-word SKILL.md +✅ Focused SKILL.md (2,000 words) + references/ for details + + + +**Mistake 3: Using "you" in instructions** +❌ "You should validate the config" +✅ "Validate the config" + + + +**Mistake 4: Missing examples** +❌ Abstract descriptions only +✅ Concrete examples with actual commands + + +## Next Steps + +- **[Add your skill](/overview/skills/adding)** to your workspace +- **[Monitor skill performance](/overview/skills/monitoring)** in production +- **[Share skills](https://github.com/OpenHands/extensions)** with the community +- **[Learn the AgentSkills format](/sdk/guides/skill)** for advanced features +- **[Explore example skills](https://github.com/OpenHands/extensions)** for inspiration + +## Further Reading + +For advanced skill creation techniques and SDK integration: +- **[Monitoring Skills](/overview/skills/monitoring)** - Track performance and improve skills in production +- **[Plugins](/overview/plugins)** - Bundle multiple skills with hooks and MCP config +- **[SDK Skills Guide](/sdk/guides/skill)** - Programmatic skill creation +- **[Observability & Tracing](/sdk/guides/observability)** - OpenTelemetry configuration details +- **[GitHub Workflows](/sdk/guides/github-workflows/pr-review)** - Automate skills in CI/CD pipelines +- **[Skills Architecture](/sdk/arch/skill)** - Technical details +- **[Official Skill Registry](https://github.com/OpenHands/extensions)** - Community examples diff --git a/overview/skills/keyword.mdx b/overview/skills/keyword.mdx index 6526c467..dd829cd3 100644 --- a/overview/skills/keyword.mdx +++ b/overview/skills/keyword.mdx @@ -21,15 +21,31 @@ Enclose the frontmatter in triple dashes (---) and include the following fields: ## Example -Keyword-triggered skill file example located at `.agents/skills/yummy.md`: -``` +Here's a simplified example of the `github` skill located at `.agents/skills/github/SKILL.md`: + +```markdown --- +name: github +description: Interact with GitHub repositories, pull requests, issues, and workflows using the GITHUB_TOKEN environment variable and GitHub CLI. Use when working with code hosted on GitHub or managing GitHub resources. triggers: -- yummyhappy -- happyyummy +- github +- git --- -The user has said the magic word. Respond with "That was delicious!" +You have access to an environment variable, `GITHUB_TOKEN`, which allows you to interact with +the GitHub API. + + +You can use `curl` with the `GITHUB_TOKEN` to interact with GitHub's API. +ALWAYS use the GitHub API for operations instead of a web browser. +ALWAYS use the `create_pr` tool to open a pull request + + +... (additional GitHub-specific instructions) ``` -[See examples of keyword-triggered skills in the official OpenHands Skills Registry](https://github.com/OpenHands/extensions) + +**Context Management with Platform Skills**: OpenHands includes specialized skills for platforms like GitHub and GitLab that are only triggered when needed (e.g., when you mention "github" or "gitlab" in your prompt). This keeps your context clean and focused, loading platform-specific guidance only when working with those services. + + +[See more examples of keyword-triggered skills in the official OpenHands Skills Registry](https://github.com/OpenHands/extensions) diff --git a/overview/skills/monitoring.mdx b/overview/skills/monitoring.mdx new file mode 100644 index 00000000..cf48497d --- /dev/null +++ b/overview/skills/monitoring.mdx @@ -0,0 +1,169 @@ +--- +title: Monitoring and Improving Skills +description: Monitor skill performance in production using logging, evaluation metrics, dashboarding, and automated feedback aggregation. +--- + +After creating and deploying a skill, monitor its performance to ensure it works correctly in production. This is particularly important for skills used in automated workflows like CI/CD pipelines. + +## The Monitoring Workflow + +Production skill monitoring follows a four-part process: + +1. **Logging** - Record agent behavior during skill execution +2. **Evaluating** - Measure performance using relevant metrics +3. **Dashboarding** - Visualize metrics over time +4. **Aggregating** - Use feedback to improve the skill + +## Logging Agent Behavior + +OpenHands includes OpenTelemetry-compatible instrumentation via the [Laminar](https://github.com/lmnr-ai/lmnr) library. Set up logging to capture agent traces during skill execution. + +### For SDK Users + +Set the `LMNR_PROJECT_API_KEY` environment variable to send traces to Laminar, or configure any OpenTelemetry-compatible backend: + +```bash +export LMNR_PROJECT_API_KEY="your-api-key" +``` + +See the [SDK Observability Guide](/sdk/guides/observability) for detailed configuration options including Honeycomb, Jaeger, Datadog, and other OTLP-compatible backends. + +### For GitHub Actions + +When using skills in GitHub workflows, add the API key to your action configuration. See the [PR review action example](https://github.com/OpenHands/extensions/blob/main/plugins/pr-review/action.yml) for reference. + +## Evaluating Performance + +Define metrics that reflect whether your skill is working correctly. Effective metrics measure actual outcomes rather than intermediate steps. + +### Example: PR Review Skill + +For a code review skill, measure suggestion acceptance rate: + +``` +suggestion_accuracy = ai_suggestions_reflected / ai_suggestions +``` + +Track: +- Number of suggestions made by the agent +- Number of suggestions incorporated by developers + +### Implementation Approach + +1. **Create an evaluation workflow** - Run after the main task completes (e.g., after PR merge) +2. **Collect relevant data** - Agent output, human responses, final results +3. **Use LLM as judge** - Feed data into a prompt that calculates metrics + +Example evaluation prompt excerpt: + +``` +### ai_suggestions +Count items where the body contains an actionable code suggestion +(look for code blocks, "suggestion:", specific changes to make). +Do NOT count general praise or approval-only comments. + +### ai_suggestions_reflected +Count suggestions that were incorporated. A suggestion is "reflected" if: +1. A human response indicates the suggestion was implemented, OR +2. The suggestion appears in the final diff +``` + +See the [evaluation action example](https://github.com/OpenHands/extensions/blob/main/.github/workflows/pr-review-evaluation.yml) for a complete implementation. + +## Dashboarding Metrics + +Visualize metrics over time to identify trends. With Laminar or similar platforms, create SQL queries that aggregate evaluation results. + +Track: +- Metric trends (improving or degrading) +- Performance across different contexts (repos, file types, etc.) +- Comparison between prompt variations or models + +## Aggregating Feedback for Improvement + +Use language models to analyze patterns in evaluation results and suggest skill improvements. + +### Process + +1. **Collect evaluation data** - Aggregate analyses from recent runs +2. **Provide current skill content** - Include the existing SKILL.md +3. **Use a reasoning model** - Feed both into a long-context model (Gemini-2-Pro, Claude 3.5 Sonnet, etc.) +4. **Extract actionable suggestions** - Review model output for concrete improvements + +### Example Output + +Example output from aggregation: + +``` +### Issue: Context-Unaware Suggestions +The agent suggests technically correct changes that conflict with +repository conventions (e.g., suggesting integration tests when the +repo uses mocks). + +Frequency: ~15% of suggestions +Recommendation: Add repo-specific testing philosophy to references/ +``` + +## Deployment in Automated Workflows + +Skills can run automatically in CI/CD pipelines. The [OpenHands Extensions repository](https://github.com/OpenHands/extensions/tree/main/plugins) includes example GitHub Actions for common automation patterns. + +### Common Automation Use Cases + +- **PR review** - Run code review skills when PRs are marked "ready for review" +- **Issue triage** - Classify and label new issues +- **Code generation** - Generate boilerplate or documentation +- **Security scanning** - Check for vulnerabilities and suggest fixes + +See the [GitHub Workflows guide](/sdk/guides/github-workflows/pr-review) for SDK-based automation examples. + +## Best Practices + + + Select metrics that reflect real-world outcomes, not just intermediate steps. + + **Good metrics:** + - Suggestion acceptance rate (for code review) + - Issue classification accuracy (for triage) + - Time to resolution (for bug fixing) + + **Poor metrics:** + - Number of suggestions made + - Lines of code generated + - Tokens consumed + + + + Begin with basic logging before implementing complex evaluation pipelines. + + 1. Set up OpenTelemetry logging + 2. Review traces manually to understand agent behavior + 3. Identify patterns in successes and failures + 4. Design metrics based on observed patterns + 5. Automate evaluation + + + + Use evaluation results to make targeted improvements: + + - Low accuracy → Review skill instructions for clarity + - Inconsistent behavior → Add more specific examples + - Context errors → Expand references/ with domain knowledge + - Repetitive failures → Create scripts for deterministic tasks + + + + Track performance across different contexts: + + - **By repository** - Different repos may need different approaches + - **By file type** - Skills may work better on certain languages + - **By time** - Identify degradation or improvement trends + - **By model** - Compare different LLM backends + + +## Further Reading + +- **[SDK Observability Guide](/sdk/guides/observability)** - Detailed OpenTelemetry configuration +- **[GitHub Workflows](/sdk/guides/github-workflows/pr-review)** - Automate skills in CI/CD +- **[Hooks Guide](/sdk/guides/hooks)** - Event-driven skill execution +- **[Creating Skills](/overview/skills/creating)** - Skill creation fundamentals