feat: add github-manager Agentkit#100
Conversation
# Lamatic.ai GitHub Documentation & Issue Manager
## What This Kit Does
Managing open-source repositories often leads to a "context gap" where the AI lacks the latest documentation, and maintainers are overwhelmed by manual issue triaging.
This kit automates the lifecycle of repository management by:
1. **Syncing Knowledge:** Automatically scraping external documentation sites, chunking the text, and converting it into vector embeddings for RAG (Retrieval-Augmented Generation).
2. **Intelligent Triaging:** Using a Webhook-driven classifier that analyzes incoming GitHub issues against the vectorized documentation to automatically apply accurate labels (e.g., `bug`, `documentation`, `feature-request`).
## Providers & Prerequisites
- **Lamatic.ai:** For workflow orchestration and managed vector storage.
- **OpenAI / Anthropic:** For embedding generation (`text-embedding-3-small`) and LLM classification (`gpt-4o` or `claude-3`).
- **GitHub PAT:** A Personal Access Token with `repo` permissions to allow the bot to apply labels.
- **Vector Database:** (Optional) Pinecone or Milvus if not using Lamatic's internal storage.
### Flow Logic Description:
This project utilizes two interconnected flows within Lamatic.ai:
* **The Ingestion Flow:** Uses a **Scraper Node** to crawl a specified URL. The output is passed to a **Processor Node** for recursive character chunking. These chunks are sent to an **Embed Node** and stored in the **Vector DB index**. This ensures the "Brain" of your assistant is always updated with the latest docs.
* **The Action Flow:** Triggered by a **Webhook Node** (GitHub `issues` event). It extracts the issue body, performs a **Vector Search** to find relevant doc snippets, and passes both to an **LLM Classifier Node**. The final **API Node** sends a PATCH request to GitHub's `/issues/{number}/labels` endpoint to tag the issue.
## Lamatic Flow
Flow ID: `e5cbca46-cc72-43e6-a24b-17814b75e3b3` $ `43920aae-9cc1-4e65-b28e-82597bf00661`
📝 WalkthroughWalkthroughThis PR introduces a new Lamatic.ai GitHub automation bundle with two workflows: a documentation scraper that vectorizes and indexes content, and a GitHub issue classifier that analyzes issues with LLM assistance and applies labels, including configuration files, input schemas, metadata, and documentation. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d1a0f00a-ea16-4564-96ee-28cf93676ae4
📒 Files selected for processing (10)
bundles/github-manager/README.mdbundles/github-manager/config.jsonbundles/github-manager/flows/Docs ingestion/README.mdbundles/github-manager/flows/Docs ingestion/config.jsonbundles/github-manager/flows/Docs ingestion/inputs.jsonbundles/github-manager/flows/Docs ingestion/meta.jsonbundles/github-manager/flows/classifier/README.mdbundles/github-manager/flows/classifier/config.jsonbundles/github-manager/flows/classifier/inputs.jsonbundles/github-manager/flows/classifier/meta.json
| "id": "docs scraping", | ||
| "type": "mandatory" |
There was a problem hiding this comment.
Step ID likely does not match flow reference convention.
"docs scraping" is inconsistent with common bundle step/flow ID conventions (typically exact, stable IDs, usually kebab-case). Align this step ID with the actual flow reference used by the bundle (e.g., docs-ingestion) and update prerequisite references accordingly.
Suggested fix
{
- "id": "docs scraping",
+ "id": "docs-ingestion",
"type": "mandatory"
},As per coding guidelines, Bundle config.json must use the same format as kit config with correct flow references and step definitions.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "id": "docs scraping", | |
| "type": "mandatory" | |
| "id": "docs-ingestion", | |
| "type": "mandatory" |
| "prerequisiteSteps": [ | ||
| "data_source" | ||
| ] |
There was a problem hiding this comment.
Invalid prerequisite reference breaks step dependency resolution.
"classifier" depends on "data_source", but no step with id: "data_source" is defined in this bundle config.
Suggested fix
{
"id": "classifier",
"type": "mandatory",
"prerequisiteSteps": [
- "data_source"
+ "docs-ingestion"
]
}As per coding guidelines, Bundle config.json must use the same format as kit config with correct flow references and step definitions.
| { | ||
| "label": "Condition 1", | ||
| "value": "conditionNode_731-addNode_120", | ||
| "condition": "{\n \"operator\": null,\n \"operands\": [\n {\n \"name\": \"{{RAGNode_463.output.modelResponse}}\",\n \"operator\": \"ilike\",\n \"value\": \"DOCS\"\n }\n ]\n}" | ||
| }, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for LLMNode output field patterns in other flow configs
rg -n "LLMNode.*output\." --glob "*.json" | head -30Repository: Lamatic/AgentKit
Length of output: 5482
🏁 Script executed:
# First, let's examine the classifier config to see all nodes
cat bundles/github-manager/flows/classifier/config.json | head -200Repository: Lamatic/AgentKit
Length of output: 6504
🏁 Script executed:
# Check if RAGNode_463 exists anywhere in this file
grep -n "RAGNode_463" bundles/github-manager/flows/classifier/config.jsonRepository: Lamatic/AgentKit
Length of output: 275
🏁 Script executed:
# List all node IDs in the classifier config
grep -oP '"id":\s*"[^"]*"' bundles/github-manager/flows/classifier/config.json | sort | uniqRepository: Lamatic/AgentKit
Length of output: 713
🏁 Script executed:
# Search for any RAG node references in the entire bundle
rg -n "RAGNode" bundles/github-manager/Repository: Lamatic/AgentKit
Length of output: 327
🏁 Script executed:
# Check what the condition actually looks like and what nodes exist around it
sed -n '70,85p' bundles/github-manager/flows/classifier/config.jsonRepository: Lamatic/AgentKit
Length of output: 662
Critical: Condition references non-existent node RAGNode_463 and uses incorrect output field.
The condition checks {{RAGNode_463.output.modelResponse}} but:
- No node with ID
RAGNode_463exists in this flow - The output field should be
generatedResponse, notmodelResponse
The LLM classifier node is LLMNode_400, and all LLM nodes in the codebase use .output.generatedResponse. This will cause the condition to fail at runtime.
🐛 Proposed fix
{
"label": "Condition 1",
"value": "conditionNode_731-addNode_120",
- "condition": "{\n \"operator\": null,\n \"operands\": [\n {\n \"name\": \"{{RAGNode_463.output.modelResponse}}\",\n \"operator\": \"ilike\",\n \"value\": \"DOCS\"\n }\n ]\n}"
+ "condition": "{\n \"operator\": null,\n \"operands\": [\n {\n \"name\": \"{{LLMNode_400.output.generatedResponse}}\",\n \"operator\": \"ilike\",\n \"value\": \"DOCS\"\n }\n ]\n}"
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| { | |
| "label": "Condition 1", | |
| "value": "conditionNode_731-addNode_120", | |
| "condition": "{\n \"operator\": null,\n \"operands\": [\n {\n \"name\": \"{{RAGNode_463.output.modelResponse}}\",\n \"operator\": \"ilike\",\n \"value\": \"DOCS\"\n }\n ]\n}" | |
| }, | |
| { | |
| "label": "Condition 1", | |
| "value": "conditionNode_731-addNode_120", | |
| "condition": "{\n \"operator\": null,\n \"operands\": [\n {\n \"name\": \"{{LLMNode_400.output.generatedResponse}}\",\n \"operator\": \"ilike\",\n \"value\": \"DOCS\"\n }\n ]\n}" | |
| }, |
| "nodeId": "apiNode", | ||
| "values": { | ||
| "id": "apiNode_451", | ||
| "url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/labels ", |
There was a problem hiding this comment.
Trailing whitespace in GitHub API URLs.
Lines 182, 212, and 242 have trailing spaces in the URL values. This could cause HTTP request failures or unexpected behavior when constructing the API endpoint.
🐛 Proposed fix for all three occurrences
- "url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/labels ",
+ "url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/labels",- "url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/comments ",
+ "url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/comments",Also applies to: 212-212, 242-242
| "retries": "0", | ||
| "nodeName": "API", | ||
| "retry_deplay": "0", | ||
| "convertXmlResponseToJson": false |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check for retry_delay vs retry_deplay usage in codebase
rg -n "retry_del" --glob "*.json"Repository: Lamatic/AgentKit
Length of output: 1693
🏁 Script executed:
# First, check if the file exists and read the relevant sections
if [ -f "bundles/github-manager/flows/classifier/config.json" ]; then
echo "=== Lines 186-189 ==="
sed -n '186,189p' bundles/github-manager/flows/classifier/config.json
echo ""
echo "=== Lines 216-219 ==="
sed -n '216,219p' bundles/github-manager/flows/classifier/config.json
echo ""
echo "=== Lines 246-249 ==="
sed -n '246,249p' bundles/github-manager/flows/classifier/config.json
echo ""
echo "=== Search for retry_deplay in this file ==="
grep -n "retry_deplay" bundles/github-manager/flows/classifier/config.json || echo "No matches found"
echo ""
echo "=== Search for retry_delay in this file ==="
grep -n "retry_delay" bundles/github-manager/flows/classifier/config.json || echo "No matches found"
else
echo "File not found"
fiRepository: Lamatic/AgentKit
Length of output: 730
Fix typo: retry_deplay should be retry_delay
The field name retry_deplay appears in all three API nodes (lines 188, 218, 248). All other config files in the codebase use the correct spelling retry_delay. This typo causes configuration inconsistency with kit standards.
|
|
||
| 2. **Prepare Your Submission** | ||
| - Create a new folder with a descriptive name for your flow | ||
| - Add all files from this package (`config.json`, `inputs.json`, `meta.json`) |
There was a problem hiding this comment.
“All files” list should include README.md.
The package file list is missing README.md while claiming completeness.
Suggested fix
- - Add all files from this package (`config.json`, `inputs.json`, `meta.json`)
+ - Add all files from this package (`config.json`, `inputs.json`, `meta.json`, `README.md`)Based on learnings, Flows exported in flows/ directory must contain all four required files: config.json, inputs.json, meta.json, and README.md.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Add all files from this package (`config.json`, `inputs.json`, `meta.json`) | |
| - Add all files from this package (`config.json`, `inputs.json`, `meta.json`, `README.md`) |
| "values": { | ||
| "id": "triggerNode_1", | ||
| "nodeName": "API Request", | ||
| "responeType": "realtime", | ||
| "advance_schema": "{\n \"url\": \"string\",\n \"filename\": \"string\"\n}" | ||
| }, |
There was a problem hiding this comment.
Typo: responeType should be responseType.
This typo may cause the API trigger to use default response behavior instead of the intended "realtime" mode.
🐛 Proposed fix
"values": {
"id": "triggerNode_1",
"nodeName": "API Request",
- "responeType": "realtime",
+ "responseType": "realtime",
"advance_schema": "{\n \"url\": \"string\",\n \"filename\": \"string\"\n}"
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "values": { | |
| "id": "triggerNode_1", | |
| "nodeName": "API Request", | |
| "responeType": "realtime", | |
| "advance_schema": "{\n \"url\": \"string\",\n \"filename\": \"string\"\n}" | |
| }, | |
| "values": { | |
| "id": "triggerNode_1", | |
| "nodeName": "API Request", | |
| "responseType": "realtime", | |
| "advance_schema": "{\n \"url\": \"string\",\n \"filename\": \"string\"\n}" | |
| }, |
| "values": { | ||
| "id": "codeNode_158", | ||
| "code": "let docs = {{scraperNode_823.output.markdown}};\n\n// Convert to array if it isn't one already\nlet docsArray = Array.isArray(docs) ? docs : [docs];\n\n// Map safely, providing a fallback for pageContent if the doc is just a string\nreturn docsArray.map((doc) => (typeof doc === 'object' ? doc.pageContent : doc));", | ||
| "nodeName": "Extract Chunks" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for chunkNode output patterns in existing flows
rg -n "chunkNode.*output" --glob "*.json" | head -20Repository: Lamatic/AgentKit
Length of output: 4469
🏁 Script executed:
# Find the exact file path and check its contents around the referenced lines
find . -type f -name "config.json" -path "*/github-manager/flows/Docs*" 2>/dev/nullRepository: Lamatic/AgentKit
Length of output: 117
🏁 Script executed:
# Once we confirm the file, read it with context around lines 91-94
cat -n "bundles/github-manager/flows/Docs ingestion/config.json" | sed -n '70,110p'Repository: Lamatic/AgentKit
Length of output: 1457
🏁 Script executed:
# Search for both chunkNode_770 and codeNode_158 in the file
rg -n "chunkNode_770|codeNode_158|scraperNode_823" "bundles/github-manager/flows/Docs ingestion/config.json"Repository: Lamatic/AgentKit
Length of output: 1536
Change codeNode_158 to reference chunked output from chunkNode_770.
The code references {{scraperNode_823.output.markdown}} directly instead of processing the chunked output from chunkNode_770. This bypasses the chunking step entirely. Update to use {{chunkNode_770.output.chunks}} to match the pattern used consistently across all other flows in the codebase and properly process the chunked documents.
| "primaryKeys": [ | ||
| "[URL]+[chunkidx]" | ||
| ], |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for IndexNode primaryKeys patterns in codebase
rg -n "primaryKeys" --glob "*.json" -A2 -B2Repository: Lamatic/AgentKit
Length of output: 29556
Replace placeholder with actual field names used by the upstream data.
The primaryKeys value "[URL]+[chunkidx]" uses placeholder notation inconsistent with all other Index nodes in the codebase, which reference actual field names (e.g., "title", "file_name", "uuid"). This will likely cause indexing failures or key collisions if the data fields are not literally named [URL] and [chunkidx]. Use the actual field names from the upstream transformation or code node that feeds this Index node.
|
|
||
| 2. **Prepare Your Submission** | ||
| - Create a new folder with a descriptive name for your flow | ||
| - Add all files from this package (`config.json`, `inputs.json`, `meta.json`) |
There was a problem hiding this comment.
“All files” list is incomplete.
This line says “all files” but lists only config.json, inputs.json, and meta.json; it should include README.md too.
Suggested fix
- - Add all files from this package (`config.json`, `inputs.json`, `meta.json`)
+ - Add all files from this package (`config.json`, `inputs.json`, `meta.json`, `README.md`)Based on learnings, Flows exported in flows/ directory must contain all four required files: config.json, inputs.json, meta.json, and README.md.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Add all files from this package (`config.json`, `inputs.json`, `meta.json`) | |
| - Add all files from this package (`config.json`, `inputs.json`, `meta.json`, `README.md`) |
Lamatic.ai GitHub Documentation & Issue Manager
What This Kit Does
Managing open-source repositories often leads to a "context gap" where the AI lacks the latest documentation, and maintainers are overwhelmed by manual issue triaging.
This kit automates the lifecycle of repository management by:
bug,documentation,feature-request).Providers & Prerequisites
text-embedding-3-small) and LLM classification (gpt-4oorclaude-3).repopermissions to allow the bot to apply labels.Flow Logic Description:
This project utilizes two interconnected flows within Lamatic.ai:
issuesevent). It extracts the issue body, performs a Vector Search to find relevant doc snippets, and passes both to an LLM Classifier Node. The final API Node sends a PATCH request to GitHub's/issues/{number}/labelsendpoint to tag the issue.Lamatic Flow
Flow ID:
e5cbca46-cc72-43e6-a24b-17814b75e3b3$43920aae-9cc1-4e65-b28e-82597bf00661Summary
bundles/github-manager/with two flow directories, each containingconfig.json,inputs.json,meta.json, and README documentation