Skip to content

feat: add github-manager Agentkit#100

Open
Vasisthayadav2123 wants to merge 1 commit intoLamatic:mainfrom
Vasisthayadav2123:agentkit-challenge
Open

feat: add github-manager Agentkit#100
Vasisthayadav2123 wants to merge 1 commit intoLamatic:mainfrom
Vasisthayadav2123:agentkit-challenge

Conversation

@Vasisthayadav2123
Copy link
Copy Markdown

@Vasisthayadav2123 Vasisthayadav2123 commented Mar 24, 2026

Lamatic.ai GitHub Documentation & Issue Manager

What This Kit Does

Managing open-source repositories often leads to a "context gap" where the AI lacks the latest documentation, and maintainers are overwhelmed by manual issue triaging.

This kit automates the lifecycle of repository management by:

  1. Syncing Knowledge: Automatically scraping external documentation sites, chunking the text, and converting it into vector embeddings for RAG (Retrieval-Augmented Generation).
  2. Intelligent Triaging: Using a Webhook-driven classifier that analyzes incoming GitHub issues against the vectorized documentation to automatically apply accurate labels (e.g., bug, documentation, feature-request).

Providers & Prerequisites

  • Lamatic.ai: For workflow orchestration and managed vector storage.
  • OpenAI / Anthropic: For embedding generation (text-embedding-3-small) and LLM classification (gpt-4o or claude-3).
  • GitHub PAT: A Personal Access Token with repo permissions to allow the bot to apply labels.
  • Vector Database: (Optional) Pinecone or Milvus if not using Lamatic's internal storage.

Flow Logic Description:

This project utilizes two interconnected flows within Lamatic.ai:

  • The Ingestion Flow: Uses a Scraper Node to crawl a specified URL. The output is passed to a Processor Node for recursive character chunking. These chunks are sent to an Embed Node and stored in the Vector DB index. This ensures the "Brain" of your assistant is always updated with the latest docs.
  • The Action Flow: Triggered by a Webhook Node (GitHub issues event). It extracts the issue body, performs a Vector Search to find relevant doc snippets, and passes both to an LLM Classifier Node. The final API Node sends a PATCH request to GitHub's /issues/{number}/labels endpoint to tag the issue.

Lamatic Flow

Flow ID: e5cbca46-cc72-43e6-a24b-17814b75e3b3 $ 43920aae-9cc1-4e65-b28e-82597bf00661

Summary

  • Introduces GitHub Manager Agentkit – A new automation bundle for repository management via GitHub integration
  • Docs Ingestion Flow – Scrapes external documentation URLs, chunks text, generates embeddings, and stores them in a vector database for RAG
  • Issue Classifier Flow – Webhook-triggered workflow that vector-searches relevant documentation and uses an LLM classifier to apply issue labels (bug, documentation, feature-request, etc.) via GitHub API
  • Bundle Structure – Added bundles/github-manager/ with two flow directories, each containing config.json, inputs.json, meta.json, and README documentation
  • Configuration – Supports configurable embedding models (text-embedding-3-small), generative models (gpt-4o/claude-3), GitHub PAT credentials, and vector database integrations (Pinecone, Milvus, or managed providers)
  • Total Changes – 162 lines of documentation and 742 lines of workflow/configuration files (9 new files total)

# Lamatic.ai GitHub Documentation & Issue Manager

## What This Kit Does
Managing open-source repositories often leads to a "context gap" where the AI lacks the latest documentation, and maintainers are overwhelmed by manual issue triaging.

This kit automates the lifecycle of repository management by:
1.  **Syncing Knowledge:** Automatically scraping external documentation sites, chunking the text, and converting it into vector embeddings for RAG (Retrieval-Augmented Generation).
2.  **Intelligent Triaging:** Using a Webhook-driven classifier that analyzes incoming GitHub issues against the vectorized documentation to automatically apply accurate labels (e.g., `bug`, `documentation`, `feature-request`).

## Providers & Prerequisites
- **Lamatic.ai:** For workflow orchestration and managed vector storage.
- **OpenAI / Anthropic:** For embedding generation (`text-embedding-3-small`) and LLM classification (`gpt-4o` or `claude-3`).
- **GitHub PAT:** A Personal Access Token with `repo` permissions to allow the bot to apply labels.
- **Vector Database:** (Optional) Pinecone or Milvus if not using Lamatic's internal storage.
### Flow Logic Description:
This project utilizes two interconnected flows within Lamatic.ai:

* **The Ingestion Flow:** Uses a **Scraper Node** to crawl a specified URL. The output is passed to a **Processor Node** for recursive character chunking. These chunks are sent to an **Embed Node** and stored in the **Vector DB index**. This ensures the "Brain" of your assistant is always updated with the latest docs.
* **The Action Flow:** Triggered by a **Webhook Node** (GitHub `issues` event). It extracts the issue body, performs a **Vector Search** to find relevant doc snippets, and passes both to an **LLM Classifier Node**. The final **API Node** sends a PATCH request to GitHub's `/issues/{number}/labels` endpoint to tag the issue.

## Lamatic Flow
Flow ID: `e5cbca46-cc72-43e6-a24b-17814b75e3b3` $ `43920aae-9cc1-4e65-b28e-82597bf00661`
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 24, 2026

📝 Walkthrough

Walkthrough

This PR introduces a new Lamatic.ai GitHub automation bundle with two workflows: a documentation scraper that vectorizes and indexes content, and a GitHub issue classifier that analyzes issues with LLM assistance and applies labels, including configuration files, input schemas, metadata, and documentation.

Changes

Cohort / File(s) Summary
Bundle Root
bundles/github-manager/README.md, bundles/github-manager/config.json
Bundle documentation and configuration defining the GitHub Manager automation bundle with two workflows: docs scraping/vectorization and issue classification.
Docs Ingestion Flow
bundles/github-manager/flows/Docs ingestion/README.md, bundles/github-manager/flows/Docs ingestion/config.json, bundles/github-manager/flows/Docs ingestion/inputs.json, bundles/github-manager/flows/Docs ingestion/meta.json
Documentation scraper workflow with 8 nodes: API trigger, URL scraper, text chunker, code processor, vectorizer, indexer, and API response; requires configuration for scraper credentials, embedding model, and vector database.
Classifier Flow
bundles/github-manager/flows/classifier/README.md, bundles/github-manager/flows/classifier/config.json, bundles/github-manager/flows/classifier/inputs.json, bundles/github-manager/flows/classifier/meta.json
GitHub issue triage workflow with 9 nodes: webhook trigger, LLM classification (BUG vs. DOCS), conditional logic, optional vector search with documentation context, LLM response generation, and GitHub API calls for labeling or commenting; requires configuration for generative/embedding models and vector database.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • amanintech
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add github-manager Agentkit' directly and clearly describes the main change: introducing a new GitHub manager Agentkit bundle with documentation, workflows, and configurations.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d1a0f00a-ea16-4564-96ee-28cf93676ae4

📥 Commits

Reviewing files that changed from the base of the PR and between cd0d519 and 56f601f.

📒 Files selected for processing (10)
  • bundles/github-manager/README.md
  • bundles/github-manager/config.json
  • bundles/github-manager/flows/Docs ingestion/README.md
  • bundles/github-manager/flows/Docs ingestion/config.json
  • bundles/github-manager/flows/Docs ingestion/inputs.json
  • bundles/github-manager/flows/Docs ingestion/meta.json
  • bundles/github-manager/flows/classifier/README.md
  • bundles/github-manager/flows/classifier/config.json
  • bundles/github-manager/flows/classifier/inputs.json
  • bundles/github-manager/flows/classifier/meta.json

Comment on lines +11 to +12
"id": "docs scraping",
"type": "mandatory"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Step ID likely does not match flow reference convention.
"docs scraping" is inconsistent with common bundle step/flow ID conventions (typically exact, stable IDs, usually kebab-case). Align this step ID with the actual flow reference used by the bundle (e.g., docs-ingestion) and update prerequisite references accordingly.

Suggested fix
         {
-            "id": "docs scraping",
+            "id": "docs-ingestion",
             "type": "mandatory"
         },

As per coding guidelines, Bundle config.json must use the same format as kit config with correct flow references and step definitions.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"id": "docs scraping",
"type": "mandatory"
"id": "docs-ingestion",
"type": "mandatory"

Comment on lines +17 to +19
"prerequisiteSteps": [
"data_source"
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Invalid prerequisite reference breaks step dependency resolution.
"classifier" depends on "data_source", but no step with id: "data_source" is defined in this bundle config.

Suggested fix
         {
             "id": "classifier",
             "type": "mandatory",
             "prerequisiteSteps": [
-                "data_source"
+                "docs-ingestion"
             ]
         }

As per coding guidelines, Bundle config.json must use the same format as kit config with correct flow references and step definitions.

Comment on lines +79 to +83
{
"label": "Condition 1",
"value": "conditionNode_731-addNode_120",
"condition": "{\n \"operator\": null,\n \"operands\": [\n {\n \"name\": \"{{RAGNode_463.output.modelResponse}}\",\n \"operator\": \"ilike\",\n \"value\": \"DOCS\"\n }\n ]\n}"
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for LLMNode output field patterns in other flow configs
rg -n "LLMNode.*output\." --glob "*.json" | head -30

Repository: Lamatic/AgentKit

Length of output: 5482


🏁 Script executed:

# First, let's examine the classifier config to see all nodes
cat bundles/github-manager/flows/classifier/config.json | head -200

Repository: Lamatic/AgentKit

Length of output: 6504


🏁 Script executed:

# Check if RAGNode_463 exists anywhere in this file
grep -n "RAGNode_463" bundles/github-manager/flows/classifier/config.json

Repository: Lamatic/AgentKit

Length of output: 275


🏁 Script executed:

# List all node IDs in the classifier config
grep -oP '"id":\s*"[^"]*"' bundles/github-manager/flows/classifier/config.json | sort | uniq

Repository: Lamatic/AgentKit

Length of output: 713


🏁 Script executed:

# Search for any RAG node references in the entire bundle
rg -n "RAGNode" bundles/github-manager/

Repository: Lamatic/AgentKit

Length of output: 327


🏁 Script executed:

# Check what the condition actually looks like and what nodes exist around it
sed -n '70,85p' bundles/github-manager/flows/classifier/config.json

Repository: Lamatic/AgentKit

Length of output: 662


Critical: Condition references non-existent node RAGNode_463 and uses incorrect output field.

The condition checks {{RAGNode_463.output.modelResponse}} but:

  1. No node with ID RAGNode_463 exists in this flow
  2. The output field should be generatedResponse, not modelResponse

The LLM classifier node is LLMNode_400, and all LLM nodes in the codebase use .output.generatedResponse. This will cause the condition to fail at runtime.

🐛 Proposed fix
             {
               "label": "Condition 1",
               "value": "conditionNode_731-addNode_120",
-              "condition": "{\n  \"operator\": null,\n  \"operands\": [\n    {\n      \"name\": \"{{RAGNode_463.output.modelResponse}}\",\n      \"operator\": \"ilike\",\n      \"value\": \"DOCS\"\n    }\n  ]\n}"
+              "condition": "{\n  \"operator\": null,\n  \"operands\": [\n    {\n      \"name\": \"{{LLMNode_400.output.generatedResponse}}\",\n      \"operator\": \"ilike\",\n      \"value\": \"DOCS\"\n    }\n  ]\n}"
             },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{
"label": "Condition 1",
"value": "conditionNode_731-addNode_120",
"condition": "{\n \"operator\": null,\n \"operands\": [\n {\n \"name\": \"{{RAGNode_463.output.modelResponse}}\",\n \"operator\": \"ilike\",\n \"value\": \"DOCS\"\n }\n ]\n}"
},
{
"label": "Condition 1",
"value": "conditionNode_731-addNode_120",
"condition": "{\n \"operator\": null,\n \"operands\": [\n {\n \"name\": \"{{LLMNode_400.output.generatedResponse}}\",\n \"operator\": \"ilike\",\n \"value\": \"DOCS\"\n }\n ]\n}"
},

"nodeId": "apiNode",
"values": {
"id": "apiNode_451",
"url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/labels ",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Trailing whitespace in GitHub API URLs.

Lines 182, 212, and 242 have trailing spaces in the URL values. This could cause HTTP request failures or unexpected behavior when constructing the API endpoint.

🐛 Proposed fix for all three occurrences
-          "url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/labels ",
+          "url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/labels",
-          "url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/comments ",
+          "url": "https://api.github.com/repos/{{triggerNode_1.output.repo_full_name}}/issues/{{triggerNode_1.output.issue_number}}/comments",

Also applies to: 212-212, 242-242

Comment on lines +186 to +189
"retries": "0",
"nodeName": "API",
"retry_deplay": "0",
"convertXmlResponseToJson": false
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for retry_delay vs retry_deplay usage in codebase
rg -n "retry_del" --glob "*.json"

Repository: Lamatic/AgentKit

Length of output: 1693


🏁 Script executed:

# First, check if the file exists and read the relevant sections
if [ -f "bundles/github-manager/flows/classifier/config.json" ]; then
  echo "=== Lines 186-189 ==="
  sed -n '186,189p' bundles/github-manager/flows/classifier/config.json
  echo ""
  echo "=== Lines 216-219 ==="
  sed -n '216,219p' bundles/github-manager/flows/classifier/config.json
  echo ""
  echo "=== Lines 246-249 ==="
  sed -n '246,249p' bundles/github-manager/flows/classifier/config.json
  echo ""
  echo "=== Search for retry_deplay in this file ==="
  grep -n "retry_deplay" bundles/github-manager/flows/classifier/config.json || echo "No matches found"
  echo ""
  echo "=== Search for retry_delay in this file ==="
  grep -n "retry_delay" bundles/github-manager/flows/classifier/config.json || echo "No matches found"
else
  echo "File not found"
fi

Repository: Lamatic/AgentKit

Length of output: 730


Fix typo: retry_deplay should be retry_delay

The field name retry_deplay appears in all three API nodes (lines 188, 218, 248). All other config files in the codebase use the correct spelling retry_delay. This typo causes configuration inconsistency with kit standards.


2. **Prepare Your Submission**
- Create a new folder with a descriptive name for your flow
- Add all files from this package (`config.json`, `inputs.json`, `meta.json`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

“All files” list should include README.md.
The package file list is missing README.md while claiming completeness.

Suggested fix
-   - Add all files from this package (`config.json`, `inputs.json`, `meta.json`)
+   - Add all files from this package (`config.json`, `inputs.json`, `meta.json`, `README.md`)

Based on learnings, Flows exported in flows/ directory must contain all four required files: config.json, inputs.json, meta.json, and README.md.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Add all files from this package (`config.json`, `inputs.json`, `meta.json`)
- Add all files from this package (`config.json`, `inputs.json`, `meta.json`, `README.md`)

Comment on lines +7 to +12
"values": {
"id": "triggerNode_1",
"nodeName": "API Request",
"responeType": "realtime",
"advance_schema": "{\n \"url\": \"string\",\n \"filename\": \"string\"\n}"
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo: responeType should be responseType.

This typo may cause the API trigger to use default response behavior instead of the intended "realtime" mode.

🐛 Proposed fix
         "values": {
           "id": "triggerNode_1",
           "nodeName": "API Request",
-          "responeType": "realtime",
+          "responseType": "realtime",
           "advance_schema": "{\n  \"url\": \"string\",\n  \"filename\": \"string\"\n}"
         },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"values": {
"id": "triggerNode_1",
"nodeName": "API Request",
"responeType": "realtime",
"advance_schema": "{\n \"url\": \"string\",\n \"filename\": \"string\"\n}"
},
"values": {
"id": "triggerNode_1",
"nodeName": "API Request",
"responseType": "realtime",
"advance_schema": "{\n \"url\": \"string\",\n \"filename\": \"string\"\n}"
},

Comment on lines +91 to +94
"values": {
"id": "codeNode_158",
"code": "let docs = {{scraperNode_823.output.markdown}};\n\n// Convert to array if it isn't one already\nlet docsArray = Array.isArray(docs) ? docs : [docs];\n\n// Map safely, providing a fallback for pageContent if the doc is just a string\nreturn docsArray.map((doc) => (typeof doc === 'object' ? doc.pageContent : doc));",
"nodeName": "Extract Chunks"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for chunkNode output patterns in existing flows
rg -n "chunkNode.*output" --glob "*.json" | head -20

Repository: Lamatic/AgentKit

Length of output: 4469


🏁 Script executed:

# Find the exact file path and check its contents around the referenced lines
find . -type f -name "config.json" -path "*/github-manager/flows/Docs*" 2>/dev/null

Repository: Lamatic/AgentKit

Length of output: 117


🏁 Script executed:

# Once we confirm the file, read it with context around lines 91-94
cat -n "bundles/github-manager/flows/Docs ingestion/config.json" | sed -n '70,110p'

Repository: Lamatic/AgentKit

Length of output: 1457


🏁 Script executed:

# Search for both chunkNode_770 and codeNode_158 in the file
rg -n "chunkNode_770|codeNode_158|scraperNode_823" "bundles/github-manager/flows/Docs ingestion/config.json"

Repository: Lamatic/AgentKit

Length of output: 1536


Change codeNode_158 to reference chunked output from chunkNode_770.

The code references {{scraperNode_823.output.markdown}} directly instead of processing the chunked output from chunkNode_770. This bypasses the chunking step entirely. Update to use {{chunkNode_770.output.chunks}} to match the pattern used consistently across all other flows in the codebase and properly process the chunked documents.

Comment on lines +161 to +163
"primaryKeys": [
"[URL]+[chunkidx]"
],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for IndexNode primaryKeys patterns in codebase
rg -n "primaryKeys" --glob "*.json" -A2 -B2

Repository: Lamatic/AgentKit

Length of output: 29556


Replace placeholder with actual field names used by the upstream data.

The primaryKeys value "[URL]+[chunkidx]" uses placeholder notation inconsistent with all other Index nodes in the codebase, which reference actual field names (e.g., "title", "file_name", "uuid"). This will likely cause indexing failures or key collisions if the data fields are not literally named [URL] and [chunkidx]. Use the actual field names from the upstream transformation or code node that feeds this Index node.


2. **Prepare Your Submission**
- Create a new folder with a descriptive name for your flow
- Add all files from this package (`config.json`, `inputs.json`, `meta.json`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

“All files” list is incomplete.
This line says “all files” but lists only config.json, inputs.json, and meta.json; it should include README.md too.

Suggested fix
-   - Add all files from this package (`config.json`, `inputs.json`, `meta.json`)
+   - Add all files from this package (`config.json`, `inputs.json`, `meta.json`, `README.md`)

Based on learnings, Flows exported in flows/ directory must contain all four required files: config.json, inputs.json, meta.json, and README.md.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Add all files from this package (`config.json`, `inputs.json`, `meta.json`)
- Add all files from this package (`config.json`, `inputs.json`, `meta.json`, `README.md`)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants