Skip to content

feat(skills): add Solr skill suite (query, extending, semantic-search, schema)#104

Open
RusticRoman wants to merge 2 commits into
griddynamics:mainfrom
RusticRoman:feat/solr-skills
Open

feat(skills): add Solr skill suite (query, extending, semantic-search, schema)#104
RusticRoman wants to merge 2 commits into
griddynamics:mainfrom
RusticRoman:feat/solr-skills

Conversation

@RusticRoman
Copy link
Copy Markdown

What & why

Adds a suite of four on-demand Solr skills plus an evals harness under skills/solr/, giving AI coding agents (Claude Code, OpenCode, OpenAI-compatible local LLMs) production-grade Solr expertise loaded on demand instead of generic, often-wrong advice.

This is purely additive — no runtime/MCP changes and no edits to existing Rosetta instructions.

Skill Covers
solr-query query construction, parsers, local params, eDisMax, block-join, JSON facets, tag/exclude, kNN, explain, doc transformers, BM25/relevancy
solr-extending custom plugins: search components, doc transformers, query parsers, update processors, value sources, jar wiring
solr-semantic-search tagging + graph-path + query-model design for a tagging-based semantic search system
solr-schema schema/solrconfig audit and design: field types, analyzer asymmetry, docValues/stored/indexed, synonyms, anti-patterns

Prompt brief

  • Goal: answers reflect real-world Solr gotchas (the kind a senior Solr engineer has memorized), loaded progressively via SKILL.md + topic references.
  • Non-goals: no changes outside skills/solr/ and the two design docs under docs/superpowers/.
  • Constraints: targets Apache Solr 9.x (Solr 10 deltas flagged; 8.x not directly supported).

Before / after behavior

Prompt Without skill With skill loaded
"Package my Solr plugin jar" generic Maven advice; may bundle solr-core provided scope, no Solr core bundled, correct lib/wiring (solr-extending)
"Why does my multi-word synonym match at index time but not query time?" vague analyzer hand-waving identifies query-time graph expansion vs index-time + FlattenGraph fix (solr-schema)
"Tune BM25 k1 for short product titles" generic k1/b explanation concrete short-field guidance + similarity config (solr-query)

Validation evidence

  • Evals harness at skills/solr/evals/ with ~284 graded cases across the four skills and a Python runner:
    python run_evals.py --skill query|extending|semantic-search|schema|all
  • Design spec + implementation plan included under docs/superpowers/.

PR checklist

  • Scope is narrow and explicit (additive skills/solr/ only)
  • No duplicate rules or ambiguous wording introduced
  • Safety, privacy, and approval checkpoints preserved (no changes to those areas)
  • Prompt changes include a brief, examples, and validation evidence
  • No architecture changes (no docs/ARCHITECTURE.md update needed)
  • DCO Signed-off-by on the commit

Note: this bundles four cohesive skills in one PR. Happy to split per-skill if maintainers prefer smaller reviews.

🤖 Generated with Claude Code

…, schema)

Adds four on-demand Solr skills plus an evals harness for AI coding agents
(Claude Code, OpenCode, and OpenAI-compatible local LLMs). Purely additive
under skills/solr/ — no runtime/MCP changes and no edits to existing
Rosetta instructions.

Prompt brief
- Goal: give agents production-grade Solr expertise loaded on demand, so
  answers reflect real-world gotchas instead of generic Solr advice.
- Skills:
  - solr-query        — query construction, parsers, JSON facets, kNN, BM25/relevancy
  - solr-extending    — custom plugins: search components, doc transformers,
                        query parsers, update processors, value sources, wiring
  - solr-semantic-search — tagging + graph-path + query-model design
  - solr-schema       — schema/solrconfig audit and design
- Non-goals: no changes outside skills/solr/ and the design docs.
- Constraints: targets Apache Solr 9.x (Solr 10 deltas flagged); progressive
  disclosure via SKILL.md + topic references.

Validation
- Evals harness under skills/solr/evals/ with ~284 graded cases across the four
  skills and a Python runner (--skill query|extending|semantic-search|schema|all).

Includes the solr-schema design spec and implementation plan under
docs/superpowers/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: RusticRoman <roman.kagan@rust-lang.co>
Copy link
Copy Markdown
Contributor

@isolomatov-gd isolomatov-gd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RusticRoman Please review the following issues:

Remove superpowers.
Skills are in incorrect folder.
Must be in instructions/r3.
Skill is incorrectly formatted.
SKILL.md is missing.
Proper frontmatter is missing.
Why did you use python? There is very low chance python will be installed.
Nodejs will be available for sure.
Ask your AI to follow instructions/r3/core/skills/coding-agents-prompt-authoring/SKILL.md with its references/pa-rosetta-intro-for-AI.md and references/{pa-rosetta.md, pa-patterns, pa-hardening.md, pa-schemas.md}.
Please check https://agentskills.io/home as structure does not match either.
Remove shell scripts too. I see also evals, probably those are not needed too.
Reduce all skill descriptions to less than 30 tokens each. The current ones will overflow entire budget for all descriptions.

@isolomatov-gd isolomatov-gd reopened this Jun 4, 2026
@github-actions github-actions Bot added the enhancement New feature or request label Jun 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Rosetta Triage Review

Summary: This PR adds a suite of four on-demand Solr skills (solr-query, solr-extending, solr-semantic-search, solr-schema) plus an evals harness under a new top-level skills/solr/ directory, designed for manual installation by users into ~/.claude/skills/. It also adds planning/spec documents under a new docs/superpowers/ directory.

Findings:

  • Content quality is high — SKILL.md routers are well-structured with correct frontmatter, references are technically accurate and comprehensive, eval cases follow the expected JSON schema, and the Python eval runner is clean with no security issues
  • Not integrated with the Rosetta MCP instruction system — skills are placed under a new top-level skills/ directory designed for manual copy-paste installation, rather than instructions/r2/core/skills/ where Rosetta's publishable skills live and where the Rosetta CLI can pick them up for RAGFlow publishing; there is no publishing path shown for these to become part of Rosetta's MCP-served skill catalog
  • New top-level skills/ directory has no precedent in this repo — the current repo structure does not have a skills/ at root; clarification from maintainers on whether a standalone community-skill area is an intentional design decision would help
  • docs/superpowers/ adds a new doc subdirectory with internal planning artifacts (implementation plan + design spec); the term superpowers refers to a competing product in the FAQ and is not a recognized Rosetta concept, making this placement confusing
  • INTERN_ file naming (INTERN_SETUP.md, INTERN_QUICK_REFERENCE.md) is an unusual convention for a general-purpose open-source repository and may warrant renaming to SETUP.md / QUICK_REFERENCE.md
  • Evals are not CI-integrated — the runner requires a live OpenAI-compatible endpoint (LM Studio default); there is no lint-only CI step wired in .github/workflows/, so correctness of eval JSON is only verified manually
  • PR size is substantial (18,280 lines, 80 files) — the PR author offers to split per-skill; given the size and the open question about integration strategy, splitting would improve reviewability
  • No Rosetta MCP publishing stepCONTRIBUTING.md describes a publish-to-RAGFlow path via the Rosetta CLI; this PR does not include or reference that step for the new skills

Suggestions:

  • Clarify with maintainers whether these skills should live in instructions/r2/core/skills/solr-*/ (MCP-published path) or as a standalone community bundle in skills/ (manual install)
  • If standalone is intentional, add a top-level skills/README.md explaining the distinction between instructions/ (MCP-published) and skills/ (manual community bundles)
  • Move or archive docs/superpowers/plans/ and docs/superpowers/specs/ — these are internal planning artifacts; if they must stay, a more neutral name like docs/design/ would avoid confusion with the competing product
  • Rename INTERN_SETUP.mdSETUP.md and INTERN_QUICK_REFERENCE.mdQUICK_REFERENCE.md
  • Add a minimal CI check (e.g., JSON lint via python3 -c "import json; ...") on the eval case files to catch malformed cases before merge

Automated triage by Rosetta agent

…view

Address PR griddynamics#104 review feedback:
- Move solr-query/-schema/-extending/-semantic-search from top-level
  skills/solr/ into instructions/r3/core/skills/<name>/ (MCP-publishable path)
- Rewrite each SKILL.md into the r3 tagged-section format with proper
  frontmatter (name, description, license, tags, baseSchema)
- Trim every skill description to under 30 tokens
- Remove the Python eval harness, shell scripts, INTERN_* docs, README,
  and the docs/superpowers/ planning artifacts
- Fix a skill-isolation violation (deep-link into a sibling skill's references)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@RusticRoman
Copy link
Copy Markdown
Author

RusticRoman commented Jun 5, 2026

Pushed 4781ffd addressing the review feedback:

  • Moved all 4 skills from top-level skills/solr/ into instructions/r3/core/skills/<name>/ (the MCP-publishable path).
  • Reformatted each SKILL.md into the r3 tagged-section style with proper frontmatter (name, description, license, tags, baseSchema).
  • Trimmed every skill description to under 30 tokens.
  • Removed the Python eval harness, shell scripts, INTERN_* docs, README, and the docs/superpowers/ planning artifacts.
  • Fixed a skill-isolation violation (a deep-link into a sibling skill's references/).

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

📋 Prompt Quality Validation Report

✅ Validation Passed

Summary by File

File 🔴 Critical 🟠 Very High 🟡 High 🔵 Medium ⚪ Low Status

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants