feat(search): Semantic Tool Search by shashi-stackone · Pull Request #149 · StackOneHQ/stackone-ai-python

shashi-stackone · 2026-02-19T13:36:55Z

Problem

Following up from #142

StackOne has over 10,000 actions across all connectors and growing, some connectors have 2,000+ actions alone. Keyword matching breaks
down when someone searches "onboard new hire" but the action is called hris_create_employee. The SDK already supports keyword-based
search, and we need to add semantic search using the action search service.

Implementation Details

SemanticSearchClient that calls StackOne's /actions/search API for natural language tool discovery
Three ways to use it:
1. search_tools() search by intent, get a Tools collection ready for OpenAI, LangChain, or any framework
2. search_action_names() lightweight lookup returning action names and scores without full tool definitions
3. Utility tools pass a SemanticSearchClient to utility_tools() and the tool_search tool becomes semantic-aware inside
  agent loops
Per-connector parallel search so results are scoped to only the connectors the user has linked
Automatic fallback to local BM25+TF-IDF hybrid search when the semantic API is unavailable
Action name normalization that strips version prefixes (e.g. bamboohr_1.0.0_bamboohr_create_employee_global →
bamboohr_create_employee)
Connector helpers (StackOneTool.connector, Tools.get_connectors()) for connector-aware filtering
Benchmark suite with 94 evaluation tasks across 8 categories — semantic search achieves 76.6% Hit@5 vs 66.0% for local search (+10.6%
improvement)

Summary by cubic

Adds semantic tool search so users can find actions with natural language (e.g., “onboard new hire” → hris_create_employee). Searches are scoped to linked connectors and optionally project_ids, fall back to local search, and backend ranking is respected unless top_k is set.

New Features
- SemanticSearchClient with search_tools() (returns Tools) and search_action_names() (names + scores).
- Tools.utility_tools(semantic_client=...) makes tool_search semantic-aware and limits results to connectors available in the fetched tools.
- Optional project_ids scoping to restrict semantic results to specific projects.
- Per-connector parallel search and action name normalization (strips version prefixes).
- Benchmark: 76.6% Hit@5 vs 66.0% local (+10.6 pts).
Migration
- No breaking changes. To enable, create SemanticSearchClient and either call search_tools()/search_action_names() or pass it to Tools.utility_tools(semantic_client=...).
- Set top_k to limit results; default respects backend ranking.
- Optionally pass project_ids to scope searches.

^{Written for commit 71457af. Summary will update on new commits.}

…resence

…thon only crewAI example

When utility_tools(semantic_client=...) is used, tool_search now searches only the connectors available in the fetched tools collection instead of the full StackOne catalog. This prevents agents from discovering tools they cannot execute. - Add available_connectors param to create_semantic_tool_search - Pass connectors from Tools.utility_tools() to scope searches - Update docs, examples, and README to reflect scoping - Add 4 new tests for scoping behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Adds semantic (natural-language) tool discovery to the StackOne AI SDK by integrating a new /actions/search client into the existing toolset + utility-tools flows, including connector scoping, normalization, examples, and tests.

Changes:

Introduces SemanticSearchClient + pydantic response models and action-name normalization for versioned API action names.
Adds StackOneToolSet.search_tools() / search_action_names() and wires semantic search into Tools.utility_tools(..., semantic_client=...).
Adds a full semantic-search example script, README docs, and a comprehensive test suite.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/test_semantic_search.py	New unit/integration tests covering semantic client, toolset integration, scoping, fallback, and normalization/dedup behaviors.
stackone_ai/utils/normalize.py	Adds action-name normalization helper for versioned semantic API action names.
stackone_ai/utility_tools.py	Adds `create_semantic_tool_search()` and updates local tool_search schema handling for nullable limit/minScore.
stackone_ai/toolset.py	Adds lazy `semantic_client` and new semantic search entrypoints with per-connector parallelization + local fallback.
stackone_ai/semantic_search.py	New semantic search client module for `/actions/search`.
stackone_ai/models.py	Adds `StackOneTool.connector`, `Tools.get_connectors()`, and semantic-aware `utility_tools()` option.
stackone_ai/init.py	Exposes semantic search classes at package top-level.
examples/test_examples.py	Registers the new semantic search example for the examples test runner.
examples/semantic_search_example.py	Adds end-to-end runnable example demonstrating semantic search + integrations.
README.md	Documents semantic search feature and links to the new example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-19T13:42:53Z

stackone_ai/models.py

+            from stackone_ai.utility_tools import create_semantic_tool_search
+
+            search_tool = create_semantic_tool_search(
+                semantic_client, available_connectors=self.get_connectors() or None


available_connectors=self.get_connectors() or None makes an empty connector set behave like None, which triggers the “full catalog” path in create_semantic_tool_search. If a user has no fetched tools/connectors, this will return semantic results for tools they can’t execute, contradicting the scoping behavior described in the docstring/README. Pass the (possibly empty) set through unchanged so empty connectors yield empty results.

Suggested change

semantic_client, available_connectors=self.get_connectors() or None

semantic_client, available_connectors=self.get_connectors()

Copilot · 2026-02-19T13:42:54Z

stackone_ai/utility_tools.py

+                    for future in concurrent.futures.as_completed(futures):
+                        try:
+                            resp = future.result()
+                            all_results.extend(resp.results)
+                        except Exception:
+                            pass  # Partial failures: skip failed connectors


The per-connector futures block catches Exception and silently ignores it. This can hide programming errors (e.g., response parsing issues) and makes failures impossible to debug. Catch SemanticSearchError (or the expected exception type) explicitly and consider logging which connector failed; if all connectors fail, consider surfacing an error instead of returning an empty list.

Copilot · 2026-02-19T13:42:55Z

stackone_ai/semantic_search.py

+        url = f"{self.base_url}/actions/search"
+        headers = {
+            "Authorization": self._build_auth_header(),
+            "Content-Type": "application/json",
+        }
+        payload: dict[str, Any] = {"query": query}
+        if top_k is not None:
+            payload["top_k"] = top_k
+        if connector:
+            payload["connector"] = connector
+
+        try:
+            response = httpx.post(url, json=payload, headers=headers, timeout=self.timeout)
+            response.raise_for_status()
+            data = response.json()


SemanticSearchClient.search() uses the top-level httpx.post(...) helper, which creates a new client per call and prevents connection pooling/keep-alive. With the new parallel per-connector search, this can add noticeable overhead and extra socket churn. Consider using a persistent httpx.Client (or AsyncClient if you later make this async) owned by SemanticSearchClient, and include a User-Agent consistent with the rest of the SDK for observability.

Copilot · 2026-02-19T13:42:55Z

README.md


+## Semantic Search
+
+Discover tools using natural language instead of exact names. Queries like "onboard new hire" resolve to the right actions even when the tool is called `hris_create_employee`.


The new Semantic Search section says the action is called hris_create_employee, but the rest of the repo (examples + semantic normalization/tests) uses connector-prefixed tool names like bamboohr_create_employee. This is likely to confuse users trying the feature; update the example text to match the actual tool naming used by the SDK, or explicitly explain the difference if both naming schemes exist.

Suggested change

Discover tools using natural language instead of exact names. Queries like "onboard new hire" resolve to the right actions even when the tool is called `hris_create_employee`.

Discover tools using natural language instead of exact names. Queries like "onboard new hire" resolve to the right actions even when the actual tool is named `bamboohr_create_employee` (rather than a generic `hris_create_employee`).

Copilot · 2026-02-19T13:42:56Z

stackone_ai/toolset.py

+            matched_tools = [t for t in all_tools if t.name in action_names]
+
+            # Sort matched tools by semantic search score order
+            action_order = {_normalize_action_name(r.action_name): i for i, r in enumerate(all_results)}


action_order is built from a dict comprehension keyed by normalized action name, so if multiple semantic results normalize to the same tool name, later (often lower-scoring) duplicates overwrite earlier entries. This can reorder matched_tools incorrectly (e.g., A(v1 high), B(mid), A(v2 low) ends up ranking B ahead of A). Build action_order without overwriting existing keys (or deduplicate results by normalized name keeping the best score) before sorting.

Suggested change

action_order = {_normalize_action_name(r.action_name): i for i, r in enumerate(all_results)}

action_order: dict[str, int] = {}

for i, r in enumerate(all_results):

key = _normalize_action_name(r.action_name)

if key not in action_order:

action_order[key] = i

cubic-dev-ai

2 issues found across 10 files

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="stackone_ai/utility_tools.py">

<violation number="1" location="stackone_ai/utility_tools.py:360">
P1: Bare `except Exception: pass` silently swallows all errors, including programming bugs and meaningful `SemanticSearchError` diagnostics. If every connector search fails, the user gets empty results with no error indication. At minimum, log the exception (e.g., `logging.exception`) and consider catching only `SemanticSearchError`. Also note the inconsistency: the unscoped search path (line ~364) lets exceptions propagate, but this parallel path silently drops them.</violation>
</file>

<file name="stackone_ai/toolset.py">

<violation number="1" location="stackone_ai/toolset.py:390">
P2: Dict comprehension overwrites the best (first) index for duplicate normalized action names, causing tools with duplicate API versions to be sorted by their *worst* score instead of their best. Use a loop that keeps only the first occurrence.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

stackone_ai/utility_tools.py

cubic-dev-ai · 2026-02-19T13:43:13Z

stackone_ai/toolset.py

+            matched_tools = [t for t in all_tools if t.name in action_names]
+
+            # Sort matched tools by semantic search score order
+            action_order = {_normalize_action_name(r.action_name): i for i, r in enumerate(all_results)}


P2: Dict comprehension overwrites the best (first) index for duplicate normalized action names, causing tools with duplicate API versions to be sorted by their worst score instead of their best. Use a loop that keeps only the first occurrence.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At stackone_ai/toolset.py, line 390: <comment>Dict comprehension overwrites the best (first) index for duplicate normalized action names, causing tools with duplicate API versions to be sorted by their *worst* score instead of their best. Use a loop that keeps only the first occurrence.</comment> <file context> @@ -264,6 +276,253 @@ def set_accounts(self, account_ids: list[str]) -> StackOneToolSet: + matched_tools = [t for t in all_tools if t.name in action_names] + + # Sort matched tools by semantic search score order + action_order = {_normalize_action_name(r.action_name): i for i, r in enumerate(all_results)} + matched_tools.sort(key=lambda t: action_order.get(t.name, float("inf"))) + </file context>

Suggested change

action_order = {_normalize_action_name(r.action_name): i for i, r in enumerate(all_results)}

action_order: dict[str, int] = {}

for i, r in enumerate(all_results):

name = _normalize_action_name(r.action_name)

if name not in action_order:

action_order[name] = i

cubic-dev-ai

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="stackone_ai/utility_tools.py">

<violation number="1" location="stackone_ai/utility_tools.py:342">
P2: An empty `available_connectors` set now falls back to a full-catalog search, which can surface tools the user doesn’t have access to. This contradicts the scoping behavior (“only the user’s own connectors are searched”) and likely returns incorrect results for accounts with no connectors.</violation>
</file>

<file name="stackone_ai/toolset.py">

<violation number="1" location="stackone_ai/toolset.py:407">
P2: Passing top_k directly to the local tool_search limits results before connector filtering, so fallback can return fewer than requested even when matching tools exist for the allowed connectors. Consider keeping an expanded fallback limit (e.g., top_k * N or a safe default) before filtering.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-02-19T14:40:15Z

stackone_ai/utility_tools.py

+
+        all_results: list[SemanticSearchResult] = []
+
+        if available_connectors is not None and available_connectors:


P2: An empty available_connectors set now falls back to a full-catalog search, which can surface tools the user doesn’t have access to. This contradicts the scoping behavior (“only the user’s own connectors are searched”) and likely returns incorrect results for accounts with no connectors.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At stackone_ai/utility_tools.py, line 342: <comment>An empty `available_connectors` set now falls back to a full-catalog search, which can surface tools the user doesn’t have access to. This contradicts the scoping behavior (“only the user’s own connectors are searched”) and likely returns incorrect results for accounts with no connectors.</comment> <file context> @@ -339,7 +339,7 @@ def execute_search(arguments: str | JsonDict | None = None) -> JsonDict: all_results: list[SemanticSearchResult] = [] - if available_connectors is not None: + if available_connectors is not None and available_connectors: # Scoped search: query each connector in parallel if connector: </file context>

Suggested change

if available_connectors is not None and available_connectors:

if available_connectors is not None:

cubic-dev-ai · 2026-02-19T14:40:15Z

stackone_ai/toolset.py

+                result = search_tool.execute(
+                    {
+                        "query": query,
+                        "limit": top_k,


P2: Passing top_k directly to the local tool_search limits results before connector filtering, so fallback can return fewer than requested even when matching tools exist for the allowed connectors. Consider keeping an expanded fallback limit (e.g., top_k * N or a safe default) before filtering.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At stackone_ai/toolset.py, line 407: <comment>Passing top_k directly to the local tool_search limits results before connector filtering, so fallback can return fewer than requested even when matching tools exist for the allowed connectors. Consider keeping an expanded fallback limit (e.g., top_k * N or a safe default) before filtering.</comment> <file context> @@ -401,11 +401,10 @@ def _search_one(c: str) -> list[SemanticSearchResult]: { "query": query, - "limit": fallback_limit, + "limit": top_k, "minScore": min_score, } </file context>

shashi-stackone and others added 30 commits February 18, 2026 09:51

Senamtic Search on action in Python AI SDK

c7ad71f

Filter tools based on the SDK auth config and connector

0210c1f

Use the local benchmark from the ai-generations

b1105fa

Add Semantinc search bench mark with local benchmarks

d49f52b

Fix CI lint errors

680fa8e

Fix the lint in the benchmark file

1ee842b

Formalise the docs and code

d6fba69

Keep semantic search minimal in the README

3eb0641

Remove the old benchmark data

fd37d93

implement PR feedback suggestions from cubic

f5ef955

fix nullable in the semantic tool schema

b7b522f

limit override

e9c6b86

handle per connector calls to avoid the guesswork

34e1ca6

simplify utility_tools API by inferring semantic search from client p…

82082cb

…resence

Benchmark update and PR suggestions

8a74517

update the README gst

85b0395

Note on the fetch tools for actions that user expect to discover

79c762a

Update examples and improve the semantic seach

6ee1adf

Fix ruff issues

7a65367

Document the semantic search feature in the python files and example

64a0a60

Respect the backend results unless top_k specified explicitly, add py…

4083642

…thon only crewAI example

move the crewAI tools conversation back in the example

b926db1

CI Trigger

d2dd2f5

Fix unit tests with updated top_k behavior

719b391

Update PR with correct approach mentioned in the PR comments

b360b00

Update example and remove unwated crewai examples

7b77f33

Remove the crewai reference from the README

bab931b

Fix the Ruff CI issue

5eaa3c5

Add back creai intefration and test integration

173121d

shashi-stackone added 3 commits February 19, 2026 13:16

Remove the sematic search example from the tools

1e4cc9a

Merge branch 'main' into semantic_search

f1db9f2

Semantic Search

a87fa00

Copilot AI review requested due to automatic review settings February 19, 2026 13:36

Copilot started reviewing on behalf of shashi-stackone February 19, 2026 13:37 View session

shashi-stackone mentioned this pull request Feb 19, 2026

feat(search): add semantic search for AI-powered tool discovery #142

Closed

6 tasks

Copilot AI reviewed Feb 19, 2026

View reviewed changes

cubic-dev-ai bot reviewed Feb 19, 2026

View reviewed changes

Cubic suggestions

c9c0358

cubic-dev-ai bot reviewed Feb 19, 2026

View reviewed changes

Optinally support project_ids in the SDK search

71457af

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat(search): Semantic Tool Search#149

feat(search): Semantic Tool Search#149
shashi-stackone wants to merge 35 commits intomainfrom
semantic_search_12111

shashi-stackone commented Feb 19, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai bot Feb 19, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

cubic-dev-ai bot Feb 19, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai bot Feb 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	semantic_client, available_connectors=self.get_connectors() or None
	semantic_client, available_connectors=self.get_connectors()


		## Semantic Search

		Discover tools using natural language instead of exact names. Queries like "onboard new hire" resolve to the right actions even when the tool is called `hris_create_employee`.

-            action_order = {_normalize_action_name(r.action_name): i for i, r in enumerate(all_results)}
+            action_order: dict[str, int] = {}
+            for i, r in enumerate(all_results):
+                key = _normalize_action_name(r.action_name)
+                if key not in action_order:
+                    action_order[key] = i


		all_results: list[SemanticSearchResult] = []

		if available_connectors is not None and available_connectors:

	if available_connectors is not None and available_connectors:
	if available_connectors is not None:

Comments

Conversation

shashi-stackone commented Feb 19, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Implementation Details

Summary by cubic

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai bot Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shashi-stackone commented Feb 19, 2026 •

edited by cubic-dev-ai bot

Loading

cubic-dev-ai bot Feb 19, 2026 •

edited

Loading

cubic-dev-ai bot Feb 19, 2026 •

edited

Loading

cubic-dev-ai bot Feb 19, 2026 •

edited

Loading