Skip to content

Commit ebb460b

Browse files
committed
Update PR with correct approach mentioned in the PR comments
1 parent c4f8f34 commit ebb460b

File tree

6 files changed

+211
-215
lines changed

6 files changed

+211
-215
lines changed

examples/semantic_search_example.py

Lines changed: 39 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -22,32 +22,33 @@
2222
This is the method you should use when integrating with OpenAI, LangChain,
2323
CrewAI, or any other agent framework. It works in these steps:
2424
25-
a) Fetch ALL tools from the user's linked accounts via MCP
26-
b) Extract the set of available connectors (e.g. {bamboohr, calendly})
27-
c) Query the semantic search API with the natural language query
28-
d) Filter results to only connectors the user has access to
29-
e) Deduplicate across API versions (keep highest score per action)
25+
a) Fetch tools from the user's linked accounts via MCP
26+
b) Extract available connectors (e.g. {bamboohr, calendly})
27+
c) Search EACH connector in parallel via the semantic search API
28+
d) Collect results, sort by relevance score
29+
e) If top_k was specified, keep only the top K results
3030
f) Match results back to the fetched tool definitions
3131
g) Return a Tools collection sorted by relevance score
3232
33-
Key point: tools are fetched first, semantic search runs second, and only
34-
the intersection (tools the user has AND that match the query) is returned.
35-
If the semantic API is unavailable, the SDK falls back to local BM25+TF-IDF
36-
search automatically.
33+
Key point: only the user's own connectors are searched — no wasted results
34+
from connectors the user doesn't have. When top_k is not specified, the
35+
backend decides how many results to return per connector. If the semantic
36+
API is unavailable, the SDK falls back to local BM25+TF-IDF search
37+
automatically.
3738
3839
2. search_action_names(query) — Lightweight preview
3940
4041
Queries the semantic API directly and returns metadata (name, connector,
4142
score, description) without fetching full tool definitions. Useful for
4243
inspecting results before committing to a full fetch. When account_ids are
43-
provided, results are filtered to the user's available connectors.
44+
provided, each connector is searched in parallel (same as search_tools).
4445
45-
3. utility_tools(semantic_client=...) — Agent-loop pattern
46+
3. utility_tools() — Agent-loop pattern
4647
4748
Creates tool_search and tool_execute utility tools that agents can call
48-
inside an agentic loop. The agent searches, inspects, and executes tools
49-
dynamically. Note: utility tool search queries the full backend catalog
50-
(all connectors), not just the user's linked accounts.
49+
inside an agentic loop. Pass semantic_client=toolset.semantic_client to
50+
enable cloud-based semantic search; without it, local BM25+TF-IDF is
51+
used. The agent searches, inspects, and executes tools dynamically.
5152
5253
5354
This example is runnable with the following command:
@@ -100,24 +101,35 @@ def example_search_action_names():
100101
toolset = StackOneToolSet()
101102

102103
query = "get user schedule"
103-
print(f'Searching for: "{query}"')
104+
105+
# --- top_k behavior ---
106+
# When top_k is NOT specified, the backend decides how many results to return.
107+
# When top_k IS specified, results are explicitly limited to that number.
108+
print(f'Searching for: "{query}" (no top_k — backend decides count)')
109+
results_default = toolset.search_action_names(query)
110+
print(f" Backend returned {len(results_default)} results (its default)")
104111
print()
105112

106-
results = toolset.search_action_names(query, top_k=5)
113+
print(f'Searching for: "{query}" (top_k=3 — explicitly limited)')
114+
results_limited = toolset.search_action_names(query, top_k=3)
115+
print(f" Got exactly {len(results_limited)} results")
116+
print()
107117

108-
print(f"Top {len(results)} matches from the full catalog:")
109-
for r in results:
118+
# Show the limited results
119+
print(f"Top {len(results_limited)} matches from the full catalog:")
120+
for r in results_limited:
110121
print(f" [{r.similarity_score:.2f}] {r.action_name} ({r.connector_key})")
111122
print(f" {r.description}")
112123
print()
113124

114125
# Show filtering effect when account_ids are available
115126
if _account_ids:
116127
print(f"Now filtering to your linked accounts ({', '.join(_account_ids)})...")
128+
print(" (Each connector is searched in parallel — only your connectors are queried)")
117129
filtered = toolset.search_action_names(query, account_ids=_account_ids, top_k=5)
118-
print(f"Filtered to {len(filtered)} matches (only your connectors):")
130+
print(f" Filtered to {len(filtered)} matches (only your connectors):")
119131
for r in filtered:
120-
print(f" [{r.similarity_score:.2f}] {r.action_name} ({r.connector_key})")
132+
print(f" [{r.similarity_score:.2f}] {r.action_name} ({r.connector_key})")
121133
else:
122134
print("Tip: Set STACKONE_ACCOUNT_ID to see results filtered to your linked connectors.")
123135

@@ -128,9 +140,9 @@ def example_search_tools():
128140
"""High-level semantic search returning a Tools collection.
129141
130142
search_tools() is the recommended way to use semantic search. It:
131-
1. Queries the semantic search API with your natural language query
132-
2. Fetches tool definitions from your linked accounts via MCP
133-
3. Matches semantic results to available tools (filtering out connectors you don't have)
143+
1. Fetches tool definitions from your linked accounts via MCP
144+
2. Searches each of your connectors in parallel via the semantic search API
145+
3. Sorts results by relevance and matches back to tool definitions
134146
4. Returns a Tools collection ready for any framework (.to_openai(), .to_langchain(), etc.)
135147
"""
136148
print("=" * 60)
@@ -199,9 +211,9 @@ def example_search_tools_with_connector():
199211
def example_utility_tools_semantic():
200212
"""Using utility tools with semantic search for agent loops.
201213
202-
When building agent loops (search -> select -> execute), pass
203-
semantic_client to utility_tools() to upgrade tool_search from
204-
local BM25+TF-IDF to cloud-based semantic search.
214+
Pass semantic_client=toolset.semantic_client to utility_tools() to enable
215+
cloud-based semantic search. Without it, utility_tools() uses local
216+
BM25+TF-IDF search instead.
205217
206218
Note: tool_search queries the full backend catalog (all connectors),
207219
not just the ones in your linked accounts.
@@ -219,8 +231,7 @@ def example_utility_tools_semantic():
219231
print()
220232

221233
print("Step 2: Creating utility tools with semantic search enabled...")
222-
print(" Passing semantic_client upgrades tool_search from local keyword")
223-
print(" matching (BM25+TF-IDF) to cloud-based semantic vector search.")
234+
print(" Pass semantic_client=toolset.semantic_client to enable semantic search.")
224235
utility = tools.utility_tools(semantic_client=toolset.semantic_client)
225236

226237
search_tool = utility.get_tool("tool_search")

stackone_ai/models.py

Lines changed: 10 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -6,16 +6,15 @@
66
from collections.abc import Sequence
77
from datetime import datetime, timezone
88
from enum import Enum
9-
from typing import TYPE_CHECKING, Annotated, Any, ClassVar, TypeAlias, cast
10-
11-
if TYPE_CHECKING:
12-
from stackone_ai.semantic_search import SemanticSearchClient
9+
from typing import Annotated, Any, ClassVar, TypeAlias, cast
1310
from urllib.parse import quote
1411

1512
import httpx
1613
from langchain_core.tools import BaseTool
1714
from pydantic import BaseModel, BeforeValidator, Field, PrivateAttr
1815

16+
from stackone_ai.semantic_search import SemanticSearchClient
17+
1918
# Type aliases for common types
2019
JsonDict: TypeAlias = dict[str, Any]
2120
Headers: TypeAlias = dict[str, str]
@@ -573,9 +572,8 @@ def utility_tools(
573572
hybrid_alpha: Weight for BM25 in hybrid search (0-1). Only used when
574573
semantic_client is not provided. If not provided, uses DEFAULT_HYBRID_ALPHA (0.2),
575574
which gives more weight to BM25 scoring.
576-
semantic_client: SemanticSearchClient instance for cloud-based semantic search.
577-
When provided, semantic search is used instead of local BM25+TF-IDF.
578-
Can be obtained from StackOneToolSet.semantic_client.
575+
semantic_client: Optional SemanticSearchClient instance. Pass
576+
toolset.semantic_client to enable cloud-based semantic search.
579577
580578
Returns:
581579
Tools collection containing tool_search and tool_execute
@@ -584,16 +582,13 @@ def utility_tools(
584582
This feature is in beta and may change in future versions
585583
586584
Example:
587-
# Local search (default)
588-
utility = tools.utility_tools()
589-
590-
# Semantic search (requires toolset)
591-
from stackone_ai import StackOneToolSet
585+
# Semantic search (pass semantic_client explicitly)
592586
toolset = StackOneToolSet()
593587
tools = toolset.fetch_tools()
594-
utility = tools.utility_tools(
595-
semantic_client=toolset.semantic_client,
596-
)
588+
utility = tools.utility_tools(semantic_client=toolset.semantic_client)
589+
590+
# Local BM25+TF-IDF search (default, no semantic_client)
591+
utility = tools.utility_tools()
597592
"""
598593
from stackone_ai.utility_tools import create_tool_execute
599594

0 commit comments

Comments
 (0)