This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
# Using uv (recommended)
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .
# Using pip
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -e .- Keep Python dependency specifiers exact where this repository already pins them, including test extras and
build-system. - If
pyproject.tomlchanges affect resolved packages, verify the result through the lock/install path used by CI. - Do not introduce floating versions in CI or release automation when exact pins are practical.
# Basic run with stdio transport
python src/codealive_mcp_server.py
# With debug mode enabled
python src/codealive_mcp_server.py --debug
# With SSE transport
python src/codealive_mcp_server.py --transport sse --host 0.0.0.0 --port 8000
# With custom API key and base URL
python src/codealive_mcp_server.py --api-key YOUR_KEY --base-url https://custom.url# Build Docker image
docker build -t codealive-mcp .
# Run with Docker
docker run --rm -i -e CODEALIVE_API_KEY=your_key_here codealive-mcpAfter making local changes, quickly verify everything works:
# Using make (recommended)
make smoke-test
# Or directly
python smoke_test.py
# With valid API key for full testing
CODEALIVE_API_KEY=your_key python smoke_test.pyThe smoke test:
- ✓ Verifies server starts and connects via stdio
- ✓ Checks all tools are registered correctly
- ✓ Tests each tool responds appropriately
- ✓ Validates parameter handling
- ✓ Runs in ~5 seconds
Run comprehensive unit tests with pytest:
# Using make
make unit-test
# Or directly
pytest src/tests/ -v
# With coverage
pytest src/tests/ -v --cov=srcRun both smoke tests and unit tests:
make testThis is a Model Context Protocol (MCP) server that provides AI clients with access to CodeAlive's semantic code search and analysis capabilities.
codealive_mcp_server.py: Main entry point — bootstraps logging, tracing, registers tools and middleware- Eight tools:
get_data_sources,semantic_search,grep_search,fetch_artifacts,get_artifact_relationships,chat,codebase_search,codebase_consultant core/client.py:CodeAliveContextdataclass +codealive_lifespan(httpx.AsyncClient lifecycle,_server_readyflag)core/logging.py: loguru structured JSON logging + PII masking + OTel context injectioncore/observability.py: OpenTelemetry TracerProvider setup with OTLP exportmiddleware/:N8NRemoveParametersMiddleware(strips n8n extra params) +ObservabilityMiddleware(OTel spans per tool call)
- FastMCP Framework: Uses FastMCP 3.x with lifespan context, middleware hooks, and built-in
Clientfor testing - HTTP Auth via
get_http_headers: FastMCP 3.x strips theauthorizationheader by default (to prevent accidental credential forwarding to downstream services). Ourget_api_key_from_context()incore/client.pymust useget_http_headers(include={"authorization"})to read Bearer tokens from HTTP/streamable-http clients. Do not remove theinclude=parameter — without it, all HTTP-transport clients (LibreChat, n8n, etc.) will fail with a misleading STDIO-mode error. - HTTP Client Management: Single persistent
httpx.AsyncClientwith connection pooling, created in lifespan - Streaming Support:
chatand the deprecatedcodebase_consultantalias use SSE streaming (response.aiter_lines()) for chat completions - Environment Configuration: Supports both .env files and command-line arguments with precedence
- Error Handling: Centralized in
utils/errors.py— all tools usehandle_api_error()withmethod=prefix - N8N Middleware: Strips extra parameters (sessionId, action, chatInput, toolCallId) from n8n tool calls before validation
- Observability Middleware: Wraps every
tools/callin an OTel span with GenAI semantic conventions
- AI client connects to MCP server via stdio/HTTP transport
- Client calls tools (
get_data_sources→semantic_search/grep_search→fetch_artifacts/get_artifact_relationships→chatonly if synthesis is still needed) - Middleware chain runs: N8N cleanup → ObservabilityMiddleware (OTel span + log correlation)
- Tool translates MCP call to CodeAlive API request (with
X-CodeAlive-*headers) - Response parsed, formatted as XML or text, returned to AI client
CODEALIVE_API_KEY: Required API key for CodeAlive serviceCODEALIVE_BASE_URL: API base URL (defaults to https://app.codealive.ai)CODEALIVE_IGNORE_SSL: Set to disable SSL verification (debug mode)DEBUG_MODE: Set totrueto enable DEBUG-level loggingOTEL_EXPORTER_OTLP_ENDPOINT: If set, traces are exported via OTLP/HTTP to this endpoint
- Repository: Individual code repositories with URL and repository ID
- Workspace: Collections of repositories accessible via workspace ID
- Tool calls can target specific repositories or entire workspaces for broader context
The server is designed to integrate with:
- Claude Desktop/Code (via settings.json configuration)
- Cursor (via MCP settings panel)
- VS Code with GitHub Copilot (via settings.json)
- Continue (via config.yaml)
- n8n (via AI Agent node with MCP tools)
- Any MCP-compatible AI client
Key integration considerations:
- AI clients should use
get_data_sourcesfirst to discover available repositories/workspaces, then default tosemantic_searchandgrep_searchfor evidence gathering; usechatonly as a slower synthesis fallback - n8n Integration: The server includes middleware to automatically strip n8n's extra parameters (sessionId, action, chatInput, toolCallId) from tool calls, so n8n works out of the box without any special configuration
This project uses loguru for structured JSON logging. All logs go to stderr (safe for stdio MCP transport). Follow these rules strictly when writing or modifying code.
-
Always use loguru, never print() or stdlib logging. Import:
from loguru import logger. Noimport loggingin application code — stdlib is intercepted via_InterceptHandlerand routed through loguru automatically. -
All logs go to stderr. The stdio MCP transport uses stdout for protocol messages. Any stray
print()or stdout write will corrupt the MCP protocol and break the client. If you add a new log sink, it must targetsys.stderr. -
Never call
response.textwithout a debug guard.log_api_response()is protected by_is_debug_enabled()because readingresponse.textconsumes the response body. Thechattool and deprecatedcodebase_consultantalias stream SSE viaresponse.aiter_lines()— calling.textfirst would silently consume the stream and produce empty results. If you add new response logging, always check_is_debug_enabled()first:if not _is_debug_enabled(): return # Do NOT touch response body at INFO level
-
Mask PII in logs. User queries, questions, messages, and response bodies must never appear in full in logs. Use
_sanitize_body()for request bodies and truncate response bodies to_RESPONSE_BODY_MAX_LEN. The PII fields list is in_PII_FIELDS— extend it if you add tools that accept user content. -
Mask Authorization headers. Always replace
Authorizationheader values with"Bearer ***"in logs. See the pattern inlog_api_request(). -
Use structured logging fields, not string interpolation. Prefer
logger.bind(request_id=rid).debug("message")orlogger.info("msg {key}", key=value)over f-strings in the message. This makes logs machine-parseable. -
Use
logger.configure(patcher=...)for global context injection (like OTel trace_id). Do NOT passpatchertologger.add()— loguru 0.7.x does not support it there.
Every log record automatically gets trace_id and span_id injected by _otel_patcher (registered via logger.configure). The ObservabilityMiddleware also uses logger.contextualize(trace_id=..., tool=...) so all logs within a tool call carry the correlation ID. Do not duplicate this — it's automatic.
- TracerProvider is initialized once in
core/observability.pyviainit_tracing(), called at startup inmain(). - If
OTEL_EXPORTER_OTLP_ENDPOINTis set, traces export via OTLP/HTTP; otherwise a no-op provider is used (trace IDs still appear in logs). atexit.register(provider.shutdown)ensures pending spans are flushed on process exit. Do not skip this if modifying the init logic.- HTTPX auto-instrumentation (
HTTPXClientInstrumentor) injectstraceparentheaders into all outbound HTTP calls. Do not add manual propagation.
The ObservabilityMiddleware creates a span per tool call with these attributes:
gen_ai.operation.name="execute_tool"gen_ai.tool.name/mcp.tool.name= tool namemcp.method="tools/call"
On errors, the span gets StatusCode.ERROR + record_exception(). Do not add redundant span creation inside tool functions — the middleware handles it.
When adding a new tool, ensure:
- The tool receives
ctx: Contextas its first argument (required for lifespan context and logging) - API requests include all four
X-CodeAlive-*headers:Integration,Tool,Client, plusAuthorization - Call
log_api_request()before andlog_api_response()after the HTTP call - Errors go through
handle_api_error(ctx, e, "description", method=_TOOL_NAME)— this ensures the[tool_name]prefix in error messages - The middleware automatically wraps the tool in an OTel span — no manual span creation needed
Tools that return search metadata (identifiers, match counts, line numbers)
return a dict. FastMCP serializes it automatically via pydantic_core.to_json,
which preserves Unicode — no manual json.dumps() needed. Examples:
semantic_search, grep_search, codebase_search.
Tools that return source code content return an XML string. XML tags give
the LLM clear structural boundaries between artifacts, content blocks, and
relationships — this is critical for accurate reasoning over multi-artifact
responses. Do not convert fetch_artifacts or get_artifact_relationships
to dict/JSON — the XML structure is intentional.
If a tool's response is meant to be used as input to another MCP tool, the
response itself MUST embed a hint (or equivalent) directing the agent to that
follow-up tool. The hint should explain what to call next, with what value,
and why. Do NOT rely on the agent to remember workflow rules from the tool
description alone — descriptions are not always re-read mid-conversation, but
the response is always in front of the model when it decides what to do next.
Examples in this repo:
codebase_searchreturns ahintfield telling the agent thatdescriptionis a triage pointer only and that real understanding must come fromfetch_artifacts(identifier)or a localRead(path). Implementation:_SEARCH_HINTinsrc/utils/response_transformer.py.fetch_artifactsemits a<hint>…get_artifact_relationships…</hint>element whenever an artifact has call relationships, telling the agent it can drill down further. Implementation:_build_artifacts_xmlinsrc/tools/fetch_artifacts.py.
When you add or change a tool whose output is structurally a "pointer" to data held by another tool (identifiers, IDs, references), add or update the hint in the same change. If you remove a follow-up workflow, remove the stale hint too.
The project has tests across four tiers: unit tests, e2e tool tests, smoke tests, and integration tests.
| Tier | Files | What it tests | How to run |
|---|---|---|---|
| Unit | test_*.py (except test_e2e_*) |
Individual functions, XML builders, error handling, PII masking, middleware | pytest src/tests/ -v |
| E2E | test_e2e_tools.py |
Full MCP protocol: Client → FastMCP → tool → mock HTTP → response | pytest src/tests/test_e2e_tools.py -v |
| Smoke | smoke_test.py |
Real server startup via stdio, tool registration, basic invocations | python smoke_test.py |
| Integration | integration_test.py |
All tools against live backend — verifies filtering params (max_results, paths, extensions, regex), relationship profiles, validation edges, full agent workflow |
CODEALIVE_API_KEY=... python integration_test.py |
Integration tests require a valid CODEALIVE_API_KEY and network access. They auto-select the CodeAlive backend repo as target, or accept --target <name>. Use make integration-test as a shortcut.
E2E tests use FastMCP's built-in Client class with httpx.MockTransport for the backend. This is the canonical pattern — use it for all new tool tests:
from fastmcp import Client, FastMCP
def _server(routes: dict) -> FastMCP:
@asynccontextmanager
async def lifespan(server):
transport = httpx.MockTransport(handler_dispatching_by_path)
async with httpx.AsyncClient(transport=transport, base_url="https://test.local") as client:
yield CodeAliveContext(client=client, api_key="", base_url="https://test.local")
mcp = FastMCP("Test", lifespan=lifespan)
mcp.tool()(your_tool)
return mcp
async def test_tool():
mcp = _server({"/api/endpoint": lambda r: httpx.Response(200, json={...})})
async with Client(mcp) as client:
result = await client.call_tool("tool_name", {"arg": "value"})
assert "expected" in result.content[0].textKey points:
httpx.MockTransport— no network, no external dependencies, tests run in < 1 second- Custom lifespan yields a real
CodeAliveContextwith a mock-backed httpx client monkeypatch.setenv("CODEALIVE_API_KEY", ...)forget_api_key_from_contextfallback- Use
raise_on_error=Falsewhen testing error paths, then assert onresult.content[0].text - For SSE streaming (
chat/codebase_consultant), returnhttpx.Response(200, text=sse_body)—aiter_lines()works on buffered responses
- OTel tests: Do NOT use
trace.set_tracer_provider()in tests — it's global and can only be called once per process. Instead, patch the module-level_tracervariable:with patch("middleware.observability_middleware._tracer", test_tracer): ...
- Logging level tests: Set
logging_module._current_level = "DEBUG"insetup_method, restore to"INFO"inteardown_method. - Avoid
InMemorySpanExporter— it doesn't exist in current OTel SDK. Use a custom collector:class _CollectingExporter(SpanExporter): def __init__(self): self.spans = [] def export(self, spans): self.spans.extend(spans) return SpanExportResult.SUCCESS
- Never mock what you can test through the protocol. Prefer e2e tests with
Client(mcp)over mockingctx,ctx.request_context, etc. The e2e approach catches middleware issues, lifespan bugs, and serialization problems that mocks hide. - Never consume
response.textin tests and then test streaming. Theaiter_lines()method works on MockTransport responses because httpx buffers the content, but calling.textfirst would consume it. - Always test both success and error paths. Every tool should have at least: happy path, empty/invalid input, backend HTTP error.
- Use
monkeypatchfor env vars, notos.environdirectly. This ensures cleanup even if a test fails. - Mark async tests with
@pytest.mark.asyncio. The project usesasyncio_mode = "strict"— unmarked async tests will be silently skipped.
The version is declared in three places that MUST stay consistent:
| File | Field | Role |
|---|---|---|
pyproject.toml |
[tool.setuptools_scm] fallback_version |
Source of truth. Used by setuptools-scm when no git tag is present. |
manifest.json |
"version" |
MCP Registry manifest (Claude Desktop discovery). |
server.json |
"version" |
MCP Registry server schema (registry listing). |
When bumping the version, update all three files in the same commit.
The project uses automated publishing:
- Trigger: Push version change to
mainbranch - Process: Tests → Build → Docker → MCP Registry → GitHub Release
- Result: Available at
io.github.codealive-ai/codealive-mcpin MCP Registry
- Patch (0.2.0 → 0.2.1): Bug fixes, minor improvements
- Minor (0.2.0 → 0.3.0): New features, enhancements
- Major (0.2.0 → 1.0.0): Breaking changes, major releases
When implementing features or fixes, evaluate if they warrant a version bump for users to benefit from the changes through the MCP Registry.