Skip to content

kuchris/DocMemory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DocMemory

Local search memory for Markdown document folders.

This is a personal project I built for searching converted Markdown documents from an agent workflow. I am sharing it because the local vector, DirectML, NPU, and CodeGraph workflow may save someone else some setup time.

DocMemory builds a SQLite index inside a target documentation folder. It supports:

  • SQLite FTS keyword search
  • optional local vector search with FastEmbed
  • hybrid keyword + vector search
  • experimental DirectML and OpenVINO/NPU vector builds
  • a read-only MCP server for agents

Use With CodeGraph

DocMemory pairs well with CodeGraph, a separate open-source tool for source code structure analysis. CodeGraph is not part of DocMemory; the two tools are complementary:

  • Use CodeGraph for source code structure, callers, callees, and impact analysis.
  • Use DocMemory for design docs, converted PDFs, specs, operation notes, and historical Markdown.
  • Ask the agent to compare both before making code changes.

Recommended agent workflow:

1. Use CodeGraph to locate the relevant code path.
2. Use DocMemory to find the matching design/spec documentation.
3. Compare code behavior against the document.
4. Report mismatches with file paths, doc paths, and line references.
5. Only then edit code or docs.

Example prompt:

Use CodeGraph to trace the call path for OrderService.
Then use DocMemory to search the docs for "order retry timeout".
Compare the implementation with the design docs and list any mismatches.
Do not edit files yet.

Another useful prompt:

Use CodeGraph to find what calls formatOrderPayload.
Use DocMemory to find the document that describes order payload fields.
Tell me whether the current code matches the latest documented field rules.

Token Efficiency

DocMemory is designed for agent workflows where loading an entire documentation folder into context is too expensive.

Instead of reading hundreds of Markdown files, an agent can call docmemory_search and receive only the most relevant snippets:

large doc folder -> ranked snippets -> smaller prompt

This usually saves tokens in three ways:

  • the agent avoids scanning unrelated files
  • search results include only focused chunks and line ranges
  • repeated questions reuse the local SQLite/vector index instead of re-reading raw docs

Hypothetical context reduction:

context reduction ~= 1 - (tokens returned by search / tokens in full docs)

These numbers are examples, not benchmarks. They assume a large Markdown folder and snippets around 300 tokens each; actual results depend on your chunk size and document structure.

Document-only comparison:

Approach Context sent to agent Approx. tokens Approx. reduction vs full-load baseline
Load full docs Entire Markdown folder 500,000 0%
Search top 10 10 snippets x 300 tokens 3,000 99.4%
Search top 5 5 snippets x 300 tokens 1,500 99.7%
Search top 3 3 snippets x 300 tokens 900 99.8%

Code + docs workflow estimate:

Workflow Context sent to agent Approx. tokens Approx. reduction vs full-load baseline
Load code + docs directly Source tree + documentation folder 800,000 0%
CodeGraph only Relevant code symbols and call paths 8,000 99.0%
DocMemory only Top 5 doc snippets 1,500 99.8%
CodeGraph + DocMemory Code path + top 5 doc snippets 9,500 98.8%

The combined workflow may use slightly more tokens than DocMemory alone, but it answers a harder question: whether code and docs agree.

Compared with loading a whole documentation folder, search-first workflows can send far less context to the agent. The exact reduction depends on how many snippets and source files you open.

For best results, keep search limits small:

uv run --extra vector docmemory search <DOC_DIR> "background worker retry behavior" --hybrid -n 5

For agents, prefer CodeGraph for code context, then MCP search for docs, then open original files only when the snippet is relevant.

Workflow

  1. Convert or drop documents into a Markdown folder.
  2. Initialize the folder once to write DocMemory config.
  3. Sync when Markdown files change to rebuild the index.
  4. Search from the CLI, or let an agent search through MCP.
Markdown docs -> .docmemory/docmemory.sqlite -> CLI / MCP search

Install

Clone and run locally:

git clone https://github.com/kuchris/DocMemory.git
cd DocMemory
uv run docmemory --help

The examples use PowerShell because this project was built and tested on Windows.

Quick Start

Initialize an index inside a document folder:

uv run docmemory init -i <DOC_DIR>

Here, -i means "initialize this target folder" and writes .docmemory/config.ini inside the document folder.

Build keyword + vector indexes:

uv run --extra vector docmemory sync <DOC_DIR> --vector

Search:

uv run --extra vector docmemory search <DOC_DIR> "payment retry design" --hybrid -n 5

Check status:

uv run docmemory status <DOC_DIR>

Search Modes

Keyword search uses SQLite FTS:

uv run docmemory search <DOC_DIR> "API-REFERENCE"

Vector-only search is useful when the query has few exact words:

uv run --extra vector docmemory search <DOC_DIR> "which document explains the background worker architecture" --vector -n 5

Hybrid search combines keyword and vector results:

uv run --extra vector docmemory search <DOC_DIR> "payment retry timeout design" --hybrid -n 5

Vector Models

Default model:

sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

FastEmbed stores model files under:

.models/

Override the model cache if needed:

$env:DOCMEMORY_MODEL_DIR = "D:\models\docmemory"

Use the model name in commands, not the local cache folder name.

DirectML / GPU

The DirectML command is separate from the stable CPU command:

uv run --extra directml docmemory-dml sync <DOC_DIR> --vector --model sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

Defaults:

model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
batch: 32

The first embedding batch may pause while ONNX Runtime compiles the graph. Later batches print timing.

Probe DirectML before a long rebuild:

uv run --extra directml python scripts\probe_directml_fastembed.py

OpenVINO / NPU

The NPU path uses a separate Python environment at .venv-npu because onnxruntime-openvino may conflict with other ONNX Runtime builds.

Defaults:

model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
batch: 32
device: NPU
precision: FP16
parallel: 2
max chars: 600

Useful NPU knobs:

$env:DOCMEMORY_NPU_BATCH_SIZE = "32"
$env:DOCMEMORY_NPU_PARALLEL = "2"
$env:DOCMEMORY_NPU_MAX_CHARS = "600"
$env:DOCMEMORY_NPU_PRECISION = "FP16"

DOCMEMORY_NPU_MAX_CHARS trims long chunks before embedding to reduce wasted tokenization and inference work. The full text still stays in the SQLite text index.

NPU cache:

.models/openvino-cache/

Probe NPU:

<DOCMEMORY_DIR>\.venv-npu\Scripts\python.exe scripts\probe_npu_fastembed.py

Windows Launchers

Optional .bat launchers can be placed beside a document folder for one-click rebuilds.

Recommended launcher behavior:

  • rebuild vectors for the folder containing the .bat
  • keep the database inside that folder at .docmemory/docmemory.sqlite
  • use MiniLM for daily sync
  • keep NPU/GPU launchers separate from the stable CPU launcher

MCP Server

DocMemory includes a read-only MCP server for agents.

Tools:

docmemory_status
docmemory_search

Example Codex config:

[mcp_servers.docmemory]
command = "uv"
args = ["run", "--directory", "<DOCMEMORY_DIR>", "--extra", "mcp", "docmemory-mcp"]
enabled = true

[mcp_servers.docmemory.env]
DOCMEMORY_TARGET = "<DOC_DIR>"

Use CLI or .bat files to rebuild vectors. Use MCP for agent search.

Ignore Folders

Ignored by default:

.docmemory
_history
.git
.svn
__pycache__
node_modules

Add more ignored folders during init:

uv run docmemory init -i <DOC_DIR> --ignore old --ignore backup

Storage Layout

The database is stored inside the target folder:

<DOC_DIR>/.docmemory/docmemory.sqlite

Model files are stored inside the DocMemory project by default:

<DOCMEMORY_DIR>/.models/

License

Apache-2.0. See LICENSE.

Support

If this project saves you time, please consider giving it a GitHub star. It helps other people find the repo.

Star History

Star History Chart

About

Local Markdown document search with SQLite FTS, hybrid vector search, and MCP support for agent workflows.

Topics

Resources

License

Stars

Watchers

Forks

Contributors