GraphMemory - GraphRAG Database

An embedded graph database for RAG and knowledge graph applications, powered by DuckDB. Vector similarity search, full-text search, hybrid search, merge/upsert, graph traversal, and a full GraphRAG retrieval pipeline — all in a single Python package.

Features

Vector Search — HNSW-indexed nearest neighbors (L2, cosine, inner product)
Full-Text Search — BM25-scored search across node properties
Hybrid Search — Combined vector + text with configurable weights
GraphRAG — Retrieval pipeline: hybrid search → graph expansion → context assembly → LLM Q&A
Merge / Upsert — Deduplicate nodes by property keys and edges by (source, target, relation)
Query Builder — Fluent, parameterized API with multi-hop traversal
DSPy Extraction — Entity/relationship extraction from text via DSPy (optional)
Graph Algorithms — PageRank, centrality, components via NetworkX (optional)
Import / Export — JSON, CSV, GraphML
Visualizer — Interactive D3.js force-directed graph in the browser
Thread-Safe — Connection pooling, transactions, automatic retry with exponential backoff

Installation

pip install graphmemory

# Optional
pip install graphmemory[extraction]   # DSPy extraction
pip install graphmemory[algorithms]   # NetworkX algorithms

Quick Start

from graphmemory import GraphMemory, Node, Edge

graph = GraphMemory(database="graph.db", vector_length=3, distance_metric="cosine")

# Insert nodes
alice = Node(type="Person", properties={"name": "Alice", "role": "engineer"}, vector=[0.1, 0.8, 0.3])
bob = Node(type="Person", properties={"name": "Bob", "role": "manager"}, vector=[0.2, 0.7, 0.4])
graph.insert_node(alice)
graph.insert_node(bob)

# Insert edge
graph.insert_edge(Edge(source_id=alice.id, target_id=bob.id, relation="reports_to", weight=1.0))

# Vector search
nearest = graph.nearest_nodes(vector=[0.1, 0.8, 0.3], limit=5)

# Full-text search
results = graph.search_nodes("engineer", limit=10)

# Hybrid search
results = graph.hybrid_search("engineer", query_vector=[0.1, 0.8, 0.3], text_weight=0.5, vector_weight=0.5)

# Context manager
with GraphMemory(database="graph.db", vector_length=3) as graph:
    graph.insert_node(alice)

Usage

Query Builder

# Filter by type and properties
results = graph.query().match(type="Person").where(role="engineer").execute()

# Multi-hop traversal
results = graph.query().traverse(source_id=alice.id, depth=2).execute()

# Paginate and order
results = graph.query().match(type="Person").order_by("name").limit(10).offset(0).execute()

# Query edges
edges = graph.query().match(type="Person").edges().execute()

Merge / Upsert

Insert-or-update nodes matched by property keys. Edges deduplicate on (source_id, target_id, relation).

from graphmemory import MergeStrategy

# Insert if no match, update if "name" matches an existing Person node
result = graph.merge_node(alice, match_keys=["name"])
print(result.created)  # True = inserted, False = updated

# Bulk merge with strategy
results = graph.bulk_merge_nodes(nodes, match_keys=["name"], strategy=MergeStrategy.UPDATE)

# Edge merge
result = graph.merge_edge(edge)
results = graph.bulk_merge_edges(edges)

Strategy	Behavior
`UPDATE`	Shallow merge — existing keys preserved, incoming keys added/overwritten (default)
`REPLACE`	Incoming properties fully replace existing
`KEEP`	Existing properties unchanged; only new nodes inserted

GraphRAG Retrieval

Full pipeline: hybrid search → multi-hop graph expansion → token-aware context assembly → LLM generation.

# Retrieve context
result = graph.retrieve(query="Who leads ML?", query_vector=embedding, max_hops=2, max_tokens=4000)
print(result.context_text)      # Prompt-ready string
print(result.token_estimate)    # Token count estimate

# End-to-end Q&A
answer = graph.ask(query="Who leads ML?", query_vector=embedding, llm_callable=my_llm)
print(answer["answer"])

DSPy Extraction

Requires pip install graphmemory[extraction]. Uses DSPy typed predictors to extract entities and relationships from text.

from graphmemory.extraction import extract_and_store, extract_and_merge
import dspy

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))

text = """George Washington was the first President. Thomas Jefferson
served as Secretary of State under Washington."""

# Extract and insert (may create duplicates on repeated calls)
nodes, edges = extract_and_store(graph, text)

# Extract and merge (deduplicates against existing graph)
node_results, edge_results = extract_and_merge(graph, text, match_keys=["name"])

Function	Description
`extract_nodes(text)`	Extract entity nodes from text
`extract_edges(text, nodes)`	Extract relationships between known nodes
`extract(text)`	Extract both nodes and edges
`extract_and_store(graph, text)`	Extract and insert into graph
`extract_and_merge(graph, text, match_keys)`	Extract and merge (deduplicated)

Graph Algorithms

Requires pip install graphmemory[algorithms]. Powered by NetworkX.

from graphmemory.algorithms import pagerank, betweenness_centrality, connected_components, to_networkx

scores = pagerank(graph)
centrality = betweenness_centrality(graph)
components = connected_components(graph)
G = to_networkx(graph)  # Export to NetworkX DiGraph

Function	Description
`pagerank(graph, alpha=0.85)`	PageRank scores for all nodes
`betweenness_centrality(graph)`	Betweenness centrality scores
`degree_distribution(graph)`	In/out/total degree per node
`connected_components(graph)`	Weakly connected components (largest first)
`to_networkx(graph)`	Export to `networkx.DiGraph`

Import / Export

# Export
data = graph.export_graph(format="json")       # also: "csv", "graphml", "json_string"

# Import
graph.import_graph(data, format="json")

Visualizer

Interactive D3.js force-directed graph visualization — opens in your browser with zero dependencies.

# Open in browser
graph.visualize()

# Save to file
graph.visualize(output="my_graph.html", open_browser=False)

Features: drag nodes, zoom/pan, hover to highlight connections, click for detail panel, search bar, filter by node type.

Data Models

Model	Fields
`Node`	`id: UUID`, `type: str`, `properties: dict`, `vector: list[float]`
`Edge`	`id: UUID`, `source_id: UUID`, `target_id: UUID`, `relation: str`, `weight: float`
`NearestNode`	`node: Node`, `distance: float`
`SearchResult`	`node: Node`, `score: float`
`TraversalResult`	`node: Node`, `depth: int`, `path: list[UUID]`
`MergeResult`	`node: Node`, `created: bool`
`EdgeMergeResult`	`edge: Edge`, `created: bool`
`RetrievalResult`	`query: str`, `contexts: list`, `context_text: str`, `token_estimate: int`

All IDs are auto-generated UUIDs. All models are Pydantic BaseModel instances.

API Reference

Connection

Method	Description
`GraphMemory(database=None, vector_length=3, distance_metric='l2', hnsw_ef_construction=128, hnsw_ef_search=64, hnsw_m=16, auto_index=True)`	Initialize. `None` = in-memory. HNSW index auto-created.
`close()`	Close connection (thread-safe, idempotent).
`transaction()`	Context manager for atomic operations.

Nodes

Method	Description
`insert_node(node) -> UUID`	Insert a node.
`bulk_insert_nodes(nodes) -> list[Node]`	Bulk insert.
`merge_node(node, match_keys, strategy=UPDATE) -> MergeResult`	Insert or update by property match.
`bulk_merge_nodes(nodes, match_keys, ...) -> list[MergeResult]`	Bulk merge.
`get_node(node_id) -> Node`	Get by ID.
`update_node(node_id, **kwargs) -> bool`	Update fields.
`delete_node(node_id)`	Delete node and its edges.
`bulk_delete_nodes(node_ids)`	Bulk delete.
`nodes_by_attribute(attr, value) -> list[Node]`	Query by property.

Edges

Method	Description
`insert_edge(edge)`	Insert an edge.
`bulk_insert_edges(edges)`	Bulk insert.
`merge_edge(edge) -> EdgeMergeResult`	Insert or update by (source, target, relation).
`bulk_merge_edges(edges) -> list[EdgeMergeResult]`	Bulk merge.
`get_edge(edge_id) -> Edge`	Get by ID.
`update_edge(edge_id, **kwargs) -> bool`	Update fields.
`delete_edge(source_id, target_id)`	Delete by endpoints.
`bulk_delete_edges(edge_ids)`	Bulk delete.

Search

Method	Description
`nearest_nodes(vector, limit) -> list[NearestNode]`	Vector similarity search.
`search_nodes(query_text, limit=10) -> list[SearchResult]`	Full-text BM25 search.
`hybrid_search(query_text, query_vector, ...) -> list[SearchResult]`	Combined text + vector search.
`create_index(ef_construction=None, ef_search=None, m=None)`	Create/recreate HNSW index with tunable params. Auto-called on init.
`compact_index()`	Compact HNSW index to reclaim space after deletions.

Retrieval

Method	Description
`retrieve(query, query_vector, ...) -> RetrievalResult`	Full GraphRAG retrieval pipeline.
`ask(query, query_vector, llm_callable, ...) -> dict`	Retrieval + LLM generation.

Traversal

Method	Description
`connected_nodes(node_id) -> list[Node]`	All nodes connected to a node.
`query() -> QueryBuilder`	Fluent query builder.

Import / Export

Method	Description
`export_graph(format='json')`	Export as JSON, CSV, GraphML, or JSON string.
`import_graph(data, format='json')`	Import from any supported format.
`visualize(output=None, open_browser=True) -> str`	Interactive D3.js graph visualization in the browser.

Examples

See examples/ for complete usage:

openai_example.py — OpenAI embeddings, similarity search, attribute queries
lexical_graph.py — Wikipedia text with SentenceTransformer embeddings
dspy_example_typed_pred.py — Knowledge graph extraction with DSPy

Testing

296 tests covering all functionality.

python3 -m pytest tests/tests.py -v

License

MIT License. See LICENSE.

Contributing

Contributions welcome — open an issue or submit a PR.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
examples		examples
graphmemory		graphmemory
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GraphMemory - GraphRAG Database

Features

Installation

Quick Start

Usage

Query Builder

Merge / Upsert

GraphRAG Retrieval

DSPy Extraction

Graph Algorithms

Import / Export

Visualizer

Data Models

API Reference

Connection

Nodes

Edges

Search

Retrieval

Traversal

Import / Export

Examples

Testing

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GraphMemory - GraphRAG Database

Features

Installation

Quick Start

Usage

Query Builder

Merge / Upsert

GraphRAG Retrieval

DSPy Extraction

Graph Algorithms

Import / Export

Visualizer

Data Models

API Reference

Connection

Nodes

Edges

Search

Retrieval

Traversal

Import / Export

Examples

Testing

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages