High-Performance Multi-Model Database with Native AI/LLM Integration
βΉοΈ Each badge links to a short explanation of what it shows and where to find the source of truth. See docs/en/badges for the full overview.
ThemisDB is a multi-model database (scientific research) that combines relational, graph, vector, and document models in a single system with full ACID transaction support. Built on RocksDB for high performance and reliability.
"ThemisDB keeps its own llamas." β Optional native LLM integration with llama.cpp for AI workloads directly in your database.
- π ACID Transactions - Full snapshot isolation with MVCC
- π Multi-Model - Relational, Graph, Vector, Document in one database
- π High Performance - 45K writes/s, 120K reads/s, CPU-optimized vector search (GPU planned for v2.x)
- π‘οΈ Enterprise Security - TLS 1.3, RBAC, field-level encryption, audit logging
- π§ AI-Ready - Optional LLM engine, vector search, image analysis, voice assistant, autonomous prompt optimization
- π Modern Protocols - HTTP/2, WebSocket, gRPC, MQTT, PostgreSQL Wire, GraphQL
- ποΈ Modular Architecture (v1.4.0+) - Optional modular build for faster compilation and selective features
- π‘οΈ Production Resilience (v1.4.1+) - Circuit breakers, auto-retry, 99.99% corruption detection, network timeouts
- π Observability & Automation (v1.4.1+) - Health checks, alerting interface, automated backup scheduling (K8s-ready)
π Full Documentation Β· π Quick Start Β· β FAQ Β· Release Notes Β· π Projektstruktur
ThemisDB has comprehensive documentation for all 44 modules (139 files total) with production-ready standards:
ποΈ Foundation Layer (7 modules)
- Core - ConcernsContext DI framework, ILogger/ITracer/IMetrics/ICache interfaces, adapter implementations (Future Enhancements)
- Storage - RocksDB MVCC wrapper, 7 blob backends (S3/Azure/GCS/MinIO/local/memory/hybrid), backup/PITR (Future Enhancements)
- Transaction - MVCC concurrency control, SAGA orchestration, deadlock detection (Future Enhancements)
- Themis - Core framework, module loading with X.509/GPG signatures, edition management (Future Enhancements)
- Base - Base utilities and common infrastructure (Future Enhancements)
- Utils - General utility functions and helpers (Future Enhancements)
- Config - Backward-compatible config path resolution, LRU caching, JSON Schema validation, Prometheus metrics (Future Enhancements)
π Query & Index Layer (6 modules)
- Query - AQL parser/optimizer/executor, 100+ functions, CTE support (Future Enhancements)
- AQL - Multi-paradigm query language (based on ArangoDB AQL), LLM integration (INFER/RAG/EMBED), hybrid queries (Future Enhancements)
- Index - HNSW GPU vector search, B-tree, graph, spatial, adaptive indexes (Future Enhancements)
- Search - Full-text search with BM25 ranking (Future Enhancements)
- Temporal - Time-travel queries, AS OF, bitemporal support (Future Enhancements)
- TimeSeries - Time-series optimized storage and queries (Future Enhancements)
π Security & Auth (3 modules)
- Security - AES-256-GCM encryption, Vault/HSM/PKI integration, RBAC, compliance (SOC 2/NIST/GDPR) (Future Enhancements)
- Auth - JWT, Kerberos/GSSAPI, MFA (TOTP), rate limiting (Future Enhancements)
- Governance - Data governance policies and compliance frameworks (Future Enhancements)
π Server & Network (4 modules)
- Server - 7 protocols (HTTP/1.1/2/3, WebSocket, MQTT, PostgreSQL, gRPC), 40+ API handlers (Future Enhancements)
- Network - Wire protocol, connection pooling, TLS/mTLS, zero-copy I/O (Future Enhancements)
- API - REST API layer implementation (Future Enhancements)
- Sharding - Horizontal partitioning and distribution (Future Enhancements)
π§ Intelligence Layer (6 modules)
- RAG - 23 components: RAG Judge (faithfulness/relevance/completeness), Knowledge Gap Detector, LLM Bridge, Bias Detector (Future Enhancements)
- LLM - LLM integration framework with llama.cpp (Future Enhancements)
- Analytics - OLAP (CUBE/ROLLUP), process mining, CEP, SIMD vectorization (4.5x-6.9x speedup) (Future Enhancements)
- Voice - NLU pipeline (STTβLLMβTTS), speaker diarization, meeting protocols (Future Enhancements)
- Prompt Engineering - Prompt template lifecycle (CRUD, versioning, A/B testing), self-improvement orchestrator, injection detection (Future Enhancements)
- Training - Domain-specific LLM fine-tuning: auto-labeling, incremental LoRA adapter training, knowledge graph enrichment (Future Enhancements)
π Operations (4 modules)
- Performance - Cycle metrics, RCU lock-free reads, LIRS cache, mimalloc, feature flags (Future Enhancements)
- Observability - Prometheus integration, profiling, flame graphs, automated issue detection (Future Enhancements)
- Updates - Hot-reload (zero-downtime), schema migration, atomic rollback (Future Enhancements)
- Scheduler - Cron scheduling, 3-stage hybrid retention (GorillaβAdaptiveβTime-based, 99.9% compression) (Future Enhancements)
π Data Integration (5 modules)
- Importers - Data import from various sources (Future Enhancements)
- Exporters - Data export to multiple formats (Future Enhancements)
- CDC - Change Data Capture for real-time data replication (Future Enhancements)
- Plugins - Plugin system for extensibility (Future Enhancements)
- Ingestion - Multi-source data intake (filesystem, HuggingFace, REST API), rate limiting, checkpointing, quarantine queue (Future Enhancements)
π Distributed Systems (2 modules)
- Replication - Raft consensus, multi-master with vector clocks, WAL shipping, 50K-100K writes/sec (Future Enhancements)
- Sharding - Horizontal scaling and data distribution (Future Enhancements)
π― Specialized (4 modules)
- Graph - 5 traversal algorithms (BFS/DFS/Dijkstra/A*/Bidirectional), 12 constraint types (Future Enhancements)
- Chimera - Vendor-neutral CHIMERA benchmark adapter (Future Enhancements)
- Geo - Advanced geospatial features and queries (Future Enhancements)
- Acceleration - Hardware acceleration (GPU, SIMD, etc.) (Future Enhancements)
π οΈ Utility (4 modules)
- Metadata - Schema introspection and system catalog (Future Enhancements)
- GPU - GPU utilities and memory management (Future Enhancements)
- Cache - Multi-level caching layer (Future Enhancements)
- Content - Content management utilities (Future Enhancements)
Each module includes enterprise-grade documentation:
- β Module Purpose & Scope - Clear description with boundaries
- β Key Components - Main classes, functions, and structures
- β Architecture - Design patterns with ASCII diagrams
- β Integration Points - Dependencies and module interactions
- β API/Usage Examples - 50+ working code examples per major module
- β Performance Characteristics - Benchmarks and tuning guides
- β Known Limitations - Current constraints and workarounds
- β Production Status - Readiness indicators
- β Future Roadmap - Planned features with target versions
- β Research Foundation - 100+ peer-reviewed paper citations
Total Documentation: 139 files Β· 500+ code examples Β· 80+ architecture diagrams Β· ~1MB technical content
flowchart LR
A[Client Request] --> B{Protocol}
B -->|REST/HTTP| C[HTTP Server]
B -->|gRPC| D[gRPC Server]
B -->|WebSocket| E[WebSocket Server]
C & D & E --> F[Authentication]
F --> G[Rate Limiting]
G --> H[Query Parser]
H --> I[Query Optimizer]
I --> J[Execution Engine]
J --> K{Operation Type}
K -->|Read| L[MVCC Read]
K -->|Write| M[Transaction]
K -->|Query| N[Index Lookup]
L & M & N --> O[Storage Layer]
O --> P[Response]
P --> Q[Client]
style A fill:#e1f5ff
style O fill:#ffe1e1
style Q fill:#e1ffe1
# Pull and run the latest version
docker pull themisdb/themisdb:latest
# Run with Docker
docker run -d \
--name themis \
-p 8080:8080 \
-p 18765:18765 \
-p 4318:4318 \
-v themis_data:/data \
themisdb/themisdb:latest
# Verify installation
curl http://localhost:8080/healthDefault Ports:
8080- HTTP/REST API, GraphQL18765- Binary Wire Protocol, gRPC4318- OpenTelemetry/Prometheus metrics
π Complete Port Reference: See docs/de/deployment/PORT_REFERENCE.md
# Clone repository
git clone https://github.com/makr-code/ThemisDB.git
cd ThemisDB
# Initialize submodules (vcpkg, llama.cpp)
git submodule update --init --recursive
# Configure with a preset
cmake --preset community-release
# Build
cmake --build --preset community-release
# Start server
./build-community-release/bin/themis_server --config config.yaml# Clone repository
git clone https://github.com/makr-code/ThemisDB.git
cd ThemisDB
# Setup and build (Linux/macOS)
./scripts/setup.sh
./scripts/build.sh
# Setup and build (Windows)
.\scripts\setup.ps1
.\scripts\build.ps1
# Start server
./build/themis_server --config config.yamlπ Build Documentation:
- CMake Presets Guide - Use presets for simplified builds
- Cross-Compilation Guide - Build for ARM64, ARMv7, Windows
- Build Strategy Guide - Detailed build instructions
- Edition Comparison - Choose the right edition
π§ Modular Build (v1.4.0+): Enable modular architecture to resolve Windows COFF symbol limits and improve build times:
cmake -B build -DTHEMIS_BUILD_MODULAR=ON cmake --build buildSee docs/architecture/MODULARIZATION_GUIDE.md for details.
graph TB
subgraph "Production Deployment"
subgraph "Edge Layer"
CDN[CDN/Edge Cache]
WAF[Web Application Firewall]
end
subgraph "Application Layer"
APP1[Client Application 1]
APP2[Client Application 2]
APP3[Client Application 3]
end
subgraph "Database Layer"
subgraph "ThemisDB Cluster"
DB1[ThemisDB Node 1<br/>Leader]
DB2[ThemisDB Node 2<br/>Follower]
DB3[ThemisDB Node 3<br/>Follower]
end
end
subgraph "Monitoring & Observability"
PROM[Prometheus]
GRAF[Grafana]
JAEGER[Jaeger Tracing]
end
subgraph "Backup & Recovery"
BACKUP[Backup Storage<br/>S3/Object Store]
end
end
CDN --> WAF
WAF --> APP1 & APP2 & APP3
APP1 & APP2 & APP3 --> DB1
DB1 -.Replication.-> DB2 & DB3
DB1 --> PROM
PROM --> GRAF
DB1 --> JAEGER
DB1 -.Backup.-> BACKUP
style DB1 fill:#e1ffe1
style DB2 fill:#e1ffe1
style DB3 fill:#e1ffe1
style PROM fill:#e1f5ff
style GRAF fill:#e1f5ff
Linux (Debian/Ubuntu):
# Download the latest release from GitHub
wget https://github.com/makr-code/ThemisDB/releases/latest/download/themisdb_amd64.deb
sudo apt install ./themisdb_amd64.deb
sudo systemctl start themisdbmacOS (Homebrew):
brew install themisdb
brew services start themisdbWindows (Chocolatey):
choco install themisdbgraph TB
subgraph "Application Use Cases"
UC1[User Profiles<br/>Document Model]
UC2[Social Graph<br/>Graph Model]
UC3[Recommendations<br/>Vector Search]
UC4[Metrics<br/>Time-Series]
end
subgraph "ThemisDB Unified API"
API[Single API Endpoint]
end
subgraph "Query Processing"
PARSER[AQL Parser]
OPT[Query Optimizer]
end
subgraph "Execution Layer"
DOC[Document Engine]
GRAPH[Graph Engine]
VECTOR[Vector Engine]
TS[Time-Series Engine]
end
subgraph "Storage"
STORAGE[RocksDB<br/>Unified Key-Value Store]
end
UC1 --> API
UC2 --> API
UC3 --> API
UC4 --> API
API --> PARSER
PARSER --> OPT
OPT --> DOC
OPT --> GRAPH
OPT --> VECTOR
OPT --> TS
DOC --> STORAGE
GRAPH --> STORAGE
VECTOR --> STORAGE
TS --> STORAGE
style API fill:#e1f5ff
style STORAGE fill:#ffe1e1
# 1. Check server health
curl http://localhost:8080/health
# 2. Create an entity
curl -X PUT http://localhost:8080/entities/users:alice \
-H "Content-Type: application/json" \
-d '{"blob":"{\"name\":\"Alice\",\"age\":30,\"city\":\"Berlin\"}"}'
# 3. Create an index
curl -X POST http://localhost:8080/index/create \
-H "Content-Type: application/json" \
-d '{"table":"users","column":"city"}'
# 4. Query by index
curl -X POST http://localhost:8080/query \
-H "Content-Type: application/json" \
-d '{"table":"users","predicates":[{"column":"city","value":"Berlin"}],"return":"entities"}'
# 5. View metrics
curl http://localhost:8080/metricsπ‘ Learn More:
- π 10-Minute Quickstart - Hello World and CRUD operations
- π Examples Index - Browse 37+ examples by feature
- π Learning Paths - Guided paths for different roles
ThemisDB provides a comprehensive Schema Manager for database introspection and schema customization:
# Get all table schemas
curl http://localhost:8080/api/v1/schema
# Get specific table schema
curl http://localhost:8080/api/v1/schema/tables/users
# Create/update custom schema
curl -X PUT http://localhost:8080/api/v1/schema/products \
-H "Content-Type: application/json" \
-d '{
"name": "products",
"type": "relational",
"properties": [
{"name": "id", "type": "integer", "indexed": true, "nullable": false},
{"name": "name", "type": "string", "nullable": true},
{"name": "price", "type": "double", "nullable": false}
],
"indexes": [
{"name": "id", "type": "regular", "unique": true, "columns": ["id"]}
]
}'
# Partial update (PATCH)
curl -X PATCH http://localhost:8080/api/v1/schema/products \
-H "Content-Type: application/json" \
-d '{
"properties": [
{"name": "description", "type": "string", "nullable": true}
]
}'
# Get database capabilities
curl http://localhost:8080/api/v1/capabilitiesSupported Schema Types:
relational- Traditional table with structured columnsdocument- Flexible document/JSON storagegraph_node- Graph database nodesgraph_edge- Graph database edges/relationshipsvector- Vector embeddings for AI/ML
Supported Property Types:
string,integer,double,boolean,vector,binary,null
Supported Index Types:
regular,range,sparse,geo,ttl,fulltext,composite
Features:
- β Automatic schema discovery from data
- β Custom schema definitions with validation
- β Partial updates (PATCH)
- β Persistent storage in RocksDB
- β Thread-safe caching with 60s TTL
- β Comprehensive validation (names, types, references)
π More Info: Operations Handbook - Schema Management
graph TB
subgraph "Client Layer"
C1[REST API]
C2[GraphQL]
C3[gRPC]
C4[Wire Protocol]
C5[Native SDKs]
end
subgraph "API & Server Layer"
S1[HTTP Server]
S2[Authentication]
S3[Rate Limiting]
S4[Load Shedding]
end
subgraph "Query Layer"
Q1[AQL Parser]
Q2[Query Optimizer]
Q3[Execution Engine]
Q4[Function Libraries]
Q5[CTE Cache]
Q6[Semantic Cache]
end
subgraph "Transaction & Concurrency Layer"
T1[MVCC]
T2[Transaction Manager]
T3[SAGA Coordinator]
T4[Deadlock Detection]
T5[WAL Management]
end
subgraph "Index Layer"
I1[Vector HNSW]
I2[Graph]
I3[Secondary]
I4[Spatial]
I5[Fulltext]
I6[GPU Acceleration]
I7[SIMD Optimization]
end
subgraph "Storage Layer"
ST1[RocksDB LSM-tree]
ST2[Key Schema]
ST3[Compression]
ST4[WAL]
ST5[Snapshot Management]
ST6[Compaction]
end
subgraph "Cross-Cutting Concerns"
X1[Security]
X2[Replication]
X3[Sharding]
X4[Monitoring]
X5[CDC]
end
C1 & C2 & C3 & C4 & C5 --> S1
S1 --> S2 --> S3 --> S4
S4 --> Q1 --> Q2 --> Q3
Q3 --> Q4 & Q5 & Q6
Q3 --> T1
T1 --> T2 --> T3
T2 --> T4 & T5
T3 --> I1 & I2 & I3 & I4 & I5
I1 & I2 --> I6 & I7
I1 & I2 & I3 & I4 & I5 --> ST1
ST1 --> ST2 & ST3 & ST4 & ST5 & ST6
ST1 -.-> X1 & X2 & X3 & X4 & X5
style I1 fill:#e1f5ff
style I2 fill:#e1f5ff
style I3 fill:#e1f5ff
style I4 fill:#e1f5ff
style I5 fill:#e1f5ff
style ST1 fill:#ffe1e1
style X1 fill:#fff3cd
style X2 fill:#fff3cd
style X3 fill:#fff3cd
style X4 fill:#fff3cd
style X5 fill:#fff3cd
- Relational: SQL-like queries with secondary indexes
- Graph: BFS, Dijkstra, A* traversals with path constraints
- Vector: HNSW and FAISS for similarity search (CPU-optimized, GPU via FAISS)
- Document: JSON storage with flexible schema
- Time-Series: Gorilla compression, continuous aggregates
graph LR
subgraph "Unified Storage"
LSM[RocksDB LSM-Tree]
end
subgraph "Data Models"
REL[Relational Model<br/>Tables & Rows]
GRAPH[Graph Model<br/>Nodes & Edges]
VECTOR[Vector Model<br/>Embeddings]
DOC[Document Model<br/>JSON Documents]
TS[Time-Series<br/>Metrics & Events]
end
REL --> LSM
GRAPH --> LSM
VECTOR --> LSM
DOC --> LSM
TS --> LSM
style LSM fill:#ffe1e1
style REL fill:#e1ffe1
style GRAPH fill:#e1ffe1
style VECTOR fill:#e1ffe1
style DOC fill:#e1ffe1
style TS fill:#e1ffe1
sequenceDiagram
participant Client
participant TxManager as Transaction Manager
participant MVCC as MVCC Engine
participant Storage as RocksDB Storage
Client->>TxManager: BEGIN TRANSACTION
TxManager->>MVCC: Get Snapshot (timestamp)
MVCC-->>TxManager: Snapshot ID
TxManager-->>Client: Transaction Handle
Client->>TxManager: READ (key)
TxManager->>MVCC: Read at Snapshot
MVCC->>Storage: Get versioned data
Storage-->>MVCC: Data with version
MVCC-->>TxManager: Consistent read
TxManager-->>Client: Data
Client->>TxManager: WRITE (key, value)
TxManager->>MVCC: Check conflicts
MVCC-->>TxManager: No conflicts
TxManager->>Storage: Write with version
Storage-->>TxManager: Written
TxManager-->>Client: OK
Client->>TxManager: COMMIT
TxManager->>MVCC: Validate & commit
MVCC->>Storage: Apply changes atomically
Storage-->>MVCC: Success
MVCC-->>TxManager: Committed
TxManager-->>Client: Transaction Complete
- Full ACID guarantees with snapshot isolation
- Write-write conflict detection
- Atomic updates across all index types
graph TB
subgraph "Client Layer"
CLIENT[Client Application]
end
subgraph "Transport Security"
TLS[TLS 1.3<br/>Certificate Validation]
MTLS[Mutual TLS<br/>Client Certificates]
end
subgraph "Authentication & Authorization"
AUTH[Authentication<br/>JWT/OAuth2]
RBAC[Role-Based Access Control<br/>Permissions Matrix]
POLICY[Policy Engine<br/>Apache Ranger]
end
subgraph "Application Security"
RATELIMIT[Rate Limiting<br/>DDoS Protection]
AUDIT[Audit Logging<br/>SIEM Integration]
INPUT[Input Validation<br/>SQL Injection Prevention]
end
subgraph "Data Security"
ENCRYPT[Field-Level Encryption<br/>AES-256-GCM]
HSM[Hardware Security Module<br/>Key Management]
MASKING[Data Masking<br/>PII Protection]
end
subgraph "Storage Security"
STORAGE[Encrypted Storage<br/>At-Rest Encryption]
BACKUP[Encrypted Backups<br/>Secure Recovery]
end
CLIENT --> TLS
TLS --> MTLS
MTLS --> AUTH
AUTH --> RBAC
RBAC --> POLICY
POLICY --> RATELIMIT
RATELIMIT --> INPUT
INPUT --> AUDIT
AUDIT --> ENCRYPT
ENCRYPT --> HSM
HSM --> MASKING
MASKING --> STORAGE
STORAGE --> BACKUP
style TLS fill:#ffe1e1
style AUTH fill:#ffe1e1
style ENCRYPT fill:#ffe1e1
style STORAGE fill:#ffe1e1
- TLS 1.3 with mTLS support
- Role-Based Access Control (RBAC)
- Field-level encryption
- Audit logging with SIEM integration
π Compliance & Audit Framework (v1.4.1+):
ThemisDB maintains comprehensive compliance with international security standards through a structured audit framework:
- Standards Coverage: ISO 27001, NIST CSF, OWASP ASVS Level 2, BSI C5, SOC 2, SLSA Level 3
- Automated Audits: Continuous SAST/DAST scanning, dependency checks, coverage analysis
- Audit Documentation:
docs/audit-framework/- Audit Charter & Planning - Framework governance and methodology
- Audit Gate Template - 113-point checklist for release audits
- Audit Runbook - Step-by-step execution guide
- Compliance Mapping - 400+ controls mapped to ThemisDB features
- CI/CD Integration: Automated audit checks on every PR (
audit-check.yml)
π See also: Security Policy | Compliance Documentation
graph TB
subgraph "Client Applications"
APP[Applications]
end
subgraph "Routing Layer"
SR[Shard Router<br/>VCC-URN Partitioning]
SM[Shard Manager<br/>Metadata & Health]
REBAL[Auto Rebalancer<br/>Load Distribution]
end
subgraph "ThemisDB Cluster - RAID Modes"
subgraph "MIRROR Mode RF=2"
subgraph "Shard 1"
S1P[Primary Node]
S1R[Replica Node]
end
subgraph "Shard 2"
S2P[Primary Node]
S2R[Replica Node]
end
end
subgraph "PARITY Mode 4+2"
S3[Data Shard 1]
S4[Data Shard 2]
S5[Data Shard 3]
S6[Data Shard 4]
P1[Parity Shard 1]
P2[Parity Shard 2]
end
end
subgraph "Observability"
MON[Monitoring<br/>Metrics & Health]
end
APP --> SR
SR --> SM
SM --> REBAL
SR --> S1P & S2P
S1P -.Replication.-> S1R
S2P -.Replication.-> S2R
SR --> S3 & S4 & S5 & S6
S3 & S4 & S5 & S6 -.Parity.-> P1 & P2
SM --> MON
REBAL -.Auto-Balance.-> S1P & S2P & S3 & S4
style SR fill:#e1f5ff
style S1P fill:#e1ffe1
style S2P fill:#e1ffe1
style S3 fill:#e1ffe1
style S4 fill:#e1ffe1
style S5 fill:#e1ffe1
style S6 fill:#e1ffe1
style P1 fill:#fff3cd
style P2 fill:#fff3cd
- VCC-URN based sharding with consistent hashing (Enterprise)
- RAID-like redundancy modes: MIRROR, STRIPE, PARITY, GEO_MIRROR (Enterprise)
- Auto-rebalancing with zero-downtime migration (Enterprise)
- Multi-region deployment support (Enterprise)
ThemisDB includes comprehensive safe-fail mechanisms for production reliability:
GPU/LLM Safe-Fail Manager - Automatic CPU fallback when GPU fails
- State machine: HEALTHY β DEGRADED β CIRCUIT_OPEN
- Memory pressure monitoring (OOM prevention)
- Operation timeouts detect hung kernels
- < 1Β΅s overhead per operation
Database Connection Manager - Connection pooling with health monitoring
- 2-10 connections (configurable), 40% overhead reduction
- Exponential backoff retry (100ms β 30s)
- Automatic stale connection removal
- ~10Β΅s overhead per acquire/release
Network Timeout Handler - Prevents hanging connections
- Accept/read/write timeouts (5s/30s/30s defaults)
- TCP keepalive & TCP_NODELAY
- Protection against Slowloris DoS attacks
- ~5-10Β΅s overhead per operation
Transaction Auto-Retry - Automatic retry with exponential backoff
- Intelligent error classification (retryable vs non-retryable)
- Jitter support prevents thundering herd
- Circuit breaker integration
- ~3Β΅s overhead on success path
Research-Backed Protection (Based on Bairavasundaram et al. 2008, Bonwick et al. 2010)
- Paranoid checks: 99.99% corruption detection (~5% read overhead)
- XXH3 checksums: 3x faster than CRC32 (~2% read overhead)
- Background verification: During compaction (0% read overhead)
- mmap disabled: Prevents hidden I/O errors (< 1% overall impact)
| Metric | Before v1.4.1 | After v1.4.1 | Improvement |
|---|---|---|---|
| Availability | 99.5% | 99.95%+ | +0.45% |
| Automatic Recovery | Manual | 99.9% | +99.9% |
| Corruption Detection | None | 99.99% | +99.99% |
| Manual Intervention | High | -90% | -90% |
| Transaction Success | ~95% | 99.9% | +4.9% |
Total System Overhead: < 1% (safe-fail) + ~7% read (integrity checks, configurable)
π Documentation:
- Safe-Fail Mechanisms - Technical guide
- Database File Robustness - Academic research
- Network Timeout Handling - Complete guide
- Transaction Auto-Retry - Retry strategies
- mmap Performance Impact - Detailed analysis
| Edition | License | Features | Use Case |
|---|---|---|---|
| πΉ Minimal | Open Source (MIT) | Core database only | Embedded systems, IoT, edge devices |
| π Community | Open Source (MIT) | Full-featured single-node | Development, startups, single-server |
| π Enterprise | Commercial | + Horizontal scaling, HA, replication | Large-scale production deployments |
β Minimal Edition Details | β Enterprise Edition Details
ThemisDB supports native LLM integration with llama.cpp on System-on-Chip (SoC) devices for edge AI deployments.
- Raspberry Pi 4/5 - ARM64, NEON-optimized
- Orange Pi 5 / Rock 5B - ARM Mali GPU, NPU acceleration
- NVIDIA Jetson - CUDA GPU acceleration
- AI Accelerators - Coral TPU, Hailo, Intel NCS2
# config/config-rpi5-llm.yaml
llm:
enabled: true
model_path: "/data/models/phi-3-mini-4k-instruct.Q4_K_M.gguf"
context_size: 4096
threads: 4
enable_caching: truePerformance: ~2-3 tokens/second (Phi-3-Mini 3.8B)
- π Complete SoC Guide - Comprehensive guide (German)
- β‘ Quick Reference - Fast configuration reference
- π§ Raspberry Pi Tuning - System optimization
Key Features:
- β Local AI inference without cloud dependency
- β Data sovereignty and privacy
- β 10-50x more energy efficient than desktop GPUs
- β Models: TinyLlama (1B), Phi-3 (3.8B), Mistral (7B)
- β RAG, embeddings, chat, and text generation
- β Autonomous prompt optimization with A/B testing and rollback (learn more)
π Complete Documentation Hub: https://makr-code.github.io/ThemisDB/
| Category | Description | Link |
|---|---|---|
| π Category Index | Browse all docs by category | View Index β |
| π Quick Start | 5-minute setup guide | Get Started β |
| π‘ Use Cases | E-Commerce, IoT, RAG/LLM, SaaS | Browse β |
| π Tutorials | Hands-on learning paths | Learn β |
| π Certification | Professional certifications | Get Certified β |
| π Knowledge Base | Troubleshooting & tips | Search β |
graph TB
HUB[π Documentation Hub]
HUB --> START[π Getting Started]
HUB --> USECASE[π‘ Use Cases]
HUB --> TUTORIAL[π Tutorials]
HUB --> CERT[π Certification]
HUB --> KB[π Knowledge Base]
HUB --> CORE[π Core Docs]
START --> QS[Quick Start]
START --> INSTALL[Installation]
START --> FIRST[First Steps]
USECASE --> ECOM[E-Commerce]
USECASE --> IOT[IoT & Sensors]
USECASE --> RAG[RAG & LLM]
USECASE --> SAAS[SaaS Multi-Tenancy]
TUTORIAL --> CRUD[CRUD Operations]
TUTORIAL --> SCHEMA[Schema Design]
TUTORIAL --> BP[Best Practices]
TUTORIAL --> VIDEO[Video Tutorials]
CERT --> FUND[Fundamentals]
CERT --> QUERY[Query Expert]
CERT --> OPS[Operations]
CERT --> SEC[Security]
KB --> TROUBLE[Troubleshooting]
KB --> PERF[Performance Tips]
KB --> MIG[Migration Guides]
KB --> BACKUP[Backup & Recovery]
CORE --> ARCH[Architecture]
CORE --> AQL[AQL Language]
CORE --> API[API Reference]
CORE --> SECURITY[Security]
style HUB fill:#e1f5ff
style USECASE fill:#ffe1e1
style CERT fill:#e1ffe1
style KB fill:#fff3cd
Getting Started:
- π Quick Start - Get up and running in 5 minutes
- π³ Docker Deployment - Container-based deployment
- π§ Building from Source - Compile from source code
Core Concepts:
- ποΈ Architecture Overview - System design and components
- πΎ Multi-Model Design - Unified storage architecture
- π Transaction Management - ACID and MVCC details
- π AQL Query Language - Advanced Query Language syntax
- π Git/GitOps Research - Version control concepts comparison
Features:
- π― Vector Search - Similarity search and embeddings
- πΈοΈ Graph Operations - Graph traversals and algorithms
- π Time-Series Engine - Time-series data handling
- π Security & Compliance - Security features
Operations:
- βοΈ Configuration Guide - Server configuration
- π Monitoring & Metrics - Prometheus and Grafana
- πΎ Backup & Recovery - Comprehensive data protection guide
- β‘ Performance Tuning - Optimization tips
Development:
- π€ Contributing - How to contribute
- πΏ Branching Strategy - Git Flow workflow
- π API Reference - REST and GraphQL APIs
- π¦ Client SDKs - Available client libraries
LLM/LoRA System:
- β LLM Core Status (Master) - Single source of truth for implementation status
- π Comprehensive Audit Report - Detailed code audit findings
- π Decision Matrix - Resolution of conflicting documentation
- π Progress Checklist - Detailed task tracking
- π Archived Docs - Historical documentation (superseded)
- β Status: Core 100% production-ready, Integration 95% complete
- π NEW: Legal LoRA Training Pipeline - Multi-source ingestion + auto-labeling + knowledge graph enrichment for domain-specific legal AI training
- Multi-source data ingestion (HuggingFace, filesystem, OCR support)
- Auto-labeling with Legal Modality Analyzer (PR #1 integration)
- Knowledge graph enrichment for contextual training
- Incremental training with version management
- Tutorial: Custom Document Ingestion
Audit Reports:
- π v1.4.1 Audit Reports - Complete audit package for v1.4.1
- Executive Summary - Overall audit opinion: β APPROVED WITH CONDITIONS (89.3/100)
- Code Quality Audit - SAST analysis, TODO inventory, metrics (89/100)
- Security Controls Audit - 58 controls assessed (90/100)
- Test Coverage Audit - Unit 87%, Integration 95%, E2E 72% (88/100)
- Compliance Audit - ISO 27001, NIST, OWASP, BSI C5, SOC 2, GDPR (95/100)
- Findings & Risks - 62 findings: 3 critical, 7 high, 22 medium, 30 low
- Performance Audit - 45K writes/s, 123K reads/s (92/100)
- π Audit Framework - Comprehensive audit methodology and tools
- π Compliance: 95.3% across 428 controls (ISO 27001, NIST, OWASP, BSI C5, SOC 2, GDPR)
- π― Status: Production-ready with v1.4.2 remediation required (3 critical findings)
Test Environment: Release build, Windows x64, 20 cores @ 3696 MHz
| Operation | Throughput | Latency (avg) |
|---|---|---|
| π Entity PUT | 45,000 ops/s | 0.02 ms |
| π Entity GET | 120,000 ops/s | 0.008 ms |
| π Indexed Query | 3.4M queries/s | 0.29 ΞΌs |
| πΈοΈ Graph Traverse | 9.56M ops/s | 0.105 ΞΌs |
| π― Vector Search | 59.7M queries/s | 0.017 ΞΌs |
| π Vector Insert (384D) | 411k vectors/s | 2.44 ΞΌs |
Note: Benchmarks represent optimal conditions. Actual performance varies based on hardware, data size, and workload.
ThemisDB performance is evaluated using the CHIMERA Suite (Comprehensive Hybrid Inferencing & Multi-model Evaluation Resource Assessment) - an industry-leading, vendor-neutral benchmark framework for multi-model databases with AI integration.
Key Features:
- π¬ IEEE/ACM compliant scientific methodology
- π― Multi-model workload testing (Graph, Vector, Relational, Document)
- π€ Native AI/LLM benchmark support (inference, LoRA, RAG)
- π Vendor-neutral, color-blind friendly reporting
- π Statistical rigor with confidence intervals
π CHIMERA Suite Documentation | Complete Benchmark Results
ThemisDB performance can be independently evaluated using the CHIMERA Suite - a vendor-neutral, IEEE-compliant benchmarking framework that supports fair comparison across multiple database systems.
CHIMERA Suite features:
- Vendor-neutral reporting and visualization
- Statistical rigor (IEEE Std 2807-2022 compliant)
- Color-blind friendly design
- Support for multiple database systems (PostgreSQL, MongoDB, Neo4j, ThemisDB, and more)
Learn more: CHIMERA Suite Documentation
ThemisDB includes a comprehensive Performance Dashboard for visualizing benchmark trends, detecting regressions, and monitoring performance across releases and branches.
Features:
- π Real-time Grafana Dashboard - Throughput, latency, error rates
- π Automatic Regression Detection - CI/CD integration with configurable thresholds
- π Historical Tracking - Performance trends over time
- πΏ Branch Comparisons - Compare main, develop, and feature branches
- π·οΈ Release Tracking - Performance evolution across versions
- π₯οΈ Hardware Comparison - Test on different configurations
- π¨ Alerts & Notifications - Slack/Email alerts for regressions
Quick Start:
# Start dashboard
cd grafana && docker-compose up -d
# Access at http://localhost:3000 (admin/admin)π Performance Dashboard Documentation | Quick Start Guide | Example Charts
| Resource | Description | Link |
|---|---|---|
| π Documentation | Complete guides and API reference | Docs Site |
| π Production Ops | Deployment, monitoring, troubleshooting | Operations Guide |
| π Issues | Report bugs or request features | GitHub Issues |
| π¬ Discussions | Community Q&A and discussions | GitHub Discussions |
| π€ Contributing | How to contribute to ThemisDB | Contributing Guide |
| π Security | Responsible disclosure policy | Security Policy |
Community Edition: Released under the MIT License - Free to use, modify, and distribute.
Enterprise Edition: Available under commercial license with additional features (horizontal sharding, advanced analytics, HA/replication).
Enterprise Inquiries: sales@themisdb.com
ThemisDB builds upon excellent open-source projects:
- RocksDB - High-performance LSM-Tree storage engine
- FAISS - Efficient similarity search library
- llama.cpp - LLM inference engine (optional)
- ArangoDB - Multi-model architecture inspiration
- CozoDB - Hybrid relational-graph-vector design inspiration
β Complete Attribution & Dependencies
β Implementation Origins & Code Attribution (Historical)
We welcome contributions! Please see our:
- π€ Contributing Guide - Development workflow and guidelines
- π Code of Conduct - Community standards
- π¬ Support - How to get help
- π Security Policy - Reporting security issues
ThemisDB uses a modern, consolidated CI/CD architecture (February 2026):
- 20 workflows (down from 53, 62% reduction)
- 12 entry workflows for PR validation, releases, security, testing
- 7 reusable workflows for shared functionality
- 8 composite actions for common steps
Key Workflows:
ci-pull-request.yml- Fast PR validation (~15-30 min)ci-release.yml- Complete release pipelinesecurity.yml- Comprehensive security scanningnightly.yml- Extended test suite
Documentation:
- π CI/CD Architecture - Complete architecture guide
- π§ Workflow README - All workflows documented
- π Archived Workflows - Historical workflows (51 archived)
All changes are automatically validated through CI/CD pipelines ensuring code quality, security, and performance standards.
Built with β€οΈ for the database community
β Star us on GitHub Β· π Read the Docs Β· π€ Contribute