Version: 1.5.0-dev
Last Updated: 2026-02-15
Complete REST API reference for the LoRA (Low-Rank Adaptation) framework in ThemisDB.
Default base URL:
http://localhost:8080
📖 Port Reference: ThemisDB uses different ports depending on deployment platform. See docs/de/deployment/PORT_REFERENCE.md for complete mapping.
Default Ports:
8080- HTTP/REST API (this documentation)18765- Binary Wire Protocol/gRPC4318- OpenTelemetry/Prometheus metrics
- Authentication
- Model Management
- Adapter Management
- Adapter Lifecycle
- Inference
- Monitoring
- Error Handling
- Rate Limiting
All API endpoints require JWT Bearer Token authentication.
Authorization: Bearer <your-jwt-token>
Content-Type: application/jsonContact your ThemisDB administrator to obtain a JWT token. Tokens include user information and permissions.
Register a new LLM model in the system.
Endpoint: POST /api/v1/llm/models
Request:
{
"model_id": "llama-2-7b",
"architecture": "llama",
"parameter_count": 7000000000,
"quantization": "Q4_K_M",
"gguf_path": "/models/llama-2-7b-Q4.gguf",
"description": "Llama 2 7B model with Q4 quantization",
"metadata": {
"context_length": 4096,
"vocab_size": 32000
}
}Response: 201 Created
{
"model_id": "llama-2-7b",
"status": "registered",
"timestamp": "2026-01-11T14:00:00Z"
}cURL Example:
curl -X POST http://localhost:8080/api/v1/llm/models \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model_id": "llama-2-7b",
"architecture": "llama",
"parameter_count": 7000000000,
"quantization": "Q4_K_M",
"gguf_path": "/models/llama-2-7b-Q4.gguf"
}'Retrieve details about a specific model.
Endpoint: GET /api/v1/llm/models/{model_id}
Response: 200 OK
{
"model_id": "llama-2-7b",
"architecture": "llama",
"parameter_count": 7000000000,
"created_at": "2026-01-11T14:00:00Z",
"metadata": {}
}cURL Example:
curl -X GET http://localhost:8080/api/v1/llm/models/llama-2-7b \
-H "Authorization: Bearer $TOKEN"List all registered models with optional filters.
Endpoint: GET /api/v1/llm/models
Query Parameters:
architecture(optional): Filter by architecturelimit(optional, default: 10): Maximum resultsoffset(optional, default: 0): Pagination offset
Response: 200 OK
{
"models": [
{
"model_id": "llama-2-7b",
"architecture": "llama",
"parameter_count": 7000000000
}
],
"total": 42,
"limit": 10,
"offset": 0
}cURL Example:
curl -X GET "http://localhost:8080/api/v1/llm/models?architecture=llama&limit=10" \
-H "Authorization: Bearer $TOKEN"Delete a model from the registry.
Endpoint: DELETE /api/v1/llm/models/{model_id}
Response: 204 No Content
cURL Example:
curl -X DELETE http://localhost:8080/api/v1/llm/models/llama-2-7b \
-H "Authorization: Bearer $TOKEN"Create a new LoRA adapter through training.
Endpoint: POST /api/v1/llm/lora/adapters
Request:
{
"adapter_id": "themis_help_lora",
"base_model": "llama-2-7b",
"task": "documentation_qa",
"rank": 8,
"alpha": 16,
"training_data": {
"dataset_id": "docs_v1",
"samples": 10000
},
"description": "Documentation Q&A adapter"
}Response: 201 Created
{
"adapter_id": "themis_help_lora",
"version": "v1.0",
"status": "training",
"job_id": "job_123"
}cURL Example:
curl -X POST http://localhost:8080/api/v1/llm/lora/adapters \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"adapter_id": "themis_help_lora",
"base_model": "llama-2-7b",
"task": "documentation_qa",
"rank": 8,
"alpha": 16,
"training_data": {
"dataset_id": "docs_v1",
"samples": 10000
}
}'Retrieve details about a specific adapter.
Endpoint: GET /api/v1/llm/lora/adapters/{adapter_id}
Response: 200 OK
{
"adapter_id": "themis_help_lora",
"base_model": "llama-2-7b",
"version": "v1.0",
"status": "ready",
"metrics": {
"validation_accuracy": 0.92,
"training_loss": 0.15
},
"created_at": "2026-01-11T14:30:00Z"
}cURL Example:
curl -X GET http://localhost:8080/api/v1/llm/lora/adapters/themis_help_lora \
-H "Authorization: Bearer $TOKEN"Update an adapter with additional training data.
Endpoint: PUT /api/v1/llm/lora/adapters/{adapter_id}
Request:
{
"additional_training_data": {
"dataset_id": "feedback_v1",
"samples": 500
}
}Response: 200 OK
{
"adapter_id": "themis_help_lora",
"version": "v1.1",
"status": "training",
"job_id": "job_124"
}cURL Example:
curl -X PUT http://localhost:8080/api/v1/llm/lora/adapters/themis_help_lora \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"additional_training_data": {
"dataset_id": "feedback_v1",
"samples": 500
}
}'Delete an adapter and optionally all its versions.
Endpoint: DELETE /api/v1/llm/lora/adapters/{adapter_id}
Query Parameters:
version(optional): Specific version to delete (omit to delete all)
Response: 204 No Content
cURL Example:
# Delete specific version
curl -X DELETE "http://localhost:8080/api/v1/llm/lora/adapters/themis_help_lora?version=v1.0" \
-H "Authorization: Bearer $TOKEN"
# Delete all versions
curl -X DELETE http://localhost:8080/api/v1/llm/lora/adapters/themis_help_lora \
-H "Authorization: Bearer $TOKEN"List all adapters with optional filters.
Endpoint: GET /api/v1/llm/lora/adapters
Query Parameters:
base_model(optional): Filter by base modelstatus(optional): Filter by status (ready, stored, training)limit(optional, default: 10): Maximum resultsoffset(optional, default: 0): Pagination offset
Response: 200 OK
{
"adapters": [
{
"adapter_id": "themis_help_lora",
"base_model": "llama-2-7b",
"status": "ready",
"is_loaded": true
}
],
"total": 15,
"limit": 10,
"offset": 0
}cURL Example:
curl -X GET "http://localhost:8080/api/v1/llm/lora/adapters?base_model=llama-2-7b&status=ready" \
-H "Authorization: Bearer $TOKEN"Load an adapter into memory for use.
Endpoint: POST /api/v1/llm/lora/adapters/{adapter_id}/load
Response: 200 OK
{
"adapter_id": "themis_help_lora",
"status": "loaded",
"load_time_ms": 45
}cURL Example:
curl -X POST http://localhost:8080/api/v1/llm/lora/adapters/themis_help_lora/load \
-H "Authorization: Bearer $TOKEN"Unload an adapter from memory.
Endpoint: POST /api/v1/llm/lora/adapters/{adapter_id}/unload
Response: 200 OK
{
"adapter_id": "themis_help_lora",
"status": "unloaded"
}cURL Example:
curl -X POST http://localhost:8080/api/v1/llm/lora/adapters/themis_help_lora/unload \
-H "Authorization: Bearer $TOKEN"Get the current status of an adapter.
Endpoint: GET /api/v1/llm/lora/adapters/{adapter_id}/status
Response: 200 OK
{
"adapter_id": "themis_help_lora",
"is_loaded": true,
"memory_usage_mb": 32,
"last_used": "2026-01-11T15:00:00Z"
}cURL Example:
curl -X GET http://localhost:8080/api/v1/llm/lora/adapters/themis_help_lora/status \
-H "Authorization: Bearer $TOKEN"Execute inference using a LoRA adapter.
Endpoint: POST /api/v1/llm/lora/query
Request:
{
"model_id": "llama-2-7b",
"adapter_id": "themis_help_lora",
"prompt": "How do I enable sharding in ThemisDB?",
"max_tokens": 500,
"temperature": 0.7,
"user_id": "user_42"
}Response: 200 OK
{
"response": "To enable sharding in ThemisDB...",
"model_id": "llama-2-7b",
"adapter_id": "themis_help_lora",
"tokens_used": 145,
"inference_time_ms": 850,
"audit_id": "audit_789"
}cURL Example:
curl -X POST http://localhost:8080/api/v1/llm/lora/query \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model_id": "llama-2-7b",
"adapter_id": "themis_help_lora",
"prompt": "How do I enable sharding in ThemisDB?",
"max_tokens": 500,
"temperature": 0.7
}'Get statistics about the LoRA framework.
Endpoint: GET /api/v1/llm/lora/stats
Response: 200 OK
{
"total_adapters": 15,
"loaded_adapters": 3,
"cache_hit_rate": 0.842,
"total_inferences": 1234567,
"avg_load_time_ms": 450,
"uptime_seconds": 864000
}cURL Example:
curl -X GET http://localhost:8080/api/v1/llm/lora/stats \
-H "Authorization: Bearer $TOKEN"Check the health of the LoRA framework.
Endpoint: GET /api/v1/llm/lora/health
Response: 200 OK
{
"status": "healthy",
"storage": "ok",
"manager": "ok",
"training": "ok",
"checks_passed": 3,
"checks_failed": 0
}cURL Example:
curl -X GET http://localhost:8080/api/v1/llm/lora/health \
-H "Authorization: Bearer $TOKEN"All errors follow RFC 7807 Problem Details for HTTP APIs.
{
"error": "Error message",
"details": "Detailed error information",
"status": 400
}| Code | Description |
|---|---|
| 200 | Success |
| 201 | Created |
| 204 | No Content |
| 400 | Bad Request - Invalid parameters |
| 401 | Unauthorized - Invalid or missing token |
| 404 | Not Found - Resource doesn't exist |
| 500 | Internal Server Error |
| 503 | Service Unavailable - Health check failed |
401 Unauthorized:
{
"error": "Unauthorized",
"details": "Valid Bearer Token required. Include 'Authorization: Bearer <token>' header.",
"status": 401
}404 Not Found:
{
"error": "Adapter not found",
"details": "Unknown adapter_id: invalid_adapter",
"status": 404
}400 Bad Request:
{
"error": "Invalid JSON body",
"status": 400
}Rate limiting is applied per API key/JWT token to ensure fair usage.
Rate limit information is included in response headers:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1641945600When rate limit is exceeded:
{
"error": "Rate limit exceeded",
"details": "Maximum 1000 requests per hour exceeded",
"status": 429
}Retry-After Header:
Retry-After: 3600List endpoints support pagination via query parameters.
limit: Maximum number of results (default: 10, max: 100)offset: Number of results to skip (default: 0)
{
"items": [...],
"total": 100,
"limit": 10,
"offset": 0
}# Get first page (items 0-9)
curl "http://localhost:8080/api/v1/llm/lora/adapters?limit=10&offset=0"
# Get second page (items 10-19)
curl "http://localhost:8080/api/v1/llm/lora/adapters?limit=10&offset=10"API uses URL path versioning: /api/v1/...
Future versions will maintain backward compatibility. Deprecated endpoints will include warnings in response headers:
Deprecated: true
Sunset: Sat, 31 Dec 2027 23:59:59 GMT- Always authenticate: Include Bearer token in all requests
- Handle errors: Check status codes and handle errors appropriately
- Use pagination: Don't fetch all results at once
- Cache responses: Use adapter status endpoints to avoid unnecessary loads
- Monitor rate limits: Check rate limit headers and implement backoff
- Async operations: Use job IDs for long-running operations like training
- Audit logging: Include
user_idin inference requests for audit trails
For issues or questions:
- GitHub Issues: https://github.com/makr-code/ThemisDB/issues
- Documentation: https://github.com/makr-code/ThemisDB/blob/main/README.md
- OpenAPI Spec:
/openapi/lora_api.yaml