Change the management of data tables and add some interface exposure for agent

# Architecture & Design Discussion

## Summary

This document proposes a comprehensive architecture and design for `lance-context`, a high-performance, evolvable context management solution for AI Agents. The design establishes a clean separation between the logical information layer and the physical storage layer (LanceDB), defines a clear set of Agent-facing interfaces, and introduces a layered data model (L0-L2) to balance retrieval effectiveness and cost. It also includes specifications for data governance, multi-tenancy, and a phased implementation roadmap, aiming to provide a robust foundation for building and scaling complex Agent systems.

## Motivation / Problem Statement

Current Agent development faces several challenges:
- **Fragmented Context**: Information is often scattered across various systems, leading to inconsistent and incomplete context for the Agent.
- **Suboptimal Retrieval**: Existing retrieval mechanisms lack the sophistication to balance precision, recall, and cost effectively.
- **Disorderly Information Growth**: Without a proper strategy, an Agent's memory and knowledge base can expand indefinitely, leading to performance degradation and increased costs.
- **Lack of Extensibility**: Tightly coupled business logic and storage implementations make it difficult to evolve the system, such as introducing new storage engines or adapting to new Agent capabilities.

The initial version of `lance-context` provides a solid starting point, but a more systematic architecture is required to address these issues and support the long-term growth of sophisticated AI Agents.

## Design Overview

The proposed architecture is centered around three key concepts:

1. **Information Layer on LanceDB**: A logical `Information Layer` is introduced on top of the physical LanceDB storage. This layer provides a stable, semantic view of the data, abstracting away the underlying implementation details. It organizes data into three distinct families: `ctx_agent` (core Agent business data), `ctx_kb` (external knowledge), and `ctx_meta` (internal system metadata).

2. **Layered Data Model (L0-L2)**: Inspired by industry best practices, we adopt a three-layer data model to optimize retrieval and processing:
 * **L2 (Raw Content)**: The immutable source of truth.
 * **L1 (Structured & Vector)**: The primary retrieval target, containing cleaned, chunked, and vectorized data.
 * **L0 (Abstract/Summary)**: A high-level summary layer for efficient pre-filtering and low-cost relevance assessment.

3. **Task-Oriented Agent Interfaces**: A set of high-level interfaces (`Add`, `Search`, `Explain`, `Trace`, `Prune`, `Archive`) are defined to provide Agents with intuitive, task-oriented capabilities for managing their context, memory, and knowledge.

## Detailed Design

### Overall Architecture and Data Layers

We propose an overall architecture that includes an **Information Layer**, which provides a stable logical view for Agent applications on top of the physical storage (LanceDB).

```mermaid
flowchart TD
 subgraph "Agent Application"
 Agent["AI Agent"]
 end

 subgraph "Information Layer (lance-context)"
 direction LR
 subgraph "Agent Interfaces"
 direction TB
 Add["Add"]
 Search["Search"]
 Prune["Prune"]
 Archive["Archive"]
 end

 subgraph "Table Families"
 direction TB
 ctx_agent["ctx_agent (Core)"]
 ctx_kb["ctx_kb (Knowledge)"]
 ctx_meta["ctx_meta (Internal)"]
 end

 subgraph "Data Layers"
 direction TB
 L0["L0 (Summary)"]
 L1["L1 (Structured/Vector)"]
 L2["L2 (Raw Content)"]
 end
 end

 subgraph "Physical Storage"
 LanceDB["LanceDB"]
 end

 Agent --> Add
 Agent --> Search
 Agent --> Prune
 Agent --> Archive

 Add --> L2
 L2 --> L1
 L1 --> L0

 Search -- "queries" --> L0
 Search -- "queries" --> L1
 L1 -- "links to" --> L2

 ctx_agent --> LanceDB
 ctx_kb --> LanceDB
 ctx_meta --> LanceDB
```

The core of this architecture is a governance strategy based on data layering and separation.

#### Data Layers: L0, L1, and L2

we process and store data in three logical layers to optimize retrieval efficiency and reduce LLM token consumption.

- **L2 (Raw Content Layer)**
 - **Semantics**: Unprocessed raw data, serving as the "Source of Truth" for all information. Examples include complete conversation logs, user-uploaded original documents, and full tool-call logs.
 - **Generation Pipeline**: Data enters the system via the `Add` interface and is directly stored in the corresponding L2 table.
 - **Storage Strategy**: Stored in tables like `ctx_agent.agent_l2_raw` or `ctx_kb.kb_l2_raw_documents` in binary or text format.
 - **Retrieval Strategy**: Not directly involved in retrieval by default. It is accessed only for "evidence traceability" or deep analysis, via links from L1/L0.

- **L1 (Structured & Vector Layer)**
 - **Semantics**: The result of cleaning, chunking, extracting metadata from, and generating vector embeddings for L2 data. This is the **primary retrieval target** of the system, balancing information density and contextual granularity.
 - **Generation Pipeline**: Triggered by a background task or write pipeline, it processes new L2 data to generate L1 records.
 - **Storage Strategy**: Chunked text and metadata are stored in `ctx_agent.agent_l1_chunks`, and vectors are stored in `ctx_agent.agent_l1_embeddings`.
 - **Retrieval Strategy**: This is the main layer for hybrid retrieval (Scalar + FTS + Vector). An Agent's `Search` request first retrieves candidates from this layer using vector similarity, keywords, and metadata filtering.

- **L0 (Abstract/Summary Layer)**
 - **Semantics**: A brief summary generated for a group of L1 Chunks or an L2 object (like a session or a document). Its core function is **pre-filtering**, helping the Agent or retrieval strategy quickly determine if a larger entity (like an entire session) is worth exploring in-depth.
 - **Generation Pipeline**: Triggered by a background task or after L1 processing is complete, it calls an LLM to summarize L1 Chunks or L2 content.
 - **Storage Strategy**: Stored in the `ctx_agent.agent_l0_summaries` table and linked to the corresponding L1/L2 entities.
 - **Retrieval Strategy**: Acts as the first line of defense in retrieval. For instance, in cross-session retrieval, it can quickly filter L0 summaries of all sessions to locate the most relevant ones before performing a precise search within their L1 Chunks.

#### Retrieval Chain

The standard retrieval chain follows the **L0 → L1 → L2** sequence:

1. **L0 Pre-filtering**: Based on the query intent, a quick, low-cost match is first performed on the L0 summary layer to identify highly relevant entities (e.g., sessions, documents).
2. **L1 Main Retrieval**: Within the scope of entities filtered by L0, or directly in the global L1 Chunks, a hybrid search is executed to recall the most relevant atomic information blocks.
3. **L2 Evidence Traceability**: The L1 retrieval results are presented to the LLM. If more complete context or fact-verification is needed, the LLM or user can trace back to the L2 raw data using the links saved in the L1 records.

### Table Families and Naming Conventions

To decouple business logic from internal management, we have designed three independent table families (which can be mapped to different databases or directories in LanceDB), each with clear naming conventions and responsibilities.

#### `ctx_agent` Family: Agent Business Core

Stores runtime data directly interacting with the Agent.

| Logical Table Name | Description | Core Field Suggestions | Index Suggestions |
| :--- | :--- | :--- | :--- |
| `agent_sessions` | Session metadata | `session_id` (PK), `agent_id`, `user_id`, `status`, `start_time`, `end_time`, `metadata` (JSON) | B-Tree: `agent_id`, `user_id`, `start_time` |
| `agent_l2_raw` | L2 raw data | `raw_id` (PK), `source_id` (e.g., session_id), `content` (bytes/text), `content_type`, `created_at` | B-Tree: `source_id`, `created_at` |
| `agent_l1_chunks` | L1 structured chunks | `chunk_id` (PK), `raw_id`, `content` (text), `metadata` (JSON), `created_at`, `agent_id` | FTS: `content`; B-Tree: `agent_id`, `created_at` |
| `agent_l1_embeddings` | L1 vector embeddings | `chunk_id` (FK), `vector` (fixed_size_list), `model_name`, `created_at` | Vector (IVF_PQ): `vector` |
| `agent_l0_summaries`| L0 summaries | `target_id` (PK), `target_type` (session/doc), `summary` (text), `updated_at`, `agent_id` | FTS: `summary`; B-Tree: `agent_id`, `target_type` |
| `agent_skills` | Agent skill definitions | `skill_id` (PK), `name`, `schema` (JSON), `description`, `agent_id`, `version` | B-Tree: `agent_id`, `name` |
| `agent_tool_calls` | Tool call records | `call_id` (PK), `session_id`, `tool_name`, `params` (JSON), `result` (text), `status`, `timestamp` | B-Tree: `session_id`, `tool_name`, `timestamp` |
| `agent_relations` | Entity relationship graph | `source_id`, `target_id`, `relation_type` (e.g., 'cites', 'triggers'), `agent_id`, `created_at` | B-Tree: `source_id`, `target_id`, `agent_id` |

#### `ctx_kb` Family: External Knowledge Base

Used for storing relatively static, shareable background knowledge. Its structure is similar to `ctx_agent` but with an independent lifecycle and management strategy.

| Logical Table Name | Description | Core Field Suggestions | Index Suggestions |
| :--- | :--- | :--- | :--- |
| `kb_l2_raw_documents` | L2 raw knowledge documents | `doc_id` (PK), `source_uri`, `content`, `metadata` (JSON), `imported_at` | B-Tree: `source_uri` |
| `kb_l1_chunks` | L1 knowledge base chunks | `chunk_id` (PK), `doc_id`, `content`, `metadata` (JSON) | FTS: `content` |
| `kb_l1_embeddings` | L1 knowledge base vectors | `chunk_id` (FK), `vector` (fixed_size_list), `model_name` | Vector (IVF_PQ): `vector` |

#### `ctx_meta` Family: Internal Metadata

The system's "Information Schema," used for self-management and internal task scheduling, transparent to the Agent.

| Logical Table Name | Description | Core Field Suggestions | Purpose |
| :--- | :--- | :--- | :--- |
| `meta_tables` | Registry of tables | `table_name`, `db_name`, `table_type`, `version` | Records which logical tables exist in the system and their ownership. |
| `meta_columns` | Column attributes | `table_name`, `column_name`, `data_type`, `is_time`, `is_vector` | Queries and manages table schemas, especially special attributes like time and vectors. |
| `meta_jobs` | Background task queue | `job_id` (PK), `job_type` (prune/index), `payload`, `status`, `scheduled_at` | Schedules asynchronous tasks like index rebuilding, data pruning, and archiving. |
| `meta_mem_candidates` | Memory candidate pool | `session_id`, `chunk_id`, `score`, `candidate_level` (promote/prune) | The core table for implementing the daily pruning mechanism. |

#### Temporal Attributes and Indexing Conventions

- **Temporal Attributes**: All fields involving timestamps (e.g., `created_at`, `timestamp`) should use a uniform UTC Timestamp type.
- **Vector Indexing**: The `vector` field is the primary target for vector indexing. IVF_PQ is recommended.
- **FTS Indexing**: All `content` or `summary` fields should have a Full-Text Search (FTS) index.

### Multi-Tenancy, Concurrency, and Data Governance

- **Multi-Tenancy and Versioning**:
 - **Logical Isolation**: All records in `ctx_agent` must include an `agent_id`.
 - **Physical Isolation**: Data can be written to different directories based on `agent_id`.
 - **Versioning**: Relies on LanceDB's native transaction and versioning capabilities.

- **Concurrency and Isolation**:
 - **Read-Write Isolation**: Ensured by LanceDB's Copy-on-Write mechanism.
 - **Write Concurrency**: A write queue at the API layer is recommended to batch-merge high-concurrency `Add` operations.

- **Index Maintenance and Archiving**:
 - **Index Maintenance**: Handled by background jobs in `meta_jobs` during off-peak hours.
 - **Data Archiving**: Triggered via the `Prune` interface, performs a soft delete and migrates data to cold storage via a background task.

### Agent Interface Mapping

The interfaces provided by `lance-context` to the Agent should be task-oriented and highly abstract.

```mermaid
flowchart TD
 subgraph Agent
 direction LR
 A[Add]
 S[Search]
 E[Explain]
 T[Trace]
 P[Prune]
 AR[Archive]
 end

 subgraph Backend
 direction TB
 subgraph ctx_agent
 agent_l2["agent_l2_raw"]
 agent_l1["agent_l1_chunks/embeddings"]
 agent_l0["agent_l0_summaries"]
 agent_relations["agent_relations"]
 agent_tool_calls["agent_tool_calls"]
 end
 subgraph ctx_meta
 meta_jobs["meta_jobs"]
 meta_mem_candidates["meta_mem_candidates"]
 end
 end

 A -- "Writes to" --> agent_l2
 A -- "Triggers async write to" --> agent_l1
 A -- "Triggers async write to" --> agent_l0
 
 S -- "Queries" --> agent_l1
 
 E -- "Traverses" --> agent_relations

 T -- "Fetches from" --> agent_l1
 T -- "Fetches from" --> agent_tool_calls

 P -- "Filters in" --> meta_mem_candidates
 P -- "Creates job in" --> meta_jobs

 AR -- "Creates job in" --> meta_jobs
```

#### Core Interface Prototypes (DTOs)

| Interface | Input Parameters (DTO) | Output (DTO) | Core Logic & Table Mapping | Error Code Suggestions |
| :--- | :--- | :--- | :--- | :--- |
| `Add` | `content: Union[str, bytes]`, `content_type: str`, `session_id: Optional[str]`, `metadata: Dict` | `job_id: str` | 1. Write to `agent_l2_raw` (L2). 2. Trigger background task: write to `agent_l1_chunks`/`embeddings` (L1). 3. (Optional) Trigger LLM to write to `agent_l0_summaries` (L0). | 400, 503 |
| `Search` | `query: str`, `filter: Dict`, `top_k: int`, `search_type: Literal[...]` | `results: List[Chunk]` | 1. Concurrently query `agent_l1_chunks` (FTS) and `agent_l1_embeddings` (Vector). 2. Use RRF to fuse results. 3. Perform scalar filtering. | 400, 404 |
| `Explain` | `entity_id: str`, `entity_type: str` | `graph: Dict` | Recursively trace relationships from the `agent_relations` table. | 404 |
| `Trace` | `session_id: str` | `events: List[...]` | Fetch all records for `session_id` from `agent_l1_chunks` and `agent_tool_calls` and sort by timestamp. | 404 |
| `Prune` | `policy: Dict` | `job_id: str` | 1. Filter candidates in `meta_mem_candidates`. 2. Create a `prune` task in `meta_jobs`. 3. Background task performs soft delete/archiving. | 400 |
| `Archive` | `session_id: str` | `job_id: str` | Create an `archive` task in `meta_jobs` to migrate all session data. | 404 |

### Daily Compaction Mechanism

This mechanism distills incremental conversation data (Episodes) into valuable long-term memory (Profiles) and cleans up low-value information.

1. **Scoring and Candidacy**
 - A daily background task scans recent `agent_sessions` and `agent_l1_chunks`.
 - **Scoring Dimensions**: Activity, Importance, Reusability, Time Decay.
 - Based on a weighted score, each item is marked in `meta_mem_candidates` as `PROMOTION`, `RETENTION`, or `PRUNING`.

2. **Promotion to Episode/Profile**
 - A background task processes records marked for `PROMOTION`.
 - It aggregates content, calls an LLM to generate a narrative **Episode** or update a structured **User Profile**, and stores the result in a long-term memory table.
 - The operation is transactional and can be rolled back.

## Limitations / Open Questions

- **Performance at Scale**: While LanceDB is highly performant, the FTS and complex scalar query capabilities might become a bottleneck under heavy load. The hybrid storage model in Phase 2 is designed to mitigate this.
- **Write Concurrency Management**: The proposed write queue adds a layer of complexity. Its implementation and tuning will be critical for high-throughput scenarios.
- **Cost of L0 Generation**: Generating L0 summaries via LLM calls for every L1/L2 update can be costly. A selective or batch-based strategy for L0 generation might be needed.
- **Complexity of Query Router**: The Query Router in Phase 3 is a significant engineering effort and will require careful design to handle query parsing, distribution, and result fusion correctly.

## Rollout Plan / Roadmap

We recommend a phased approach to implement this design.

- **Phase 1: Unified Information Layer View based on LanceDB**
 - **Goal**: Implement the complete L0/L1/L2 data model and core APIs (`Add`, `Search`) using only LanceDB.
 - **Outcome**: A functionally complete but limited-performance context database.

- **Phase 2: Hybrid Storage and Query Offloading**
 - **Goal**: Introduce external specialized engines (e.g., Elasticsearch for FTS) where LanceDB's native capabilities are insufficient.
 - **Outcome**: A hybrid system with better performance and stronger query capabilities.

- **Phase 3: Engine Adapter Layer and Query Router**
 - **Goal**: Evolve `lance-context` into a universal "context virtualization layer" that supports any combination of backend storage engines.
 - **Outcome**: A highly scalable context database platform completely decoupled from the underlying storage.

## Checklist

- [ ] Finalize table schemas for all three families.
- [ ] Implement Phase 1 `Add` interface and async processing pipeline.
- [ ] Implement Phase 1 `Search` interface with hybrid search capabilities.
- [ ] Set up `meta_jobs` and a basic daily compaction framework.
- [ ] Develop adapters for LangChain and LlamaIndex.
- [ ] Benchmark performance of the Phase 1 implementation.
- [ ] Document all public APIs and data models.

## Impact Assessment

- **Performance**: The layered data model and hybrid retrieval strategy are expected to significantly improve query performance and reduce LLM context size. Write latency will be managed via asynchronous processing and a write queue.
- **Cost**: While LLM calls for L0/L1 generation introduce costs, the overall architecture aims to reduce token consumption during retrieval, potentially leading to net savings. The phased rollout allows for cost-effective scaling.
- **Compatibility**: The design is framework-agnostic. By providing clean DTOs and adapters, it ensures easy integration with existing and future Agent frameworks. The reliance on LanceDB in Phase 1 simplifies initial deployment and dependencies.

Logical Table Name	Description	Core Field Suggestions	Index Suggestions
`agent_sessions`	Session metadata	`session_id` (PK), `agent_id`, `user_id`, `status`, `start_time`, `end_time`, `metadata` (JSON)	B-Tree: `agent_id`, `user_id`, `start_time`
`agent_l2_raw`	L2 raw data	`raw_id` (PK), `source_id` (e.g., session_id), `content` (bytes/text), `content_type`, `created_at`	B-Tree: `source_id`, `created_at`
`agent_l1_chunks`	L1 structured chunks	`chunk_id` (PK), `raw_id`, `content` (text), `metadata` (JSON), `created_at`, `agent_id`	FTS: `content`; B-Tree: `agent_id`, `created_at`
`agent_l1_embeddings`	L1 vector embeddings	`chunk_id` (FK), `vector` (fixed_size_list), `model_name`, `created_at`	Vector (IVF_PQ): `vector`
`agent_l0_summaries`	L0 summaries	`target_id` (PK), `target_type` (session/doc), `summary` (text), `updated_at`, `agent_id`	FTS: `summary`; B-Tree: `agent_id`, `target_type`
`agent_skills`	Agent skill definitions	`skill_id` (PK), `name`, `schema` (JSON), `description`, `agent_id`, `version`	B-Tree: `agent_id`, `name`
`agent_tool_calls`	Tool call records	`call_id` (PK), `session_id`, `tool_name`, `params` (JSON), `result` (text), `status`, `timestamp`	B-Tree: `session_id`, `tool_name`, `timestamp`
`agent_relations`	Entity relationship graph	`source_id`, `target_id`, `relation_type` (e.g., 'cites', 'triggers'), `agent_id`, `created_at`	B-Tree: `source_id`, `target_id`, `agent_id`

Logical Table Name	Description	Core Field Suggestions	Index Suggestions
`kb_l2_raw_documents`	L2 raw knowledge documents	`doc_id` (PK), `source_uri`, `content`, `metadata` (JSON), `imported_at`	B-Tree: `source_uri`
`kb_l1_chunks`	L1 knowledge base chunks	`chunk_id` (PK), `doc_id`, `content`, `metadata` (JSON)	FTS: `content`
`kb_l1_embeddings`	L1 knowledge base vectors	`chunk_id` (FK), `vector` (fixed_size_list), `model_name`	Vector (IVF_PQ): `vector`

Interface	Input Parameters (DTO)	Output (DTO)	Core Logic & Table Mapping	Error Code Suggestions
`Add`	`content: Union[str, bytes]`, `content_type: str`, `session_id: Optional[str]`, `metadata: Dict`	`job_id: str`	1. Write to `agent_l2_raw` (L2). 2. Trigger background task: write to `agent_l1_chunks`/`embeddings` (L1). 3. (Optional) Trigger LLM to write to `agent_l0_summaries` (L0).	400, 503
`Search`	`query: str`, `filter: Dict`, `top_k: int`, `search_type: Literal[...]`	`results: List[Chunk]`	1. Concurrently query `agent_l1_chunks` (FTS) and `agent_l1_embeddings` (Vector). 2. Use RRF to fuse results. 3. Perform scalar filtering.	400, 404
`Explain`	`entity_id: str`, `entity_type: str`	`graph: Dict`	Recursively trace relationships from the `agent_relations` table.	404
`Trace`	`session_id: str`	`events: List[...]`	Fetch all records for `session_id` from `agent_l1_chunks` and `agent_tool_calls` and sort by timestamp.	404
`Prune`	`policy: Dict`	`job_id: str`	1. Filter candidates in `meta_mem_candidates`. 2. Create a `prune` task in `meta_jobs`. 3. Background task performs soft delete/archiving.	400
`Archive`	`session_id: str`	`job_id: str`	Create an `archive` task in `meta_jobs` to migrate all session data.	404

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change the management of data tables and add some interface exposure for agent #37

Architecture & Design Discussion

Summary

Motivation / Problem Statement

Design Overview

Detailed Design

Overall Architecture and Data Layers

Data Layers: L0, L1, and L2

Retrieval Chain

Table Families and Naming Conventions

`ctx_agent` Family: Agent Business Core

`ctx_kb` Family: External Knowledge Base

`ctx_meta` Family: Internal Metadata

Temporal Attributes and Indexing Conventions

Multi-Tenancy, Concurrency, and Data Governance

Agent Interface Mapping

Core Interface Prototypes (DTOs)

Daily Compaction Mechanism

Limitations / Open Questions

Rollout Plan / Roadmap

Checklist

Impact Assessment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Logical Table Name	Description	Core Field Suggestions	Purpose
`meta_tables`	Registry of tables	`table_name`, `db_name`, `table_type`, `version`	Records which logical tables exist in the system and their ownership.
`meta_columns`	Column attributes	`table_name`, `column_name`, `data_type`, `is_time`, `is_vector`	Queries and manages table schemas, especially special attributes like time and vectors.
`meta_jobs`	Background task queue	`job_id` (PK), `job_type` (prune/index), `payload`, `status`, `scheduled_at`	Schedules asynchronous tasks like index rebuilding, data pruning, and archiving.
`meta_mem_candidates`	Memory candidate pool	`session_id`, `chunk_id`, `score`, `candidate_level` (promote/prune)	The core table for implementing the daily pruning mechanism.

Change the management of data tables and add some interface exposure for agent #37

Description

Architecture & Design Discussion

Summary

Motivation / Problem Statement

Design Overview

Detailed Design

Overall Architecture and Data Layers

Data Layers: L0, L1, and L2

Retrieval Chain

Table Families and Naming Conventions

ctx_agent Family: Agent Business Core

ctx_kb Family: External Knowledge Base

ctx_meta Family: Internal Metadata

Temporal Attributes and Indexing Conventions

Multi-Tenancy, Concurrency, and Data Governance

Agent Interface Mapping

Core Interface Prototypes (DTOs)

Daily Compaction Mechanism

Limitations / Open Questions

Rollout Plan / Roadmap

Checklist

Impact Assessment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`ctx_agent` Family: Agent Business Core

`ctx_kb` Family: External Knowledge Base

`ctx_meta` Family: Internal Metadata