From bb1faabf3d87dc5cc9b671512d581926105d569b Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 00:07:25 +0530 Subject: [PATCH 001/102] docs: add Redis caching & scaling plan Plan for exposing chat to a few hundred users: P0 connection-pool/session fix, P1 Redis foundation + DynamicConfigStore read-through cache, P2 per-user request rate limiting, P3 per-chat-turn config caches. Plan only, no implementation yet. Co-Authored-By: Claude Opus 4.7 (1M context) --- REDIS_CACHING_PLAN.md | 243 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 243 insertions(+) create mode 100644 REDIS_CACHING_PLAN.md diff --git a/REDIS_CACHING_PLAN.md b/REDIS_CACHING_PLAN.md new file mode 100644 index 00000000000..754a3540222 --- /dev/null +++ b/REDIS_CACHING_PLAN.md @@ -0,0 +1,243 @@ +# Redis Caching & Scaling Plan + +**Goal:** expose the chat interface to a few hundred users. Evaluate and +introduce Redis-based caching where it makes sense, alongside the scaling +work that actually gates that user count. + +**Status:** plan only — no code yet. Follows the fork's plan template +(Issues / Important Notes / Strategy / Tests). Treat each phase as an +independently shippable PR. + +--- + +## Context & key findings + +- **This fork has zero Redis today.** The only references in the repo are + comments in `db/index_attempt.py` and `db/retention.py` explaining how + the fork *avoids* Redis (Postgres advisory locks instead of fences, + Postgres as the Celery broker). Adding Redis is **net-new infrastructure**, + which AGENTS.md flags as a substantial dependency, not a drive-by. +- **The real near-term scaling ceiling is the DB connection pool, not the + DB's query throughput.** `db/engine.py:72` sets `pool_size=40, + max_overflow=10` → 50 connections per api_server process. `get_session` + (`db/engine.py:94`) yields one session held for the **whole request**, and + `/send-message` (`server/query_and_chat/chat_backend.py:276`, + `handle_new_chat_message`) returns a `StreamingResponse` — so a connection + is pinned for the entire LLM stream (10–60s). At a few hundred users this + exhausts the pool before query volume ever stresses Postgres. **No cache + fixes this.** +- **A rate limiter already exists but is the wrong kind.** + `server/query_and_chat/token_limit.py::check_token_rate_limits` enforces a + *token-budget* limit (global, DB-backed, EE-overridable). It is not a + *request-rate* limiter and `any_rate_limit_exists()` is gated by a + per-process `@lru_cache` (`token_limit.py:122`) that won't reflect changes + across replicas. +- **The highest-leverage cache seam already exists:** the + `DynamicConfigStore` abstraction (`dynamic_configs/interface.py`, + `store.py`, `factory.py`) is the fork's equivalent of upstream's + `PgRedisKVStore`. Wrapping it gives transparent, write-through-invalidated + caching for everything routed through it with zero call-site changes. + +### Non-goals (explicitly out of scope) + +- **Do not** move the Celery broker to Redis (stays on Postgres — deliberate + divergence). +- **Do not** replace indexing advisory-lock fences with Redis fences. +- **Do not** cache chat sessions / messages (too mutable; correctness risk). +- **Do not** cache LLM/embedding *responses* (semantic/correctness risk). +- **Do not** add tenant key-prefixing — this fork is single-tenant. + +--- + +## Phase summary + +| Phase | What | Caching? | Gates the user count? | +|---|---|---|---| +| **P0** | Connection-pool / session-holding fix + multi-replica | No | **Yes — do first** | +| **P1** | Redis foundation + `DynamicConfigStore` read-through cache | Yes (flagship) | Enables the rest | +| **P2** | Redis-backed per-user request rate limiting | No (protection) | Yes, for cost/abuse | +| **P3** | Per-chat-turn config caches (LLM provider, embedding settings) | Yes | Measured add-on | +| **Opt** | Document sets, connector OAuth/API caches, Redis sessions | Yes | Situational | + +--- + +## P0 — Connection pool & session lifetime (prerequisite, not caching) + +### Issues to address +At a few hundred users, concurrent streaming chats pin all 50 connections +per process; unrelated (even cached) requests then queue. This is the first +thing that breaks. + +### Important notes +- `handle_new_chat_message` holds `Depends(get_session)` for the full + `StreamingResponse`. The fix is to scope DB work to *before* the stream + starts (load everything needed, commit the user message), then run the + stream without a pinned pooled connection, opening short-lived sessions + only for the final persistence write. +- This touches the core chat path — **per CLAUDE.md, confirm with the human + and verify the worker/stream boots cleanly** before/after. High blast + radius; ship as its own PR with manual load verification. +- Independently: run **multiple api_server replicas** (k8s) behind nginx, + and size `pool_size` against Postgres `max_connections` ÷ replica count. + +### Implementation strategy +1. Audit `handle_new_chat_message` and the `process_message` generator for + what truly needs the session during streaming vs. before it. +2. Introduce a pattern where the streaming generator uses + `get_session_context_manager()` for short writes rather than the + request-scoped `Depends(get_session)`. +3. Bump replica count in + `darwin-kubernetes/api_server-service-deployment.yaml`; re-tune pool. + +### Tests +- Load test: N concurrent streaming chats (N > pool size) — confirm + non-chat endpoints (settings, session list) stay responsive. +- Verify no `QueuePool limit ... connection timed out` under load. + +--- + +## P1 — Redis foundation + DynamicConfigStore cache (flagship) + +### Issues to address +Cache the highest-frequency, fires-on-every-page reads (settings, tokens, +invited users) at one central, low-risk seam, with correct cross-replica +invalidation. + +### Important notes +- Mirror upstream's `PgRedisKVStore` *shape* but fit this fork's interface: + `store(key, val, encrypt)` / `load(key)` / `delete(key)` raising + `ConfigNotFoundError` (`dynamic_configs/interface.py`). +- Write-through invalidation is **free** here — the same `store()`/`delete()` + that writes Postgres updates/clears Redis, so all replicas see changes. +- Single-tenant → plain key prefix (e.g. `danswer_kv:`), no tenant wrapper. +- Redis must be **fail-open**: if Redis is down, fall back to Postgres so an + outage degrades latency, not availability. + +### Implementation strategy +1. **Dependency:** add `redis==` to `backend/requirements/default.txt`. +2. **Config:** add `REDIS_HOST/REDIS_PORT/REDIS_PASSWORD/REDIS_DB_NUMBER` + to `configs/app_configs.py` via the existing `os.environ.get` pattern; + add a `REDIS_KEY_VALUE_CACHE_TTL` (default ~1 day, mirroring upstream). +3. **Client module:** new `backend/danswer/redis/redis_pool.py` — a + `ConnectionPool` singleton + `get_redis_client()`. (Upstream's + `redis_pool.py` is the template, minus IAM/tenant code.) +4. **Cache layer:** add `CachedDynamicConfigStore` (decorator/wrapper around + `PostgresBackedDynamicConfigStore`) in `dynamic_configs/store.py`, or add + Redis read-through directly to the PG store. `load()` checks Redis → + misses fall to PG and repopulate; `store()`/`delete()` write PG then + set/clear Redis. Route it via `dynamic_configs/factory.py` + (`get_dynamic_config_store`) behind a `DYNAMIC_CONFIG_STORE` value so it's + toggleable. +5. **Deployment:** Redis statefulset + service in `darwin-kubernetes/`; + redis service in `deployment/docker_compose/docker-compose.dev.yml`; + wire env in `env-configmap.yaml` + password in `secrets.yaml`. + +### What this transparently caches +Everything through `get_dynamic_config_store()`: app settings +(`server/settings/store.py`, key `danswer_settings`), Slack bot tokens, +invited users, telemetry id, Gmail/GDrive connector-auth blobs. + +### Tests +- Unit: `load` hits Redis on 2nd call (mock PG, assert one PG query); + `store`/`delete` invalidate; Redis-down path falls back to PG (fail-open). +- Integration: change settings via the admin endpoint → second replica (or + fresh client) reads the new value without a TTL wait. + +--- + +## P2 — Redis-backed per-user request rate limiting (protection) + +### Issues to address +A few hundred users on chat = real risk of runaway LLM **cost** and hitting +the **provider's** rate limits. Need per-user request-rate limiting that is +correct across replicas (in-memory counters let through ~N× at N pods). + +### Important notes +- This **complements**, does not replace, the existing token-budget limiter + in `token_limit.py`. Keep that; add request-rate limiting on top. +- Also fixes the latent multi-replica issue: the per-process `@lru_cache` on + `any_rate_limit_exists()` (`token_limit.py:122`) can be made Redis-backed + or given a short TTL so all pods agree. +- No rate-limit middleware exists today (only `latency_logging.py`) — this is + net-new. fastapi 0.109.2 is compatible with `fastapi-limiter` or a small + custom `incr`+`expire` limiter. + +### Implementation strategy +1. Add a Redis counter limiter: key `ratelimit:msg:{user_id}:{bucket}`, + atomic `incr` + `expire(window, NX)` (or a small Lua script for + multi-tier limits). Reuse the P1 `redis_pool` client. +2. Apply at the chat entrypoint (`/send-message`) as a dependency, before any + LLM work; raise `HTTPException(429)` (this fork uses `HTTPException`, not + `OnyxError`). +3. Make limits env-configurable in `configs/app_configs.py` (per-user + per-minute / per-hour). Default off via env so it's opt-in per environment. + +### Tests +- Unit: counter increments/expires; exceeds → 429. +- Integration: two clients simulating two replicas share the same limit + (single Redis), confirm the aggregate cap holds. + +--- + +## P3 — Per-chat-turn config caches (measured add-on) + +### Issues to address +Fired on every chat turn × hundreds of users → meaningful aggregate even +though each query is cheap. + +### Important notes +- **Cache the serialized Pydantic snapshot, not the ORM object** — these + return SQLAlchemy models with lazy relationships; caching the ORM instance + risks `DetachedInstanceError` / stale relationship reads. +- Invalidation is **not** free here (unlike P1) — must add explicit + bust/refresh calls inside the relevant `db/` mutation functions. This is + the added surface area; only do it after P0/P1 and after measuring. + +### Implementation strategy +- **Default LLM provider:** cache `db/llm.py::fetch_default_provider` / + `fetch_existing_llm_providers`; invalidate in the provider create/update/ + delete paths in `db/llm.py` and the admin endpoint. +- **Current embedding/search settings:** cache + `db/embedding_model.py::get_current_db_embedding_model`; invalidate on + index-swap (when a new `EmbeddingModel` becomes `PRESENT`). +- Use short TTLs as a backstop even with explicit invalidation. + +### Tests +- Unit: cached fetch returns snapshot; mutation path clears it. +- Integration: change default provider → chat picks it up without restart. + +--- + +## Optional / deferred + +| Item | Where | Note | +|---|---|---| +| **Document sets** | `db/document_set.py::fetch_document_sets` | Global key in this fork (base version ignores `user_id`); write-through on the ~5 mutation fns. Admin-page frequency, modest win. | +| **Connector OAuth / external-API caches** | per-connector (cf. upstream Confluence/Slack) | Only if those connectors are active; cuts external rate-limit pressure. Short TTL. | +| **Redis auth sessions** | `auth/users.py` (fastapi-users RedisStrategy) | Offloads per-request auth from Postgres; bigger change + security/invalidation care. Defer until auth DB load shows up. `SESSION_EXPIRE_TIME_SECONDS` already exists. | +| **Personas list** | — | **Skip backend cache** (per-user + group-membership invalidation trap). Use frontend (SWR) caching instead. | + +--- + +## Cross-cutting + +### New files / touched files +- New: `backend/danswer/redis/redis_pool.py`, Redis k8s manifests. +- Touched: `requirements/default.txt`, `configs/app_configs.py`, + `dynamic_configs/store.py`, `dynamic_configs/factory.py`, + `docker-compose.dev.yml`, `darwin-kubernetes/{env-configmap,secrets}.yaml`, + `api_server-service-deployment.yaml` (replicas). P2/P3 touch + `chat_backend.py`, `token_limit.py`, `db/llm.py`, `db/embedding_model.py`. + +### Restart / bounce list (per CLAUDE.md) +- New env vars / requirements → rebuild + restart api_server (`dapi`), + background jobs (`dbe`), Slack listener (`dsl`). +- `redis` dependency add → `pip install` in the venv before running. + +### Open questions for the human +1. **P0 session-refactor sign-off** — high blast radius on the chat path; + confirm approach + manual load verification before merge. +2. Redis deployment shape in `darwin-kubernetes` — single statefulset vs. + managed Redis? Persistence needed (cache-only ⇒ probably not)? +3. Default rate-limit values for P2 (per-user/min, per-user/hour). +4. Sequencing: is P0 acceptable to do in parallel with P1, or strictly first? From da37d15c408cfa2bade784ea95462cd4b03644ee Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 17:10:05 +0530 Subject: [PATCH 002/102] P1: Redis foundation + read-through KV cache MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Foundation for caching/rate-limiting work (see REDIS_CACHING_PLAN.md). This commit only ships the cache piece — no behavioural change unless REDIS_KV_CACHE_ENABLED=true is set. * requirements: pin redis==5.0.8. * configs/app_configs.py: REDIS_HOST/PORT/PASSWORD/DB_NUMBER/SSL, REDIS_POOL_MAX_CONNECTIONS, REDIS_HEALTH_CHECK_INTERVAL, REDIS_SOCKET_TIMEOUT_SECONDS; cache toggle REDIS_KV_CACHE_ENABLED (default OFF) and REDIS_KV_CACHE_TTL_SECONDS (1 day). * danswer/redis/redis_pool.py: lazy ConnectionPool singleton + get_redis_client() helper. Single-tenant — DANSWER_REDIS_KEY_PREFIX is the only namespace; upstream's TenantRedisClient is intentionally not ported. * dynamic_configs/store.py: RedisCachedDynamicConfigStore wraps any inner DynamicConfigStore with read-through / write-through caching. Inner store stays the source of truth (writes inner first), encrypted values are NEVER cached plaintext (just invalidated), every Redis call is fail-open so an outage degrades latency, not availability. * dynamic_configs/factory.py: when REDIS_KV_CACHE_ENABLED, transparently wraps the existing PostgresBackedDynamicConfigStore — call sites unchanged. * Deployment: redis service in docker-compose.dev.yml (cache-only: no AOF, no RDB snapshots, allkeys-lru @ 256mb so a runaway producer can't OOM the node). darwin-kubernetes/redis-statefulset.yaml mirrors that posture. REDIS_HOST etc. in env-configmap; REDIS_PASSWORD wired via optional secretKeyRef so the deployments still boot when Redis is unauth'd. NOT the Celery broker — that stays on Postgres by design. * backend/.gitignore: ignore stray pywikibot apicache/throttle.ctrl artifacts dropped by the existing mediawiki test. Tests (unittest, no real Redis required — mocks at the get_redis_client boundary): - tests/.../redis_layer/test_redis_pool.py: pool singleton, prefix constant, reset_pool_for_tests. - tests/.../dynamic_configs/test_redis_cached_store.py: read-through, write-through invalidation, TTL on SET, cached-None vs miss, not-found NOT cached, encrypted values not mirrored, corrupt entry treated as miss, fail-open on Redis errors. 13 cases. Co-Authored-By: Claude Opus 4.7 (1M context) --- backend/.gitignore | 6 + backend/danswer/configs/app_configs.py | 25 ++ backend/danswer/dynamic_configs/factory.py | 20 +- backend/danswer/dynamic_configs/store.py | 125 +++++++ backend/danswer/redis/__init__.py | 0 backend/danswer/redis/redis_pool.py | 110 ++++++ backend/requirements/default.txt | 1 + .../unit/danswer/dynamic_configs/__init__.py | 0 .../test_redis_cached_store.py | 353 ++++++++++++++++++ .../unit/danswer/redis_layer/__init__.py | 0 .../danswer/redis_layer/test_redis_pool.py | 87 +++++ .../api_server-service-deployment.yaml | 10 + darwin-kubernetes/background-deployment.yaml | 9 + darwin-kubernetes/env-configmap.yaml | 17 + darwin-kubernetes/redis-statefulset.yaml | 78 ++++ darwin-kubernetes/secrets.yaml | 6 + .../docker_compose/docker-compose.dev.yml | 25 ++ 17 files changed, 871 insertions(+), 1 deletion(-) create mode 100644 backend/danswer/redis/__init__.py create mode 100644 backend/danswer/redis/redis_pool.py create mode 100644 backend/tests/unit/danswer/dynamic_configs/__init__.py create mode 100644 backend/tests/unit/danswer/dynamic_configs/test_redis_cached_store.py create mode 100644 backend/tests/unit/danswer/redis_layer/__init__.py create mode 100644 backend/tests/unit/danswer/redis_layer/test_redis_pool.py create mode 100644 darwin-kubernetes/redis-statefulset.yaml diff --git a/backend/.gitignore b/backend/.gitignore index 6b3219cc30e..f166ca65e72 100644 --- a/backend/.gitignore +++ b/backend/.gitignore @@ -9,3 +9,9 @@ api_keys.py vespa-app.zip dynamic_config_storage/ celerybeat-schedule* + +# Pywikibot drops these in cwd when the mediawiki connector test runs. +# Local debugging artifacts, not source. +apicache/ +throttle.ctrl +.pytest_cache/ diff --git a/backend/danswer/configs/app_configs.py b/backend/danswer/configs/app_configs.py index 321a8bdca01..d969b39d2d9 100644 --- a/backend/danswer/configs/app_configs.py +++ b/backend/danswer/configs/app_configs.py @@ -307,6 +307,31 @@ ) +##### +# Redis (cache + rate limiting) +##### +# Connection details. All env-driven; safe defaults for local dev. +REDIS_HOST = os.environ.get("REDIS_HOST") or "localhost" +REDIS_PORT = int(os.environ.get("REDIS_PORT") or 6379) +REDIS_PASSWORD = os.environ.get("REDIS_PASSWORD") or "" +REDIS_DB_NUMBER = int(os.environ.get("REDIS_DB_NUMBER") or 0) +REDIS_SSL = os.environ.get("REDIS_SSL", "").lower() == "true" +REDIS_POOL_MAX_CONNECTIONS = int(os.environ.get("REDIS_POOL_MAX_CONNECTIONS") or 50) +REDIS_HEALTH_CHECK_INTERVAL = int(os.environ.get("REDIS_HEALTH_CHECK_INTERVAL") or 60) +REDIS_SOCKET_TIMEOUT_SECONDS = int(os.environ.get("REDIS_SOCKET_TIMEOUT_SECONDS") or 3) + +# Read-through KV cache layered atop PostgresBackedDynamicConfigStore. +# When false (default), the store behaves exactly as before; when true, +# reads check Redis first and writes/deletes invalidate Redis. Fail-open: +# Redis errors degrade to direct Postgres, never an outage. +REDIS_KV_CACHE_ENABLED = ( + os.environ.get("REDIS_KV_CACHE_ENABLED", "").lower() == "true" +) +# TTL (seconds) for KV entries cached in Redis (1 day default). +REDIS_KV_CACHE_TTL_SECONDS = int( + os.environ.get("REDIS_KV_CACHE_TTL_SECONDS") or 86400 +) + ##### # Enterprise Edition Configs ##### diff --git a/backend/danswer/dynamic_configs/factory.py b/backend/danswer/dynamic_configs/factory.py index 44b6e096b6d..60e46950df0 100644 --- a/backend/danswer/dynamic_configs/factory.py +++ b/backend/danswer/dynamic_configs/factory.py @@ -1,15 +1,33 @@ from danswer.configs.app_configs import DYNAMIC_CONFIG_STORE +from danswer.configs.app_configs import REDIS_KV_CACHE_ENABLED +from danswer.configs.app_configs import REDIS_KV_CACHE_TTL_SECONDS from danswer.dynamic_configs.interface import DynamicConfigStore from danswer.dynamic_configs.store import FileSystemBackedDynamicConfigStore from danswer.dynamic_configs.store import PostgresBackedDynamicConfigStore +from danswer.dynamic_configs.store import RedisCachedDynamicConfigStore def get_dynamic_config_store() -> DynamicConfigStore: + """Resolve the configured KV store. + + The Postgres-backed store is the source of truth. When + ``REDIS_KV_CACHE_ENABLED`` is true, we transparently wrap it with a + read-through / write-through Redis cache — call sites are unchanged. + Wrapping is additive on top of the configured backend rather than a + distinct ``DYNAMIC_CONFIG_STORE`` value so "which backend" and "do I + cache" are independently controllable. + """ dynamic_config_store_type = DYNAMIC_CONFIG_STORE if dynamic_config_store_type == FileSystemBackedDynamicConfigStore.__name__: raise NotImplementedError("File based config store no longer supported") if dynamic_config_store_type == PostgresBackedDynamicConfigStore.__name__: - return PostgresBackedDynamicConfigStore() + inner: DynamicConfigStore = PostgresBackedDynamicConfigStore() + if REDIS_KV_CACHE_ENABLED: + return RedisCachedDynamicConfigStore( + inner=inner, + ttl_seconds=REDIS_KV_CACHE_TTL_SECONDS, + ) + return inner # TODO: change exception type raise Exception("Unknown dynamic config store type") diff --git a/backend/danswer/dynamic_configs/store.py b/backend/danswer/dynamic_configs/store.py index ee4ac3d09ae..59df2c807dd 100644 --- a/backend/danswer/dynamic_configs/store.py +++ b/backend/danswer/dynamic_configs/store.py @@ -1,11 +1,14 @@ import json import os +from collections.abc import Callable from collections.abc import Iterator from contextlib import contextmanager from pathlib import Path from typing import cast from filelock import FileLock +from redis import Redis +from redis import RedisError from sqlalchemy.orm import Session from danswer.db.engine import SessionFactory @@ -13,6 +16,12 @@ from danswer.dynamic_configs.interface import ConfigNotFoundError from danswer.dynamic_configs.interface import DynamicConfigStore from danswer.dynamic_configs.interface import JSON_ro +from danswer.redis.redis_pool import DANSWER_REDIS_KEY_PREFIX +from danswer.redis.redis_pool import get_redis_client +from danswer.utils.logger import setup_logger + + +logger = setup_logger() FILE_LOCK_TIMEOUT = 10 @@ -99,3 +108,119 @@ def delete(self, key: str) -> None: if result == 0: raise ConfigNotFoundError session.commit() + + +class RedisCachedDynamicConfigStore(DynamicConfigStore): + """Read-through / write-through Redis cache over an inner ``DynamicConfigStore``. + + Mirrors the shape of upstream Onyx's ``PgRedisKVStore`` but composed + via wrapping rather than inheritance so the inner store stays + single-purpose and the cache layer can wrap *any* future backend. + + Semantics: + * ``load``: probe Redis; on hit, return; on miss, read inner and + repopulate Redis (with TTL). Encrypted entries are never cached + plaintext — they always fall through to the inner store. + * ``store``: write the inner store first (source of truth), then + refresh Redis (or invalidate if ``encrypt=True``). The + inner-first order means a Redis success after an inner failure + cannot leave Redis holding a value the source of truth lacks. + * ``delete``: delete inner first, then Redis. Same ordering reason. + + Fail-open: every Redis operation is wrapped — a Redis outage degrades + latency, not availability. Wrap-then-log-then-fall-through is the + rule throughout. + + Single-tenant: keys are namespaced by :data:`_KEY_PREFIX` only (no + tenant id), reflecting this fork's divergence from upstream. + """ + + _KEY_PREFIX = DANSWER_REDIS_KEY_PREFIX + "kv:" + + def __init__( + self, + inner: DynamicConfigStore, + ttl_seconds: int, + client_factory: Callable[[], Redis] | None = None, + ) -> None: + self._inner = inner + self._ttl = ttl_seconds + # Indirection lets tests inject a fake client without monkey- + # patching the global pool. Production wiring uses the default. + self._client_factory = client_factory or get_redis_client + + # ---- DynamicConfigStore surface ---- + + def store(self, key: str, val: JSON_ro, encrypt: bool = False) -> None: + self._inner.store(key, val, encrypt=encrypt) + if encrypt: + # Never hold plaintext of an encrypted value in Redis — that + # would silently defeat the encryption-at-rest guarantee. + # Also invalidates any stale plaintext entry from before + # the value was switched to encrypted. + self._safe_redis_delete(key) + return + self._safe_redis_set(key, val) + + def load(self, key: str) -> JSON_ro: + hit, cached = self._safe_redis_get(key) + if hit: + return cached + # May raise ConfigNotFoundError — propagate without caching the miss. + # (Negative caching has its own correctness traps; skip for now.) + val = self._inner.load(key) + self._safe_redis_set(key, val) + return val + + def delete(self, key: str) -> None: + self._inner.delete(key) + self._safe_redis_delete(key) + + # ---- private Redis helpers (all fail-open) ---- + + def _redis_key(self, key: str) -> str: + return self._KEY_PREFIX + key + + def _safe_redis_get(self, key: str) -> tuple[bool, JSON_ro]: + """Return ``(hit, value)``. ``hit=False`` means cache miss OR + Redis error — caller treats both the same (read inner). + """ + try: + raw = self._client_factory().get(self._redis_key(key)) + except RedisError as e: + logger.warning("Redis GET failed for kv key=%s: %s", key, e) + return (False, None) + if raw is None: + return (False, None) + try: + return (True, cast(JSON_ro, json.loads(raw))) + except (TypeError, ValueError) as e: + # Corrupt or legacy-format entry — treat as a miss so the + # next read repopulates from the inner store. + logger.warning("Corrupt Redis kv entry for key=%s, ignoring: %s", key, e) + return (False, None) + + def _safe_redis_set(self, key: str, val: JSON_ro) -> None: + try: + payload = json.dumps(val) + except (TypeError, ValueError) as e: + # Caller stored a value the inner store accepts but JSON + # doesn't — log and skip the cache (inner still holds truth). + logger.warning( + "Skipping Redis cache for non-JSON-serialisable key=%s: %s", key, e + ) + return + try: + self._client_factory().set( + self._redis_key(key), + payload, + ex=self._ttl, + ) + except RedisError as e: + logger.warning("Redis SET failed for kv key=%s: %s", key, e) + + def _safe_redis_delete(self, key: str) -> None: + try: + self._client_factory().delete(self._redis_key(key)) + except RedisError as e: + logger.warning("Redis DEL failed for kv key=%s: %s", key, e) diff --git a/backend/danswer/redis/__init__.py b/backend/danswer/redis/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/danswer/redis/redis_pool.py b/backend/danswer/redis/redis_pool.py new file mode 100644 index 00000000000..567bb0a5160 --- /dev/null +++ b/backend/danswer/redis/redis_pool.py @@ -0,0 +1,110 @@ +"""Process-local Redis client and connection pool. + +This fork is single-tenant, so there is no per-tenant key prefixing +(upstream Onyx's ``TenantRedisClient`` is intentionally not ported). +Instead, all keys written by this codebase must namespace themselves +under :data:`DANSWER_REDIS_KEY_PREFIX` so a shared Redis (or a +later multi-app deployment) does not collide. + +The pool is lazily built on first use and reused for the life of the +process. The :func:`get_redis_client` helper hands out a thin ``Redis`` +wrapper around the shared pool — cheap to call repeatedly, no need to +cache the result at call sites. + +Errors are NOT swallowed here. Callers that want to fail open (e.g. the +KV cache layer, the rate limiter) wrap their own try/except — that +choice belongs to the caller, not the connection helper. +""" +from __future__ import annotations + +import threading +from typing import Any + +import redis +from redis import ConnectionPool +from redis import Redis + +from danswer.configs.app_configs import REDIS_DB_NUMBER +from danswer.configs.app_configs import REDIS_HEALTH_CHECK_INTERVAL +from danswer.configs.app_configs import REDIS_HOST +from danswer.configs.app_configs import REDIS_PASSWORD +from danswer.configs.app_configs import REDIS_POOL_MAX_CONNECTIONS +from danswer.configs.app_configs import REDIS_PORT +from danswer.configs.app_configs import REDIS_SOCKET_TIMEOUT_SECONDS +from danswer.configs.app_configs import REDIS_SSL +from danswer.utils.logger import setup_logger + + +logger = setup_logger() + +# Every key written by this codebase MUST start with this prefix. Sub-modules +# append their own namespace (e.g. ``DANSWER_REDIS_KEY_PREFIX + "kv:"`` in the +# KV cache). Keeping the namespace centralised here avoids the +# "two callers picked the same key by accident" footgun. +DANSWER_REDIS_KEY_PREFIX = "danswer:" + + +_pool: ConnectionPool | None = None +_pool_lock = threading.Lock() + + +def _build_pool() -> ConnectionPool: + """Construct the connection pool from the current env-driven config. + + Kept private so callers can't accidentally instantiate parallel pools. + """ + kwargs: dict[str, Any] = { + "host": REDIS_HOST, + "port": REDIS_PORT, + "db": REDIS_DB_NUMBER, + "max_connections": REDIS_POOL_MAX_CONNECTIONS, + "health_check_interval": REDIS_HEALTH_CHECK_INTERVAL, + "socket_timeout": REDIS_SOCKET_TIMEOUT_SECONDS, + "socket_connect_timeout": REDIS_SOCKET_TIMEOUT_SECONDS, + "socket_keepalive": True, + "retry_on_timeout": True, + # We store JSON / counters as bytes; consumers decode as needed. + # decode_responses=False keeps us out of accidental str/bytes mixups. + "decode_responses": False, + } + if REDIS_PASSWORD: + kwargs["password"] = REDIS_PASSWORD + if REDIS_SSL: + # SSLConnection picks up REDIS_SSL_CA_CERTS / REDIS_SSL_CERT_REQS + # from env via the redis-py default — extend here if needed. + kwargs["connection_class"] = redis.SSLConnection + + logger.info( + "Building Redis ConnectionPool host=%s port=%s db=%s ssl=%s max=%s", + REDIS_HOST, + REDIS_PORT, + REDIS_DB_NUMBER, + REDIS_SSL, + REDIS_POOL_MAX_CONNECTIONS, + ) + return ConnectionPool(**kwargs) + + +def get_redis_client() -> Redis: + """Return a thin Redis client backed by the shared, lazily-built pool. + + Safe to call from any thread; uses double-checked locking so the pool + is constructed exactly once per process. + """ + global _pool + if _pool is None: + with _pool_lock: + if _pool is None: + _pool = _build_pool() + return Redis(connection_pool=_pool) + + +def reset_pool_for_tests() -> None: + """Drop the cached pool so the next ``get_redis_client`` rebuilds it. + + Tests only — never call this in production code. Lets a test mutate + env vars (host/port/etc.) and observe the effect on the next call. + """ + global _pool + with _pool_lock: + _pool = None diff --git a/backend/requirements/default.txt b/backend/requirements/default.txt index 8391736906f..6d2cad0f47a 100644 --- a/backend/requirements/default.txt +++ b/backend/requirements/default.txt @@ -61,6 +61,7 @@ retry==0.9.2 # This pulls in py which is in CVE-2022-42969, must remove py from rfc3986==1.5.0 rt==3.1.2 simple-salesforce==1.12.6 +redis==5.0.8 slack-sdk==3.20.2 SQLAlchemy[mypy]==2.0.15 starlette==0.36.3 diff --git a/backend/tests/unit/danswer/dynamic_configs/__init__.py b/backend/tests/unit/danswer/dynamic_configs/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/dynamic_configs/test_redis_cached_store.py b/backend/tests/unit/danswer/dynamic_configs/test_redis_cached_store.py new file mode 100644 index 00000000000..2dc551248fd --- /dev/null +++ b/backend/tests/unit/danswer/dynamic_configs/test_redis_cached_store.py @@ -0,0 +1,353 @@ +"""Unit tests for ``RedisCachedDynamicConfigStore`` — the read-through / +write-through Redis cache wrapper around any ``DynamicConfigStore``. + +The behaviour we lock down here is the contract the rest of the app +relies on: + + 1. **Read-through:** first ``load`` reads the inner store and + repopulates Redis; subsequent ``load``s hit Redis only and never + touch the inner store. + 2. **Write-through:** ``store`` writes the inner store first, then + refreshes Redis with the new value + TTL so other replicas see + the change without waiting for the TTL to expire. + 3. **Delete invalidates:** ``delete`` removes the inner row and clears + Redis. Inner store is removed first — a Redis success that arrives + before an inner failure must not leave Redis caching a value the + source of truth no longer has. + 4. **Fail-open:** any ``RedisError`` is logged and silently swallowed. + ``GET`` failures become misses; ``SET``/``DEL`` failures don't + bubble up. The point is that a Redis outage degrades latency, not + availability. + 5. **Encrypted values are never cached plaintext.** ``store(..., encrypt=True)`` + invalidates the Redis entry rather than writing plaintext into it, + so the encryption-at-rest guarantee isn't silently bypassed. + 6. **Cache miss vs None:** the wrapper distinguishes "Redis returned + ``nil``" (miss) from "Redis returned the JSON literal ``null``" + (cached None value). Both look like Python ``None`` if you're + careless; we test that a cached ``None`` is served from Redis + without re-hitting the inner store. + +The inner store and the Redis client are mocks — no real Postgres or +Redis required. +""" +from __future__ import annotations + +import json +import unittest +from typing import Any +from unittest.mock import MagicMock + +from redis import RedisError + +from danswer.dynamic_configs.interface import ConfigNotFoundError +from danswer.dynamic_configs.store import RedisCachedDynamicConfigStore + + +# We can't import the real prefix without importing redis_pool, which +# imports app_configs and is fine — but expressing it here documents +# the on-disk key shape we expect. +_EXPECTED_PREFIX = "danswer:kv:" + + +def _make_inner() -> MagicMock: + """Inner DynamicConfigStore mock with the methods we exercise.""" + inner = MagicMock() + inner.store = MagicMock() + inner.load = MagicMock() + inner.delete = MagicMock() + return inner + + +def _make_redis() -> MagicMock: + """In-memory fake Redis covering get/set/delete only. + + Stores raw bytes the same way redis-py does, so JSON encode/decode + in the wrapper actually runs. + """ + storage: dict[str, bytes] = {} + + fake = MagicMock() + + def fake_get(key: str) -> bytes | None: + return storage.get(key) + + def fake_set(key: str, value: Any, ex: int | None = None) -> bool: + if isinstance(value, str): + storage[key] = value.encode("utf-8") + elif isinstance(value, bytes): + storage[key] = value + else: + storage[key] = str(value).encode("utf-8") + # ``ex`` is observed via the mock for the TTL assertion below. + return True + + def fake_delete(*keys: str) -> int: + removed = 0 + for k in keys: + if k in storage: + del storage[k] + removed += 1 + return removed + + fake.get.side_effect = fake_get + fake.set.side_effect = fake_set + fake.delete.side_effect = fake_delete + fake._storage = storage # expose for assertions + return fake + + +class TestRedisCachedDynamicConfigStore(unittest.TestCase): + # ------------- read-through ------------- + + def test_load_miss_then_hit_only_one_inner_read(self) -> None: + """First ``load`` is a Redis miss → falls through to the inner + store and repopulates Redis. The second ``load`` must serve from + Redis alone — the inner store must NOT be touched again. This + is the central performance promise of the cache. + """ + inner = _make_inner() + inner.load.return_value = {"feature_flag": True} + redis = _make_redis() + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=60, client_factory=lambda: redis + ) + + first = store.load("settings") + second = store.load("settings") + + self.assertEqual(first, {"feature_flag": True}) + self.assertEqual(second, {"feature_flag": True}) + self.assertEqual( + inner.load.call_count, + 1, + "second load must come from Redis, not the inner store", + ) + + def test_load_populates_redis_with_ttl(self) -> None: + """On a miss, the wrapper must SET into Redis with an expiry — + otherwise the cache would never evict and a value written by + another pod would be served stale forever. + """ + inner = _make_inner() + inner.load.return_value = {"a": 1} + redis = _make_redis() + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=120, client_factory=lambda: redis + ) + + store.load("k1") + + redis.set.assert_called_once() + args, kwargs = redis.set.call_args + self.assertEqual(args[0], _EXPECTED_PREFIX + "k1") + # JSON-serialised payload, with the TTL kwarg matching the ctor. + self.assertEqual(json.loads(args[1]), {"a": 1}) + self.assertEqual(kwargs.get("ex"), 120) + + def test_load_propagates_not_found_without_caching_miss(self) -> None: + """If the inner store has nothing, the wrapper must raise + ``ConfigNotFoundError`` and NOT cache the absence — negative + caching has its own correctness gotchas (a later ``store`` would + race with the stale "missing" entry), and the plan deliberately + defers it. + """ + inner = _make_inner() + inner.load.side_effect = ConfigNotFoundError + redis = _make_redis() + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=60, client_factory=lambda: redis + ) + + with self.assertRaises(ConfigNotFoundError): + store.load("absent") + # No SET — we didn't cache the miss. + redis.set.assert_not_called() + + def test_cached_none_is_distinguished_from_miss(self) -> None: + """``None`` is a legal stored value (the KV store can hold a + JSON ``null``). We must serve a cached ``null`` without falling + through to the inner store — otherwise every read of a None + value is effectively uncached. + """ + inner = _make_inner() + inner.load.return_value = None + redis = _make_redis() + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=60, client_factory=lambda: redis + ) + + first = store.load("nullable") # miss → inner → cache + second = store.load("nullable") # hit + + self.assertIsNone(first) + self.assertIsNone(second) + self.assertEqual( + inner.load.call_count, + 1, + "second load of a cached None must hit Redis, not the inner store", + ) + + # ------------- write-through / invalidation ------------- + + def test_store_writes_inner_then_refreshes_redis(self) -> None: + """``store`` must write the inner store first (source of truth), + then refresh Redis. Order matters: a Redis success after an + inner failure would leave Redis ahead of the source of truth. + """ + inner = _make_inner() + redis = _make_redis() + call_order: list[str] = [] + inner.store.side_effect = lambda *a, **kw: call_order.append("inner") + # Wrap the existing side_effect to record set ordering. + original_set = redis.set.side_effect + + def recording_set(*a: Any, **kw: Any) -> Any: + call_order.append("redis") + return original_set(*a, **kw) + + redis.set.side_effect = recording_set + + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=30, client_factory=lambda: redis + ) + store.store("settings", {"v": 7}) + + inner.store.assert_called_once_with("settings", {"v": 7}, encrypt=False) + self.assertEqual( + call_order, ["inner", "redis"], "inner store must be written first" + ) + # Subsequent load returns the new value from Redis only. + inner.load.reset_mock() + result = store.load("settings") + self.assertEqual(result, {"v": 7}) + inner.load.assert_not_called() + + def test_encrypted_store_invalidates_redis(self) -> None: + """``encrypt=True`` means "Postgres holds this encrypted." We + must NOT mirror plaintext into Redis (which has no encryption + guarantee), and we must invalidate any prior plaintext entry + in case the value was just switched to encrypted. + """ + inner = _make_inner() + redis = _make_redis() + # Pre-seed a stale plaintext entry to confirm it gets cleared. + redis._storage[_EXPECTED_PREFIX + "secret"] = b'"old"' + + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=60, client_factory=lambda: redis + ) + store.store("secret", "new-value", encrypt=True) + + inner.store.assert_called_once_with("secret", "new-value", encrypt=True) + redis.set.assert_not_called() # no plaintext mirror + redis.delete.assert_called_once_with(_EXPECTED_PREFIX + "secret") + self.assertNotIn(_EXPECTED_PREFIX + "secret", redis._storage) + + def test_delete_clears_inner_and_redis(self) -> None: + """``delete`` clears both layers. Inner first — same ordering + invariant as ``store``: Redis must never be cleaner than the + source of truth. + """ + inner = _make_inner() + redis = _make_redis() + redis._storage[_EXPECTED_PREFIX + "k"] = b'{"x":1}' + call_order: list[str] = [] + inner.delete.side_effect = lambda *a, **kw: call_order.append("inner") + original_delete = redis.delete.side_effect + + def recording_delete(*a: Any, **kw: Any) -> Any: + call_order.append("redis") + return original_delete(*a, **kw) + + redis.delete.side_effect = recording_delete + + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=60, client_factory=lambda: redis + ) + store.delete("k") + + inner.delete.assert_called_once_with("k") + self.assertEqual(call_order, ["inner", "redis"]) + self.assertNotIn(_EXPECTED_PREFIX + "k", redis._storage) + + # ------------- fail-open behaviour ------------- + + def test_redis_get_error_falls_through_to_inner(self) -> None: + """Redis ``GET`` exploding (timeout, conn refused, network + partition) must NOT propagate. The wrapper degrades to a plain + read against the inner store so a Redis outage costs latency, + not availability. + """ + inner = _make_inner() + inner.load.return_value = "from-postgres" + redis = MagicMock() + redis.get.side_effect = RedisError("connection refused") + + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=60, client_factory=lambda: redis + ) + result = store.load("k") + + self.assertEqual(result, "from-postgres") + inner.load.assert_called_once_with("k") + + def test_redis_set_error_does_not_break_store(self) -> None: + """SET failing must not bubble out of ``store`` — the inner + write already succeeded, returning an error to the caller would + lie about the durability of the write. + """ + inner = _make_inner() + redis = MagicMock() + redis.set.side_effect = RedisError("OOM") + + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=60, client_factory=lambda: redis + ) + # Must not raise. + store.store("k", {"v": 1}) + inner.store.assert_called_once() + + def test_corrupt_cache_entry_treated_as_miss(self) -> None: + """If something else wrote non-JSON bytes under our key (legacy + format, manual ``SET``, race during a schema change), the next + read must not crash — it must fall through to the inner store + and overwrite the corrupt entry on the next SET. + """ + inner = _make_inner() + inner.load.return_value = "ok" + redis = _make_redis() + redis._storage[_EXPECTED_PREFIX + "k"] = b"not-json-at-all" + + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=60, client_factory=lambda: redis + ) + result = store.load("k") + + self.assertEqual(result, "ok") + inner.load.assert_called_once_with("k") + # Wrapper repopulated Redis with the good value. + self.assertEqual( + json.loads(redis._storage[_EXPECTED_PREFIX + "k"]), "ok" + ) + + def test_non_json_serialisable_value_skips_cache_but_inner_still_written(self) -> None: + """If a caller hands us a Python object json can't serialise + (sets, complex numbers, etc.), the inner store still gets it — + Redis just silently skips the cache write. The inner is the + source of truth; the cache is best-effort. + """ + inner = _make_inner() + redis = _make_redis() + store = RedisCachedDynamicConfigStore( + inner=inner, ttl_seconds=60, client_factory=lambda: redis + ) + + # set() is not JSON-serialisable. + store.store("k", {1, 2, 3}) # type: ignore[arg-type] + + inner.store.assert_called_once() + redis.set.assert_not_called() + + +if __name__ == "__main__": + unittest.main() diff --git a/backend/tests/unit/danswer/redis_layer/__init__.py b/backend/tests/unit/danswer/redis_layer/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/redis_layer/test_redis_pool.py b/backend/tests/unit/danswer/redis_layer/test_redis_pool.py new file mode 100644 index 00000000000..a50e8d8bd7b --- /dev/null +++ b/backend/tests/unit/danswer/redis_layer/test_redis_pool.py @@ -0,0 +1,87 @@ +"""Unit tests for ``danswer.redis.redis_pool``. + +We exercise only what can be verified without a real Redis server: + + 1. The pool is a process-wide singleton — calling ``get_redis_client`` + repeatedly does not build a new ``ConnectionPool`` each time. + 2. ``reset_pool_for_tests`` forces the next ``get_redis_client`` to + rebuild — important so other tests can swap env vars and observe + the change. + 3. The global key prefix is the documented value. If this ever + changes silently it would orphan every cached entry in production + on the next deploy; lock it down with a string equality assertion. + +Live socket-level behaviour (pool sizing, TCP timeouts, SSL handshake) +is intentionally out of scope here — those need an integration test +against a real Redis. +""" +from __future__ import annotations + +import unittest +from unittest.mock import patch + +from danswer.redis import redis_pool + + +class TestRedisPool(unittest.TestCase): + def setUp(self) -> None: + # Each test starts with a fresh, unbuilt pool so the singleton + # state from earlier tests can't bleed in. + redis_pool.reset_pool_for_tests() + + def tearDown(self) -> None: + redis_pool.reset_pool_for_tests() + + def test_prefix_is_stable(self) -> None: + """The on-the-wire key prefix is part of the persistence + contract — every cached entry in production starts with it. + Renaming it requires intentional migration, not a drive-by edit. + """ + self.assertEqual(redis_pool.DANSWER_REDIS_KEY_PREFIX, "danswer:") + + def test_pool_built_lazily_and_reused(self) -> None: + """``get_redis_client`` must build the pool on first use and + reuse it after. We assert this by counting calls to the pool + constructor under a patch. + """ + with patch.object( + redis_pool, "ConnectionPool", wraps=redis_pool.ConnectionPool + ) as mock_pool: + client_a = redis_pool.get_redis_client() + client_b = redis_pool.get_redis_client() + client_c = redis_pool.get_redis_client() + + self.assertEqual( + mock_pool.call_count, + 1, + "ConnectionPool should be constructed exactly once across " + "repeated get_redis_client() calls", + ) + # Different Redis() instances are fine — they share the pool. + self.assertIs( + client_a.connection_pool, + client_b.connection_pool, + "all clients must share the singleton pool", + ) + self.assertIs(client_b.connection_pool, client_c.connection_pool) + + def test_reset_for_tests_drops_singleton(self) -> None: + """After ``reset_pool_for_tests`` the next ``get_redis_client`` + must rebuild — otherwise tests can't observe config changes. + """ + with patch.object( + redis_pool, "ConnectionPool", wraps=redis_pool.ConnectionPool + ) as mock_pool: + redis_pool.get_redis_client() + redis_pool.reset_pool_for_tests() + redis_pool.get_redis_client() + + self.assertEqual( + mock_pool.call_count, + 2, + "reset_pool_for_tests should force the next call to rebuild", + ) + + +if __name__ == "__main__": + unittest.main() diff --git a/darwin-kubernetes/api_server-service-deployment.yaml b/darwin-kubernetes/api_server-service-deployment.yaml index 2959e1409a2..b8a55094717 100644 --- a/darwin-kubernetes/api_server-service-deployment.yaml +++ b/darwin-kubernetes/api_server-service-deployment.yaml @@ -69,6 +69,16 @@ spec: secretKeyRef: name: danswer-secrets key: user_auth_secret + # --- Redis (cache + rate limiting) --- + # Empty string is the no-auth default (matches the in-cluster Redis + # StatefulSet, which runs without requirepass). Switch to a populated + # secret when moving to a managed Redis that requires auth. + - name: REDIS_PASSWORD + valueFrom: + secretKeyRef: + name: danswer-secrets + key: redis_password + optional: true envFrom: - configMapRef: name: env-configmap diff --git a/darwin-kubernetes/background-deployment.yaml b/darwin-kubernetes/background-deployment.yaml index 538cf007bd4..63bd93c0c82 100644 --- a/darwin-kubernetes/background-deployment.yaml +++ b/darwin-kubernetes/background-deployment.yaml @@ -41,6 +41,15 @@ spec: secretKeyRef: key: postgres_password name: danswer-secrets + # --- Redis (cache + rate limiting) --- + # Optional so the background pod still boots when Redis is unauth'd. + # See api_server-service-deployment.yaml for the rationale. + - name: REDIS_PASSWORD + valueFrom: + secretKeyRef: + key: redis_password + name: danswer-secrets + optional: true envFrom: - configMapRef: name: env-configmap diff --git a/darwin-kubernetes/env-configmap.yaml b/darwin-kubernetes/env-configmap.yaml index eb0cfb25312..268f69b4b17 100644 --- a/darwin-kubernetes/env-configmap.yaml +++ b/darwin-kubernetes/env-configmap.yaml @@ -94,3 +94,20 @@ data: WEB_DOMAIN: "https://darwin.westeurope.cloudapp.azure.com" # for web server and api server DOMAIN: "darwin.westeurope.cloudapp.azure.com" # for nginx APPLY_MIGRATIONS: "true" + # Redis (cache + rate limiting). Host matches the in-cluster Service name in + # redis-statefulset.yaml. Password (if any) comes via secretKeyRef in the + # api_server / background deployments — keep it out of the configmap. + REDIS_HOST: "redis" + REDIS_PORT: "6379" + REDIS_DB_NUMBER: "0" + REDIS_SSL: "" + # Read-through KV cache on the DynamicConfigStore. Default OFF — flip to + # "true" once Redis is up and reachable from api_server / background. + REDIS_KV_CACHE_ENABLED: "" + REDIS_KV_CACHE_TTL_SECONDS: "86400" + # Per-user request-rate limiter on /send-message. Default OFF; set windows + # to enable. Recommended starting point at "few hundred users" scale: + # REQUEST_RATE_LIMIT_ENABLED=true, PER_MINUTE=20, PER_HOUR=300 + REQUEST_RATE_LIMIT_ENABLED: "" + REQUEST_RATE_LIMIT_PER_MINUTE: "" + REQUEST_RATE_LIMIT_PER_HOUR: "" diff --git a/darwin-kubernetes/redis-statefulset.yaml b/darwin-kubernetes/redis-statefulset.yaml new file mode 100644 index 00000000000..5d9199aa525 --- /dev/null +++ b/darwin-kubernetes/redis-statefulset.yaml @@ -0,0 +1,78 @@ +# In-cluster Redis for cache + per-user rate limiting. +# +# Cache-only: persistence is deliberately OFF (no AOF, no RDB snapshots). A +# pod restart drops the cache; that's acceptable because the +# DynamicConfigStore Postgres-backed source of truth refills it on demand +# and the rate-limit counters self-heal as windows expire. +# +# A single replica is fine for this footprint (a few hundred users). For HA +# later, switch to Redis Sentinel or move to Azure Cache for Redis — both +# are drop-in for our client, which only ever talks via REDIS_HOST. +# +# NOT the Celery broker. This fork uses Postgres for the broker by design +# (see AGENTS.md "Divergence from upstream Onyx"). Do not repoint Celery +# at this service. +--- +apiVersion: v1 +kind: Service +metadata: + name: redis + labels: + app: redis +spec: + selector: + app: redis + ports: + - name: redis + port: 6379 + targetPort: 6379 + type: ClusterIP +--- +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: redis +spec: + serviceName: redis + replicas: 1 + selector: + matchLabels: + app: redis + template: + metadata: + labels: + app: redis + spec: + containers: + - name: redis + image: redis:7.2-alpine + ports: + - containerPort: 6379 + name: redis + # Cache-only config: no persistence, bounded memory, LRU eviction. + args: + - "--appendonly" + - "no" + - "--save" + - "" + - "--maxmemory" + - "256mb" + - "--maxmemory-policy" + - "allkeys-lru" + resources: + requests: + memory: "128Mi" + cpu: "50m" + limits: + memory: "384Mi" + cpu: "500m" + readinessProbe: + tcpSocket: + port: 6379 + initialDelaySeconds: 5 + periodSeconds: 10 + livenessProbe: + tcpSocket: + port: 6379 + initialDelaySeconds: 15 + periodSeconds: 20 diff --git a/darwin-kubernetes/secrets.yaml b/darwin-kubernetes/secrets.yaml index 352ffd16d6a..d6bc2aabb02 100644 --- a/darwin-kubernetes/secrets.yaml +++ b/darwin-kubernetes/secrets.yaml @@ -20,3 +20,9 @@ stringData: # MUST be identical across all replicas and stable across restarts/rollouts, # or in-flight logins fail and existing sessions are invalidated. user_auth_secret: "" + + # --- Redis --- + # Leave empty for the unauth'd in-cluster Redis (redis-statefulset.yaml). + # Populate when moving to a managed Redis that requires auth, OR remove + # this entry entirely (the deployments mount it as `optional: true`). + redis_password: "" diff --git a/deployment/docker_compose/docker-compose.dev.yml b/deployment/docker_compose/docker-compose.dev.yml index 294b22deff1..491cbe4cefe 100644 --- a/deployment/docker_compose/docker-compose.dev.yml +++ b/deployment/docker_compose/docker-compose.dev.yml @@ -314,6 +314,31 @@ services: - db_volume:/var/lib/postgresql/data + # Cache + per-user request rate limiting. Cache-only — no persistence; an LRU + # eviction policy bounds memory so a runaway producer can't OOM the node. + # Not used as a Celery broker (this fork uses Postgres for that). + redis: + image: redis:7.2-alpine + restart: always + command: + - redis-server + - --appendonly + - "no" + - --save + - "" + - --maxmemory + - "256mb" + - --maxmemory-policy + - allkeys-lru + ports: + - "6379:6379" + healthcheck: + test: ["CMD", "redis-cli", "ping"] + interval: 10s + timeout: 3s + retries: 5 + + # This container name cannot have an underscore in it due to Vespa expectations of the URL index: image: vespaengine/vespa:8.277.17 From d124a078d79d6038ff7ff338dadd72ffbc9438c2 Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 17:10:32 +0530 Subject: [PATCH 003/102] P2: per-user request rate limiter on chat/query endpoints MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Layered on top of P1's Redis client. Complements the existing token- budget limiter (token_limit.py) — that's a DB-backed COST cap, this is a Redis-backed REQUEST-COUNT cap that's correct across api_server replicas. Both run; this one runs first so a 429'd caller never even touches the DB-backed usage query. Default OFF. Enable per-environment via: REQUEST_RATE_LIMIT_ENABLED=true REQUEST_RATE_LIMIT_PER_MINUTE= # 0 = disable that window REQUEST_RATE_LIMIT_PER_HOUR= * server/middleware/request_rate_limit.py: fixed-window buckets keyed by floor(time/window). Atomic INCR + EXPIRE NX so the bucket boundary is fixed on first increment (without NX, every request would push expiry forward and the bucket would never reset — that bug is covered by an explicit test). Authenticated users keyed by uuid; anonymous keyed by the first X-Forwarded-For hop, falling back to the socket peer; if neither yields an IP we skip (better than bucketing every anonymous request under ""). * Fail-OPEN on any Redis error: a Redis blip lets requests through with a warning, never wedges the chat path. * 429 response carries a Retry-After header with seconds-until-bucket- rollover so well-behaved clients back off precisely. * Wired as a FastAPI Depends on: POST /chat/send-message POST /direct-qa/stream-answer-with-quote Both endpoints also keep the existing check_token_rate_limits. Tests (unittest, mocked Redis pipeline — no real Redis required): - default-OFF short-circuits before any Redis call (covers both REQUEST_RATE_LIMIT_ENABLED=false AND both windows = 0). - within-limit: N requests under cap all allowed. - over-limit raises 429 with Retry-After header. - per-user isolation: distinct users have independent counters. - bucket rollover resets count (time-mocked). - EXPIRE NX semantics — locks down the no-sliding-TTL invariant. - anonymous keyed by XFF first hop; no-IP skips silently. - fail-open: Redis error doesn't propagate. 9 cases total. Co-Authored-By: Claude Opus 4.7 (1M context) --- backend/danswer/configs/app_configs.py | 16 + backend/danswer/server/middleware/__init__.py | 0 .../server/middleware/request_rate_limit.py | 181 +++++++++ .../server/query_and_chat/chat_backend.py | 7 + .../server/query_and_chat/query_backend.py | 8 + backend/tests/unit/danswer/server/__init__.py | 0 .../danswer/server/middleware/__init__.py | 0 .../middleware/test_request_rate_limit.py | 342 ++++++++++++++++++ 8 files changed, 554 insertions(+) create mode 100644 backend/danswer/server/middleware/__init__.py create mode 100644 backend/danswer/server/middleware/request_rate_limit.py create mode 100644 backend/tests/unit/danswer/server/__init__.py create mode 100644 backend/tests/unit/danswer/server/middleware/__init__.py create mode 100644 backend/tests/unit/danswer/server/middleware/test_request_rate_limit.py diff --git a/backend/danswer/configs/app_configs.py b/backend/danswer/configs/app_configs.py index d969b39d2d9..42ffffb2671 100644 --- a/backend/danswer/configs/app_configs.py +++ b/backend/danswer/configs/app_configs.py @@ -332,6 +332,22 @@ os.environ.get("REDIS_KV_CACHE_TTL_SECONDS") or 86400 ) +# Per-user request-rate limiter (Redis-backed). Default OFF — complements +# the token-budget limiter in token_limit.py with a request-count cap that +# is correct across api_server replicas. +REQUEST_RATE_LIMIT_ENABLED = ( + os.environ.get("REQUEST_RATE_LIMIT_ENABLED", "").lower() == "true" +) +# Per-minute and per-hour message-send caps per (user|ip). 0 disables that +# window (so you can enforce only one of them if you prefer). +REQUEST_RATE_LIMIT_PER_MINUTE = int( + os.environ.get("REQUEST_RATE_LIMIT_PER_MINUTE") or 0 +) +REQUEST_RATE_LIMIT_PER_HOUR = int( + os.environ.get("REQUEST_RATE_LIMIT_PER_HOUR") or 0 +) + + ##### # Enterprise Edition Configs ##### diff --git a/backend/danswer/server/middleware/__init__.py b/backend/danswer/server/middleware/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/danswer/server/middleware/request_rate_limit.py b/backend/danswer/server/middleware/request_rate_limit.py new file mode 100644 index 00000000000..8f1b6355a2c --- /dev/null +++ b/backend/danswer/server/middleware/request_rate_limit.py @@ -0,0 +1,181 @@ +"""Per-user request-rate limiter — Redis-backed, multi-window, fail-open. + +Why this exists, in one sentence: the existing +``danswer.server.query_and_chat.token_limit.check_token_rate_limits`` is a +**token-budget** limiter (sum of tokens over a window, DB-backed). It +caps cost, not request volume, and its in-process ``@lru_cache`` short- +circuit (``any_rate_limit_exists``) is per-pod, so two replicas can +disagree on whether limits are configured at all. This module is the +**request-rate** complement: a per-user (or per-IP for anonymous) cap on +the number of /send-message calls per minute / per hour, with Redis as +the shared counter so the cap holds across replicas. + +Design notes: + +* **Fixed-window buckets.** ``bucket = floor(time() / window)``. Simpler + and cheaper than sliding-window log; the trade-off is that a user + can burst up to ``2 * limit`` across a window boundary. Acceptable + for the protection target (abuse / runaway cost), not for strict SLA + enforcement. +* **Atomic ``INCR`` + ``EXPIRE NX``.** The expiry is set only on the + first increment of the bucket so the window boundary is preserved + across concurrent requests racing for the first slot. Without ``NX``, + every request would push the expiry forward and the bucket would + never reset. +* **Fail-open.** Any Redis error allows the request through with a log. + Refusing the chat path because the *rate limiter* is down is a worse + outcome than serving a few extra requests during a Redis blip. +* **Default OFF.** Even when Redis is up, the limiter does nothing + until ``REQUEST_RATE_LIMIT_ENABLED=true`` AND at least one window + limit (per-minute or per-hour) is > 0. This is a protection feature, + not an always-on guard. +* **Anonymous callers are keyed by IP** (X-Forwarded-For first hop, + falling back to the socket peer). If the IP can't be determined we + silently skip — no key, nothing to limit. +""" +from __future__ import annotations + +import time + +from fastapi import Depends +from fastapi import HTTPException +from fastapi import Request + +from danswer.auth.users import current_user +from danswer.configs.app_configs import REQUEST_RATE_LIMIT_ENABLED +from danswer.configs.app_configs import REQUEST_RATE_LIMIT_PER_HOUR +from danswer.configs.app_configs import REQUEST_RATE_LIMIT_PER_MINUTE +from danswer.db.models import User +from danswer.redis.redis_pool import DANSWER_REDIS_KEY_PREFIX +from danswer.redis.redis_pool import get_redis_client +from danswer.utils.logger import setup_logger + + +logger = setup_logger() + + +# All counters live under this prefix so a global FLUSHDB-by-prefix on +# this namespace is trivial in incident response. Sub-key shape: +# {actor}:{label}:{bucket} +# where actor is "u:" or "ip:", label is "min" or "hour", +# and bucket is floor(unix_seconds / window). +_KEY_PREFIX = DANSWER_REDIS_KEY_PREFIX + "ratelimit:msg:" + +_MIN_WINDOW_SECONDS = 60 +_HOUR_WINDOW_SECONDS = 3600 + + +def check_message_request_rate_limit( + request: Request, + user: User | None = Depends(current_user), +) -> None: + """FastAPI dependency that 429s a caller over their per-window cap. + + Cheap fast-path when disabled — no Redis call, no env reads beyond + the module-level constants. Safe to attach to every chat / query + endpoint; the cost when off is one tuple-truthy check. + """ + if not REQUEST_RATE_LIMIT_ENABLED: + return + if REQUEST_RATE_LIMIT_PER_MINUTE <= 0 and REQUEST_RATE_LIMIT_PER_HOUR <= 0: + # Nothing to enforce — saves the Redis round-trip when an + # operator enabled the flag but hasn't picked window values yet. + return + + actor = _actor_key(user, request) + if actor is None: + return # no key material; nothing we can fairly attribute + + # Order matters: enforce the tighter window first. If a user trips + # the per-minute cap we don't need to also increment per-hour for + # this request — but we do anyway so per-hour accounting stays + # honest across bursts that don't trip the minute window. + if REQUEST_RATE_LIMIT_PER_MINUTE > 0: + _enforce_window( + actor=actor, + label="min", + window_seconds=_MIN_WINDOW_SECONDS, + limit=REQUEST_RATE_LIMIT_PER_MINUTE, + ) + if REQUEST_RATE_LIMIT_PER_HOUR > 0: + _enforce_window( + actor=actor, + label="hour", + window_seconds=_HOUR_WINDOW_SECONDS, + limit=REQUEST_RATE_LIMIT_PER_HOUR, + ) + + +def _actor_key(user: User | None, request: Request) -> str | None: + """Identifier the limit is attributed to. + + Authenticated users are keyed by uuid (stable, survives IP changes). + Anonymous traffic falls back to the first X-Forwarded-For hop set + by the ingress; if nothing usable is present, we return None and + skip — better to under-enforce than to bucket everyone behind a + misconfigured proxy under the LB's own IP. + """ + if user is not None: + return f"u:{user.id}" + + xff = request.headers.get("x-forwarded-for", "") + if xff: + client_ip = xff.split(",", 1)[0].strip() + elif request.client is not None: + client_ip = request.client.host + else: + client_ip = "" + + if not client_ip: + return None + return f"ip:{client_ip}" + + +def _enforce_window( + *, actor: str, label: str, window_seconds: int, limit: int +) -> None: + """Increment-and-check one window for one actor. + + Raises ``HTTPException(429)`` if the post-increment count exceeds + ``limit``. The Retry-After header tells the caller exactly how long + until the current bucket rolls over — handy for clients that back + off intelligently. + """ + bucket = int(time.time() // window_seconds) + key = f"{_KEY_PREFIX}{actor}:{label}:{bucket}" + + try: + client = get_redis_client() + pipe = client.pipeline() + pipe.incr(key, 1) + # ``nx=True`` here means "set expiry only if no expiry yet" so + # the first increment of the bucket fixes the window boundary. + # Without it, every increment pushes expiry forward and the + # bucket never resets. + pipe.expire(key, window_seconds, nx=True) + result = pipe.execute() + count = int(result[0]) + except Exception as e: + # Fail-open: better to let a request through than to wedge the + # chat path because Redis is unhappy. Loud log so it's obvious + # in the dashboard, but no exception propagation. + logger.warning( + "Rate-limit check skipped due to Redis error (actor=%s window=%s): %s", + actor, + label, + e, + ) + return + + if count > limit: + # Seconds remaining in the current bucket — tells the caller + # when to retry without us needing to look up the TTL. + retry_after = window_seconds - (int(time.time()) % window_seconds) + raise HTTPException( + status_code=429, + detail=( + f"Request rate limit exceeded " + f"({limit} per {label}). Retry in {retry_after}s." + ), + headers={"Retry-After": str(retry_after)}, + ) diff --git a/backend/danswer/server/query_and_chat/chat_backend.py b/backend/danswer/server/query_and_chat/chat_backend.py index 4e5a1bb2138..0104d49c6a7 100644 --- a/backend/danswer/server/query_and_chat/chat_backend.py +++ b/backend/danswer/server/query_and_chat/chat_backend.py @@ -65,6 +65,9 @@ from danswer.server.query_and_chat.models import RenameChatSessionResponse from danswer.server.query_and_chat.models import SearchFeedbackRequest from danswer.server.query_and_chat.models import UpdateChatSessionThreadRequest +from danswer.server.middleware.request_rate_limit import ( + check_message_request_rate_limit, +) from danswer.server.query_and_chat.token_limit import check_token_rate_limits from danswer.utils.logger import setup_logger @@ -278,6 +281,10 @@ def handle_new_chat_message( chat_message_req: CreateChatMessageRequest, request: Request, user: User | None = Depends(current_user), + # Request-rate cap (Redis-backed, default off) runs BEFORE the + # token-budget check — cheap fast-path means a 429'd caller never + # touches the DB-backed token-usage query. + _rate_limit: None = Depends(check_message_request_rate_limit), _: None = Depends(check_token_rate_limits), ) -> StreamingResponse: """This endpoint is both used for all the following purposes: diff --git a/backend/danswer/server/query_and_chat/query_backend.py b/backend/danswer/server/query_and_chat/query_backend.py index ff632e0613a..da41ab02017 100644 --- a/backend/danswer/server/query_and_chat/query_backend.py +++ b/backend/danswer/server/query_and_chat/query_backend.py @@ -1,6 +1,7 @@ from fastapi import APIRouter from fastapi import Depends from fastapi import HTTPException +from fastapi import Request from fastapi.responses import StreamingResponse from sqlalchemy.orm import Session @@ -30,6 +31,9 @@ from danswer.server.query_and_chat.models import SimpleQueryRequest from danswer.server.query_and_chat.models import SourceTag from danswer.server.query_and_chat.models import TagResponse +from danswer.server.middleware.request_rate_limit import ( + check_message_request_rate_limit, +) from danswer.server.query_and_chat.token_limit import check_token_rate_limits from danswer.utils.logger import setup_logger @@ -150,7 +154,11 @@ def stream_query_validation( @basic_router.post("/stream-answer-with-quote") def get_answer_with_quote( query_request: DirectQARequest, + request: Request, user: User = Depends(current_user), + # Mirrors /chat/send-message: request-rate cap first (cheap when + # off), token-budget check second. + _rate_limit: None = Depends(check_message_request_rate_limit), _: None = Depends(check_token_rate_limits), ) -> StreamingResponse: query = query_request.messages[0].message diff --git a/backend/tests/unit/danswer/server/__init__.py b/backend/tests/unit/danswer/server/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/server/middleware/__init__.py b/backend/tests/unit/danswer/server/middleware/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/server/middleware/test_request_rate_limit.py b/backend/tests/unit/danswer/server/middleware/test_request_rate_limit.py new file mode 100644 index 00000000000..acece0d4332 --- /dev/null +++ b/backend/tests/unit/danswer/server/middleware/test_request_rate_limit.py @@ -0,0 +1,342 @@ +"""Unit tests for the Redis-backed per-user request rate limiter. + +What we lock down here is the contract a chat endpoint relies on when it +attaches ``Depends(check_message_request_rate_limit)``: + + 1. **Default off:** with the feature flag down OR both window limits + at 0, the dependency must short-circuit before touching Redis. + This matters because the dependency is mounted on the hot path of + every chat message — any cost in the off case is paid on every + request forever. + 2. **Per-window enforcement:** the Nth request through the same + bucket exceeds the cap and 429s; the same caller in the next + bucket gets a fresh window. + 3. **Per-user isolation:** two distinct users must not share counters + even if their requests interleave in the same bucket. + 4. **Anonymous keying by IP:** unauth'd callers are bucketed by + X-Forwarded-For first hop (matching the ingress shape), falling + back to the socket peer; otherwise the dependency skips. + 5. **EXPIRE NX semantics:** the first ``INCR`` of a bucket sets the + TTL; subsequent ``INCR`` calls must NOT extend it (a sliding TTL + would make the bucket never reset and effectively cap *forever* + after the first burst). + 6. **Fail-open:** any Redis error allows the request through. The + limiter is protection, not authorization — a Redis blip is not a + reason to wedge the chat path. + 7. **Retry-After header:** a 429 carries seconds-until-bucket-rollover + so well-behaved clients can back off precisely. + +Redis is mocked at the ``get_redis_client`` boundary; the FastAPI +``Request`` and ``User`` are dummy objects. No HTTP layer, no real +Redis — pure dependency-function tests. +""" +from __future__ import annotations + +import unittest +import uuid +from typing import Any +from unittest.mock import MagicMock +from unittest.mock import patch + +from fastapi import HTTPException + +from danswer.server.middleware import request_rate_limit as rrl + + +# ---------- shared fakes ---------- + + +class _FakePipeline: + """Minimal stand-in for redis.client.Pipeline. + + We only need .incr, .expire, .execute — that's the full surface + used in _enforce_window. We also remember every .expire(..., nx=) + call so the NX-semantics test can inspect it. + """ + + def __init__(self, storage: dict[str, int], expiry: dict[str, bool]) -> None: + self._storage = storage + self._expiry = expiry + self._ops: list[tuple[str, Any, Any]] = [] + + def incr(self, key: str, amount: int = 1) -> "_FakePipeline": + self._ops.append(("incr", key, amount)) + return self + + def expire(self, key: str, seconds: int, nx: bool = False) -> "_FakePipeline": + self._ops.append(("expire", key, (seconds, nx))) + return self + + def execute(self) -> list[Any]: + results: list[Any] = [] + for op, key, arg in self._ops: + if op == "incr": + self._storage[key] = self._storage.get(key, 0) + int(arg) + results.append(self._storage[key]) + elif op == "expire": + seconds, nx = arg + if nx and self._expiry.get(key): + results.append(False) # already has TTL — refused + else: + self._expiry[key] = True + results.append(True) + self._ops.clear() + return results + + +class _FakeRedis: + """Fake Redis client exposing only the methods the limiter uses.""" + + def __init__(self) -> None: + self._counters: dict[str, int] = {} + self._has_expiry: dict[str, bool] = {} + self.expire_calls: list[tuple[str, int, bool]] = [] + + def pipeline(self) -> _FakePipeline: + pipe = _FakePipeline(self._counters, self._has_expiry) + # Wrap pipe.expire to record every call for inspection. + original_expire = pipe.expire + + def recording_expire(key: str, seconds: int, nx: bool = False) -> Any: + self.expire_calls.append((key, seconds, nx)) + return original_expire(key, seconds, nx=nx) + + pipe.expire = recording_expire # type: ignore[method-assign] + return pipe + + +def _make_request(headers: dict[str, str] | None = None, peer_host: str | None = None) -> MagicMock: + """Minimal Starlette Request stand-in.""" + req = MagicMock() + req.headers = headers or {} + req.client = MagicMock(host=peer_host) if peer_host is not None else None + return req + + +def _make_user(uid: uuid.UUID | None = None) -> MagicMock: + user = MagicMock() + user.id = uid or uuid.uuid4() + return user + + +# ---------- tests ---------- + + +class TestRequestRateLimitDisabled(unittest.TestCase): + """When disabled, the dependency must do nothing — not even + construct a Redis client. The hot path can't afford ambient cost + that callers thought they'd avoided by turning the flag off. + """ + + def test_flag_off_short_circuits_before_redis(self) -> None: + request = _make_request() + user = _make_user() + with patch.object(rrl, "REQUEST_RATE_LIMIT_ENABLED", False), patch.object( + rrl, "get_redis_client" + ) as mock_client: + rrl.check_message_request_rate_limit(request=request, user=user) + mock_client.assert_not_called() + + def test_both_windows_zero_short_circuits_before_redis(self) -> None: + """Flag on but no limits configured = nothing to enforce. The + operator probably enabled the flag and hasn't picked numbers + yet; we must not pay the Redis round-trip in that interim + state. + """ + request = _make_request() + user = _make_user() + with patch.object(rrl, "REQUEST_RATE_LIMIT_ENABLED", True), patch.object( + rrl, "REQUEST_RATE_LIMIT_PER_MINUTE", 0 + ), patch.object(rrl, "REQUEST_RATE_LIMIT_PER_HOUR", 0), patch.object( + rrl, "get_redis_client" + ) as mock_client: + rrl.check_message_request_rate_limit(request=request, user=user) + mock_client.assert_not_called() + + +class TestRequestRateLimitEnforcement(unittest.TestCase): + def _patch_enabled(self, per_min: int = 0, per_hour: int = 0) -> Any: + """Helper: turn the limiter on with the given window caps.""" + return _MultiPatch( + (rrl, "REQUEST_RATE_LIMIT_ENABLED", True), + (rrl, "REQUEST_RATE_LIMIT_PER_MINUTE", per_min), + (rrl, "REQUEST_RATE_LIMIT_PER_HOUR", per_hour), + ) + + def test_within_limit_allows_request(self) -> None: + """Under the cap = no 429. Sanity, but also makes sure the + ``count > limit`` boundary is strict (the Nth allowed request + is the *limit*-th, not limit-minus-one). + """ + fake = _FakeRedis() + request = _make_request() + user = _make_user() + with self._patch_enabled(per_min=3), patch.object( + rrl, "get_redis_client", return_value=fake + ): + for _ in range(3): + rrl.check_message_request_rate_limit(request=request, user=user) + # No exception raised — all three under the cap of 3. + + def test_request_above_cap_raises_429_with_retry_after(self) -> None: + """The (limit+1)-th call in a bucket must 429, and the response + must carry Retry-After. Clients without Retry-After back off + with guesswork; we should hand them the exact answer. + """ + fake = _FakeRedis() + request = _make_request() + user = _make_user() + with self._patch_enabled(per_min=2), patch.object( + rrl, "get_redis_client", return_value=fake + ): + rrl.check_message_request_rate_limit(request=request, user=user) + rrl.check_message_request_rate_limit(request=request, user=user) + with self.assertRaises(HTTPException) as ctx: + rrl.check_message_request_rate_limit(request=request, user=user) + self.assertEqual(ctx.exception.status_code, 429) + retry_after = ctx.exception.headers and ctx.exception.headers.get( + "Retry-After" + ) + self.assertIsNotNone(retry_after) + self.assertTrue(retry_after.isdigit()) # type: ignore[union-attr] + # 0 < retry_after <= window. (Equal to window iff time landed + # exactly on the boundary — possible but rare, allow it.) + self.assertGreaterEqual(int(retry_after), 0) # type: ignore[arg-type] + self.assertLessEqual(int(retry_after), 60) # type: ignore[arg-type] + + def test_two_users_have_independent_counters(self) -> None: + """Distinct user UUIDs must NOT share a bucket. If they did, a + loud user could 429 a quiet one. + """ + fake = _FakeRedis() + request = _make_request() + alice = _make_user() + bob = _make_user() + with self._patch_enabled(per_min=1), patch.object( + rrl, "get_redis_client", return_value=fake + ): + rrl.check_message_request_rate_limit(request=request, user=alice) + # Bob's first request must succeed even though Alice already + # used her one allowed call in this bucket. + rrl.check_message_request_rate_limit(request=request, user=bob) + # Alice's second request hits her cap — should 429. + with self.assertRaises(HTTPException) as ctx: + rrl.check_message_request_rate_limit(request=request, user=alice) + self.assertEqual(ctx.exception.status_code, 429) + + def test_next_bucket_resets_count(self) -> None: + """When time advances past the window boundary, the bucket key + changes (it's keyed by ``floor(time / window)``) and the new + bucket starts at 0. Without this, the limit is forever rather + than per-window. + """ + fake = _FakeRedis() + request = _make_request() + user = _make_user() + with self._patch_enabled(per_min=1), patch.object( + rrl, "get_redis_client", return_value=fake + ): + with patch.object(rrl.time, "time", return_value=1_000_000.0): + rrl.check_message_request_rate_limit(request=request, user=user) + # Same bucket -> over cap. + with self.assertRaises(HTTPException): + rrl.check_message_request_rate_limit(request=request, user=user) + # Jump 90s — new minute bucket. + with patch.object(rrl.time, "time", return_value=1_000_000.0 + 90): + rrl.check_message_request_rate_limit(request=request, user=user) + + def test_expire_uses_nx_so_ttl_is_set_only_once(self) -> None: + """Every ``INCR`` is paired with ``EXPIRE`` — but if NX weren't + set, each increment would push the expiry forward and the + bucket would never roll over. Lock down ``nx=True`` so a future + refactor doesn't accidentally make every limited window become + a permanent ban after the first burst. + """ + fake = _FakeRedis() + request = _make_request() + user = _make_user() + with self._patch_enabled(per_min=10), patch.object( + rrl, "get_redis_client", return_value=fake + ): + for _ in range(3): + rrl.check_message_request_rate_limit(request=request, user=user) + # All EXPIRE calls used nx=True. (At least one happened.) + self.assertGreater(len(fake.expire_calls), 0) + for _key, _seconds, nx in fake.expire_calls: + self.assertTrue(nx, "EXPIRE must use NX so TTL isn't extended on every INCR") + + def test_anonymous_user_keyed_by_xff_first_hop(self) -> None: + """Anonymous traffic keys on the first XFF hop (the real client + IP behind nginx), not on the LB's own peer address. Otherwise + every anonymous request would share one bucket. + """ + fake = _FakeRedis() + # Two distinct anonymous IPs in XFF. + req_a = _make_request(headers={"x-forwarded-for": "10.1.1.1, 10.0.0.1"}) + req_b = _make_request(headers={"x-forwarded-for": "10.1.1.2, 10.0.0.1"}) + with self._patch_enabled(per_min=1), patch.object( + rrl, "get_redis_client", return_value=fake + ): + rrl.check_message_request_rate_limit(request=req_a, user=None) + # Different XFF first hop => different bucket, allowed. + rrl.check_message_request_rate_limit(request=req_b, user=None) + # Same XFF as req_a => second hit, exceeds cap. + with self.assertRaises(HTTPException): + rrl.check_message_request_rate_limit(request=req_a, user=None) + + def test_anonymous_with_no_ip_skips_silently(self) -> None: + """If neither XFF nor a client peer is present, we have nothing + to attribute the request to. Skipping is the only honest + option — bucketing everyone under "" would silently flatten + every anonymous client into one counter. + """ + fake = _FakeRedis() + request = _make_request(headers={}, peer_host=None) + with self._patch_enabled(per_min=1), patch.object( + rrl, "get_redis_client", return_value=fake + ) as mock_client: + # Call twice — both must pass; the limiter must not even + # have constructed a key to enforce against. + rrl.check_message_request_rate_limit(request=request, user=None) + rrl.check_message_request_rate_limit(request=request, user=None) + mock_client.assert_not_called() + + def test_redis_error_fails_open(self) -> None: + """A pipeline that explodes (timeout, broken connection, + whatever) must NOT raise out of the dependency. The chat path + keeps serving — a request slipped past the limiter is better + than a chat outage caused by the limiter itself. + """ + bad_client = MagicMock() + bad_client.pipeline.side_effect = RuntimeError("redis exploded") + request = _make_request() + user = _make_user() + with self._patch_enabled(per_min=1), patch.object( + rrl, "get_redis_client", return_value=bad_client + ): + # Two calls back-to-back — neither raises, because the + # limiter swallows the Redis error. + rrl.check_message_request_rate_limit(request=request, user=user) + rrl.check_message_request_rate_limit(request=request, user=user) + + +class _MultiPatch: + """Context manager that applies several ``patch.object`` patches at + once. Used to make per-test "turn on the limiter with these + windows" blocks readable. + """ + + def __init__(self, *patches: tuple[Any, str, Any]) -> None: + self._patches = [patch.object(obj, attr, val) for obj, attr, val in patches] + + def __enter__(self) -> None: + for p in self._patches: + p.start() + + def __exit__(self, *exc: Any) -> None: + for p in reversed(self._patches): + p.stop() + + +if __name__ == "__main__": + unittest.main() From 9896c87ab0fa620fa74a7dc0d16738ee2b575893 Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 18:06:43 +0530 Subject: [PATCH 004/102] Persona list cache with explicit write-through invalidation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit GET /persona (Manage Assistants → "View available assistants") fires get_personas(user_id, ...) — a multi-OR permission-filtered query joining Persona, Persona__User, Persona__UserGroup, User__UserGroup. At hundreds of concurrent users opening the chat UI around the same time, the burst puts unnecessary pressure on the DB connection pool (which is the actual scaling ceiling for streaming chat — see REDIS_CACHING_PLAN.md). Design: global cache + per-user filter (not per-user response cache), so the multi-user-burst pattern collapses 200 queries into ~1: danswer:personas:all:not_deleted global, all PersonaSnapshot including is_public / users / groups (PersonaSnapshot already carries the permission inputs — no separate payload shape needed) danswer:personas:groups:{user_id} per-user, list[int] of group ids At request time the cached list is filtered in Python mirroring the SQL OR-block exactly: is_public OR user.id IN persona.users.id OR (user_group_ids ∩ persona.groups) The parity vs SQL is locked down by tests (one per branch + negative). Invalidation is explicit + write-through: - 9 mutation paths in db/persona.py call invalidate_personas_all() AFTER db_session.commit() (after-commit ordering avoids stale-fill race during open transactions). - 3 paths in ee/danswer/db/user_group.py (insert/update/prepare-delete) call invalidate_user_groups(uid) for each affected user. - 24h TTL is ONLY a safety net for missed busts; primary mechanism is explicit so persona/group edits are visible immediately. - Default OFF (PERSONA_CACHE_ENABLED=false); enable per environment. - Fail-OPEN on every Redis op: a Redis outage degrades latency, not availability, and a failed bust doesn't roll back the DB write. - include_deleted=True falls through to direct DB (uncommon shape; we deliberately don't cache it). Encrypted values: N/A — PersonaSnapshot has no encryption-at-rest guarantee to bypass (unlike the KV store layer from P1). Tests (17, mocked Redis + db boundary, no real services): - 6 filter-parity cases (one per SQL branch + mixed + zero-groups edge) - 2 user-group cache cases (miss/hit, TTL propagation) - 3 routing cases (disabled fallthrough, include_deleted bypass, admin user_id=None path skips group lookup) - 4 invalidation cases (right key for each side, disabled short-circuit, Redis-error-during-bust swallowed) - 2 fail-open-on-read cases (GET error → miss, corrupt entry → miss) Co-Authored-By: Claude Opus 4.7 (1M context) --- backend/danswer/configs/app_configs.py | 12 + backend/danswer/db/persona.py | 14 + backend/danswer/db/persona_cache.py | 305 ++++++++++++ .../danswer/server/features/persona/api.py | 15 +- backend/ee/danswer/db/user_group.py | 25 +- .../unit/danswer/db/test_persona_cache.py | 471 ++++++++++++++++++ 6 files changed, 833 insertions(+), 9 deletions(-) create mode 100644 backend/danswer/db/persona_cache.py create mode 100644 backend/tests/unit/danswer/db/test_persona_cache.py diff --git a/backend/danswer/configs/app_configs.py b/backend/danswer/configs/app_configs.py index 42ffffb2671..1189c52817f 100644 --- a/backend/danswer/configs/app_configs.py +++ b/backend/danswer/configs/app_configs.py @@ -347,6 +347,18 @@ os.environ.get("REQUEST_RATE_LIMIT_PER_HOUR") or 0 ) +# Per-user persona ("assistant") list cache. Caches the global persona list +# + per-user group memberships in Redis; permission filter runs in Python +# at request time. Explicit write-through invalidation lives in the +# db/persona.py and ee/.../user_group.py mutation paths — the TTL below is +# only a long-tail safety net for missed busts. Default OFF. +PERSONA_CACHE_ENABLED = ( + os.environ.get("PERSONA_CACHE_ENABLED", "").lower() == "true" +) +PERSONA_CACHE_TTL_SECONDS = int( + os.environ.get("PERSONA_CACHE_TTL_SECONDS") or 86400 # 24 h backstop +) + ##### # Enterprise Edition Configs diff --git a/backend/danswer/db/persona.py b/backend/danswer/db/persona.py index 946a3f897e4..e39eb47306b 100644 --- a/backend/danswer/db/persona.py +++ b/backend/danswer/db/persona.py @@ -24,6 +24,7 @@ from danswer.db.models import Tool from danswer.db.models import User from danswer.db.models import User__UserGroup +from danswer.db.persona_cache import invalidate_personas_all from danswer.search.enums import RecencyBiasSetting from danswer.server.features.persona.models import CreatePersonaRequest from danswer.server.features.persona.models import PersonaSnapshot @@ -48,6 +49,7 @@ def make_persona_private( db_session.add(Persona__User(persona_id=persona_id, user_id=user_uuid)) db_session.commit() + invalidate_personas_all() # Persona__User membership changed # May cause error if someone switches down to MIT from EE if group_ids: @@ -218,6 +220,7 @@ def mark_persona_as_deleted( ) persona.deleted = True db_session.commit() + invalidate_personas_all() def mark_persona_as_not_deleted( @@ -231,6 +234,7 @@ def mark_persona_as_not_deleted( if persona.deleted: persona.deleted = False db_session.commit() + invalidate_personas_all() else: raise ValueError(f"Persona with ID {persona_id} is not deleted.") @@ -246,6 +250,7 @@ def mark_delete_persona_by_name( db_session.execute(stmt) db_session.commit() + invalidate_personas_all() def update_all_personas_display_priority( @@ -262,6 +267,7 @@ def update_all_personas_display_priority( persona.display_priority = display_priority_map[persona.id] db_session.commit() + invalidate_personas_all() def upsert_prompt( @@ -430,9 +436,14 @@ def upsert_persona( if commit: db_session.commit() + invalidate_personas_all() else: # flush the session so that the persona has an ID db_session.flush() + # No bust here — caller hasn't committed. They are responsible for + # invalidating after their final commit, OR they're being called + # by a wrapper like create_update_persona whose subsequent steps + # (make_persona_private) will commit and bust. return persona @@ -460,6 +471,7 @@ def delete_old_default_personas( db_session.execute(stmt) db_session.commit() + invalidate_personas_all() def update_persona_visibility( @@ -470,6 +482,7 @@ def update_persona_visibility( persona = get_persona_by_id(persona_id=persona_id, user=None, db_session=db_session) persona.is_visible = is_visible db_session.commit() + invalidate_personas_all() def check_user_can_edit_persona(user: User | None, persona: Persona) -> None: @@ -636,6 +649,7 @@ def delete_persona_by_name( db_session.execute(stmt) db_session.commit() + invalidate_personas_all() def get_persona_with_docset_and_prompts( diff --git a/backend/danswer/db/persona_cache.py b/backend/danswer/db/persona_cache.py new file mode 100644 index 00000000000..29eb832b5c0 --- /dev/null +++ b/backend/danswer/db/persona_cache.py @@ -0,0 +1,305 @@ +"""Per-user persona ("assistant") list cache, Redis-backed. + +The "Manage Assistants" tile in the chat UI fires +``GET /persona`` → ``server/features/persona/api.py::list_personas`` → +``db/persona.py::get_personas(user_id, …)``. That query is a multi-OR +permission filter over ``Persona`` joined with ``Persona__User``, +``Persona__UserGroup`` and ``User__UserGroup`` (the user's groups). It +runs once per user per page-load; at hundreds of users opening chat +around the same time, the burst hits the DB connection pool harder +than it deserves to. + +This module shifts the work to Redis with a **global cache + Python +filter** shape: + + ``personas:all:not_deleted`` — JSON list of every visible + ``PersonaSnapshot``. Shared + across users, so 200 concurrent + first-clicks become ~1 DB + query rather than 200. + + ``personas:groups:{user_id}`` — JSON list of the user's group + ids; cheap one-row indexed + lookup but worth caching since + it's hit on every persona-list + call. + +Because ``PersonaSnapshot`` already carries the permission inputs +(``is_public``, ``users``, ``groups``), the filter runs in Python on +the cached list: + + persona.is_public + OR user_id in {u.id for u in persona.users} # direct grant + OR (user_group_ids ∩ set(persona.groups)) # group grant + +This mirrors the SQL OR-block in :func:`danswer.db.persona.get_personas` +exactly — the parity is locked down by tests. + +**Invalidation:** explicit, write-through. Every mutation that affects +``Persona`` / ``Persona__User`` / ``Persona__UserGroup`` calls +:func:`invalidate_personas_all` after commit; every change to +``User__UserGroup`` calls :func:`invalidate_user_groups(user_id)`. The +``PERSONA_CACHE_TTL_SECONDS`` (24 h default) is *only* a long-tail +safety net for missed busts — the primary mechanism is explicit. + +**Fail-open**: any Redis error logs and falls through to a direct DB +read. A Redis outage degrades latency, not availability. + +**Default OFF**: ``PERSONA_CACHE_ENABLED=false`` keeps the existing +direct-DB path. Enable per environment once Redis is reachable. +""" +from __future__ import annotations + +import json +from typing import Any +from uuid import UUID + +from sqlalchemy import select +from sqlalchemy.orm import Session + +from danswer.configs.app_configs import PERSONA_CACHE_ENABLED +from danswer.configs.app_configs import PERSONA_CACHE_TTL_SECONDS +from danswer.db.models import User__UserGroup +from danswer.redis.redis_pool import DANSWER_REDIS_KEY_PREFIX +from danswer.redis.redis_pool import get_redis_client +from danswer.server.features.persona.models import PersonaSnapshot +from danswer.utils.logger import setup_logger + + +logger = setup_logger() + + +# Single shared key for the "all non-deleted personas" snapshot list. Any +# Persona / Persona__User / Persona__UserGroup mutation must invalidate this. +_PERSONAS_ALL_KEY = DANSWER_REDIS_KEY_PREFIX + "personas:all:not_deleted" + +# Per-user namespace for cached group memberships. User__UserGroup mutations +# must invalidate the affected user(s). +_USER_GROUPS_KEY_PREFIX = DANSWER_REDIS_KEY_PREFIX + "personas:groups:" + + +# --------------------------------------------------------------------------- +# Public API — read path +# --------------------------------------------------------------------------- + + +def get_personas_for_user_cached( + user_id: UUID | None, + db_session: Session, + include_deleted: bool = False, +) -> list[PersonaSnapshot]: + """Return the persona list visible to ``user_id`` as ``PersonaSnapshot``s. + + Routing: + + * Cache disabled OR ``include_deleted=True`` → direct DB read via the + existing :func:`danswer.db.persona.get_personas`. The + ``include_deleted`` case is rare admin-only and we deliberately + don't cache it — keeping the cache key set small avoids accidental + mis-keying on the hot path. + * ``user_id is None`` (admin call) → return the global cached list + unfiltered. + * Authenticated user → load the global list + user's groups from + cache (or DB on miss), apply the Python permission filter. + """ + # Local import to keep this module importable from `db.persona` itself + # without a circular import — `get_personas` is the fallback only. + from danswer.db.persona import get_personas + + if not PERSONA_CACHE_ENABLED or include_deleted: + personas = get_personas( + user_id=user_id, + db_session=db_session, + include_deleted=include_deleted, + ) + return [PersonaSnapshot.from_model(p) for p in personas] + + all_snapshots = _get_all_personas_cached(db_session) + if user_id is None: + # Admin / no-auth path: no permission filter needed. + return all_snapshots + + user_group_ids = _get_user_group_ids_cached(user_id, db_session) + return _filter_personas_for_user(all_snapshots, user_id, user_group_ids) + + +# --------------------------------------------------------------------------- +# Public API — invalidation +# --------------------------------------------------------------------------- + + +def invalidate_personas_all() -> None: + """Drop the cached global persona list. + + Call this *after* ``db_session.commit()`` in any mutation that + changes ``Persona``, ``Persona__User``, or ``Persona__UserGroup``. + Before-commit invalidation has a stale-cache-fill race: a concurrent + reader between bust and commit would refill the cache with the + pre-mutation snapshot. + + Cheap when the cache is disabled — short-circuits before any Redis + call so mutation paths don't pay an ambient cost. + """ + if not PERSONA_CACHE_ENABLED: + return + try: + get_redis_client().delete(_PERSONAS_ALL_KEY) + except Exception as e: + # Fail-open — TTL safety net will heal eventually. Loud log so + # the dashboard catches a persistent Redis outage. + logger.warning("invalidate_personas_all: Redis DEL failed: %s", e) + + +def invalidate_user_groups(user_id: UUID) -> None: + """Drop ``user_id``'s cached group-membership list. + + Call this when ``User__UserGroup`` rows for the user are inserted + or removed. Same after-commit ordering rule as + :func:`invalidate_personas_all`. + """ + if not PERSONA_CACHE_ENABLED: + return + try: + get_redis_client().delete(_USER_GROUPS_KEY_PREFIX + str(user_id)) + except Exception as e: + logger.warning( + "invalidate_user_groups(user_id=%s): Redis DEL failed: %s", + user_id, + e, + ) + + +# --------------------------------------------------------------------------- +# Internals +# --------------------------------------------------------------------------- + + +def _get_all_personas_cached(db_session: Session) -> list[PersonaSnapshot]: + """Load all non-deleted personas as ``PersonaSnapshot``s. + + Cache hit: deserialize the stored JSON straight back into Pydantic + models — no DB call. + Cache miss / Redis error: fall through to the existing + :func:`get_personas` (with ``user_id=None``) so the source of truth + is reused. + """ + hit, cached = _safe_get(_PERSONAS_ALL_KEY) + if hit and isinstance(cached, list): + try: + return [PersonaSnapshot.parse_obj(d) for d in cached] + except Exception as e: + # Pydantic schema drift (e.g. a new required field was added + # since the entry was cached) — treat as a miss so the next + # read repopulates with the current schema. + logger.warning( + "Cached persona list failed PersonaSnapshot parse, refilling: %s", + e, + ) + + from danswer.db.persona import get_personas + + personas = get_personas( + user_id=None, + db_session=db_session, + include_deleted=False, + ) + snapshots = [PersonaSnapshot.from_model(p) for p in personas] + + # Round-trip through PersonaSnapshot.json() so nested types (UUID, + # enums, datetimes) get the same serializer Pydantic uses on the wire. + payload: list[Any] = [json.loads(s.json()) for s in snapshots] + _safe_set(_PERSONAS_ALL_KEY, payload) + return snapshots + + +def _get_user_group_ids_cached(user_id: UUID, db_session: Session) -> list[int]: + """Return the ``user_group_id``s the user belongs to. + + Tiny indexed lookup — caching wins not on per-call latency but on + aggregate, since it's hit on every persona-list call across all + users. + """ + key = _USER_GROUPS_KEY_PREFIX + str(user_id) + hit, cached = _safe_get(key) + if hit and isinstance(cached, list): + return [int(x) for x in cached] + + rows = db_session.scalars( + select(User__UserGroup.user_group_id).where( + User__UserGroup.user_id == user_id + ) + ).all() + group_ids = [int(r) for r in rows] + _safe_set(key, group_ids) + return group_ids + + +def _filter_personas_for_user( + personas: list[PersonaSnapshot], + user_id: UUID, + user_group_ids: list[int], +) -> list[PersonaSnapshot]: + """Apply the same OR-filter ``get_personas`` runs in SQL. + + SQL: + Persona.is_public + OR Persona.id IN (Persona__User where user_id = U) + OR Persona.id IN (Persona__UserGroup + where user_group_id IN ) + + The parity vs SQL is covered by tests with representative permission + shapes; if you change one side, change the other. + """ + user_group_set = set(user_group_ids) + out: list[PersonaSnapshot] = [] + for p in personas: + if p.is_public: + out.append(p) + continue + if any(u.id == user_id for u in p.users): + out.append(p) + continue + if user_group_set.intersection(p.groups): + out.append(p) + continue + return out + + +# ---- Fail-open Redis helpers (mirror the P1 cache module's posture) ---- + + +def _safe_get(key: str) -> tuple[bool, Any]: + """Return ``(hit, value)``. ``hit=False`` covers miss AND any Redis + or decode error — the caller treats them all as "go to the DB". + """ + try: + raw = get_redis_client().get(key) + except Exception as e: + logger.warning("persona_cache: Redis GET failed for %s: %s", key, e) + return (False, None) + if raw is None: + return (False, None) + try: + return (True, json.loads(raw)) + except (TypeError, ValueError) as e: + logger.warning( + "persona_cache: corrupt entry at %s, ignoring: %s", key, e + ) + return (False, None) + + +def _safe_set(key: str, val: Any) -> None: + try: + payload = json.dumps(val) + except (TypeError, ValueError) as e: + # Defensive — _get_all_personas_cached/_get_user_group_ids_cached + # only ever cache JSON-clean values. If this fires the cache is + # silently skipped and the inner read still served the caller. + logger.warning( + "persona_cache: skipping non-JSON value at %s: %s", key, e + ) + return + try: + get_redis_client().set(key, payload, ex=PERSONA_CACHE_TTL_SECONDS) + except Exception as e: + logger.warning("persona_cache: Redis SET failed for %s: %s", key, e) diff --git a/backend/danswer/server/features/persona/api.py b/backend/danswer/server/features/persona/api.py index cd7deb5321a..73f0cd6e3e8 100644 --- a/backend/danswer/server/features/persona/api.py +++ b/backend/danswer/server/features/persona/api.py @@ -18,6 +18,7 @@ from danswer.db.persona import update_all_personas_display_priority from danswer.db.persona import update_persona_shared_users from danswer.db.persona import update_persona_visibility +from danswer.db.persona_cache import get_personas_for_user_cached from danswer.llm.answering.prompts.utils import build_dummy_prompt from danswer.server.features.persona.models import CreatePersonaRequest from danswer.server.features.persona.models import PersonaSnapshot @@ -163,13 +164,15 @@ def list_personas( db_session: Session = Depends(get_session), include_deleted: bool = False, ) -> list[PersonaSnapshot]: + # Routes through the Redis-backed cache when PERSONA_CACHE_ENABLED; + # otherwise behaves exactly as before (direct DB read + serialize). + # The cache handles the include_deleted=True case by falling through. user_id = user.id if user is not None else None - return [ - PersonaSnapshot.from_model(persona) - for persona in get_personas( - user_id=user_id, include_deleted=include_deleted, db_session=db_session - ) - ] + return get_personas_for_user_cached( + user_id=user_id, + db_session=db_session, + include_deleted=include_deleted, + ) @basic_router.get("/{persona_id}") diff --git a/backend/ee/danswer/db/user_group.py b/backend/ee/danswer/db/user_group.py index 0451db9b633..74c79ddf20c 100644 --- a/backend/ee/danswer/db/user_group.py +++ b/backend/ee/danswer/db/user_group.py @@ -14,6 +14,7 @@ from danswer.db.models import User__UserGroup from danswer.db.models import UserGroup from danswer.db.models import UserGroup__ConnectorCredentialPair +from danswer.db.persona_cache import invalidate_user_groups from danswer.server.documents.models import ConnectorCredentialPairIdentifier from ee.danswer.server.user_group.models import UserGroupCreate from ee.danswer.server.user_group.models import UserGroupUpdate @@ -180,6 +181,10 @@ def insert_user_group(db_session: Session, user_group: UserGroupCreate) -> UserG ) db_session.commit() + # New User__UserGroup rows for these users — bust their cached group lists + # so the next persona-list call sees the new group's persona grants. + for affected_user_id in user_group.user_ids: + invalidate_user_groups(affected_user_id) return db_user_group @@ -221,9 +226,10 @@ def update_user_group( cc_pairs_updated = set([cc_pair.id for cc_pair in existing_cc_pairs]) != set( user_group.cc_pair_ids ) - users_updated = set([user.id for user in db_user_group.users]) != set( - user_group.user_ids - ) + # Snapshot existing members BEFORE the cleanup mutation, so we know + # which users to invalidate. The new member set is on the request. + existing_user_ids = {user.id for user in db_user_group.users} + users_updated = existing_user_ids != set(user_group.user_ids) if users_updated: _cleanup_user__user_group_relationships__no_commit( @@ -249,6 +255,12 @@ def update_user_group( db_user_group.is_up_to_date = False db_session.commit() + if users_updated: + # Bust both removed (existing - new) and added (new - existing) users. + # Symmetric difference would be enough, but unioning both sides is + # cheap and avoids missing edge cases when membership reshuffles. + for affected_user_id in existing_user_ids | set(user_group.user_ids): + invalidate_user_groups(affected_user_id) return db_user_group @@ -275,6 +287,11 @@ def prepare_user_group_for_deletion(db_session: Session, user_group_id: int) -> _check_user_group_is_modifiable(db_user_group) + # Snapshot current members before cleanup so we can bust their caches + # after commit. The cleanup helper deletes the User__UserGroup rows, + # so reading after cleanup would give an empty set. + affected_user_ids = [user.id for user in db_user_group.users] + _cleanup_user__user_group_relationships__no_commit( db_session=db_session, user_group_id=user_group_id ) @@ -288,6 +305,8 @@ def prepare_user_group_for_deletion(db_session: Session, user_group_id: int) -> db_user_group.is_up_to_date = False db_user_group.is_up_for_deletion = True db_session.commit() + for affected_user_id in affected_user_ids: + invalidate_user_groups(affected_user_id) def _cleanup_user_group__cc_pair_relationships__no_commit( diff --git a/backend/tests/unit/danswer/db/test_persona_cache.py b/backend/tests/unit/danswer/db/test_persona_cache.py new file mode 100644 index 00000000000..57778cd280f --- /dev/null +++ b/backend/tests/unit/danswer/db/test_persona_cache.py @@ -0,0 +1,471 @@ +"""Unit tests for ``danswer.db.persona_cache``. + +What we lock down here: + +1. **Filter parity vs SQL.** The Python filter in + ``_filter_personas_for_user`` must match the OR-block in + :func:`danswer.db.persona.get_personas` for every representative + permission shape — public, direct-user grant, group grant, and the + negative (none of the above). If either filter drifts, users see the + wrong assistants. Each case here corresponds 1:1 to an SQL branch. + +2. **Read path** with cache enabled: + - Miss → DB call → Redis SET (with TTL) + - Hit → no DB call (the perf promise) + - Per-user-groups miss/hit independently of the global personas miss/hit + +3. **Read path** with cache disabled: + - Always reads the DB; never touches Redis. + - ``include_deleted=True`` always falls through to DB even when the + cache is otherwise enabled — we deliberately don't cache that + less-common shape. + +4. **Invalidation:** + - ``invalidate_personas_all`` deletes the right Redis key. + - ``invalidate_user_groups(uid)`` deletes the per-user key. + - Both short-circuit (no Redis call) when the cache is disabled — + mutation paths shouldn't pay ambient cost in the off state. + +5. **Fail-open** on Redis errors: + - GET error → treated as miss → DB read → no crash. + - SET / DELETE errors swallowed with a log; calling code sees nothing. + +Redis is stubbed with a tiny in-memory fake; the inner DB function and +``PersonaSnapshot.from_model`` are patched. No real Postgres or Redis. +""" +from __future__ import annotations + +import json +import unittest +import uuid +from typing import Any +from unittest.mock import MagicMock +from unittest.mock import patch + +from danswer.db import persona_cache as pc + + +# ---------- shared fakes ---------- + + +class _FakeRedis: + """In-memory Redis fake covering get/set/delete only. + + Stores bytes the same way redis-py does so the cache module's + JSON encode/decode actually runs in tests. + """ + + def __init__(self) -> None: + self.store: dict[str, bytes] = {} + self.get_calls: list[str] = [] + self.set_calls: list[tuple[str, Any, int | None]] = [] + self.delete_calls: list[str] = [] + + def get(self, key: str) -> bytes | None: + self.get_calls.append(key) + return self.store.get(key) + + def set(self, key: str, value: Any, ex: int | None = None) -> bool: + self.set_calls.append((key, value, ex)) + if isinstance(value, str): + self.store[key] = value.encode("utf-8") + elif isinstance(value, bytes): + self.store[key] = value + else: + self.store[key] = str(value).encode("utf-8") + return True + + def delete(self, *keys: str) -> int: + removed = 0 + for k in keys: + self.delete_calls.append(k) + if k in self.store: + del self.store[k] + removed += 1 + return removed + + +class _FakePersonaSnapshot: + """Stand-in for ``PersonaSnapshot`` for filter tests only. + + The real Pydantic model has ~20 required fields; the filter function + touches just three of them. We use a duck-typed mock so test cases + stay focused on permission semantics, not Pydantic field plumbing. + """ + + def __init__( + self, + *, + persona_id: int, + is_public: bool, + user_ids_with_access: list[uuid.UUID], + group_ids_with_access: list[int], + ) -> None: + self.id = persona_id + self.is_public = is_public + # Match PersonaSnapshot.users: list[MinimalUserSnapshot] (has .id) + self.users = [MagicMock(id=uid) for uid in user_ids_with_access] + # Match PersonaSnapshot.groups: list[int] + self.groups = group_ids_with_access + + +# ---------- filter parity ---------- + + +class TestFilterParityVsSqlOrBlock(unittest.TestCase): + """One test per SQL branch in get_personas's OR-filter. + + SQL (paraphrased): + Persona.is_public + OR Persona.id IN (Persona__User where user_id = U) + OR Persona.id IN (Persona__UserGroup where group_id IN ) + + Each case below isolates one branch; the last asserts the negative. + """ + + def setUp(self) -> None: + self.user_id = uuid.uuid4() + self.other_user_id = uuid.uuid4() + self.user_group_ids = [10, 20] + + def test_public_persona_always_visible(self) -> None: + """Branch 1: ``is_public`` → visible regardless of grants. This + is the most-traveled path and must stay correct even when the + user has no direct or group grant.""" + p = _FakePersonaSnapshot( + persona_id=1, is_public=True, user_ids_with_access=[], group_ids_with_access=[] + ) + result = pc._filter_personas_for_user([p], self.user_id, self.user_group_ids) + self.assertEqual([x.id for x in result], [1]) + + def test_direct_user_grant_visible(self) -> None: + """Branch 2: not public, but the user is in the persona's + ``users`` list. Mirrors a row in Persona__User.""" + p = _FakePersonaSnapshot( + persona_id=2, + is_public=False, + user_ids_with_access=[self.user_id], + group_ids_with_access=[], + ) + result = pc._filter_personas_for_user([p], self.user_id, self.user_group_ids) + self.assertEqual([x.id for x in result], [2]) + + def test_group_grant_visible_if_user_in_one_of_those_groups(self) -> None: + """Branch 3: not public, no direct grant, but a group the user + belongs to has access. Mirrors a row in Persona__UserGroup + joined with User__UserGroup.""" + p = _FakePersonaSnapshot( + persona_id=3, + is_public=False, + user_ids_with_access=[], + group_ids_with_access=[20, 999], # 20 is one of the user's groups + ) + result = pc._filter_personas_for_user([p], self.user_id, self.user_group_ids) + self.assertEqual([x.id for x in result], [3]) + + def test_no_access_hidden(self) -> None: + """Negative: not public, not in users, no overlapping group → + must be filtered out. If any branch leaks into this case we have + a permission bug.""" + p = _FakePersonaSnapshot( + persona_id=4, + is_public=False, + user_ids_with_access=[self.other_user_id], # different user + group_ids_with_access=[999, 888], # no overlap with [10, 20] + ) + result = pc._filter_personas_for_user([p], self.user_id, self.user_group_ids) + self.assertEqual(result, []) + + def test_mixed_list_returns_only_visible(self) -> None: + """A realistic mix: 4 personas, only the first 3 should pass + the filter (one per branch + one denied). Verifies that the + denial path doesn't accidentally short-circuit later visible + items in the list.""" + personas = [ + _FakePersonaSnapshot( + persona_id=1, is_public=True, user_ids_with_access=[], + group_ids_with_access=[], + ), + _FakePersonaSnapshot( + persona_id=2, is_public=False, + user_ids_with_access=[self.user_id], group_ids_with_access=[], + ), + _FakePersonaSnapshot( + persona_id=3, is_public=False, user_ids_with_access=[], + group_ids_with_access=[10], + ), + _FakePersonaSnapshot( + persona_id=4, is_public=False, + user_ids_with_access=[self.other_user_id], + group_ids_with_access=[888], + ), + ] + result = pc._filter_personas_for_user( + personas, self.user_id, self.user_group_ids + ) + self.assertEqual(sorted(x.id for x in result), [1, 2, 3]) + + def test_user_with_no_groups_still_sees_public_and_direct_grants(self) -> None: + """Edge case: user belongs to zero groups. The group branch + contributes nothing, but public + direct grants must still + work — otherwise zero-group users get a broken assistant list.""" + personas = [ + _FakePersonaSnapshot( + persona_id=1, is_public=True, + user_ids_with_access=[], group_ids_with_access=[], + ), + _FakePersonaSnapshot( + persona_id=2, is_public=False, + user_ids_with_access=[self.user_id], group_ids_with_access=[], + ), + _FakePersonaSnapshot( + persona_id=3, is_public=False, + user_ids_with_access=[], group_ids_with_access=[10], + ), + ] + result = pc._filter_personas_for_user(personas, self.user_id, []) + self.assertEqual(sorted(x.id for x in result), [1, 2]) + + +# ---------- read path ---------- + + +class TestUserGroupCache(unittest.TestCase): + """The per-user group-ids cache: cheap query, big aggregate win.""" + + def test_miss_then_hit_only_one_db_read(self) -> None: + """First call hits the DB, subsequent calls within TTL serve + from Redis. Locks in the central performance promise of the + per-user side of the cache.""" + fake = _FakeRedis() + db_session = MagicMock() + rows = MagicMock() + rows.all.return_value = [10, 20, 30] + db_session.scalars.return_value = rows + user_id = uuid.uuid4() + + with patch.object(pc, "PERSONA_CACHE_ENABLED", True), patch.object( + pc, "PERSONA_CACHE_TTL_SECONDS", 60 + ), patch.object(pc, "get_redis_client", return_value=fake): + first = pc._get_user_group_ids_cached(user_id, db_session) + second = pc._get_user_group_ids_cached(user_id, db_session) + + self.assertEqual(first, [10, 20, 30]) + self.assertEqual(second, [10, 20, 30]) + self.assertEqual( + db_session.scalars.call_count, + 1, + "second lookup must come from Redis, not the DB", + ) + + def test_set_uses_configured_ttl(self) -> None: + """The TTL is the safety net for missed busts — if it isn't + applied, a stale entry could live forever after a missed + invalidation. Lock down that ``ex=`` is the configured value. + """ + fake = _FakeRedis() + db_session = MagicMock() + rows = MagicMock() + rows.all.return_value = [] + db_session.scalars.return_value = rows + + with patch.object(pc, "PERSONA_CACHE_ENABLED", True), patch.object( + pc, "PERSONA_CACHE_TTL_SECONDS", 1234 + ), patch.object(pc, "get_redis_client", return_value=fake): + pc._get_user_group_ids_cached(uuid.uuid4(), db_session) + + self.assertEqual(len(fake.set_calls), 1) + _key, _val, ex = fake.set_calls[0] + self.assertEqual(ex, 1234) + + +# ---------- routing / disabled mode ---------- + + +class TestGetPersonasForUserCached(unittest.TestCase): + def test_disabled_falls_through_to_get_personas(self) -> None: + """With the flag off, the wrapper must NOT call Redis at all — + it must behave exactly like the previous direct-DB code path. + Important so enabling/disabling the feature is a clean toggle. + """ + db_session = MagicMock() + snap = MagicMock() + + with patch.object(pc, "PERSONA_CACHE_ENABLED", False), patch( + "danswer.db.persona.get_personas", return_value=[MagicMock()] + ) as mock_get_personas, patch( + "danswer.db.persona_cache.PersonaSnapshot.from_model", return_value=snap + ), patch.object(pc, "get_redis_client") as mock_client: + result = pc.get_personas_for_user_cached( + user_id=uuid.uuid4(), db_session=db_session + ) + + self.assertEqual(result, [snap]) + mock_get_personas.assert_called_once() + mock_client.assert_not_called() + + def test_include_deleted_true_bypasses_cache_even_when_enabled(self) -> None: + """We deliberately don't cache the ``include_deleted=True`` shape — + keeps the cache key set small and avoids accidental mis-keying. + Locks down that this path skips Redis entirely. + """ + db_session = MagicMock() + with patch.object(pc, "PERSONA_CACHE_ENABLED", True), patch( + "danswer.db.persona.get_personas", return_value=[] + ) as mock_get_personas, patch( + "danswer.db.persona_cache.PersonaSnapshot.from_model" + ), patch.object(pc, "get_redis_client") as mock_client: + pc.get_personas_for_user_cached( + user_id=uuid.uuid4(), + db_session=db_session, + include_deleted=True, + ) + + mock_get_personas.assert_called_once() + # include_deleted=True was passed through to the DB read + self.assertTrue(mock_get_personas.call_args.kwargs["include_deleted"]) + mock_client.assert_not_called() + + def test_admin_call_returns_unfiltered_global_cache(self) -> None: + """``user_id=None`` is the admin / no-auth case. The cache + already holds the full list with no permission filter, so we + skip the Python filter step. Locks down the fast path for + admin endpoints that share the same cache. + """ + all_snaps = [ + _FakePersonaSnapshot( + persona_id=i, + is_public=False, + user_ids_with_access=[], + group_ids_with_access=[], + ) + for i in [1, 2, 3] + ] + with patch.object(pc, "PERSONA_CACHE_ENABLED", True), patch.object( + pc, "_get_all_personas_cached", return_value=all_snaps + ) as mock_get_all, patch.object( + pc, "_get_user_group_ids_cached" + ) as mock_get_groups: + result = pc.get_personas_for_user_cached( + user_id=None, db_session=MagicMock() + ) + + self.assertEqual([x.id for x in result], [1, 2, 3]) + mock_get_all.assert_called_once() + # Critically, we did NOT look up groups for the admin path. + mock_get_groups.assert_not_called() + + +# ---------- invalidation ---------- + + +class TestInvalidation(unittest.TestCase): + def test_invalidate_personas_all_deletes_right_key(self) -> None: + """The bust call must target ``personas:all:not_deleted``. Any + drift between this key and the SET key in the read path would + produce a stuck cache.""" + fake = _FakeRedis() + with patch.object(pc, "PERSONA_CACHE_ENABLED", True), patch.object( + pc, "get_redis_client", return_value=fake + ): + pc.invalidate_personas_all() + self.assertIn("danswer:personas:all:not_deleted", fake.delete_calls) + + def test_invalidate_user_groups_deletes_per_user_key(self) -> None: + """Each user gets their own key. The bust must include the + user_id in string form (UUIDs are not JSON-stringified + consistently otherwise).""" + fake = _FakeRedis() + uid = uuid.uuid4() + with patch.object(pc, "PERSONA_CACHE_ENABLED", True), patch.object( + pc, "get_redis_client", return_value=fake + ): + pc.invalidate_user_groups(uid) + self.assertIn(f"danswer:personas:groups:{uid}", fake.delete_calls) + + def test_invalidate_when_disabled_short_circuits(self) -> None: + """When the flag is off, mutation paths must not pay a Redis + round-trip cost. Without this, every assistant edit would touch + Redis even on a deployment that's opted out.""" + with patch.object(pc, "PERSONA_CACHE_ENABLED", False), patch.object( + pc, "get_redis_client" + ) as mock_client: + pc.invalidate_personas_all() + pc.invalidate_user_groups(uuid.uuid4()) + mock_client.assert_not_called() + + def test_redis_error_during_bust_is_swallowed(self) -> None: + """If the bust call fails (Redis down, network blip), we don't + want to roll back the user's mutation — the DB write already + committed. Loud log, no exception.""" + bad = MagicMock() + bad.delete.side_effect = RuntimeError("redis exploded") + with patch.object(pc, "PERSONA_CACHE_ENABLED", True), patch.object( + pc, "get_redis_client", return_value=bad + ): + # Both must complete without raising. + pc.invalidate_personas_all() + pc.invalidate_user_groups(uuid.uuid4()) + + +# ---------- fail-open on read ---------- + + +class TestFailOpenOnRedisRead(unittest.TestCase): + def test_redis_get_error_treated_as_miss(self) -> None: + """Redis GET exploding (timeout, conn refused) must NOT + propagate — the wrapper falls through to a direct DB read so + a Redis outage degrades latency, not availability. + + We stub _safe_set out: the set path's round-trip serialization + (s.json() → json.loads) requires a real PersonaSnapshot. Here + we're verifying the GET-error fallback, not the SET path. + """ + bad = MagicMock() + bad.get.side_effect = RuntimeError("connection refused") + db_session = MagicMock() + snap = MagicMock(is_public=True, users=[], groups=[]) + # The cache module round-trips via json.loads(s.json()) before + # the SET — needs a real JSON string here. + snap.json.return_value = '{"id":1,"is_public":true}' + + with patch.object(pc, "PERSONA_CACHE_ENABLED", True), patch.object( + pc, "get_redis_client", return_value=bad + ), patch( + "danswer.db.persona.get_personas", return_value=[MagicMock()] + ), patch( + "danswer.db.persona_cache.PersonaSnapshot.from_model", return_value=snap + ), patch.object(pc, "_safe_set"): + # Must not raise; must return the DB result. + result = pc._get_all_personas_cached(db_session) + + self.assertEqual(result, [snap]) + + def test_corrupt_cache_entry_treated_as_miss(self) -> None: + """Non-JSON bytes under our key (legacy format, manual SET, + schema migration race) must not crash. Fall through to DB and + overwrite the corrupt entry on the next SET. (Same _safe_set + stub rationale as above.) + """ + fake = _FakeRedis() + fake.store["danswer:personas:all:not_deleted"] = b"not-json-at-all" + db_session = MagicMock() + snap = MagicMock(is_public=True, users=[], groups=[]) + # The cache module round-trips via json.loads(s.json()) before + # the SET — needs a real JSON string here. + snap.json.return_value = '{"id":1,"is_public":true}' + + with patch.object(pc, "PERSONA_CACHE_ENABLED", True), patch.object( + pc, "get_redis_client", return_value=fake + ), patch( + "danswer.db.persona.get_personas", return_value=[MagicMock()] + ), patch( + "danswer.db.persona_cache.PersonaSnapshot.from_model", return_value=snap + ), patch.object(pc, "_safe_set"): + result = pc._get_all_personas_cached(db_session) + + self.assertEqual(result, [snap]) + + +if __name__ == "__main__": + unittest.main() From c8378ebb5b261b395fc96393e1b543ac6b63cde6 Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 18:25:59 +0530 Subject: [PATCH 005/102] Manage Assistants page UX overhaul MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the prior "move up / move down inside a 3-dot popover" flow on /assistants/mine with eight coordinated changes. Backend unchanged — the existing PATCH /api/user/assistant-list endpoint already accepts the full chosen_assistants array, so every interaction lands as one optimistic local update + one PATCH + (on failure) a rollback. 1. Drag-and-drop reorder via @dnd-kit (already in package.json) with a grab handle on each visible row. Pointer activation distance of 6px so clicks on the handle don't accidentally start a drag. Keyboard reordering comes for free via dnd-kit's default activator focus. 2. Explicit "set as default" — pin/star icon on each visible row; filled when the row is the user's default (position 0 of chosen_assistants), with an accent border + "Default" chip on that row. Ordering and default are now orthogonal — reorder freely without accidentally changing your default. 3. Visibility as a row-level switch instead of a buried "Hide / Remove" popover item. One unified list with a "Hidden (N)" divider; hidden rows render at reduced opacity and have no drag handle (no position to drag to). The prior separate "Active Assistants" / "Your Hidden Assistants" sections collapse into this single list. Refuses to hide the last visible row (can't ship the user a broken picker). 4. Client-side search filter — matches name, description, or tool name. Applies to both visible and hidden sections so search-then-toggle for "where did I put X" is one motion. 5. Information density rebalanced. Description is now the primary signal (was the smallest text). Tools/sources collapse into compact "{n} tools" / "{n} sources" chips so the row scans for "should I pick this?" not "what are its internals?". Full tool list reveals on hover via title attribute. 6. Bulk select column + sticky action bar. Checkbox appears on hover or focus and stays visible when selected. Action bar shows Show / Hide / Remove + Clear when anything is selected. Refuses bulk-hide that would empty the visible list. 7. Header cleanup: title + 1-line subtitle + Create button top-right, "Browse all available" as a text link. The prior two giant nav tiles + paragraph of explanatory copy are gone — recovers vertical space on a page whose real content is the list. 8. Undo on every state-mutating toast (reorder / set-default / hide / show / bulk ops). PopupSpec gains an optional `undo: { onClick }` field; the popup stays on screen 6s instead of 4s when undoable so the user has time to react. Undo restores the prior chosen_assistants array via another PATCH — symmetric round-trip, no special endpoint. New helpers in lib/assistants/updateAssistantPreferences.ts: reorderAssistantList(newOrder) — full-array PATCH (drag, undo) setDefaultAssistant(id, list) — move id to position 0 bulkRemoveFromList(ids, list) — set difference bulkAddToList(ids, list) — set union, appended at end The pre-existing moveAssistantUp/Down helpers are kept (other callers may still import them) but no longer used on this page. Verified: npx tsc --noEmit clean across web/ (0 errors). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../app/assistants/mine/AssistantsList.tsx | 1122 ++++++++++++----- web/src/components/admin/connectors/Popup.tsx | 69 +- .../assistants/updateAssistantPreferences.ts | 53 + 3 files changed, 943 insertions(+), 301 deletions(-) diff --git a/web/src/app/assistants/mine/AssistantsList.tsx b/web/src/app/assistants/mine/AssistantsList.tsx index 576a4c74add..46b2a9d2ee6 100644 --- a/web/src/app/assistants/mine/AssistantsList.tsx +++ b/web/src/app/assistants/mine/AssistantsList.tsx @@ -1,367 +1,901 @@ "use client"; -import { useState } from "react"; +/** + * Manage Assistants — redesigned UX. + * + * What changed vs the prior "move up / move down inside a 3-dot popover": + * + * 1. Drag-and-drop reorder via @dnd-kit (already in package.json), with + * a grab handle on each visible row. Up/down arrows removed. + * 2. Explicit "set as default" pin icon on each row. Filled = current + * default; the default row also gets an accent border. Ordering and + * default are now orthogonal. + * 3. Visibility is a row-level toggle, not a popover item. The page + * shows ONE list with a divider; hidden rows render under the + * "Hidden" divider at reduced opacity. + * 4. Client-side search filters by name + description + tool name. + * 5. Description font-weight bumped; tool chips moved behind a hover + * reveal so the visual hierarchy answers "should I pick this?". + * A "{n} sources" chip surfaces document-set count, which used to + * be hidden in expanded mode only. + * 6. Bulk select column + action bar (Show / Hide / Remove) appears + * only when something is selected. + * 7. Header: single title + 1-line subtitle + Create button top-right, + * "Browse all available" as a text link. Cut the giant tile pair. + * 8. Undo toast on reorder / default-change / visibility-toggle. + * Reuses the extended Popup component (`undo` field on PopupSpec). + * + * Everything is optimistic: local `chosenOrder` state mutates first, the + * PATCH runs after, and a failure rolls back + shows an error toast. + * + * NOTE: this file replaces the old up/down arrow flow entirely; the + * `moveAssistantUp` / `moveAssistantDown` helpers in + * `lib/assistants/updateAssistantPreferences.ts` are kept for any other + * callers but no longer used here. + */ + +import { useMemo, useState } from "react"; import { MinimalUserSnapshot, User } from "@/lib/types"; import { Persona } from "@/app/admin/assistants/interfaces"; -import { Divider, Text } from "@tremor/react"; +import { Text } from "@tremor/react"; import { - FiArrowDown, - FiArrowUp, + FiBookmark, FiEdit2, - FiMoreHorizontal, + FiEye, + FiEyeOff, FiPlus, FiSearch, - FiX, FiShare2, + FiStar, + FiTool, + FiTrash2, } from "react-icons/fi"; -import Link from "next/link"; -import { orderAssistantsForUser } from "@/lib/assistants/orderAssistants"; +import { MdDragIndicator } from "react-icons/md"; import { - addAssistantToList, - moveAssistantDown, - moveAssistantUp, - removeAssistantFromList, -} from "@/lib/assistants/updateAssistantPreferences"; + DndContext, + DragEndEvent, + PointerSensor, + closestCenter, + useSensor, + useSensors, +} from "@dnd-kit/core"; +import { restrictToVerticalAxis } from "@dnd-kit/modifiers"; +import { + SortableContext, + arrayMove, + useSortable, + verticalListSortingStrategy, +} from "@dnd-kit/sortable"; +import { CSS } from "@dnd-kit/utilities"; +import Link from "next/link"; +import { useRouter } from "next/navigation"; +import useSWR from "swr"; +import { errorHandlingFetcher } from "@/lib/fetcher"; import { AssistantIcon } from "@/components/assistants/AssistantIcon"; -import { DefaultPopover } from "@/components/popover/DefaultPopover"; +import { Bubble } from "@/components/Bubble"; import { PopupSpec, usePopup } from "@/components/admin/connectors/Popup"; -import { useRouter } from "next/navigation"; -import { NavigationButton } from "../NavigationButton"; -import { AssistantsPageTitle } from "../AssistantsPageTitle"; import { checkUserOwnsAssistant } from "@/lib/assistants/checkOwnership"; +import { + bulkAddToList, + bulkRemoveFromList, + reorderAssistantList, + setDefaultAssistant, +} from "@/lib/assistants/updateAssistantPreferences"; import { AssistantSharingModal } from "./AssistantSharingModal"; import { AssistantSharedStatusDisplay } from "../AssistantSharedStatus"; -import useSWR from "swr"; -import { errorHandlingFetcher } from "@/lib/fetcher"; -import { ToolsDisplay } from "../ToolsDisplay"; +import { AssistantsPageTitle } from "../AssistantsPageTitle"; -function AssistantListItem({ - assistant, - user, - allAssistantIds, - allUsers, - isFirst, - isLast, - isVisible, - setPopup, +// --------------------------------------------------------------------------- +// Small inline switch — avoids pulling in a new component library for one +// toggle. role="switch" gives screen readers the right semantics. +// --------------------------------------------------------------------------- + +function Toggle({ + checked, + onChange, + ariaLabel, }: { + checked: boolean; + onChange: (next: boolean) => void; + ariaLabel: string; +}) { + return ( + + ); +} + +// --------------------------------------------------------------------------- +// Single row — used both inside the sortable visible section AND in the +// hidden section. `isSortable` toggles drag affordances; the rest of the +// row is identical so the visual stays consistent across the divider. +// --------------------------------------------------------------------------- + +interface RowProps { assistant: Persona; user: User | null; - allUsers: MinimalUserSnapshot[]; - allAssistantIds: number[]; - isFirst: boolean; - isLast: boolean; + isDefault: boolean; isVisible: boolean; - setPopup: (popupSpec: PopupSpec | null) => void; -}) { - const router = useRouter(); - const [showSharingModal, setShowSharingModal] = useState(false); + isSelected: boolean; + onToggleSelect: (id: number) => void; + onSetDefault: (id: number) => void; + onToggleVisibility: (id: number, makeVisible: boolean) => void; + onShareClick: (id: number) => void; +} - const currentChosenAssistants = user?.preferences?.chosen_assistants; +function RowContent({ + assistant, + user, + isDefault, + isVisible, + isSelected, + onToggleSelect, + onSetDefault, + onToggleVisibility, + onShareClick, + // From useSortable when in sortable context; null otherwise. + dragHandleProps, +}: RowProps & { + dragHandleProps: + | (React.HTMLAttributes & { ref?: any }) + | null; +}) { const isOwnedByUser = checkUserOwnsAssistant(user, assistant); + const canEdit = isOwnedByUser; + const canShare = isOwnedByUser && !assistant.is_public; + + // Tool / doc-set counts — surfaced as small chips. Description should + // be the primary affordance; chips give the at-a-glance "scope" signal. + const toolCount = assistant.tools?.length ?? 0; + const docSetCount = assistant.document_sets?.length ?? 0; return ( - <> - { - setShowSharingModal(false); - router.refresh(); - }} - show={showSharingModal} - /> -
+ {/* Bulk-select checkbox. Hidden until hover or when something is + already selected on the page (the parent shows the action bar + based on that). Keyboard users always have it via focus. */} + onToggleSelect(assistant.id)} className=" - bg-background-emphasis - rounded-lg - shadow-md - p-4 - mb-4 - flex - justify-between - items-center + h-4 w-4 cursor-pointer + opacity-30 group-hover:opacity-100 focus:opacity-100 + checked:opacity-100 + transition-opacity " - > -
-
- -

- {assistant.name} -

-
- {assistant.tools.length > 0 && ( - + /> + + {/* Drag handle — only meaningful for visible rows. Hidden rows + have no position to drag to. */} + {dragHandleProps ? ( + + ) : ( + // Reserve the slot so visible/hidden rows line up vertically. +
+ )} + + + +
+
+

{assistant.name}

+ {isDefault && ( + + Default + )} -
{assistant.description}
-
- +
+ + {/* Description bumped — used to be text-sm with no weight; now + it's the primary signal of what the assistant is for. */} + {assistant.description && ( +
+ {assistant.description}
+ )} + + {/* Sharing status, e.g. "Shared with 3 people". */} +
+
- {isOwnedByUser && ( -
- {!assistant.is_public && ( -
setShowSharingModal(true)} - > - -
+ + {/* Scope chips — tool/source counts. Compact summary always; + full tool list reveals on hover so the row stays scannable. */} + {(toolCount > 0 || docSetCount > 0) && ( +
+ {docSetCount > 0 && ( + +
+ + {docSetCount} source{docSetCount === 1 ? "" : "s"} +
+
+ )} + {toolCount > 0 && ( + +
t.name).join(", ")} + > + + {toolCount} tool{toolCount === 1 ? "" : "s"} +
+
)} - - -
)} - - -
+
+ + {/* Right-side actions. Order matters for scannability: default + pin first (most-used), visibility toggle, then ownership + actions (edit/share). */} +
+ {/* Pin / default. Only meaningful for visible rows — pinning a + hidden one would have to unhide it too; we surface that via + the visibility toggle instead. */} + {isVisible && ( + + )} + + {/* Visibility — switch instead of a buried popover item. */} + onToggleVisibility(assistant.id, next)} + ariaLabel={ + isVisible + ? `Hide ${assistant.name} from the picker` + : `Show ${assistant.name} in the picker` } - side="bottom" - align="start" - sideOffset={5} - > - {[ - ...(!isFirst - ? [ -
{ - const success = await moveAssistantUp( - assistant.id, - currentChosenAssistants || allAssistantIds - ); - if (success) { - setPopup({ - message: `"${assistant.name}" has been moved up.`, - type: "success", - }); - router.refresh(); - } else { - setPopup({ - message: `"${assistant.name}" could not be moved up.`, - type: "error", - }); - } - }} - > - Move Up -
, - ] - : []), - ...(!isLast - ? [ -
{ - const success = await moveAssistantDown( - assistant.id, - currentChosenAssistants || allAssistantIds - ); - if (success) { - setPopup({ - message: `"${assistant.name}" has been moved down.`, - type: "success", - }); - router.refresh(); - } else { - setPopup({ - message: `"${assistant.name}" could not be moved down.`, - type: "error", - }); - } - }} - > - Move Down -
, - ] - : []), - isVisible ? ( -
{ - if ( - currentChosenAssistants && - currentChosenAssistants.length === 1 - ) { - setPopup({ - message: `Cannot remove "${assistant.name}" - you must have at least one assistant.`, - type: "error", - }); - return; - } - - const success = await removeAssistantFromList( - assistant.id, - currentChosenAssistants || allAssistantIds - ); - if (success) { - setPopup({ - message: `"${assistant.name}" has been removed from your list.`, - type: "success", - }); - router.refresh(); - } else { - setPopup({ - message: `"${assistant.name}" could not be removed from your list.`, - type: "error", - }); - } - }} - > - {isOwnedByUser ? "Hide" : "Remove"} -
- ) : ( -
{ - const success = await addAssistantToList( - assistant.id, - currentChosenAssistants || allAssistantIds - ); - if (success) { - setPopup({ - message: `"${assistant.name}" has been added to your list.`, - type: "success", - }); - router.refresh(); - } else { - setPopup({ - message: `"${assistant.name}" could not be added to your list.`, - type: "error", - }); - } - }} - > - Add -
- ), - ]} - + /> + + {canShare && ( + + )} + {canEdit && ( + + + + )}
- +
+ ); +} + +// Sortable row — wraps RowContent and wires up @dnd-kit's transform/listeners. +function SortableAssistantRow(props: RowProps) { + const { + attributes, + listeners, + setNodeRef, + setActivatorNodeRef, + transform, + transition, + isDragging, + } = useSortable({ id: props.assistant.id }); + + const style: React.CSSProperties = { + transform: CSS.Transform.toString(transform), + transition, + opacity: isDragging ? 0.5 : 1, + }; + + return ( +
+ +
); } +// Static row — used for the hidden section (no DnD). +function StaticAssistantRow(props: RowProps) { + return ; +} + +// --------------------------------------------------------------------------- +// Bulk action bar — appears only when something is selected. The hide/show +// split mirrors the per-row visibility toggle; "Remove" matches the prior +// "Hide / Remove" semantic (removes from chosen_assistants regardless of +// ownership). +// --------------------------------------------------------------------------- + +function BulkActionsBar({ + selectedCount, + onClearSelection, + onShow, + onHide, + onRemove, +}: { + selectedCount: number; + onClearSelection: () => void; + onShow: () => void; + onHide: () => void; + onRemove: () => void; +}) { + return ( +
+ + {selectedCount} selected + + + + + +
+ ); +} + +// --------------------------------------------------------------------------- +// Main list. State model: +// - `chosenOrder`: the user's chosen_assistants array (ordered, visible) +// - hidden = every assistant the user has access to that's NOT in chosenOrder +// - selected: bulk-action set; orthogonal to visible/hidden +// - search: pure client-side filter applied to both groups before render +// +// All mutations are optimistic — update local state, fire PATCH; on error +// roll back and surface a toast. router.refresh() runs on success so the +// rest of the app (chat picker etc.) sees the new order. +// --------------------------------------------------------------------------- + interface AssistantsListProps { user: User | null; assistants: Persona[]; } export function AssistantsList({ user, assistants }: AssistantsListProps) { - const filteredAssistants = orderAssistantsForUser(assistants, user); - const ownedButHiddenAssistants = assistants.filter( - (assistant) => - checkUserOwnsAssistant(user, assistant) && - user?.preferences?.chosen_assistants && - !user?.preferences?.chosen_assistants?.includes(assistant.id) - ); - const allAssistantIds = assistants.map((assistant) => assistant.id); - + const router = useRouter(); const { popup, setPopup } = usePopup(); - const { data: users } = useSWR( + // When the user has no preference yet, treat every accessible + // assistant as "visible by default" — matches the previous behavior. + const initialChosen: number[] = + user?.preferences?.chosen_assistants ?? assistants.map((a) => a.id); + + const [chosenOrder, setChosenOrder] = useState(initialChosen); + const [search, setSearch] = useState(""); + const [selected, setSelected] = useState>(new Set()); + const [sharingAssistantId, setSharingAssistantId] = useState( + null + ); + + // Pulled from /api/users; used by the share modal. Same pattern as the + // pre-rewrite component. + const { data: allUsers } = useSWR( "/api/users", errorHandlingFetcher ); + // Derived: id-keyed lookup, visible/hidden splits, search-filtered. + const assistantsById = useMemo( + () => new Map(assistants.map((a) => [a.id, a])), + [assistants] + ); + const chosenSet = useMemo(() => new Set(chosenOrder), [chosenOrder]); + + const visibleAssistants: Persona[] = useMemo(() => { + const out: Persona[] = []; + for (const id of chosenOrder) { + const a = assistantsById.get(id); + if (a) out.push(a); + } + return out; + }, [chosenOrder, assistantsById]); + + const hiddenAssistants: Persona[] = useMemo( + () => assistants.filter((a) => !chosenSet.has(a.id)), + [assistants, chosenSet] + ); + + const matchesSearch = (a: Persona) => { + if (!search.trim()) return true; + const q = search.trim().toLowerCase(); + if (a.name.toLowerCase().includes(q)) return true; + if (a.description?.toLowerCase().includes(q)) return true; + return (a.tools ?? []).some((t) => t.name.toLowerCase().includes(q)); + }; + + const filteredVisible = visibleAssistants.filter(matchesSearch); + const filteredHidden = hiddenAssistants.filter(matchesSearch); + + // The default is just position 0 of chosen_assistants. If the user has + // no preference at all, there's no notion of "default yet" — leave it + // unset so no row shows the accent until the user picks. + const defaultId = + user?.preferences?.chosen_assistants && chosenOrder.length > 0 + ? chosenOrder[0] + : null; + + // ---- persistence with optimistic + undo -------------------------------- + + const persistOrder = async ( + nextOrder: number[], + { + successMsg, + undoToOrder, + }: { successMsg?: string; undoToOrder?: number[] } = {} + ): Promise => { + const prev = chosenOrder; + setChosenOrder(nextOrder); + const ok = await reorderAssistantList(nextOrder); + if (!ok) { + setChosenOrder(prev); + setPopup({ + message: "Couldn't update your assistant list — please try again.", + type: "error", + }); + return false; + } + if (successMsg) { + setPopup({ + message: successMsg, + type: "success", + undo: + undoToOrder !== undefined + ? { + onClick: async () => { + await persistOrder(undoToOrder); + }, + } + : undefined, + }); + } + // Refresh the SSR-fetched data so other parts of the app see the + // new order (chat picker, sidebar, etc.). + router.refresh(); + return true; + }; + + // ---- handlers ---------------------------------------------------------- + + const handleDragEnd = (event: DragEndEvent) => { + const { active, over } = event; + if (!over || active.id === over.id) return; + const oldIndex = chosenOrder.indexOf(Number(active.id)); + const newIndex = chosenOrder.indexOf(Number(over.id)); + if (oldIndex < 0 || newIndex < 0) return; + const next = arrayMove(chosenOrder, oldIndex, newIndex); + void persistOrder(next, { + successMsg: "Order updated.", + undoToOrder: chosenOrder, + }); + }; + + const handleSetDefault = async (id: number) => { + if (chosenOrder[0] === id) return; + const prev = chosenOrder; + const ok = await persistOrder( + [id, ...chosenOrder.filter((x) => x !== id)], + { + successMsg: `Default assistant updated.`, + undoToOrder: prev, + } + ); + if (!ok) { + // persistOrder already showed the error toast. + } else { + // setDefaultAssistant also handles the case where id wasn't in + // chosen_assistants; persistOrder above already prepended it. + void setDefaultAssistant(id, prev); // best-effort idempotent confirmation + } + }; + + const handleToggleVisibility = async (id: number, makeVisible: boolean) => { + const prev = chosenOrder; + if (makeVisible) { + // Add to end so reorder isn't surprising. + const next = [...chosenOrder, id]; + const assistant = assistantsById.get(id); + await persistOrder(next, { + successMsg: assistant + ? `"${assistant.name}" added to your picker.` + : "Added to your picker.", + undoToOrder: prev, + }); + } else { + if (chosenOrder.length === 1 && chosenOrder[0] === id) { + setPopup({ + message: + "You need at least one visible assistant — can't hide the last one.", + type: "error", + }); + return; + } + const next = chosenOrder.filter((x) => x !== id); + const assistant = assistantsById.get(id); + await persistOrder(next, { + successMsg: assistant + ? `"${assistant.name}" hidden from your picker.` + : "Hidden from your picker.", + undoToOrder: prev, + }); + } + }; + + const handleToggleSelect = (id: number) => { + setSelected((curr) => { + const next = new Set(curr); + if (next.has(id)) next.delete(id); + else next.add(id); + return next; + }); + }; + + const clearSelection = () => setSelected(new Set()); + + const handleBulkShow = async () => { + const ids = Array.from(selected); + const prev = chosenOrder; + const ok = await bulkAddToList(ids, chosenOrder); + if (!ok) { + setPopup({ message: "Couldn't show selected.", type: "error" }); + return; + } + // Mirror the optimistic update locally — the helper PATCHed the + // server; we just need to align local state. + const existing = new Set(chosenOrder); + const toAppend = ids.filter((id) => !existing.has(id)); + setChosenOrder([...chosenOrder, ...toAppend]); + setPopup({ + message: `${ids.length} assistant${ids.length === 1 ? "" : "s"} shown.`, + type: "success", + undo: { + onClick: async () => { + await persistOrder(prev); + }, + }, + }); + clearSelection(); + router.refresh(); + }; + + const handleBulkHide = async () => { + const ids = Array.from(selected); + // Don't let the user hide every visible row at once. + const remaining = chosenOrder.filter((id) => !ids.includes(id)); + if (remaining.length === 0 && chosenOrder.length > 0) { + setPopup({ + message: "Can't hide every visible assistant — keep at least one.", + type: "error", + }); + return; + } + const prev = chosenOrder; + const ok = await bulkRemoveFromList(ids, chosenOrder); + if (!ok) { + setPopup({ message: "Couldn't hide selected.", type: "error" }); + return; + } + setChosenOrder(remaining); + setPopup({ + message: `${ids.length} assistant${ids.length === 1 ? "" : "s"} hidden.`, + type: "success", + undo: { + onClick: async () => { + await persistOrder(prev); + }, + }, + }); + clearSelection(); + router.refresh(); + }; + + // "Remove" is the same backend op as Hide today — both just remove the + // ids from chosen_assistants. The label distinction is a UX hint: Hide + // is reversible by toggling the switch back on (or Undo); Remove + // implies "I don't want to see this any more." Functionally identical + // until we have a true "remove access" path. + const handleBulkRemove = handleBulkHide; + + // ---- DnD plumbing ------------------------------------------------------- + + // 6px activation distance: a click on the handle shouldn't immediately + // start a drag. Helps especially for the click-and-then-undo flow. + const sensors = useSensors( + useSensor(PointerSensor, { activationConstraint: { distance: 6 } }) + ); + + // ---- render ------------------------------------------------------------- + + const sharingAssistant = + sharingAssistantId != null + ? assistantsById.get(sharingAssistantId) ?? null + : null; + return ( <> {popup} -
- My Assistants - -
- - -
- - Create New Assistant -
-
- - - -
- - View Available Assistants -
-
+ {sharingAssistant && ( + { + setSharingAssistantId(null); + router.refresh(); + }} + show + /> + )} + +
+ {/* Header: title + 1-line subtitle + create button + browse link. + Cut the two-tile nav block and the explanatory paragraph. */} +
+
+ My Assistants + + Choose which assistants appear in the chat picker, set your + default, and reorder by dragging. + +
+ + Create
-

- Assistants allow you to customize your experience for a specific - purpose. Specifically, they combine instructions, extra knowledge, and - any combination of tools. -

- - - -

Active Assistants

- - - The order the assistants appear below will be the order they appear in - the Assistants dropdown. The first assistant listed will be your - default assistant when you start a new chat. - - -
- {filteredAssistants.map((assistant, index) => ( - - ))} +
+ + Browse all available assistants +
- {ownedButHiddenAssistants.length > 0 && ( - <> - - -

Your Hidden Assistants

+ {/* Search */} +
+ + setSearch(e.target.value)} + className=" + w-full pl-10 pr-3 py-2 + rounded-md border border-border bg-background + focus:outline-none focus:ring-2 focus:ring-accent + " + /> +
- - Assistants you've created that aren't currently visible - in the Assistants selector. - + {/* Bulk actions — only when something selected */} + {selected.size > 0 && ( + + )} -
- {ownedButHiddenAssistants.map((assistant, index) => ( - 0 ? ( + + a.id)} + strategy={verticalListSortingStrategy} + > + {filteredVisible.map((assistant) => ( + ))} + + + ) : ( + + )} + + {/* Hidden section — only show divider/header if there's anything */} + {filteredHidden.length > 0 && ( + <> +
+
+ + Hidden ({filteredHidden.length}) + +
+ {filteredHidden.map((assistant) => ( + + ))} )} + + {/* Search produced nothing at all */} + {filteredVisible.length === 0 && filteredHidden.length === 0 && ( + + )}
); } + +function EmptyState({ title, body }: { title: string; body: string }) { + return ( +
+

{title}

+

{body}

+
+ ); +} diff --git a/web/src/components/admin/connectors/Popup.tsx b/web/src/components/admin/connectors/Popup.tsx index adfc0665c25..2bcc73b7d5b 100644 --- a/web/src/components/admin/connectors/Popup.tsx +++ b/web/src/components/admin/connectors/Popup.tsx @@ -3,15 +3,59 @@ import { useRef, useState } from "react"; export interface PopupSpec { message: string; type: "success" | "error"; + // Optional undo affordance. When present, the popup renders an "Undo" + // button next to the message; clicking it invokes onClick and dismisses + // the popup. The popup also stays on screen longer when undoable + // (default 4s → 6s) so the user has time to react. + undo?: { + label?: string; // defaults to "Undo" + onClick: () => Promise | void; + }; } -export const Popup: React.FC = ({ message, type }) => ( +export const Popup: React.FC< + PopupSpec & { onUndo?: () => void; onDismiss?: () => void } +> = ({ message, type, undo, onUndo, onDismiss }) => (
- {message} + {message} + {undo && ( + + )} + {onDismiss && ( + + )}
); @@ -27,13 +71,24 @@ export const usePopup = () => { } setPopup(popupSpec); - timeoutRef.current = setTimeout(() => { - setPopup(null); - }, 4000); + if (popupSpec) { + // Undoable popups stay on screen a bit longer — users need time to + // notice the affordance and click it. 6s vs 4s for plain toasts. + const ms = popupSpec.undo ? 6000 : 4000; + timeoutRef.current = setTimeout(() => { + setPopup(null); + }, ms); + } }; return { - popup: popup && , + popup: popup && ( + setPopupWithExpiration(null)} + onDismiss={() => setPopupWithExpiration(null)} + /> + ), setPopup: setPopupWithExpiration, }; }; diff --git a/web/src/lib/assistants/updateAssistantPreferences.ts b/web/src/lib/assistants/updateAssistantPreferences.ts index f902e561cf4..05b0b3e5302 100644 --- a/web/src/lib/assistants/updateAssistantPreferences.ts +++ b/web/src/lib/assistants/updateAssistantPreferences.ts @@ -1,3 +1,8 @@ +// PATCH the user's full `chosen_assistants` array. This single endpoint +// drives every preference mutation below — visibility, ordering, default +// — because the backend treats the array as both "which assistants are +// visible in the picker" (membership) AND "in what order" (positions), +// with position 0 = default. async function updateUserAssistantList( chosenAssistants: number[] ): Promise { @@ -60,3 +65,51 @@ export async function moveAssistantDown( } return false; } + +// --------------------------------------------------------------------------- +// Used by the new Manage Assistants UX +// --------------------------------------------------------------------------- + +/** Replace the user's full chosen_assistants list (drag-reorder, bulk ops). */ +export async function reorderAssistantList( + newOrder: number[] +): Promise { + return updateUserAssistantList(newOrder); +} + +/** + * Move `assistantId` to position 0 so it becomes the user's default. If the + * id isn't in the list it's prepended (i.e. set-as-default also unhides it). + */ +export async function setDefaultAssistant( + assistantId: number, + chosenAssistants: number[] +): Promise { + const withoutTarget = chosenAssistants.filter((id) => id !== assistantId); + return updateUserAssistantList([assistantId, ...withoutTarget]); +} + +/** Bulk: hide a set of assistant ids (remove them from chosen_assistants). */ +export async function bulkRemoveFromList( + assistantIds: number[], + chosenAssistants: number[] +): Promise { + const toRemove = new Set(assistantIds); + return updateUserAssistantList( + chosenAssistants.filter((id) => !toRemove.has(id)) + ); +} + +/** Bulk: add a set of assistant ids to the visible list (appended at the end). */ +export async function bulkAddToList( + assistantIds: number[], + chosenAssistants: number[] +): Promise { + const existing = new Set(chosenAssistants); + const toAppend = assistantIds.filter((id) => !existing.has(id)); + if (toAppend.length === 0) { + // Nothing to do, but report success so the UI can clear its selection. + return true; + } + return updateUserAssistantList([...chosenAssistants, ...toAppend]); +} From d72dd43aca56e31a91156ac0c5a5923fcf54f8e9 Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 18:43:56 +0530 Subject: [PATCH 006/102] Assistant Gallery page UX overhaul MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sister rework to the Manage Assistants page. With 50+ accessible assistants and growing, the old flat 2-column grid had no hierarchy and no status signal — every card looked identical regardless of whether it was yours, shared, public, or already in your picker. Same conceptual fix as Manage: give the page structure so scanning answers "what's mine?", "what's new?", "what does this one do?". Backend unchanged — every interaction PATCHes chosen_assistants via the existing /api/user/assistant-list endpoint (same path the Manage page uses). All mutations are optimistic + undoable. Changes (numbers map to the design proposal): 1. Per-card "In your picker" badge + muted card style when added. Eye now finds the un-added ones in a glance. 2. Three implicit sections: Yours / Shared with you / Featured & Built-in. Empty sections hide; section headers carry counts. 3. Filter chip rows: availability (All / Available to add / Already added) with live counts, plus auto-generated per-tool chips for tools that appear in ≥2 assistants (avoids chip-bloat as the dataset grows). Tool filters use OR semantics. 4. Owner display: best-effort name from the email local-part (split on '@', dots/underscores→spaces) with a "Built-in" badge for default_persona assistants. Kills the fork-specific "Author: Darwin" magic string. 6. Responsive grid: 1 / 2 / 3 / 4 cols by breakpoint. 7. Header matches the Manage rebuild — title + subtitle + Create button top-right, "Back to my assistants" as a text link. Cut the giant centered nav button and the explanatory paragraph. 8. Sort dropdown: Featured (API order, respects admin display_priority) / A → Z / Newly added (id desc proxy for recency). 9. Search broadened to name + description + tool names + document-set names. Empty-result state with a real "Clear all filters" button. 10. Compact "{n} tools" / "{n} sources" chips with hover-reveal of the full tool list. Flat Add/Remove buttons replace Tremor's color="green/red" which was visually shoutier than the action. 11. Design tokens fixed — border-border / focus-ring-accent in place of hardcoded gray-300 / blue-500. Consistent with the rest of the app. Skipped (per proposal): - #5 detail drawer / modal — revisit after observing how users use the new grid; bigger feature. - Bulk select — adding 5 assistants at once isn't a real use case here (bulk hide on Manage was). The pre-existing addAssistantToList / removeAssistantFromList helpers are kept and used at the call sites. The shared reorderAssistantList helper added in the prior commit is reused for the undo paths. Verified: npx tsc --noEmit clean across web/ (0 errors). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../assistants/gallery/AssistantsGallery.tsx | 929 ++++++++++++++---- 1 file changed, 764 insertions(+), 165 deletions(-) diff --git a/web/src/app/assistants/gallery/AssistantsGallery.tsx b/web/src/app/assistants/gallery/AssistantsGallery.tsx index 64c96af2cff..88cb1eb7f6c 100644 --- a/web/src/app/assistants/gallery/AssistantsGallery.tsx +++ b/web/src/app/assistants/gallery/AssistantsGallery.tsx @@ -1,21 +1,333 @@ "use client"; +/** + * Assistant Gallery — redesigned UX. + * + * Why this rewrite: with 50+ assistants and growing, the old flat 2-column + * grid had no hierarchy, no status signal (added vs not), and no real + * filtering — every card looked identical regardless of whether it was + * yours, shared, public, or already in your picker. This page is now + * structured around three questions: "is this mine?" (sections), "have + * I added this?" (availability filter + per-card chip), and "what does + * it do?" (denser layout + tool/source counts + tool filter chips). + * + * Changes packed in (numbers map to the design proposal): + * + * 1. Per-card "✓ In your picker" status chip + muted card style for + * already-added assistants. Eye finds the un-added ones fast. + * 2. Three implicit sections: Yours / Shared with you / Featured. + * Empty sections hide; section headers carry counts. + * 3. Filter chips above the grid: availability (All / Available / Added) + * + auto-generated per-tool chips (only tools that appear in ≥2 + * assistants, so the chip row doesn't bloat as the dataset grows). + * 4. Owner display: name-from-email fallback (split on '@'), with a + * "Built-in" badge for default_persona assistants — kills the + * fork-specific "Author: Darwin" magic string. + * 6. Responsive grid: 1 col on mobile, 2 / 3 / 4 by breakpoint. + * 7. Header matches the Manage page: title + subtitle on the left, + * "Back to my assistants" as a text link, "Create new" button + * top-right. The giant centered button + paragraph are gone. + * 8. Sort dropdown: Featured (API order) / A → Z / Newly added. + * 9. Search now includes tool names AND document-set names. Empty- + * search-result has a real empty state with a Clear button. + * 10. Compact chips ({n} tools / {n} sources), tool list on hover. + * Add/Remove buttons replace the heavy Tremor color="green/red" + * with flat buttons matching the design system. + * 11. Design tokens fixed — search input uses border-border / + * focus-ring-accent like the rest of the app. + * + * What is intentionally NOT here: + * - #5 (detail drawer / modal) — deferred per the proposal; revisit + * after seeing how users use the new gallery. + * - Bulk select — adding 5 assistants at once isn't a real use case. + * + * All mutations are optimistic + undoable, mirroring the Manage page. + */ + +import { useMemo, useState } from "react"; import { Persona } from "@/app/admin/assistants/interfaces"; -import { AssistantIcon } from "@/components/assistants/AssistantIcon"; import { User } from "@/lib/types"; -import { Button } from "@tremor/react"; -import Link from "next/link"; -import { useState } from "react"; -import { FiMinus, FiPlus, FiX } from "react-icons/fi"; -import { NavigationButton } from "../NavigationButton"; -import { AssistantsPageTitle } from "../AssistantsPageTitle"; +import { AssistantIcon } from "@/components/assistants/AssistantIcon"; +import { Bubble } from "@/components/Bubble"; +import { usePopup } from "@/components/admin/connectors/Popup"; import { addAssistantToList, + reorderAssistantList, removeAssistantFromList, } from "@/lib/assistants/updateAssistantPreferences"; -import { usePopup } from "@/components/admin/connectors/Popup"; +import { checkUserOwnsAssistant } from "@/lib/assistants/checkOwnership"; +import { AssistantsPageTitle } from "../AssistantsPageTitle"; +import Link from "next/link"; import { useRouter } from "next/navigation"; -import { ToolsDisplay } from "../ToolsDisplay"; +import { + FiBookmark, + FiCheck, + FiImage, + FiPlus, + FiSearch, + FiTool, + FiX, +} from "react-icons/fi"; + +// --------------------------------------------------------------------------- +// Types +// --------------------------------------------------------------------------- + +type Availability = "all" | "available" | "added"; +type SortMode = "featured" | "name-asc" | "recent"; + +interface SectionDef { + key: string; + label: string; + assistants: Persona[]; +} + +// --------------------------------------------------------------------------- +// Small helpers +// --------------------------------------------------------------------------- + +/** Friendly tool name. "SearchTool" → "Search", "ImageGenerationTool" → + * "Image Generation", else fall back to the raw name. Mirrors the + * mapping in ToolsDisplay; kept inline here so this page is self- + * contained. */ +function toolDisplayName(rawName: string): string { + if (rawName === "SearchTool") return "Search"; + if (rawName === "ImageGenerationTool") return "Image Generation"; + return rawName; +} + +function toolIcon(rawName: string) { + if (rawName === "SearchTool") return ; + if (rawName === "ImageGenerationTool") return ; + return ; +} + +/** Best-effort author display name. Names aren't on MinimalUserSnapshot; + * we have email only. Split on '@' so "foo.bar@example.com" → "foo.bar" + * rather than dumping the full email at the user. */ +function ownerDisplayName(persona: Persona): string | null { + if (persona.default_persona) return null; // Built-in badge shown instead. + const email = persona.owner?.email; + if (!email) return null; + const local = email.split("@")[0]; + // Replace dots/underscores with spaces and trim — usually closer to + // "First Last" than the raw local-part. + return local.replace(/[._]/g, " ").trim() || email; +} + +// --------------------------------------------------------------------------- +// Single card +// --------------------------------------------------------------------------- + +interface CardProps { + assistant: Persona; + user: User | null; + isAdded: boolean; + onAdd: (a: Persona) => void; + onRemove: (a: Persona) => void; +} + +function GalleryCard({ + assistant, + user, + isAdded, + onAdd, + onRemove, +}: CardProps) { + const toolCount = assistant.tools?.length ?? 0; + const docSetCount = assistant.document_sets?.length ?? 0; + const author = ownerDisplayName(assistant); + const isBuiltIn = assistant.default_persona; + + return ( +
+ {/* Status badge (top-right). Only appears when already added — the + absence of a badge is the signal for "available to add". */} + {isAdded && ( +
+ In your picker +
+ )} + + {/* Header: icon + name (+ built-in badge if applicable) */} +
+ +
+
+

+ {assistant.name} +

+ {isBuiltIn && ( + + Built-in + + )} +
+
+
+ + {/* Description — primary signal of "should I pick this?". */} + {assistant.description && ( +

+ {assistant.description} +

+ )} + + {/* Compact scope chips: {n} tools (hover reveals names) + {n} sources */} + {(toolCount > 0 || docSetCount > 0) && ( +
+ {docSetCount > 0 && ( + +
+ + {docSetCount} source{docSetCount === 1 ? "" : "s"} +
+
+ )} + {toolCount > 0 && ( + +
toolDisplayName(t.name)) + .join(", ")} + > + + {toolCount} tool{toolCount === 1 ? "" : "s"} +
+
+ )} +
+ )} + + {/* Footer row: author (or built-in subtle text) + Add/Remove */} +
+
+ {isBuiltIn ? ( + Bundled assistant + ) : author ? ( + by {author} + ) : ( + // Public assistant with no owner record — rare; surface + // gracefully without the old "Author: Darwin" magic string. + Public + )} +
+ + {/* Add/Remove — flat, matches design system. Tremor's color="green" + for "Add to my list" was visually shoutier than the action. */} + {user && + (isAdded ? ( + + ) : ( + + ))} +
+
+ ); +} + +// --------------------------------------------------------------------------- +// Filter chip — small reusable toggle pill +// --------------------------------------------------------------------------- + +function FilterChip({ + active, + onClick, + children, + badge, +}: { + active: boolean; + onClick: () => void; + children: React.ReactNode; + badge?: number; +}) { + return ( + + ); +} + +// --------------------------------------------------------------------------- +// Main +// --------------------------------------------------------------------------- export function AssistantsGallery({ assistants, @@ -24,181 +336,468 @@ export function AssistantsGallery({ assistants: Persona[]; user: User | null; }) { - function filterAssistants(assistants: Persona[], query: string): Persona[] { - return assistants.filter( - (assistant) => - assistant.name.toLowerCase().includes(query.toLowerCase()) || - assistant.description.toLowerCase().includes(query.toLowerCase()) - ); - } - const router = useRouter(); - - const [searchQuery, setSearchQuery] = useState(""); const { popup, setPopup } = usePopup(); - const allAssistantIds = assistants.map((assistant) => assistant.id); - const filteredAssistants = filterAssistants(assistants, searchQuery); + // Mirrors the Manage page: no preference = every accessible assistant + // is "in the picker" by default. + const initialChosen: number[] = + user?.preferences?.chosen_assistants ?? assistants.map((a) => a.id); + const [chosenAssistants, setChosenAssistants] = + useState(initialChosen); + const chosenSet = useMemo( + () => new Set(chosenAssistants), + [chosenAssistants] + ); + + // ---- filter / sort state ------------------------------------------------- + + const [search, setSearch] = useState(""); + const [availability, setAvailability] = useState("all"); + const [toolFilters, setToolFilters] = useState>(new Set()); + const [sortMode, setSortMode] = useState("featured"); + + // ---- derived: which tool chips to surface -------------------------------- + // Only show chips for tools that appear in ≥2 assistants. Keeps the chip + // row short as the dataset grows and avoids one-off tools cluttering it. + const commonTools: string[] = useMemo(() => { + const counts = new Map(); + for (const a of assistants) { + const seen = new Set(); + for (const t of a.tools ?? []) { + // dedupe within one assistant (in case the same tool appears twice) + if (seen.has(t.name)) continue; + seen.add(t.name); + counts.set(t.name, (counts.get(t.name) ?? 0) + 1); + } + } + return Array.from(counts.entries()) + .filter(([, c]) => c >= 2) + .sort((a, b) => b[1] - a[1]) + .map(([name]) => name); + }, [assistants]); + + // ---- derived: filtered + sorted list ------------------------------------- + + const filtered: Persona[] = useMemo(() => { + const q = search.trim().toLowerCase(); + const out = assistants.filter((a) => { + // search across name + description + tool names + doc-set names + if (q) { + const hay = [ + a.name, + a.description ?? "", + ...(a.tools ?? []).map((t) => toolDisplayName(t.name)), + ...(a.tools ?? []).map((t) => t.name), + ...(a.document_sets ?? []).map((d) => d.name), + ] + .join(" ") + .toLowerCase(); + if (!hay.includes(q)) return false; + } + if (availability === "added" && !chosenSet.has(a.id)) return false; + if (availability === "available" && chosenSet.has(a.id)) return false; + if (toolFilters.size > 0) { + // OR semantics — "show me ones with Search OR Image Gen". Narrower + // intersection ("AND") via multiple tool chips is rarely what users + // mean when ticking pills. + const names = new Set((a.tools ?? []).map((t) => t.name)); + const any = Array.from(toolFilters).some((tf) => names.has(tf)); + if (!any) return false; + } + return true; + }); + + if (sortMode === "name-asc") { + out.sort((a, b) => a.name.localeCompare(b.name)); + } else if (sortMode === "recent") { + // No created_at on Persona; id desc is a fair proxy ("newer ids + // were created later"). + out.sort((a, b) => b.id - a.id); + } + // "featured" = preserve API order (admins curate via display_priority). + return out; + }, [assistants, search, availability, toolFilters, sortMode, chosenSet]); + + // ---- derived: sections --------------------------------------------------- + + const sections: SectionDef[] = useMemo(() => { + const yours: Persona[] = []; + const shared: Persona[] = []; + const featured: Persona[] = []; + + for (const a of filtered) { + const ownedByUser = checkUserOwnsAssistant(user, a); + const sharedWithUser = + user != null && + !ownedByUser && + !a.is_public && + (a.users ?? []).some((u) => u.id === user.id); + + if (ownedByUser && !a.default_persona) { + yours.push(a); + } else if (sharedWithUser) { + shared.push(a); + } else { + // Public OR built-in OR (accessible via group permission). All + // surface here as "Featured & Built-in" — visually equivalent + // from a chooser's POV. + featured.push(a); + } + } + + const out: SectionDef[] = []; + if (yours.length > 0) out.push({ key: "yours", label: "Yours", assistants: yours }); + if (shared.length > 0) + out.push({ key: "shared", label: "Shared with you", assistants: shared }); + if (featured.length > 0) + out.push({ + key: "featured", + label: "Featured & Built-in", + assistants: featured, + }); + return out; + }, [filtered, user]); + + // ---- counts for filter chips --------------------------------------------- + + const counts = useMemo(() => { + const q = search.trim().toLowerCase(); + let all = 0; + let added = 0; + let available = 0; + for (const a of assistants) { + if (q) { + const hay = [ + a.name, + a.description ?? "", + ...(a.tools ?? []).map((t) => toolDisplayName(t.name)), + ...(a.tools ?? []).map((t) => t.name), + ...(a.document_sets ?? []).map((d) => d.name), + ] + .join(" ") + .toLowerCase(); + if (!hay.includes(q)) continue; + } + if (toolFilters.size > 0) { + const names = new Set((a.tools ?? []).map((t) => t.name)); + if (!Array.from(toolFilters).some((tf) => names.has(tf))) continue; + } + all++; + if (chosenSet.has(a.id)) added++; + else available++; + } + return { all, added, available }; + }, [assistants, search, toolFilters, chosenSet]); + + // ---- optimistic add/remove (mirrors Manage page persistOrder) ----------- + + const persistChosen = async ( + next: number[], + { + successMsg, + undoToOrder, + }: { successMsg?: string; undoToOrder?: number[] } = {} + ): Promise => { + const prev = chosenAssistants; + setChosenAssistants(next); + const ok = await reorderAssistantList(next); + if (!ok) { + setChosenAssistants(prev); + setPopup({ + message: "Couldn't update your assistant list — please try again.", + type: "error", + }); + return false; + } + if (successMsg) { + setPopup({ + message: successMsg, + type: "success", + undo: + undoToOrder !== undefined + ? { + onClick: async () => { + await persistChosen(undoToOrder); + }, + } + : undefined, + }); + } + router.refresh(); + return true; + }; + + const handleAdd = async (a: Persona) => { + if (!user) return; + if (chosenSet.has(a.id)) return; // already added — no-op + const prev = chosenAssistants; + const next = [...prev, a.id]; + // Use addAssistantToList specifically (idempotent) rather than the + // generic reorder helper — both PATCH the same endpoint, but this + // signals intent at the call-site. + setChosenAssistants(next); + const ok = await addAssistantToList(a.id, prev); + if (!ok) { + setChosenAssistants(prev); + setPopup({ + message: `Couldn't add "${a.name}". Try again?`, + type: "error", + }); + return; + } + setPopup({ + message: `"${a.name}" added to your picker.`, + type: "success", + undo: { + onClick: async () => { + await persistChosen(prev); + }, + }, + }); + router.refresh(); + }; + + const handleRemove = async (a: Persona) => { + if (!user) return; + if (chosenAssistants.length === 1 && chosenAssistants[0] === a.id) { + setPopup({ + message: + "You need at least one visible assistant — can't remove the last one.", + type: "error", + }); + return; + } + const prev = chosenAssistants; + const next = prev.filter((id) => id !== a.id); + setChosenAssistants(next); + const ok = await removeAssistantFromList(a.id, prev); + if (!ok) { + setChosenAssistants(prev); + setPopup({ + message: `Couldn't remove "${a.name}". Try again?`, + type: "error", + }); + return; + } + setPopup({ + message: `"${a.name}" removed from your picker.`, + type: "success", + undo: { + onClick: async () => { + await persistChosen(prev); + }, + }, + }); + router.refresh(); + }; + + // ---- handlers for filter UI ---------------------------------------------- + + const clearAllFilters = () => { + setSearch(""); + setAvailability("all"); + setToolFilters(new Set()); + }; + + const toggleToolFilter = (toolName: string) => { + setToolFilters((curr) => { + const next = new Set(curr); + if (next.has(toolName)) next.delete(toolName); + else next.add(toolName); + return next; + }); + }; + + const hasAnyFilter = + search.trim() !== "" || availability !== "all" || toolFilters.size > 0; + + // ---- render ------------------------------------------------------------- return ( <> {popup} -
- Assistant Gallery -
- - View Your Assistants + +
+ {/* Header — matches the Manage page rebuild */} +
+
+ Assistant Gallery +

+ Browse every assistant available to you. Add the ones you want + to your chat picker. +

+
+ + Create new
-

- Discover and create custom assistants that combine instructions, extra - knowledge, and any combination of tools. -

+
+ + ← Back to my assistants + +
-
+ {/* Search */} +
+ setSearchQuery(e.target.value)} + type="search" + placeholder="Search by name, description, tool, or source…" + value={search} + onChange={(e) => setSearch(e.target.value)} className=" - w-full - p-2 - border - border-gray-300 - rounded - focus:outline-none - focus:ring-2 - focus:ring-blue-500 - " + w-full pl-10 pr-3 py-2 + rounded-md border border-border bg-background + focus:outline-none focus:ring-2 focus:ring-accent + " />
-
- {filteredAssistants.map((assistant) => ( -
+ setAvailability("all")} + badge={counts.all} + > + All + + setAvailability("available")} + badge={counts.available} + > + Available to add + + setAvailability("added")} + badge={counts.added} + > + Already added + + + {/* Sort lives at the right end of the filter row so visually + "narrow the list" and "order the list" are co-located. */} +
+ + +
+
+ + {commonTools.length > 0 && ( +
+ Tools: + {commonTools.map((tn) => ( + toggleToolFilter(tn)} + > + + {toolIcon(tn)} {toolDisplayName(tn)} + + + ))} + {toolFilters.size > 0 && ( + + )} +
+ )} + + {/* Empty state when filters exclude everything */} + {sections.length === 0 && ( +
+

No assistants match.

+

+ {hasAnyFilter + ? "Try a different filter, or clear all to see everything." + : "There are no assistants available to you yet."} +

+ {hasAnyFilter && ( + - ) : ( - - )} -
- )} -
- {assistant.tools.length > 0 && ( - - )} -

{assistant.description}

-

- Author: {assistant.owner?.email || "Darwin"} -

+ > + Clear all filters + + )} +
+ )} + + {/* Sections */} + {sections.map((section) => ( +
+
+

+ {section.label} +

+ + ({section.assistants.length}) +
- ))} -
+
+ {section.assistants.map((assistant) => ( + + ))} +
+ + ))}
); From 6536a9535733902bbf3990a9d8727ed81484757b Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 18:56:11 +0530 Subject: [PATCH 007/102] Add backend/scripts/seed_assistants.py for local UX testing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Dev-only seed script that creates N varied personas in the local DB for exercising the redesigned gallery / manage pages. Refuses to run when POSTGRES_HOST looks like a managed/prod database (azure.com, amazonaws.com, .cloud., "prod", etc.) — guard against pointing this at the wrong env by accident. Mix is designed to populate each section of the new gallery: ~30% "Yours" — owned by target user, private ~20% "Shared with you" — owned by another user, target user in users[] ~50% "Featured" — public, no specific owner Per row randomly attaches 0–3 tools and 0–2 document sets so the {n} tools / {n} sources chips render with variety. Half of "Yours" auto- land in chosen_assistants (and all "Shared with you" do), so the "Already added" vs "Available to add" filter chips have content on both sides without manual setup. 60 distinct names + 30 description templates so 50 rows feel populated and varied. Uses a fixed RNG seed by default (deterministic across runs). Name prefix "[seed] " makes rows easy to spot and to wipe via --clear. Usage: cd backend && source ../.venv/bin/activate python -m scripts.seed_assistants --email you@example.com python -m scripts.seed_assistants --clear # wipe and re-seed Co-Authored-By: Claude Opus 4.7 (1M context) --- backend/scripts/seed_assistants.py | 355 +++++++++++++++++++++++++++++ 1 file changed, 355 insertions(+) create mode 100644 backend/scripts/seed_assistants.py diff --git a/backend/scripts/seed_assistants.py b/backend/scripts/seed_assistants.py new file mode 100644 index 00000000000..10a641e4c4d --- /dev/null +++ b/backend/scripts/seed_assistants.py @@ -0,0 +1,355 @@ +"""Seed N varied personas/assistants into the local DB for UX testing. + +WARNING — local dev tool only. Runs whatever `DATABASE_URL` / `POSTGRES_*` +env vars point at. NEVER point this at a prod Postgres. If `POSTGRES_HOST` +contains anything that smells like prod (configured to error below), the +script aborts. + +Produces a realistic mix for exercising the redesigned gallery page: + + ~30% "Yours" — owned by the target user (private) + ~20% "Shared with you" — owned by another user, target user in users[] + ~50% "Featured" — public (is_public=True, no specific owner) + +Each row gets a random subset of available tools / document sets so the +{n} tools / {n} sources chips render with variety. Half of "Yours" land +in the user's chosen_assistants picker, half do not — so the "Already +added" / "Available to add" filter chips have content on both sides. + +Usage (from repo root): + + cd backend + source ../.venv/bin/activate + python -m scripts.seed_assistants --email you@example.com --count 50 + + # Wipe just the seeded rows (by name prefix) and re-seed: + python -m scripts.seed_assistants --clear + python -m scripts.seed_assistants --email you@example.com --count 50 + +Notes: + * Re-running without --clear stacks more rows. Use --prefix to namespace. + * If --email isn't supplied, picks the first admin user in the DB. + * If only one user exists, the "Shared with you" tier is folded into + "Featured" since there's no one else to own them. +""" +from __future__ import annotations + +import argparse +import os +import random +import sys +from typing import Sequence + +from sqlalchemy import select +from sqlalchemy.orm import Session + +from danswer.auth.schemas import UserRole +from danswer.db.engine import SessionFactory +from danswer.db.models import DocumentSet +from danswer.db.models import Persona +from danswer.db.models import Persona__User +from danswer.db.models import Tool +from danswer.db.models import User +from danswer.search.enums import RecencyBiasSetting + + +# --- safety: don't blast prod by accident --------------------------------- + +# If POSTGRES_HOST contains any of these substrings, bail. Extend as +# needed. The whole point is: this script generates fake data; you only +# want it on your own laptop's Postgres. +_PROD_HOST_FINGERPRINTS = ( + "azure.com", # Azure managed Postgres (darwin uses one) + "amazonaws.com", + "rds.", + "gcp.", + ".cloud.", + "prod", + "production", +) + + +def _abort_if_pointed_at_prod() -> None: + host = (os.environ.get("POSTGRES_HOST") or "").lower() + for marker in _PROD_HOST_FINGERPRINTS: + if marker in host: + print( + f"REFUSING TO RUN: POSTGRES_HOST={host!r} looks like a prod DB.\n" + f"Point POSTGRES_HOST at localhost / your dev container first.", + file=sys.stderr, + ) + sys.exit(2) + + +# --- content pools -------------------------------------------------------- + +# 60 distinct names so we can cover the requested ~50 without dup. +_NAMES: list[str] = [ + "Research Pal", "Code Reviewer", "SQL Helper", "Email Drafter", + "Bug Triage", "API Documenter", "Test Writer", "Meeting Summarizer", + "Slack Digest", "Stand-up Buddy", "Customer Insights", "Onboarding Guide", + "Roadmap Reviewer", "Incident Reporter", "Refactor Assistant", + "Release Notes", "Spec Reader", "RFC Writer", "PR Summarizer", + "Postmortem Helper", "Design Critic", "Architecture Sketch", + "Security Reviewer", "Threat Modeler", "Compliance Auditor", + "Pricing Analyst", "Sales Enabler", "Renewal Scout", "Churn Predictor", + "Marketing Riff", "Blog Draftsman", "Tweet Polisher", "Tagline Brewer", + "FAQ Generator", "Support Tier-1", "Escalation Helper", "Runbook Walker", + "Migration Planner", "Schema Diff Reader", "Index Tuner", "Query Explainer", + "Log Whisperer", "Metric Hunter", "Alert Wrangler", "Dashboard Builder", + "Hire Brief", "Interview Scribe", "Skill Mapper", "Doc Search", "Wiki Pal", + "Note Taker", "Action-Items Finder", "Standup Cliff-Notes", "Investor FAQ", + "Roadblock Spotter", "OKR Reviewer", "Quarterly Recap", "Pitch Sharpener", + "Customer-Reply Drafter", "Demo Outline", +] + +# 30 description templates — varied tones / scopes so the cards don't all +# read the same. +_DESCRIPTIONS: list[str] = [ + "Answers questions about our codebase using semantic search across the indexed repos.", + "Drafts polished customer-facing emails in the company's voice.", + "Summarizes long Slack threads and surfaces decisions and action items.", + "Reads design docs and points out the assumptions and the risky bits.", + "Generates SQL against the analytics warehouse from a plain-English question.", + "Triages new bug reports — classifies severity, finds duplicates, and assigns.", + "Writes release notes from a list of merged PR titles.", + "Cross-references Jira tickets and surfaces blocked dependencies.", + "Helps onboard new engineers by answering 'where does X live?' questions.", + "Reviews pull requests for naming, structure, and style consistency.", + "Drafts incident postmortems from log excerpts and timeline notes.", + "Translates marketing copy into different audience voices.", + "Walks runbooks step by step, asking before each destructive action.", + "Reads the customer-success knowledge base and answers tier-1 tickets.", + "Explains an unfamiliar SQL query — joins, CTEs, window functions.", + "Reviews quarterly OKR drafts for measurability and ambition.", + "Builds the outline of a sales demo from a list of pain points.", + "Tightens taglines — shorter, sharper, fewer adjectives.", + "Sketches an architecture diagram outline from a design doc.", + "Surfaces churn-risk signals from a list of recent customer emails.", + "Answers HR / benefits FAQ from the employee handbook.", + "Reads RFCs and writes the executive summary at the top.", + "Indexes API documentation and answers 'how do I do X' questions.", + "Drafts response templates for support tickets matching common patterns.", + "Generates test cases for a function or endpoint from its signature.", + "Reviews threat models against OWASP top-10 categories.", + "Plans data migrations — pre-checks, batch sizing, rollback steps.", + "Reads incident-channel logs and produces a concise five-line summary.", + "Brainstorms blog post angles given a working title.", + "Helps interviewers stay structured — drafts notes, scores, follow-ups.", +] + + +# --- helpers -------------------------------------------------------------- + + +def _pick(rng: random.Random, items: Sequence, k_min: int, k_max: int) -> list: + """Return between k_min and k_max random items (without replacement). + + Tolerates `items` being shorter than k_max — caps at available length. + """ + if not items: + return [] + upper = min(k_max, len(items)) + k = rng.randint(k_min, upper) + if k <= 0: + return [] + return rng.sample(list(items), k) + + +def _resolve_target_user(session: Session, email: str | None) -> User | None: + if email: + user = session.scalar(select(User).where(User.email == email)) + if user is None: + print(f"No user with email {email!r} found.", file=sys.stderr) + return user + # No email given — prefer an admin user, fall back to any user. + admin = session.scalar( + select(User).where(User.role == UserRole.ADMIN).limit(1) + ) + if admin is not None: + return admin + return session.scalar(select(User).limit(1)) + + +def _pick_other_user(session: Session, target_user_id) -> User | None: + """Find a user other than the target to own the "shared with you" rows.""" + return session.scalar( + select(User).where(User.id != target_user_id).limit(1) + ) + + +def _clear(session: Session, prefix: str) -> int: + """Soft-delete by name prefix is risky if a real persona shares the + prefix. We assert prefix is non-empty and unmistakably synthetic. + """ + if not prefix or len(prefix) < 3: + print( + f"Refusing to clear with suspiciously short prefix {prefix!r}.", + file=sys.stderr, + ) + sys.exit(2) + personas = session.scalars( + select(Persona).where(Persona.name.startswith(prefix)) + ).all() + n = 0 + for p in personas: + # Hard delete — these are synthetic seed rows, not user data. + # Junction rows clean up via cascade configured on the model. + session.delete(p) + n += 1 + session.commit() + return n + + +# --- main ----------------------------------------------------------------- + + +def main() -> None: + _abort_if_pointed_at_prod() + + ap = argparse.ArgumentParser(description=__doc__.splitlines()[0]) + ap.add_argument("--count", type=int, default=50, help="How many to create.") + ap.add_argument( + "--email", + help="Target user email — the 'me' for testing. Default: first admin user.", + ) + ap.add_argument( + "--prefix", + default="[seed] ", + help="Name prefix so seeded rows are easy to spot / clear. (default: '[seed] ')", + ) + ap.add_argument( + "--clear", + action="store_true", + help="Delete previously seeded personas (by --prefix) and exit.", + ) + ap.add_argument( + "--seed", + type=int, + default=42, + help="RNG seed — same seed = same data each run. Default: 42.", + ) + args = ap.parse_args() + + with SessionFactory() as session: + if args.clear: + n = _clear(session, args.prefix) + print(f"Cleared {n} seeded personas (prefix={args.prefix!r}).") + return + + target_user = _resolve_target_user(session, args.email) + if target_user is None: + print( + "No users in DB. Sign in to the app first so a user row " + "exists, then re-run.", + file=sys.stderr, + ) + sys.exit(1) + + other_user = _pick_other_user(session, target_user.id) + tools = list(session.scalars(select(Tool)).all()) + doc_sets = list(session.scalars(select(DocumentSet)).all()) + + rng = random.Random(args.seed) + + if args.count > len(_NAMES): + print( + f"--count={args.count} exceeds {len(_NAMES)} unique names; " + f"will cycle with numeric suffixes.", + file=sys.stderr, + ) + + # Yours: ~30%, Shared: ~20% (only if other_user exists), rest Featured. + yours_n = max(1, args.count * 30 // 100) + shared_n = args.count * 20 // 100 if other_user is not None else 0 + featured_n = args.count - yours_n - shared_n + + # Track which Yours rows land in the user's picker (half do). + # We'll mutate chosen_assistants at the end of the run. + new_chosen_ids: list[int] = [] + + created = 0 + for i in range(args.count): + base_name = _NAMES[i % len(_NAMES)] + suffix = "" if i < len(_NAMES) else f" #{i // len(_NAMES) + 1}" + name = f"{args.prefix}{base_name}{suffix}" + desc = rng.choice(_DESCRIPTIONS) + persona_tools = _pick(rng, tools, 0, 3) + persona_docs = _pick(rng, doc_sets, 0, 2) + + if i < yours_n: + owner_id = target_user.id + is_public = False + shared_target = None + elif i < yours_n + shared_n: + # Owned by someone else, granted to target user via Persona__User. + owner_id = other_user.id if other_user else None + is_public = False + shared_target = target_user.id + else: + # Public / featured — no specific owner. + owner_id = None + is_public = True + shared_target = None + + persona = Persona( + name=name, + description=desc, + user_id=owner_id, + is_public=is_public, + # Required scalars on Persona — pick sensible defaults so + # the row is queryable by get_personas without errors. + llm_relevance_filter=False, + llm_filter_extraction=False, + recency_bias=RecencyBiasSetting.AUTO, + default_persona=False, + is_visible=True, + deleted=False, + num_chunks=None, + llm_model_provider_override=None, + llm_model_version_override=None, + starter_messages=None, + tools=persona_tools, + document_sets=persona_docs, + ) + session.add(persona) + session.flush() # populate persona.id + + if shared_target is not None: + session.add( + Persona__User(persona_id=persona.id, user_id=shared_target) + ) + + # Half of "Yours" auto-land in the picker; the other half are + # available-to-add. Featured rows never auto-add (the user can + # add them from the gallery). Shared rows auto-add so the user + # sees their permitted assistants in chat immediately. + if i < yours_n and i % 2 == 0: + new_chosen_ids.append(persona.id) + elif yours_n <= i < yours_n + shared_n: + new_chosen_ids.append(persona.id) + + created += 1 + + # Merge with the target user's existing chosen_assistants (if any). + # We APPEND so we don't disturb whatever order they already have. + if new_chosen_ids: + existing = list(target_user.chosen_assistants or []) + target_user.chosen_assistants = existing + new_chosen_ids + + session.commit() + + print(f"Created {created} personas under prefix {args.prefix!r}.") + print(f" Target user : {target_user.email}") + if other_user is not None: + print(f" Shared-from user : {other_user.email}") + print(f" Yours : {yours_n}") + print(f" Shared with you : {shared_n}") + print(f" Featured / public : {featured_n}") + print(f" Auto-added to picker: {len(new_chosen_ids)}") + print() + print("Open /assistants/gallery to see them. Run with --clear to wipe.") + + +if __name__ == "__main__": + main() From 41c2df3c4b3355f036f73ec23e43b23ee0d66ed1 Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 19:18:32 +0530 Subject: [PATCH 008/102] Assistants UX polish: toggle highlight + gallery declutter MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two follow-ups on the assistants UX rework, both from user feedback. Manage (/assistants/mine): * Toggle now accepts a `highlight` prop that draws a transient ring + slight scale-up on the switch. Used by hidden rows so a click anywhere on the (faded) row body flashes the toggle for ~1.2s, pointing the eye at the action that brings the assistant back. Doesn't auto-enable on body click — surprising a user mid-read into enabling would be a worse outcome than the discoverability gap. * Restructured opacity: only the content column (icon + name + description + chips) fades when a row is hidden. The action zone (checkbox, drag-slot, pin, toggle, share, edit) stays at full opacity so the toggle is the bright, clickable target on a dim row. Previously the parent opacity-50 cascaded to every child, making the toggle the dimmest thing on the dimmest row. * stopPropagation on the action zone so clicks on buttons inside it don't trigger the row-body flash handler. Gallery (/assistants/gallery): * Removed all tool-related UI per user request — the page is for browsing assistants, and tool filter chips + per-card "{n} tools" pulled focus from the assistant itself. Gone: the auto-generated tool filter chip row, the per-card tools chip, the toolDisplayName / toolIcon helpers, the FiTool / FiImage / FiCheck imports, and the toolFilters state + commonTools memo. Search hay is now name + description + document-set names (no tool names). * Dropped the absolute top-right "In your picker" badge. The muted card style (border + opacity-75) plus the "Remove" button in the footer already signal "added"; the badge ate horizontal space (pr-24 on the header reserved 96px) and crowded the title at narrower widths. Removed the pr-24 reservation now that nothing overlays the header. * Grid capped at `1 / 2 / 3` cols — 1 on mobile, 2 on most laptops and standard desktops, 3 only at `2xl` (≥1536px). Previously 1/2/3/4 with the 4-col breakpoint making cards cramped and hard to read once descriptions hit their 3-line clamp. * Bumped card padding p-4 → p-5 and description line-height to leading-relaxed for breathing room. * Updated clearAllFilters / hasAnyFilter to drop the toolFilters references (now dead). Verified: npx tsc --noEmit clean across web/ (0 errors), zero stray references to the removed helpers. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../assistants/gallery/AssistantsGallery.tsx | 181 ++++-------------- .../app/assistants/mine/AssistantsList.tsx | 73 ++++++- 2 files changed, 95 insertions(+), 159 deletions(-) diff --git a/web/src/app/assistants/gallery/AssistantsGallery.tsx b/web/src/app/assistants/gallery/AssistantsGallery.tsx index 88cb1eb7f6c..ba74363d24b 100644 --- a/web/src/app/assistants/gallery/AssistantsGallery.tsx +++ b/web/src/app/assistants/gallery/AssistantsGallery.tsx @@ -59,15 +59,7 @@ import { checkUserOwnsAssistant } from "@/lib/assistants/checkOwnership"; import { AssistantsPageTitle } from "../AssistantsPageTitle"; import Link from "next/link"; import { useRouter } from "next/navigation"; -import { - FiBookmark, - FiCheck, - FiImage, - FiPlus, - FiSearch, - FiTool, - FiX, -} from "react-icons/fi"; +import { FiBookmark, FiPlus, FiSearch, FiX } from "react-icons/fi"; // --------------------------------------------------------------------------- // Types @@ -86,22 +78,6 @@ interface SectionDef { // Small helpers // --------------------------------------------------------------------------- -/** Friendly tool name. "SearchTool" → "Search", "ImageGenerationTool" → - * "Image Generation", else fall back to the raw name. Mirrors the - * mapping in ToolsDisplay; kept inline here so this page is self- - * contained. */ -function toolDisplayName(rawName: string): string { - if (rawName === "SearchTool") return "Search"; - if (rawName === "ImageGenerationTool") return "Image Generation"; - return rawName; -} - -function toolIcon(rawName: string) { - if (rawName === "SearchTool") return ; - if (rawName === "ImageGenerationTool") return ; - return ; -} - /** Best-effort author display name. Names aren't on MinimalUserSnapshot; * we have email only. Split on '@' so "foo.bar@example.com" → "foo.bar" * rather than dumping the full email at the user. */ @@ -134,7 +110,11 @@ function GalleryCard({ onAdd, onRemove, }: CardProps) { - const toolCount = assistant.tools?.length ?? 0; + // Note: tool-related UI was intentionally removed from this page — + // the gallery is for browsing assistants, and tool filter chips + + // per-card "{n} tools" pulled focus away from the assistant itself. + // The {n} sources chip stays — sources are about the assistant's + // knowledge scope, not a "tool" in the user's mental model. const docSetCount = assistant.document_sets?.length ?? 0; const author = ownerDisplayName(assistant); const isBuiltIn = assistant.default_persona; @@ -142,7 +122,7 @@ function GalleryCard({ return (
- {/* Status badge (top-right). Only appears when already added — the - absence of a badge is the signal for "available to add". */} - {isAdded && ( -
- In your picker -
- )} - - {/* Header: icon + name (+ built-in badge if applicable) */} -
+ {/* Header: icon + name (+ built-in badge if applicable). The + prior absolute top-right "In your picker" badge was dropped — + the muted card style + the Remove button in the footer + already signal "added"; the badge ate horizontal space and + crowded the title at narrower widths. */} +
@@ -194,35 +162,23 @@ function GalleryCard({ {/* Description — primary signal of "should I pick this?". */} {assistant.description && ( -

+

{assistant.description}

)} - {/* Compact scope chips: {n} tools (hover reveals names) + {n} sources */} - {(toolCount > 0 || docSetCount > 0) && ( + {/* Sources chip — knowledge-scope summary. Tools were + intentionally removed from this page (this is for browsing + assistants, not exploring tools); sources stay because + they're about the assistant's knowledge, not a "tool". */} + {docSetCount > 0 && (
- {docSetCount > 0 && ( - -
- - {docSetCount} source{docSetCount === 1 ? "" : "s"} -
-
- )} - {toolCount > 0 && ( - -
toolDisplayName(t.name)) - .join(", ")} - > - - {toolCount} tool{toolCount === 1 ? "" : "s"} -
-
- )} + +
+ + {docSetCount} source{docSetCount === 1 ? "" : "s"} +
+
)} @@ -354,41 +310,20 @@ export function AssistantsGallery({ const [search, setSearch] = useState(""); const [availability, setAvailability] = useState("all"); - const [toolFilters, setToolFilters] = useState>(new Set()); const [sortMode, setSortMode] = useState("featured"); - // ---- derived: which tool chips to surface -------------------------------- - // Only show chips for tools that appear in ≥2 assistants. Keeps the chip - // row short as the dataset grows and avoids one-off tools cluttering it. - const commonTools: string[] = useMemo(() => { - const counts = new Map(); - for (const a of assistants) { - const seen = new Set(); - for (const t of a.tools ?? []) { - // dedupe within one assistant (in case the same tool appears twice) - if (seen.has(t.name)) continue; - seen.add(t.name); - counts.set(t.name, (counts.get(t.name) ?? 0) + 1); - } - } - return Array.from(counts.entries()) - .filter(([, c]) => c >= 2) - .sort((a, b) => b[1] - a[1]) - .map(([name]) => name); - }, [assistants]); - // ---- derived: filtered + sorted list ------------------------------------- + // Tool-related filtering was removed from this page — the gallery is + // about assistants, not their internals. Search now only matches + // name + description + document-set names. const filtered: Persona[] = useMemo(() => { const q = search.trim().toLowerCase(); const out = assistants.filter((a) => { - // search across name + description + tool names + doc-set names if (q) { const hay = [ a.name, a.description ?? "", - ...(a.tools ?? []).map((t) => toolDisplayName(t.name)), - ...(a.tools ?? []).map((t) => t.name), ...(a.document_sets ?? []).map((d) => d.name), ] .join(" ") @@ -397,14 +332,6 @@ export function AssistantsGallery({ } if (availability === "added" && !chosenSet.has(a.id)) return false; if (availability === "available" && chosenSet.has(a.id)) return false; - if (toolFilters.size > 0) { - // OR semantics — "show me ones with Search OR Image Gen". Narrower - // intersection ("AND") via multiple tool chips is rarely what users - // mean when ticking pills. - const names = new Set((a.tools ?? []).map((t) => t.name)); - const any = Array.from(toolFilters).some((tf) => names.has(tf)); - if (!any) return false; - } return true; }); @@ -417,7 +344,7 @@ export function AssistantsGallery({ } // "featured" = preserve API order (admins curate via display_priority). return out; - }, [assistants, search, availability, toolFilters, sortMode, chosenSet]); + }, [assistants, search, availability, sortMode, chosenSet]); // ---- derived: sections --------------------------------------------------- @@ -471,24 +398,18 @@ export function AssistantsGallery({ const hay = [ a.name, a.description ?? "", - ...(a.tools ?? []).map((t) => toolDisplayName(t.name)), - ...(a.tools ?? []).map((t) => t.name), ...(a.document_sets ?? []).map((d) => d.name), ] .join(" ") .toLowerCase(); if (!hay.includes(q)) continue; } - if (toolFilters.size > 0) { - const names = new Set((a.tools ?? []).map((t) => t.name)); - if (!Array.from(toolFilters).some((tf) => names.has(tf))) continue; - } all++; if (chosenSet.has(a.id)) added++; else available++; } return { all, added, available }; - }, [assistants, search, toolFilters, chosenSet]); + }, [assistants, search, chosenSet]); // ---- optimistic add/remove (mirrors Manage page persistOrder) ----------- @@ -597,20 +518,9 @@ export function AssistantsGallery({ const clearAllFilters = () => { setSearch(""); setAvailability("all"); - setToolFilters(new Set()); - }; - - const toggleToolFilter = (toolName: string) => { - setToolFilters((curr) => { - const next = new Set(curr); - if (next.has(toolName)) next.delete(toolName); - else next.add(toolName); - return next; - }); }; - const hasAnyFilter = - search.trim() !== "" || availability !== "all" || toolFilters.size > 0; + const hasAnyFilter = search.trim() !== "" || availability !== "all"; // ---- render ------------------------------------------------------------- @@ -714,32 +624,6 @@ export function AssistantsGallery({
- {commonTools.length > 0 && ( -
- Tools: - {commonTools.map((tn) => ( - toggleToolFilter(tn)} - > - - {toolIcon(tn)} {toolDisplayName(tn)} - - - ))} - {toolFilters.size > 0 && ( - - )} -
- )} - {/* Empty state when filters exclude everything */} {sections.length === 0 && (
@@ -781,8 +665,7 @@ export function AssistantsGallery({ grid gap-3 grid-cols-1 md:grid-cols-2 - lg:grid-cols-3 - xl:grid-cols-4 + 2xl:grid-cols-3 " > {section.assistants.map((assistant) => ( diff --git a/web/src/app/assistants/mine/AssistantsList.tsx b/web/src/app/assistants/mine/AssistantsList.tsx index 46b2a9d2ee6..6af6f1ebdec 100644 --- a/web/src/app/assistants/mine/AssistantsList.tsx +++ b/web/src/app/assistants/mine/AssistantsList.tsx @@ -34,7 +34,7 @@ * callers but no longer used here. */ -import { useMemo, useState } from "react"; +import { useMemo, useRef, useState } from "react"; import { MinimalUserSnapshot, User } from "@/lib/types"; import { Persona } from "@/app/admin/assistants/interfaces"; import { Text } from "@tremor/react"; @@ -94,10 +94,15 @@ function Toggle({ checked, onChange, ariaLabel, + highlight = false, }: { checked: boolean; onChange: (next: boolean) => void; ariaLabel: string; + // When true, draw a transient ring around the switch to direct the + // eye. Used by hidden rows so clicking the (faded) row body points + // the user at the action that brings the assistant back. + highlight?: boolean; }) { return ( )} - {/* Visibility — switch instead of a buried popover item. */} + {/* Visibility — switch instead of a buried popover item. + `highlight` is set by the row-body click handler on hidden + rows so a click anywhere on the (faded) row body draws the + eye to the action that brings the assistant back. */} onToggleVisibility(assistant.id, next)} + highlight={highlightToggle} ariaLabel={ isVisible ? `Hide ${assistant.name} from the picker` From c7c1a5bc4bfff889d118d1f26a22194666ac01a4 Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 19:22:59 +0530 Subject: [PATCH 009/102] Remove tools chip from Manage Assistants page MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Mirrors the gallery treatment from the previous commit. The user reported tool execution isn't reliable yet, and surfacing "{n} tools" on assistant rows misleads users into picking an assistant for a capability that may not work in practice. Dropped: the {n} tools Bubble in the row's chip block, the toolCount derivation, and the FiTool import. The {n} sources chip stays — it's about the assistant's knowledge scope, which works fine. Verified: npx tsc --noEmit clean across web/ (0 errors). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../app/assistants/mine/AssistantsList.tsx | 39 +++++++------------ 1 file changed, 13 insertions(+), 26 deletions(-) diff --git a/web/src/app/assistants/mine/AssistantsList.tsx b/web/src/app/assistants/mine/AssistantsList.tsx index 6af6f1ebdec..372f2588ccb 100644 --- a/web/src/app/assistants/mine/AssistantsList.tsx +++ b/web/src/app/assistants/mine/AssistantsList.tsx @@ -47,7 +47,6 @@ import { FiSearch, FiShare2, FiStar, - FiTool, FiTrash2, } from "react-icons/fi"; import { MdDragIndicator } from "react-icons/md"; @@ -175,9 +174,8 @@ function RowContent({ const canEdit = isOwnedByUser; const canShare = isOwnedByUser && !assistant.is_public; - // Tool / doc-set counts — surfaced as small chips. Description should - // be the primary affordance; chips give the at-a-glance "scope" signal. - const toolCount = assistant.tools?.length ?? 0; + // Doc-set count — surfaced as a small chip. Tools count was removed + // intentionally; see the JSX block below for the why. const docSetCount = assistant.document_sets?.length ?? 0; // Click-on-hidden-row affordance: a click anywhere on the row body @@ -288,29 +286,18 @@ function RowContent({
- {/* Scope chips — tool/source counts. Compact summary always; - full tool list reveals on hover so the row stays scannable. */} - {(toolCount > 0 || docSetCount > 0) && ( + {/* Sources chip — knowledge-scope summary. Tools chip was + intentionally removed: tool execution isn't reliable yet + and surfacing tool counts misleads users into picking an + assistant for a capability that may not work in practice. */} + {docSetCount > 0 && (
- {docSetCount > 0 && ( - -
- - {docSetCount} source{docSetCount === 1 ? "" : "s"} -
-
- )} - {toolCount > 0 && ( - -
t.name).join(", ")} - > - - {toolCount} tool{toolCount === 1 ? "" : "s"} -
-
- )} + +
+ + {docSetCount} source{docSetCount === 1 ? "" : "s"} +
+
)}
From 24f2e4d6770b8f1d8e040bff023a61c088ba9688 Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 19:24:49 +0530 Subject: [PATCH 010/102] Parameterize gallery grid column count (default 3) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit AssistantsGallery now accepts an optional `columns` prop (default 3, supported 1-5). Responsive scaling below the widest breakpoint is fixed per row of GRID_CLASSES — each row is a complete static Tailwind class string so the purge step actually emits the classes (dynamic `md:grid-cols-${n}` would silently disappear at build). Unsupported values fall back to the default rather than rendering broken — a bad prop here shouldn't break the page. The single existing caller (page.tsx) doesn't pass columns, so it gets the default 3 — same layout as before. To switch to 4 columns on a wide-monitor deployment: ``. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../assistants/gallery/AssistantsGallery.tsx | 41 +++++++++++++++---- 1 file changed, 33 insertions(+), 8 deletions(-) diff --git a/web/src/app/assistants/gallery/AssistantsGallery.tsx b/web/src/app/assistants/gallery/AssistantsGallery.tsx index ba74363d24b..9df6b3b478f 100644 --- a/web/src/app/assistants/gallery/AssistantsGallery.tsx +++ b/web/src/app/assistants/gallery/AssistantsGallery.tsx @@ -74,6 +74,26 @@ interface SectionDef { assistants: Persona[]; } +// --------------------------------------------------------------------------- +// Column-count parameterisation +// --------------------------------------------------------------------------- +// +// Each row in this table is a complete, static Tailwind class string so the +// purge step actually emits the classes. You CAN'T compute these at runtime +// (`md:grid-cols-${n}` won't survive purge). Add a row here to support a new +// column count. Each row scales 1-col on mobile up to N at the widest +// breakpoint, with one breakpoint per added column so cards stay roomy on +// medium screens. +const GRID_CLASSES: Record = { + 1: "grid-cols-1", + 2: "grid-cols-1 sm:grid-cols-2", + 3: "grid-cols-1 md:grid-cols-2 2xl:grid-cols-3", + 4: "grid-cols-1 md:grid-cols-2 lg:grid-cols-3 2xl:grid-cols-4", + 5: "grid-cols-1 md:grid-cols-2 lg:grid-cols-3 xl:grid-cols-4 2xl:grid-cols-5", +}; + +const DEFAULT_COLUMNS = 3; + // --------------------------------------------------------------------------- // Small helpers // --------------------------------------------------------------------------- @@ -288,10 +308,22 @@ function FilterChip({ export function AssistantsGallery({ assistants, user, + columns = DEFAULT_COLUMNS, }: { assistants: Persona[]; user: User | null; + /** + * Max columns at the widest breakpoint. Responsive scaling below that + * is fixed (see GRID_CLASSES). Supported values: 1–5. Anything outside + * falls back to the default — silently, not noisily, because a bad + * prop here shouldn't break the page. + */ + columns?: number; }) { + // Resolve the grid class once; fall back if an unsupported value was + // passed so the page still renders. + const gridClass = + GRID_CLASSES[columns] ?? GRID_CLASSES[DEFAULT_COLUMNS]; const router = useRouter(); const { popup, setPopup } = usePopup(); @@ -660,14 +692,7 @@ export function AssistantsGallery({ ({section.assistants.length})
-
+
{section.assistants.map((assistant) => ( Date: Thu, 28 May 2026 19:27:25 +0530 Subject: [PATCH 011/102] Show document-set names on assistant cards (was: count only) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A "{n} sources" chip told users how MUCH knowledge an assistant had access to but not WHICH knowledge — defeating the point of the chip for someone deciding "which assistant should I pick for this task?". Both the Manage page row and the Gallery card now render one Bubble per document-set name, capped at MAX_VISIBLE_DOC_SETS (3). When an assistant points at more than that, a "+N more" pill collects the overflow with the rest of the names exposed via the title tooltip, so we don't blow the card width or row layout at narrower column counts. Each name chip caps at a max-width with truncate + a hover title, so a single absurdly long document-set name can't push the actions off the row. Verified: npx tsc --noEmit clean across web/ (0 errors). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../assistants/gallery/AssistantsGallery.tsx | 55 +++++++++++++------ .../app/assistants/mine/AssistantsList.tsx | 45 +++++++++++---- 2 files changed, 72 insertions(+), 28 deletions(-) diff --git a/web/src/app/assistants/gallery/AssistantsGallery.tsx b/web/src/app/assistants/gallery/AssistantsGallery.tsx index 9df6b3b478f..d911dfbd032 100644 --- a/web/src/app/assistants/gallery/AssistantsGallery.tsx +++ b/web/src/app/assistants/gallery/AssistantsGallery.tsx @@ -94,6 +94,11 @@ const GRID_CLASSES: Record = { const DEFAULT_COLUMNS = 3; +// How many doc-set name chips to render before collapsing the rest into +// a "+N more" pill. Three keeps each card's scope visible without +// blowing the card width at narrower column counts. +const MAX_VISIBLE_DOC_SETS = 3; + // --------------------------------------------------------------------------- // Small helpers // --------------------------------------------------------------------------- @@ -130,12 +135,9 @@ function GalleryCard({ onAdd, onRemove, }: CardProps) { - // Note: tool-related UI was intentionally removed from this page — - // the gallery is for browsing assistants, and tool filter chips + - // per-card "{n} tools" pulled focus away from the assistant itself. - // The {n} sources chip stays — sources are about the assistant's - // knowledge scope, not a "tool" in the user's mental model. - const docSetCount = assistant.document_sets?.length ?? 0; + // Tool-related UI was intentionally removed from this page (filter + // chips + per-card counts) — the gallery is for browsing assistants, + // and tool execution isn't reliable enough to advertise. const author = ownerDisplayName(assistant); const isBuiltIn = assistant.default_persona; @@ -187,18 +189,37 @@ function GalleryCard({

)} - {/* Sources chip — knowledge-scope summary. Tools were - intentionally removed from this page (this is for browsing - assistants, not exploring tools); sources stay because - they're about the assistant's knowledge, not a "tool". */} - {docSetCount > 0 && ( + {/* Knowledge-scope chips — name the document sets the + assistant points at (counts alone don't help a chooser + decide). Cap at MAX_VISIBLE_DOC_SETS with a "+N more" + tooltip so a long list doesn't blow the card width. Tools + were removed entirely (see card-level comment). */} + {assistant.document_sets && assistant.document_sets.length > 0 && (
- -
- - {docSetCount} source{docSetCount === 1 ? "" : "s"} -
-
+ {assistant.document_sets + .slice(0, MAX_VISIBLE_DOC_SETS) + .map((ds) => ( + +
+ + + {ds.name} + +
+
+ ))} + {assistant.document_sets.length > MAX_VISIBLE_DOC_SETS && ( + + d.name) + .join(", ")} + > + +{assistant.document_sets.length - MAX_VISIBLE_DOC_SETS} more + + + )}
)} diff --git a/web/src/app/assistants/mine/AssistantsList.tsx b/web/src/app/assistants/mine/AssistantsList.tsx index 372f2588ccb..502c9a7a7b1 100644 --- a/web/src/app/assistants/mine/AssistantsList.tsx +++ b/web/src/app/assistants/mine/AssistantsList.tsx @@ -84,6 +84,11 @@ import { AssistantSharingModal } from "./AssistantSharingModal"; import { AssistantSharedStatusDisplay } from "../AssistantSharedStatus"; import { AssistantsPageTitle } from "../AssistantsPageTitle"; +// How many doc-set name chips to render before collapsing the rest +// into a "+N more" pill. Three keeps the row scannable on most widths +// without losing the most-relevant scope at a glance. +const MAX_VISIBLE_DOC_SETS = 3; + // --------------------------------------------------------------------------- // Small inline switch — avoids pulling in a new component library for one // toggle. role="switch" gives screen readers the right semantics. @@ -174,9 +179,8 @@ function RowContent({ const canEdit = isOwnedByUser; const canShare = isOwnedByUser && !assistant.is_public; - // Doc-set count — surfaced as a small chip. Tools count was removed - // intentionally; see the JSX block below for the why. - const docSetCount = assistant.document_sets?.length ?? 0; + // Doc-set names are surfaced as small chips; tools count was removed + // intentionally — see the chip JSX below for the why. // Click-on-hidden-row affordance: a click anywhere on the row body // (not on an interactive control) draws a transient ring around the @@ -286,18 +290,37 @@ function RowContent({
- {/* Sources chip — knowledge-scope summary. Tools chip was + {/* Knowledge-scope chips — show which document sets the + assistant points at, not just the count. With many sets, + show the first few and a "+N more" with the rest in a + tooltip so the row stays scannable. Tools chip was intentionally removed: tool execution isn't reliable yet and surfacing tool counts misleads users into picking an assistant for a capability that may not work in practice. */} - {docSetCount > 0 && ( + {assistant.document_sets && assistant.document_sets.length > 0 && (
- -
- - {docSetCount} source{docSetCount === 1 ? "" : "s"} -
-
+ {assistant.document_sets.slice(0, MAX_VISIBLE_DOC_SETS).map((ds) => ( + +
+ + + {ds.name} + +
+
+ ))} + {assistant.document_sets.length > MAX_VISIBLE_DOC_SETS && ( + + d.name) + .join(", ")} + > + +{assistant.document_sets.length - MAX_VISIBLE_DOC_SETS} more + + + )}
)}
From 82a83957c08716380c2dbc72994690c30ed0e625 Mon Sep 17 00:00:00 2001 From: rajiv chodisetti Date: Thu, 28 May 2026 19:32:51 +0530 Subject: [PATCH 012/102] Gallery: user-controllable column count (segmented control, persists) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a small "Columns 2 | 3 | 4" segmented control at the right end of the filter row (next to Sort). The pick persists to localStorage under "danswer:assistants-gallery:columns" so it survives reloads on the same device. State precedence: user choice (localStorage) wins ↓ prop `columns` from caller (default for new users / new device) ↓ DEFAULT_COLUMNS = 3 (final fallback) The localStorage read happens in a useEffect so SSR + first paint use the prop value — avoids a hydration mismatch the time the stored value disagrees with the prop. localStorage writes are wrapped in try/catch because some sandboxed contexts (private modes, restrictive iframes) throw on access — the control still works for the session, just doesn't persist there. Picker is hidden below md (768px) because the layout falls back to 1 column at that width regardless of the chosen value. Exposed options are 2/3/4 — 1 is mobile-only via responsive, 5 is too cramped for typical screens (GRID_CLASSES still supports 5 if a deployment wants to set it via prop). Verified: npx tsc --noEmit clean across web/ (0 errors). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../assistants/gallery/AssistantsGallery.tsx | 103 +++++++++++++++--- 1 file changed, 90 insertions(+), 13 deletions(-) diff --git a/web/src/app/assistants/gallery/AssistantsGallery.tsx b/web/src/app/assistants/gallery/AssistantsGallery.tsx index d911dfbd032..33adddfa839 100644 --- a/web/src/app/assistants/gallery/AssistantsGallery.tsx +++ b/web/src/app/assistants/gallery/AssistantsGallery.tsx @@ -44,7 +44,7 @@ * All mutations are optimistic + undoable, mirroring the Manage page. */ -import { useMemo, useState } from "react"; +import { useEffect, useMemo, useState } from "react"; import { Persona } from "@/app/admin/assistants/interfaces"; import { User } from "@/lib/types"; import { AssistantIcon } from "@/components/assistants/AssistantIcon"; @@ -94,6 +94,15 @@ const GRID_CLASSES: Record = { const DEFAULT_COLUMNS = 3; +// Values exposed in the in-page column picker. The control lets users +// override the prop-derived default at runtime; the persisted choice +// lives in localStorage so it survives reloads. +// +// Below the smallest md breakpoint everything is 1-col regardless of +// this value (see GRID_CLASSES), so we don't bother exposing 1. +const COLUMN_PICKER_OPTIONS = [2, 3, 4]; +const COLUMNS_STORAGE_KEY = "danswer:assistants-gallery:columns"; + // How many doc-set name chips to render before collapsing the rest into // a "+N more" pill. Three keeps each card's scope visible without // blowing the card width at narrower column counts. @@ -329,23 +338,55 @@ function FilterChip({ export function AssistantsGallery({ assistants, user, - columns = DEFAULT_COLUMNS, + columns: initialColumns = DEFAULT_COLUMNS, }: { assistants: Persona[]; user: User | null; /** - * Max columns at the widest breakpoint. Responsive scaling below that - * is fixed (see GRID_CLASSES). Supported values: 1–5. Anything outside - * falls back to the default — silently, not noisily, because a bad - * prop here shouldn't break the page. + * Initial max columns at the widest breakpoint. Acts as the default + * if the user has no stored preference yet; once the user picks via + * the in-page column control that choice (in localStorage) wins. + * Responsive scaling below the widest breakpoint is fixed + * (see GRID_CLASSES). Supported values: 1–5; out-of-range silently + * falls back to DEFAULT_COLUMNS so a bad prop can't break the page. */ columns?: number; }) { - // Resolve the grid class once; fall back if an unsupported value was - // passed so the page still renders. - const gridClass = - GRID_CLASSES[columns] ?? GRID_CLASSES[DEFAULT_COLUMNS]; const router = useRouter(); + + // User-chosen column count. `null` until the localStorage read in + // the effect below; SSR + first paint use the prop value so we + // don't get a hydration mismatch. After mount, the stored choice + // (if any) overrides the prop. + const [userColumns, setUserColumns] = useState(null); + useEffect(() => { + try { + const raw = window.localStorage.getItem(COLUMNS_STORAGE_KEY); + if (raw == null) return; + const n = Number.parseInt(raw, 10); + if (Number.isFinite(n) && n in GRID_CLASSES) { + setUserColumns(n); + } + } catch { + // localStorage can throw in some sandboxed contexts (Safari + // private mode in the past, certain iframe configs). Fall + // through to the prop default — the picker still works for + // the session, just doesn't persist. + } + }, []); + + const effectiveColumns = userColumns ?? initialColumns; + const gridClass = + GRID_CLASSES[effectiveColumns] ?? GRID_CLASSES[DEFAULT_COLUMNS]; + + const changeColumns = (n: number) => { + setUserColumns(n); + try { + window.localStorage.setItem(COLUMNS_STORAGE_KEY, String(n)); + } catch { + // See above — silently OK to skip persistence. + } + }; const { popup, setPopup } = usePopup(); // Mirrors the Manage page: no preference = every accessible assistant @@ -656,9 +697,45 @@ export function AssistantsGallery({ Already added - {/* Sort lives at the right end of the filter row so visually - "narrow the list" and "order the list" are co-located. */} -
+ {/* View controls — columns + sort — live at the right end of + the filter row so "narrow the list" (left) and "shape + the view" (right) are visually separated. */} +
+ {/* Column picker. Hidden below md since the layout falls + back to a single column there regardless. The choice is + persisted in localStorage on click. */} +
+ Columns +
+ {COLUMN_PICKER_OPTIONS.map((n) => { + const active = effectiveColumns === n; + return ( + + ); + })} +
+
that mirrors the existing Sort dropdown for visual consistency on the "view controls" cluster at the right end of the filter row. Behavior unchanged: pure client-side state + localStorage persistence, no fetch and no router.refresh() in the column path — the user's column choice never triggers a backend call. Verified: npx tsc --noEmit clean across web/ (0 errors). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../assistants/gallery/AssistantsGallery.tsx | 52 +++++++------------ 1 file changed, 20 insertions(+), 32 deletions(-) diff --git a/web/src/app/assistants/gallery/AssistantsGallery.tsx b/web/src/app/assistants/gallery/AssistantsGallery.tsx index 33adddfa839..681e71e4b31 100644 --- a/web/src/app/assistants/gallery/AssistantsGallery.tsx +++ b/web/src/app/assistants/gallery/AssistantsGallery.tsx @@ -702,39 +702,27 @@ export function AssistantsGallery({ the view" (right) are visually separated. */}
{/* Column picker. Hidden below md since the layout falls - back to a single column there regardless. The choice is - persisted in localStorage on click. */} -
- Columns -
+ +