diff --git a/docs/features/administration/banners.md b/docs/features/administration/banners.md
index 058d39fe7..caad5e381 100644
--- a/docs/features/administration/banners.md
+++ b/docs/features/administration/banners.md
@@ -154,7 +154,84 @@ Inline styles are supported on allowed tags:
 <span style="background: linear-gradient(90deg,#e0f2fe,#fef9c3);">Gradient background</span>
 ```
 
-> Keep styling minimal. Overly large padding, font sizes, or complex layouts can cause banners to become tall or visually inconsistent across themes.
+You can also style a full message area by wrapping the content in a block element:
+
+```html
+<div style="background:#f8fafc;border:1px solid #cbd5e1;border-radius:12px;padding:10px 14px;line-height:1.3;display:block;width:100%;box-sizing:border-box;">
+  <b>Notice title</b><br>
+  Short supporting message.
+</div>
+```
+
+> Keep styling purposeful. Large padding, large font sizes, or deeply nested layouts can make banners too tall and visually inconsistent across themes.
+
+---
+
+## Designing effective banners
+
+Banners work best when they are easy to scan, visually distinct, and short enough not to interrupt normal work.
+
+### Structure the message
+
+Use a predictable structure:
+
+- Start with the event type or status: `Maintenance`, `Incident`, `Policy update`, `New feature`.
+- Put the most important detail first: date, time, impact, or required action.
+- Keep the body to one or two short sentences.
+- Add one link only if users need more details.
+
+For longer notices, use short sections instead of one long paragraph. For multilingual notices, separate languages with a subtle `<hr>` or use a collapsible `<details>` section.
+
+### Make severity visible
+
+Use the banner `type` consistently:
+
+- `info`: neutral announcements and product updates.
+- `success`: resolved incidents or completed changes.
+- `warning`: planned maintenance, degraded service, or upcoming action needed.
+- `error`: active incidents or urgent action required.
+
+Avoid using `error` for non-urgent announcements. Users learn to ignore alerts when every message looks critical.
+
+### Use color carefully
+
+Color should support the banner type, not compete with it:
+
+- Use soft backgrounds for the full message area.
+- Use stronger colors for small accents, labels, or left borders.
+- Keep text contrast high enough to read in bright rooms and on dim screens.
+- Avoid mixing many unrelated colors in one banner.
+
+A useful pattern is a pale background plus a stronger left border:
+
+```html
+<div style="background:#f8fafc;color:#334155;border:1px solid #cbd5e1;border-left:6px solid #64748b;border-radius:12px;padding:10px 14px;line-height:1.3;display:block;width:100%;box-sizing:border-box;">
+  <b>Notice title</b><br>
+  Short supporting message.
+</div>
+```
+
+### Keep layouts responsive
+
+Banners are shown inside the application layout and must still work on narrow screens.
+
+- Prefer `display:flex;flex-wrap:wrap` for rows containing labels, dates, or badges.
+- Avoid fixed widths.
+- Use `width:100%;box-sizing:border-box` for full-width styled blocks.
+- Keep icons and badges small so they do not increase banner height.
+- Test the banner with a narrow browser window before using it broadly.
+
+### Avoid accidental extra height
+
+Banner content treats literal newlines as line breaks. If you use explicit `<br>` tags, keep the raw HTML compact and avoid adding extra blank lines or indentation in the banner content field.
+
+This compact style:
+
+```html
+<b>Notice</b><br>One short sentence.<br>Another short sentence.
+```
+
+renders more predictably than heavily formatted HTML with many line breaks.
 
 ---
 
@@ -255,6 +332,22 @@ Service updates: <a href="https://example.com/status" target="_blank"><u>Status
 </span>
 ```
 
+### Pattern: Styled notice block
+
+Use a full-width styled block when the whole message should read as one announcement area. Keep this HTML compact when pasting it into the banner content field, especially if it also contains `<br>` tags.
+
+```html
+<div style="background:#f8fafc;color:#334155;border:1px solid #cbd5e1;border-left:6px solid #64748b;border-radius:12px;padding:10px 14px;line-height:1.3;display:block;width:100%;box-sizing:border-box;"><div style="display:flex;flex-wrap:wrap;align-items:center;gap:5px 8px;margin-bottom:5px;"><span style="display:inline-flex;align-items:center;background:#64748b;color:#fff;padding:1px 7px;border-radius:999px;line-height:1.25;font-size:10px;font-weight:700;letter-spacing:.04em;">NOTICE</span><b>Notice title</b><span style="display:inline-flex;align-items:center;background:#fff;color:#334155;padding:1px 8px;border-radius:999px;line-height:1.25;border:1px solid #cbd5e1;">Key detail</span></div><div>Short supporting message.</div></div>
+```
+
+This pattern uses:
+
+- A pale background for the full message area.
+- A stronger left border for fast visual recognition.
+- A small uppercase label for the event type.
+- A compact date/time chip for the most important metadata.
+- `flex-wrap` so the header row still works on narrow screens.
+
 ### Pattern: Collapsible details (keep banners short)
 
 ```html
diff --git a/docs/features/authentication-access/rbac/groups.md b/docs/features/authentication-access/rbac/groups.md
index ce077c3de..e6db919e6 100644
--- a/docs/features/authentication-access/rbac/groups.md
+++ b/docs/features/authentication-access/rbac/groups.md
@@ -84,3 +84,18 @@ For example, granting the "Marketing" group read access and a specific editor us
 
 *   **Read**: Users can view and use the resource.
 *   **Write**: Users can update or delete the resource.
+
+### Previewing Access (Audit)
+
+When access grants span many groups and resources, it's easy to lose track of who can see what. Open WebUI ships an admin-only **Preview Access** view that resolves every access grant for a specific user or group and lists the result in one place — no need to crawl through individual resource pages.
+
+**For a user** — In **Admin Panel > Users**, hover over a non-admin user row and click the eye-style **Preview Access** button. The modal shows every model, knowledge base, and tool the user can read, aggregated across all of their group memberships and any direct user grants.
+
+**For a group** — In **Admin Panel > Users > Groups**, open the group editor and use the **Preview Group Access** panel. The output is the same shape (models, knowledge, tools), scoped to just that group's grants.
+
+Both views are admin-only and read-only — they reflect what the access-grant table currently says without modifying it. Use them after a permission change to confirm the result matches intent, or as part of a periodic RBAC audit.
+
+Programmatic equivalents:
+
+- `GET /api/v1/users/{user_id}/preview` — user view (admin auth required)
+- `GET /api/v1/groups/id/{id}/preview` — group view (admin auth required)
diff --git a/docs/features/channels/index.md b/docs/features/channels/index.md
index 0287854ee..08c75e004 100644
--- a/docs/features/channels/index.md
+++ b/docs/features/channels/index.md
@@ -81,9 +81,14 @@ Mentioning a model in a channel runs through the same chat-completion pipeline a
 | **User tools and MCP tools** | Whatever the model is configured to call, it can call |
 | **Filters** | Inlet/outlet/stream filters apply just like in chats |
 | **Knowledge (RAG)** | Knowledge bases attached to the model are queried and injected |
+| **Attached documents** | Images **and** non-image files (PDF, DOCX, etc.) uploaded in the thread are forwarded into the model's context |
 
 In other words, a channel-summoned model is a fully-equipped agent — not a one-shot completion.
 
+:::note Document attachments in channels (v0.9.6+)
+Before v0.9.6, tagging a model in a channel only forwarded **images** from the thread — uploaded PDFs, DOCX, and other non-image documents were ignored, so summarization and document-comparison prompts silently had nothing to work with. As of v0.9.6 these files are forwarded the same way they are in a direct chat, so document workflows behave identically in channels.
+:::
+
 ### Tagging people and linking channels
 
 Use `@username` to notify teammates. Use `#channel-name` to create clickable cross-references between conversations.
diff --git a/docs/features/chat-conversations/web-search/providers/linkup.md b/docs/features/chat-conversations/web-search/providers/linkup.md
new file mode 100644
index 000000000..c9418bd75
--- /dev/null
+++ b/docs/features/chat-conversations/web-search/providers/linkup.md
@@ -0,0 +1,93 @@
+---
+sidebar_position: 23
+title: "Linkup"
+---
+
+:::warning
+
+This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the [contributing tutorial](https://docs.openwebui.com/contributing).
+
+:::
+
+:::tip
+
+For a comprehensive list of all environment variables related to Web Search (including concurrency settings, result counts, and more), please refer to the [Environment Configuration documentation](/reference/env-configuration#web-search).
+
+:::
+
+:::tip Troubleshooting
+
+Having issues with web search? Check out the [Web Search Troubleshooting Guide](/troubleshooting/web-search) for solutions to common problems like proxy configuration, connection timeouts, and empty content.
+
+:::
+
+## Overview
+
+[Linkup](https://www.linkup.so/) is a search API built for AI applications. Integrating it with Open WebUI lets your language model perform real-time web searches and ground responses in current sources. This tutorial guides you through configuring Linkup as a web search provider.
+
+Linkup support was added in Open WebUI v0.9.6.
+
+## Prerequisites
+
+Ensure you have:
+
+- **Open WebUI Installed**: A running instance of Open WebUI (local or Docker). See the [Getting Started guide](https://docs.openwebui.com/getting-started).
+- **Linkup Account**: An account with an API key from [Linkup](https://www.linkup.so/).
+- **Admin Access**: Administrative access to your Open WebUI instance.
+- **Internet Connection**: Required for Linkup API requests.
+
+## Step-by-Step Configuration
+
+### 1. Obtain a Linkup API Key
+
+1. Log in or sign up at [Linkup](https://www.linkup.so/).
+2. Open the API keys section of your dashboard.
+3. Copy or generate a new API key. Keep it secure.
+
+### 2. Configure Open WebUI
+
+1. Log in to Open WebUI with an admin account.
+2. Open **Admin Panel → Settings → Web Search**.
+3. Enable **Web Search** by toggling it **On**.
+4. Select **linkup** from the **Web Search Engine** dropdown.
+5. Paste your Linkup API key into the **Linkup API Key** field.
+6. (Optional) Set the **Search Depth** and **Output Type** (see below).
+7. Save your settings.
+
+### 3. Test the Integration
+
+1. Start a chat session in Open WebUI.
+2. Click the **plus (+)** button in the prompt field to enable web search.
+3. Enter a query (e.g., `+latest AI news`) and confirm Linkup returns real-time results.
+
+## Search Parameters
+
+Linkup requests are built from a small set of defaults that you can override. The query (`q`) and result count (`maxResults`) are injected automatically and cannot be overridden.
+
+| Parameter | Default | Notes |
+|-----------|---------|-------|
+| `depth` | `standard` | `standard` is faster and cheaper; `deep` runs a more thorough multi-step search. |
+| `outputType` | `sourcedAnswer` | `sourcedAnswer` returns an answer plus its source pages; `searchResults` returns raw result entries. |
+| `url` | `https://api.linkup.so/v1/search` | Override only if you need to point at a different endpoint. |
+
+These map to the [`LINKUP_SEARCH_PARAMS`](/reference/env-configuration#linkup_search_params) environment variable, supplied as a JSON object. For example:
+
+```bash
+-e LINKUP_API_KEY="your_linkup_api_key"
+-e LINKUP_SEARCH_PARAMS='{"depth": "deep", "outputType": "searchResults"}'
+```
+
+The same fields are exposed in the Admin UI when the `linkup` engine is selected, so you do not need environment variables unless you prefer to manage configuration that way. See [Environment Variable Configuration](https://docs.openwebui.com/environment) for details and the [`ENABLE_PERSISTENT_CONFIG`](/reference/env-configuration#enable_persistent_config) behavior.
+
+## Troubleshooting
+
+- **Invalid API Key**: Ensure the key is copied correctly, without extra spaces.
+- **No Results**: Confirm the web search toggle (`+`) is enabled and your internet is active. Try `depth: deep` for sparse topics.
+- **Quota Exceeded**: Check your plan and usage on the Linkup dashboard.
+- **Settings Not Saved**: Verify admin privileges and that `webui.db` is writable.
+
+## Additional Resources
+
+- [Linkup Documentation](https://docs.linkup.so/): API reference and advanced options.
+- [Open WebUI Features](https://docs.openwebui.com/features): Details on RAG and web search.
+- [Contributing to Open WebUI](https://docs.openwebui.com/contributing): Share improvements or report issues.
diff --git a/docs/features/extensibility/index.md b/docs/features/extensibility/index.md
index e103daf5b..fa36d46b6 100644
--- a/docs/features/extensibility/index.md
+++ b/docs/features/extensibility/index.md
@@ -9,11 +9,35 @@ title: "Extensibility"
 
 Open WebUI ships with powerful defaults, but your workflows aren't default. Extensibility is how you close the gap: give models real-time data, enforce compliance rules, add new AI providers, or connect to any external service. Write a few lines of Python, point at an OpenAPI endpoint, or browse the community library. The platform adapts to you, not the other way around.
 
-There are three layers, and most teams end up using at least two:
+There are two layers, and most teams end up using both:
 
 - **In-process Python** (Tools & Functions) runs inside Open WebUI itself with zero infrastructure and instant iteration.
 - **External HTTP** (OpenAPI & MCP servers) connects to services running anywhere, from a sidecar container to a third-party SaaS.
-- **Pipeline workers** (Pipelines) offload heavy or sensitive processing to a separate container, keeping your main instance fast and clean.
+
+:::warning Pipelines are legacy
+You may still see **Pipelines** referenced as a third layer. It is **legacy and no longer recommended** — the heavy-processing problem it solved no longer exists (see [Run heavy or long-running work](#run-heavy-or-long-running-work) below). Use Functions, Tools, or an external tool server instead. See the [Pipelines pages](pipelines) for the full deprecation notice.
+:::
+
+---
+
+## Which Extension Do I Need?
+
+The names don't always map obviously to what they do. Start from what you're trying to accomplish:
+
+| I want to... | Use | Why this one |
+|---|---|---|
+| Let the model **call an API or perform an action** (and keep a secret/API key the user and model can never read) | **[Tool](plugin/tools)** | The key lives inside the tool, server-side. The model only sees the *result*, never the credential. |
+| **Add a new model or provider** to the model selector | **[Pipe Function](plugin/functions/pipe)** | A Pipe appears as a selectable "model" and handles the request however you like. |
+| **Modify messages** going in or out (redact PII, inject system text, log, translate) | **[Filter Function](plugin/functions/filter)** | Filters run on every message via `inlet`/`outlet`/`stream` without touching model config. |
+| Add a **button on a message** that runs custom code | **[Action Function](plugin/functions/action)** | Actions are user-triggered, per-message operations. |
+| Teach the model **how to approach a task** (methodology, steps, house style) | **[Skill](/features/workspace/skills)** | Skills are instructions, not code. The model reads them; they don't execute anything. |
+| Give the model **documents to retrieve from** | **[Knowledge](/features/workspace/knowledge)** | RAG over your files, attached to a model or referenced with `#`. |
+| Save a **reusable prompt** behind a slash command | **[Prompt](/features/workspace/prompts)** | Templated text with typed variables; expands when you type `/name`. |
+| Connect an **existing external service** that already speaks HTTP | **[OpenAPI / MCP server](mcp)** | Point Open WebUI at the spec; endpoints become callable tools. No glue code. |
+
+:::tip "Pipe" vs "Pipeline" — not the same thing
+This is the single most common naming mix-up. A **Pipe** is a type of **Function** (in-process Python, adds a provider to the model list). A **Pipeline** is a **separate external worker container**. They share a prefix and nothing else. If you want to add a model provider, you almost always want a **Pipe Function**, not a Pipeline.
+:::
 
 ---
 
@@ -31,9 +55,11 @@ Have an internal API? A third-party SaaS with an OpenAPI spec? An MCP server alr
 
 Functions let you intercept and transform messages before they reach the model (input filters) or before they reach the user (output filters). Help redact PII, enforce formatting rules, log to an observability platform, inject system instructions dynamically, all without touching model configuration.
 
-### Offload heavy processing
+### Run heavy or long-running work
+
+Open WebUI's backend is **fully async**. Long-running Tools and Functions (awaiting an external API, a slow query, a multi-step agent) do not block other users, and synchronous/CPU-bound plugin code is offloaded to a worker thread pool (see [`THREAD_POOL_SIZE`](/reference/env-configuration#thread_pool_size)) — so it doesn't stall the event loop either. In practice you can run heavy work **in-process** without the latency problems that older synchronous releases had.
 
-When a plugin needs GPU access, large dependencies, or isolated execution, run it as a Pipeline on a separate machine. Open WebUI talks to it over a standard API. Your main instance stays lean.
+The historical reason to push heavy pipes/filters onto a separate **Pipelines** worker — keeping the single synchronous event loop unblocked — no longer applies. If you genuinely need **GPU access, large or conflicting dependencies, hard isolation, or independent scaling**, run that work as an **external service behind an [OpenAPI or MCP tool server](mcp)**, not a Pipeline.
 
 ### Import from the community
 
@@ -49,7 +75,6 @@ Browse hundreds of community-built Tools and Functions from the Open WebUI Commu
 | ⚙️ **Functions** | Platform extensions that add model providers (Pipes), message processing (Filters), or UI actions (Actions) |
 | 🔗 **MCP support** | Native Streamable HTTP for Model Context Protocol servers |
 | 🌐 **OpenAPI servers** | Auto-discover and expose tools from any OpenAPI-compatible endpoint |
-| 🔧 **Pipelines** | Modular plugin framework running on a separate worker for heavy or sensitive processing |
 | 📝 **Skills** | Markdown instruction sets that teach models how to approach specific tasks |
 | ⚡ **Prompts** | Slash-command templates with typed input variables and versioning |
 | 🏪 **Community library** | One-click import of community-built Tools and Functions |
@@ -62,11 +87,10 @@ Understanding which layer to use saves time:
 
 | Layer | Runs where | Best for | Trade-off |
 |-------|-----------|----------|-----------|
-| **Tools & Functions** | Inside Open WebUI process | Real-time data, filters, UI actions, new providers | Shares resources with the main server |
-| **OpenAPI / MCP** | Any HTTP endpoint | Connecting existing services, third-party APIs | Requires a running external server |
-| **Pipelines** | Separate Docker container | GPU workloads, heavy dependencies, sandboxed execution | Additional infrastructure to manage |
+| **Tools & Functions** | Inside Open WebUI process | Real-time data, filters, UI actions, new providers — including heavy/long-running work (the async backend keeps it from blocking) | Shares CPU/RAM with the main server |
+| **OpenAPI / MCP** | Any HTTP endpoint | Connecting existing services, third-party APIs, and GPU / heavy-dependency / isolated workloads | Requires a running external server |
 
-Most users start with **Tools & Functions**. They require no extra setup, have a built-in code editor, and cover the majority of use cases.
+Most users start with **Tools & Functions**. They require no extra setup, have a built-in code editor, and cover the majority of use cases. (**Pipelines** is a legacy third option, no longer recommended — see the note above.)
 
 ---
 
@@ -84,9 +108,9 @@ A healthcare organization deploys a Filter Function that scans outbound messages
 
 An engineering team uses Pipe Functions to add Anthropic, Google Vertex AI, and a self-hosted vLLM instance alongside their existing Ollama models. Users see all providers in a single model selector with no separate logins and no API key juggling.
 
-### Heavy-compute pipelines
+### GPU-bound external processing
 
-A research group runs a Retrieval-Augmented Generation pipeline that re-ranks with a cross-encoder model requiring GPU. They deploy it as a Pipeline on a dedicated GPU node. Open WebUI routes relevant queries to the pipeline automatically while keeping the main instance on commodity hardware.
+A research group needs to re-rank retrieval results with a cross-encoder model that requires a GPU. They run it as a small service on a dedicated GPU node and expose it to Open WebUI as an **[OpenAPI tool server](mcp)**. The model calls it like any other tool while the main instance stays on commodity hardware. (The async backend means lighter custom logic can simply run in-process as a Function — only the GPU dependency pushes this particular workload to a separate service.)
 
 ---
 
@@ -94,11 +118,11 @@ A research group runs a Retrieval-Augmented Generation pipeline that re-ranks wi
 
 ### Security
 
-Tools, Functions, and Pipelines execute **arbitrary Python code** on your server. Only install extensions from trusted sources, review code before importing, and restrict Workspace access to administrators. See the [Security Policy](/security) for details.
+Tools and Functions execute **arbitrary Python code** on your server. Only install extensions from trusted sources, review code before importing, and restrict Workspace access to administrators. See the [Security Policy](/security) for details.
 
 ### Resource sharing
 
-In-process Tools and Functions share CPU and memory with Open WebUI. Computationally expensive plugins should be moved to Pipelines or external services.
+In-process Tools and Functions share CPU and memory with Open WebUI. The async backend keeps long-running and blocking work from stalling other requests, but it does not create more hardware — genuinely CPU- or GPU-heavy workloads still compete for the same machine. For those, run the work as an external service behind an [OpenAPI / MCP tool server](mcp) so it scales independently.
 
 ### MCP transport
 
@@ -112,4 +136,4 @@ Native MCP support is **Streamable HTTP only**. For stdio or SSE-based MCP serve
 |-------|-------------------|
 | [**Tools & Functions**](plugin) | Writing Python Tools, Functions (Pipes, Filters, Actions), and the development API |
 | [**MCP**](mcp) | Connecting Model Context Protocol servers, OAuth setup, troubleshooting |
-| [**Pipelines**](pipelines) | Deploying the pipeline worker, building custom pipelines, directory structure |
+| [**Pipelines**](pipelines) *(legacy)* | Reference only — the deprecated separate-worker framework, superseded by Functions and Tools |
diff --git a/docs/features/extensibility/mcp.mdx b/docs/features/extensibility/mcp.mdx
index 3b68120dd..29a950bd0 100644
--- a/docs/features/extensibility/mcp.mdx
+++ b/docs/features/extensibility/mcp.mdx
@@ -32,6 +32,16 @@ Entering MCP-style configuration (with `mcpServers` in JSON) into an OpenAPI con
 2. Re-add it with the correct **Type** set to **MCP**
 :::
 
+## 🔒 MCP servers are admin-only {#mcp-servers-are-admin-only}
+
+MCP servers can only be added by **administrators**, under **Admin Settings → External Tools**. Regular users cannot register their own, by design.
+
+This is **not** the same restriction as OpenAPI. When you grant the **Direct Tool Servers** permission (per user or per group, off by default), users can add their own **OpenAPI** tool servers under **Settings → Tools**, but that path is OpenAPI-only: the connection type is locked, with no MCP option.
+
+The difference is capability. A user-supplied OpenAPI server is a stateless HTTP URL exposing a fixed set of declared endpoints. An MCP server is far more powerful: it is stateful and capability-rich (sampling, elicitation, persistent sessions and arbitrary host command execution over stdio transports), and it runs inside Open WebUI's trust boundary with the connecting user's full scope. In practice a malicious or compromised MCP server could execute code and read or exfiltrate data with that user's access, so the capability stays admin-gated. Open WebUI's own MCP support is Streamable HTTP only, but the protocol's privileged nature is why adding one is reserved for admins.
+
+To give users an MCP-backed capability without server-configuration rights, an admin adds the server once and scopes it with **Access Control** to the right users or groups.
+
 ## 🧭 When to use MCP vs OpenAPI
 
 :::tip 
@@ -128,11 +138,17 @@ Both MCP and OpenAPI tool-server connections accept a free-form **Headers** fiel
 | :--- | :--- |
 | `{{USER_ID}}` | The calling user's ID. |
 | `{{USER_NAME}}` | The calling user's display name. |
+| `{{USER_EMAIL}}` | The calling user's email address. |
+| `{{USER_ROLE}}` | The calling user's role (e.g. `admin`, `user`). |
 | `{{CHAT_ID}}` | The current chat ID (empty in non-chat contexts like the **Verify Connection** button). |
 | `{{MESSAGE_ID}}` | The current message ID (empty in non-chat contexts). |
 
 Unknown tokens are passed through as literal text. Non-string header values are coerced to strings before substitution. The same tokens are honored on custom headers attached to OpenAI-compatible model connections in **Admin Settings → Connections → OpenAI**, so you can use the feature for tenant routing or audit-trail propagation across both surfaces.
 
+:::note
+`{{USER_EMAIL}}` and `{{USER_ROLE}}` were added in v0.9.6. The same release also fixed MCP server connections, where custom-header templates were previously stored but **not** interpolated at request time — they now expand the same way they always have for direct connections and OpenAPI tool servers.
+:::
+
 ### Function Name Filter List
 
 This field restricts which tools are exposed to the LLM.
@@ -182,3 +198,7 @@ Supported and improving. The broader ecosystem is still evolving; expect occasio
 **Can I mix OpenAPI and MCP tools?**
 
 Yes. Many deployments do both.
+
+**Can users add their own MCP servers?**
+
+No. Adding MCP servers is admin-only (**Admin Settings → External Tools**). Users with the **Direct Tool Servers** permission can add their own **OpenAPI** tool servers, but not MCP. See [MCP servers are admin-only](#mcp-servers-are-admin-only) for the reasoning.
diff --git a/docs/features/extensibility/pipelines/filters.md b/docs/features/extensibility/pipelines/filters.md
index 24c197c02..ce636a7d0 100644
--- a/docs/features/extensibility/pipelines/filters.md
+++ b/docs/features/extensibility/pipelines/filters.md
@@ -5,6 +5,15 @@ title: "Filters"
 
 ## Filters
 
+:::danger Pipelines are legacy — do not use for new deployments
+**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that are built in, easier to configure, and need no separate worker container:
+
+- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe)
+- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter)
+
+This page is kept for reference and existing deployments only.
+:::
+
 Filters are used to perform actions against incoming user messages and outgoing assistant (LLM) messages. Potential actions that can be taken in a filter include sending messages to monitoring platforms (such as Langfuse or DataDog), modifying message contents, blocking toxic messages, translating messages to another language, or rate limiting messages from certain users. A list of examples is maintained in the [Pipelines repo](https://github.com/open-webui/pipelines/tree/main/examples/filters). Filters can be executed as a Function or on a Pipelines server. The general workflow can be seen in the image below.
 
 <div align="center">
diff --git a/docs/features/extensibility/pipelines/index.mdx b/docs/features/extensibility/pipelines/index.mdx
index 5676e14f7..a8bf05ddb 100644
--- a/docs/features/extensibility/pipelines/index.mdx
+++ b/docs/features/extensibility/pipelines/index.mdx
@@ -12,12 +12,14 @@ title: "Pipelines"
 
 # Pipelines: UI-Agnostic OpenAI API Plugin Framework
 
-:::warning
+:::danger Pipelines are legacy — do not use for new deployments
+**Pipelines are legacy and are no longer recommended.** They predate the in-process [Functions](/features/extensibility/plugin/functions/) (Pipes, Filters, Actions) and [Tools](/features/extensibility/plugin/tools/) system, which now covers the same use cases without running a separate worker container.
 
-**DO NOT USE PIPELINES IF!**
-
-If your goal is simply to add support for additional providers like Anthropic or basic filters, you likely don't need Pipelines . For those cases, Open WebUI Functions are a better fit—it's built-in, much more convenient, and easier to configure. Pipelines, however, comes into play when you're dealing with computationally heavy tasks (e.g., running large models or complex logic) that you want to offload from your main Open WebUI instance for better performance and scalability.
+- Custom provider / RAG / request routing (a Pipeline **pipe**) → use a [Pipe Function](/features/extensibility/plugin/functions/pipe).
+- Message pre/post-processing (a Pipeline **filter**) → use a [Filter Function](/features/extensibility/plugin/functions/filter).
+- Connecting an external HTTP service → use an [OpenAPI or MCP tool server](/features/extensibility/mcp).
 
+These pages are kept for reference and for existing deployments only. New work should target Functions, Tools, or external tool servers instead.
 :::
 
 Welcome to **Pipelines**, an [Open WebUI](https://github.com/open-webui) initiative. Pipelines bring modular, customizable workflows to any UI client supporting OpenAI API specs – and much more! Easily extend functionalities, integrate unique logic, and create dynamic workflows with just a few lines of code.
diff --git a/docs/features/extensibility/pipelines/pipes.md b/docs/features/extensibility/pipelines/pipes.md
index a02365b67..8c65aeacc 100644
--- a/docs/features/extensibility/pipelines/pipes.md
+++ b/docs/features/extensibility/pipelines/pipes.md
@@ -5,6 +5,15 @@ title: "Pipes"
 
 ## Pipes
 
+:::danger Pipelines are legacy — do not use for new deployments
+**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that are built in, easier to configure, and need no separate worker container:
+
+- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe)
+- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter)
+
+This page is kept for reference and existing deployments only.
+:::
+
 Pipes are standalone functions that process inputs and generate responses, possibly by invoking one or more LLMs or external services before returning results to the user. Examples of potential actions you can take with Pipes are Retrieval Augmented Generation (RAG), sending requests to non-OpenAI LLM providers (such as Anthropic, Azure OpenAI, or Google), or executing functions right in your web UI. Pipes can be hosted as a Function or on a Pipelines server. A list of examples is maintained in the [Pipelines repo](https://github.com/open-webui/pipelines/tree/main/examples/pipelines). The general workflow can be seen in the image below.
 
 <div align="center">
@@ -46,7 +55,7 @@ yield {"choices": [{"delta": {}, "finish_reason": "stop"}]}
 
 This is the single biggest gotcha when building an agent pipeline (LangChain, LlamaIndex, a custom planner, anything that executes its own tools and streams the result back).
 
-`delta.tool_calls` in a chunk means **"please execute this tool call for me, client"**. When Open WebUI's middleware sees it, the tool executor picks up the call, runs it, appends a `role: "tool"` message, and fires a continuation request back at the same pipeline. It does this in a loop capped by `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES` (≈30).
+`delta.tool_calls` in a chunk means **"please execute this tool call for me, client"**. When Open WebUI's middleware sees it, the tool executor picks up the call, runs it, appends a `role: "tool"` message, and fires a continuation request back at the same pipeline. It does this in a loop capped by [`CHAT_RESPONSE_MAX_TOOL_CALL_ITERATIONS`](/reference/env-configuration#chat_response_max_tool_call_iterations) (default 256; `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES`, default 30, on versions before v0.9.6).
 
 If your pipeline already executed the tool internally, emitting `delta.tool_calls` makes Open WebUI try to execute it *again* — and since the pipeline keeps emitting the same call on every retry, you get 30 copies of the response stacked on top of each other before the retry cap trips. Same thing happens if you set `finish_reason: "tool_calls"` mid-stream.
 
diff --git a/docs/features/extensibility/pipelines/tutorials.md b/docs/features/extensibility/pipelines/tutorials.md
index 9d7302b78..69d0a669a 100644
--- a/docs/features/extensibility/pipelines/tutorials.md
+++ b/docs/features/extensibility/pipelines/tutorials.md
@@ -5,6 +5,15 @@ title: "Tutorials"
 
 ## Pipeline Tutorials
 
+:::danger Pipelines are legacy — do not use for new deployments
+**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that are built in, easier to configure, and need no separate worker container:
+
+- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe)
+- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter)
+
+These tutorials are kept for reference and existing deployments only.
+:::
+
 ## Tutorials Welcome
 
 Are you a content creator with a blog post or YouTube video about your pipeline setup? Get in touch
diff --git a/docs/features/extensibility/pipelines/valves.md b/docs/features/extensibility/pipelines/valves.md
index 6e333d3db..e86d1cede 100644
--- a/docs/features/extensibility/pipelines/valves.md
+++ b/docs/features/extensibility/pipelines/valves.md
@@ -5,6 +5,15 @@ title: "Valves"
 
 ## Valves
 
+:::danger Pipelines are legacy — do not use for new deployments
+**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that support [Valves](/features/extensibility/plugin/development/valves) too, are built in, and need no separate worker container:
+
+- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe)
+- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter)
+
+This page is kept for reference and existing deployments only.
+:::
+
 `Valves` (see the dedicated [Valves & UserValves](/features/extensibility/plugin/development/valves) page) can also be set for `Pipeline`. In short, `Valves` are input variables that are set per pipeline.
 
 `Valves` are set as a subclass of the `Pipeline` class, and initialized as part of the `__init__` method of the `Pipeline` class.
diff --git a/docs/features/extensibility/plugin/development/events.mdx b/docs/features/extensibility/plugin/development/events.mdx
index ba8abf730..6090552d2 100644
--- a/docs/features/extensibility/plugin/development/events.mdx
+++ b/docs/features/extensibility/plugin/development/events.mdx
@@ -795,6 +795,17 @@ When Open WebUI calls your external tool (with header forwarding enabled), it in
 
 **Authentication:** Requires a valid Open WebUI API key or session token.
 
+:::warning Open WebUI does **not** forward user credentials to external tools
+The `X-OpenWebUI-User-*` and `X-Open-WebUI-Chat-Id` / `X-Open-WebUI-Message-Id` headers forwarded to your tool are **identification only** — they carry no API key or session token. The same applies to MCP custom-header template tokens (`{{USER_ID}}`, `{{USER_NAME}}`, `{{USER_EMAIL}}`, `{{USER_ROLE}}`, `{{CHAT_ID}}`, `{{MESSAGE_ID}}`): there is no `{{API_KEY}}` or `{{TOKEN}}` placeholder, and the user's own API key / session is never sent to the tool server.
+
+So an external tool **must hold its own statically-configured Open WebUI API key** to call this endpoint. The endpoint's authorization check requires the caller to be the chat's owner **or an admin**, which gives you two practical options:
+
+- **Per-user key (uncommon)** — the tool server holds the specific user's API key. Only works for a single-user setup; impractical for a shared MCP server.
+- **Admin / service-account key (recommended)** — provision a dedicated admin (or service-account) user in Open WebUI, generate an API key for it, and use that key from the tool server. An admin key works for any user's chat, so a single key serves all callers; the forwarded `X-Open-WebUI-Chat-Id` + `X-Open-WebUI-Message-Id` headers tell your tool *which* chat/message to post to.
+
+Store the key as a secret on the tool server (env var, secrets manager, etc.); do not expect Open WebUI to push it for you.
+:::
+
 **Request Body:**
 
 ```json
diff --git a/docs/features/extensibility/plugin/development/rich-ui.mdx b/docs/features/extensibility/plugin/development/rich-ui.mdx
index c3d78a984..db5a7556f 100644
--- a/docs/features/extensibility/plugin/development/rich-ui.mdx
+++ b/docs/features/extensibility/plugin/development/rich-ui.mdx
@@ -16,28 +16,28 @@ To embed HTML content, your tool should return an `HTMLResponse` with the `Conte
 ```python
 from fastapi.responses import HTMLResponse
 
-def create_visualization_tool(self, data: str) -> HTMLResponse:
+def render_checklist(self, items: list[str]) -> HTMLResponse:
     """
-    Creates an interactive data visualization that embeds in the chat.
+    Renders an interactive checklist that embeds in the chat.
 
-    :param data: The data to visualize
+    :param items: The items to show in the checklist
     """
-    html_content = """
+    items_html = "".join(
+        f'<li><label><input type="checkbox"> {item}</label></li>' for item in items
+    )
+    html_content = f"""
     <!DOCTYPE html>
     <html>
     <head>
-        <title>Data Visualization</title>
-        <script src="https://cdn.plot.ly/plotly-latest.min.js"></script>
+        <title>Checklist</title>
+        <style>
+            body {{ font-family: system-ui, sans-serif; padding: 1rem; }}
+            ul {{ list-style: none; padding: 0; }}
+            li {{ padding: 0.25rem 0; }}
+        </style>
     </head>
     <body>
-        <div id="chart" style="width:100%;height:400px;"></div>
-        <script>
-            // Your interactive chart code here
-            Plotly.newPlot('chart', [{
-                y: [1, 2, 3, 4],
-                type: 'scatter'
-            }]);
-        </script>
+        <ul>{items_html}</ul>
     </body>
     </html>
     """
@@ -55,11 +55,11 @@ To provide the LLM with actionable context about the embed, return a **tuple** o
 ```python
 from fastapi.responses import HTMLResponse
 
-def create_chart(self, data: str) -> tuple:
+def render_feedback_form(self, prompt: str) -> tuple:
     """
-    Creates an interactive chart and returns context to the LLM.
+    Renders an interactive feedback form and returns context to the LLM.
 
-    :param data: The data to chart
+    :param prompt: The question to show the user above the form
     """
     html_content = "<html>...</html>"
     headers = {"Content-Disposition": "inline"}
@@ -67,16 +67,16 @@ def create_chart(self, data: str) -> tuple:
     # The LLM receives this context instead of the generic message
     result_context = {
         "status": "success",
-        "chart_type": "scatter",
-        "data_points": 42,
-        "description": "Scatter plot showing correlation between X and Y"
+        "form_type": "feedback",
+        "fields": ["rating", "comment"],
+        "description": f"Rendered a feedback form asking: {prompt!r}"
     }
 
     return HTMLResponse(content=html_content, headers=headers), result_context
 ```
 
 The context can be:
-- A **string** — sent as-is to the LLM (e.g., `"Generated a bar chart with 5 categories"`)
+- A **string** — sent as-is to the LLM (e.g., `"Rendered a 5-item checklist"`)
 - A **dict** — serialized as JSON for structured context
 - A **list** — serialized as JSON for multiple items
 
@@ -271,11 +271,11 @@ The iframe and parent window can communicate beyond just height reporting. The f
 
 ### Payload Requests
 
-The iframe can request a data payload from the parent. This is useful for passing dynamic data into the embed after it loads:
+The iframe can ask the parent for a data payload after it loads:
 
 ```html
 <script>
-  // Request payload from parent
+  // Listen for the response
   window.addEventListener('message', (e) => {
     if (e.data?.type === 'payload') {
       const data = e.data.payload;
@@ -289,26 +289,46 @@ The iframe can request a data payload from the parent. This is useful for passin
 </script>
 ```
 
-The parent responds with `{ type: 'payload', requestId: ..., payload: ... }` containing the configured payload data.
+The parent responds with `{ type: 'payload', requestId: ..., payload: ... }`.
+
+:::info Where the payload comes from
+There is no separate "set the payload" call. The payload is whatever the parent component had configured when it instantiated the iframe — and today only one path actually configures one:
+
+- ✅ **Citation-opened embeds in the chat-controls Embeds panel** — when the user clicks a citation badge whose source has an embed URL, the side panel opens and exposes **the full citation/source object** (the same dict you sent in your `source` / `citation` event via `__event_emitter__`) as the payload. To set it, emit a [`source` event](./events#source-or-citation-and-code-execution) whose `data` includes whatever you want the iframe to be able to fetch. The iframe then asks for it via the postMessage above and receives the citation object back.
+- ❌ **Inline tool-call embeds** (from a tool method returning `HTMLResponse` or `(HTMLResponse, context)`) — the parent does not configure a payload on this path, so a payload request returns `{ type: 'payload', requestId: ..., payload: null }`. Use [Tool Args Injection](#tool-args-injection-tools-only) (subject to `allowSameOrigin`) to pass data into a tool-call embed instead.
+- ❌ **`__event_emitter__({"type": "embeds", ...})` and Action embeds** — also configured without a payload; the response is `null`.
+
+In short: payload-request is the side-panel-citation channel, not a generic iframe-data channel. Pick the right rendering path for the data flow you need.
+:::
 
 ### Tool Args Injection (Tools Only)
 
-When a **Tool** returns a Rich UI embed, the tool call arguments (the parameters the model passed to the tool) are automatically injected into the iframe's `window.args`. This allows your embedded HTML to access the tool's input:
+When a **Tool** method returns a Rich UI embed inline at the tool-call display (i.e. you return an `HTMLResponse`, or a `(HTMLResponse, context)` tuple, from the tool method itself), the arguments the model passed are exposed on the iframe as `window.args` — **as a JSON string**, not a parsed object. Parse it before use:
 
 ```html
 <script>
   window.addEventListener('load', () => {
-    // window.args contains the JSON arguments the model passed to this tool
-    const args = window.args;
-    if (args) {
+    const raw = window.args;             // JSON string, or undefined
+    if (raw) {
+      const args = JSON.parse(raw);      // parse to object
       document.getElementById('output').textContent = JSON.stringify(args, null, 2);
+    } else {
+      console.warn('window.args not set — see Requirements below.');
     }
   });
 </script>
 ```
 
-:::note
-This only works for Tool embeds rendered via the tool call display. Action embeds do not have `window.args` since they are triggered by the user, not the model.
+:::warning Requires `allowSameOrigin` — otherwise `window.args` is silently `undefined`
+The args are injected from the parent page via `iframe.contentWindow.args = ...`, which the browser blocks under same-origin policy unless the iframe sandbox carries `allow-same-origin`. That is gated by the per-user **Settings → Interface → "iframe Sandbox Allow Same Origin"** toggle, which is **off by default**. If `window.args` comes back undefined and you have not changed this setting, that is the cause: turn it on and reload. See [allowSameOrigin](#allowsameorigin) above for the security trade-off.
+:::
+
+:::note Where `window.args` is set, and where it is not
+- ✅ **Tool method returning `HTMLResponse` or `(HTMLResponse, context)` tuple** — rendered inline at the "View Result from..." tool call indicator. `window.args` is injected (subject to the `allowSameOrigin` requirement above).
+- ❌ **`__event_emitter__({"type": "embeds", "data": {"embeds": [...]}})`** — rendered through the chat-controls Embeds panel, which does not wire `args` at all. `window.args` will always be undefined here, regardless of sandbox settings. This is by design: the embeds-event path has no tool call attached, so there are no args to inject.
+- ❌ **Action embeds** — triggered by the user, not the model, so there are no model-supplied args to inject.
+
+If you need to pass dynamic data into an embed rendered via either of the ❌ paths, use the [Payload Requests](#payload-requests) pattern above instead.
 :::
 
 ### Auto-Injected Libraries
diff --git a/docs/features/extensibility/plugin/development/under-the-hood.mdx b/docs/features/extensibility/plugin/development/under-the-hood.mdx
new file mode 100644
index 000000000..4d6f9a983
--- /dev/null
+++ b/docs/features/extensibility/plugin/development/under-the-hood.mdx
@@ -0,0 +1,190 @@
+---
+sidebar_position: 5
+title: "Under the Hood"
+---
+
+# 🔧 Under the Hood: What the Plugin Loader Actually Does
+
+:::danger ⚠️ Critical Security Warning
+**Tools, Functions, Pipes, Filters, and Actions execute arbitrary Python code on your server.** Function creation is restricted to administrators only, and Workspace Tool creation is gated by the `workspace.tools` permission — granting that permission is equivalent to giving the user shell access to the server. Only install from trusted sources, review code before importing, and restrict creation to trusted administrators. A malicious plugin could access your file system, exfiltrate data, or compromise your entire system. For full details, see the [Plugin Security Warning](/features/extensibility/plugin/).
+:::
+
+Open WebUI's plugins (Tools, Functions = Filters / Pipes / Actions) are not sandboxed scripts running in some restricted runtime. They are **Python modules executed inside your Open WebUI process**, with full access to the standard library, any pip package, the entire `open_webui` codebase, the live FastAPI app, and the database. The documented hooks (`inlet`, `outlet`, `stream`, `pipe`, `action`) are *one* way to use that access. They are not the only way.
+
+This page documents what the loader really does and what that opens up, so you can build (or audit) plugins beyond the patterns shown on the per-type pages. It also lists the footguns that come with the territory.
+
+---
+
+## How a plugin is loaded
+
+A single loader in [`backend/open_webui/utils/plugin.py`](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/utils/plugin.py) handles every plugin type:
+
+1. The plugin's Python source is read from the database.
+2. A fresh `types.ModuleType` is created and registered in `sys.modules` as `function_{id}` (or `tool_{id}`).
+3. The source is fed to `exec(content, module.__dict__)`. Anything at module top level runs at this point.
+4. The loader looks for **one** entry-point class: `Tools`, `Pipe`, `Filter`, or `Action`. That class becomes the handle Open WebUI calls into.
+5. The module stays in `sys.modules` for the life of the process. Any side effect of step 3 (imports, monkey-patches, background tasks, route registrations) is now installed in the live application.
+
+The entry-point class is the only thing the rest of Open WebUI cares about. Everything else in the file is yours.
+
+### When the module is re-executed
+
+Inlet/outlet hooks pass `load_from_db=True`. The loader still serves from cache if the source has not changed, but it consults the database on every call to decide that. Stream hooks pass `load_from_db=False` and read straight from cache.
+
+| Hook | DB check per call? | Module re-exec'd when? |
+|---|---|---|
+| `inlet` / `outlet` (Filter) | yes | source change between calls |
+| `stream` (Filter) | no | only when another hook re-loads it |
+| Tools, Pipes, Actions | yes on dispatch | source change between calls |
+
+Practical consequences:
+
+- **Editing a Filter via the editor takes effect on the next chat for `inlet`/`outlet`.** Stream picks it up the next time an `inlet` or `outlet` triggers a reload.
+- **Re-execution is not per-request**, so module-top-level work is paid for once per content version, not once per chat. Top-level imports, patches, and singletons are fine.
+- **Disabling or deleting a plugin** removes it from the active set. It does **not** undo anything its module top level did. The module stays in `sys.modules` and any monkey-patches it installed in other modules stay applied until the process restarts.
+
+---
+
+## What you actually have access to
+
+From any hook (and from module top level):
+
+- The full `open_webui.*` package. Examples: `from open_webui.models.chats import Chats`, `from open_webui.utils.middleware import process_chat_payload`, `from open_webui.config import ConfigVar`.
+- The live FastAPI `Request` via `__request__`, which carries `__request__.app` (the FastAPI app), `__request__.app.state` (config, caches, handlers), and `__request__.state` (per-request scratch).
+- The reserved dunder args documented in [Reserved Arguments](./reserved-args): `__user__`, `__metadata__`, `__model__`, `__request__`, `__event_emitter__`, `__event_call__`, `__features__`, `__body__`, `__id__`, `__oauth_token__`, plus stream-only and per-hook extras.
+- Events documented in [Events](./events): emit anything to the frontend, or solicit a response from the user with `event_call`.
+- Any pip package via `requirements:` frontmatter, installed at load time (gated by [`ENABLE_PIP_INSTALL_FRONTMATTER_REQUIREMENTS`](/reference/env-configuration#enable_pip_install_frontmatter_requirements)).
+- The Python stdlib, plus everything pip-installed in the container.
+
+There is no sandbox, no allowlist, no capability system. The execution model is **"this is Python, you are inside the server process"**.
+
+---
+
+## Patterns
+
+### 1. Mutate the per-request model dict from `inlet`
+
+The `__model__` you receive is **the same dict object** the rest of the request reads. Changing its keys from `inlet` changes how the rest of the pipeline behaves on this request. Example (the reasoning-content fix for DeepSeek / Kimi / MiMo):
+
+```python
+class Filter:
+    async def inlet(self, body: dict, __model__: dict = None) -> dict:
+        # Flip the per-request model to the code path that emits
+        # reasoning_content as a top-level field on assistant messages
+        # during the native tool-call loop.
+        if __model__ and __model__.get("provider") not in ("ollama", "llama.cpp"):
+            __model__["provider"] = "llama.cpp"
+        return body
+```
+
+Same trick works for any other field the middleware reads from the model dict: `params`, `meta`, custom keys you put there yourself and then read from another hook.
+
+### 2. Monkey-patch a backend function
+
+Because the plugin module can `import open_webui.*` and rebind module attributes:
+
+```python
+import open_webui.utils.middleware as _mw
+
+_original = _mw.process_chat_payload
+
+async def _patched(request, form_data, user, metadata, model):
+    # ...your wrapping logic, then delegate...
+    return await _original(request, form_data, user, metadata, model)
+
+_mw.process_chat_payload = _patched
+```
+
+Runs at module load (once per source version). The patch persists in `sys.modules` for the life of the process. Deleting or disabling the plugin **does not** revert the patch. The only clean rollback is a process restart.
+
+Use sparingly. Cross-plugin interference is a real risk: if two plugins patch the same function the result depends on load order, which is not deterministic.
+
+### 3. Add a new HTTP route at load
+
+```python
+def _ensure_route(app):
+    if any(getattr(r, "path", None) == "/my/route" for r in app.routes):
+        return
+    app.add_api_route("/my/route", my_handler, methods=["GET"])
+```
+
+Call from the first hook with access to `__request__.app`. The idempotency guard is important: the loader may re-execute on edits, and `add_api_route` will happily register the same path twice.
+
+### 4. Spawn a background task
+
+```python
+import asyncio
+
+async def _loop(app):
+    while True:
+        # ...periodic work...
+        await asyncio.sleep(60)
+
+def _start_once(app):
+    if getattr(app.state, "_my_plugin_started", False):
+        return
+    app.state._my_plugin_started = True
+    asyncio.create_task(_loop(app))
+```
+
+The `app.state` flag makes it "once per process" rather than "once per source version". On a clean restart it starts fresh.
+
+### 5. Stash state in `app.state`
+
+```python
+async def inlet(self, body, __request__):
+    cache = __request__.app.state.__dict__.setdefault("my_cache", {})
+    # ...read/write cache...
+    return body
+```
+
+Shared across requests and **across plugins** in the same process. There is no namespacing: pick a unique key.
+
+### 6. Use `event_emitter` for arbitrary side effects in the UI
+
+`event_emitter` accepts any event shape the frontend handles: status banners, source citations, file attachments, chat-message updates, toasts. You are not restricted to the events documented on the per-type pages. See [Events](./events) for the full catalogue.
+
+### 7. Prompt the user mid-handler with `event_call`
+
+`event_call` is `event_emitter` that **awaits a response**. Show a form, a confirmation, an input dialog, and block until the user answers. Useful inside Tool methods that need a human in the loop, or Action handlers that confirm before executing.
+
+### 8. Pipes as full provider replacements
+
+A `Pipe` replaces the entire LLM call. Open WebUI hands you the request and asks for a response back. Nothing in the middleware constrains what you put in that response, so:
+
+- wrap an external API (any provider, any protocol),
+- route between providers based on request shape,
+- run an entire agent inside `pipe()` and stream the agent's output back,
+- skip any model entirely and return canned content.
+
+A Pipe is the most powerful entry point precisely because the middleware steps out of the way.
+
+### 9. Tools that do more than their docstring says
+
+A `Tools` class's methods are exposed to the model as callable tools (their docstrings become JSON schema). The method body can do **anything**: call external APIs, emit UI events with `__event_emitter__`, stash data in `app.state`, monkey-patch on first call. The docstring is purely how the tool advertises itself to the model. The implementation is unconstrained.
+
+### 10. Actions as arbitrary one-shot operations
+
+`Action` renders a button on an assistant message. The handler runs server-side with the same dunder surface as Filters and Tools, against the chat that the message belongs to. Use for "approve this", "re-run with...", "send to external system", or any one-off operation a user should be able to trigger from a specific message.
+
+---
+
+## Footguns
+
+- **No sandboxing.** Tools and Functions execute Python in your backend process as the backend user. The security policy ([Rule 10](/security/security-policy#reporting-guidelines)) treats this as intended behaviour: granting Tool or Function creation permission is equivalent to granting shell access on the host. Treat plugin authors as administrators.
+- **Stream hooks use a stale cache.** Edits to a `stream` method only take effect after another hook (or a process restart) refreshes the module. If you edit a stream filter and the change does not seem to apply, trigger an `inlet`/`outlet` reload or restart.
+- **Cross-plugin interference is not detected.** Two plugins patching the same function, registering the same route, or writing to the same `app.state` key will collide. Load order is not deterministic. Prefer additive patterns (your own namespaces, wrappers that delegate) over destructive ones.
+- **Disabling does not unload.** The module stays in `sys.modules` and any module-level side effects stay installed. Restart the process to fully revert.
+- **`requirements:` runs `pip install` on every replica at load.** In multi-replica deployments set [`ENABLE_PIP_INSTALL_FRONTMATTER_REQUIREMENTS=False`](/reference/env-configuration#enable_pip_install_frontmatter_requirements) and pre-install dependencies in your image; runtime installs race across workers and crash. See [Scaling → Function/Tool Dependency Installation Crashes](/troubleshooting/multi-replica#9-functiontool-dependency-installation-crashes).
+- **Internal APIs are not a stable public surface.** `open_webui.utils.*`, the internal model classes, middleware helpers, and pretty much everything outside the documented dunder args and event types can rename, move, or change signatures between releases. If your monkey-patch breaks after an upgrade, that is on you to repair.
+- **The Pipelines server is out of scope here.** This page is about in-process plugins (Tools / Functions). The separate [Pipelines](/features/extensibility/pipelines/) server runs out-of-process and does not share `sys.modules` with Open WebUI: it cannot monkey-patch the main app, but it also is not constrained by it.
+
+---
+
+## When this is the wrong tool
+
+For anything you can express through the documented hooks (filters that mutate `body`, tools that call APIs and return results, actions that emit events), **stay in the documented hooks**. The patterns above are powerful, but their durability is shallow: cross-plugin interaction, upgrade compatibility, and rollback all degrade the moment you start patching module internals.
+
+If your plugin needs an interface that does not exist yet, an upstream PR is more durable than a monkey-patch.
+
+If you file a bug report against a code path that your plugin is monkey-patching, expect it to be closed. Reports must reproduce against an unmodified Open WebUI ([Rule 6](/security/security-policy#reporting-guidelines)).
diff --git a/docs/features/extensibility/plugin/functions/filter.mdx b/docs/features/extensibility/plugin/functions/filter.mdx
index af374e6cf..270da77e1 100644
--- a/docs/features/extensibility/plugin/functions/filter.mdx
+++ b/docs/features/extensibility/plugin/functions/filter.mdx
@@ -3,7 +3,7 @@ sidebar_position: 3
 title: "Filter Function"
 ---
 
-# 🪄 Filter Function: Modify Inputs and Outputs
+# Filter Function: Modify Inputs and Outputs
 
 :::danger ⚠️ Critical Security Warning
 **Filter Functions execute arbitrary Python code on your server.** Function creation is restricted to administrators only. Only install from trusted sources and review code before importing. A malicious Function could access your file system, exfiltrate data, or compromise your entire system. For full details, see the [Plugin Security Warning](/features/extensibility/plugin/).
@@ -15,7 +15,7 @@ This guide will break down **what Filters are**, how they work, their structure,
 
 ---
 
-## 🌊 What Are Filters in Open WebUI?
+## What Are Filters in Open WebUI?
 
 Imagine Open WebUI as a **stream of water** flowing through pipes:
 
@@ -36,11 +36,11 @@ Filters are like **translators or editors** in the AI workflow: you can intercep
 
 ---
 
-## 🗺️ Structure of a Filter Function: The Skeleton
+## Structure of a Filter Function: The Skeleton
 
 Let's start with the simplest representation of a Filter Function. Don't worry if some parts feel technical at first—we’ll break it all down step by step!
 
-### 🦴 Basic Skeleton of a Filter
+### Basic Skeleton of a Filter
 
 ```python
 from pydantic import BaseModel
@@ -73,7 +73,7 @@ class Filter:
 
 ---
 
-### 🧲 Toggleable Filters: Making Filters User-Controllable (`self.toggle`)
+### Toggleable Filters: Making Filters User-Controllable (`self.toggle`)
 
 By default a filter that's **active and in scope** (global, or attached to the model) runs on every request — the user has no say in it. That's often what you want (PII scrubbing, logging, mandatory guardrails). Sometimes you want the opposite: let the user decide whether the filter runs for a given conversation.
 
@@ -144,9 +144,88 @@ The chip being present = the filter is enabled for the next request. The chip be
 
 ---
 
-## ⚙️ Filter Administration & Configuration
+### Owning Retrieval With file_handler
 
-### 🌐 Global Filters vs. Model-Specific Filters
+By default, when a user attaches a knowledge collection or uploads a file to a chat, Open WebUI runs the built-in RAG pipeline **after** every inlet filter has returned. The chat-completion handler queries the vector DB for chunks relevant to the user's last message, wraps them in `<source>` tags, appends them to the last user message (or to a system message, depending on `RAG_SYSTEM_CONTEXT`), and only then calls the LLM.
+
+This is important to understand for filter authors: at `inlet()` time, `body["metadata"]["files"]` and `body["files"]` contain only the file/collection *references* (IDs, names, types). **The chunk text doesn't exist yet** — retrieval hasn't happened. So if you want to inspect or transform the chunks themselves (PII / PHI redaction, reranking, custom hybrid scoring, translation, chunk-level access control, anonymization), the standard inlet contract is not enough — the data you want isn't there yet.
+
+**`file_handler = True`** is the opt-in escape hatch for exactly this case. Declared as a **module-level attribute** at the top of your filter file, it tells Open WebUI "I am handling retrieval and chunk injection myself — skip the built-in RAG step." When set, the backend strips `body["metadata"]["files"]` and `body["files"]` after your `inlet()` returns, so the chat-completion handler finds no files to retrieve over and goes straight to the LLM with whatever you injected.
+
+```python
+from pydantic import BaseModel
+from typing import Optional
+
+# Module-level attribute — sits OUTSIDE the Filter class, alongside imports.
+file_handler = True
+
+class Filter:
+    class Valves(BaseModel):
+        pass
+
+    def __init__(self):
+        self.valves = self.Valves()
+
+    async def inlet(
+        self,
+        body: dict,
+        __request__=None,
+        __user__: Optional[dict] = None,
+        __model__: Optional[dict] = None,
+    ) -> dict:
+        # body["metadata"]["files"] still contains the file/collection REFERENCES here.
+        # After this method returns, Open WebUI strips them and does NOT run its own RAG.
+        # Therefore: it is YOUR job to retrieve, transform, and inject chunks below.
+        return body
+```
+
+:::warning Module attribute, not `self.file_handler`
+Open WebUI reads `file_handler` from the **module object** (the file your filter lives in), not from the `Filter` instance. Setting `self.file_handler = True` inside `__init__` is silently ignored. Put the assignment at the top of the file, alongside your imports — exactly as shown above.
+:::
+
+#### When to use it
+
+- **Per-model redaction.** Apply PII / PHI scrubbing only when the request targets a remote model, while letting a self-hosted model see raw chunks. Branch on `__model__["owned_by"]` (or another signal) inside the inlet and transform chunks accordingly.
+- **Custom retrieval logic.** Hybrid BM25 + dense scoring, query rewriting, multi-collection routing, reranking with a different model than the one Open WebUI uses, result caching keyed on the rewritten query.
+- **Pre-injection transformation.** Translation, summarization, deduplication, or any transform that needs the *actual chunk text* rather than just the references.
+- **Chunk-level access control.** Filter out chunks the current user shouldn't see based on metadata attached to the source documents.
+
+#### The recipe
+
+1. Set `file_handler = True` at the top of your filter module.
+2. In `inlet()`, read the file references from `body["metadata"]["files"]` (and `body["files"]` for ad-hoc attachments).
+3. Retrieve chunks yourself. Two options:
+   - **HTTP**: call `POST /api/v1/retrieval/query/doc` (single collection) or `POST /api/v1/retrieval/query/collection` (multiple), passing the user's last message as the query string and the inbound request's bearer token so permissions stay scoped to the user.
+   - **In-process**: `from open_webui.retrieval.utils import get_sources_from_items` and call it directly with the same arguments the core code uses. This avoids the network hop and returns a cleaner shape (list of dicts each containing a `document` array of chunks and a parallel `metadata` array).
+4. Transform the chunks however you need. Branch on `__model__` / `__user__` if the transform is conditional (e.g. "redact only when the model is remote").
+5. Inject the transformed chunks back into `body["messages"]`. To preserve clickable citations in the UI, mirror the format Open WebUI uses internally:
+
+   ```html
+   <source id="1" name="filename.pdf" resource-id="<collection_id>" resource-type="collection">
+   ...chunk text...
+   </source>
+   ```
+
+   Plain Markdown also works if you don't care about citations being clickable in the UI — only the structured `<source>` form wires up the citation popovers.
+6. Return `body`. The built-in RAG step is skipped (because `file_handler` caused the file references to be stripped), and the LLM call goes out with your sanitized chunks already in the prompt.
+
+#### Caveat: it's static, all-or-nothing per filter
+
+`file_handler` is read **once per filter, at the module level**. It is not a per-request signal and cannot be flipped based on the model, user, or chat from inside `inlet()`. When set, the built-in RAG is **always** skipped for any request where this filter is invoked — regardless of whether your `inlet()` actually called any retrieval logic on that particular request.
+
+In practice this means: if you use `file_handler = True`, your filter must handle retrieval for **every** scenario where files would normally be retrieved by the built-in path, including the cases where you'd have been happy with the default behavior. The retrieval call itself is identical in both cases; only any conditional *transformation* (e.g. "only redact for remote models") branches on context.
+
+If you genuinely need per-request switching between built-in and custom retrieval (e.g. "use built-in RAG for some users, custom for others on the same model"), the cleanest approach is to gate the custom-RAG filter on `self.toggle = True` so it only runs when the user has it selected — when the filter isn't selected, it doesn't run, its `file_handler` doesn't apply, and the built-in RAG handles the request normally. Don't try to dynamically mutate `file_handler` from inside `inlet()`; the flag is read off the module object before your method is called.
+
+#### Why this matters compared to mutating `body["files"]` in inlet
+
+A naive alternative is to clear `body["metadata"]["files"] = []` and `body["files"] = []` inside `inlet()` to suppress the built-in RAG dynamically. This works in practice but is brittle: future Open WebUI versions can add new file/collection plumbing under additional keys, and the official "I'm handling this myself" contract is `file_handler`. Prefer the documented opt-in.
+
+---
+
+## Filter Administration & Configuration
+
+### Global Filters vs. Model-Specific Filters
 
 Open WebUI provides a flexible multi-level filter system that allows you to control which filters are active, how they're enabled, and who can toggle them. Understanding this system is crucial for effective filter management.
 
@@ -191,7 +270,7 @@ POST /functions/id/{filter_id}/toggle/global
 
 ---
 
-### 🎛️ The Two-Tier Filter System
+### The Two-Tier Filter System
 
 Open WebUI uses a sophisticated two-tier system for managing filters on a per-model basis. This can be confusing at first, but it's designed to support both **always-on filters** and **user-toggleable filters**.
 
@@ -258,7 +337,7 @@ class Filter:
 
 ---
 
-### 🔄 Toggleable Filters vs. Always-On Filters
+### Toggleable Filters vs. Always-On Filters
 
 Understanding the difference between these two types is key to using the filter system effectively.
 
@@ -348,7 +427,7 @@ class WebSearchFilter:
 
 ---
 
-### 📊 Filter Execution Flow
+### Filter Execution Flow
 
 Here's the complete flow from admin configuration to filter execution:
 
@@ -386,7 +465,7 @@ Here's the complete flow from admin configuration to filter execution:
 
 ---
 
-### 📡 Filter Behavior with API Requests
+### Filter Behavior with API Requests
 
 When using Open WebUI's API endpoints directly (e.g., via `curl` or external applications), `inlet()` and `stream()` follow the same execution model as WebUI requests. `outlet()` is the one that behaves very differently for direct API callers and is covered in detail below.
 
@@ -608,7 +687,7 @@ Filters are sorted in **ascending** order by priority. A filter with `priority=0
 
 ---
 
-### 🔗 Data Passing Between Filters
+### Data Passing Between Filters
 
 When multiple filters are active, each filter in the chain receives the **modified data from the previous filter**. The returned value from one filter becomes the input to the next filter in the priority order.
 
@@ -932,6 +1011,10 @@ In the world of Open WebUI, the `inlet` function does this important prep work o
 🚀 **Your Task**:
 Modify and return the `body`. The modified version of the `body` is what the LLM works with, so this is your chance to bring clarity, structure, and context to the input.
 
+:::info Want to transform RAG chunks? `inlet()` runs **before** retrieval
+At `inlet()` time, `body["metadata"]["files"]` and `body["files"]` contain only file/collection *references* — the actual chunk text is fetched and injected later, after every inlet filter has returned. If you need to inspect or transform the chunk text itself (PII redaction, reranking, translation, chunk-level ACLs), see [Owning Retrieval With `file_handler`](#owning-retrieval-with-file_handler) for the supported opt-in.
+:::
+
 ##### Why Would You Use the `inlet`?
 1. **Adding Context**: Automatically append crucial information to the user’s input, especially if their text is vague or incomplete. For example, you might add "You are a friendly assistant" or "Help this user troubleshoot a software bug."
 
@@ -1036,7 +1119,7 @@ async def stream(self, event: dict) -> dict:
 - Each line represents a **small fragment** of the model's streamed response.
 - The **`delta.content` field** contains the progressively generated text.
 
-##### 🔄 Example: Filtering Out Emojis from Streamed Data
+##### Example: Filtering Out Emojis from Streamed Data
 ```python
 async def stream(self, event: dict) -> dict:
     for choice in event.get("choices", []):
@@ -1073,7 +1156,7 @@ The `outlet` function is like a **proofreader**: tidy up the AI's response (or m
 - **Quality scoring** - Run automated quality checks on model outputs
 
 :::info Outlet and API Requests
-`outlet()` does **not** run reliably for direct `/api/chat/completions` calls. On tagged releases it is never invoked by that endpoint. On `dev` it can run inline, but only when the caller supplies `chat_id` + `id`, owns the chat, and uses a non-streaming request — and even then the filtered content is not returned in the HTTP response. For direct API integrations that need `outlet()`, follow `/api/chat/completions` with `POST /api/chat/completed`. See [Filter Behavior with API Requests](#-filter-behavior-with-api-requests) for the full picture.
+`outlet()` does **not** run reliably for direct `/api/chat/completions` calls. On tagged releases it is never invoked by that endpoint. On `dev` it can run inline, but only when the caller supplies `chat_id` + `id`, owns the chat, and uses a non-streaming request — and even then the filtered content is not returned in the HTTP response. For direct API integrations that need `outlet()`, follow `/api/chat/completions` with `POST /api/chat/completed`. See [Filter Behavior with API Requests](#filter-behavior-with-api-requests) for the full picture.
 :::
 
 💡 **Example Use Case**: Strip out sensitive API responses you don't want the user to see:
@@ -1169,7 +1252,7 @@ Publishing a curated package on **[openwebui.com](https://openwebui.com/)** lets
 
 ---
 
-## 🚧 Potential Confusion: Clear FAQ 🛑
+## Potential Confusion: Clear FAQ
 
 ### **Q: How Are Filters Different From Pipe Functions?**
 
diff --git a/docs/features/extensibility/plugin/functions/pipe.mdx b/docs/features/extensibility/plugin/functions/pipe.mdx
index 8eb46f923..09a8b1cab 100644
--- a/docs/features/extensibility/plugin/functions/pipe.mdx
+++ b/docs/features/extensibility/plugin/functions/pipe.mdx
@@ -279,7 +279,7 @@ If you must use a synchronous third-party library in an async handler, wrap the
 You can modify this proxy Pipe to support additional service providers like Anthropic, Perplexity, and more by adjusting the API endpoints, headers, and logic within the `pipes` and `pipe` functions.
 
 :::caution Building a self-contained agent? Don't emit `delta.tool_calls`.
-If your Pipe wraps an agent (LangChain, LlamaIndex, a custom planner, …) that executes tools **internally** and then streams the final answer back to the chat, emitting `delta.tool_calls` in the stream will trigger Open WebUI's tool-execution retry loop — the middleware treats `delta.tool_calls` as "please execute this for me, client" and loops back through your pipe, duplicating the response up to `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES` (~30) times.
+If your Pipe wraps an agent (LangChain, LlamaIndex, a custom planner, …) that executes tools **internally** and then streams the final answer back to the chat, emitting `delta.tool_calls` in the stream will trigger Open WebUI's tool-execution retry loop — the middleware treats `delta.tool_calls` as "please execute this for me, client" and loops back through your pipe, duplicating the response up to [`CHAT_RESPONSE_MAX_TOOL_CALL_ITERATIONS`](/reference/env-configuration#chat_response_max_tool_call_iterations) (default 256; `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES`, default 30, before v0.9.6) times.
 
 For self-contained agents, render tool executions as `<details type="tool_calls">` content blocks instead — the same shape Open WebUI itself emits after internal tool execution. See the [Pipes → Self-contained agents and `delta.tool_calls`](/features/extensibility/pipelines/pipes#self-contained-agents-and-deltatool_calls) section for the full pattern, a LangChain example, and the rule of thumb for which path to take.
 :::
diff --git a/docs/features/extensibility/plugin/tools/development.mdx b/docs/features/extensibility/plugin/tools/development.mdx
index 8c86128c0..642d4af94 100644
--- a/docs/features/extensibility/plugin/tools/development.mdx
+++ b/docs/features/extensibility/plugin/tools/development.mdx
@@ -33,6 +33,10 @@ licence: MIT
 """
 ```
 
+:::tip Metadata auto-fill (v0.9.6+)
+When you create a **new** tool (also applies to functions and skills), the editor reads the frontmatter as you paste or type code and auto-fills the **Name**, **ID**, and **Description** fields from `title` and `description` if you haven't already filled them in. It never overwrites a value you've entered, and it does not re-derive fields when editing an existing item — so you no longer need to retype metadata that's already declared in the source.
+:::
+
 ### Tools Class
 
 Tools have to be defined as methods within a class called `Tools`, with optional subclasses called `Valves` and `UserValves`, for example:
diff --git a/docs/features/extensibility/plugin/tools/index.mdx b/docs/features/extensibility/plugin/tools/index.mdx
index 1b4b354cf..6130b2c6c 100644
--- a/docs/features/extensibility/plugin/tools/index.mdx
+++ b/docs/features/extensibility/plugin/tools/index.mdx
@@ -21,12 +21,13 @@ Because there are several ways to integrate "Tools" in Open WebUI, it's importan
 
 | Type | Location in UI | Best For... | Source |
 | :--- | :--- | :--- | :--- |
-| **Native Features** | Admin/Settings | Core platform functionality | Built-in to Open WebUI |
-| **Workspace Tools** | `Workspace > Tools` | User-created or community Python scripts | [Community Library](https://openwebui.com/search) |
-| **Native MCP (HTTP)** | `Settings > Connections` | Standard MCP servers reachable via HTTP/SSE | External MCP Servers |
-| **MCP via Proxy (MCPO)** | `Settings > Connections` | Local stdio-based MCP servers (e.g., Claude Desktop tools) | [MCPO Adapter](https://github.com/open-webui/mcpo) |
-| **OpenAPI Servers** | `Settings > Connections` | Standard REST/OpenAPI web services | External Web APIs |
-| **Open Terminal** | `Settings > Integrations` | Full shell access in an isolated Docker container (always-on) | [Open Terminal](https://github.com/open-webui/open-terminal) |
+| **Native Features** | Admin/Settings | Core platform functionality (these are the [built-in system tools](#built-in-system-tools-nativeagentic-mode)) | Built-in to Open WebUI |
+| **Workspace Tools** | `Workspace > Tools` | User-created or community Python scripts — **the most powerful, least restricted option** | [Community Library](https://openwebui.com/search) |
+| **Native MCP (HTTP)** | `Settings > Connections` | Standard MCP servers reachable via HTTP/SSE | External tool server |
+| **MCP via Proxy (MCPO)** | `Settings > Connections` | Local stdio-based MCP servers (e.g., Claude Desktop tools) | External tool server (via [MCPO Adapter](https://github.com/open-webui/mcpo)) |
+| **OpenAPI Servers** | `Settings > Connections` | Standard REST/OpenAPI web services | External tool server |
+
+The last three (**MCP HTTP**, **MCPO**, **OpenAPI**) are all **external tool servers**: the tool code runs on a separate process or machine and Open WebUI calls it over HTTP. **Native Features** are the built-in system tools that ship with Open WebUI. **Workspace Tools** are Python that runs in-process — for the most demanding use cases they are by far the most capable option with the fewest limitations (see below).
 
 ### 1. Native Features (Built-in)
 These are deeply integrated into Open WebUI and generally don't require external scripts.
@@ -39,8 +40,8 @@ These are deeply integrated into Open WebUI and generally don't require external
 In [**Native Mode**](#built-in-system-tools-nativeagentic-mode), these features are exposed as **Tools** that the model can call independently.
 
 ### 2. Workspace Tools (Custom Plugins)
-These are **Python scripts** that run directly within the Open WebUI environment.
-- **Capability**: Can do anything Python can do (web scraping, complex math, API calls).
+These are **Python scripts** that run directly within the Open WebUI environment. **For the most demanding use cases, Workspace Tools are by far the most powerful option with the fewest limitations** — they run in-process with full access to Python, the `open_webui` codebase, and the request context, so there is very little they *can't* do (see [Under the Hood](../development/under-the-hood) for the full extent). The external tool servers above are more constrained: they only see what you pass over HTTP and can't reach into Open WebUI itself.
+- **Capability**: Can do anything Python can do (web scraping, complex math, API calls), and hold secrets (API keys) entirely server-side so neither the user nor the model can read them.
 - **Access**: Managed via the `Workspace` menu. 
 - **Safety**: Always review code before importing, as these run on your server.
 - **⚠️ Security Warning**: Normal or untrusted users should **not** be given permission to access the Workspace Tools section. This access allows a user to upload and execute arbitrary Python code on your server, which could lead to a full system compromise.
@@ -54,6 +55,10 @@ These are **Python scripts** that run directly within the Open WebUI environment
 ### 4. OpenAPI / Function Calling Servers
 Generic web servers that provide an OpenAPI (`.json` or `.yaml`) specification. Open WebUI can ingest these specs and treat every endpoint as a tool.
 
+:::info Open Terminal — a separate code-execution integration
+Beyond the tool types above, Open WebUI also integrates with **[Open Terminal](/features/open-terminal)**: an always-on, isolated Docker container that gives a model a real shell and filesystem. Once connected, it exposes its own set of **built-in tools** (`run_command`, `read_file`, `write_file`, `grep_search`, `glob_search`, process management, and more) that the model can call directly — effectively a sandboxed code-execution and file-handling environment, distinct from the per-message [Code Interpreter](#built-in-system-tools-nativeagentic-mode) tool. See the [Open Terminal documentation](/features/open-terminal) for setup, multi-user, and security considerations.
+:::
+
 ---
 
 ## How to Install & Manage Workspace Tools
@@ -229,8 +234,10 @@ Default Mode is **not** a supported workaround even for DeepSeek — it is legac
 | `search_knowledge_bases` | Text search over KB names/descriptions. |
 | `query_knowledge_files` | Search file contents via the RAG retrieval pipeline (hybrid + rerank when enabled). Main tool for finding answers in docs. |
 | `search_knowledge_files` | Search files by filename. |
-| `view_file` | Read a user-accessible file by ID with pagination (`offset`, `max_chars`). |
+| `grep_knowledge_files` | Exact text / regex search across knowledge file content. Returns matching lines with line numbers. Complements `query_knowledge_files` (semantic) when you need literal matches. |
+| `view_file` | Read a user-accessible file by ID with character pagination (`offset`, `max_chars`) or line range (`start_line`, `end_line`, optional `line_numbers`). |
 | `view_knowledge_file` | Read a knowledge-base file by ID with pagination (`offset`, `max_chars`). |
+| `kb_exec` *(opt-in)* | Filesystem-style command interface for knowledge bases (`ls`, `tree`, `cat`, `head`, `tail`, `sed`, `grep`, `find`, `wc`, `stat`, with pipe support). Directory-aware: `ls docs/`, `tree`, `grep "x" docs/`, and path-based file refs (`docs/api/auth.md`). Replaces the discovery/read tools above when [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set. |
 | **Image Gen** | *Requires image generation enabled (per-tool) AND per-chat "Image Generation" toggle enabled.* |
 | `generate_image` | Generates a new image based on a prompt. Requires `ENABLE_IMAGE_GENERATION`. |
 | `edit_image` | Edits existing images based on a prompt and image URLs. Requires `ENABLE_IMAGE_EDIT`. |
@@ -287,12 +294,17 @@ Use this quick matrix instead of memorizing per-row caveats.
 | `query_knowledge_bases` | ❌ | ✅ |
 | `search_knowledge_files` | ✅ (auto-scoped) | ✅ (all accessible KBs) |
 | `query_knowledge_files` | ✅ (auto-scoped) | ✅ |
+| `grep_knowledge_files` | ✅ (auto-scoped) | ✅ |
 | `view_file` | ✅ (when attached items include files/collections) | ❌ |
 | `view_knowledge_file` | ✅ (when attached items include files/collections) | ✅ |
 | `view_note` | ✅ (when attached items include notes) | ❌ |
 
 Quick rule: `list_knowledge` and `list_knowledge_bases` are mutually exclusive.
 
+:::info `kb_exec` replaces the matrix when enabled
+When [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set, Open WebUI injects `kb_exec` instead of the file-oriented tools listed above. Still injected alongside it: `query_knowledge_files` (always), `view_note` (when notes are attached), and `query_knowledge_bases` + `search_knowledge_bases` (when no KB is attached). The model interacts with files through familiar shell commands. See the [Knowledge feature page](/features/workspace/knowledge#filesystem-style-access-kb_exec) for details.
+:::
+
 #### Tool Reference
 
 | Tool | Parameters | Output |
@@ -307,8 +319,10 @@ Quick rule: `list_knowledge` and `list_knowledge_bases` are mutually exclusive.
 | `search_knowledge_bases` | `query` (required), `count` (default: 5), `skip` (default: 0) | Array of `{id, name, description, file_count}` |
 | `query_knowledge_files` | `query` (required), `knowledge_ids` (optional), `count` (default: 5) | Array of chunks like `{content, source, file_id, distance?}`; note hits include `{note_id, type: "note"}` |
 | `search_knowledge_files` | `query` (required), `knowledge_id` (optional), `count` (default: 5), `skip` (default: 0) | Array of `{id, filename, knowledge_id, knowledge_name}` |
-| `view_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000) | `{id, filename, content, updated_at, created_at}` — includes `truncated`, `total_chars`, `next_offset` when paginated |
+| `grep_knowledge_files` | `pattern` (required; regex auto-detected), `file_id` (optional — single-file mode), `case_insensitive` (default: false), `count_only` (default: false) | Matching lines with file IDs, filenames, and 1-indexed line numbers (capped at 50 matches) |
+| `view_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000), `line_numbers` (default: false), `start_line` / `end_line` (optional — line-based addressing overrides `offset`/`max_chars`) | `{id, filename, content, updated_at, created_at}` — includes `truncated`, `total_chars`, `next_offset` when paginated, or `total_lines`, `showing_lines`, `next_start_line` in line mode |
 | `view_knowledge_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000) | `{id, filename, content, knowledge_id, knowledge_name}` — includes pagination metadata when truncated |
+| `kb_exec` | `command` (required) — filesystem-style command: `ls` (root) / `ls <dir>/` / `ls -a` (flat with paths), `tree` / `tree <dir>/`, `cat -n <file>`, `head -N <file>`, `tail -N <file>`, `sed -n '<a>,<b>p' <file>`, `grep [-i\|-l\|-c] "<pattern>" [<dir>/\|<file>\|*.ext]`, `find [<dir>/] "<glob>"`, `wc <file>`, `stat <file>`; supports pipes (`grep "auth" \| head -5`); files referenced by path (`docs/api/auth.md`), filename, or file ID | Plain text command output (matches/listing/tree/file content as appropriate) |
 | **Image Gen** | | |
 | `generate_image` | `prompt` (required) | `{status, message, images}` — auto-displayed |
 | `edit_image` | `prompt` (required), `image_urls` (required) | `{status, message, images}` — auto-displayed |
@@ -443,7 +457,7 @@ When the **Builtin Tools** capability is enabled, you can further control which
 | **Memory** | `search_memories`, `add_memory`, `replace_memory_content`, `delete_memory`, `list_memories` | Search and manage user memories |
 | **Chat History** | `search_chats`, `view_chat` | Search and view user chat history |
 | **Notes** | `search_notes`, `view_note`, `write_note`, `replace_note_content` | Search, view, and manage user notes |
-| **Knowledge Base** | `list_knowledge`, `list_knowledge_bases`, `search_knowledge_bases`, `query_knowledge_bases`, `search_knowledge_files`, `query_knowledge_files`, `view_file`, `view_knowledge_file` | Browse and query knowledge bases |
+| **Knowledge Base** | `list_knowledge`, `list_knowledge_bases`, `search_knowledge_bases`, `query_knowledge_bases`, `search_knowledge_files`, `query_knowledge_files`, `grep_knowledge_files`, `view_file`, `view_knowledge_file` (or `kb_exec` + `query_knowledge_files` + `view_note`/`query_knowledge_bases`/`search_knowledge_bases` as applicable when [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set) | Browse and query knowledge bases |
 | **Web Search** | `search_web`, `fetch_url` | Search the web and fetch URL content |
 | **Image Generation** | `generate_image`, `edit_image` | Generate and edit images |
 | **Code Interpreter** | `execute_code` | Execute code in a sandboxed environment |
diff --git a/docs/features/workspace/knowledge.md b/docs/features/workspace/knowledge.md
index 1ba316d90..a2365cf68 100644
--- a/docs/features/workspace/knowledge.md
+++ b/docs/features/workspace/knowledge.md
@@ -42,6 +42,8 @@ Attach specific knowledge bases to a model so it only searches what's relevant.
 | 📑 **5 extraction engines** | Tika, Docling, Azure, Mistral OCR, custom loaders |
 | 🤖 **Agentic retrieval** | Models browse, search, and read your documents autonomously |
 | 📄 **Full context mode** | Inject entire documents with no chunking |
+| 🗂️ **Nested directories** | Organize files into subdirectories with drag-and-drop reordering |
+| 🔄 **Incremental directory sync** | Mirror a local folder into the KB — only new and modified files upload, deletions are removed, mirroring folder structure |
 | 📦 **Export and API** | Back up knowledge bases as zip files, manage via REST API |
 
 ---
@@ -76,12 +78,93 @@ With [native function calling](/features/extensibility/plugin/tools#tool-calling
 | `query_knowledge_bases` | ❌ | ✅ | Search KB names/descriptions by semantic similarity |
 | `search_knowledge_files` | ✅ (scoped) | ✅ (all) | Search files by filename |
 | `query_knowledge_files` | ✅ (scoped) | ✅ | Search file contents using the RAG pipeline |
-| `view_file` | ✅ | ❌ | Read file content with pagination (default 10K chars, cap 100K) |
+| `grep_knowledge_files` | ✅ (scoped) | ✅ | Exact text / regex search across knowledge files (returns matching lines with line numbers; auto-detects regex like `error|warn`) |
+| `view_file` | ✅ | ❌ | Read file content with pagination (`offset`/`max_chars`) or by line range (`start_line`/`end_line`, optional `line_numbers`) |
 | `view_knowledge_file` | ✅ | ✅ | Read file content from any accessible KB |
 | `view_note` | ✅ | ❌ | Read attached notes |
 
 The key split: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. Attaching a KB scopes the model to only those documents. Leaving it unscoped lets the model discover everything the user has access to.
 
+#### When to prefer `grep_knowledge_files` over `query_knowledge_files`
+
+The two search tools complement each other:
+
+| | `query_knowledge_files` | `grep_knowledge_files` |
+|---|---|---|
+| **How it matches** | Semantic / vector retrieval (with optional BM25 + rerank when [`ENABLE_RAG_HYBRID_SEARCH`](/reference/env-configuration#enable_rag_hybrid_search) is on) | Exact string match — regex auto-detected (e.g. `error\|warn`, `version \d+`) |
+| **Returns** | Relevant chunks of content | Matching lines with file ID, filename, and 1-indexed line number |
+| **Use when** | "What does the documentation say about X?" — paraphrased questions, conceptual lookups | "Find every place we mention `OPENAI_API_KEY`" — literal identifiers, error strings, version numbers |
+| **Result cap** | Top K (default 5) | 50 matches |
+| **Flags** | — | `case_insensitive`, `count_only`, `file_id` (single-file mode) |
+
+In agentic flows, a typical pattern is: `query_knowledge_files` to locate the relevant document, then `grep_knowledge_files` to pinpoint exact lines, then `view_file` (line-range mode below) to read the surrounding context.
+
+#### Reading with `view_file`
+
+`view_file` supports two addressing modes:
+
+- **Character pagination** — `offset` + `max_chars` (default `10000`, hard cap `100000`). Best for streaming through a long document; the response includes `next_offset` when the file is truncated.
+- **Line range** — `start_line` + optional `end_line` (1-indexed, inclusive). Overrides `offset`/`max_chars` when set; pairs naturally with `grep_knowledge_files`' line numbers. Pass `line_numbers: true` to also get a `<n>: <line>` prefix on each returned line.
+
+The line-range response includes `total_lines`, `showing_lines`, and `next_start_line` for follow-up reads.
+
+### Filesystem-style access (`kb_exec`)
+
+When [`ENABLE_KB_EXEC=True`](/reference/env-configuration#enable_kb_exec) is set, Open WebUI exposes a `kb_exec` tool that gives the model a filesystem-style interface over knowledge bases.
+
+**Tools that go away**, because their function is now covered by `kb_exec` commands:
+
+- `list_knowledge` — replaced by `ls`
+- `search_knowledge_files` — replaced by `find "<glob>"`
+- `grep_knowledge_files` — replaced by `grep "<pattern>"`
+- `view_file` and `view_knowledge_file` — replaced by `cat`, `head`, `tail`, `sed -n '<a>,<b>p'`
+
+**Tools that stay injected alongside `kb_exec`**, because they do something `kb_exec` can't:
+
+- **`query_knowledge_files`** — semantic / RAG search (always)
+- **`view_note`** — when notes are attached to the model (`kb_exec` is file-only, so notes need a dedicated reader)
+- **`query_knowledge_bases`** and **`search_knowledge_bases`** — when no KB is attached to the model, so the model can still discover and search across knowledge bases by name/description
+
+This is experimental and **off by default**. It targets frontier models that already "think in shell" — they tend to chain `ls`, `grep`, and `cat` more reliably than they orchestrate a fan-out of specialized tools.
+
+**Supported commands**
+
+| Command | Purpose |
+|---------|---------|
+| `ls`, `ls <dir>/`, `ls -a` | List the current level / a subdirectory / a flat view of every file with full paths |
+| `tree`, `tree <dir>/` | Recursive directory tree |
+| `cat -n <file>` | Read a file (optionally with line numbers) |
+| `head -N <file>` / `tail -N <file>` | First or last N lines |
+| `sed -n '<a>,<b>p' <file>` | Print lines `<a>` through `<b>` |
+| `grep "<pattern>" [<dir>/\|<file>\|*.ext]` | Exact / regex search; flags `-i` (case-insensitive), `-l` (filenames only), `-c` (counts) |
+| `find [<dir>/] "<glob>"` | Find files by glob |
+| `wc <file>` | Line / word / char counts |
+| `stat <file>` | File metadata |
+
+**Pipes**
+
+`kb_exec` parses a single pipeline, so commands compose:
+
+```text
+grep "auth" | head -5
+grep -l "TODO" docs/
+find docs/ "*.md" | head -10
+```
+
+**File references**
+
+Files can be addressed three ways — pick whichever is unambiguous:
+
+- **Path** — `docs/api/auth.md` (relative to the knowledge base root; resolves through the directory tree)
+- **Filename** — `auth.md` (errors with an "ambiguous filename" hint when the same name exists in multiple directories or KBs)
+- **File ID** — the UUID returned by `ls`, `find`, or `grep`
+
+**Behavior notes**
+
+- `kb_exec` respects the same access control as the other knowledge tools — files the user can't read are silently excluded from results.
+- The model still has `query_knowledge_files` for semantic search; reach for it when literal commands won't find a paraphrased concept.
+- Built on top of the directory model — `kb_exec` is the only tool that fully reflects the directory structure created in the UI.
+
 Autonomous exploration works best with frontier models that can intelligently chain search, browse, and synthesize. Smaller models may struggle with multi-step retrieval. Administrators can disable the **Knowledge Base** tool category per-model in **Workspace > Models > Edit > Builtin Tools**.
 
 For the full list of built-in agentic tools, see the [Native/Agentic Mode Tools Guide](/features/extensibility/plugin/tools#built-in-system-tools-nativeagentic-mode).
@@ -104,6 +187,54 @@ When native function calling is enabled, attached knowledge is **not automatical
 3. Upload files or add existing documents.
 4. Attach the knowledge base to a model in **Workspace > Models > Edit**, or reference it in chat with `#`.
 
+### Organizing into directories
+
+Knowledge bases support nested **directories** so larger document sets stay navigable. Create them from the **Add Content** menu (**+ New Directory**), then reorganize freely.
+
+**Creating and navigating**
+
+- **+ New Directory** lives next to file upload in the **Add Content** menu. Name uniqueness is enforced per parent — two siblings can't share a name, but you can reuse names in different parents.
+- Click a directory to descend into it; the **breadcrumb trail** at the top of the view always reflects the current path and lets you jump back to any ancestor in one click.
+- Directories can be **renamed** or **moved to a different parent** without affecting the files inside them.
+
+**Drag-and-drop**
+
+You can move items by dragging:
+
+- **Files** onto a directory row, into the empty area of an open directory, or onto any breadcrumb crumb (including the root crumb to send a file back to the top level).
+- **Directories** onto another directory to nest them, or onto a breadcrumb crumb to move them up the tree. Moving a directory into itself or one of its descendants is blocked server-side.
+
+**Deletion semantics**
+
+Deleting a non-empty directory prompts for the action to take with its contents:
+
+- **Move files to parent** (default) — the directory is removed but its files and subdirectories are re-parented one level up.
+- **Delete everything** — the directory and all files/subdirectories underneath it are permanently removed.
+
+**Effect on retrieval and tools**
+
+- **Retrieval and standard RAG** still span the entire knowledge base. Directories don't shard the vector index; chunks from any subdirectory remain reachable in a single search.
+- **Agentic tools** are directory-aware:
+  - `kb_exec` (when enabled) treats subdirectories like a filesystem: `ls docs/`, `tree`, `grep "x" docs/`, and path-style refs (`docs/api/auth.md`) all work — see [Filesystem-Style Access (`kb_exec`)](#filesystem-style-access-kb_exec) below.
+  - The other knowledge tools (`query_knowledge_files`, `grep_knowledge_files`, `search_knowledge_files`) ignore directory boundaries and return matches from the whole KB.
+
+### Renaming files
+
+Individual files can be renamed in place from the workspace via the file's item menu — no need to re-upload. The new name is reflected everywhere the file is referenced (knowledge listings, agentic tool output, citations).
+
+### Syncing a local directory
+
+The **Add Content → Sync Directory** action mirrors a local folder into the knowledge base **incrementally**: the client hashes each local file (SHA-256), the server compares hashes and paths against what is already stored, and only **new**, **modified**, and **deleted** files are touched. Unmodified files (the typical majority) are left alone — no re-upload, no re-embedding. The local folder's subdirectory structure is mirrored in the KB; missing subdirectories are created, and subdirectories that no longer exist locally are removed.
+
+Behavior to be aware of:
+
+- Hidden files and folders (anything beginning with `.`) are skipped.
+- Files modified locally upload with a new content hash; the old file entry is removed from the KB and replaced.
+- Files removed locally are deleted from the KB during the cleanup step.
+- The action is **non-destructive** for unchanged files. Earlier versions of the same menu action used to wipe and re-upload everything — that is no longer the case as of v0.9.6.
+
+For programmatic use, the same workflow is exposed as two endpoints under [API access](#api-access) below.
+
 ### Exporting
 
 Admins can export an entire knowledge base as a zip file via the item menu (three dots) > **Export**. Files are converted to `.txt` for universal compatibility. Regular users will not see the Export option.
@@ -112,9 +243,25 @@ Admins can export an entire knowledge base as a zip file via the item menu (thre
 
 Knowledge bases can be managed programmatically:
 
-- `POST /api/v1/files/` - Upload files
-- `GET /api/v1/files/{id}/process/status` - Check processing status
-- `POST /api/v1/knowledge/{id}/file/add` - Add files to a knowledge base
+**Files**
+
+- `POST /api/v1/files/` — Upload files. Pass `knowledge_id` (and optionally `directory_id`) in the upload metadata to have the backend **auto-link and process the file into that knowledge base server-side** — equivalent to a follow-up `POST /api/v1/knowledge/{id}/file/add`, but it does not depend on the client staying connected after upload. This is the recommended single-call path (added in v0.9.6, fixing files left unlinked when the uploader disconnected mid-processing). The server SHA-256-hashes the uploaded bytes into `file.meta.file_hash`; clients can pre-compute and send `file_hash` in metadata to skip server-side hashing (used by the incremental sync flow below).
+- `GET /api/v1/files/{id}/process/status` — Check processing status
+- `POST /api/v1/files/{id}/rename` — Rename a file
+- `POST /api/v1/knowledge/{id}/file/add` — Add files to a knowledge base
+- `POST /api/v1/knowledge/{id}/file/move` — Move a file between directories within the same KB (body: `file_id`, `directory_id` — `null` moves to the KB root)
+
+**Directories**
+
+- `POST /api/v1/knowledge/{id}/dirs/create` — Create a directory (body: `name`, optional `parent_id`)
+- `POST /api/v1/knowledge/{id}/dirs/{dir_id}/update` — Rename or re-parent a directory (body: `name` and/or `parent_id`)
+- `DELETE /api/v1/knowledge/{id}/dirs/{dir_id}/delete?move_files=true` — Delete a directory. With `move_files=true` (default), contained files are re-parented; with `move_files=false`, they're deleted along with the directory.
+
+**Incremental directory sync** (added in v0.9.6)
+
+- `POST /api/v1/knowledge/{id}/sync/diff` — Submit a local manifest (`manifest: [{path, filename, checksum}]` where `checksum` is the SHA-256 of the file bytes) and receive `{added, modified, deleted, mkdir, rmdir, unmodified_count}` describing exactly what to upload, replace, delete, and which directories to create/remove. Read-only — does not mutate the KB.
+- After acting on the diff (create `mkdir` paths, upload `added` + `modified` files with their hashes via `POST /api/v1/files/`), call:
+- `POST /api/v1/knowledge/{id}/sync/cleanup` — Body: `{file_ids: [...], dir_ids: [...]}`. Removes the stale files (from the KB, vector store, and per-file collections) and the now-empty directories returned by `sync/diff`. Run this last so deletions don't outrun uploads.
 
 File processing happens asynchronously. You must poll the status endpoint until processing completes before adding files to a knowledge base, or you'll get an "empty content" error. See [API Endpoints](/reference/api-endpoints#-retrieval-augmented-generation-rag) for workflow examples.
 
@@ -144,7 +291,7 @@ Add dozens of papers to a knowledge base. The AI searches across all of them to
 
 ### Processing delay for API uploads
 
-Files uploaded via API are processed asynchronously. Attempting to use a file before processing completes will fail silently or return empty results.
+Files uploaded via API are processed asynchronously. Attempting to use a file before processing completes will fail silently or return empty results. Note that uploading with a `knowledge_id` (above) makes linking server-side and robust to client disconnects, but it does **not** make the content instantly queryable — extraction/embedding still runs in the background, so poll `GET /api/v1/files/{id}/process/status` before relying on retrieval.
 
 ### Native function calling changes behavior
 
diff --git a/docs/getting-started/advanced-topics/development.md b/docs/getting-started/advanced-topics/development.md
index 0dfeba762..eba4bd84e 100644
--- a/docs/getting-started/advanced-topics/development.md
+++ b/docs/getting-started/advanced-topics/development.md
@@ -19,10 +19,17 @@ You can test the latest changes by running the [dev Docker image](/getting-start
 
 | Requirement | Version |
 |-------------|---------|
-| **Python** | 3.11+ |
+| **Python** | 3.11 or 3.12 (see note below; 3.13 not supported yet) |
 | **Node.js** | 22.10+ |
 | **Git** | Any recent version |
 
+:::info Python version compatibility
+Open WebUI supports **Python 3.11 and 3.12**. **3.13 is not supported yet** — a few of our dependencies still need to ship 3.13-compatible releases, and until they do, installs on 3.13 will fail or break at runtime.
+
+- **For production**, use the [Docker image](/getting-started/quick-start) or the **latest Python 3.11**. This is the combination we test against most heavily.
+- **3.12 also works**, but we have seen very rare reports of odd behaviour on 3.12 that we have not reproduced on 3.11. If you are running into something inexplicable on 3.12, dropping to the latest 3.11 is the first thing to try.
+:::
+
 :::warning Separate your data
 Never share your database or data directory between dev and production. Dev builds may include database migrations that are not backward-compatible.
 :::
diff --git a/docs/getting-started/advanced-topics/hardening.md b/docs/getting-started/advanced-topics/hardening.md
index 9dc6e3751..7b41f30f9 100644
--- a/docs/getting-started/advanced-topics/hardening.md
+++ b/docs/getting-started/advanced-topics/hardening.md
@@ -551,6 +551,10 @@ Outbound HTTP requests also do not follow `3xx` redirects by default. Without th
 AIOHTTP_CLIENT_ALLOW_REDIRECTS=false
 ```
 
+:::note Playwright loader (v0.9.6+)
+Earlier versions applied URL validation and the redirect gate only to the default web loader; the Playwright-based loader (`WEB_LOADER_ENGINE=playwright` / the `playwright` Docker variant) could navigate and follow redirects to internal or blocklisted URLs unchecked. As of v0.9.6 the Playwright path enforces the same `validate_url()` and redirect rules as the default loader, so the SSRF controls above apply regardless of which web loader engine you run. If you use Playwright, ensure you are on v0.9.6 or later.
+:::
+
 ### Profile image URL forwarding
 
 The user and model profile-image endpoints can issue a `302 Found` redirect to whatever origin is stored in `profile_image_url` so that externally-hosted avatars (e.g. Gravatar via an upstream identity provider) display in the UI. That redirect causes the user's browser to make a request directly to the external origin, leaking client IP, User-Agent, and Referer headers — and an account whose `profile_image_url` was set to an attacker-controlled host can use that to deanonymize anyone who renders their avatar.
diff --git a/docs/getting-started/advanced-topics/scaling.md b/docs/getting-started/advanced-topics/scaling.md
index 829918525..3d778ded5 100644
--- a/docs/getting-started/advanced-topics/scaling.md
+++ b/docs/getting-started/advanced-topics/scaling.md
@@ -109,6 +109,7 @@ ENABLE_WEBSOCKET_SUPPORT=true
 - If you're using Redis Sentinel for high availability, also set `REDIS_SENTINEL_HOSTS` and consider setting `REDIS_SOCKET_CONNECT_TIMEOUT=5` to prevent hangs during failover.
 - For AWS Elasticache or other managed Redis Cluster services, set `REDIS_CLUSTER=true`.
 - Make sure your Redis server has `timeout 1800` and a high enough `maxclients` (10000+) to prevent connection exhaustion over time.
+- For high-concurrency websocket streaming, review Redis Pub/Sub output buffer limits. Large Socket.IO events can disconnect Pub/Sub clients if Redis uses small default buffers; see [WebSocket Pub/Sub Buffer Limits](/tutorials/integrations/redis#websocket-pubsub-buffer-limits).
 - A **single Redis instance** is sufficient for the vast majority of deployments, even with thousands of users. You almost certainly do not need Redis Cluster unless you have specific HA/bandwidth requirements. If you think you need Redis Cluster, first check whether your connection count and memory usage are caused by fixable configuration issues (see [Common Anti-Patterns](/troubleshooting/performance#%EF%B8%8F-common-anti-patterns)).
 - Without Redis in a multi-instance setup, you will experience [WebSocket 403 errors](/troubleshooting/multi-replica#2-websocket-403-errors--connection-failures), [configuration sync issues](/troubleshooting/multi-replica#3-model-not-found-or-configuration-mismatch), and intermittent authentication failures.
 
@@ -385,8 +386,19 @@ UVICORN_WORKERS=1
 
 # Migrations (set to false on all but one instance)
 ENABLE_DB_MIGRATIONS=false
+
+# Concurrency & DB write throttling (REQUIRED at scale — see note below)
+THREAD_POOL_SIZE=2000
+DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL=300
 ```
 
+:::warning Two settings people forget — and then their scaled deployment stalls
+- **`THREAD_POOL_SIZE=2000`** — Open WebUI offloads blocking work (DB calls, file I/O, sync handlers) to a thread pool whose default concurrency ceiling is only **40**. At scale, once 40 blocking operations are in flight every further request **queues**, and the whole app appears to freeze even though CPU/RAM look fine. `2000` is a *lower* bound for large instances; it is a concurrency ceiling, **not** a CPU/thread count, so a high value is not a contention risk. Never lower it. (The only exception is genuinely tiny hardware, which is not a "scaled deployment".)
+- **`DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL=300`** — presence tracking writes each user's `last_active_at` to the database. **Unset (the default) means this write is unthrottled — roughly one `UPDATE` + `COMMIT` per authenticated request.** At scale that is a continuous flood of tiny write transactions that saturates the connection pool for no functional gain. Set it to `300`–`500` seconds; it is mandatory for large/production deployments and free performance everywhere else.
+
+Both are read once at startup and are not configurable from the Admin UI. See [Performance → Database Optimization](/troubleshooting/performance#-database-optimization) and [Performance → High-Concurrency](/troubleshooting/performance#-high-concurrency--network-optimization).
+:::
+
 ### Security defaults to revisit at scale
 
 A few defaults that are reasonable for single-user evaluation become less so once you put the deployment behind SSO and serve real users. The full discussion lives in the [Hardening guide](/getting-started/advanced-topics/hardening); the items most often missed in enterprise rollouts:
diff --git a/docs/getting-started/essentials.mdx b/docs/getting-started/essentials.mdx
index 677ae05b8..f7bcf4b0e 100644
--- a/docs/getting-started/essentials.mdx
+++ b/docs/getting-started/essentials.mdx
@@ -219,14 +219,14 @@ If you just want RAG to work well out of the box, these settings are a solid gen
 
 Set these in **Admin Panel > Settings > Documents**:
 
-| Setting | Recommended value | Default | Why |
-|---------|-------------------|---------|-----|
-| **Text Splitter** | `token` | `character` | Token-based splitting produces more consistent chunk sizes across document types |
-| **Markdown Header Splitting** | **On** | On | Respects document structure by splitting at headings, keeping sections coherent |
-| **Chunk Size** | `2000` | `1000` | Larger chunks preserve more surrounding context per retrieval hit |
-| **Chunk Overlap** | `200` | `100` | More overlap means less chance of cutting a key sentence in half |
-| **Top K** | `15` | `3` | Retrieves more candidate chunks, giving the model a wider pool of relevant context. If you are working with local models that have constrained context sizes, lower this to `5` to avoid filling the context window with retrieved chunks |
-| **Embedding Model** | External (OpenAI or Ollama) | `all-MiniLM-L6-v2` (local CPU) | The default works for a single user but consumes ~500 MB RAM per worker. For any multi-user setup, use an external embedding API instead |
+| Setting | Default | Recommended value | Why |
+|---------|---------|-------------------|-----|
+| **Text Splitter** | `character` | `token` | Token-based splitting produces more consistent chunk sizes across document types |
+| **Markdown Header Splitting** | On | **On** | Respects document structure by splitting at headings, keeping sections coherent |
+| **Chunk Size** | `1000` | `2000` | Larger chunks preserve more surrounding context per retrieval hit |
+| **Chunk Overlap** | `100` | `200` | More overlap means less chance of cutting a key sentence in half |
+| **Top K** | `3` | `15` | Retrieves more candidate chunks, giving the model a wider pool of relevant context. If you are working with local models that have constrained context sizes, lower this to `5` to avoid filling the context window with retrieved chunks |
+| **Embedding Model** | `all-MiniLM-L6-v2` (local CPU) | External (OpenAI or Ollama) | The default works for a single user but consumes ~500 MB RAM per worker. For any multi-user setup, use an external embedding API instead |
 
 :::tip Embedding model
 The default SentenceTransformers model runs locally on CPU and is fine for a single user getting started. For anything beyond that, point at an external embeddings API: set `RAG_EMBEDDING_ENGINE=openai` with an OpenAI API key, or `RAG_EMBEDDING_ENGINE=ollama` with any Ollama embedding model (e.g., `nomic-embed-text`). This offloads the work and frees significant RAM.
diff --git a/docs/getting-started/quick-start/index.mdx b/docs/getting-started/quick-start/index.mdx
index d4f5d32d0..81f4064b6 100644
--- a/docs/getting-started/quick-start/index.mdx
+++ b/docs/getting-started/quick-start/index.mdx
@@ -22,6 +22,7 @@ import Pip from './tab-python/Pip.md';
 import Uv from './tab-python/Uv.md';
 import Conda from './tab-python/Conda.md';
 import PythonUpdating from './tab-python/PythonUpdating.md';
+import PythonCompat from './tab-python/_PythonCompat.md';
 
 # Quick Start
 
@@ -87,6 +88,7 @@ Open WebUI works on **macOS, Linux** (x86_64 and ARM64, including Raspberry Pi a
 
   </TabItem>
   <TabItem value="python" label="Python">
+    <PythonCompat />
     <Tabs>
       <TabItem value="pip" label="pip">
         <div className='mt-5'>
diff --git a/docs/getting-started/quick-start/tab-docker/DockerCompose.md b/docs/getting-started/quick-start/tab-docker/DockerCompose.md
index 8b88d3ac4..b7bd492f3 100644
--- a/docs/getting-started/quick-start/tab-docker/DockerCompose.md
+++ b/docs/getting-started/quick-start/tab-docker/DockerCompose.md
@@ -56,9 +56,15 @@ To start your services, run the following command:
 docker compose up -d
 ```
 
-## Helper Script
+## Helper Scripts
 
-A useful helper script called `run-compose.sh` is included with the codebase. This script assists in choosing which Docker Compose files to include in your deployment, streamlining the setup process.
+A set of helper scripts is included with the codebase to streamline common Docker workflows:
+
+- `docker-compose-launcher.sh` — Interactive Compose launcher with GPU auto-detection, configurable WebUI/API ports, host data mounts, and optional Playwright support. Run `./docker-compose-launcher.sh --help` for the full list of flags. Use `--drop` to tear down the project.
+- `docker-cleanup.sh` — Stops the Compose project and **deletes all volumes**, including persistent data. Prompts for confirmation before destroying data.
+- `docker-run.sh` — Builds the Open WebUI image and runs a single container, exposing it on `OPEN_WEBUI_PORT` (default `3000`).
+- `docker-ollama.sh` — Pulls and runs the official Ollama container with optional GPU passthrough, exposing it on `OLLAMA_PORT` (default `11434`).
+- `docker-update-models.sh` — Iterates through every model installed in the Ollama container and pulls the latest version.
 
 ---
 
diff --git a/docs/getting-started/quick-start/tab-docker/ManualDocker.md b/docs/getting-started/quick-start/tab-docker/ManualDocker.md
index b944625d4..8825dedf2 100644
--- a/docs/getting-started/quick-start/tab-docker/ManualDocker.md
+++ b/docs/getting-started/quick-start/tab-docker/ManualDocker.md
@@ -49,9 +49,9 @@ Visit [http://localhost:3000](http://localhost:3000).
 For production environments, pin a specific version instead of using floating tags:
 
 ```bash
-docker pull ghcr.io/open-webui/open-webui:v0.9.5
-docker pull ghcr.io/open-webui/open-webui:v0.9.5-cuda
-docker pull ghcr.io/open-webui/open-webui:v0.9.5-ollama
+docker pull ghcr.io/open-webui/open-webui:v0.9.6
+docker pull ghcr.io/open-webui/open-webui:v0.9.6-cuda
+docker pull ghcr.io/open-webui/open-webui:v0.9.6-ollama
 ```
 
 ---
diff --git a/docs/getting-started/quick-start/tab-python/_PythonCompat.md b/docs/getting-started/quick-start/tab-python/_PythonCompat.md
new file mode 100644
index 000000000..80f68c9a1
--- /dev/null
+++ b/docs/getting-started/quick-start/tab-python/_PythonCompat.md
@@ -0,0 +1,6 @@
+:::info Python version compatibility
+Open WebUI supports **Python 3.11 and 3.12**. **Python 3.13 is not supported yet** — a handful of our dependencies still need to ship 3.13-compatible releases, and until they do, installs on 3.13 will fail or break at runtime.
+
+- **For production**, run the [Docker image](#docker) or use the **latest Python 3.11**. This is the combination we test against most heavily.
+- **Python 3.12 also works**, but we have seen very rare reports of odd behaviour on 3.12 that we have not reproduced on 3.11. If something inexplicable happens on 3.12, drop to the latest 3.11 first.
+:::
diff --git a/docs/getting-started/updating.mdx b/docs/getting-started/updating.mdx
index 68a118ccd..7b9000e04 100644
--- a/docs/getting-started/updating.mdx
+++ b/docs/getting-started/updating.mdx
@@ -31,9 +31,9 @@ The `:main` tag always points to the **latest build**. It's convenient but can i
 For stability, pin a specific release tag:
 
 ```
-ghcr.io/open-webui/open-webui:v0.9.5
-ghcr.io/open-webui/open-webui:v0.9.5-cuda
-ghcr.io/open-webui/open-webui:v0.9.5-ollama
+ghcr.io/open-webui/open-webui:v0.9.6
+ghcr.io/open-webui/open-webui:v0.9.6-cuda
+ghcr.io/open-webui/open-webui:v0.9.6-ollama
 ```
 
 Browse all available tags on the [GitHub releases page](https://github.com/open-webui/open-webui/releases).
diff --git a/docs/reference/api-endpoints.md b/docs/reference/api-endpoints.md
index 450d51b68..fc3426565 100644
--- a/docs/reference/api-endpoints.md
+++ b/docs/reference/api-endpoints.md
@@ -278,7 +278,7 @@ Even in the non-streaming case, **`outlet()` does not rewrite the HTTP response
   ```
 
 :::tip
-If you need `outlet()` output over HTTP today, call `/api/chat/completions` followed by `/api/chat/completed`. Inline execution on `dev` is primarily for WebUI-shaped clients that read from the WebSocket. For more details on filter behavior, see the [Filter Function documentation](/features/extensibility/plugin/functions/filter#-filter-behavior-with-api-requests).
+If you need `outlet()` output over HTTP today, call `/api/chat/completions` followed by `/api/chat/completed`. Inline execution on `dev` is primarily for WebUI-shaped clients that read from the WebSocket. For more details on filter behavior, see the [Filter Function documentation](/features/extensibility/plugin/functions/filter#filter-behavior-with-api-requests).
 :::
 
 ### 🦙 Ollama API Proxy Support
diff --git a/docs/reference/database-schema.md b/docs/reference/database-schema.md
index 8b5ab256e..464ba831a 100644
--- a/docs/reference/database-schema.md
+++ b/docs/reference/database-schema.md
@@ -10,7 +10,7 @@ This tutorial is a community contribution and is not supported by the Open WebUI
 :::
 
 > [!WARNING]
-> This documentation reflects schema changes up to Open WebUI v0.9.5.
+> This documentation reflects schema changes up to Open WebUI v0.9.6.
 
 ## Open-WebUI Internal SQLite Database
 
diff --git a/docs/reference/env-configuration.mdx b/docs/reference/env-configuration.mdx
index ed6ad2fc9..0403f05c0 100644
--- a/docs/reference/env-configuration.mdx
+++ b/docs/reference/env-configuration.mdx
@@ -12,23 +12,23 @@ As new variables are introduced, this page will be updated to reflect the growin
 
 :::info
 
-This page is up-to-date with Open WebUI release version [v0.9.5](https://github.com/open-webui/open-webui/releases/tag/v0.9.5), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions.
+This page is up-to-date with Open WebUI release version [v0.9.6](https://github.com/open-webui/open-webui/releases/tag/v0.9.6), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions.
 
 :::
 
-### Important Note on `PersistentConfig` Environment Variables
+### Important Note on `ConfigVar` Environment Variables
 
 :::note
 
-When launching Open WebUI for the first time, all environment variables are treated equally and can be used to configure the application. However, for environment variables marked as `PersistentConfig`, their values are persisted and stored internally.
+When launching Open WebUI for the first time, all environment variables are treated equally and can be used to configure the application. However, for environment variables marked as `ConfigVar`, their values are persisted and stored internally.
 
-After the initial launch, if you restart the container, `PersistentConfig` environment variables will no longer use the external environment variable values. Instead, they will use the internally stored values.
+After the initial launch, if you restart the container, `ConfigVar` environment variables will no longer use the external environment variable values. Instead, they will use the internally stored values.
 
 In contrast, regular environment variables will continue to be updated and applied on each subsequent restart.
 
-You can update the values of `PersistentConfig` environment variables directly from within Open WebUI, and these changes will be stored internally. This allows you to manage these configuration settings independently of the external environment variables.
+You can update the values of `ConfigVar` environment variables directly from within Open WebUI, and these changes will be stored internally. This allows you to manage these configuration settings independently of the external environment variables.
 
-Please note that `PersistentConfig` environment variables are clearly marked as such in the documentation below, so you can be aware of how they will behave.
+Please note that `ConfigVar` environment variables are clearly marked as such in the documentation below, so you can be aware of how they will behave.
 
 To disable this behavior and force Open WebUI to always use your environment variables (ignoring the database), set `ENABLE_PERSISTENT_CONFIG` to `False`.
 
@@ -44,7 +44,7 @@ If you change an environment variable (like `ENABLE_SIGNUP=True`) but don't see
 Set `ENABLE_PERSISTENT_CONFIG=False` in your environment. This forces Open WebUI to read your variables directly. Note that UI-based settings changes will not persist across restarts in this mode.
 
 #### Option 2: Update via Admin UI (Recommended)
-The simplest and safest way to change `PersistentConfig` settings is directly through the **Admin Panel** within Open WebUI. Even if an environment variable is set, changes made in the UI will take precedence and be saved to the database.
+The simplest and safest way to change `ConfigVar` settings is directly through the **Admin Panel** within Open WebUI. Even if an environment variable is set, changes made in the UI will take precedence and be saved to the database.
 
 #### Option 3: Manual Database Update (Last Resort / Lock-out Recovery)
 If you are locked out or cannot access the UI, you can manually update the SQLite database via Docker:
@@ -78,7 +78,7 @@ environment variables, see our [logging documentation](https://docs.openwebui.co
 - Type: `str`
 - Default: `http://localhost:3000`
 - Description: Specifies the URL where your Open WebUI installation is reachable. Needed for search engine support and OAuth/SSO.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::warning
 
@@ -97,7 +97,7 @@ Failure to set WEBUI_URL before using OAuth/SSO will result in failure to log in
 - Type: `bool`
 - Default: `True`
 - Description: Toggles user account creation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_SIGNUP_PASSWORD_CONFIRMATION`
 
@@ -148,14 +148,14 @@ After the admin account is created, sign-up is automatically disabled for securi
 - Type: `bool`
 - Default: `True`
 - Description: Toggles email, password, sign-in and "or" (only when `ENABLE_OAUTH_SIGNUP` is set to True) elements.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_PASSWORD_CHANGE_FORM`
 
 - Type: `bool`
 - Default: `True`
 - Description: Controls visibility of the password change UI in **Settings > Account**. When set to `False`, users do not see the password update form, which is useful for SSO-focused deployments where password changes should not be presented in the UI.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_PASSWORD_AUTH`
 
@@ -181,14 +181,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b
 - Type: `str`
 - Default: `en`
 - Description: Sets the default locale for the application.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DEFAULT_MODELS`
 
 - Type: `str`
 - Default: Empty string (' '), since `None`.
 - Description: Sets a default Language Model.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DEFAULT_PINNED_MODELS`
 
@@ -196,14 +196,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b
 - Default: Empty string (' ')
 - Description: Comma-separated list of model IDs to pin by default for new users who haven't customized their pinned models. This provides a pre-selected set of frequently used models in the model selector for new accounts.
 - Example: `gpt-4,claude-3-opus,llama-3-70b`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DEFAULT_MODEL_METADATA`
 
 - Type: `dict` (JSON object)
 - Default: `{}`
 - Description: Sets global default metadata (capabilities and other model info) for all models. These defaults act as a baseline — per-model overrides always take precedence. For capabilities, the defaults and per-model values are merged (per-model wins on conflicts). For other metadata fields, the default is only applied if the model has no value set. Configurable via **Admin Settings → Models**.
-- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `models.default_metadata`.
+- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `models.default_metadata`.
 
 :::info
 
@@ -220,7 +220,7 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b
 - Type: `dict` (JSON object)
 - Default: `{}`
 - Description: Sets global default parameters (temperature, top_p, max_tokens, seed, etc.) for all models. These defaults are applied as a baseline at chat completion time — per-model parameter overrides always take precedence. Configurable via **Admin Settings → Models**.
-- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `models.default_params`.
+- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `models.default_params`.
 
 :::info
 
@@ -240,14 +240,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b
   - `admin` - New users are automatically activated with administrator permissions.
 - Default: `pending`
 - Description: Sets the default role assigned to new users.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DEFAULT_GROUP_ID`
 
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the default group ID to assign to new users upon registration.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DEFAULT_GROUP_SHARE_PERMISSION`
 
@@ -261,63 +261,63 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets a custom title for the pending user overlay.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PENDING_USER_OVERLAY_CONTENT`
 
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets a custom text content for the pending user overlay.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_CALENDAR`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables the Calendar feature. When enabled, users can create calendars, manage events, and share calendars with other users or groups via access grants. Active automations are automatically surfaced as virtual events on a dedicated "Scheduled Tasks" calendar. Requires the `features.calendar` user permission (admins always pass).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_CHANNELS`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables channel support.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_FOLDERS`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables the folders feature, allowing users to organize their chats into folders in the sidebar.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `FOLDER_MAX_FILE_COUNT`
 
 - Type: `int`
 - Default: `("") empty string`
 - Description: Sets the maximum number of files processing allowed per folder.
-- Persistence: This environment variable is a `PersistentConfig` variable. It can be configured in the **Admin Panel > Settings > General > Folder Max File Count**. Default is none (empty string) which is unlimited.
+- Persistence: This environment variable is a `ConfigVar` variable. It can be configured in the **Admin Panel > Settings > General > Folder Max File Count**. Default is none (empty string) which is unlimited.
 
 #### `ENABLE_AUTOMATIONS`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables the Automations feature globally. When disabled, the scheduler skips automation processing, the automation API endpoints return `403 Forbidden`, automation builtin tools are not injected, and the Automations entry is hidden from the sidebar. Requires the `features.automations` user permission (admins always pass).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUTOMATION_MAX_COUNT`
 
 - Type: `int`
 - Default: `("") empty string` (unlimited)
 - Description: Sets the maximum number of automations a non-admin user can create. When set to a positive integer, users who reach this limit will receive a `403 Forbidden` error when attempting to create additional automations. Admins bypass this limit.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUTOMATION_MIN_INTERVAL`
 
 - Type: `int` (seconds)
 - Default: `("") empty string` (no minimum)
 - Description: Sets the minimum allowed interval in seconds between automation recurrences for non-admin users. When set, any automation schedule that recurs more frequently than this value will be rejected with a `400 Bad Request` error. One-time automations (`COUNT=1`) are exempt from this check. Admins bypass this limit.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::tip Common values for AUTOMATION_MIN_INTERVAL
 
@@ -347,20 +347,20 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables the notes feature, allowing users to create and manage personal notes within Open WebUI.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_MEMORIES`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables the [memory feature](/features/chat-conversations/memory), allowing models to store and retrieve long-term information about users.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WEBHOOK_URL`
 
 - Type: `str`
 - Description: Sets a webhook for integration with Discord/Slack/Microsoft Teams.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::note Admin posture toggles vs. security boundaries
 
@@ -416,14 +416,14 @@ Treat anything in this cluster as *what the admin sees and does in the product U
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables user webhooks.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RESPONSE_WATERMARK`
 
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets a custom text that will be included when you copy a message in the chat. e.g., `"This text is AI generated"` -> will add "This text is AI generated" to every message, when copied.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `IFRAME_CSP`
 
@@ -434,12 +434,15 @@ Treat anything in this cluster as *what the admin sees and does in the product U
 #### `THREAD_POOL_SIZE`
 
 - Type: `int`
-- Default: `0`
-- Description: Sets the thread pool size for FastAPI/AnyIO blocking calls. By default (when set to `0`) FastAPI/AnyIO use `40` threads. In case of large instances and many concurrent users, it may be needed to increase `THREAD_POOL_SIZE` to prevent blocking.
+- Default: `0` (unset — the AnyIO default limit of `40` applies)
+- Description: Sets the maximum number of **concurrent** blocking operations that may run in the AnyIO worker thread pool at once. Open WebUI offloads synchronous/blocking work (many DB calls, file I/O, sync route handlers, some library calls) to this pool via `run_in_threadpool`. The value is a **concurrency ceiling (a token limit), not a fixed pool of pre-spawned OS threads and not a CPU-core/thread count**: worker threads are created lazily only when needed and reused, so a high value does **not** by itself create that many threads, consume CPU, or cause CPU contention while idle. It only raises how many blocking operations can be in flight simultaneously before the rest must queue.
 
-:::info
+:::warning Set this high on any real server (2000+); never lower it
+The AnyIO default of `40` is far too low for production. When more than `THREAD_POOL_SIZE` blocking operations are needed at once (many users acting at the same time, or a few users each triggering several blocking calls), every further request **waits** for a free slot. The symptom is the whole app appearing to **hang / freeze / stop responding** under load, even though CPU and memory look fine — it is pool starvation, not resource exhaustion.
 
-If you are running larger instances, you WILL NEED to set this to a higher value like multiple hundreds if not thousands (e.g. `1000`) otherwise your app may get stuck the default pool size (which is 40 threads) is full and will not react anymore.
+- **Normal servers / production:** `2000` or higher. `2000` is a *lower* bound for very large multi-user instances — going higher is fine and is **not** a CPU or contention risk (it is a ceiling, not a preallocation).
+- **Never decrease below the default.** An idle high ceiling costs effectively nothing; a low ceiling causes freezes.
+- **Exception — weak hardware (Raspberry Pi, tiny VPS, containers capped at ~250m CPU / very low RAM):** do **not** set `2000` here. Each *genuinely concurrent* blocking op still uses a real OS thread (stack memory), so on a tiny device an enormous ceiling lets a traffic burst spawn enough threads to exhaust RAM. Leave it at the default, or set a modest value (e.g. a few hundred) matched to what the device can actually absorb. This caveat applies only to constrained single-board / micro deployments — any normal server should use `2000+`.
 
 :::
 
@@ -454,21 +457,21 @@ If you are running larger instances, you WILL NEED to set this to a higher value
 - Type: `bool`
 - Default: `True`
 - Description: Toggles whether to show admin user details in the interface.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_PUBLIC_ACTIVE_USERS_COUNT`
 
 - Type: `bool`
 - Default: `True`
 - Description: Controls whether the active user count is visible to all users or restricted to administrators only. When set to `False`, only admin users can see how many users are currently active, reducing backend load and addressing privacy concerns in large deployments.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_USER_STATUS`
 
 - Type: `bool`
 - Default: `True`
 - Description: Globally enables or disables user status functionality. When disabled, the status UI (including blinking active/away indicators and status messages) is hidden across the application, and user status API endpoints are restricted.
-- Persistence: This environment variable is a `PersistentConfig` variable. It can be toggled in the **Admin Panel > Settings > General > User Status**.
+- Persistence: This environment variable is a `ConfigVar` variable. It can be toggled in the **Admin Panel > Settings > General > User Status**.
 
 #### `ENABLE_EASTER_EGGS`
 
@@ -480,7 +483,7 @@ If you are running larger instances, you WILL NEED to set this to a higher value
 
 - Type: `str`
 - Description: Sets the admin email shown by `SHOW_ADMIN_DETAILS`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENV`
 
@@ -566,13 +569,13 @@ Enabling `ENABLE_REALTIME_CHAT_SAVE` causes every single token generated by the
 
 - Type: `bool`
 - Default: `True`
-- Description: Controls whether the user and model profile-image endpoints honor an external `http(s)://` URL stored in `profile_image_url` by issuing a `302 Found` redirect to the original origin. When `False`, the redirect is suppressed and the endpoint falls through to the bundled default image instead. Set to `False` to prevent client-side IP, User-Agent, and Referer leaks to attacker-controlled origins via attacker-stored profile URLs (data URIs and same-origin/static images continue to load normally). Existing deployments that legitimately rely on external profile image URLs (e.g. Gravatar redirects served by upstream identity providers) should keep the default. **This variable is read once at startup — it is not a `PersistentConfig` and cannot be changed from the Admin UI.**
+- Description: Controls whether the user and model profile-image endpoints honor an external `http(s)://` URL stored in `profile_image_url` by issuing a `302 Found` redirect to the original origin. When `False`, the redirect is suppressed and the endpoint falls through to the bundled default image instead. Set to `False` to prevent client-side IP, User-Agent, and Referer leaks to attacker-controlled origins via attacker-stored profile URLs (data URIs and same-origin/static images continue to load normally). Existing deployments that legitimately rely on external profile image URLs (e.g. Gravatar redirects served by upstream identity providers) should keep the default. **This variable is read once at startup — it is not a `ConfigVar` and cannot be changed from the Admin UI.**
 
 #### `PROFILE_IMAGE_ALLOWED_MIME_TYPES`
 
 - Type: `str` (comma-separated MIME types)
 - Default: `image/png,image/jpeg,image/gif,image/webp`
-- Description: Allowlist of MIME types accepted when serving a base64 `data:` URI as a profile image. The MIME type is parsed from the data URI prefix and checked against this list before the response is streamed; non-allowlisted types fall through to the bundled default image. Responses also set `X-Content-Type-Options: nosniff` to prevent the browser from sniffing the body into an executable type. SVG is intentionally not in the default list because it can carry inline `<script>`. The same allowlist drives the Pydantic data-URI prefix validator, so adding a type here both serves it on the read path and accepts it on the write path. **This variable is read once at startup — it is not a `PersistentConfig` and cannot be changed from the Admin UI.**
+- Description: Allowlist of MIME types accepted when serving a base64 `data:` URI as a profile image. The MIME type is parsed from the data URI prefix and checked against this list before the response is streamed; non-allowlisted types fall through to the bundled default image. Responses also set `X-Content-Type-Options: nosniff` to prevent the browser from sniffing the body into an executable type. SVG is intentionally not in the default list because it can carry inline `<script>`. The same allowlist drives the Pydantic data-URI prefix validator, so adding a type here both serves it on the read path and accepts it on the write path. **This variable is read once at startup — it is not a `ConfigVar` and cannot be changed from the Admin UI.**
 
 #### `CHAT_RESPONSE_STREAM_DELTA_CHUNK_SIZE`
 
@@ -592,6 +595,16 @@ It is recommended to set this to a high single-digit or low double-digit value i
 
 :::
 
+#### `CHAT_RESPONSE_MAX_TOOL_CALL_ITERATIONS`
+
+- Type: `int`
+- Default: `256`
+- Description: Caps how many sequential tool-calling turns the agentic loop will run within a **single** assistant response when Native Function Calling is enabled. Each turn where the model emits one or more tool calls counts as **one** toward this limit (multiple tool calls grouped in the same turn still count as one); the counter resets for every new message and does not carry across turns of a conversation. It is **not** a retry budget for failed tool calls — a successful call and a failed call each consume one, exactly the same. Set to `-1` for **unlimited** (no cap). Read once at startup; it is **not** a `ConfigVar` and cannot be changed from the Admin UI. An empty or invalid value falls back to `256`. The pre-v0.9.6 name `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES` (old default `30`) is still honored as a fallback if this is unset; prefer the new name.
+
+:::info Hitting the limit is visible (since v0.9.6)
+On reaching the cap, Open WebUI emits a **`Tool-call limit reached (N iterations).`** error in the chat. Before v0.9.6 it stopped silently (looked like the model "froze") — upgrading is the fix. `256` is high enough for normal runs; raise it or set `-1` only for unbounded agents, accepting that a looping model then runs many more turns before stopping.
+:::
+
 #### `ENABLE_RESPONSES_API_STATEFUL`
 
 - Type: `bool`
@@ -626,7 +639,7 @@ It is recommended to set this to a high single-digit or low double-digit value i
 [{"id": "string", "type": "string [info, success, warning, error]", "title": "string", "content": "string", "dismissible": false, "timestamp": 1000}]
 ```
 
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -673,21 +686,21 @@ WEBUI_BANNERS="[{\"id\": \"1\", \"type\": \"warning\", \"title\": \"Your message
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables chat title generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LICENSE_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the license key to use (for Enterprise users only).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SSL_ASSERT_FINGERPRINT`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the SSL assert fingerprint to use.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_COMPRESSION_MIDDLEWARE`
 
@@ -775,13 +788,19 @@ If this variable is unset or invalid, Open WebUI falls back to `AIOHTTP_CLIENT_T
 
 - Type: `bool`
 - Default: `True`
-- Description: Controls SSL/TLS verification for AIOHTTP client sessions when connecting to external APIs (e.g., Ollama Embeddings).
+- Description: Controls SSL/TLS verification for AIOHTTP client sessions when connecting to external APIs (e.g., Ollama Embeddings, and — since v0.9.6 — Speech-to-Text / Text-to-Speech audio endpoints). Set to `False` to allow self-signed or custom CA certificates on these upstream services.
 
 #### `AIOHTTP_CLIENT_ALLOW_REDIRECTS`
 
 - Type: `bool`
 - Default: `False`
-- Description: Controls whether outbound HTTP requests across the application follow `3xx` redirects. When `False` (the default since v0.9.5), redirects are not followed — this closes a class of SSRF where a public, validated URL `302`-redirects to an internal address (RFC 1918, loopback `127.0.0.1`, cloud-metadata `169.254.169.254`) that bypasses the original allowlist check. Affected call sites include the RAG web loader, image loading and base64 conversion, OAuth pre-flight, code-interpreter login, and tool-server execution. Set to `True` only if your deployment legitimately requires redirect following (e.g. shortlink-style URLs) AND you have other SSRF protections in place — typically an egress firewall or `WEB_FETCH_FILTER_LIST` covering your internal ranges.
+- Description: Controls whether outbound HTTP requests across the application follow `3xx` redirects. When `False` (the default since v0.9.6), redirects are not followed — this closes a class of SSRF where a public, validated URL `302`-redirects to an internal address (RFC 1918, loopback `127.0.0.1`, cloud-metadata `169.254.169.254`) that bypasses the original allowlist check. Affected call sites include the RAG web loader, image loading and base64 conversion, OAuth pre-flight, code-interpreter login, and tool-server execution. Set to `True` only if your deployment legitimately requires redirect following (e.g. shortlink-style URLs) AND you have other SSRF protections in place — typically an egress firewall or `WEB_FETCH_FILTER_LIST` covering your internal ranges.
+
+#### `USER_AGENT`
+
+- Type: `str`
+- Default: `""` (unset — falls back to the underlying library default, typically `python-requests/2.x`)
+- Description: Overrides the `User-Agent` header that the web loader (`SafeWebBaseLoader`) sends on outbound fetches for web search and the `fetch_url` tool. The default Python library UA is aggressively blocked by Cloudflare, Wikipedia, and similar bot-detection layers, which manifests as empty results or 403s when scraping or searching the public web. Set this to a real browser-like UA (e.g. `Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0 Safari/537.36`) to restore access. Applies to both the synchronous `_scrape()` and the async `_fetch()` code paths.
 
 #### `AIOHTTP_CLIENT_TIMEOUT_TOOL_SERVER_DATA`
 
@@ -930,7 +949,7 @@ By default, audit logging uses **blacklist mode** — all paths are logged excep
 - Type: `bool`
 - Default: `True`
 - Description: Enables the use of Ollama APIs.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OLLAMA_BASE_URL` (`OLLAMA_API_BASE_URL` is deprecated) {#ollama_base_url}
 
@@ -948,7 +967,7 @@ By default, audit logging uses **blacklist mode** — all paths are logged excep
 - Description: Configures load-balanced Ollama backend hosts, separated by `;`. See
 [`OLLAMA_BASE_URL`](#ollama_base_url). Takes precedence over[`OLLAMA_BASE_URL`](#ollama_base_url).
 - Example: `http://host-one:11434;http://host-two:11434`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USE_OLLAMA_DOCKER`
 
@@ -969,28 +988,28 @@ By default, audit logging uses **blacklist mode** — all paths are logged excep
 - Type: `bool`
 - Default: `True`
 - Description: Enables the use of OpenAI APIs.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OPENAI_API_BASE_URL`
 
 - Type: `str`
 - Default: `https://api.openai.com/v1`
 - Description: Configures the OpenAI base API URL.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OPENAI_API_BASE_URLS`
 
 - Type: `str`
 - Description: Supports balanced OpenAI base API URLs, semicolon-separated.
 - Example: `http://host-one:11434;http://host-two:11434`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OPENAI_API_KEY`
 
 - Type: `str`
 - Description: Sets the OpenAI API key.
 - Example: `sk-124781258123`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::warning Provider Key Scope (Important)
 For OpenAI-compatible backends and proxies (including LiteLLM), configure least-privilege keys for regular user traffic whenever possible.
@@ -1003,7 +1022,7 @@ Do not use provider management/master keys unless your deployment explicitly req
 - Type: `str`
 - Description: Supports multiple OpenAI API keys, semicolon-separated.
 - Example: `sk-124781258123;sk-4389759834759834`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_OPENAI_API_PASSTHROUGH`
 
@@ -1024,14 +1043,14 @@ Do not use provider management/master keys unless your deployment explicitly req
 - Type: `str`
 - Description: The default model to use for tasks such as title and web search query generation
 when using Ollama models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `TASK_MODEL_EXTERNAL`
 
 - Type: `str`
 - Description: The default model to use for tasks such as title and web search query generation
 when using OpenAI-compatible endpoints.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `TITLE_GENERATION_PROMPT_TEMPLATE`
 
@@ -1069,14 +1088,14 @@ JSON format: { "title": "your concise title here" }
 </chat_history>
 ```
 
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_FOLLOW_UP_GENERATION`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables follow up generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `FOLLOW_UP_GENERATION_PROMPT_TEMPLATE`
 
@@ -1108,7 +1127,7 @@ JSON format: { "follow_ups": ["Question 1?", "Question 2?", "Question 3?"] }
 </chat_history>"
 ```
 
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `TOOLS_FUNCTION_CALLING_PROMPT_TEMPLATE`
 
@@ -1143,7 +1162,7 @@ The format for the JSON response is strictly:
 }
 ```
 
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Code Execution
 
@@ -1152,49 +1171,49 @@ The format for the JSON response is strictly:
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables code execution.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_EXECUTION_ENGINE`
 
 - Type: `str`
 - Default: `pyodide`
 - Description: Specifies the code execution engine to use.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_EXECUTION_JUPYTER_URL`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the Jupyter URL to use for code execution.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_EXECUTION_JUPYTER_AUTH`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the Jupyter authentication method to use for code execution.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_EXECUTION_JUPYTER_AUTH_TOKEN`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the Jupyter authentication token to use for code execution.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_EXECUTION_JUPYTER_AUTH_PASSWORD`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the Jupyter authentication password to use for code execution.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_EXECUTION_JUPYTER_TIMEOUT`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the timeout for Jupyter code execution.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Code Interpreter
 
@@ -1203,14 +1222,14 @@ The format for the JSON response is strictly:
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables code interpreter.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_INTERPRETER_ENGINE`
 
 - Type: `str`
 - Default: `pyodide`
 - Description: Specifies the code interpreter engine to use.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_INTERPRETER_BLACKLISTED_MODULES`
 
@@ -1223,42 +1242,42 @@ The format for the JSON response is strictly:
 - Type: `str`
 - Default: `None`
 - Description: Specifies the prompt template to use for code interpreter.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_INTERPRETER_JUPYTER_URL`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the Jupyter URL to use for code interpreter.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_INTERPRETER_JUPYTER_AUTH`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the Jupyter authentication method to use for code interpreter.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_INTERPRETER_JUPYTER_AUTH_TOKEN`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the Jupyter authentication token to use for code interpreter.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_INTERPRETER_JUPYTER_AUTH_PASSWORD`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the Jupyter authentication password to use for code interpreter.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CODE_INTERPRETER_JUPYTER_TIMEOUT`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the timeout for the Jupyter code interpreter.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Direct Connections (OpenAPI/MCPO Tool Servers)
 
@@ -1267,7 +1286,7 @@ The format for the JSON response is strictly:
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables direct connections.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `TOOL_SERVER_CONNECTIONS`
 
@@ -1302,7 +1321,7 @@ The format for the JSON response is strictly:
   }
 ]
 ```
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::warning
 
@@ -1349,7 +1368,7 @@ The JSON data structure of `TOOL_SERVER_CONNECTIONS` might evolve over time as n
   }
 ]
 ```
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::tip Helm chart auto-configuration
 When deploying on Kubernetes with the Open WebUI Helm chart and `terminals.enabled: true`, this variable is set automatically to point at the in-cluster orchestrator service. See the [Terminals (Orchestrator) guide](/features/open-terminal/terminals/) for details.
@@ -1382,7 +1401,7 @@ The JSON data structure of `TERMINAL_SERVER_CONNECTIONS` might evolve over time
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables autocomplete generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -1395,7 +1414,7 @@ When enabling `ENABLE_AUTOCOMPLETE_GENERATION`, ensure that you also configure `
 - Type: `int`
 - Default: `-1`
 - Description: Sets the maximum input length for autocomplete generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUTOCOMPLETE_GENERATION_PROMPT_TEMPLATE`
 
@@ -1452,7 +1471,7 @@ Output:
 ```
 
 - Description: Sets the prompt template for autocomplete generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Evaluation Arena Model
 
@@ -1461,14 +1480,14 @@ Output:
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables evaluation arena models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_MESSAGE_RATING`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables message rating feature.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_COMMUNITY_SHARING`
 
@@ -1482,7 +1501,7 @@ Output:
   - **Share Chat Modal**: "Share to Open WebUI Community" button when sharing a chat conversation
   - **Evaluation Feedbacks**: "Share to Open WebUI Community" button for contributing feedback history to the community leaderboard
   - **Stats Sync Modal**: Enables syncing usage statistics with the community
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -1497,7 +1516,7 @@ When `ENABLE_COMMUNITY_SHARING` is set to `False`, all community sharing buttons
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables tag generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `TAGS_GENERATION_PROMPT_TEMPLATE`
 
@@ -1528,7 +1547,7 @@ JSON format: { "tags": ["tag1", "tag2", "tag3"] }
 ```
 
 - Description: Sets the prompt template for tag generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### API Key Endpoint Restrictions
 
@@ -1537,7 +1556,7 @@ JSON format: { "tags": ["tag1", "tag2", "tag3"] }
 - Type: `bool`
 - Default: `False`
 - Description: Enables the API key creation feature, allowing users to generate API keys for programmatic access to Open WebUI.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -1560,7 +1579,7 @@ For API Key creation (and the API keys themselves) to work:
 - Type: `bool`
 - Default: `False`
 - Description: Enables API key endpoint restrictions for added security and configurability, allowing administrators to limit which endpoints can be accessed using API keys.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -1573,7 +1592,7 @@ This variable replaces the deprecated `ENABLE_API_KEY_ENDPOINT_RESTRICTIONS` env
 - Type: `str`
 - Description: Specifies a comma-separated list of allowed API endpoints when API key endpoint restrictions are enabled.
 - Example: `/api/v1/messages,/api/v1/channels,/api/v1/chat/completions`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::note
 
@@ -1592,7 +1611,7 @@ This variable replaces the deprecated `API_KEY_ALLOWED_ENDPOINTS` environment va
 - Type: `str`
 - Default: `x-api-key`
 - Description: Name of the HTTP header the auth middleware checks for API-key credentials. Useful when Open WebUI sits behind a reverse proxy or API gateway that consumes the `Authorization` header for its own authentication — set this to a distinct header (for example `X-OpenWebUI-Key`) so clients can deliver their Open WebUI API key without colliding with the proxy's own auth.
-- Read at startup from the process environment (not a `PersistentConfig`).
+- Read at startup from the process environment (not a `ConfigVar`).
 
 **How the auth middleware picks up a credential**, in order:
 
@@ -1625,7 +1644,7 @@ The header name is matched case-insensitively by the ASGI layer, so pick whateve
 - Type: `bool`
 - Default: `False`
 - Description: When enabled, caches the list of base models from connected Ollama and OpenAI-compatible endpoints in memory. This reduces the number of API calls made to external model providers when loading the model selector, improving performance particularly for deployments with many users or slow connections to model endpoints. Can also be configured from Admin Panel > Settings > Connections > "Cache Base Model List".
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 **How the cache works:**
 
@@ -1675,7 +1694,7 @@ For maximum performance, enable both: `ENABLE_BASE_MODELS_CACHE=True` with `MODE
 - Type: `str`
 - Default: `4w`
 - Description: Sets the JWT expiration time in seconds. Valid time units: `s`, `m`, `h`, `d`, `w` or `-1` for no expiration.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::warning
 
@@ -1760,7 +1779,7 @@ FORWARD_SESSION_INFO_HEADER_MESSAGE_ID=X-Amzn-Bedrock-AgentCore-Runtime-Custom-M
 - Type: `bool`
 - Default: `True`
 - Description: Bypass SSL Verification for RAG on Websites.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WEBUI_SESSION_COOKIE_SAME_SITE`
 
@@ -2174,49 +2193,49 @@ For multi-worker or multi-replica setups, you **must** configure `CHROMA_HTTP_HO
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the Elasticsearch API key.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ELASTICSEARCH_CA_CERTS`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the path to the CA certificates for Elasticsearch.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ELASTICSEARCH_CLOUD_ID`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the Elasticsearch cloud ID.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ELASTICSEARCH_INDEX_PREFIX`
 
 - Type: `str`
 - Default: `open_webui_collections`
 - Description: Specifies the prefix for the Elasticsearch index.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ELASTICSEARCH_PASSWORD`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the password for Elasticsearch.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ELASTICSEARCH_URL`
 
 - Type: `str`
 - Default: `https://localhost:9200`
 - Description: Specifies the URL for the Elasticsearch instance.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ELASTICSEARCH_USERNAME`
 
 - Type: `str`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the username for Elasticsearch.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Milvus
 
@@ -2254,7 +2273,7 @@ If you want to use Milvus, be careful when upgrading Open WebUI (crate backups a
 - Default: `HNSW`
 - Options: `AUTOINDEX`, `FLAT`, `IVF_FLAT`, `HNSW`, `DISKANN`
 - Description: Specifies the index type to use when creating a new collection in Milvus. `AUTOINDEX` is generally recommended for Milvus standalone. `HNSW` may offer better performance but requires a clustered Milvus setup and is not meant for standalone setups.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MILVUS_METRIC_TYPE`
 
@@ -2262,28 +2281,28 @@ If you want to use Milvus, be careful when upgrading Open WebUI (crate backups a
 - Default: `COSINE`
 - Options: `COSINE`, `IP`, `L2`
 - Description: Specifies the metric type for vector similarity search in Milvus.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MILVUS_HNSW_M`
 
 - Type: `int`
 - Default: `16`
 - Description: Specifies the `M` parameter for the HNSW index type in Milvus. This influences the number of bi-directional links created for each new element during construction. Only applicable if `MILVUS_INDEX_TYPE` is `HNSW`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MILVUS_HNSW_EFCONSTRUCTION`
 
 - Type: `int`
 - Default: `100`
 - Description: Specifies the `efConstruction` parameter for the HNSW index type in Milvus. This influences the size of the dynamic list for the nearest neighbors during index construction. Only applicable if `MILVUS_INDEX_TYPE` is `HNSW`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MILVUS_IVF_FLAT_NLIST`
 
 - Type: `int`
 - Default: `128`
 - Description: Specifies the `nlist` parameter for the IVF_FLAT index type in Milvus. This is the number of cluster units. Only applicable if `MILVUS_INDEX_TYPE` is `IVF_FLAT`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MILVUS_DISKANN_MAX_DEGREE`
 
@@ -2569,7 +2588,7 @@ If set to `false`, open-webui will assume the postgreSQL database where embeddin
   - `hnsw` - Uses Hierarchical Navigable Small World graphs, generally provides better query performance.
 - Default: Not specified (pgvector will use its default)
 - Description: Specifies the index method for pgvector. The choice affects query performance and index build time.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -2582,21 +2601,21 @@ When choosing an index method, consider your dataset size and query patterns. HN
 - Type: `int`
 - Default: `16`
 - Description: HNSW index parameter that controls the maximum number of bi-directional connections per layer during index construction. Higher values improve recall but increase index size and build time. Only applicable when `PGVECTOR_INDEX_METHOD` is set to `hnsw`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PGVECTOR_HNSW_EF_CONSTRUCTION`
 
 - Type: `int`
 - Default: `64`
 - Description: HNSW index parameter that controls the size of the dynamic candidate list during index construction. Higher values improve index quality but increase build time. Only applicable when `PGVECTOR_INDEX_METHOD` is set to `hnsw`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PGVECTOR_IVFFLAT_LISTS`
 
 - Type: `int`
 - Default: `100`
 - Description: IVFFlat index parameter that specifies the number of inverted lists (clusters) to create. A good starting point is `rows / 1000` for up to 1M rows and `sqrt(rows)` for over 1M rows. Only applicable when `PGVECTOR_INDEX_METHOD` is set to `ivfflat`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -2947,49 +2966,49 @@ Note: this configuration assumes that AWS credentials will be available to your
   - `mineru` - Use MinerU engine
   - `paddleocr_vl` - Use a PaddleOCR-vl server (requires `PADDLEOCR_VL_TOKEN`; see below)
 - Description: Sets the content extraction engine to use for document ingestion.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MISTRAL_OCR_API_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the Mistral OCR API key to use.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MISTRAL_OCR_API_BASE_URL`
 
 - Type: `str`
 - Default: `https://api.mistral.ai/v1`
 - Description: Configures custom Mistral OCR API endpoints for flexible deployment options, allowing users to point to self-hosted or alternative Mistral OCR instances.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `EXTERNAL_DOCUMENT_LOADER_URL`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the URL for the external document loader service.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `EXTERNAL_DOCUMENT_LOADER_API_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the API key for authenticating with the external document loader service.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `TIKA_SERVER_URL`
 
 - Type: `str`
 - Default: `http://localhost:9998`
 - Description: Sets the URL for the Apache Tika server.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DOCLING_SERVER_URL`
 
 - Type: `str`
 - Default: `http://docling:5001`
 - Description: Specifies the URL for the Docling server. Requires Docling version 2.0.0 or later for full compatibility with the new parameter-based configuration system.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::warning
 
@@ -3011,7 +3030,7 @@ The old individual environment variables (`DOCLING_OCR_ENGINE`, `DOCLING_OCR_LAN
 - Type: `str`
 - Default: `None`
 - Description: Sets the API key for authenticating with the Docling server. Required when the Docling server has authentication enabled.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DOCLING_PARAMS`
 
@@ -3052,7 +3071,7 @@ The old individual environment variables (`DOCLING_OCR_ENGINE`, `DOCLING_OCR_LAN
 **dlparse** vs **dbparse**: Note that the backend names use **`dlparse`** (Deep Learning Parse), not `dbparse`. For modern Docling (v2+), `dlparse_v4` is generally recommended for the best balance of features.
 :::
 
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3087,21 +3106,21 @@ DOCLING_PARAMS="{\"do_ocr\": true, \"ocr_engine\": \"tesseract\", \"ocr_lang\":
 - Type: `str`
 - Default: `300`
 - Description: Sets the timeout in seconds for MinerU API requests during document processing.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PADDLEOCR_VL_BASE_URL`
 
 - Type: `str`
 - Default: `http://localhost:8080`
 - Description: Base URL of the PaddleOCR-vl server used when `CONTENT_EXTRACTION_ENGINE=paddleocr_vl`. Documents and images are POSTed to `{base_url}/layout-parsing` and the response's `layoutParsingResults[].markdown.text` is ingested page-by-page.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PADDLEOCR_VL_TOKEN`
 
 - Type: `str`
 - Default: `""` (empty)
 - Description: Authentication token for the PaddleOCR-vl server. Sent as `Authorization: token <value>` on every layout-parsing request. **The PaddleOCR-vl engine is skipped at runtime if this value is empty** — the loader falls back to the default PyPDFLoader for the current document even when `CONTENT_EXTRACTION_ENGINE=paddleocr_vl` is set. Set this to activate the engine.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info Supported file types
 PaddleOCR-vl handles both documents and images. Extensions treated as images and dispatched with `fileType=1`: `png`, `jpg`, `jpeg`, `bmp`, `tiff`, `webp`. Everything else is dispatched with `fileType=0` (document, e.g. PDFs). Output is per-page Markdown, so downstream chunking behaves the same as other engines.
@@ -3120,56 +3139,69 @@ PaddleOCR-vl handles both documents and images. Extensions treated as images and
   - `openai` - Uses the OpenAI API for embeddings.
   - `azure_openai` - Uses Azure OpenAI Services for embeddings.
 - Description: Selects an embedding engine to use for RAG.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_EMBEDDING_MODEL`
 
 - Type: `str`
 - Default: `sentence-transformers/all-MiniLM-L6-v2`
 - Description: Sets a model for embeddings. Locally, a Sentence-Transformer model is used.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_TOP_K`
 
 - Type: `int`
 - Default: `3`
 - Description: Sets the default number of results to consider for the embedding when using RAG.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_TOP_K_RERANKER`
 
 - Type: `int`
 - Default: `3`
 - Description: Sets the default number of results to consider for the reranker when using RAG.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_RELEVANCE_THRESHOLD`
 
 - Type: `float`
 - Default: `0.0`
 - Description: Sets the relevance threshold to consider for documents when used with reranking.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_RAG_HYBRID_SEARCH`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables the use of ensemble search with `BM25` + `ChromaDB`, with reranking using `sentence_transformers` models. When enabled, this applies to both the standard RAG retrieval pipeline and the native knowledge tools used in agentic mode.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
+
+#### `ENABLE_KB_EXEC`
+
+- Type: `bool`
+- Default: `False`
+- Description: When enabled, adds a `kb_exec` tool that lets the model interact with knowledge bases through shell-style commands (`ls`, `tree`, `cat`, `head`, `tail`, `sed`, `grep`, `find`, `wc`, `stat`) with pipe support (e.g. `grep "auth" | head -5`). Directory-aware: `ls docs/` lists a subdirectory, `tree` renders a recursive view, `grep "text" docs/` scopes the search, and files can be referenced by path (`docs/api/auth.md`), filename, or file ID.
+
+  Turning this on **replaces** the file-oriented per-purpose tools (`list_knowledge`, `search_knowledge_files`, `grep_knowledge_files`, `view_file`, `view_knowledge_file`) — they would overlap with the equivalent `kb_exec` commands. Tools that do something `kb_exec` can't are **kept** alongside it:
+  - `query_knowledge_files` (semantic / RAG search) — always
+  - `view_note` — when notes are attached to the model
+  - `query_knowledge_bases` and `search_knowledge_bases` — when no knowledge is attached, so the model can still discover KBs by name/description
+
+  Experimental — best paired with capable frontier models that handle shell-style tool chaining well. See [Filesystem-style access](/features/workspace/knowledge#filesystem-style-access-kb_exec).
 
 #### `RAG_HYBRID_BM25_WEIGHT`
 
 - Type: `float`
 - Default: `0.5`
 - Description: Sets the weight given to the keyword search (BM25) during hybrid search. 1 means only keyword search, 0 means only vector search.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_RAG_HYBRID_SEARCH_ENRICHED_TEXTS`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enhances BM25 hybrid search by enriching indexed text with document metadata including filenames, titles, sections, and snippets. This improves keyword recall for metadata-based queries, allowing searches to match on document names and structural elements in addition to content.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3217,7 +3249,7 @@ Provide a clear and direct response to the user's query, including inline citati
 ```
 
 - Description: Template to use when injecting RAG documents into chat completion.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Document Processing
 
@@ -3226,21 +3258,21 @@ Provide a clear and direct response to the user's query, including inline citati
 - Type: `int`
 - Default: `1000`
 - Description: Sets the document chunk size for embeddings.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CHUNK_OVERLAP`
 
 - Type: `int`
 - Default: `100`
 - Description: Specifies how much overlap there should be between chunks.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `CHUNK_MIN_SIZE_TARGET`
 
 - Type: `int`
 - Default: `0`
 - Description: Chunks smaller than this threshold will be intelligently merged with neighboring chunks when possible. This helps prevent tiny, low-quality fragments that can hurt retrieval performance and waste embedding resources. This feature only works when `ENABLE_MARKDOWN_HEADER_TEXT_SPLITTER` is enabled. Set to `0` to disable merging. For more information on the benefits and configuration, see the [RAG guide](/features/chat-conversations/rag#chunking-configuration).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_TEXT_SPLITTER`
 
@@ -3250,14 +3282,14 @@ Provide a clear and direct response to the user's query, including inline citati
   - `token`
 - Default: `character`
 - Description: Sets the text splitter for RAG models. Use `character` for RecursiveCharacterTextSplitter or `token` for TokenTextSplitter (Tiktoken-based).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_MARKDOWN_HEADER_TEXT_SPLITTER`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables markdown header text splitting as a preprocessing step before character or token splitting. When enabled, documents are first split by markdown headers (h1-h6), then the resulting chunks are further processed by the configured text splitter (`RAG_TEXT_SPLITTER`). This helps preserve document structure and context across chunks.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3278,14 +3310,14 @@ The `markdown_header` option has been removed from `RAG_TEXT_SPLITTER`. Markdown
 - Type: `str`
 - Default: `cl100k_base`
 - Description: Sets the encoding name for TikToken.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PDF_EXTRACT_IMAGES`
 
 - Type: `bool`
 - Default: `False`
 - Description: Extracts images from PDFs using OCR when loading documents.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PDF_LOADER_MODE`
 
@@ -3295,19 +3327,19 @@ The `markdown_header` option has been removed from `RAG_TEXT_SPLITTER`. Markdown
   - `single` - Combines all pages into one document for better chunking across page boundaries.
 - Default: `page`
 - Description: Controls how PDFs are loaded and split into documents when using the **default content extraction engine** (PyPDFLoader). Page mode creates one document per page, while single mode combines all pages into one document, which can improve chunking quality when content spans across page boundaries. This setting has no effect when using external content extraction engines like Tika, Docling, Document Intelligence, MinerU, or Mistral OCR, as those engines have their own document handling logic.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_FILE_MAX_SIZE`
 
 - Type: `int`
 - Description: Sets the maximum size of a file in megabytes that can be uploaded for document ingestion.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_FILE_MAX_COUNT`
 
 - Type: `int`
 - Description: Sets the maximum number of files that can be uploaded at once for document ingestion.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_ALLOWED_FILE_EXTENSIONS`
 
@@ -3319,7 +3351,7 @@ The `markdown_header` option has been removed from `RAG_TEXT_SPLITTER`. Markdown
 ["pdf,docx,txt"]
 ```
 
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3336,7 +3368,7 @@ When configuring `RAG_FILE_MAX_SIZE` and `RAG_FILE_MAX_COUNT`, ensure that the v
 - Type: `int`
 - Default: `1`
 - Description: Controls how many text chunks are embedded in a single API request when using external embedding providers (Ollama, OpenAI, or Azure OpenAI). Higher values (20-100+; max 16000 (not recommended)) may process documents faster by sending less, but larger API requests. Some external APIs do not support batching or sending more than 1 chunk per request. In such casey you must leave this at `1`. Default is 1 (safest option if the API does not support batching / more than 1 chunk per request). This setting only applies to external embedding engines, not the default SentenceTransformers engine. 
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3350,7 +3382,7 @@ Only increase this variable's value if it does - otherwise you might run into un
 - Type: `bool`
 - Default: `true`
 - Description: Runs embedding tasks asynchronously (parallelized) for maximum performance. Only works for Ollama, OpenAI and Azure OpenAI, does not affect sentence transformer setups.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::tip
 
@@ -3370,7 +3402,7 @@ If you are embedding externally via API, ensure your rate limits are high enough
 - Type: `int`
 - Default: `0`
 - Description: Limits the number of concurrent embedding API requests when async embedding is enabled. Uses an asyncio semaphore to throttle parallel requests. Set to `0` for unlimited concurrency (default behavior), or set to a positive integer to cap simultaneous requests. Useful for respecting rate limits on external embedding APIs or reducing load on local embedding servers.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::tip
 
@@ -3415,14 +3447,14 @@ This variable was introduced alongside a fix for **uvicorn worker death during d
 - Type: `str`
 - Default: `${OPENAI_API_BASE_URL}`
 - Description: Sets the OpenAI base API URL to use for RAG embeddings.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_OPENAI_API_KEY`
 
 - Type: `str`
 - Default: `${OPENAI_API_KEY}`
 - Description: Sets the OpenAI API key to use for RAG embeddings.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_EMBEDDING_OPENAI_BATCH_SIZE`
 
@@ -3437,21 +3469,21 @@ This variable was introduced alongside a fix for **uvicorn worker death during d
 - Type: `str`
 - Default: `None`
 - Description: Sets the base URL for Azure OpenAI Services when using Azure OpenAI for RAG embeddings. Should be in the format `https://{your-resource-name}.openai.azure.com`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_AZURE_OPENAI_API_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the API key for Azure OpenAI Services when using Azure OpenAI for RAG embeddings.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_AZURE_OPENAI_API_VERSION`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the API version for Azure OpenAI Services when using Azure OpenAI for RAG embeddings. Common values include `2023-05-15`, `2023-12-01-preview`, or `2024-02-01`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### Ollama Embeddings
 
@@ -3459,13 +3491,13 @@ This variable was introduced alongside a fix for **uvicorn worker death during d
 
 - Type: `str`
 - Description: Sets the base URL for Ollama API used in RAG models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_OLLAMA_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for Ollama API used in RAG models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Reranking
 
@@ -3475,20 +3507,20 @@ This variable was introduced alongside a fix for **uvicorn worker death during d
 - Options: `external`, or empty for local Sentence-Transformer CrossEncoder
 - Default: Empty string (local reranking)
 - Description: Specifies the reranking engine to use. Set to `external` to use an external reranker API (requires `RAG_EXTERNAL_RERANKER_URL`). Leave empty to use a local Sentence-Transformer CrossEncoder model.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_RERANKING_MODEL`
 
 - Type: `str`
 - Description: Sets a model for reranking results. Locally, a Sentence-Transformer model is used.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_RERANKING_BATCH_SIZE`
 
 - Type: `int`
 - Default: `32`
 - Description: Controls how many query–document pairs are scored in a single batch during local reranking. Higher values use more memory but can be faster on GPUs with sufficient VRAM. This applies to the local ColBERT/CrossEncoder reranking model's `predict()` call.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SENTENCE_TRANSFORMERS_CROSS_ENCODER_SIGMOID_ACTIVATION_FUNCTION`
 
@@ -3501,14 +3533,14 @@ This variable was introduced alongside a fix for **uvicorn worker death during d
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the timeout in seconds for external reranker API requests during RAG document retrieval. Leave empty to use default timeout behavior.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_EXTERNAL_RERANKER_URL`
 
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the **full URL** for the external reranking API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::warning
 
@@ -3521,7 +3553,7 @@ You **MUST** provide the full URL, including the endpoint path (e.g., `https://a
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the API key for the external reranking API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Query Generation
 
@@ -3530,7 +3562,7 @@ You **MUST** provide the full URL, including the endpoint path (e.g., `https://a
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables retrieval query generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_QUERIES_CACHE`
 
@@ -3572,7 +3604,7 @@ Strictly return in JSON format:
 ```
 
 - Description: Sets the prompt template for query generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Document Intelligence (Azure)
 
@@ -3581,21 +3613,21 @@ Strictly return in JSON format:
 - Type: `str`
 - Default: `None`
 - Description: Specifies the endpoint for document intelligence.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DOCUMENT_INTELLIGENCE_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the key for document intelligence.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DOCUMENT_INTELLIGENCE_MODEL`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the model for document intelligence.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Advanced Settings
 
@@ -3604,14 +3636,14 @@ Strictly return in JSON format:
 - Type: `bool`
 - Default: `False`
 - Description: Bypasses the embedding and retrieval process.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_FULL_CONTEXT`
 
 - Type: `bool`
 - Default: `False`
 - Description: Specifies whether to use the full context for RAG.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `RAG_SYSTEM_CONTEXT`
 
@@ -3624,7 +3656,7 @@ Strictly return in JSON format:
 - Type: `bool`
 - Default: `False`
 - Description: Controls whether RAG web fetch operations can access URLs that resolve to private/local network IP addresses.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 When disabled (default), Open WebUI blocks web fetch requests to URLs that resolve to private IP addresses, including:
 
@@ -3646,7 +3678,7 @@ Only enable this setting if you need to fetch content from internal network reso
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables Google Drive integration. If set to true, and `GOOGLE_DRIVE_CLIENT_ID` & `GOOGLE_DRIVE_API_KEY` are both configured, Google Drive will appear as an upload option in the chat UI.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3658,13 +3690,13 @@ When enabling `GOOGLE_DRIVE_INTEGRATION`, ensure that you have configured `GOOGL
 
 - Type: `str`
 - Description: Sets the client ID for Google Drive (client must be configured with Drive API and Picker API enabled).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GOOGLE_DRIVE_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for Google Drive integration.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### OneDrive
 
@@ -3679,7 +3711,7 @@ For a step-by-step setup guide, check out our tutorial: [Configuring OneDrive &
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables the Microsoft OneDrive integration feature globally.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::warning
 
@@ -3693,14 +3725,14 @@ The authentication flow also depends on a browser pop-up window. Please ensure t
 - Type: `bool`
 - Default: `True`
 - Description: Controls whether the "Personal OneDrive" option appears in the attachment menu. The option is only shown when `ONEDRIVE_CLIENT_ID_PERSONAL` (or the legacy `ONEDRIVE_CLIENT_ID`) is configured — without a client ID, the option stays hidden even if this variable is `True`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_ONEDRIVE_BUSINESS`
 
 - Type: `bool`
 - Default: `True`
 - Description: Controls whether the "Work/School OneDrive" option appears in the attachment menu. The option is only shown when `ONEDRIVE_CLIENT_ID_BUSINESS` (or the legacy `ONEDRIVE_CLIENT_ID`) is configured — without a client ID, the option stays hidden even if this variable is `True`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ONEDRIVE_CLIENT_ID`
 
@@ -3732,7 +3764,7 @@ When configuring the App Registration in Azure, the Redirect URI must be set to
 - Type: `str`
 - Default: `None`
 - Description: Specifies the root SharePoint site URL for the work/school integration, e.g., `https://companyname.sharepoint.com`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3745,7 +3777,7 @@ This variable is essential for the work/school integration. It should point to t
 -   Type: `str`
 -   Default: `None`
 -   Description: Specifies the Directory (tenant) ID for the work/school integration. This is obtained from your business-focused Azure App Registration.
--   Persistence: This environment variable is a `PersistentConfig` variable.
+-   Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3760,21 +3792,21 @@ This Tenant ID (also known as Directory ID) is required for the work/school inte
 - Type: `bool`
 - Default: `False`
 - Description: Enable web search toggle.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_SEARCH_QUERY_GENERATION`
 
 - Type: `bool`
 - Default: `True`
 - Description: Only applies to Default Function Calling mode, which is legacy and no longer supported. If True: an LLM generates optimized, distilled search queries from the conversation context. If False: the user's last message is used verbatim as the web search query. Native Mode (the supported mode) uses the model's own `search_web` tool call and does not consult this setting.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WEB_SEARCH_TRUST_ENV`
 
 - Type: `bool`
 - Default: `True`
 - Description: Routes the web-page content fetcher (the stage that scrapes result pages after a web search, and the general website/URL loader) through the proxy defined by the http_proxy / https_proxy environment variables, also honoring no_proxy and .netrc. Needed because the default fetcher uses aiohttp, which — unlike requests — ignores these proxy variables unless told to trust the environment. This does not affect the search-engine query request itself (that already respects proxy env vars), and it never overrides an explicitly configured proxy; for the Firecrawl/Tavily/Playwright loaders it only acts as a fallback when no proxy is otherwise set.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WEB_FETCH_FILTER_LIST`
 
@@ -3795,35 +3827,35 @@ Allow only specific domains: WEB_FETCH_FILTER_LIST="example.com,trusted-site.org
 - Default: `[]`
 - Description: Comma-separated list of domains to filter web search results. Domains prefixed with `!` are blocked; domains without prefix create an allowlist (only those domains permitted).
 - Example: `wikipedia.org,github.com,!malicious-site.com`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WEB_SEARCH_RESULT_COUNT`
 
 - Type: `int`
 - Default: `3`
 - Description: Maximum number of web search results to crawl. In Native/Agentic tool calling, this is also the default `search_web` result count when the model omits `count`, and the maximum cap when the model provides `count`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WEB_SEARCH_CONCURRENT_REQUESTS`
 
 - Type: `int`
 - Default: `0`
 - Description: Limits the number of concurrent search requests to the search engine provider. Set to `0` for unlimited concurrency (default). Set to `1` for sequential execution to prevent rate limiting errors (e.g., Brave Free Tier).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WEB_FETCH_MAX_CONTENT_LENGTH`
 
 - Type: `int`
 - Default: None (no limit)
 - Description: Maximum number of characters to return from fetched URLs. When set, content exceeding this limit is truncated. Previously hardcoded at 50,000 characters. Leave empty or unset to return full content without truncation. Useful for controlling context window usage with large web pages.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WEB_LOADER_CONCURRENT_REQUESTS`
 
 - Type: `int`
 - Default: `10`
 - Description: Specifies the number of concurrent requests used by the web loader to fetch content from web pages returned by search results. This directly impacts how many pages can be crawled simultaneously.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3849,6 +3881,7 @@ Allow only specific domains: WEB_FETCH_FILTER_LIST="example.com,trusted-site.org
   - `serpapi` - Uses the [SerpApi](https://serpapi.com/) search engine.
   - `duckduckgo` - Uses the [DuckDuckGo](https://duckduckgo.com/) search engine.
   - `tavily` - Uses the [Tavily](https://tavily.com/) search engine.
+  - `linkup` - Uses the [Linkup](https://www.linkup.so/) search API. Requires `LINKUP_API_KEY`; tunable via `LINKUP_SEARCH_PARAMS` (search depth, output type).
   - `jina` - Uses the [Jina](https://jina.ai/) search engine.
   - `bing` - Uses the [Bing](https://www.bing.com/) search engine.
   - `exa` - Uses the [Exa](https://exa.ai/) search engine.
@@ -3860,7 +3893,7 @@ Allow only specific domains: WEB_FETCH_FILTER_LIST="example.com,trusted-site.org
   - `yacy`
   - `yandex` - Uses the [Yandex Search API](https://yandex.cloud/en/docs/search-api/api-ref/WebSearch/search).
   - `youcom` - Uses the [You.com](https://you.com/) YDC Index API for web search.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `DDGS_BACKEND`
 
@@ -3868,53 +3901,53 @@ Allow only specific domains: WEB_FETCH_FILTER_LIST="example.com,trusted-site.org
 - Default: `auto`
 - Options: `auto` (Random), `bing`, `brave`, `duckduckgo`, `google`, `grokipedia`, `mojeek`, `wikipedia`, `yahoo`, `yandex`.
 - Description: Specifies the backend to be used by the DDGS engine.
-- Persistence: This environment variable is a `PersistentConfig` variable. It can be configured in the **Admin Panel > Settings > Web Search > DDGS Backend** when DDGS is selected as the search engine.
+- Persistence: This environment variable is a `ConfigVar` variable. It can be configured in the **Admin Panel > Settings > Web Search > DDGS Backend** when DDGS is selected as the search engine.
 
 #### `BYPASS_WEB_SEARCH_EMBEDDING_AND_RETRIEVAL`
 
 - Type: `bool`
 - Default: `False`
 - Description: Bypasses the web search embedding and retrieval process.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `BYPASS_WEB_SEARCH_WEB_LOADER`
 
 - Type: `bool`
 - Default: `False`
 - Description: Bypasses the web loader when performing web search. When enabled, only snippets from the search engine are used, and the full page content is not fetched.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SEARXNG_QUERY_URL`
 
 - Type: `str`
 - Description: The [SearXNG search API](https://docs.searxng.org/dev/search_api.html) URL supporting JSON output. `<query>` is replaced with
 the search query. Example: `http://searxng.local/search?q=<query>`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SEARXNG_LANGUAGE`
 
 - Type: `str`
 - Default: `all`
 - Description: This variable is used in the request to searxng as the "search language" (arguement "language").
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GOOGLE_PSE_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for the Google Programmable Search Engine (PSE) service.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GOOGLE_PSE_ENGINE_ID`
 
 - Type: `str`
 - Description: The engine ID for the Google Programmable Search Engine (PSE) service.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `BRAVE_SEARCH_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for the Brave Search API. Used by both the `brave` and `brave_llm_context` engines.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -3927,123 +3960,137 @@ Brave's free tier enforces a rate limit of 1 request per second. Open WebUI auto
 - Type: `int`
 - Default: `8192`
 - Description: Maximum total tokens to retrieve per query when `WEB_SEARCH_ENGINE=brave_llm_context`. Sent to Brave's LLM Context API as `maximum_number_of_tokens`. Valid range is `1024`–`32768`. Higher values pull richer extracted passages at the cost of API quota; lower values keep responses lean. Configurable via **Admin Panel → Settings → Web Search → Context Tokens** (only shown when the `brave_llm_context` engine is selected).
-- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `rag.web.search.brave_search_context_tokens`.
+- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `rag.web.search.brave_search_context_tokens`.
 
 #### `KAGI_SEARCH_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for Kagi Search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MOJEEK_SEARCH_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for Mojeek Search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SERPSTACK_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for Serpstack search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SERPSTACK_HTTPS`
 
 - Type: `bool`
 - Default: `True`
 - Description: Configures the use of HTTPS for Serpstack requests. Free tier requests are restricted to HTTP only.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SERPER_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for Serper search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SERPLY_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for Serply search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SEARCHAPI_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for SearchAPI.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SEARCHAPI_ENGINE`
 
 - Type: `str`
 - Description: Sets the SearchAPI engine.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `TAVILY_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for Tavily search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
+
+#### `LINKUP_API_KEY`
+
+- Type: `str`
+- Default: `''`
+- Description: Sets the API key for the [Linkup](https://www.linkup.so/) search API. Required when `WEB_SEARCH_ENGINE=linkup`.
+- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `rag.web.search.linkup_api_key`.
+
+#### `LINKUP_SEARCH_PARAMS`
+
+- Type: `str` (JSON object)
+- Default: `''` (parsed to `{}`; effective defaults are `{"url": "https://api.linkup.so/v1/search", "depth": "standard", "outputType": "sourcedAnswer"}`)
+- Description: Optional JSON object merged over the Linkup request defaults. Recognized keys include `depth` (`standard` or `deep`), `outputType` (`sourcedAnswer` or `searchResults`), and `url` (overrides the API endpoint). `q` (the query) and `maxResults` are injected automatically and cannot be overridden. Invalid JSON falls back to `{}`. Configurable via **Admin Panel → Settings → Web Search** when the `linkup` engine is selected.
+- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `rag.web.search.linkup_search_params`.
 
 #### `JINA_API_KEY`
 
 - Type: `str`
 - Description: Sets the API key for Jina.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `JINA_API_BASE_URL`
 
 - Type: `str`
 - Default: `https://s.jina.ai/`
 - Description: Sets the Base URL for Jina Search API. Useful for specifying custom or regional endpoints (e.g., `https://eu-s-beta.jina.ai/`).
-- Persistence: This environment variable is a `PersistentConfig` variable. It can be configured in the **Admin Panel > Settings > Web Search > Jina API Base URL**.
+- Persistence: This environment variable is a `ConfigVar` variable. It can be configured in the **Admin Panel > Settings > Web Search > Jina API Base URL**.
 
 #### `BING_SEARCH_V7_ENDPOINT`
 
 - Type: `str`
 - Description: Sets the endpoint for Bing Search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `BING_SEARCH_V7_SUBSCRIPTION_KEY`
 
 - Type: `str`
 - Default: `https://api.bing.microsoft.com/v7.0/search`
 - Description: Sets the subscription key for Bing Search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `BOCHA_SEARCH_API_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the API key for Bocha Search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `EXA_API_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the API key for Exa search API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SERPAPI_API_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the API key for SerpAPI.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SERPAPI_ENGINE`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the search engine to use for SerpAPI.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AZURE_AI_SEARCH_API_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: API key (query key or admin key) for authenticating with Azure AI Search service. Required for using Azure AI Search as a web search provider.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AZURE_AI_SEARCH_ENDPOINT`
 
@@ -4051,7 +4098,7 @@ Brave's free tier enforces a rate limit of 1 request per second. Open WebUI auto
 - Default: `None`
 - Description: Azure Search service endpoint URL. Specifies which Azure Search service instance to connect to.
 - Example: `https://myservice.search.windows.net`, `https://company-search.search.windows.net`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AZURE_AI_SEARCH_INDEX_NAME`
 
@@ -4059,105 +4106,105 @@ Brave's free tier enforces a rate limit of 1 request per second. Open WebUI auto
 - Default: `None`
 - Description: Name of the search index to query within your Azure Search service. Different indexes can contain different types of searchable content.
 - Example: `my-search-index`, `documents-index`, `knowledge-base`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SOUGOU_API_SID`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the Sogou API SID.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SOUGOU_API_SK`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the Sogou API SK.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OLLAMA_CLOUD_WEB_SEARCH_API_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the Ollama Cloud Web Search API Key.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `TAVILY_EXTRACT_DEPTH`
 
 - Type: `str`
 - Default: `basic`
 - Description: Specifies the extract depth for Tavily search results.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `YACY_QUERY_URL`
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the query URL for YaCy search engine integration. Should point to a YaCy instance's search API endpoint.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `YACY_USERNAME`
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Specifies the username for authenticated access to YaCy search engine.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `YACY_PASSWORD`
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Specifies the password for authenticated access to YaCy search engine.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `EXTERNAL_WEB_SEARCH_URL`
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Specifies the URL of an external web search service API endpoint for custom search integrations.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `EXTERNAL_WEB_SEARCH_API_KEY`
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the API key for authenticating with the external web search service.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `EXTERNAL_WEB_LOADER_URL`
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Specifies the URL of an external web content loader service for fetching and processing web pages.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `EXTERNAL_WEB_LOADER_API_KEY`
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the API key for authenticating with the external web loader service.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `YANDEX_WEB_SEARCH_URL`
 
 - Type: `str`
 - Default: `https://searchapi.api.cloud.yandex.net/v2/web/search`
 - Description: Specifies the URL of the Yandex Web Search service API endpoint.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `YANDEX_WEB_SEARCH_API_KEY`
 
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the API key for authenticating with the Yandex Web Search service.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `YANDEX_WEB_SEARCH_CONFIG`
 
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Optional JSON configuration string for Yandex Web Search. Can be used to set parameters like `searchType` or `region` as per the Yandex API documentation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PERPLEXITY_API_KEY`
 
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the API key for Perplexity API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PERPLEXITY_SEARCH_API_URL`
 
@@ -4165,14 +4212,14 @@ Brave's free tier enforces a rate limit of 1 request per second. Open WebUI auto
 - Default: `https://api.perplexity.ai/search`
 - Description: Configures the API endpoint for Perplexity Search. Allows using custom or self-hosted Perplexity-compatible API endpoints (such as LiteLLM's `/search` endpoint) instead of the hardcoded default for the official Perplexity API. This enables flexibility in routing search requests to alternative providers or internal proxies. **Note: If using LiteLLM, append the specific provider name to the URL path.**
 - Example: `http://my-litellm-server.com/search/perplexity-search`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `PERPLEXITY_MODEL`
 
 - Type: `str`
 - Default: `sonar`
 - Description: Specifies the Perplexity AI model to use for search queries when using `Perplexity` as the web search engine.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -4186,7 +4233,7 @@ If you use `perplexity_search`, this variable is not relevant to you.
 - Type: `str`
 - Default: `medium`
 - Description: Controls the amount of search context used by Perplexity AI. Options typically include `low`, `medium`, `high`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -4200,7 +4247,7 @@ If you use `perplexity`, this variable is not relevant to you.
 - Type: `str`
 - Default: Empty string (' ')
 - Description: Sets the API key for [You.com](https://you.com/) YDC Index API web search. Required when `WEB_SEARCH_ENGINE` is set to `youcom`. Obtain an API key from [You.com API](https://you.com/api).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Web Loader Configuration
 
@@ -4215,7 +4262,7 @@ If you use `perplexity`, this variable is not relevant to you.
   - `firecrawl` - Uses Firecrawl service.
   - `tavily` - Uses Tavily service.
   - `external` - Uses an external web loader API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -4231,7 +4278,7 @@ When using `playwright`, you have two options:
 - Type: `str`
 - Default: `None`
 - Description: Specifies the WebSocket URI of a remote Playwright browser instance. When set, Open WebUI will use this remote browser instead of installing browser dependencies locally. This is particularly useful in containerized environments where you want to keep the Open WebUI container lightweight and separate browser concerns. Example: `ws://playwright:3000`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::tip
 
@@ -4248,35 +4295,35 @@ Using a remote Playwright browser via `PLAYWRIGHT_WS_URL` can be beneficial for:
 - Type: `str`
 - Default: `https://api.firecrawl.dev`
 - Description: Sets the base URL for Firecrawl API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `FIRECRAWL_API_KEY`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets the API key for Firecrawl API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `FIRECRAWL_TIMEOUT`
 
 - Type: `int`
 - Default: `None`
 - Description: Specifies the timeout in milliseconds for Firecrawl requests. If not set, the default Firecrawl timeout is used.
-- Persistence: This environment variable is a `PersistentConfig` variable. It can be configured in the **Admin Panel > Settings > Web Search > Firecrawl Timeout**.
+- Persistence: This environment variable is a `ConfigVar` variable. It can be configured in the **Admin Panel > Settings > Web Search > Firecrawl Timeout**.
 
 #### `PLAYWRIGHT_TIMEOUT`
 
 - Type: `int`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the timeout for Playwright requests.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WEB_LOADER_TIMEOUT`
 
 - Type: `float`
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the request timeout in seconds for the SafeWebBaseLoader when scraping web pages. Without this setting, web scraping operations can hang indefinitely on slow or unresponsive pages. Recommended values are 10–30 seconds depending on your network conditions.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::warning
 
@@ -4290,7 +4337,7 @@ This **timeout only applies when `WEB_LOADER_ENGINE` is set to `safe_web`** or l
 
 - Type: `str`
 - Description: Sets the proxy URL for YouTube loader.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `YOUTUBE_LOADER_LANGUAGE`
 
@@ -4305,7 +4352,7 @@ Note: If none of the specified languages are available and `en` was not in your
 
 :::
 
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ## Audio
 
@@ -4322,7 +4369,7 @@ Note: If none of the specified languages are available and `en` was not in your
 - Type: `str`
 - Default: `base`
 - Description: Sets the Whisper model to use for Speech-to-Text. The backend used is faster_whisper with quantization to `int8`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `WHISPER_MODEL_DIR`
 
@@ -4372,28 +4419,28 @@ Note: If none of the specified languages are available and `en` was not in your
   - `azure` - Uses Azure Cognitive Services for Speech-to-Text.
   - `mistral` - Uses Mistral API for Speech-to-Text.
 - Description: Specifies the Speech-to-Text engine to use. When left as an empty string (the default), the backend runs a local Whisper instance. Note: The "Web API" option seen in User Settings is a frontend-only setting that uses the browser's built-in speech recognition and does not call this backend endpoint at all.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_MODEL`
 
 - Type: `str`
 - Default: `whisper-1`
 - Description: Specifies the Speech-to-Text model to use for OpenAI-compatible endpoints.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_OPENAI_API_BASE_URL`
 
 - Type: `str`
 - Default: `${OPENAI_API_BASE_URL}`
 - Description: Sets the OpenAI-compatible base URL to use for Speech-to-Text.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_OPENAI_API_KEY`
 
 - Type: `str`
 - Default: `${OPENAI_API_KEY}`
 - Description: Sets the OpenAI API key to use for Speech-to-Text.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Speech-to-Text (Azure)
 
@@ -4402,35 +4449,35 @@ Note: If none of the specified languages are available and `en` was not in your
 - Type: `str`
 - Default: `None`
 - Description: Specifies the Azure API key to use for Speech-to-Text.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_AZURE_REGION`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the Azure region to use for Speech-to-Text.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_AZURE_LOCALES`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the locales to use for Azure Speech-to-Text.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_AZURE_BASE_URL`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies a custom Azure base URL for Speech-to-Text. Use this if you have a custom Azure endpoint.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_AZURE_MAX_SPEAKERS`
 
 - Type: `int`
 - Default: `3`
 - Description: Sets the maximum number of speakers for Azure Speech-to-Text diarization.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Speech-to-Text (Deepgram)
 
@@ -4439,7 +4486,7 @@ Note: If none of the specified languages are available and `en` was not in your
 - Type: `str`
 - Default: `None`
 - Description: Specifies the Deepgram API key to use for Speech-to-Text.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Speech-to-Text (Mistral)
 
@@ -4448,21 +4495,21 @@ Note: If none of the specified languages are available and `en` was not in your
 - Type: `str`
 - Default: `None`
 - Description: Specifies the Mistral API key to use for Speech-to-Text.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_MISTRAL_API_BASE_URL`
 
 - Type: `str`
 - Default: `https://api.mistral.ai/v1`
 - Description: Specifies the Mistral API base URL to use for Speech-to-Text.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS`
 
 - Type: `bool`
 - Default: `False`
 - Description: When enabled, uses the chat completions endpoint for Mistral Speech-to-Text instead of the dedicated transcription endpoint.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Speech-to-Text (General)
 
@@ -4471,14 +4518,14 @@ Note: If none of the specified languages are available and `en` was not in your
 - Type: `str`
 - Default: `None`
 - Description: Comma-separated list of supported audio MIME types for Speech-to-Text (e.g., `audio/wav,audio/mpeg,video/*`). Leave empty to use defaults.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_STT_ALLOWED_EXTENSIONS`
 
 - Type: `str`
 - Default: `mp3,wav,m4a,webm,ogg,flac,mp4,mpga,mpeg`
 - Description: Comma-separated list of audio file extensions accepted by the Speech-to-Text upload endpoint. Uploads with extensions outside this list are rejected with `400 Invalid audio file extension`. Comparison is case-insensitive. Set to an empty value to skip the extension check (MIME-type validation via `AUDIO_STT_SUPPORTED_CONTENT_TYPES` still applies). Configurable via **Admin Settings → Audio → STT**.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Text-to-Speech
 
@@ -4486,7 +4533,7 @@ Note: If none of the specified languages are available and `en` was not in your
 
 - Type: `str`
 - Description: Sets the API key for Text-to-Speech.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_TTS_ENGINE`
 
@@ -4499,28 +4546,28 @@ Note: If none of the specified languages are available and `en` was not in your
   - `azure` - Uses Azure Cognitive Services for Text-to-Speech.
   - `transformers` - Uses a local SentenceTransformers-based model for Text-to-Speech (runs on the backend).
 - Description: Specifies the Text-to-Speech engine to use on the backend. When left as an empty string (the default), no backend TTS service is configured, and audio playback relies entirely on the user's browser capabilities or frontend options like "Browser Kokoro".
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_TTS_MODEL`
 
 - Type: `str`
 - Default: `tts-1`
 - Description: Specifies the OpenAI text-to-speech model to use.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_TTS_VOICE`
 
 - Type: `str`
 - Default: `alloy`
 - Description: Sets the OpenAI text-to-speech voice to use.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_TTS_SPLIT_ON`
 
 - Type: `str`
 - Default: `punctuation`
 - Description: Sets the OpenAI text-to-speech split on to use.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Azure Text-to-Speech
 
@@ -4528,21 +4575,21 @@ Note: If none of the specified languages are available and `en` was not in your
 
 - Type: `str`
 - Description: Sets the region for Azure Text to Speech.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_TTS_AZURE_SPEECH_OUTPUT_FORMAT`
 
 - Type: `str`
 - Default: `audio-24khz-160kbitrate-mono-mp3`
 - Description: Sets the output format for Azure Text to Speech.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_TTS_AZURE_SPEECH_BASE_URL`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies a custom Azure Speech base URL for Text-to-Speech. Use this if you have a custom Azure endpoint.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Voice Mode
 
@@ -4551,13 +4598,13 @@ Note: If none of the specified languages are available and `en` was not in your
 - Type: `bool`
 - Default: `True`
 - Description: Master switch for the voice-mode system prompt. When `True`, voice-mode chats prepend either the custom `VOICE_MODE_PROMPT_TEMPLATE` (if set) or `DEFAULT_VOICE_MODE_PROMPT_TEMPLATE`. When `False`, no voice-specific system prompt is injected — the model uses only the regular system prompt and chat history. Configurable in **Admin Settings → Interface → Voice Mode Custom Prompt** (toggle).
-- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `task.voice.prompt.enable`.
+- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `task.voice.prompt.enable`.
 
 #### `VOICE_MODE_PROMPT_TEMPLATE`
 - Type: `str`
 - Default: The value of `DEFAULT_VOICE_MODE_PROMPT_TEMPLATE` environment variable.
 - Description: Configures a custom system prompt for voice mode interactions. Allows administrators to control how the AI responds in voice conversations (style, length, tone). Leave empty to use the default prompt optimized for voice conversations, or provide custom instructions to tailor the voice assistant's behavior. Only applied when `ENABLE_VOICE_MODE_PROMPT` is `True`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### OpenAI Text-to-Speech
 
@@ -4566,14 +4613,14 @@ Note: If none of the specified languages are available and `en` was not in your
 - Type: `str`
 - Default: `${OPENAI_API_BASE_URL}`
 - Description: Sets the OpenAI-compatible base URL to use for text-to-speech.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_TTS_OPENAI_API_KEY`
 
 - Type: `str`
 - Default: `${OPENAI_API_KEY}`
 - Description: Sets the API key to use for text-to-speech.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_TTS_OPENAI_PARAMS`
 
@@ -4581,7 +4628,7 @@ Note: If none of the specified languages are available and `en` was not in your
 - Default: `{}`
 - Description: Additional parameters for OpenAI-compatible TTS API in JSON format. Allows customization of API-specific settings.
 - Example: `{"speed": 1.0}`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Mistral Text-to-Speech
 
@@ -4590,14 +4637,14 @@ Note: If none of the specified languages are available and `en` was not in your
 - Type: `str`
 - Default: `None`
 - Description: Sets the API key used for Mistral Text-to-Speech.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUDIO_TTS_MISTRAL_API_BASE_URL`
 
 - Type: `str`
 - Default: `https://api.mistral.ai/v1`
 - Description: Sets the base URL used for Mistral Text-to-Speech.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -4612,7 +4659,7 @@ When `AUDIO_TTS_ENGINE=mistral`, Open WebUI uses `mistral-tts-latest` when `AUDI
 - Type: `str`
 - Default: `https://api.elevenlabs.io`
 - Description: Configures custom ElevenLabs API endpoints, enabling support for EU residency API requirements and other regional deployments.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ## Image Generation
 
@@ -4623,21 +4670,21 @@ When `AUDIO_TTS_ENGINE=mistral`, Open WebUI uses `mistral-tts-latest` when `AUDI
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables image generation features.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_IMAGE_PROMPT_GENERATION`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables automatic enhancement of user prompts for better image generation results.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `IMAGE_PROMPT_GENERATION_PROMPT_TEMPLATE`
 
 - Type: `str`
 - Default: `None`
 - Description: Specifies the template to use for generating image prompts.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 `DEFAULT_IMAGE_PROMPT_GENERATION_PROMPT_TEMPLATE`:
 ```
@@ -4676,21 +4723,21 @@ Strictly return in JSON format:
   - `gemini` - Uses Gemini for image generation.
 - Default: `openai`
 - Description: Specifies the engine to use for image generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `IMAGE_GENERATION_MODEL`
 
 - Type: `str`
 - Default: ``
 - Description: Default model to use for image generation (e.g., `dall-e-3`, `gemini-2.0-flash-exp`).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `IMAGE_SIZE`
 
 - Type: `str`
 - Default: `512x512`
 - Description: Sets the default output dimensions for generated images in WIDTHxHEIGHT format (e.g., `1024x1024`). Set to `auto` to let the model determine the appropriate size (only supported by models matching `IMAGE_AUTO_SIZE_MODELS_REGEX_PATTERN`).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `IMAGE_AUTO_SIZE_MODELS_REGEX_PATTERN`
 
@@ -4709,7 +4756,7 @@ Strictly return in JSON format:
 - Type: `int`
 - Default: `50`
 - Description: Sets the default iteration steps for image generation. Used for ComfyUI and AUTOMATIC1111 engines.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ---
 
@@ -4720,7 +4767,7 @@ Strictly return in JSON format:
 - Type: `boolean`
 - Default: `true`
 - Description: When disabled, Image Editing will not be used and instead, images will be created only using the image generation engine.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `IMAGE_EDIT_ENGINE`
 
@@ -4731,21 +4778,21 @@ Strictly return in JSON format:
   - `comfyui` - Uses ComfyUI engine for image editing.
 - Default: `openai`
 - Description: Configures the engine used for image editing operations, enabling modification of existing images using text prompts.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `IMAGE_EDIT_MODEL`
 
 - Type: `str`
 - Default: ``
 - Description: Specifies the model to use for image editing operations within the selected engine (e.g., `dall-e-2`, `gemini-2.5-flash`).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `IMAGE_EDIT_SIZE`
 
 - Type: `str`
 - Default: ``
 - Description: Defines the output dimensions for edited images in WIDTHxHEIGHT format (e.g., `1024x1024`). Leave empty to preserve original dimensions.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ---
 
@@ -4758,21 +4805,21 @@ Strictly return in JSON format:
 - Type: `str`
 - Default: `${OPENAI_API_BASE_URL}`
 - Description: Sets the OpenAI-compatible base URL to use for DALL-E image generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_OPENAI_API_VERSION`
 
 - Type: `str`
 - Default: `${OPENAI_API_VERSION}`
 - Description: Optional setting. If provided it sets the `api-version` query parameter when calling the image generation endpoint. Required for Azure OpenAI deployments.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_OPENAI_API_KEY`
 
 - Type: `str`
 - Default: `${OPENAI_API_KEY}`
 - Description: Sets the API key to use for DALL-E image generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_OPENAI_API_PARAMS`
 
@@ -4780,7 +4827,7 @@ Strictly return in JSON format:
 - Default: `{}`
 - Description: Additional parameters for OpenAI image generation API in JSON format. Allows customization of API-specific settings such as quality parameters for DALL-E models (e.g., `{"quality": "hd"}` for dall-e-3).
 - Example: `{"quality": "hd", "style": "vivid"}`
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### Image Editing
 
@@ -4789,21 +4836,21 @@ Strictly return in JSON format:
 - Type: `str`
 - Default: `${OPENAI_API_BASE_URL}`
 - Description: Configures the OpenAI API base URL specifically for image editing operations, allowing separate endpoints from image generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_EDIT_OPENAI_API_VERSION`
 
 - Type: `str`
 - Default: ``
 - Description: Specifies the OpenAI API version for image editing, enabling support for Azure OpenAI deployments with versioned endpoints.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_EDIT_OPENAI_API_KEY`
 
 - Type: `str`
 - Default: `${OPENAI_API_KEY}`
 - Description: Provides authentication for OpenAI image editing API requests, with support for separate keys from image generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ---
 
@@ -4822,14 +4869,14 @@ For a detailed setup guide and example configuration, please refer to the [Gemin
 - Type: `str`
 - Default: `${GEMINI_API_BASE_URL}`
 - Description: Specifies the URL to Gemini's image generation API.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_GEMINI_API_KEY`
 
 - Type: `str`
 - Default: `${GEMINI_API_KEY}`
 - Description: Sets the Gemini API key for image generation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_GEMINI_ENDPOINT_METHOD`
 
@@ -4839,7 +4886,7 @@ For a detailed setup guide and example configuration, please refer to the [Gemin
   - `generateContent` - Uses the generateContent endpoint (for Gemini 2.5 Flash and newer models).
 - Default: ``
 - Description: Specifies the Gemini API endpoint method for image generation, supporting both legacy Imagen models and newer Gemini models with image generation capabilities.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### Image Editing
 
@@ -4848,14 +4895,14 @@ For a detailed setup guide and example configuration, please refer to the [Gemin
 - Type: `str`
 - Default: `${GEMINI_API_BASE_URL}`
 - Description: Configures the Gemini API base URL for image editing operations with Gemini models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_EDIT_GEMINI_API_KEY`
 
 - Type: `str`
 - Default: `${GEMINI_API_KEY}`
 - Description: Provides authentication for Gemini image editing API requests.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ---
 
@@ -4868,14 +4915,14 @@ For a detailed setup guide and example configuration, please refer to the [Gemin
 - Type: `str`
 - Default: ``
 - Description: Specifies the URL to the ComfyUI image generation API (e.g., `http://127.0.0.1:8188`).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `COMFYUI_API_KEY`
 
 - Type: `str`
 - Default: ``
 - Description: Sets the API key for ComfyUI authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `COMFYUI_WORKFLOW`
 
@@ -4965,14 +5012,14 @@ For a detailed setup guide and example configuration, please refer to the [Gemin
 ```
 
 - Description: Defines the ComfyUI workflow configuration in JSON format. Export from ComfyUI using "Save (API Format)" to ensure compatibility.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `COMFYUI_WORKFLOW_NODES`
 
 - Type: `list[dict]`
 - Default: `[]`
 - Description: Specifies the ComfyUI workflow node mappings for image generation, defining which nodes handle prompt, model, dimensions, and other parameters. Configured automatically via the admin UI.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### Image Editing
 
@@ -4981,28 +5028,28 @@ For a detailed setup guide and example configuration, please refer to the [Gemin
 - Type: `str`
 - Default: ``
 - Description: Configures the ComfyUI base URL for image editing operations, enabling self-hosted ComfyUI workflows for image manipulation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_EDIT_COMFYUI_API_KEY`
 
 - Type: `str`
 - Default: ``
 - Description: Provides authentication for ComfyUI image editing API requests when the ComfyUI instance requires API key authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_EDIT_COMFYUI_WORKFLOW`
 
 - Type: `str` (JSON)
 - Default: ``
 - Description: Defines the ComfyUI workflow configuration in JSON format for image editing operations. Must include nodes for image input, prompt, and output. Export from ComfyUI using "Save (API Format)".
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ##### `IMAGES_EDIT_COMFYUI_WORKFLOW_NODES`
 
 - Type: `list[dict]`
 - Default: `[]`
 - Description: Specifies the ComfyUI workflow node mappings for image editing, defining which nodes handle image input, prompt, model, dimensions, and other parameters. Configured automatically via the admin UI.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ---
 
@@ -5013,21 +5060,21 @@ For a detailed setup guide and example configuration, please refer to the [Gemin
 - Type: `str`
 - Default: ``
 - Description: Specifies the URL to AUTOMATIC1111's Stable Diffusion API (e.g., `http://127.0.0.1:7860`).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUTOMATIC1111_API_AUTH`
 
 - Type: `str`
 - Default: ``
 - Description: Sets the AUTOMATIC1111 API authentication credentials if required.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `AUTOMATIC1111_PARAMS`
 
 - Type: `str` (JSON)
 - Default: `{}`
 - Description: Additional parameters in JSON format to pass to AUTOMATIC1111 API requests (e.g., `{"cfg_scale": 7, "sampler_name": "Euler a", "scheduler": "normal"}`).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ## OAuth
 
@@ -5042,7 +5089,7 @@ You can only configure one OAUTH provider at a time. You cannot have two or more
 - Type: `bool`
 - Default: `False`
 - Description: Enables account creation when signing up via OAuth. Distinct from `ENABLE_SIGNUP`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::danger
 
@@ -5067,7 +5114,7 @@ By default, OAuth configurations are stored in the database and managed via the
 - Type: `str`
 - Default: `None`
 - Description: Overrides the default claim used to identify a user's unique ID (`sub`) from the OAuth/OIDC provider's user info response. By default, Open WebUI attempts to infer this from the provider's configuration. This variable allows you to explicitly specify which claim to use. For example, if your identity provider uses 'employee_id' as the unique identifier, you would set this variable to 'employee_id'.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_MERGE_ACCOUNTS_BY_EMAIL`
 
@@ -5075,13 +5122,13 @@ By default, OAuth configurations are stored in the database and managed via the
 - Default: `False`
 - Description: If enabled, merges OAuth accounts with existing accounts using the same email
 address. This is considered unsafe as not all OAuth providers will verify email addresses and can lead to potential account takeovers.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_OAUTH_WITHOUT_EMAIL`
 - Type: `bool`
 - Default: `False`
 - Description: Enables authentication with OpenID Connect (OIDC) providers that do not support or expose an email scope. When enabled, Open WebUI will create and manage user accounts without requiring an email address from the OAuth provider.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::warning
 
@@ -5109,7 +5156,7 @@ For most standard OAuth providers (Google, Microsoft, GitHub, etc.), this settin
 - Type: `bool`
 - Default: `False`
 - Description: If enabled, updates the local user profile picture with the OAuth-provided picture on login.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -5257,27 +5304,27 @@ You must also set `OPENID_PROVIDER_URL` or otherwise logout may not work.
 
 - Type: `str`
 - Description: Sets the client ID for Google OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GOOGLE_CLIENT_SECRET`
 
 - Type: `str`
 - Description: Sets the client secret for Google OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GOOGLE_OAUTH_SCOPE`
 
 - Type: `str`
 - Default: `openid email profile`
 - Description: Sets the scope for Google OAuth authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GOOGLE_REDIRECT_URI`
 
 - Type: `str`
 - Default: `<backend>/oauth/google/callback`
 - Description: Sets the redirect URI for Google OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GOOGLE_OAUTH_AUTHORIZE_PARAMS`
 
@@ -5310,47 +5357,47 @@ You must also set `OPENID_PROVIDER_URL` or otherwise logout may not work.
 
 - Type: `str`
 - Description: Sets the client ID for Microsoft OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MICROSOFT_CLIENT_SECRET`
 
 - Type: `str`
 - Description: Sets the client secret for Microsoft OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MICROSOFT_CLIENT_TENANT_ID`
 
 - Type: `str`
 - Description: Sets the tenant ID for Microsoft OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MICROSOFT_OAUTH_SCOPE`
 
 - Type: `str`
 - Default: `openid email profile`
 - Description: Sets the scope for Microsoft OAuth authentication. This scope is also included in refresh token requests when `OAUTH_REFRESH_TOKEN_INCLUDE_SCOPE` is enabled, which is required by Azure AD to avoid `AADSTS90009` errors. If you use custom API scopes, include them here (e.g., `openid email profile offline_access api://<Application ID URI>/<custom_scope>`).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MICROSOFT_REDIRECT_URI`
 
 - Type: `str`
 - Default: `<backend>/oauth/microsoft/callback`
 - Description: Sets the redirect URI for Microsoft OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MICROSOFT_CLIENT_LOGIN_BASE_URL`
 
 - Type: `str`
 - Default: `https://login.microsoftonline.com`
 - Description: Sets the base login URL for Microsoft OAuth authentication. Allows configuration of alternative login endpoints for government clouds or custom deployments.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `MICROSOFT_CLIENT_PICTURE_URL`
 
 - Type: `str`
 - Default: `https://graph.microsoft.com/v1.0/me/photo/$value`
 - Description: Specifies the Microsoft Graph API endpoint for retrieving user profile pictures during OAuth authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### GitHub
 
@@ -5366,27 +5413,27 @@ You must also set `OPENID_PROVIDER_URL` or otherwise logout may not work.
 
 - Type: `str`
 - Description: Sets the client ID for GitHub OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GITHUB_CLIENT_SECRET`
 
 - Type: `str`
 - Description: Sets the client secret for GitHub OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GITHUB_CLIENT_SCOPE`
 
 - Type: `str`
 - Default: `user:email`
 - Description: Specifies the scope for GitHub OAuth authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `GITHUB_CLIENT_REDIRECT_URI`
 
 - Type: `str`
 - Default: `<backend>/oauth/github/callback`
 - Description: Sets the redirect URI for GitHub OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Feishu
 
@@ -5396,26 +5443,26 @@ See https://open.feishu.cn/document/sso/web-application-sso/login-overview
 
 - Type: `str`
 - Description: Sets the client ID for Feishu OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `FEISHU_CLIENT_SECRET`
 
 - Type: `str`
 - Description: Sets the client secret for Feishu OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `FEISHU_OAUTH_SCOPE`
 
 - Type: `str`
 - Default: `contact:user.base:readonly`
 - Description: Specifies the scope for Feishu OAuth authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `FEISHU_REDIRECT_URI`
 
 - Type: `str`
 - Description: Sets the redirect URI for Feishu OAuth.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### OpenID (OIDC)
 
@@ -5423,19 +5470,19 @@ See https://open.feishu.cn/document/sso/web-application-sso/login-overview
 
 - Type: `str`
 - Description: Sets the client ID for OIDC.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_CLIENT_SECRET`
 
 - Type: `str`
 - Description: Sets the client secret for OIDC.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OPENID_PROVIDER_URL`
 
 - Type: `str`
 - Description: Path to the `.well-known/openid-configuration` endpoint
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::danger
 
@@ -5451,21 +5498,21 @@ Alternatively, if your provider does not support standard OIDC discovery (e.g.,
 - Type: `str`
 - Default: Empty string (`""`)
 - Description: Sets a custom end-session (logout) endpoint URL. When configured, Open WebUI will redirect to this URL on logout instead of attempting OIDC discovery via `OPENID_PROVIDER_URL`. This is useful for OAuth providers that do not support standard OIDC discovery for logout, such as AWS Cognito.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OPENID_REDIRECT_URI`
 
 - Type: `str`
 - Default: `<backend>/oauth/oidc/callback`
 - Description: Sets the redirect URI for OIDC
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_SCOPES`
 
 - Type: `str`
 - Default: `openid email profile`
 - Description: Sets the scope for OIDC authentication. `openid` and `email` are required.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_CODE_CHALLENGE_METHOD`
 
@@ -5474,35 +5521,35 @@ Alternatively, if your provider does not support standard OIDC discovery (e.g.,
   - `S256` - Hash `code_verifier` with SHA-256.
 - Default: Empty string (' '), since `None` is set as default.
 - Description: Specifies the code challenge method for OAuth authentication. Set to `S256` when PKCE is required by the provider.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_PROVIDER_NAME`
 
 - Type: `str`
 - Default: `SSO`
 - Description: Sets the name for the OIDC provider.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_USERNAME_CLAIM`
 
 - Type: `str`
 - Default: `name`
 - Description: Set username claim for OpenID.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_EMAIL_CLAIM`
 
 - Type: `str`
 - Default: `email`
 - Description: Set email claim for OpenID.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_PICTURE_CLAIM`
 
 - Type: `str`
 - Default: `picture`
 - Description: Set picture (avatar) claim for OpenID.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -5515,28 +5562,28 @@ If `OAUTH_PICTURE_CLAIM` is set to `''` (empty string), then the OAuth picture c
 - Type: `str`
 - Default: `groups`
 - Description: Specifies the group claim for OAuth authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_OAUTH_ROLE_MANAGEMENT`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables role management for OAuth delegation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_OAUTH_GROUP_MANAGEMENT`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables OAuth group management.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_OAUTH_GROUP_CREATION`
 
 - Type: `bool`
 - Default: `False`
 - Description: When enabled, groups from OAuth claims that don't exist in Open WebUI will be automatically created.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_GROUP_DEFAULT_SHARE`
 
@@ -5547,35 +5594,35 @@ If `OAUTH_PICTURE_CLAIM` is set to `''` (empty string), then the OAuth picture c
   - `members` — Groups created via OAuth will only allow sharing with group members.
   - `false` — Groups created via OAuth will have sharing disabled (no one can share).
 - Description: Controls the default sharing permission for groups that are automatically created via OAuth group management. Only applies when `ENABLE_OAUTH_GROUP_CREATION` is enabled. Existing groups are not affected by this setting.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_BLOCKED_GROUPS`
 
 - Type: `str`
 - Default: `[]`
 - Description: JSON array of group names that are blocked from accessing the application. Users belonging to these groups will be denied access even if they have valid OAuth credentials.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_ROLES_CLAIM`
 
 - Type: `str`
 - Default: `roles`
 - Description: Sets the roles claim to look for in the OIDC token.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_ALLOWED_ROLES`
 
 - Type: `str`
 - Default: `user,admin`
 - Description: Sets the roles that are allowed access to the platform.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_ADMIN_ROLES`
 
 - Type: `str`
 - Default: `admin`
 - Description: Sets the roles that are considered administrators.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_ROLES_SEPARATOR`
 
@@ -5594,7 +5641,7 @@ If `OAUTH_PICTURE_CLAIM` is set to `''` (empty string), then the OAuth picture c
 - Type: `str`
 - Default: `*`
 - Description: Specifies the allowed domains for OAuth authentication. (e.g., "example1.com,example2.com").
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `OAUTH_AUDIENCE`
 
@@ -5644,7 +5691,7 @@ If you set both `OAUTH_AUDIENCE` and `OAUTH_AUTHORIZE_PARAMS` containing an `aud
 - Type: `bool`
 - Default: `False`
 - Description: When enabled, includes the configured OAuth scope in refresh token requests. Some OAuth providers, notably Microsoft Azure AD, require the scope to be explicitly provided when refreshing a token. Without it, Azure AD treats the request as the application requesting a token for itself, resulting in `AADSTS90009` errors. Enable this if you encounter token refresh failures with your OAuth provider.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -5659,117 +5706,117 @@ This setting is compliant with [RFC 6749 Section 6](https://datatracker.ietf.org
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables LDAP authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_SERVER_LABEL`
 
 - Type: `str`
 - Description: Sets the label of the LDAP server.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_SERVER_HOST`
 
 - Type: `str`
 - Default: `localhost`
 - Description: Sets the hostname of the LDAP server.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_SERVER_PORT`
 
 - Type: `int`
 - Default: `389`
 - Description: Sets the port number of the LDAP server.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_ATTRIBUTE_FOR_MAIL`
 
 - Type: `str`
 - Description: Sets the attribute to use as mail for LDAP authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_ATTRIBUTE_FOR_USERNAME`
 
 - Type: `str`
 - Description: Sets the attribute to use as a username for LDAP authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_APP_DN`
 
 - Type: `str`
 - Description: Sets the distinguished name for the LDAP application.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_APP_PASSWORD`
 
 - Type: `str`
 - Description: Sets the password for the LDAP application.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_SEARCH_BASE`
 
 - Type: `str`
 - Description: Sets the base to search for LDAP authentication.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_SEARCH_FILTER`
 
 - Type: `str`
 - Default: `None`
 - Description: Sets additional filter conditions for LDAP user search. This filter is **appended** to the automatically-generated username filter. Open WebUI automatically constructs the username portion of the filter using `LDAP_ATTRIBUTE_FOR_USERNAME`, so you should **not** include user placeholders like `%(user)s` or `%s` — these are not supported. Use this for additional conditions such as group membership restrictions (e.g., `(memberOf=cn=allowed-users,ou=groups,dc=example,dc=com)`). Alternative to `LDAP_SEARCH_FILTERS`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_SEARCH_FILTERS`
 
 - Type: `str`
 - Description: Sets additional filter conditions for LDAP user search. This is an alias for `LDAP_SEARCH_FILTER`. The filter is appended to the automatically-generated username filter — do **not** include user placeholders like `%(user)s` or `%s`.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_USE_TLS`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables TLS for LDAP connection.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_CA_CERT_FILE`
 
 - Type: `str`
 - Description: Sets the path to the LDAP CA certificate file.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_VALIDATE_CERT`
 
 - Type: `bool`
 - Description: Sets whether to validate the LDAP CA certificate.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_CIPHERS`
 
 - Type: `str`
 - Default: `ALL`
 - Description: Sets the ciphers to use for LDAP connection.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_LDAP_GROUP_MANAGEMENT`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables the group management feature.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `ENABLE_LDAP_GROUP_CREATION`
 
 - Type: `bool`
 - Default: `False`
 - Description: If a group from LDAP does not exist in Open WebUI, it will be created automatically.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `LDAP_ATTRIBUTE_FOR_GROUPS`
 
 - Type: `str`
 - Default: `memberOf`
 - Description: Specifies the LDAP attribute that contains the user's group memberships. `memberOf` is a standard attribute for this purpose in Active Directory environments.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ## SCIM
 
@@ -5778,14 +5825,14 @@ This setting is compliant with [RFC 6749 Section 6](https://datatracker.ietf.org
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables SCIM 2.0 (System for Cross-domain Identity Management) support for automated user and group provisioning from identity providers like Okta, Azure AD, and Google Workspace.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SCIM_TOKEN`
 
 - Type: `str`
 - Default: `""`
 - Description: Sets the bearer token for SCIM authentication. This token must be provided by identity providers when making SCIM API requests. Generate a secure random token (e.g., using `openssl rand -base64 32`) and configure it in both Open WebUI and your identity provider.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `SCIM_AUTH_PROVIDER`
 
@@ -5802,7 +5849,7 @@ This setting is compliant with [RFC 6749 Section 6](https://datatracker.ietf.org
 - Type: `bool`
 - Default: `True`
 - Description: Acts as a master switch to enable or disable the main "Controls" button and panel in the chat interface. **If this is set to False, users will not see the Controls button, and the granular permissions below will have no effect**.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_VALVES`
 
@@ -5827,7 +5874,7 @@ This setting is compliant with [RFC 6749 Section 6](https://datatracker.ietf.org
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to upload files to chats.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_WEB_UPLOAD`
 
@@ -5840,80 +5887,80 @@ This setting is compliant with [RFC 6749 Section 6](https://datatracker.ietf.org
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to delete chats.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_EDIT`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to edit chats.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_DELETE_MESSAGE`
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to delete individual messages within chats. This provides granular control over message deletion capabilities separate from full chat deletion.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_CONTINUE_RESPONSE`
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to continue AI responses. When disabled, users cannot use the "Continue Response" button, which helps prevent potential system prompt leakage through response continuation manipulation.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_REGENERATE_RESPONSE`
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to regenerate AI responses. Controls access to both the standard regenerate button and the guided regeneration menu.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_RATE_RESPONSE`
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to rate AI responses using the thumbs up/down feedback system. This controls access to the response rating functionality for evaluation and feedback collection.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_STT`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to use Speech-to-Text in chats.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_TTS`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to use Text-to-Speech in chats.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_CALL`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to make calls in chats.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_MULTIPLE_MODELS`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to use multiple models in chats.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_TEMPORARY`
 
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user permission to create temporary chats. **Note:** Temporary chats disable backend document parsing for privacy.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_TEMPORARY_ENFORCED`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables enforced temporary chats for users.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Feature Permissions
 
@@ -5922,42 +5969,42 @@ This setting is compliant with [RFC 6749 Section 6](https://datatracker.ietf.org
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables user permission to access direct tool servers.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_FEATURES_WEB_SEARCH`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to use the web search feature.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_FEATURES_IMAGE_GENERATION`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to use the image generation feature.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_FEATURES_CODE_INTERPRETER`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to use code interpreter feature.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_FEATURES_MEMORIES`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to use the [memory feature](/features/chat-conversations/memory).
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_FEATURES_AUTOMATIONS`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables Automations access for non-admin users. When enabled, users can access `/automations` and manage their own scheduled automations.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -5975,7 +6022,7 @@ See [Permissions](/features/authentication-access/rbac/permissions#4-features-pe
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables Calendar access for non-admin users. When enabled, users can access the Calendar feature to create calendars, manage events, and view shared calendars.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -5993,28 +6040,28 @@ See [Permissions](/features/authentication-access/rbac/permissions#4-features-pe
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables the visibility of the Folders feature (chat sidebar) to users.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_FEATURES_NOTES`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables the visibility of the Notes feature to users.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_FEATURES_CHANNELS`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables the ability for users to create their own group channels.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_FEATURES_API_KEYS`
 
 - Type: `bool`
 - Default: `False`
 - Description: Sets the permission for API key creation feature for users. When enabled, users will have the ability to create and manage API keys for programmatic access.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 :::info
 
@@ -6033,35 +6080,35 @@ For API Key creation (and the API keys themselves):
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables user permission to access workspace models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_KNOWLEDGE_ACCESS`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables user permission to access workspace knowledge.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_PROMPTS_ACCESS`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables user permission to access workspace prompts.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_TOOLS_ACCESS`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables user permission to access workspace tools.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_SKILLS_ACCESS`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables user permission to access workspace skills.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Sharing (Private)
 
@@ -6072,42 +6119,42 @@ These settings control whether users can share workspace items with other users
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **private sharing** of workspace models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_KNOWLEDGE_ALLOW_SHARING`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **private sharing** of workspace knowledge.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_PROMPTS_ALLOW_SHARING`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **private sharing** of workspace prompts.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_TOOLS_ALLOW_SHARING`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **private sharing** of workspace tools.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_SKILLS_ALLOW_SHARING`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **private sharing** of workspace skills.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_NOTES_ALLOW_SHARING`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables **private sharing** of notes.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Sharing (Public)
 
@@ -6118,56 +6165,56 @@ These settings control whether users can share workspace items **publicly**.
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **public sharing** of workspace models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_KNOWLEDGE_ALLOW_PUBLIC_SHARING`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **public sharing** of workspace knowledge.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_PROMPTS_ALLOW_PUBLIC_SHARING`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **public sharing** of workspace prompts.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_TOOLS_ALLOW_PUBLIC_SHARING`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **public sharing** of workspace tools.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_SKILLS_ALLOW_PUBLIC_SHARING`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables **public sharing** of workspace skills.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_NOTES_ALLOW_PUBLIC_SHARING`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables **public sharing** of notes.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CHAT_ALLOW_PUBLIC_SHARING`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables **public sharing** of chat conversations. When disabled, the access-control selector in the chat share modal hides the "Public" option for non-admin users — they can still create share links scoped to specific users or groups, but cannot make a chat reachable by anyone with the link. Admins always retain the ability to share chats publicly. Requires `USER_PERMISSIONS_CHAT_SHARE` (Share Chat) to be enabled for the user. Configurable per-group in **Admin Panel → Users → Groups → Permissions → Chats Public Sharing**.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_CALENDAR_ALLOW_PUBLIC_SHARING`
 
 - Type: `bool`
 - Default: `False`
 - Description: Enables or disables **public sharing** of calendars. When disabled, non-admin owners cannot attach a wildcard `read` or `write` access grant to a calendar on create or update — public principals are silently filtered out of the access grant list, so a calendar cannot be made readable or writable by every user with the Calendar feature without an admin-granted sharing permission. Per-user and per-group grants remain unaffected. Admins always retain the ability to share calendars publicly. Configurable per-group in **Admin Panel → Users → Groups → Permissions → Calendars Public Sharing**.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Access Grants
 
@@ -6184,42 +6231,42 @@ These settings control whether users can share workspace items **publicly**.
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to import workspace models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_MODELS_EXPORT`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to export workspace models.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_PROMPTS_IMPORT`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to import workspace prompts.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_PROMPTS_EXPORT`
 
 - Type: `str`
 - Default: `True`
 - Description: Enables or disables user permission to export workspace prompts.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_TOOLS_IMPORT`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables user permission to import workspace tools.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 #### `USER_PERMISSIONS_WORKSPACE_TOOLS_EXPORT`
 
 - Type: `str`
 - Default: `False`
 - Description: Enables or disables user permission to export workspace tools.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 ### Settings Permissions
 
@@ -6228,7 +6275,7 @@ These settings control whether users can share workspace items **publicly**.
 - Type: `bool`
 - Default: `True`
 - Description: Enables or disables user / group permissions for the interface settings section in the Settings Modal.
-- Persistence: This environment variable is a `PersistentConfig` variable.
+- Persistence: This environment variable is a `ConfigVar` variable.
 
 
 ## Misc Environment Variables
@@ -6604,7 +6651,17 @@ When `DATABASE_URL` is not explicitly set, Open WebUI will attempt to construct
 
 - Type: `float`
 - Default: `None`
-- Description: Sets the minimum time interval in seconds between user active status updates in the database. Helps reduce write operations for high-traffic instances. Set to `0.0` to update on every activity.
+- Description: Minimum interval, in seconds, between writes of a user's `last_active_at` timestamp to the database (this timestamp drives online/"active" presence, evaluated over a ~3-minute window). The write is throttled **per user**: within the interval, repeat activity does not hit the database.
+
+:::warning Set this on every deployment — mandatory at scale
+**With the default (`None`, or an invalid value which becomes `0.0`), throttling is disabled and the write is unthrottled: essentially every authenticated request issues its own `UPDATE users SET last_active_at = ... ` plus a `COMMIT`.** On a busy instance this is a continuous flood of tiny write transactions that amplifies database load and consumes connection-pool capacity for no functional benefit — presence only needs ~minute granularity.
+
+Setting a positive interval collapses thousands of these writes into at most one per user per interval. This is **free performance for any setup** and is **required** for large/production deployments — leaving it unset is a common, avoidable database bottleneck.
+
+- **Recommended:** a few hundred seconds, e.g. `300`–`500`. Presence accuracy stays within the interval, which is well inside the 3-minute active window.
+- Applies regardless of database backend, and applies on weak hardware too (it only *reduces* writes — there is no downside to setting it on a Raspberry Pi).
+- Read once at startup; not a `ConfigVar` and not changeable from the Admin UI.
+:::
 
 #### `DATABASE_ENABLE_SESSION_SHARING`
 
@@ -6678,6 +6735,10 @@ To use SQLCipher with existing data, you must either start fresh (with users exp
 - Default: `None`
 - Description: Specifies the pooling strategy and size of the database pool. By default SQLAlchemy will automatically chose the proper pooling strategy for the selected database connection. A value of `0` disables pooling. A value larger `0` will set the pooling strategy to `QueuePool` and the pool size accordingly.
 
+:::warning SQLite: unset does not mean "small". It falls back to 512
+The documented default (`None`) only means "not set by you". On **SQLite**, current releases (0.9.x, async DB backend) substitute a large internal pool (currently **512** connections) when this is unset, not SQLAlchemy's small default. Each SQLite connection carries its own page cache (`DATABASE_SQLITE_PRAGMA_CACHE_SIZE`, ~64 MB by default) and mmap window (`DATABASE_SQLITE_PRAGMA_MMAP_SIZE`, ~256 MB by default), so peak memory scales with the number of simultaneously active connections. On a memory-constrained container this can OOM-kill the process during connection-heavy workflows (for example editing model or knowledge-base permissions and reloading a long model list). On small / low-spec deployments set this explicitly to a small value (e.g. `DATABASE_POOL_SIZE=8`) and lower the two PRAGMAs below. See [Performance → SQLite Memory Footprint on Constrained Containers](/troubleshooting/performance#4-sqlite-memory-footprint-on-constrained-containers).
+:::
+
 :::tip High-Concurrency Deployments
 
 For deployments with many concurrent users, consider increasing both `DATABASE_POOL_SIZE` and `DATABASE_POOL_MAX_OVERFLOW`. A good starting point is `DATABASE_POOL_SIZE=15` and `DATABASE_POOL_MAX_OVERFLOW=20`.
@@ -6773,6 +6834,10 @@ More information about this setting can be found [here](https://docs.sqlalchemy.
 - Default: `-65536`
 - Description: Sets the SQLite `PRAGMA cache_size` value. Negative values specify the cache size in KiB — the default `-65536` allocates approximately 64 MB of page cache. Larger caches reduce disk I/O for read-heavy workloads. Set to an empty string to skip this PRAGMA entirely. **This setting only applies to SQLite databases.**
 
+:::warning This cache is per connection, not global
+SQLite's page cache is allocated **per connection**, so the real worst case is `cache_size` times the number of simultaneously active pooled connections, not a single 64 MB allocation. With the default SQLite pool fallback (`DATABASE_POOL_SIZE` unset, see above) that multiplier can be large enough to OOM a memory-constrained container. On small / low-spec deployments lower this (e.g. `DATABASE_SQLITE_PRAGMA_CACHE_SIZE=-2000` for ~2 MB per connection) and cap `DATABASE_POOL_SIZE`. See [Performance → SQLite Memory Footprint on Constrained Containers](/troubleshooting/performance#4-sqlite-memory-footprint-on-constrained-containers).
+:::
+
 #### `DATABASE_SQLITE_PRAGMA_TEMP_STORE`
 
 - Type: `str`
@@ -6785,6 +6850,10 @@ More information about this setting can be found [here](https://docs.sqlalchemy.
 - Default: `268435456`
 - Description: Sets the SQLite `PRAGMA mmap_size` value in bytes. The default `268435456` enables approximately 256 MB of memory-mapped I/O, which can significantly improve read performance by avoiding syscall overhead. Set to `0` to disable mmap, or to an empty string to skip this PRAGMA entirely. **This setting only applies to SQLite databases.**
 
+:::note Also per connection
+Like the page cache above, the mmap window is established **per connection**. It is mostly virtual / file-backed rather than committed anonymous RAM, but it inflates the process `total-vm` substantially and the resident portion still counts toward a cgroup memory limit. On memory-constrained containers set `DATABASE_SQLITE_PRAGMA_MMAP_SIZE=0` and cap `DATABASE_POOL_SIZE`. See [Performance → SQLite Memory Footprint on Constrained Containers](/troubleshooting/performance#4-sqlite-memory-footprint-on-constrained-containers).
+:::
+
 #### `DATABASE_SQLITE_PRAGMA_JOURNAL_SIZE_LIMIT`
 
 - Type: `str`
@@ -6844,6 +6913,22 @@ maxclients 10000
 timeout 1800
 ```
 
+For high-concurrency websocket deployments, also review Redis Pub/Sub output buffer limits. Open WebUI uses Socket.IO over Redis Pub/Sub when `WEBSOCKET_MANAGER=redis` is enabled, and streaming responses can produce large websocket events. If Redis disconnects Pub/Sub clients under large streaming payloads, you may see `Cannot publish to redis... giving up`, Redis timeout errors, or stalled live updates. Check:
+
+```bash
+redis-cli INFO stats | grep client_output_buffer_limit_disconnections
+redis-cli SLOWLOG GET 50
+redis-cli CONFIG GET client-output-buffer-limit
+```
+
+If `client_output_buffer_limit_disconnections` increases and the slow log shows large `PUBLISH socketio ...` entries, raise the Pub/Sub buffer limit in `redis.conf`. Example:
+
+```conf
+client-output-buffer-limit normal 0 0 0 replica 268435456 67108864 60 pubsub 1073741824 268435456 180
+```
+
+This leaves normal clients unchanged and allows Pub/Sub clients to buffer up to 1 GB hard / 256 MB soft for 180 seconds. Tune these values to your available Redis memory and expected websocket payload size.
+
 **Symptoms of this misconfiguration:**
 - Works fine for days/weeks, then suddenly all logins fail with 500 errors
 - `redis.exceptions.ConnectionError: max number of clients reached` in logs
diff --git a/docs/reference/index.md b/docs/reference/index.md
index 74bcb5602..62c252914 100644
--- a/docs/reference/index.md
+++ b/docs/reference/index.md
@@ -15,7 +15,7 @@ Open WebUI is highly configurable. This section is the canonical source of truth
 
 **Every flag, path, and secret Open WebUI reads at startup, in one place.**
 
-Over 200 environment variables control authentication, model routing, storage, logging, and more. Understand `PersistentConfig` behavior, troubleshoot ignored settings, and find the exact variable you need.
+Over 200 environment variables control authentication, model routing, storage, logging, and more. Understand `ConfigVar` behavior, troubleshoot ignored settings, and find the exact variable you need.
 
 | | |
 | :--- | :--- |
diff --git a/docs/security/vendor-dispositions/cve-2024-7040.mdx b/docs/security/vendor-dispositions/cve-2024-7040.mdx
index af33b0923..7aaf94153 100644
--- a/docs/security/vendor-dispositions/cve-2024-7040.mdx
+++ b/docs/security/vendor-dispositions/cve-2024-7040.mdx
@@ -53,7 +53,7 @@ The published CWE-639 (Authorization Bypass Through User-Controlled Key) does no
 
 - **[Rule 9](/security/security-policy#reporting-guidelines):** Admins have full system control and are expected to understand the security implications of their actions and configurations. Administrators within the same instance share a single trust boundary.
 - **[Rule 7](/security/security-policy#reporting-guidelines):** The report does not acknowledge the project's self-hosted, multi-administrator architecture in which administrators share trust at the infrastructure level.
-- **[Rule 12](/security/security-policy#reporting-guidelines):** The issue crosses no security boundary against a party other than the reporter's own peer administrators within the same trust boundary.
+- **[Rule 13](/security/security-policy#reporting-guidelines):** The issue crosses no security boundary against a party other than the reporter's own peer administrators within the same trust boundary.
 
 ---
 
diff --git a/docs/security/vendor-dispositions/cve-2026-0765.mdx b/docs/security/vendor-dispositions/cve-2026-0765.mdx
index 36fd23c42..66c3f2645 100644
--- a/docs/security/vendor-dispositions/cve-2026-0765.mdx
+++ b/docs/security/vendor-dispositions/cve-2026-0765.mdx
@@ -39,7 +39,7 @@ The published CVSS rates the privilege requirement as PR:L (Low). This is method
 
 ### Applicable Security Policy Rules
 
-- **[Rule 10](/security/security-policy#tools-functions-and-pipelines-security):** Reports involving Tools or Functions — including code execution and frontmatter-based pip installation — are closed as intended behavior.
+- **[Rule 10](/security/security-policy#reporting-guidelines):** Reports involving Tools or Functions — including code execution and frontmatter-based pip installation — are closed as intended behavior.
 - **[Rule 9](/security/security-policy#reporting-guidelines):** "Pasting untrusted code into Functions/Tools" is explicitly cited as out-of-scope.
 - **[Rule 1](/security/security-policy#reporting-guidelines):** Expected protocol behavior is not a vulnerability.
 
diff --git a/docs/security/vendor-dispositions/cve-2026-0766.mdx b/docs/security/vendor-dispositions/cve-2026-0766.mdx
index 8fb744c8b..c842a8c4f 100644
--- a/docs/security/vendor-dispositions/cve-2026-0766.mdx
+++ b/docs/security/vendor-dispositions/cve-2026-0766.mdx
@@ -39,7 +39,7 @@ The published CVSS rates the privilege requirement as PR:L (Low). This is method
 
 ### Applicable Security Policy Rules
 
-- **[Rule 10](/security/security-policy#tools-functions-and-pipelines-security):** The Tools feature is designed to execute user-provided Python code on the server. Reports involving Tools or Functions are closed as intended behavior.
+- **[Rule 10](/security/security-policy#reporting-guidelines):** The Tools feature is designed to execute user-provided Python code on the server. Reports involving Tools or Functions are closed as intended behavior.
 - **[Rule 9](/security/security-policy#reporting-guidelines):** "Pasting untrusted code into Functions/Tools" is explicitly cited as out-of-scope.
 - **[Rule 1](/security/security-policy#reporting-guidelines):** Expected protocol behavior is not a vulnerability.
 
diff --git a/docs/troubleshooting/audio.mdx b/docs/troubleshooting/audio.mdx
index d54bc8938..e0636d59e 100644
--- a/docs/troubleshooting/audio.mdx
+++ b/docs/troubleshooting/audio.mdx
@@ -342,6 +342,14 @@ environment:
    - Under STT Settings, try switching Speech-to-Text Engine to "Web API"
    - This uses the browser's built-in speech recognition
 
+### STT/TTS Failing with SSL Certificate Errors
+
+If your STT or TTS engine is hosted behind a self-signed certificate or an internal CA and requests fail with TLS verification errors (e.g. `SSLCertVerificationError`, `certificate verify failed`):
+
+- Set [`AIOHTTP_CLIENT_SESSION_SSL=False`](/reference/env-configuration#aiohttp_client_session_ssl) to disable certificate verification for these upstream calls.
+- As of v0.9.6 the audio endpoints honor this setting (previously STT/TTS ignored it and always verified, so this configuration had no effect on audio engines). If it still appears ignored, confirm you are on v0.9.6 or later.
+- Prefer installing the internal CA into the container's trust store where possible; disabling verification removes protection against man-in-the-middle interception on that connection.
+
 ---
 
 ## ElevenLabs Integration
@@ -442,6 +450,7 @@ curl http://your-tts-service:port/health
 | `AUDIO_STT_OPENAI_API_BASE_URL` | Base URL for OpenAI-compatible STT |
 | `AUDIO_STT_OPENAI_API_KEY` | API key for OpenAI-compatible STT |
 | `DEEPGRAM_API_KEY` | Deepgram API key |
+| `AIOHTTP_CLIENT_SESSION_SSL` | TLS verification for outbound STT/TTS API calls (default: `True`; set `False` for self-signed certs — honored by audio engines since v0.9.6) |
 
 For a complete list of audio environment variables, see [Environment Variable Configuration](/reference/env-configuration#audio).
 
diff --git a/docs/troubleshooting/context-window.mdx b/docs/troubleshooting/context-window.mdx
index 291cf6496..52db3423b 100644
--- a/docs/troubleshooting/context-window.mdx
+++ b/docs/troubleshooting/context-window.mdx
@@ -130,6 +130,12 @@ Enable this filter globally or attach it to specific models in **Admin Panel →
 With tool calling on, an `assistant` message that invokes tools is paired with one or more `tool` messages carrying results that share the same `tool_call_id`. If `max_turns` happens to slice the conversation in the middle of that pair — keeping the orphan half — the upstream provider returns a 400 because the tool call / result structure is invalid. The repair block drops the orphans so the window always starts on a clean boundary. This matches what production community filters for context management do; the rest of the filter is the generic trimming logic.
 :::
 
+:::note Server-side reconciliation (v0.9.6+) — error: `tool_use ids were found without tool_result blocks`
+A second source of the same 400 is **stored** output that is already incomplete — a tool result never got written (the call was interrupted, or a knowledge base changed mid-chat), or a tool call is missing while its result survived. Strict providers (Anthropic, AWS Bedrock Converse) reject this with `400 ... tool_use ids were found without tool_result blocks` (or the mirror case, a `tool_result` with no matching `tool_use`).
+
+As of v0.9.6 Open WebUI reconciles these when it reconstructs a conversation: unpaired `tool_use` / `tool_result` entries are dropped before the request is sent, so resuming a chat with an interrupted tool call no longer hard-fails. Well-formed history is untouched. This is independent of the filter above — the filter still matters because *trimming* can create fresh orphans after reconstruction, which the server-side pass (run earlier, on the stored output) does not see. If you still hit this error, confirm you are on v0.9.6 or later.
+:::
+
 ### Slightly more involved: per-model token budget
 
 Counting turns is easy to reason about but wrong in practice — 40 turns of one-liners fit in 8k tokens, five turns with a 200-page PDF attachment do not. The more useful policy is "keep everything until we're about to bust the model's context window, then drop the oldest non-system messages until we fit."
diff --git a/docs/troubleshooting/image-generation.md b/docs/troubleshooting/image-generation.md
index 77cbc0bc8..e58a82339 100644
--- a/docs/troubleshooting/image-generation.md
+++ b/docs/troubleshooting/image-generation.md
@@ -27,6 +27,11 @@ title: "Image Generation"
     - If you are using Image Editing or Image+Image generation, your custom workflow **must** have nodes configured to accept an input image (usually a `LoadImage` node replaced/linked effectively).
     - Check the default "Image Editing" workflow in the Open WebUI settings for the required node structure to ensure compatibility.
 
+- **Generation/editing fails when ComfyUI is on a private/internal address** (e.g. `10.x`, `192.168.x`, `172.16–31.x`, or `localhost`), often only after the workflow runs:
+    - Open WebUI applies SSRF protection to outbound fetches, which previously blocked retrieving the rendered image back from a ComfyUI instance on a private network.
+    - **Fixed in v0.9.6**: image URLs are now trusted when they are same-origin with the admin-configured `COMFYUI_BASE_URL` (a strict scheme + host + port match, not a string prefix), so a private-network ComfyUI works without weakening SSRF protection globally.
+    - Ensure `COMFYUI_BASE_URL` is set to the **exact** origin ComfyUI serves images from (matching scheme, host, and port). If ComfyUI returns image URLs on a different host/port than `COMFYUI_BASE_URL`, those fetches are still SSRF-validated and may be blocked. On older versions, upgrade rather than disabling SSRF protection.
+
 ### Automatic1111 Issues
 
 - **Connection Refused / "Api Not Found"** (Automatic1111 is running, but Open WebUI reports connection errors):
diff --git a/docs/troubleshooting/index.mdx b/docs/troubleshooting/index.mdx
index 470c3f6bd..3fcf20809 100644
--- a/docs/troubleshooting/index.mdx
+++ b/docs/troubleshooting/index.mdx
@@ -20,6 +20,7 @@ Use this page to find the right guide for your issue. If you're unsure where to
 | `database is locked`, data disappearing across instances | [Scaling & HA → Database](./multi-replica#4-database-corruption--locked-errors) |
 | Worker crash: `Child process [pid] died` during upload | [RAG → Worker Crashes](./rag#12-worker-dies-during-document-upload) |
 | Model ignores attached knowledge base | [RAG → Knowledge Base Not Working](./rag#13-knowledge-base-attached-to-model-not-working) |
+| Agent stops mid-task after many tool calls / `Tool-call limit reached (N iterations).` | [`CHAT_RESPONSE_MAX_TOOL_CALL_ITERATIONS`](/reference/env-configuration#chat_response_max_tool_call_iterations) |
 | `NoneType object has no attribute 'encode'` | [RAG → Embedding Error](./rag#5-400-nonetype-object-has-no-attribute-encode) |
 | CUDA out of memory during embedding | [RAG → CUDA OOM](./rag#10-cuda-out-of-memory-during-embedding) |
 | OAuth redirect loops, CSRF state mismatch | [SSO & OAuth](./sso) |
@@ -30,6 +31,7 @@ Use this page to find the right guide for your issue. If you're unsure where to
 | Image not generating, ComfyUI workflow errors | [Image Generation](./image-generation) |
 | Web search returns empty content or proxy errors | [Web Search](./web-search) |
 | Slow performance, high RAM, OOM crashes | [Performance & RAM](./performance) |
+| Container OOM-killed when editing model/knowledge permissions (SQLite) | [Performance & RAM → SQLite Memory Footprint](./performance#4-sqlite-memory-footprint-on-constrained-containers) |
 | "The prompt is too long" / context length exceeded | [Context Window / Prompt Too Long](./context-window) |
 | Forgot admin password | [Reset Admin Password](./password-reset) |
 | `no such table` or `table already exists` on startup | [Database Migration](./manual-database-migration) |
@@ -53,7 +55,7 @@ Use this page to find the right guide for your issue. If you're unsure where to
 ## Before You Start
 
 - **Update first.** Many issues are fixed in newer releases. Make sure you're on the latest version.
-- **Check PersistentConfig.** Open WebUI stores some settings in its database, which take priority over environment variables. If an env var change seems ignored, see the [Environment Variable Configuration](/reference/env-configuration#important-note-on-persistentconfig-environment-variables) page.
+- **Check ConfigVar.** Open WebUI stores some settings in its database, which take priority over environment variables. If an env var change seems ignored, see the [Environment Variable Configuration](/reference/env-configuration#important-note-on-configvar-environment-variables) page.
 
 ## Browser Compatibility
 
diff --git a/docs/troubleshooting/manual-database-migration.md b/docs/troubleshooting/manual-database-migration.md
index fd527e1ac..4685ccabc 100644
--- a/docs/troubleshooting/manual-database-migration.md
+++ b/docs/troubleshooting/manual-database-migration.md
@@ -464,6 +464,10 @@ or similar errors for other tables (e.g., `access_grant`).
 
 **Cause:** A previous migration **partially completed** — the table was created in the database, but Alembic's version tracking was not updated (typically because the migration was interrupted during the data backfill step that runs after table creation). Alembic still thinks the migration hasn't been applied, so it tries to create the table again.
 
+:::tip Migrations are idempotent as of v0.9.6
+v0.9.6 reworked the bundled Alembic migrations to introspect the live schema and **skip tables, indexes, and columns that already exist** (and to add missing primary keys to legacy peewee-era tables). Many partially-applied-schema upgrades that used to fail with this error now complete cleanly on a straight `alembic upgrade head`. If you are hitting "table already exists" on an older version, upgrading to v0.9.6+ before manual intervention is often the simplest fix. The recovery steps below remain valid for databases mutated in other ways.
+:::
+
 **Diagnosis:**
 
 ```bash title="Terminal - Identify the Stuck Migration"
diff --git a/docs/troubleshooting/multi-replica.mdx b/docs/troubleshooting/multi-replica.mdx
index af78d702d..a250bcbfa 100644
--- a/docs/troubleshooting/multi-replica.mdx
+++ b/docs/troubleshooting/multi-replica.mdx
@@ -19,6 +19,8 @@ Before troubleshooting specific errors, ensure your deployment meets these **abs
 4.  **Shared Storage:** A persistent volume (RWX / ReadWriteMany if possible, or ensuring all replicas map to the same underlying storage for `data/`) is critical for RAG (uploads/vectors) and generated images.
 5.  **External Vector Database (Required):** The default ChromaDB uses a local SQLite-backed `PersistentClient` that is **not safe for multi-worker or multi-replica deployments**. SQLite connections are not fork-safe, and concurrent writes from multiple processes will crash workers instantly. You **must** use a dedicated external Vector DB (e.g., [PGVector](/reference/env-configuration#pgvector_db_url), [MariaDB Vector](/reference/env-configuration#mariadb_vector_db_url), Milvus, Qdrant) via [`VECTOR_DB`](/reference/env-configuration#vector_db), or run ChromaDB as a [separate HTTP server](/reference/env-configuration#chroma_http_host).
 6.  **Database Session Sharing (Optional):** For PostgreSQL deployments with adequate resources, consider enabling [`DATABASE_ENABLE_SESSION_SHARING=True`](/reference/env-configuration#database_enable_session_sharing) to improve performance under high concurrency.
+7.  **Thread Pool Ceiling:** Set [`THREAD_POOL_SIZE=2000`](/reference/env-configuration#thread_pool_size) (or higher). The default concurrency ceiling for blocking operations is only **40**; at multi-user scale this is exhausted quickly and the app appears to **freeze** while CPU/RAM look fine. It is a ceiling, not a thread/CPU count — a high value is not a contention risk. Never lower it.
+8.  **Throttle Presence Writes:** Set [`DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL=300`](/reference/env-configuration#database_user_active_status_update_interval) (300–500s). Unset (the default) means each user's `last_active_at` is written **on essentially every request** — a continuous flood of tiny `UPDATE`/`COMMIT` transactions that saturates the connection pool at scale for no functional benefit.
 
 ---
 
diff --git a/docs/troubleshooting/performance.md b/docs/troubleshooting/performance.md
index e43fb943e..675a9d4fa 100644
--- a/docs/troubleshooting/performance.md
+++ b/docs/troubleshooting/performance.md
@@ -92,6 +92,14 @@ By default, Open WebUI saves chats **after generation is complete**. While savin
 -   **Effect**: Chats are saved only when the generation is complete (or periodically).
 -   **Recommendation**: **DO NOT ENABLE `ENABLE_REALTIME_CHAT_SAVE` in production.** It is highly recommended to keep this `False` to prevent database connection exhaustion and severe performance degradation under concurrent load. See the [Environment Variable Configuration](/reference/env-configuration#enable_realtime_chat_save) for details.
 
+### User Active-Status Write Throttling (set this on every deployment)
+
+Open WebUI tracks online/"active" presence by writing each user's `last_active_at` timestamp to the database. **By default this write is unthrottled** — essentially *every authenticated request* issues its own `UPDATE users SET last_active_at = ...` plus a `COMMIT`. On a busy instance this is a continuous flood of tiny write transactions that amplifies database load and consumes connection-pool capacity for zero functional benefit (presence only needs ~minute granularity).
+
+-   **Env Var**: `DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL=300`
+-   **Default**: unset (**unthrottled — writes on every request**)
+-   **Recommendation**: Set a positive interval in seconds — `300`–`500` is a good range. This collapses thousands of writes into at most one per user per interval. It is **free performance for any setup** and is effectively **mandatory for large/production deployments**; leaving it unset is a common, avoidable database bottleneck. There is no downside on weak hardware either — it only *reduces* writes. See [`DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL`](/reference/env-configuration#database_user_active_status_update_interval).
+
 ### Database Session Sharing
 
 Starting with v0.7.1, Open WebUI includes a database session sharing feature that can improve performance under high concurrency by reusing database sessions instead of creating new ones for each request.
@@ -206,13 +214,17 @@ Increasing the chunk size buffers these updates, sending them to the client in l
   *   *Recommendation*: Set to **5-10** for high-concurrency instances.
 
 #### Thread Pool Size
-Defines the number of worker threads available for handling requests.
-*   **Default**: 40
-*   **High-Traffic Recommendation**: **2000+**
-*   **Warning**: **NEVER decrease this value.** Even on low-spec hardware, an idle thread pool does not consume significant resources. Setting this too low (e.g., 10) **WILL cause application freezes** and request timeouts.
+Caps how many **concurrent** blocking operations (sync DB calls, file I/O, sync route handlers offloaded via `run_in_threadpool`) may run at once. This is a concurrency **ceiling**, not a fixed pool of pre-spawned OS threads and **not** a CPU-core/thread count — threads are created lazily and reused, so a high value does not spawn that many threads, burn CPU, or cause CPU contention while idle.
+*   **Default**: 40 (the AnyIO default — far too low for production)
+*   **Normal servers / production**: **2000+**. `2000` is a *lower* bound for very large instances; going higher is fine and is **not** a CPU/contention risk.
+*   **Symptom if too low**: when more than `THREAD_POOL_SIZE` blocking ops are needed at once (many users at the same time, or a few users each triggering several blocking calls), further requests queue and the **whole app appears to hang/freeze** even though CPU and RAM look fine. This is pool starvation, not resource exhaustion.
+*   **Warning**: **NEVER decrease below the default.** An idle high ceiling costs effectively nothing.
+*   **Exception — weak hardware** (Raspberry Pi, tiny VPS, containers capped at ~250m CPU / very low RAM): do **not** set `2000`. Each genuinely concurrent blocking op still uses a real OS thread (stack memory), so on a tiny device a huge ceiling lets a traffic burst exhaust RAM. Leave it at the default or a modest few-hundred value matched to the device. Any normal server should use `2000+`.
 
 - **Env Var**: `THREAD_POOL_SIZE=2000`
 
+See [`THREAD_POOL_SIZE`](/reference/env-configuration#thread_pool_size) for the full explanation.
+
 #### AIOHTTP Client Timeouts
 Long LLM completions can exceed default HTTP client timeouts. Configure these to prevent requests being cut off mid-response:
 
@@ -431,6 +443,32 @@ If resource usage is critical, disable automated features that constantly trigge
     *   Admin: `Settings > Interface > Chat Title`
 4.  **Tag Generation**: `ENABLE_TAGS_GENERATION=False`
 
+### 4. SQLite Memory Footprint on Constrained Containers
+
+This one applies **even on fast local SSD/NVMe**. It is a RAM problem, not a storage-latency one (for the latency/corruption problem on network storage, see [Disk I/O Latency](#disk-io-latency-sqlite--storage) instead). It is the most common cause of "the container gets OOM-killed when I edit model or knowledge-base permissions" on small deployments.
+
+On SQLite, when `DATABASE_POOL_SIZE` is left unset, current releases (0.9.x, async DB backend) do **not** fall back to SQLAlchemy's small default pool. They fall back to a large internal pool (currently **512** connections). Each pooled connection independently:
+
+- lazily grows its **own** SQLite page cache up to the `DATABASE_SQLITE_PRAGMA_CACHE_SIZE` cap. The default `-65536` is roughly **64 MB of committed RAM per connection**.
+- memory-maps the database file up to `DATABASE_SQLITE_PRAGMA_MMAP_SIZE`, default **256 MB per connection**. This is mostly virtual and file-backed rather than committed anonymous RAM, but it inflates `total-vm` enormously and the resident portion still counts against a cgroup memory limit.
+- runs on its **own OS thread** (the async SQLite driver is one thread per connection), adding thread-stack address space.
+
+Peak memory therefore scales with the number of **simultaneously active connections**, not with the size of any single query. A workflow that fans out many short-lived connections, for example editing model or knowledge-base access control and then reloading a long model list, can briefly drive dozens of connections live. On a small container the page caches alone (active connections times up to 64 MB) exceed the limit and the OOM killer terminates the process. A profiler (py-spy, memray) will attribute almost everything to `aiosqlite/core.py`, because that is the frame where each connection allocates its cache and materialises rows. That is the signature of many connections, not a leak inside the driver, and it is unrelated to the size of any remote vector database.
+
+:::tip Constrained / Low-Spec Containers (SQLite)
+On any deployment with a tight memory limit (small VPS, Raspberry Pi, a Docker `mem_limit` of 1 to 2 GB), set these explicitly instead of relying on the defaults:
+
+```bash
+DATABASE_POOL_SIZE=8                       # cap the SQLite pool (unset falls back to 512)
+DATABASE_SQLITE_PRAGMA_CACHE_SIZE=-2000    # ~2 MB page cache per connection instead of ~64 MB
+DATABASE_SQLITE_PRAGMA_MMAP_SIZE=0         # disable the per-connection mmap window
+```
+
+Also give the container realistic headroom. **1 GB is very low** for anything doing RAG or embeddings; aim for **2 GB or more**. Keep `DATABASE_ENABLE_SESSION_SHARING=False` (the default) on low-spec hardware; turning it on is the wrong lever here and degrades SQLite on weak hardware (see [Database Session Sharing](#database-session-sharing)).
+
+For multi-user or growing deployments the durable fix is **PostgreSQL**, not SQLite tuning.
+:::
+
 ---
 
 ## 🚀 Recommended Configuration Profiles
@@ -443,7 +481,7 @@ If resource usage is critical, disable automated features that constantly trigge
 3.  **Task Model**: Disable or use tiny model (`llama3.2:1b`).
 4.  **Scaling**: Keep default `THREAD_POOL_SIZE` (40).
 5.  **Disable**: Image Gen, Code Interpreter, Autocomplete, Follow-ups.
-6.  **Database**: SQLite is fine.
+6.  **Database**: SQLite is fine, but cap its memory: `DATABASE_POOL_SIZE=8`, `DATABASE_SQLITE_PRAGMA_CACHE_SIZE=-2000`, `DATABASE_SQLITE_PRAGMA_MMAP_SIZE=0`. The unset SQLite pool default is large (512); see [SQLite Memory Footprint on Constrained Containers](#4-sqlite-memory-footprint-on-constrained-containers).
 
 ### Profile 2: Single User Enthusiast
 *Target: Max Quality & Speed, Local + External APIs.*
@@ -480,6 +518,23 @@ Common Redis configuration issues that cause unnecessary scaling:
 | **Stale connections** | Redis runs out of connections or memory grows indefinitely | Set `timeout 1800` in redis.conf (kills idle connections after 30 minutes) |
 | **Low maxclients** | `max number of clients reached` errors | Set `maxclients 10000` or higher |
 | **No connection limits** | Open WebUI pods may accumulate connections that never close | Combine `timeout` with connection pool limits in your Redis client config |
+| **Low Pub/Sub output buffer limits** | WebSocket streams stall, `Cannot publish to redis... giving up`, or Redis logs client output buffer disconnections when large Socket.IO events are published | Increase the Redis `client-output-buffer-limit ... pubsub ...` setting, sized for your websocket payloads and available Redis memory |
+
+For Redis-backed websockets, Open WebUI uses Socket.IO over Redis Pub/Sub. Large streaming responses and tool events can create multi-MB `PUBLISH socketio ...` payloads. If Redis disconnects slow Pub/Sub clients, inspect:
+
+```bash
+redis-cli INFO stats | grep client_output_buffer_limit_disconnections
+redis-cli SLOWLOG GET 50
+redis-cli CONFIG GET client-output-buffer-limit
+```
+
+Example Redis configuration for deployments that need to tolerate large websocket bursts:
+
+```conf
+client-output-buffer-limit normal 0 0 0 replica 268435456 67108864 60 pubsub 1073741824 268435456 180
+```
+
+This keeps normal client limits disabled and raises Pub/Sub clients to a 1 GB hard limit and 256 MB soft limit for 180 seconds. Tune downward or upward based on Redis memory headroom and observed payload sizes.
 
 ---
 
@@ -496,6 +551,7 @@ These are real-world mistakes that cause organizations to massively over-provisi
 | **Using Default (prompt-based) tool calling** | Legacy / no longer supported; injected prompts break KV cache → higher latency → more resources needed per request; cannot access built-in system tools | Switch every model to Native Mode |
 | **Not configuring Redis stale connection timeout** | Connections accumulate forever → Redis OOM → you deploy Redis Cluster | Add `timeout 1800` to redis.conf |
 | **Using base64-encoded icons in Actions/Filters** | Icon data is embedded in `/api/models` responses sent to the frontend on every page load for every model. A 500 KB base64 icon on 3 actions across 20 models = **30 MB of payload bloat** per request → slow frontend loads, high bandwidth usage, unnecessary backend memory pressure | Host icons as static files and reference them by URL in `icon_url` / `self.icon`. See [Action Function icon_url warning](/features/extensibility/plugin/functions/action#example---specifying-action-frontmatter) |
+| **Running SQLite with the default pool on a tiny container** | Unset `DATABASE_POOL_SIZE` falls back to a 512-connection pool; each connection grows its own ~64 MB page cache plus a 256 MB mmap window, so a connection-fanning workflow (editing model/KB permissions, reloading a long model list) OOM-kills a 1 GB container | Cap `DATABASE_POOL_SIZE` (e.g. `8`), set `DATABASE_SQLITE_PRAGMA_CACHE_SIZE=-2000` and `DATABASE_SQLITE_PRAGMA_MMAP_SIZE=0`, give the container ≥ 2 GB. See [SQLite Memory Footprint](#4-sqlite-memory-footprint-on-constrained-containers) |
 
 ---
 
@@ -522,3 +578,7 @@ For detailed information on all available variables, see the [Environment Config
 | `ENABLE_AUTOCOMPLETE_GENERATION` | [Autocomplete](/reference/env-configuration#enable_autocomplete_generation) |
 | `RAG_SYSTEM_CONTEXT` | [RAG System Context](/reference/env-configuration#rag_system_context) |
 | `DATABASE_ENABLE_SESSION_SHARING` | [Database Session Sharing](/reference/env-configuration#database_enable_session_sharing) |
+| `DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL` | [Presence Write Throttling](/reference/env-configuration#database_user_active_status_update_interval) |
+| `DATABASE_POOL_SIZE` | [Connection Pool Size](/reference/env-configuration#database_pool_size) |
+| `DATABASE_SQLITE_PRAGMA_CACHE_SIZE` | [SQLite Page Cache Size](/reference/env-configuration#database_sqlite_pragma_cache_size) |
+| `DATABASE_SQLITE_PRAGMA_MMAP_SIZE` | [SQLite mmap Size](/reference/env-configuration#database_sqlite_pragma_mmap_size) |
diff --git a/docs/troubleshooting/web-search.mdx b/docs/troubleshooting/web-search.mdx
index 8437c9614..4bc92455a 100644
--- a/docs/troubleshooting/web-search.mdx
+++ b/docs/troubleshooting/web-search.mdx
@@ -31,7 +31,7 @@ WEB_SEARCH_TRUST_ENV=True
 
 :::info
 
-This is a **PersistentConfig** variable, meaning it can be set via environment variable on startup OR configured through the Admin Panel UI. Once set in the UI, the database value takes precedence over the environment variable.
+This is a **ConfigVar** variable, meaning it can be set via environment variable on startup OR configured through the Admin Panel UI. Once set in the UI, the database value takes precedence over the environment variable.
 
 This setting tells Open WebUI's web content loader to respect the proxy settings from your environment variables (`http_proxy`, `https_proxy`). Without this, even if your search engine works through the proxy, fetching content from the returned URLs will fail.
 
@@ -72,6 +72,8 @@ If web search returns empty content or poor quality results, the issue is often
 
 - **Try different loaders**: Configure `WEB_LOADER_ENGINE` to use `playwright` for JavaScript-heavy sites or `firecrawl`/`tavily` for better extraction.
 
+- **Set a real User-Agent**: If results from Wikipedia, Cloudflare-protected sites, or other large publishers come back empty (or as 403s in logs), the default Python `requests`/`aiohttp` UA is being filtered by bot detection. Set [`USER_AGENT`](/reference/env-configuration#user_agent) to a real browser-like string (e.g. `Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0 Safari/537.36`). This affects both web search result fetching and the `fetch_url` tool.
+
 For more details on context window issues, see the [RAG Troubleshooting Guide](./rag).
 
 ---
diff --git a/docs/tutorials/integrations/llm-providers/azure-openai/workload-identity-auth.mdx b/docs/tutorials/integrations/llm-providers/azure-openai/workload-identity-auth.mdx
index fc8384057..bbdb5f0db 100644
--- a/docs/tutorials/integrations/llm-providers/azure-openai/workload-identity-auth.mdx
+++ b/docs/tutorials/integrations/llm-providers/azure-openai/workload-identity-auth.mdx
@@ -140,6 +140,10 @@ After deploying Open WebUI you can follow these steps to configure your Azure Op
 5. Configure your Azure OpenAI endpoint and deployment details
 6. Save the connection
 
+:::note Custom / private endpoint hostnames
+If you front Azure OpenAI with a private endpoint, API gateway, or other custom domain (i.e. the base URL is **not** a standard `*.openai.azure.com` / `*.cognitiveservices.azure.com` host — common in AKS/enterprise setups), selecting the **Azure OpenAI** provider is sufficient: Open WebUI applies Azure request handling based on the chosen provider, not on hostname pattern matching. Older versions only recognized Azure connections by the `azure.*` hostname, so a custom-hostname Azure connection would fail **Verify Connection** and model listing — if you hit that, upgrade to v0.9.6 or later.
+:::
+
 ## Key Components Explained
 
 ### Service Account Annotations
diff --git a/docs/tutorials/integrations/redis.md b/docs/tutorials/integrations/redis.md
index f80bd43ae..1578e9748 100644
--- a/docs/tutorials/integrations/redis.md
+++ b/docs/tutorials/integrations/redis.md
@@ -188,6 +188,39 @@ The above configuration sets up a Redis container named `redis-valkey` and mount
 
 :::
 
+### WebSocket Pub/Sub Buffer Limits
+
+Open WebUI uses Socket.IO over Redis Pub/Sub when `WEBSOCKET_MANAGER=redis` is enabled. Streaming responses and tool events can generate large websocket events because some updates include accumulated message state, not only the newest token delta. If Redis disconnects Pub/Sub clients while delivering these events, users can see stalled streams, missing live updates, or log messages such as:
+
+```text
+Cannot publish to redis... retrying
+Cannot publish to redis... giving up
+redis.exceptions.TimeoutError: Timeout connecting to server
+```
+
+Check whether Redis is disconnecting Pub/Sub clients because of output buffer limits:
+
+```bash
+redis-cli INFO stats | grep client_output_buffer_limit_disconnections
+redis-cli SLOWLOG GET 50
+redis-cli CONFIG GET client-output-buffer-limit
+```
+
+If the slow log shows large `PUBLISH socketio ...` payloads and `client_output_buffer_limit_disconnections` increases, raise the Redis Pub/Sub output buffer limit. For example:
+
+```conf
+# Keep normal clients unchanged; allow larger websocket Pub/Sub bursts.
+client-output-buffer-limit normal 0 0 0 replica 268435456 67108864 60 pubsub 1073741824 268435456 180
+```
+
+This sets the Pub/Sub hard limit to 1 GB and the soft limit to 256 MB for 180 seconds. Tune these values for your available Redis memory and expected websocket payload size. Higher limits make Redis more tolerant of temporary slow subscribers, but they also allow each slow Pub/Sub client to buffer more memory before Redis disconnects it.
+
+If you changed Redis configuration at runtime, verify the active value:
+
+```bash
+redis-cli CONFIG GET client-output-buffer-limit
+```
+
 To create a Docker network for communication between Open WebUI and Redis, run the following command:
 
 ```bash