diff --git a/docs/features/administration/banners.md b/docs/features/administration/banners.md index 058d39fe7..caad5e381 100644 --- a/docs/features/administration/banners.md +++ b/docs/features/administration/banners.md @@ -154,7 +154,84 @@ Inline styles are supported on allowed tags: Gradient background ``` -> Keep styling minimal. Overly large padding, font sizes, or complex layouts can cause banners to become tall or visually inconsistent across themes. +You can also style a full message area by wrapping the content in a block element: + +```html +
+ Notice title
+ Short supporting message. +
+``` + +> Keep styling purposeful. Large padding, large font sizes, or deeply nested layouts can make banners too tall and visually inconsistent across themes. + +--- + +## Designing effective banners + +Banners work best when they are easy to scan, visually distinct, and short enough not to interrupt normal work. + +### Structure the message + +Use a predictable structure: + +- Start with the event type or status: `Maintenance`, `Incident`, `Policy update`, `New feature`. +- Put the most important detail first: date, time, impact, or required action. +- Keep the body to one or two short sentences. +- Add one link only if users need more details. + +For longer notices, use short sections instead of one long paragraph. For multilingual notices, separate languages with a subtle `
` or use a collapsible `
` section. + +### Make severity visible + +Use the banner `type` consistently: + +- `info`: neutral announcements and product updates. +- `success`: resolved incidents or completed changes. +- `warning`: planned maintenance, degraded service, or upcoming action needed. +- `error`: active incidents or urgent action required. + +Avoid using `error` for non-urgent announcements. Users learn to ignore alerts when every message looks critical. + +### Use color carefully + +Color should support the banner type, not compete with it: + +- Use soft backgrounds for the full message area. +- Use stronger colors for small accents, labels, or left borders. +- Keep text contrast high enough to read in bright rooms and on dim screens. +- Avoid mixing many unrelated colors in one banner. + +A useful pattern is a pale background plus a stronger left border: + +```html +
+ Notice title
+ Short supporting message. +
+``` + +### Keep layouts responsive + +Banners are shown inside the application layout and must still work on narrow screens. + +- Prefer `display:flex;flex-wrap:wrap` for rows containing labels, dates, or badges. +- Avoid fixed widths. +- Use `width:100%;box-sizing:border-box` for full-width styled blocks. +- Keep icons and badges small so they do not increase banner height. +- Test the banner with a narrow browser window before using it broadly. + +### Avoid accidental extra height + +Banner content treats literal newlines as line breaks. If you use explicit `
` tags, keep the raw HTML compact and avoid adding extra blank lines or indentation in the banner content field. + +This compact style: + +```html +Notice
One short sentence.
Another short sentence. +``` + +renders more predictably than heavily formatted HTML with many line breaks. --- @@ -255,6 +332,22 @@ Service updates: Status ``` +### Pattern: Styled notice block + +Use a full-width styled block when the whole message should read as one announcement area. Keep this HTML compact when pasting it into the banner content field, especially if it also contains `
` tags. + +```html +
NOTICENotice titleKey detail
Short supporting message.
+``` + +This pattern uses: + +- A pale background for the full message area. +- A stronger left border for fast visual recognition. +- A small uppercase label for the event type. +- A compact date/time chip for the most important metadata. +- `flex-wrap` so the header row still works on narrow screens. + ### Pattern: Collapsible details (keep banners short) ```html diff --git a/docs/features/authentication-access/rbac/groups.md b/docs/features/authentication-access/rbac/groups.md index ce077c3de..e6db919e6 100644 --- a/docs/features/authentication-access/rbac/groups.md +++ b/docs/features/authentication-access/rbac/groups.md @@ -84,3 +84,18 @@ For example, granting the "Marketing" group read access and a specific editor us * **Read**: Users can view and use the resource. * **Write**: Users can update or delete the resource. + +### Previewing Access (Audit) + +When access grants span many groups and resources, it's easy to lose track of who can see what. Open WebUI ships an admin-only **Preview Access** view that resolves every access grant for a specific user or group and lists the result in one place — no need to crawl through individual resource pages. + +**For a user** — In **Admin Panel > Users**, hover over a non-admin user row and click the eye-style **Preview Access** button. The modal shows every model, knowledge base, and tool the user can read, aggregated across all of their group memberships and any direct user grants. + +**For a group** — In **Admin Panel > Users > Groups**, open the group editor and use the **Preview Group Access** panel. The output is the same shape (models, knowledge, tools), scoped to just that group's grants. + +Both views are admin-only and read-only — they reflect what the access-grant table currently says without modifying it. Use them after a permission change to confirm the result matches intent, or as part of a periodic RBAC audit. + +Programmatic equivalents: + +- `GET /api/v1/users/{user_id}/preview` — user view (admin auth required) +- `GET /api/v1/groups/id/{id}/preview` — group view (admin auth required) diff --git a/docs/features/channels/index.md b/docs/features/channels/index.md index 0287854ee..08c75e004 100644 --- a/docs/features/channels/index.md +++ b/docs/features/channels/index.md @@ -81,9 +81,14 @@ Mentioning a model in a channel runs through the same chat-completion pipeline a | **User tools and MCP tools** | Whatever the model is configured to call, it can call | | **Filters** | Inlet/outlet/stream filters apply just like in chats | | **Knowledge (RAG)** | Knowledge bases attached to the model are queried and injected | +| **Attached documents** | Images **and** non-image files (PDF, DOCX, etc.) uploaded in the thread are forwarded into the model's context | In other words, a channel-summoned model is a fully-equipped agent — not a one-shot completion. +:::note Document attachments in channels (v0.9.6+) +Before v0.9.6, tagging a model in a channel only forwarded **images** from the thread — uploaded PDFs, DOCX, and other non-image documents were ignored, so summarization and document-comparison prompts silently had nothing to work with. As of v0.9.6 these files are forwarded the same way they are in a direct chat, so document workflows behave identically in channels. +::: + ### Tagging people and linking channels Use `@username` to notify teammates. Use `#channel-name` to create clickable cross-references between conversations. diff --git a/docs/features/chat-conversations/web-search/providers/linkup.md b/docs/features/chat-conversations/web-search/providers/linkup.md new file mode 100644 index 000000000..c9418bd75 --- /dev/null +++ b/docs/features/chat-conversations/web-search/providers/linkup.md @@ -0,0 +1,93 @@ +--- +sidebar_position: 23 +title: "Linkup" +--- + +:::warning + +This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the [contributing tutorial](https://docs.openwebui.com/contributing). + +::: + +:::tip + +For a comprehensive list of all environment variables related to Web Search (including concurrency settings, result counts, and more), please refer to the [Environment Configuration documentation](/reference/env-configuration#web-search). + +::: + +:::tip Troubleshooting + +Having issues with web search? Check out the [Web Search Troubleshooting Guide](/troubleshooting/web-search) for solutions to common problems like proxy configuration, connection timeouts, and empty content. + +::: + +## Overview + +[Linkup](https://www.linkup.so/) is a search API built for AI applications. Integrating it with Open WebUI lets your language model perform real-time web searches and ground responses in current sources. This tutorial guides you through configuring Linkup as a web search provider. + +Linkup support was added in Open WebUI v0.9.6. + +## Prerequisites + +Ensure you have: + +- **Open WebUI Installed**: A running instance of Open WebUI (local or Docker). See the [Getting Started guide](https://docs.openwebui.com/getting-started). +- **Linkup Account**: An account with an API key from [Linkup](https://www.linkup.so/). +- **Admin Access**: Administrative access to your Open WebUI instance. +- **Internet Connection**: Required for Linkup API requests. + +## Step-by-Step Configuration + +### 1. Obtain a Linkup API Key + +1. Log in or sign up at [Linkup](https://www.linkup.so/). +2. Open the API keys section of your dashboard. +3. Copy or generate a new API key. Keep it secure. + +### 2. Configure Open WebUI + +1. Log in to Open WebUI with an admin account. +2. Open **Admin Panel → Settings → Web Search**. +3. Enable **Web Search** by toggling it **On**. +4. Select **linkup** from the **Web Search Engine** dropdown. +5. Paste your Linkup API key into the **Linkup API Key** field. +6. (Optional) Set the **Search Depth** and **Output Type** (see below). +7. Save your settings. + +### 3. Test the Integration + +1. Start a chat session in Open WebUI. +2. Click the **plus (+)** button in the prompt field to enable web search. +3. Enter a query (e.g., `+latest AI news`) and confirm Linkup returns real-time results. + +## Search Parameters + +Linkup requests are built from a small set of defaults that you can override. The query (`q`) and result count (`maxResults`) are injected automatically and cannot be overridden. + +| Parameter | Default | Notes | +|-----------|---------|-------| +| `depth` | `standard` | `standard` is faster and cheaper; `deep` runs a more thorough multi-step search. | +| `outputType` | `sourcedAnswer` | `sourcedAnswer` returns an answer plus its source pages; `searchResults` returns raw result entries. | +| `url` | `https://api.linkup.so/v1/search` | Override only if you need to point at a different endpoint. | + +These map to the [`LINKUP_SEARCH_PARAMS`](/reference/env-configuration#linkup_search_params) environment variable, supplied as a JSON object. For example: + +```bash +-e LINKUP_API_KEY="your_linkup_api_key" +-e LINKUP_SEARCH_PARAMS='{"depth": "deep", "outputType": "searchResults"}' +``` + +The same fields are exposed in the Admin UI when the `linkup` engine is selected, so you do not need environment variables unless you prefer to manage configuration that way. See [Environment Variable Configuration](https://docs.openwebui.com/environment) for details and the [`ENABLE_PERSISTENT_CONFIG`](/reference/env-configuration#enable_persistent_config) behavior. + +## Troubleshooting + +- **Invalid API Key**: Ensure the key is copied correctly, without extra spaces. +- **No Results**: Confirm the web search toggle (`+`) is enabled and your internet is active. Try `depth: deep` for sparse topics. +- **Quota Exceeded**: Check your plan and usage on the Linkup dashboard. +- **Settings Not Saved**: Verify admin privileges and that `webui.db` is writable. + +## Additional Resources + +- [Linkup Documentation](https://docs.linkup.so/): API reference and advanced options. +- [Open WebUI Features](https://docs.openwebui.com/features): Details on RAG and web search. +- [Contributing to Open WebUI](https://docs.openwebui.com/contributing): Share improvements or report issues. diff --git a/docs/features/extensibility/index.md b/docs/features/extensibility/index.md index e103daf5b..fa36d46b6 100644 --- a/docs/features/extensibility/index.md +++ b/docs/features/extensibility/index.md @@ -9,11 +9,35 @@ title: "Extensibility" Open WebUI ships with powerful defaults, but your workflows aren't default. Extensibility is how you close the gap: give models real-time data, enforce compliance rules, add new AI providers, or connect to any external service. Write a few lines of Python, point at an OpenAPI endpoint, or browse the community library. The platform adapts to you, not the other way around. -There are three layers, and most teams end up using at least two: +There are two layers, and most teams end up using both: - **In-process Python** (Tools & Functions) runs inside Open WebUI itself with zero infrastructure and instant iteration. - **External HTTP** (OpenAPI & MCP servers) connects to services running anywhere, from a sidecar container to a third-party SaaS. -- **Pipeline workers** (Pipelines) offload heavy or sensitive processing to a separate container, keeping your main instance fast and clean. + +:::warning Pipelines are legacy +You may still see **Pipelines** referenced as a third layer. It is **legacy and no longer recommended** — the heavy-processing problem it solved no longer exists (see [Run heavy or long-running work](#run-heavy-or-long-running-work) below). Use Functions, Tools, or an external tool server instead. See the [Pipelines pages](pipelines) for the full deprecation notice. +::: + +--- + +## Which Extension Do I Need? + +The names don't always map obviously to what they do. Start from what you're trying to accomplish: + +| I want to... | Use | Why this one | +|---|---|---| +| Let the model **call an API or perform an action** (and keep a secret/API key the user and model can never read) | **[Tool](plugin/tools)** | The key lives inside the tool, server-side. The model only sees the *result*, never the credential. | +| **Add a new model or provider** to the model selector | **[Pipe Function](plugin/functions/pipe)** | A Pipe appears as a selectable "model" and handles the request however you like. | +| **Modify messages** going in or out (redact PII, inject system text, log, translate) | **[Filter Function](plugin/functions/filter)** | Filters run on every message via `inlet`/`outlet`/`stream` without touching model config. | +| Add a **button on a message** that runs custom code | **[Action Function](plugin/functions/action)** | Actions are user-triggered, per-message operations. | +| Teach the model **how to approach a task** (methodology, steps, house style) | **[Skill](/features/workspace/skills)** | Skills are instructions, not code. The model reads them; they don't execute anything. | +| Give the model **documents to retrieve from** | **[Knowledge](/features/workspace/knowledge)** | RAG over your files, attached to a model or referenced with `#`. | +| Save a **reusable prompt** behind a slash command | **[Prompt](/features/workspace/prompts)** | Templated text with typed variables; expands when you type `/name`. | +| Connect an **existing external service** that already speaks HTTP | **[OpenAPI / MCP server](mcp)** | Point Open WebUI at the spec; endpoints become callable tools. No glue code. | + +:::tip "Pipe" vs "Pipeline" — not the same thing +This is the single most common naming mix-up. A **Pipe** is a type of **Function** (in-process Python, adds a provider to the model list). A **Pipeline** is a **separate external worker container**. They share a prefix and nothing else. If you want to add a model provider, you almost always want a **Pipe Function**, not a Pipeline. +::: --- @@ -31,9 +55,11 @@ Have an internal API? A third-party SaaS with an OpenAPI spec? An MCP server alr Functions let you intercept and transform messages before they reach the model (input filters) or before they reach the user (output filters). Help redact PII, enforce formatting rules, log to an observability platform, inject system instructions dynamically, all without touching model configuration. -### Offload heavy processing +### Run heavy or long-running work + +Open WebUI's backend is **fully async**. Long-running Tools and Functions (awaiting an external API, a slow query, a multi-step agent) do not block other users, and synchronous/CPU-bound plugin code is offloaded to a worker thread pool (see [`THREAD_POOL_SIZE`](/reference/env-configuration#thread_pool_size)) — so it doesn't stall the event loop either. In practice you can run heavy work **in-process** without the latency problems that older synchronous releases had. -When a plugin needs GPU access, large dependencies, or isolated execution, run it as a Pipeline on a separate machine. Open WebUI talks to it over a standard API. Your main instance stays lean. +The historical reason to push heavy pipes/filters onto a separate **Pipelines** worker — keeping the single synchronous event loop unblocked — no longer applies. If you genuinely need **GPU access, large or conflicting dependencies, hard isolation, or independent scaling**, run that work as an **external service behind an [OpenAPI or MCP tool server](mcp)**, not a Pipeline. ### Import from the community @@ -49,7 +75,6 @@ Browse hundreds of community-built Tools and Functions from the Open WebUI Commu | ⚙️ **Functions** | Platform extensions that add model providers (Pipes), message processing (Filters), or UI actions (Actions) | | 🔗 **MCP support** | Native Streamable HTTP for Model Context Protocol servers | | 🌐 **OpenAPI servers** | Auto-discover and expose tools from any OpenAPI-compatible endpoint | -| 🔧 **Pipelines** | Modular plugin framework running on a separate worker for heavy or sensitive processing | | 📝 **Skills** | Markdown instruction sets that teach models how to approach specific tasks | | ⚡ **Prompts** | Slash-command templates with typed input variables and versioning | | 🏪 **Community library** | One-click import of community-built Tools and Functions | @@ -62,11 +87,10 @@ Understanding which layer to use saves time: | Layer | Runs where | Best for | Trade-off | |-------|-----------|----------|-----------| -| **Tools & Functions** | Inside Open WebUI process | Real-time data, filters, UI actions, new providers | Shares resources with the main server | -| **OpenAPI / MCP** | Any HTTP endpoint | Connecting existing services, third-party APIs | Requires a running external server | -| **Pipelines** | Separate Docker container | GPU workloads, heavy dependencies, sandboxed execution | Additional infrastructure to manage | +| **Tools & Functions** | Inside Open WebUI process | Real-time data, filters, UI actions, new providers — including heavy/long-running work (the async backend keeps it from blocking) | Shares CPU/RAM with the main server | +| **OpenAPI / MCP** | Any HTTP endpoint | Connecting existing services, third-party APIs, and GPU / heavy-dependency / isolated workloads | Requires a running external server | -Most users start with **Tools & Functions**. They require no extra setup, have a built-in code editor, and cover the majority of use cases. +Most users start with **Tools & Functions**. They require no extra setup, have a built-in code editor, and cover the majority of use cases. (**Pipelines** is a legacy third option, no longer recommended — see the note above.) --- @@ -84,9 +108,9 @@ A healthcare organization deploys a Filter Function that scans outbound messages An engineering team uses Pipe Functions to add Anthropic, Google Vertex AI, and a self-hosted vLLM instance alongside their existing Ollama models. Users see all providers in a single model selector with no separate logins and no API key juggling. -### Heavy-compute pipelines +### GPU-bound external processing -A research group runs a Retrieval-Augmented Generation pipeline that re-ranks with a cross-encoder model requiring GPU. They deploy it as a Pipeline on a dedicated GPU node. Open WebUI routes relevant queries to the pipeline automatically while keeping the main instance on commodity hardware. +A research group needs to re-rank retrieval results with a cross-encoder model that requires a GPU. They run it as a small service on a dedicated GPU node and expose it to Open WebUI as an **[OpenAPI tool server](mcp)**. The model calls it like any other tool while the main instance stays on commodity hardware. (The async backend means lighter custom logic can simply run in-process as a Function — only the GPU dependency pushes this particular workload to a separate service.) --- @@ -94,11 +118,11 @@ A research group runs a Retrieval-Augmented Generation pipeline that re-ranks wi ### Security -Tools, Functions, and Pipelines execute **arbitrary Python code** on your server. Only install extensions from trusted sources, review code before importing, and restrict Workspace access to administrators. See the [Security Policy](/security) for details. +Tools and Functions execute **arbitrary Python code** on your server. Only install extensions from trusted sources, review code before importing, and restrict Workspace access to administrators. See the [Security Policy](/security) for details. ### Resource sharing -In-process Tools and Functions share CPU and memory with Open WebUI. Computationally expensive plugins should be moved to Pipelines or external services. +In-process Tools and Functions share CPU and memory with Open WebUI. The async backend keeps long-running and blocking work from stalling other requests, but it does not create more hardware — genuinely CPU- or GPU-heavy workloads still compete for the same machine. For those, run the work as an external service behind an [OpenAPI / MCP tool server](mcp) so it scales independently. ### MCP transport @@ -112,4 +136,4 @@ Native MCP support is **Streamable HTTP only**. For stdio or SSE-based MCP serve |-------|-------------------| | [**Tools & Functions**](plugin) | Writing Python Tools, Functions (Pipes, Filters, Actions), and the development API | | [**MCP**](mcp) | Connecting Model Context Protocol servers, OAuth setup, troubleshooting | -| [**Pipelines**](pipelines) | Deploying the pipeline worker, building custom pipelines, directory structure | +| [**Pipelines**](pipelines) *(legacy)* | Reference only — the deprecated separate-worker framework, superseded by Functions and Tools | diff --git a/docs/features/extensibility/mcp.mdx b/docs/features/extensibility/mcp.mdx index 3b68120dd..29a950bd0 100644 --- a/docs/features/extensibility/mcp.mdx +++ b/docs/features/extensibility/mcp.mdx @@ -32,6 +32,16 @@ Entering MCP-style configuration (with `mcpServers` in JSON) into an OpenAPI con 2. Re-add it with the correct **Type** set to **MCP** ::: +## 🔒 MCP servers are admin-only {#mcp-servers-are-admin-only} + +MCP servers can only be added by **administrators**, under **Admin Settings → External Tools**. Regular users cannot register their own, by design. + +This is **not** the same restriction as OpenAPI. When you grant the **Direct Tool Servers** permission (per user or per group, off by default), users can add their own **OpenAPI** tool servers under **Settings → Tools**, but that path is OpenAPI-only: the connection type is locked, with no MCP option. + +The difference is capability. A user-supplied OpenAPI server is a stateless HTTP URL exposing a fixed set of declared endpoints. An MCP server is far more powerful: it is stateful and capability-rich (sampling, elicitation, persistent sessions and arbitrary host command execution over stdio transports), and it runs inside Open WebUI's trust boundary with the connecting user's full scope. In practice a malicious or compromised MCP server could execute code and read or exfiltrate data with that user's access, so the capability stays admin-gated. Open WebUI's own MCP support is Streamable HTTP only, but the protocol's privileged nature is why adding one is reserved for admins. + +To give users an MCP-backed capability without server-configuration rights, an admin adds the server once and scopes it with **Access Control** to the right users or groups. + ## 🧭 When to use MCP vs OpenAPI :::tip @@ -128,11 +138,17 @@ Both MCP and OpenAPI tool-server connections accept a free-form **Headers** fiel | :--- | :--- | | `{{USER_ID}}` | The calling user's ID. | | `{{USER_NAME}}` | The calling user's display name. | +| `{{USER_EMAIL}}` | The calling user's email address. | +| `{{USER_ROLE}}` | The calling user's role (e.g. `admin`, `user`). | | `{{CHAT_ID}}` | The current chat ID (empty in non-chat contexts like the **Verify Connection** button). | | `{{MESSAGE_ID}}` | The current message ID (empty in non-chat contexts). | Unknown tokens are passed through as literal text. Non-string header values are coerced to strings before substitution. The same tokens are honored on custom headers attached to OpenAI-compatible model connections in **Admin Settings → Connections → OpenAI**, so you can use the feature for tenant routing or audit-trail propagation across both surfaces. +:::note +`{{USER_EMAIL}}` and `{{USER_ROLE}}` were added in v0.9.6. The same release also fixed MCP server connections, where custom-header templates were previously stored but **not** interpolated at request time — they now expand the same way they always have for direct connections and OpenAPI tool servers. +::: + ### Function Name Filter List This field restricts which tools are exposed to the LLM. @@ -182,3 +198,7 @@ Supported and improving. The broader ecosystem is still evolving; expect occasio **Can I mix OpenAPI and MCP tools?** Yes. Many deployments do both. + +**Can users add their own MCP servers?** + +No. Adding MCP servers is admin-only (**Admin Settings → External Tools**). Users with the **Direct Tool Servers** permission can add their own **OpenAPI** tool servers, but not MCP. See [MCP servers are admin-only](#mcp-servers-are-admin-only) for the reasoning. diff --git a/docs/features/extensibility/pipelines/filters.md b/docs/features/extensibility/pipelines/filters.md index 24c197c02..ce636a7d0 100644 --- a/docs/features/extensibility/pipelines/filters.md +++ b/docs/features/extensibility/pipelines/filters.md @@ -5,6 +5,15 @@ title: "Filters" ## Filters +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that are built in, easier to configure, and need no separate worker container: + +- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe) +- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter) + +This page is kept for reference and existing deployments only. +::: + Filters are used to perform actions against incoming user messages and outgoing assistant (LLM) messages. Potential actions that can be taken in a filter include sending messages to monitoring platforms (such as Langfuse or DataDog), modifying message contents, blocking toxic messages, translating messages to another language, or rate limiting messages from certain users. A list of examples is maintained in the [Pipelines repo](https://github.com/open-webui/pipelines/tree/main/examples/filters). Filters can be executed as a Function or on a Pipelines server. The general workflow can be seen in the image below.
diff --git a/docs/features/extensibility/pipelines/index.mdx b/docs/features/extensibility/pipelines/index.mdx index 5676e14f7..a8bf05ddb 100644 --- a/docs/features/extensibility/pipelines/index.mdx +++ b/docs/features/extensibility/pipelines/index.mdx @@ -12,12 +12,14 @@ title: "Pipelines" # Pipelines: UI-Agnostic OpenAI API Plugin Framework -:::warning +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are legacy and are no longer recommended.** They predate the in-process [Functions](/features/extensibility/plugin/functions/) (Pipes, Filters, Actions) and [Tools](/features/extensibility/plugin/tools/) system, which now covers the same use cases without running a separate worker container. -**DO NOT USE PIPELINES IF!** - -If your goal is simply to add support for additional providers like Anthropic or basic filters, you likely don't need Pipelines . For those cases, Open WebUI Functions are a better fit—it's built-in, much more convenient, and easier to configure. Pipelines, however, comes into play when you're dealing with computationally heavy tasks (e.g., running large models or complex logic) that you want to offload from your main Open WebUI instance for better performance and scalability. +- Custom provider / RAG / request routing (a Pipeline **pipe**) → use a [Pipe Function](/features/extensibility/plugin/functions/pipe). +- Message pre/post-processing (a Pipeline **filter**) → use a [Filter Function](/features/extensibility/plugin/functions/filter). +- Connecting an external HTTP service → use an [OpenAPI or MCP tool server](/features/extensibility/mcp). +These pages are kept for reference and for existing deployments only. New work should target Functions, Tools, or external tool servers instead. ::: Welcome to **Pipelines**, an [Open WebUI](https://github.com/open-webui) initiative. Pipelines bring modular, customizable workflows to any UI client supporting OpenAI API specs – and much more! Easily extend functionalities, integrate unique logic, and create dynamic workflows with just a few lines of code. diff --git a/docs/features/extensibility/pipelines/pipes.md b/docs/features/extensibility/pipelines/pipes.md index a02365b67..8c65aeacc 100644 --- a/docs/features/extensibility/pipelines/pipes.md +++ b/docs/features/extensibility/pipelines/pipes.md @@ -5,6 +5,15 @@ title: "Pipes" ## Pipes +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that are built in, easier to configure, and need no separate worker container: + +- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe) +- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter) + +This page is kept for reference and existing deployments only. +::: + Pipes are standalone functions that process inputs and generate responses, possibly by invoking one or more LLMs or external services before returning results to the user. Examples of potential actions you can take with Pipes are Retrieval Augmented Generation (RAG), sending requests to non-OpenAI LLM providers (such as Anthropic, Azure OpenAI, or Google), or executing functions right in your web UI. Pipes can be hosted as a Function or on a Pipelines server. A list of examples is maintained in the [Pipelines repo](https://github.com/open-webui/pipelines/tree/main/examples/pipelines). The general workflow can be seen in the image below.
@@ -46,7 +55,7 @@ yield {"choices": [{"delta": {}, "finish_reason": "stop"}]} This is the single biggest gotcha when building an agent pipeline (LangChain, LlamaIndex, a custom planner, anything that executes its own tools and streams the result back). -`delta.tool_calls` in a chunk means **"please execute this tool call for me, client"**. When Open WebUI's middleware sees it, the tool executor picks up the call, runs it, appends a `role: "tool"` message, and fires a continuation request back at the same pipeline. It does this in a loop capped by `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES` (≈30). +`delta.tool_calls` in a chunk means **"please execute this tool call for me, client"**. When Open WebUI's middleware sees it, the tool executor picks up the call, runs it, appends a `role: "tool"` message, and fires a continuation request back at the same pipeline. It does this in a loop capped by [`CHAT_RESPONSE_MAX_TOOL_CALL_ITERATIONS`](/reference/env-configuration#chat_response_max_tool_call_iterations) (default 256; `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES`, default 30, on versions before v0.9.6). If your pipeline already executed the tool internally, emitting `delta.tool_calls` makes Open WebUI try to execute it *again* — and since the pipeline keeps emitting the same call on every retry, you get 30 copies of the response stacked on top of each other before the retry cap trips. Same thing happens if you set `finish_reason: "tool_calls"` mid-stream. diff --git a/docs/features/extensibility/pipelines/tutorials.md b/docs/features/extensibility/pipelines/tutorials.md index 9d7302b78..69d0a669a 100644 --- a/docs/features/extensibility/pipelines/tutorials.md +++ b/docs/features/extensibility/pipelines/tutorials.md @@ -5,6 +5,15 @@ title: "Tutorials" ## Pipeline Tutorials +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that are built in, easier to configure, and need no separate worker container: + +- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe) +- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter) + +These tutorials are kept for reference and existing deployments only. +::: + ## Tutorials Welcome Are you a content creator with a blog post or YouTube video about your pipeline setup? Get in touch diff --git a/docs/features/extensibility/pipelines/valves.md b/docs/features/extensibility/pipelines/valves.md index 6e333d3db..e86d1cede 100644 --- a/docs/features/extensibility/pipelines/valves.md +++ b/docs/features/extensibility/pipelines/valves.md @@ -5,6 +5,15 @@ title: "Valves" ## Valves +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that support [Valves](/features/extensibility/plugin/development/valves) too, are built in, and need no separate worker container: + +- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe) +- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter) + +This page is kept for reference and existing deployments only. +::: + `Valves` (see the dedicated [Valves & UserValves](/features/extensibility/plugin/development/valves) page) can also be set for `Pipeline`. In short, `Valves` are input variables that are set per pipeline. `Valves` are set as a subclass of the `Pipeline` class, and initialized as part of the `__init__` method of the `Pipeline` class. diff --git a/docs/features/extensibility/plugin/development/events.mdx b/docs/features/extensibility/plugin/development/events.mdx index ba8abf730..6090552d2 100644 --- a/docs/features/extensibility/plugin/development/events.mdx +++ b/docs/features/extensibility/plugin/development/events.mdx @@ -795,6 +795,17 @@ When Open WebUI calls your external tool (with header forwarding enabled), it in **Authentication:** Requires a valid Open WebUI API key or session token. +:::warning Open WebUI does **not** forward user credentials to external tools +The `X-OpenWebUI-User-*` and `X-Open-WebUI-Chat-Id` / `X-Open-WebUI-Message-Id` headers forwarded to your tool are **identification only** — they carry no API key or session token. The same applies to MCP custom-header template tokens (`{{USER_ID}}`, `{{USER_NAME}}`, `{{USER_EMAIL}}`, `{{USER_ROLE}}`, `{{CHAT_ID}}`, `{{MESSAGE_ID}}`): there is no `{{API_KEY}}` or `{{TOKEN}}` placeholder, and the user's own API key / session is never sent to the tool server. + +So an external tool **must hold its own statically-configured Open WebUI API key** to call this endpoint. The endpoint's authorization check requires the caller to be the chat's owner **or an admin**, which gives you two practical options: + +- **Per-user key (uncommon)** — the tool server holds the specific user's API key. Only works for a single-user setup; impractical for a shared MCP server. +- **Admin / service-account key (recommended)** — provision a dedicated admin (or service-account) user in Open WebUI, generate an API key for it, and use that key from the tool server. An admin key works for any user's chat, so a single key serves all callers; the forwarded `X-Open-WebUI-Chat-Id` + `X-Open-WebUI-Message-Id` headers tell your tool *which* chat/message to post to. + +Store the key as a secret on the tool server (env var, secrets manager, etc.); do not expect Open WebUI to push it for you. +::: + **Request Body:** ```json diff --git a/docs/features/extensibility/plugin/development/rich-ui.mdx b/docs/features/extensibility/plugin/development/rich-ui.mdx index c3d78a984..db5a7556f 100644 --- a/docs/features/extensibility/plugin/development/rich-ui.mdx +++ b/docs/features/extensibility/plugin/development/rich-ui.mdx @@ -16,28 +16,28 @@ To embed HTML content, your tool should return an `HTMLResponse` with the `Conte ```python from fastapi.responses import HTMLResponse -def create_visualization_tool(self, data: str) -> HTMLResponse: +def render_checklist(self, items: list[str]) -> HTMLResponse: """ - Creates an interactive data visualization that embeds in the chat. + Renders an interactive checklist that embeds in the chat. - :param data: The data to visualize + :param items: The items to show in the checklist """ - html_content = """ + items_html = "".join( + f'
  • ' for item in items + ) + html_content = f""" - Data Visualization - + Checklist + -
    - +
      {items_html}
    """ @@ -55,11 +55,11 @@ To provide the LLM with actionable context about the embed, return a **tuple** o ```python from fastapi.responses import HTMLResponse -def create_chart(self, data: str) -> tuple: +def render_feedback_form(self, prompt: str) -> tuple: """ - Creates an interactive chart and returns context to the LLM. + Renders an interactive feedback form and returns context to the LLM. - :param data: The data to chart + :param prompt: The question to show the user above the form """ html_content = "..." headers = {"Content-Disposition": "inline"} @@ -67,16 +67,16 @@ def create_chart(self, data: str) -> tuple: # The LLM receives this context instead of the generic message result_context = { "status": "success", - "chart_type": "scatter", - "data_points": 42, - "description": "Scatter plot showing correlation between X and Y" + "form_type": "feedback", + "fields": ["rating", "comment"], + "description": f"Rendered a feedback form asking: {prompt!r}" } return HTMLResponse(content=html_content, headers=headers), result_context ``` The context can be: -- A **string** — sent as-is to the LLM (e.g., `"Generated a bar chart with 5 categories"`) +- A **string** — sent as-is to the LLM (e.g., `"Rendered a 5-item checklist"`) - A **dict** — serialized as JSON for structured context - A **list** — serialized as JSON for multiple items @@ -271,11 +271,11 @@ The iframe and parent window can communicate beyond just height reporting. The f ### Payload Requests -The iframe can request a data payload from the parent. This is useful for passing dynamic data into the embed after it loads: +The iframe can ask the parent for a data payload after it loads: ```html ``` -The parent responds with `{ type: 'payload', requestId: ..., payload: ... }` containing the configured payload data. +The parent responds with `{ type: 'payload', requestId: ..., payload: ... }`. + +:::info Where the payload comes from +There is no separate "set the payload" call. The payload is whatever the parent component had configured when it instantiated the iframe — and today only one path actually configures one: + +- ✅ **Citation-opened embeds in the chat-controls Embeds panel** — when the user clicks a citation badge whose source has an embed URL, the side panel opens and exposes **the full citation/source object** (the same dict you sent in your `source` / `citation` event via `__event_emitter__`) as the payload. To set it, emit a [`source` event](./events#source-or-citation-and-code-execution) whose `data` includes whatever you want the iframe to be able to fetch. The iframe then asks for it via the postMessage above and receives the citation object back. +- ❌ **Inline tool-call embeds** (from a tool method returning `HTMLResponse` or `(HTMLResponse, context)`) — the parent does not configure a payload on this path, so a payload request returns `{ type: 'payload', requestId: ..., payload: null }`. Use [Tool Args Injection](#tool-args-injection-tools-only) (subject to `allowSameOrigin`) to pass data into a tool-call embed instead. +- ❌ **`__event_emitter__({"type": "embeds", ...})` and Action embeds** — also configured without a payload; the response is `null`. + +In short: payload-request is the side-panel-citation channel, not a generic iframe-data channel. Pick the right rendering path for the data flow you need. +::: ### Tool Args Injection (Tools Only) -When a **Tool** returns a Rich UI embed, the tool call arguments (the parameters the model passed to the tool) are automatically injected into the iframe's `window.args`. This allows your embedded HTML to access the tool's input: +When a **Tool** method returns a Rich UI embed inline at the tool-call display (i.e. you return an `HTMLResponse`, or a `(HTMLResponse, context)` tuple, from the tool method itself), the arguments the model passed are exposed on the iframe as `window.args` — **as a JSON string**, not a parsed object. Parse it before use: ```html ``` -:::note -This only works for Tool embeds rendered via the tool call display. Action embeds do not have `window.args` since they are triggered by the user, not the model. +:::warning Requires `allowSameOrigin` — otherwise `window.args` is silently `undefined` +The args are injected from the parent page via `iframe.contentWindow.args = ...`, which the browser blocks under same-origin policy unless the iframe sandbox carries `allow-same-origin`. That is gated by the per-user **Settings → Interface → "iframe Sandbox Allow Same Origin"** toggle, which is **off by default**. If `window.args` comes back undefined and you have not changed this setting, that is the cause: turn it on and reload. See [allowSameOrigin](#allowsameorigin) above for the security trade-off. +::: + +:::note Where `window.args` is set, and where it is not +- ✅ **Tool method returning `HTMLResponse` or `(HTMLResponse, context)` tuple** — rendered inline at the "View Result from..." tool call indicator. `window.args` is injected (subject to the `allowSameOrigin` requirement above). +- ❌ **`__event_emitter__({"type": "embeds", "data": {"embeds": [...]}})`** — rendered through the chat-controls Embeds panel, which does not wire `args` at all. `window.args` will always be undefined here, regardless of sandbox settings. This is by design: the embeds-event path has no tool call attached, so there are no args to inject. +- ❌ **Action embeds** — triggered by the user, not the model, so there are no model-supplied args to inject. + +If you need to pass dynamic data into an embed rendered via either of the ❌ paths, use the [Payload Requests](#payload-requests) pattern above instead. ::: ### Auto-Injected Libraries diff --git a/docs/features/extensibility/plugin/development/under-the-hood.mdx b/docs/features/extensibility/plugin/development/under-the-hood.mdx new file mode 100644 index 000000000..4d6f9a983 --- /dev/null +++ b/docs/features/extensibility/plugin/development/under-the-hood.mdx @@ -0,0 +1,190 @@ +--- +sidebar_position: 5 +title: "Under the Hood" +--- + +# 🔧 Under the Hood: What the Plugin Loader Actually Does + +:::danger ⚠️ Critical Security Warning +**Tools, Functions, Pipes, Filters, and Actions execute arbitrary Python code on your server.** Function creation is restricted to administrators only, and Workspace Tool creation is gated by the `workspace.tools` permission — granting that permission is equivalent to giving the user shell access to the server. Only install from trusted sources, review code before importing, and restrict creation to trusted administrators. A malicious plugin could access your file system, exfiltrate data, or compromise your entire system. For full details, see the [Plugin Security Warning](/features/extensibility/plugin/). +::: + +Open WebUI's plugins (Tools, Functions = Filters / Pipes / Actions) are not sandboxed scripts running in some restricted runtime. They are **Python modules executed inside your Open WebUI process**, with full access to the standard library, any pip package, the entire `open_webui` codebase, the live FastAPI app, and the database. The documented hooks (`inlet`, `outlet`, `stream`, `pipe`, `action`) are *one* way to use that access. They are not the only way. + +This page documents what the loader really does and what that opens up, so you can build (or audit) plugins beyond the patterns shown on the per-type pages. It also lists the footguns that come with the territory. + +--- + +## How a plugin is loaded + +A single loader in [`backend/open_webui/utils/plugin.py`](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/utils/plugin.py) handles every plugin type: + +1. The plugin's Python source is read from the database. +2. A fresh `types.ModuleType` is created and registered in `sys.modules` as `function_{id}` (or `tool_{id}`). +3. The source is fed to `exec(content, module.__dict__)`. Anything at module top level runs at this point. +4. The loader looks for **one** entry-point class: `Tools`, `Pipe`, `Filter`, or `Action`. That class becomes the handle Open WebUI calls into. +5. The module stays in `sys.modules` for the life of the process. Any side effect of step 3 (imports, monkey-patches, background tasks, route registrations) is now installed in the live application. + +The entry-point class is the only thing the rest of Open WebUI cares about. Everything else in the file is yours. + +### When the module is re-executed + +Inlet/outlet hooks pass `load_from_db=True`. The loader still serves from cache if the source has not changed, but it consults the database on every call to decide that. Stream hooks pass `load_from_db=False` and read straight from cache. + +| Hook | DB check per call? | Module re-exec'd when? | +|---|---|---| +| `inlet` / `outlet` (Filter) | yes | source change between calls | +| `stream` (Filter) | no | only when another hook re-loads it | +| Tools, Pipes, Actions | yes on dispatch | source change between calls | + +Practical consequences: + +- **Editing a Filter via the editor takes effect on the next chat for `inlet`/`outlet`.** Stream picks it up the next time an `inlet` or `outlet` triggers a reload. +- **Re-execution is not per-request**, so module-top-level work is paid for once per content version, not once per chat. Top-level imports, patches, and singletons are fine. +- **Disabling or deleting a plugin** removes it from the active set. It does **not** undo anything its module top level did. The module stays in `sys.modules` and any monkey-patches it installed in other modules stay applied until the process restarts. + +--- + +## What you actually have access to + +From any hook (and from module top level): + +- The full `open_webui.*` package. Examples: `from open_webui.models.chats import Chats`, `from open_webui.utils.middleware import process_chat_payload`, `from open_webui.config import ConfigVar`. +- The live FastAPI `Request` via `__request__`, which carries `__request__.app` (the FastAPI app), `__request__.app.state` (config, caches, handlers), and `__request__.state` (per-request scratch). +- The reserved dunder args documented in [Reserved Arguments](./reserved-args): `__user__`, `__metadata__`, `__model__`, `__request__`, `__event_emitter__`, `__event_call__`, `__features__`, `__body__`, `__id__`, `__oauth_token__`, plus stream-only and per-hook extras. +- Events documented in [Events](./events): emit anything to the frontend, or solicit a response from the user with `event_call`. +- Any pip package via `requirements:` frontmatter, installed at load time (gated by [`ENABLE_PIP_INSTALL_FRONTMATTER_REQUIREMENTS`](/reference/env-configuration#enable_pip_install_frontmatter_requirements)). +- The Python stdlib, plus everything pip-installed in the container. + +There is no sandbox, no allowlist, no capability system. The execution model is **"this is Python, you are inside the server process"**. + +--- + +## Patterns + +### 1. Mutate the per-request model dict from `inlet` + +The `__model__` you receive is **the same dict object** the rest of the request reads. Changing its keys from `inlet` changes how the rest of the pipeline behaves on this request. Example (the reasoning-content fix for DeepSeek / Kimi / MiMo): + +```python +class Filter: + async def inlet(self, body: dict, __model__: dict = None) -> dict: + # Flip the per-request model to the code path that emits + # reasoning_content as a top-level field on assistant messages + # during the native tool-call loop. + if __model__ and __model__.get("provider") not in ("ollama", "llama.cpp"): + __model__["provider"] = "llama.cpp" + return body +``` + +Same trick works for any other field the middleware reads from the model dict: `params`, `meta`, custom keys you put there yourself and then read from another hook. + +### 2. Monkey-patch a backend function + +Because the plugin module can `import open_webui.*` and rebind module attributes: + +```python +import open_webui.utils.middleware as _mw + +_original = _mw.process_chat_payload + +async def _patched(request, form_data, user, metadata, model): + # ...your wrapping logic, then delegate... + return await _original(request, form_data, user, metadata, model) + +_mw.process_chat_payload = _patched +``` + +Runs at module load (once per source version). The patch persists in `sys.modules` for the life of the process. Deleting or disabling the plugin **does not** revert the patch. The only clean rollback is a process restart. + +Use sparingly. Cross-plugin interference is a real risk: if two plugins patch the same function the result depends on load order, which is not deterministic. + +### 3. Add a new HTTP route at load + +```python +def _ensure_route(app): + if any(getattr(r, "path", None) == "/my/route" for r in app.routes): + return + app.add_api_route("/my/route", my_handler, methods=["GET"]) +``` + +Call from the first hook with access to `__request__.app`. The idempotency guard is important: the loader may re-execute on edits, and `add_api_route` will happily register the same path twice. + +### 4. Spawn a background task + +```python +import asyncio + +async def _loop(app): + while True: + # ...periodic work... + await asyncio.sleep(60) + +def _start_once(app): + if getattr(app.state, "_my_plugin_started", False): + return + app.state._my_plugin_started = True + asyncio.create_task(_loop(app)) +``` + +The `app.state` flag makes it "once per process" rather than "once per source version". On a clean restart it starts fresh. + +### 5. Stash state in `app.state` + +```python +async def inlet(self, body, __request__): + cache = __request__.app.state.__dict__.setdefault("my_cache", {}) + # ...read/write cache... + return body +``` + +Shared across requests and **across plugins** in the same process. There is no namespacing: pick a unique key. + +### 6. Use `event_emitter` for arbitrary side effects in the UI + +`event_emitter` accepts any event shape the frontend handles: status banners, source citations, file attachments, chat-message updates, toasts. You are not restricted to the events documented on the per-type pages. See [Events](./events) for the full catalogue. + +### 7. Prompt the user mid-handler with `event_call` + +`event_call` is `event_emitter` that **awaits a response**. Show a form, a confirmation, an input dialog, and block until the user answers. Useful inside Tool methods that need a human in the loop, or Action handlers that confirm before executing. + +### 8. Pipes as full provider replacements + +A `Pipe` replaces the entire LLM call. Open WebUI hands you the request and asks for a response back. Nothing in the middleware constrains what you put in that response, so: + +- wrap an external API (any provider, any protocol), +- route between providers based on request shape, +- run an entire agent inside `pipe()` and stream the agent's output back, +- skip any model entirely and return canned content. + +A Pipe is the most powerful entry point precisely because the middleware steps out of the way. + +### 9. Tools that do more than their docstring says + +A `Tools` class's methods are exposed to the model as callable tools (their docstrings become JSON schema). The method body can do **anything**: call external APIs, emit UI events with `__event_emitter__`, stash data in `app.state`, monkey-patch on first call. The docstring is purely how the tool advertises itself to the model. The implementation is unconstrained. + +### 10. Actions as arbitrary one-shot operations + +`Action` renders a button on an assistant message. The handler runs server-side with the same dunder surface as Filters and Tools, against the chat that the message belongs to. Use for "approve this", "re-run with...", "send to external system", or any one-off operation a user should be able to trigger from a specific message. + +--- + +## Footguns + +- **No sandboxing.** Tools and Functions execute Python in your backend process as the backend user. The security policy ([Rule 10](/security/security-policy#reporting-guidelines)) treats this as intended behaviour: granting Tool or Function creation permission is equivalent to granting shell access on the host. Treat plugin authors as administrators. +- **Stream hooks use a stale cache.** Edits to a `stream` method only take effect after another hook (or a process restart) refreshes the module. If you edit a stream filter and the change does not seem to apply, trigger an `inlet`/`outlet` reload or restart. +- **Cross-plugin interference is not detected.** Two plugins patching the same function, registering the same route, or writing to the same `app.state` key will collide. Load order is not deterministic. Prefer additive patterns (your own namespaces, wrappers that delegate) over destructive ones. +- **Disabling does not unload.** The module stays in `sys.modules` and any module-level side effects stay installed. Restart the process to fully revert. +- **`requirements:` runs `pip install` on every replica at load.** In multi-replica deployments set [`ENABLE_PIP_INSTALL_FRONTMATTER_REQUIREMENTS=False`](/reference/env-configuration#enable_pip_install_frontmatter_requirements) and pre-install dependencies in your image; runtime installs race across workers and crash. See [Scaling → Function/Tool Dependency Installation Crashes](/troubleshooting/multi-replica#9-functiontool-dependency-installation-crashes). +- **Internal APIs are not a stable public surface.** `open_webui.utils.*`, the internal model classes, middleware helpers, and pretty much everything outside the documented dunder args and event types can rename, move, or change signatures between releases. If your monkey-patch breaks after an upgrade, that is on you to repair. +- **The Pipelines server is out of scope here.** This page is about in-process plugins (Tools / Functions). The separate [Pipelines](/features/extensibility/pipelines/) server runs out-of-process and does not share `sys.modules` with Open WebUI: it cannot monkey-patch the main app, but it also is not constrained by it. + +--- + +## When this is the wrong tool + +For anything you can express through the documented hooks (filters that mutate `body`, tools that call APIs and return results, actions that emit events), **stay in the documented hooks**. The patterns above are powerful, but their durability is shallow: cross-plugin interaction, upgrade compatibility, and rollback all degrade the moment you start patching module internals. + +If your plugin needs an interface that does not exist yet, an upstream PR is more durable than a monkey-patch. + +If you file a bug report against a code path that your plugin is monkey-patching, expect it to be closed. Reports must reproduce against an unmodified Open WebUI ([Rule 6](/security/security-policy#reporting-guidelines)). diff --git a/docs/features/extensibility/plugin/functions/filter.mdx b/docs/features/extensibility/plugin/functions/filter.mdx index af374e6cf..270da77e1 100644 --- a/docs/features/extensibility/plugin/functions/filter.mdx +++ b/docs/features/extensibility/plugin/functions/filter.mdx @@ -3,7 +3,7 @@ sidebar_position: 3 title: "Filter Function" --- -# 🪄 Filter Function: Modify Inputs and Outputs +# Filter Function: Modify Inputs and Outputs :::danger ⚠️ Critical Security Warning **Filter Functions execute arbitrary Python code on your server.** Function creation is restricted to administrators only. Only install from trusted sources and review code before importing. A malicious Function could access your file system, exfiltrate data, or compromise your entire system. For full details, see the [Plugin Security Warning](/features/extensibility/plugin/). @@ -15,7 +15,7 @@ This guide will break down **what Filters are**, how they work, their structure, --- -## 🌊 What Are Filters in Open WebUI? +## What Are Filters in Open WebUI? Imagine Open WebUI as a **stream of water** flowing through pipes: @@ -36,11 +36,11 @@ Filters are like **translators or editors** in the AI workflow: you can intercep --- -## 🗺️ Structure of a Filter Function: The Skeleton +## Structure of a Filter Function: The Skeleton Let's start with the simplest representation of a Filter Function. Don't worry if some parts feel technical at first—we’ll break it all down step by step! -### 🦴 Basic Skeleton of a Filter +### Basic Skeleton of a Filter ```python from pydantic import BaseModel @@ -73,7 +73,7 @@ class Filter: --- -### 🧲 Toggleable Filters: Making Filters User-Controllable (`self.toggle`) +### Toggleable Filters: Making Filters User-Controllable (`self.toggle`) By default a filter that's **active and in scope** (global, or attached to the model) runs on every request — the user has no say in it. That's often what you want (PII scrubbing, logging, mandatory guardrails). Sometimes you want the opposite: let the user decide whether the filter runs for a given conversation. @@ -144,9 +144,88 @@ The chip being present = the filter is enabled for the next request. The chip be --- -## ⚙️ Filter Administration & Configuration +### Owning Retrieval With file_handler -### 🌐 Global Filters vs. Model-Specific Filters +By default, when a user attaches a knowledge collection or uploads a file to a chat, Open WebUI runs the built-in RAG pipeline **after** every inlet filter has returned. The chat-completion handler queries the vector DB for chunks relevant to the user's last message, wraps them in `` tags, appends them to the last user message (or to a system message, depending on `RAG_SYSTEM_CONTEXT`), and only then calls the LLM. + +This is important to understand for filter authors: at `inlet()` time, `body["metadata"]["files"]` and `body["files"]` contain only the file/collection *references* (IDs, names, types). **The chunk text doesn't exist yet** — retrieval hasn't happened. So if you want to inspect or transform the chunks themselves (PII / PHI redaction, reranking, custom hybrid scoring, translation, chunk-level access control, anonymization), the standard inlet contract is not enough — the data you want isn't there yet. + +**`file_handler = True`** is the opt-in escape hatch for exactly this case. Declared as a **module-level attribute** at the top of your filter file, it tells Open WebUI "I am handling retrieval and chunk injection myself — skip the built-in RAG step." When set, the backend strips `body["metadata"]["files"]` and `body["files"]` after your `inlet()` returns, so the chat-completion handler finds no files to retrieve over and goes straight to the LLM with whatever you injected. + +```python +from pydantic import BaseModel +from typing import Optional + +# Module-level attribute — sits OUTSIDE the Filter class, alongside imports. +file_handler = True + +class Filter: + class Valves(BaseModel): + pass + + def __init__(self): + self.valves = self.Valves() + + async def inlet( + self, + body: dict, + __request__=None, + __user__: Optional[dict] = None, + __model__: Optional[dict] = None, + ) -> dict: + # body["metadata"]["files"] still contains the file/collection REFERENCES here. + # After this method returns, Open WebUI strips them and does NOT run its own RAG. + # Therefore: it is YOUR job to retrieve, transform, and inject chunks below. + return body +``` + +:::warning Module attribute, not `self.file_handler` +Open WebUI reads `file_handler` from the **module object** (the file your filter lives in), not from the `Filter` instance. Setting `self.file_handler = True` inside `__init__` is silently ignored. Put the assignment at the top of the file, alongside your imports — exactly as shown above. +::: + +#### When to use it + +- **Per-model redaction.** Apply PII / PHI scrubbing only when the request targets a remote model, while letting a self-hosted model see raw chunks. Branch on `__model__["owned_by"]` (or another signal) inside the inlet and transform chunks accordingly. +- **Custom retrieval logic.** Hybrid BM25 + dense scoring, query rewriting, multi-collection routing, reranking with a different model than the one Open WebUI uses, result caching keyed on the rewritten query. +- **Pre-injection transformation.** Translation, summarization, deduplication, or any transform that needs the *actual chunk text* rather than just the references. +- **Chunk-level access control.** Filter out chunks the current user shouldn't see based on metadata attached to the source documents. + +#### The recipe + +1. Set `file_handler = True` at the top of your filter module. +2. In `inlet()`, read the file references from `body["metadata"]["files"]` (and `body["files"]` for ad-hoc attachments). +3. Retrieve chunks yourself. Two options: + - **HTTP**: call `POST /api/v1/retrieval/query/doc` (single collection) or `POST /api/v1/retrieval/query/collection` (multiple), passing the user's last message as the query string and the inbound request's bearer token so permissions stay scoped to the user. + - **In-process**: `from open_webui.retrieval.utils import get_sources_from_items` and call it directly with the same arguments the core code uses. This avoids the network hop and returns a cleaner shape (list of dicts each containing a `document` array of chunks and a parallel `metadata` array). +4. Transform the chunks however you need. Branch on `__model__` / `__user__` if the transform is conditional (e.g. "redact only when the model is remote"). +5. Inject the transformed chunks back into `body["messages"]`. To preserve clickable citations in the UI, mirror the format Open WebUI uses internally: + + ```html + + ...chunk text... + + ``` + + Plain Markdown also works if you don't care about citations being clickable in the UI — only the structured `` form wires up the citation popovers. +6. Return `body`. The built-in RAG step is skipped (because `file_handler` caused the file references to be stripped), and the LLM call goes out with your sanitized chunks already in the prompt. + +#### Caveat: it's static, all-or-nothing per filter + +`file_handler` is read **once per filter, at the module level**. It is not a per-request signal and cannot be flipped based on the model, user, or chat from inside `inlet()`. When set, the built-in RAG is **always** skipped for any request where this filter is invoked — regardless of whether your `inlet()` actually called any retrieval logic on that particular request. + +In practice this means: if you use `file_handler = True`, your filter must handle retrieval for **every** scenario where files would normally be retrieved by the built-in path, including the cases where you'd have been happy with the default behavior. The retrieval call itself is identical in both cases; only any conditional *transformation* (e.g. "only redact for remote models") branches on context. + +If you genuinely need per-request switching between built-in and custom retrieval (e.g. "use built-in RAG for some users, custom for others on the same model"), the cleanest approach is to gate the custom-RAG filter on `self.toggle = True` so it only runs when the user has it selected — when the filter isn't selected, it doesn't run, its `file_handler` doesn't apply, and the built-in RAG handles the request normally. Don't try to dynamically mutate `file_handler` from inside `inlet()`; the flag is read off the module object before your method is called. + +#### Why this matters compared to mutating `body["files"]` in inlet + +A naive alternative is to clear `body["metadata"]["files"] = []` and `body["files"] = []` inside `inlet()` to suppress the built-in RAG dynamically. This works in practice but is brittle: future Open WebUI versions can add new file/collection plumbing under additional keys, and the official "I'm handling this myself" contract is `file_handler`. Prefer the documented opt-in. + +--- + +## Filter Administration & Configuration + +### Global Filters vs. Model-Specific Filters Open WebUI provides a flexible multi-level filter system that allows you to control which filters are active, how they're enabled, and who can toggle them. Understanding this system is crucial for effective filter management. @@ -191,7 +270,7 @@ POST /functions/id/{filter_id}/toggle/global --- -### 🎛️ The Two-Tier Filter System +### The Two-Tier Filter System Open WebUI uses a sophisticated two-tier system for managing filters on a per-model basis. This can be confusing at first, but it's designed to support both **always-on filters** and **user-toggleable filters**. @@ -258,7 +337,7 @@ class Filter: --- -### 🔄 Toggleable Filters vs. Always-On Filters +### Toggleable Filters vs. Always-On Filters Understanding the difference between these two types is key to using the filter system effectively. @@ -348,7 +427,7 @@ class WebSearchFilter: --- -### 📊 Filter Execution Flow +### Filter Execution Flow Here's the complete flow from admin configuration to filter execution: @@ -386,7 +465,7 @@ Here's the complete flow from admin configuration to filter execution: --- -### 📡 Filter Behavior with API Requests +### Filter Behavior with API Requests When using Open WebUI's API endpoints directly (e.g., via `curl` or external applications), `inlet()` and `stream()` follow the same execution model as WebUI requests. `outlet()` is the one that behaves very differently for direct API callers and is covered in detail below. @@ -608,7 +687,7 @@ Filters are sorted in **ascending** order by priority. A filter with `priority=0 --- -### 🔗 Data Passing Between Filters +### Data Passing Between Filters When multiple filters are active, each filter in the chain receives the **modified data from the previous filter**. The returned value from one filter becomes the input to the next filter in the priority order. @@ -932,6 +1011,10 @@ In the world of Open WebUI, the `inlet` function does this important prep work o 🚀 **Your Task**: Modify and return the `body`. The modified version of the `body` is what the LLM works with, so this is your chance to bring clarity, structure, and context to the input. +:::info Want to transform RAG chunks? `inlet()` runs **before** retrieval +At `inlet()` time, `body["metadata"]["files"]` and `body["files"]` contain only file/collection *references* — the actual chunk text is fetched and injected later, after every inlet filter has returned. If you need to inspect or transform the chunk text itself (PII redaction, reranking, translation, chunk-level ACLs), see [Owning Retrieval With `file_handler`](#owning-retrieval-with-file_handler) for the supported opt-in. +::: + ##### Why Would You Use the `inlet`? 1. **Adding Context**: Automatically append crucial information to the user’s input, especially if their text is vague or incomplete. For example, you might add "You are a friendly assistant" or "Help this user troubleshoot a software bug." @@ -1036,7 +1119,7 @@ async def stream(self, event: dict) -> dict: - Each line represents a **small fragment** of the model's streamed response. - The **`delta.content` field** contains the progressively generated text. -##### 🔄 Example: Filtering Out Emojis from Streamed Data +##### Example: Filtering Out Emojis from Streamed Data ```python async def stream(self, event: dict) -> dict: for choice in event.get("choices", []): @@ -1073,7 +1156,7 @@ The `outlet` function is like a **proofreader**: tidy up the AI's response (or m - **Quality scoring** - Run automated quality checks on model outputs :::info Outlet and API Requests -`outlet()` does **not** run reliably for direct `/api/chat/completions` calls. On tagged releases it is never invoked by that endpoint. On `dev` it can run inline, but only when the caller supplies `chat_id` + `id`, owns the chat, and uses a non-streaming request — and even then the filtered content is not returned in the HTTP response. For direct API integrations that need `outlet()`, follow `/api/chat/completions` with `POST /api/chat/completed`. See [Filter Behavior with API Requests](#-filter-behavior-with-api-requests) for the full picture. +`outlet()` does **not** run reliably for direct `/api/chat/completions` calls. On tagged releases it is never invoked by that endpoint. On `dev` it can run inline, but only when the caller supplies `chat_id` + `id`, owns the chat, and uses a non-streaming request — and even then the filtered content is not returned in the HTTP response. For direct API integrations that need `outlet()`, follow `/api/chat/completions` with `POST /api/chat/completed`. See [Filter Behavior with API Requests](#filter-behavior-with-api-requests) for the full picture. ::: 💡 **Example Use Case**: Strip out sensitive API responses you don't want the user to see: @@ -1169,7 +1252,7 @@ Publishing a curated package on **[openwebui.com](https://openwebui.com/)** lets --- -## 🚧 Potential Confusion: Clear FAQ 🛑 +## Potential Confusion: Clear FAQ ### **Q: How Are Filters Different From Pipe Functions?** diff --git a/docs/features/extensibility/plugin/functions/pipe.mdx b/docs/features/extensibility/plugin/functions/pipe.mdx index 8eb46f923..09a8b1cab 100644 --- a/docs/features/extensibility/plugin/functions/pipe.mdx +++ b/docs/features/extensibility/plugin/functions/pipe.mdx @@ -279,7 +279,7 @@ If you must use a synchronous third-party library in an async handler, wrap the You can modify this proxy Pipe to support additional service providers like Anthropic, Perplexity, and more by adjusting the API endpoints, headers, and logic within the `pipes` and `pipe` functions. :::caution Building a self-contained agent? Don't emit `delta.tool_calls`. -If your Pipe wraps an agent (LangChain, LlamaIndex, a custom planner, …) that executes tools **internally** and then streams the final answer back to the chat, emitting `delta.tool_calls` in the stream will trigger Open WebUI's tool-execution retry loop — the middleware treats `delta.tool_calls` as "please execute this for me, client" and loops back through your pipe, duplicating the response up to `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES` (~30) times. +If your Pipe wraps an agent (LangChain, LlamaIndex, a custom planner, …) that executes tools **internally** and then streams the final answer back to the chat, emitting `delta.tool_calls` in the stream will trigger Open WebUI's tool-execution retry loop — the middleware treats `delta.tool_calls` as "please execute this for me, client" and loops back through your pipe, duplicating the response up to [`CHAT_RESPONSE_MAX_TOOL_CALL_ITERATIONS`](/reference/env-configuration#chat_response_max_tool_call_iterations) (default 256; `CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES`, default 30, before v0.9.6) times. For self-contained agents, render tool executions as `
    ` content blocks instead — the same shape Open WebUI itself emits after internal tool execution. See the [Pipes → Self-contained agents and `delta.tool_calls`](/features/extensibility/pipelines/pipes#self-contained-agents-and-deltatool_calls) section for the full pattern, a LangChain example, and the rule of thumb for which path to take. ::: diff --git a/docs/features/extensibility/plugin/tools/development.mdx b/docs/features/extensibility/plugin/tools/development.mdx index 8c86128c0..642d4af94 100644 --- a/docs/features/extensibility/plugin/tools/development.mdx +++ b/docs/features/extensibility/plugin/tools/development.mdx @@ -33,6 +33,10 @@ licence: MIT """ ``` +:::tip Metadata auto-fill (v0.9.6+) +When you create a **new** tool (also applies to functions and skills), the editor reads the frontmatter as you paste or type code and auto-fills the **Name**, **ID**, and **Description** fields from `title` and `description` if you haven't already filled them in. It never overwrites a value you've entered, and it does not re-derive fields when editing an existing item — so you no longer need to retype metadata that's already declared in the source. +::: + ### Tools Class Tools have to be defined as methods within a class called `Tools`, with optional subclasses called `Valves` and `UserValves`, for example: diff --git a/docs/features/extensibility/plugin/tools/index.mdx b/docs/features/extensibility/plugin/tools/index.mdx index 1b4b354cf..6130b2c6c 100644 --- a/docs/features/extensibility/plugin/tools/index.mdx +++ b/docs/features/extensibility/plugin/tools/index.mdx @@ -21,12 +21,13 @@ Because there are several ways to integrate "Tools" in Open WebUI, it's importan | Type | Location in UI | Best For... | Source | | :--- | :--- | :--- | :--- | -| **Native Features** | Admin/Settings | Core platform functionality | Built-in to Open WebUI | -| **Workspace Tools** | `Workspace > Tools` | User-created or community Python scripts | [Community Library](https://openwebui.com/search) | -| **Native MCP (HTTP)** | `Settings > Connections` | Standard MCP servers reachable via HTTP/SSE | External MCP Servers | -| **MCP via Proxy (MCPO)** | `Settings > Connections` | Local stdio-based MCP servers (e.g., Claude Desktop tools) | [MCPO Adapter](https://github.com/open-webui/mcpo) | -| **OpenAPI Servers** | `Settings > Connections` | Standard REST/OpenAPI web services | External Web APIs | -| **Open Terminal** | `Settings > Integrations` | Full shell access in an isolated Docker container (always-on) | [Open Terminal](https://github.com/open-webui/open-terminal) | +| **Native Features** | Admin/Settings | Core platform functionality (these are the [built-in system tools](#built-in-system-tools-nativeagentic-mode)) | Built-in to Open WebUI | +| **Workspace Tools** | `Workspace > Tools` | User-created or community Python scripts — **the most powerful, least restricted option** | [Community Library](https://openwebui.com/search) | +| **Native MCP (HTTP)** | `Settings > Connections` | Standard MCP servers reachable via HTTP/SSE | External tool server | +| **MCP via Proxy (MCPO)** | `Settings > Connections` | Local stdio-based MCP servers (e.g., Claude Desktop tools) | External tool server (via [MCPO Adapter](https://github.com/open-webui/mcpo)) | +| **OpenAPI Servers** | `Settings > Connections` | Standard REST/OpenAPI web services | External tool server | + +The last three (**MCP HTTP**, **MCPO**, **OpenAPI**) are all **external tool servers**: the tool code runs on a separate process or machine and Open WebUI calls it over HTTP. **Native Features** are the built-in system tools that ship with Open WebUI. **Workspace Tools** are Python that runs in-process — for the most demanding use cases they are by far the most capable option with the fewest limitations (see below). ### 1. Native Features (Built-in) These are deeply integrated into Open WebUI and generally don't require external scripts. @@ -39,8 +40,8 @@ These are deeply integrated into Open WebUI and generally don't require external In [**Native Mode**](#built-in-system-tools-nativeagentic-mode), these features are exposed as **Tools** that the model can call independently. ### 2. Workspace Tools (Custom Plugins) -These are **Python scripts** that run directly within the Open WebUI environment. -- **Capability**: Can do anything Python can do (web scraping, complex math, API calls). +These are **Python scripts** that run directly within the Open WebUI environment. **For the most demanding use cases, Workspace Tools are by far the most powerful option with the fewest limitations** — they run in-process with full access to Python, the `open_webui` codebase, and the request context, so there is very little they *can't* do (see [Under the Hood](../development/under-the-hood) for the full extent). The external tool servers above are more constrained: they only see what you pass over HTTP and can't reach into Open WebUI itself. +- **Capability**: Can do anything Python can do (web scraping, complex math, API calls), and hold secrets (API keys) entirely server-side so neither the user nor the model can read them. - **Access**: Managed via the `Workspace` menu. - **Safety**: Always review code before importing, as these run on your server. - **⚠️ Security Warning**: Normal or untrusted users should **not** be given permission to access the Workspace Tools section. This access allows a user to upload and execute arbitrary Python code on your server, which could lead to a full system compromise. @@ -54,6 +55,10 @@ These are **Python scripts** that run directly within the Open WebUI environment ### 4. OpenAPI / Function Calling Servers Generic web servers that provide an OpenAPI (`.json` or `.yaml`) specification. Open WebUI can ingest these specs and treat every endpoint as a tool. +:::info Open Terminal — a separate code-execution integration +Beyond the tool types above, Open WebUI also integrates with **[Open Terminal](/features/open-terminal)**: an always-on, isolated Docker container that gives a model a real shell and filesystem. Once connected, it exposes its own set of **built-in tools** (`run_command`, `read_file`, `write_file`, `grep_search`, `glob_search`, process management, and more) that the model can call directly — effectively a sandboxed code-execution and file-handling environment, distinct from the per-message [Code Interpreter](#built-in-system-tools-nativeagentic-mode) tool. See the [Open Terminal documentation](/features/open-terminal) for setup, multi-user, and security considerations. +::: + --- ## How to Install & Manage Workspace Tools @@ -229,8 +234,10 @@ Default Mode is **not** a supported workaround even for DeepSeek — it is legac | `search_knowledge_bases` | Text search over KB names/descriptions. | | `query_knowledge_files` | Search file contents via the RAG retrieval pipeline (hybrid + rerank when enabled). Main tool for finding answers in docs. | | `search_knowledge_files` | Search files by filename. | -| `view_file` | Read a user-accessible file by ID with pagination (`offset`, `max_chars`). | +| `grep_knowledge_files` | Exact text / regex search across knowledge file content. Returns matching lines with line numbers. Complements `query_knowledge_files` (semantic) when you need literal matches. | +| `view_file` | Read a user-accessible file by ID with character pagination (`offset`, `max_chars`) or line range (`start_line`, `end_line`, optional `line_numbers`). | | `view_knowledge_file` | Read a knowledge-base file by ID with pagination (`offset`, `max_chars`). | +| `kb_exec` *(opt-in)* | Filesystem-style command interface for knowledge bases (`ls`, `tree`, `cat`, `head`, `tail`, `sed`, `grep`, `find`, `wc`, `stat`, with pipe support). Directory-aware: `ls docs/`, `tree`, `grep "x" docs/`, and path-based file refs (`docs/api/auth.md`). Replaces the discovery/read tools above when [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set. | | **Image Gen** | *Requires image generation enabled (per-tool) AND per-chat "Image Generation" toggle enabled.* | | `generate_image` | Generates a new image based on a prompt. Requires `ENABLE_IMAGE_GENERATION`. | | `edit_image` | Edits existing images based on a prompt and image URLs. Requires `ENABLE_IMAGE_EDIT`. | @@ -287,12 +294,17 @@ Use this quick matrix instead of memorizing per-row caveats. | `query_knowledge_bases` | ❌ | ✅ | | `search_knowledge_files` | ✅ (auto-scoped) | ✅ (all accessible KBs) | | `query_knowledge_files` | ✅ (auto-scoped) | ✅ | +| `grep_knowledge_files` | ✅ (auto-scoped) | ✅ | | `view_file` | ✅ (when attached items include files/collections) | ❌ | | `view_knowledge_file` | ✅ (when attached items include files/collections) | ✅ | | `view_note` | ✅ (when attached items include notes) | ❌ | Quick rule: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. +:::info `kb_exec` replaces the matrix when enabled +When [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set, Open WebUI injects `kb_exec` instead of the file-oriented tools listed above. Still injected alongside it: `query_knowledge_files` (always), `view_note` (when notes are attached), and `query_knowledge_bases` + `search_knowledge_bases` (when no KB is attached). The model interacts with files through familiar shell commands. See the [Knowledge feature page](/features/workspace/knowledge#filesystem-style-access-kb_exec) for details. +::: + #### Tool Reference | Tool | Parameters | Output | @@ -307,8 +319,10 @@ Quick rule: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. | `search_knowledge_bases` | `query` (required), `count` (default: 5), `skip` (default: 0) | Array of `{id, name, description, file_count}` | | `query_knowledge_files` | `query` (required), `knowledge_ids` (optional), `count` (default: 5) | Array of chunks like `{content, source, file_id, distance?}`; note hits include `{note_id, type: "note"}` | | `search_knowledge_files` | `query` (required), `knowledge_id` (optional), `count` (default: 5), `skip` (default: 0) | Array of `{id, filename, knowledge_id, knowledge_name}` | -| `view_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000) | `{id, filename, content, updated_at, created_at}` — includes `truncated`, `total_chars`, `next_offset` when paginated | +| `grep_knowledge_files` | `pattern` (required; regex auto-detected), `file_id` (optional — single-file mode), `case_insensitive` (default: false), `count_only` (default: false) | Matching lines with file IDs, filenames, and 1-indexed line numbers (capped at 50 matches) | +| `view_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000), `line_numbers` (default: false), `start_line` / `end_line` (optional — line-based addressing overrides `offset`/`max_chars`) | `{id, filename, content, updated_at, created_at}` — includes `truncated`, `total_chars`, `next_offset` when paginated, or `total_lines`, `showing_lines`, `next_start_line` in line mode | | `view_knowledge_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000) | `{id, filename, content, knowledge_id, knowledge_name}` — includes pagination metadata when truncated | +| `kb_exec` | `command` (required) — filesystem-style command: `ls` (root) / `ls /` / `ls -a` (flat with paths), `tree` / `tree /`, `cat -n `, `head -N `, `tail -N `, `sed -n ',p' `, `grep [-i\|-l\|-c] "" [/\|\|*.ext]`, `find [/] ""`, `wc `, `stat `; supports pipes (`grep "auth" \| head -5`); files referenced by path (`docs/api/auth.md`), filename, or file ID | Plain text command output (matches/listing/tree/file content as appropriate) | | **Image Gen** | | | | `generate_image` | `prompt` (required) | `{status, message, images}` — auto-displayed | | `edit_image` | `prompt` (required), `image_urls` (required) | `{status, message, images}` — auto-displayed | @@ -443,7 +457,7 @@ When the **Builtin Tools** capability is enabled, you can further control which | **Memory** | `search_memories`, `add_memory`, `replace_memory_content`, `delete_memory`, `list_memories` | Search and manage user memories | | **Chat History** | `search_chats`, `view_chat` | Search and view user chat history | | **Notes** | `search_notes`, `view_note`, `write_note`, `replace_note_content` | Search, view, and manage user notes | -| **Knowledge Base** | `list_knowledge`, `list_knowledge_bases`, `search_knowledge_bases`, `query_knowledge_bases`, `search_knowledge_files`, `query_knowledge_files`, `view_file`, `view_knowledge_file` | Browse and query knowledge bases | +| **Knowledge Base** | `list_knowledge`, `list_knowledge_bases`, `search_knowledge_bases`, `query_knowledge_bases`, `search_knowledge_files`, `query_knowledge_files`, `grep_knowledge_files`, `view_file`, `view_knowledge_file` (or `kb_exec` + `query_knowledge_files` + `view_note`/`query_knowledge_bases`/`search_knowledge_bases` as applicable when [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set) | Browse and query knowledge bases | | **Web Search** | `search_web`, `fetch_url` | Search the web and fetch URL content | | **Image Generation** | `generate_image`, `edit_image` | Generate and edit images | | **Code Interpreter** | `execute_code` | Execute code in a sandboxed environment | diff --git a/docs/features/workspace/knowledge.md b/docs/features/workspace/knowledge.md index 1ba316d90..a2365cf68 100644 --- a/docs/features/workspace/knowledge.md +++ b/docs/features/workspace/knowledge.md @@ -42,6 +42,8 @@ Attach specific knowledge bases to a model so it only searches what's relevant. | 📑 **5 extraction engines** | Tika, Docling, Azure, Mistral OCR, custom loaders | | 🤖 **Agentic retrieval** | Models browse, search, and read your documents autonomously | | 📄 **Full context mode** | Inject entire documents with no chunking | +| 🗂️ **Nested directories** | Organize files into subdirectories with drag-and-drop reordering | +| 🔄 **Incremental directory sync** | Mirror a local folder into the KB — only new and modified files upload, deletions are removed, mirroring folder structure | | 📦 **Export and API** | Back up knowledge bases as zip files, manage via REST API | --- @@ -76,12 +78,93 @@ With [native function calling](/features/extensibility/plugin/tools#tool-calling | `query_knowledge_bases` | ❌ | ✅ | Search KB names/descriptions by semantic similarity | | `search_knowledge_files` | ✅ (scoped) | ✅ (all) | Search files by filename | | `query_knowledge_files` | ✅ (scoped) | ✅ | Search file contents using the RAG pipeline | -| `view_file` | ✅ | ❌ | Read file content with pagination (default 10K chars, cap 100K) | +| `grep_knowledge_files` | ✅ (scoped) | ✅ | Exact text / regex search across knowledge files (returns matching lines with line numbers; auto-detects regex like `error|warn`) | +| `view_file` | ✅ | ❌ | Read file content with pagination (`offset`/`max_chars`) or by line range (`start_line`/`end_line`, optional `line_numbers`) | | `view_knowledge_file` | ✅ | ✅ | Read file content from any accessible KB | | `view_note` | ✅ | ❌ | Read attached notes | The key split: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. Attaching a KB scopes the model to only those documents. Leaving it unscoped lets the model discover everything the user has access to. +#### When to prefer `grep_knowledge_files` over `query_knowledge_files` + +The two search tools complement each other: + +| | `query_knowledge_files` | `grep_knowledge_files` | +|---|---|---| +| **How it matches** | Semantic / vector retrieval (with optional BM25 + rerank when [`ENABLE_RAG_HYBRID_SEARCH`](/reference/env-configuration#enable_rag_hybrid_search) is on) | Exact string match — regex auto-detected (e.g. `error\|warn`, `version \d+`) | +| **Returns** | Relevant chunks of content | Matching lines with file ID, filename, and 1-indexed line number | +| **Use when** | "What does the documentation say about X?" — paraphrased questions, conceptual lookups | "Find every place we mention `OPENAI_API_KEY`" — literal identifiers, error strings, version numbers | +| **Result cap** | Top K (default 5) | 50 matches | +| **Flags** | — | `case_insensitive`, `count_only`, `file_id` (single-file mode) | + +In agentic flows, a typical pattern is: `query_knowledge_files` to locate the relevant document, then `grep_knowledge_files` to pinpoint exact lines, then `view_file` (line-range mode below) to read the surrounding context. + +#### Reading with `view_file` + +`view_file` supports two addressing modes: + +- **Character pagination** — `offset` + `max_chars` (default `10000`, hard cap `100000`). Best for streaming through a long document; the response includes `next_offset` when the file is truncated. +- **Line range** — `start_line` + optional `end_line` (1-indexed, inclusive). Overrides `offset`/`max_chars` when set; pairs naturally with `grep_knowledge_files`' line numbers. Pass `line_numbers: true` to also get a `: ` prefix on each returned line. + +The line-range response includes `total_lines`, `showing_lines`, and `next_start_line` for follow-up reads. + +### Filesystem-style access (`kb_exec`) + +When [`ENABLE_KB_EXEC=True`](/reference/env-configuration#enable_kb_exec) is set, Open WebUI exposes a `kb_exec` tool that gives the model a filesystem-style interface over knowledge bases. + +**Tools that go away**, because their function is now covered by `kb_exec` commands: + +- `list_knowledge` — replaced by `ls` +- `search_knowledge_files` — replaced by `find ""` +- `grep_knowledge_files` — replaced by `grep ""` +- `view_file` and `view_knowledge_file` — replaced by `cat`, `head`, `tail`, `sed -n ',p'` + +**Tools that stay injected alongside `kb_exec`**, because they do something `kb_exec` can't: + +- **`query_knowledge_files`** — semantic / RAG search (always) +- **`view_note`** — when notes are attached to the model (`kb_exec` is file-only, so notes need a dedicated reader) +- **`query_knowledge_bases`** and **`search_knowledge_bases`** — when no KB is attached to the model, so the model can still discover and search across knowledge bases by name/description + +This is experimental and **off by default**. It targets frontier models that already "think in shell" — they tend to chain `ls`, `grep`, and `cat` more reliably than they orchestrate a fan-out of specialized tools. + +**Supported commands** + +| Command | Purpose | +|---------|---------| +| `ls`, `ls /`, `ls -a` | List the current level / a subdirectory / a flat view of every file with full paths | +| `tree`, `tree /` | Recursive directory tree | +| `cat -n ` | Read a file (optionally with line numbers) | +| `head -N ` / `tail -N ` | First or last N lines | +| `sed -n ',p' ` | Print lines `` through `` | +| `grep "" [/\|\|*.ext]` | Exact / regex search; flags `-i` (case-insensitive), `-l` (filenames only), `-c` (counts) | +| `find [/] ""` | Find files by glob | +| `wc ` | Line / word / char counts | +| `stat ` | File metadata | + +**Pipes** + +`kb_exec` parses a single pipeline, so commands compose: + +```text +grep "auth" | head -5 +grep -l "TODO" docs/ +find docs/ "*.md" | head -10 +``` + +**File references** + +Files can be addressed three ways — pick whichever is unambiguous: + +- **Path** — `docs/api/auth.md` (relative to the knowledge base root; resolves through the directory tree) +- **Filename** — `auth.md` (errors with an "ambiguous filename" hint when the same name exists in multiple directories or KBs) +- **File ID** — the UUID returned by `ls`, `find`, or `grep` + +**Behavior notes** + +- `kb_exec` respects the same access control as the other knowledge tools — files the user can't read are silently excluded from results. +- The model still has `query_knowledge_files` for semantic search; reach for it when literal commands won't find a paraphrased concept. +- Built on top of the directory model — `kb_exec` is the only tool that fully reflects the directory structure created in the UI. + Autonomous exploration works best with frontier models that can intelligently chain search, browse, and synthesize. Smaller models may struggle with multi-step retrieval. Administrators can disable the **Knowledge Base** tool category per-model in **Workspace > Models > Edit > Builtin Tools**. For the full list of built-in agentic tools, see the [Native/Agentic Mode Tools Guide](/features/extensibility/plugin/tools#built-in-system-tools-nativeagentic-mode). @@ -104,6 +187,54 @@ When native function calling is enabled, attached knowledge is **not automatical 3. Upload files or add existing documents. 4. Attach the knowledge base to a model in **Workspace > Models > Edit**, or reference it in chat with `#`. +### Organizing into directories + +Knowledge bases support nested **directories** so larger document sets stay navigable. Create them from the **Add Content** menu (**+ New Directory**), then reorganize freely. + +**Creating and navigating** + +- **+ New Directory** lives next to file upload in the **Add Content** menu. Name uniqueness is enforced per parent — two siblings can't share a name, but you can reuse names in different parents. +- Click a directory to descend into it; the **breadcrumb trail** at the top of the view always reflects the current path and lets you jump back to any ancestor in one click. +- Directories can be **renamed** or **moved to a different parent** without affecting the files inside them. + +**Drag-and-drop** + +You can move items by dragging: + +- **Files** onto a directory row, into the empty area of an open directory, or onto any breadcrumb crumb (including the root crumb to send a file back to the top level). +- **Directories** onto another directory to nest them, or onto a breadcrumb crumb to move them up the tree. Moving a directory into itself or one of its descendants is blocked server-side. + +**Deletion semantics** + +Deleting a non-empty directory prompts for the action to take with its contents: + +- **Move files to parent** (default) — the directory is removed but its files and subdirectories are re-parented one level up. +- **Delete everything** — the directory and all files/subdirectories underneath it are permanently removed. + +**Effect on retrieval and tools** + +- **Retrieval and standard RAG** still span the entire knowledge base. Directories don't shard the vector index; chunks from any subdirectory remain reachable in a single search. +- **Agentic tools** are directory-aware: + - `kb_exec` (when enabled) treats subdirectories like a filesystem: `ls docs/`, `tree`, `grep "x" docs/`, and path-style refs (`docs/api/auth.md`) all work — see [Filesystem-Style Access (`kb_exec`)](#filesystem-style-access-kb_exec) below. + - The other knowledge tools (`query_knowledge_files`, `grep_knowledge_files`, `search_knowledge_files`) ignore directory boundaries and return matches from the whole KB. + +### Renaming files + +Individual files can be renamed in place from the workspace via the file's item menu — no need to re-upload. The new name is reflected everywhere the file is referenced (knowledge listings, agentic tool output, citations). + +### Syncing a local directory + +The **Add Content → Sync Directory** action mirrors a local folder into the knowledge base **incrementally**: the client hashes each local file (SHA-256), the server compares hashes and paths against what is already stored, and only **new**, **modified**, and **deleted** files are touched. Unmodified files (the typical majority) are left alone — no re-upload, no re-embedding. The local folder's subdirectory structure is mirrored in the KB; missing subdirectories are created, and subdirectories that no longer exist locally are removed. + +Behavior to be aware of: + +- Hidden files and folders (anything beginning with `.`) are skipped. +- Files modified locally upload with a new content hash; the old file entry is removed from the KB and replaced. +- Files removed locally are deleted from the KB during the cleanup step. +- The action is **non-destructive** for unchanged files. Earlier versions of the same menu action used to wipe and re-upload everything — that is no longer the case as of v0.9.6. + +For programmatic use, the same workflow is exposed as two endpoints under [API access](#api-access) below. + ### Exporting Admins can export an entire knowledge base as a zip file via the item menu (three dots) > **Export**. Files are converted to `.txt` for universal compatibility. Regular users will not see the Export option. @@ -112,9 +243,25 @@ Admins can export an entire knowledge base as a zip file via the item menu (thre Knowledge bases can be managed programmatically: -- `POST /api/v1/files/` - Upload files -- `GET /api/v1/files/{id}/process/status` - Check processing status -- `POST /api/v1/knowledge/{id}/file/add` - Add files to a knowledge base +**Files** + +- `POST /api/v1/files/` — Upload files. Pass `knowledge_id` (and optionally `directory_id`) in the upload metadata to have the backend **auto-link and process the file into that knowledge base server-side** — equivalent to a follow-up `POST /api/v1/knowledge/{id}/file/add`, but it does not depend on the client staying connected after upload. This is the recommended single-call path (added in v0.9.6, fixing files left unlinked when the uploader disconnected mid-processing). The server SHA-256-hashes the uploaded bytes into `file.meta.file_hash`; clients can pre-compute and send `file_hash` in metadata to skip server-side hashing (used by the incremental sync flow below). +- `GET /api/v1/files/{id}/process/status` — Check processing status +- `POST /api/v1/files/{id}/rename` — Rename a file +- `POST /api/v1/knowledge/{id}/file/add` — Add files to a knowledge base +- `POST /api/v1/knowledge/{id}/file/move` — Move a file between directories within the same KB (body: `file_id`, `directory_id` — `null` moves to the KB root) + +**Directories** + +- `POST /api/v1/knowledge/{id}/dirs/create` — Create a directory (body: `name`, optional `parent_id`) +- `POST /api/v1/knowledge/{id}/dirs/{dir_id}/update` — Rename or re-parent a directory (body: `name` and/or `parent_id`) +- `DELETE /api/v1/knowledge/{id}/dirs/{dir_id}/delete?move_files=true` — Delete a directory. With `move_files=true` (default), contained files are re-parented; with `move_files=false`, they're deleted along with the directory. + +**Incremental directory sync** (added in v0.9.6) + +- `POST /api/v1/knowledge/{id}/sync/diff` — Submit a local manifest (`manifest: [{path, filename, checksum}]` where `checksum` is the SHA-256 of the file bytes) and receive `{added, modified, deleted, mkdir, rmdir, unmodified_count}` describing exactly what to upload, replace, delete, and which directories to create/remove. Read-only — does not mutate the KB. +- After acting on the diff (create `mkdir` paths, upload `added` + `modified` files with their hashes via `POST /api/v1/files/`), call: +- `POST /api/v1/knowledge/{id}/sync/cleanup` — Body: `{file_ids: [...], dir_ids: [...]}`. Removes the stale files (from the KB, vector store, and per-file collections) and the now-empty directories returned by `sync/diff`. Run this last so deletions don't outrun uploads. File processing happens asynchronously. You must poll the status endpoint until processing completes before adding files to a knowledge base, or you'll get an "empty content" error. See [API Endpoints](/reference/api-endpoints#-retrieval-augmented-generation-rag) for workflow examples. @@ -144,7 +291,7 @@ Add dozens of papers to a knowledge base. The AI searches across all of them to ### Processing delay for API uploads -Files uploaded via API are processed asynchronously. Attempting to use a file before processing completes will fail silently or return empty results. +Files uploaded via API are processed asynchronously. Attempting to use a file before processing completes will fail silently or return empty results. Note that uploading with a `knowledge_id` (above) makes linking server-side and robust to client disconnects, but it does **not** make the content instantly queryable — extraction/embedding still runs in the background, so poll `GET /api/v1/files/{id}/process/status` before relying on retrieval. ### Native function calling changes behavior diff --git a/docs/getting-started/advanced-topics/development.md b/docs/getting-started/advanced-topics/development.md index 0dfeba762..eba4bd84e 100644 --- a/docs/getting-started/advanced-topics/development.md +++ b/docs/getting-started/advanced-topics/development.md @@ -19,10 +19,17 @@ You can test the latest changes by running the [dev Docker image](/getting-start | Requirement | Version | |-------------|---------| -| **Python** | 3.11+ | +| **Python** | 3.11 or 3.12 (see note below; 3.13 not supported yet) | | **Node.js** | 22.10+ | | **Git** | Any recent version | +:::info Python version compatibility +Open WebUI supports **Python 3.11 and 3.12**. **3.13 is not supported yet** — a few of our dependencies still need to ship 3.13-compatible releases, and until they do, installs on 3.13 will fail or break at runtime. + +- **For production**, use the [Docker image](/getting-started/quick-start) or the **latest Python 3.11**. This is the combination we test against most heavily. +- **3.12 also works**, but we have seen very rare reports of odd behaviour on 3.12 that we have not reproduced on 3.11. If you are running into something inexplicable on 3.12, dropping to the latest 3.11 is the first thing to try. +::: + :::warning Separate your data Never share your database or data directory between dev and production. Dev builds may include database migrations that are not backward-compatible. ::: diff --git a/docs/getting-started/advanced-topics/hardening.md b/docs/getting-started/advanced-topics/hardening.md index 9dc6e3751..7b41f30f9 100644 --- a/docs/getting-started/advanced-topics/hardening.md +++ b/docs/getting-started/advanced-topics/hardening.md @@ -551,6 +551,10 @@ Outbound HTTP requests also do not follow `3xx` redirects by default. Without th AIOHTTP_CLIENT_ALLOW_REDIRECTS=false ``` +:::note Playwright loader (v0.9.6+) +Earlier versions applied URL validation and the redirect gate only to the default web loader; the Playwright-based loader (`WEB_LOADER_ENGINE=playwright` / the `playwright` Docker variant) could navigate and follow redirects to internal or blocklisted URLs unchecked. As of v0.9.6 the Playwright path enforces the same `validate_url()` and redirect rules as the default loader, so the SSRF controls above apply regardless of which web loader engine you run. If you use Playwright, ensure you are on v0.9.6 or later. +::: + ### Profile image URL forwarding The user and model profile-image endpoints can issue a `302 Found` redirect to whatever origin is stored in `profile_image_url` so that externally-hosted avatars (e.g. Gravatar via an upstream identity provider) display in the UI. That redirect causes the user's browser to make a request directly to the external origin, leaking client IP, User-Agent, and Referer headers — and an account whose `profile_image_url` was set to an attacker-controlled host can use that to deanonymize anyone who renders their avatar. diff --git a/docs/getting-started/advanced-topics/scaling.md b/docs/getting-started/advanced-topics/scaling.md index 829918525..3d778ded5 100644 --- a/docs/getting-started/advanced-topics/scaling.md +++ b/docs/getting-started/advanced-topics/scaling.md @@ -109,6 +109,7 @@ ENABLE_WEBSOCKET_SUPPORT=true - If you're using Redis Sentinel for high availability, also set `REDIS_SENTINEL_HOSTS` and consider setting `REDIS_SOCKET_CONNECT_TIMEOUT=5` to prevent hangs during failover. - For AWS Elasticache or other managed Redis Cluster services, set `REDIS_CLUSTER=true`. - Make sure your Redis server has `timeout 1800` and a high enough `maxclients` (10000+) to prevent connection exhaustion over time. +- For high-concurrency websocket streaming, review Redis Pub/Sub output buffer limits. Large Socket.IO events can disconnect Pub/Sub clients if Redis uses small default buffers; see [WebSocket Pub/Sub Buffer Limits](/tutorials/integrations/redis#websocket-pubsub-buffer-limits). - A **single Redis instance** is sufficient for the vast majority of deployments, even with thousands of users. You almost certainly do not need Redis Cluster unless you have specific HA/bandwidth requirements. If you think you need Redis Cluster, first check whether your connection count and memory usage are caused by fixable configuration issues (see [Common Anti-Patterns](/troubleshooting/performance#%EF%B8%8F-common-anti-patterns)). - Without Redis in a multi-instance setup, you will experience [WebSocket 403 errors](/troubleshooting/multi-replica#2-websocket-403-errors--connection-failures), [configuration sync issues](/troubleshooting/multi-replica#3-model-not-found-or-configuration-mismatch), and intermittent authentication failures. @@ -385,8 +386,19 @@ UVICORN_WORKERS=1 # Migrations (set to false on all but one instance) ENABLE_DB_MIGRATIONS=false + +# Concurrency & DB write throttling (REQUIRED at scale — see note below) +THREAD_POOL_SIZE=2000 +DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL=300 ``` +:::warning Two settings people forget — and then their scaled deployment stalls +- **`THREAD_POOL_SIZE=2000`** — Open WebUI offloads blocking work (DB calls, file I/O, sync handlers) to a thread pool whose default concurrency ceiling is only **40**. At scale, once 40 blocking operations are in flight every further request **queues**, and the whole app appears to freeze even though CPU/RAM look fine. `2000` is a *lower* bound for large instances; it is a concurrency ceiling, **not** a CPU/thread count, so a high value is not a contention risk. Never lower it. (The only exception is genuinely tiny hardware, which is not a "scaled deployment".) +- **`DATABASE_USER_ACTIVE_STATUS_UPDATE_INTERVAL=300`** — presence tracking writes each user's `last_active_at` to the database. **Unset (the default) means this write is unthrottled — roughly one `UPDATE` + `COMMIT` per authenticated request.** At scale that is a continuous flood of tiny write transactions that saturates the connection pool for no functional gain. Set it to `300`–`500` seconds; it is mandatory for large/production deployments and free performance everywhere else. + +Both are read once at startup and are not configurable from the Admin UI. See [Performance → Database Optimization](/troubleshooting/performance#-database-optimization) and [Performance → High-Concurrency](/troubleshooting/performance#-high-concurrency--network-optimization). +::: + ### Security defaults to revisit at scale A few defaults that are reasonable for single-user evaluation become less so once you put the deployment behind SSO and serve real users. The full discussion lives in the [Hardening guide](/getting-started/advanced-topics/hardening); the items most often missed in enterprise rollouts: diff --git a/docs/getting-started/essentials.mdx b/docs/getting-started/essentials.mdx index 677ae05b8..f7bcf4b0e 100644 --- a/docs/getting-started/essentials.mdx +++ b/docs/getting-started/essentials.mdx @@ -219,14 +219,14 @@ If you just want RAG to work well out of the box, these settings are a solid gen Set these in **Admin Panel > Settings > Documents**: -| Setting | Recommended value | Default | Why | -|---------|-------------------|---------|-----| -| **Text Splitter** | `token` | `character` | Token-based splitting produces more consistent chunk sizes across document types | -| **Markdown Header Splitting** | **On** | On | Respects document structure by splitting at headings, keeping sections coherent | -| **Chunk Size** | `2000` | `1000` | Larger chunks preserve more surrounding context per retrieval hit | -| **Chunk Overlap** | `200` | `100` | More overlap means less chance of cutting a key sentence in half | -| **Top K** | `15` | `3` | Retrieves more candidate chunks, giving the model a wider pool of relevant context. If you are working with local models that have constrained context sizes, lower this to `5` to avoid filling the context window with retrieved chunks | -| **Embedding Model** | External (OpenAI or Ollama) | `all-MiniLM-L6-v2` (local CPU) | The default works for a single user but consumes ~500 MB RAM per worker. For any multi-user setup, use an external embedding API instead | +| Setting | Default | Recommended value | Why | +|---------|---------|-------------------|-----| +| **Text Splitter** | `character` | `token` | Token-based splitting produces more consistent chunk sizes across document types | +| **Markdown Header Splitting** | On | **On** | Respects document structure by splitting at headings, keeping sections coherent | +| **Chunk Size** | `1000` | `2000` | Larger chunks preserve more surrounding context per retrieval hit | +| **Chunk Overlap** | `100` | `200` | More overlap means less chance of cutting a key sentence in half | +| **Top K** | `3` | `15` | Retrieves more candidate chunks, giving the model a wider pool of relevant context. If you are working with local models that have constrained context sizes, lower this to `5` to avoid filling the context window with retrieved chunks | +| **Embedding Model** | `all-MiniLM-L6-v2` (local CPU) | External (OpenAI or Ollama) | The default works for a single user but consumes ~500 MB RAM per worker. For any multi-user setup, use an external embedding API instead | :::tip Embedding model The default SentenceTransformers model runs locally on CPU and is fine for a single user getting started. For anything beyond that, point at an external embeddings API: set `RAG_EMBEDDING_ENGINE=openai` with an OpenAI API key, or `RAG_EMBEDDING_ENGINE=ollama` with any Ollama embedding model (e.g., `nomic-embed-text`). This offloads the work and frees significant RAM. diff --git a/docs/getting-started/quick-start/index.mdx b/docs/getting-started/quick-start/index.mdx index d4f5d32d0..81f4064b6 100644 --- a/docs/getting-started/quick-start/index.mdx +++ b/docs/getting-started/quick-start/index.mdx @@ -22,6 +22,7 @@ import Pip from './tab-python/Pip.md'; import Uv from './tab-python/Uv.md'; import Conda from './tab-python/Conda.md'; import PythonUpdating from './tab-python/PythonUpdating.md'; +import PythonCompat from './tab-python/_PythonCompat.md'; # Quick Start @@ -87,6 +88,7 @@ Open WebUI works on **macOS, Linux** (x86_64 and ARM64, including Raspberry Pi a +
    diff --git a/docs/getting-started/quick-start/tab-docker/DockerCompose.md b/docs/getting-started/quick-start/tab-docker/DockerCompose.md index 8b88d3ac4..b7bd492f3 100644 --- a/docs/getting-started/quick-start/tab-docker/DockerCompose.md +++ b/docs/getting-started/quick-start/tab-docker/DockerCompose.md @@ -56,9 +56,15 @@ To start your services, run the following command: docker compose up -d ``` -## Helper Script +## Helper Scripts -A useful helper script called `run-compose.sh` is included with the codebase. This script assists in choosing which Docker Compose files to include in your deployment, streamlining the setup process. +A set of helper scripts is included with the codebase to streamline common Docker workflows: + +- `docker-compose-launcher.sh` — Interactive Compose launcher with GPU auto-detection, configurable WebUI/API ports, host data mounts, and optional Playwright support. Run `./docker-compose-launcher.sh --help` for the full list of flags. Use `--drop` to tear down the project. +- `docker-cleanup.sh` — Stops the Compose project and **deletes all volumes**, including persistent data. Prompts for confirmation before destroying data. +- `docker-run.sh` — Builds the Open WebUI image and runs a single container, exposing it on `OPEN_WEBUI_PORT` (default `3000`). +- `docker-ollama.sh` — Pulls and runs the official Ollama container with optional GPU passthrough, exposing it on `OLLAMA_PORT` (default `11434`). +- `docker-update-models.sh` — Iterates through every model installed in the Ollama container and pulls the latest version. --- diff --git a/docs/getting-started/quick-start/tab-docker/ManualDocker.md b/docs/getting-started/quick-start/tab-docker/ManualDocker.md index b944625d4..8825dedf2 100644 --- a/docs/getting-started/quick-start/tab-docker/ManualDocker.md +++ b/docs/getting-started/quick-start/tab-docker/ManualDocker.md @@ -49,9 +49,9 @@ Visit [http://localhost:3000](http://localhost:3000). For production environments, pin a specific version instead of using floating tags: ```bash -docker pull ghcr.io/open-webui/open-webui:v0.9.5 -docker pull ghcr.io/open-webui/open-webui:v0.9.5-cuda -docker pull ghcr.io/open-webui/open-webui:v0.9.5-ollama +docker pull ghcr.io/open-webui/open-webui:v0.9.6 +docker pull ghcr.io/open-webui/open-webui:v0.9.6-cuda +docker pull ghcr.io/open-webui/open-webui:v0.9.6-ollama ``` --- diff --git a/docs/getting-started/quick-start/tab-python/_PythonCompat.md b/docs/getting-started/quick-start/tab-python/_PythonCompat.md new file mode 100644 index 000000000..80f68c9a1 --- /dev/null +++ b/docs/getting-started/quick-start/tab-python/_PythonCompat.md @@ -0,0 +1,6 @@ +:::info Python version compatibility +Open WebUI supports **Python 3.11 and 3.12**. **Python 3.13 is not supported yet** — a handful of our dependencies still need to ship 3.13-compatible releases, and until they do, installs on 3.13 will fail or break at runtime. + +- **For production**, run the [Docker image](#docker) or use the **latest Python 3.11**. This is the combination we test against most heavily. +- **Python 3.12 also works**, but we have seen very rare reports of odd behaviour on 3.12 that we have not reproduced on 3.11. If something inexplicable happens on 3.12, drop to the latest 3.11 first. +::: diff --git a/docs/getting-started/updating.mdx b/docs/getting-started/updating.mdx index 68a118ccd..7b9000e04 100644 --- a/docs/getting-started/updating.mdx +++ b/docs/getting-started/updating.mdx @@ -31,9 +31,9 @@ The `:main` tag always points to the **latest build**. It's convenient but can i For stability, pin a specific release tag: ``` -ghcr.io/open-webui/open-webui:v0.9.5 -ghcr.io/open-webui/open-webui:v0.9.5-cuda -ghcr.io/open-webui/open-webui:v0.9.5-ollama +ghcr.io/open-webui/open-webui:v0.9.6 +ghcr.io/open-webui/open-webui:v0.9.6-cuda +ghcr.io/open-webui/open-webui:v0.9.6-ollama ``` Browse all available tags on the [GitHub releases page](https://github.com/open-webui/open-webui/releases). diff --git a/docs/reference/api-endpoints.md b/docs/reference/api-endpoints.md index 450d51b68..fc3426565 100644 --- a/docs/reference/api-endpoints.md +++ b/docs/reference/api-endpoints.md @@ -278,7 +278,7 @@ Even in the non-streaming case, **`outlet()` does not rewrite the HTTP response ``` :::tip -If you need `outlet()` output over HTTP today, call `/api/chat/completions` followed by `/api/chat/completed`. Inline execution on `dev` is primarily for WebUI-shaped clients that read from the WebSocket. For more details on filter behavior, see the [Filter Function documentation](/features/extensibility/plugin/functions/filter#-filter-behavior-with-api-requests). +If you need `outlet()` output over HTTP today, call `/api/chat/completions` followed by `/api/chat/completed`. Inline execution on `dev` is primarily for WebUI-shaped clients that read from the WebSocket. For more details on filter behavior, see the [Filter Function documentation](/features/extensibility/plugin/functions/filter#filter-behavior-with-api-requests). ::: ### 🦙 Ollama API Proxy Support diff --git a/docs/reference/database-schema.md b/docs/reference/database-schema.md index 8b5ab256e..464ba831a 100644 --- a/docs/reference/database-schema.md +++ b/docs/reference/database-schema.md @@ -10,7 +10,7 @@ This tutorial is a community contribution and is not supported by the Open WebUI ::: > [!WARNING] -> This documentation reflects schema changes up to Open WebUI v0.9.5. +> This documentation reflects schema changes up to Open WebUI v0.9.6. ## Open-WebUI Internal SQLite Database diff --git a/docs/reference/env-configuration.mdx b/docs/reference/env-configuration.mdx index ed6ad2fc9..0403f05c0 100644 --- a/docs/reference/env-configuration.mdx +++ b/docs/reference/env-configuration.mdx @@ -12,23 +12,23 @@ As new variables are introduced, this page will be updated to reflect the growin :::info -This page is up-to-date with Open WebUI release version [v0.9.5](https://github.com/open-webui/open-webui/releases/tag/v0.9.5), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions. +This page is up-to-date with Open WebUI release version [v0.9.6](https://github.com/open-webui/open-webui/releases/tag/v0.9.6), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions. ::: -### Important Note on `PersistentConfig` Environment Variables +### Important Note on `ConfigVar` Environment Variables :::note -When launching Open WebUI for the first time, all environment variables are treated equally and can be used to configure the application. However, for environment variables marked as `PersistentConfig`, their values are persisted and stored internally. +When launching Open WebUI for the first time, all environment variables are treated equally and can be used to configure the application. However, for environment variables marked as `ConfigVar`, their values are persisted and stored internally. -After the initial launch, if you restart the container, `PersistentConfig` environment variables will no longer use the external environment variable values. Instead, they will use the internally stored values. +After the initial launch, if you restart the container, `ConfigVar` environment variables will no longer use the external environment variable values. Instead, they will use the internally stored values. In contrast, regular environment variables will continue to be updated and applied on each subsequent restart. -You can update the values of `PersistentConfig` environment variables directly from within Open WebUI, and these changes will be stored internally. This allows you to manage these configuration settings independently of the external environment variables. +You can update the values of `ConfigVar` environment variables directly from within Open WebUI, and these changes will be stored internally. This allows you to manage these configuration settings independently of the external environment variables. -Please note that `PersistentConfig` environment variables are clearly marked as such in the documentation below, so you can be aware of how they will behave. +Please note that `ConfigVar` environment variables are clearly marked as such in the documentation below, so you can be aware of how they will behave. To disable this behavior and force Open WebUI to always use your environment variables (ignoring the database), set `ENABLE_PERSISTENT_CONFIG` to `False`. @@ -44,7 +44,7 @@ If you change an environment variable (like `ENABLE_SIGNUP=True`) but don't see Set `ENABLE_PERSISTENT_CONFIG=False` in your environment. This forces Open WebUI to read your variables directly. Note that UI-based settings changes will not persist across restarts in this mode. #### Option 2: Update via Admin UI (Recommended) -The simplest and safest way to change `PersistentConfig` settings is directly through the **Admin Panel** within Open WebUI. Even if an environment variable is set, changes made in the UI will take precedence and be saved to the database. +The simplest and safest way to change `ConfigVar` settings is directly through the **Admin Panel** within Open WebUI. Even if an environment variable is set, changes made in the UI will take precedence and be saved to the database. #### Option 3: Manual Database Update (Last Resort / Lock-out Recovery) If you are locked out or cannot access the UI, you can manually update the SQLite database via Docker: @@ -78,7 +78,7 @@ environment variables, see our [logging documentation](https://docs.openwebui.co - Type: `str` - Default: `http://localhost:3000` - Description: Specifies the URL where your Open WebUI installation is reachable. Needed for search engine support and OAuth/SSO. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. :::warning @@ -97,7 +97,7 @@ Failure to set WEBUI_URL before using OAuth/SSO will result in failure to log in - Type: `bool` - Default: `True` - Description: Toggles user account creation. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_SIGNUP_PASSWORD_CONFIRMATION` @@ -148,14 +148,14 @@ After the admin account is created, sign-up is automatically disabled for securi - Type: `bool` - Default: `True` - Description: Toggles email, password, sign-in and "or" (only when `ENABLE_OAUTH_SIGNUP` is set to True) elements. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_PASSWORD_CHANGE_FORM` - Type: `bool` - Default: `True` - Description: Controls visibility of the password change UI in **Settings > Account**. When set to `False`, users do not see the password update form, which is useful for SSO-focused deployments where password changes should not be presented in the UI. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_PASSWORD_AUTH` @@ -181,14 +181,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `str` - Default: `en` - Description: Sets the default locale for the application. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_MODELS` - Type: `str` - Default: Empty string (' '), since `None`. - Description: Sets a default Language Model. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_PINNED_MODELS` @@ -196,14 +196,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Default: Empty string (' ') - Description: Comma-separated list of model IDs to pin by default for new users who haven't customized their pinned models. This provides a pre-selected set of frequently used models in the model selector for new accounts. - Example: `gpt-4,claude-3-opus,llama-3-70b` -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_MODEL_METADATA` - Type: `dict` (JSON object) - Default: `{}` - Description: Sets global default metadata (capabilities and other model info) for all models. These defaults act as a baseline — per-model overrides always take precedence. For capabilities, the defaults and per-model values are merged (per-model wins on conflicts). For other metadata fields, the default is only applied if the model has no value set. Configurable via **Admin Settings → Models**. -- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `models.default_metadata`. +- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `models.default_metadata`. :::info @@ -220,7 +220,7 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `dict` (JSON object) - Default: `{}` - Description: Sets global default parameters (temperature, top_p, max_tokens, seed, etc.) for all models. These defaults are applied as a baseline at chat completion time — per-model parameter overrides always take precedence. Configurable via **Admin Settings → Models**. -- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `models.default_params`. +- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `models.default_params`. :::info @@ -240,14 +240,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - `admin` - New users are automatically activated with administrator permissions. - Default: `pending` - Description: Sets the default role assigned to new users. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_GROUP_ID` - Type: `str` - Default: Empty string (' ') - Description: Sets the default group ID to assign to new users upon registration. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_GROUP_SHARE_PERMISSION` @@ -261,63 +261,63 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `str` - Default: Empty string (' ') - Description: Sets a custom title for the pending user overlay. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `PENDING_USER_OVERLAY_CONTENT` - Type: `str` - Default: Empty string (' ') - Description: Sets a custom text content for the pending user overlay. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_CALENDAR` - Type: `bool` - Default: `True` - Description: Enables or disables the Calendar feature. When enabled, users can create calendars, manage events, and share calendars with other users or groups via access grants. Active automations are automatically surfaced as virtual events on a dedicated "Scheduled Tasks" calendar. Requires the `features.calendar` user permission (admins always pass). -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_CHANNELS` - Type: `bool` - Default: `False` - Description: Enables or disables channel support. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_FOLDERS` - Type: `bool` - Default: `True` - Description: Enables or disables the folders feature, allowing users to organize their chats into folders in the sidebar. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `FOLDER_MAX_FILE_COUNT` - Type: `int` - Default: `("") empty string` - Description: Sets the maximum number of files processing allowed per folder. -- Persistence: This environment variable is a `PersistentConfig` variable. It can be configured in the **Admin Panel > Settings > General > Folder Max File Count**. Default is none (empty string) which is unlimited. +- Persistence: This environment variable is a `ConfigVar` variable. It can be configured in the **Admin Panel > Settings > General > Folder Max File Count**. Default is none (empty string) which is unlimited. #### `ENABLE_AUTOMATIONS` - Type: `bool` - Default: `True` - Description: Enables or disables the Automations feature globally. When disabled, the scheduler skips automation processing, the automation API endpoints return `403 Forbidden`, automation builtin tools are not injected, and the Automations entry is hidden from the sidebar. Requires the `features.automations` user permission (admins always pass). -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `AUTOMATION_MAX_COUNT` - Type: `int` - Default: `("") empty string` (unlimited) - Description: Sets the maximum number of automations a non-admin user can create. When set to a positive integer, users who reach this limit will receive a `403 Forbidden` error when attempting to create additional automations. Admins bypass this limit. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `AUTOMATION_MIN_INTERVAL` - Type: `int` (seconds) - Default: `("") empty string` (no minimum) - Description: Sets the minimum allowed interval in seconds between automation recurrences for non-admin users. When set, any automation schedule that recurs more frequently than this value will be rejected with a `400 Bad Request` error. One-time automations (`COUNT=1`) are exempt from this check. Admins bypass this limit. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. :::tip Common values for AUTOMATION_MIN_INTERVAL @@ -347,20 +347,20 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `bool` - Default: `True` - Description: Enables or disables the notes feature, allowing users to create and manage personal notes within Open WebUI. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_MEMORIES` - Type: `bool` - Default: `True` - Description: Enables or disables the [memory feature](/features/chat-conversations/memory), allowing models to store and retrieve long-term information about users. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `WEBHOOK_URL` - Type: `str` - Description: Sets a webhook for integration with Discord/Slack/Microsoft Teams. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. :::note Admin posture toggles vs. security boundaries @@ -416,14 +416,14 @@ Treat anything in this cluster as *what the admin sees and does in the product U - Type: `bool` - Default: `False` - Description: Enables or disables user webhooks. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `RESPONSE_WATERMARK` - Type: `str` - Default: Empty string (' ') - Description: Sets a custom text that will be included when you copy a message in the chat. e.g., `"This text is AI generated"` -> will add "This text is AI generated" to every message, when copied. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `IFRAME_CSP` @@ -434,12 +434,15 @@ Treat anything in this cluster as *what the admin sees and does in the product U #### `THREAD_POOL_SIZE` - Type: `int` -- Default: `0` -- Description: Sets the thread pool size for FastAPI/AnyIO blocking calls. By default (when set to `0`) FastAPI/AnyIO use `40` threads. In case of large instances and many concurrent users, it may be needed to increase `THREAD_POOL_SIZE` to prevent blocking. +- Default: `0` (unset — the AnyIO default limit of `40` applies) +- Description: Sets the maximum number of **concurrent** blocking operations that may run in the AnyIO worker thread pool at once. Open WebUI offloads synchronous/blocking work (many DB calls, file I/O, sync route handlers, some library calls) to this pool via `run_in_threadpool`. The value is a **concurrency ceiling (a token limit), not a fixed pool of pre-spawned OS threads and not a CPU-core/thread count**: worker threads are created lazily only when needed and reused, so a high value does **not** by itself create that many threads, consume CPU, or cause CPU contention while idle. It only raises how many blocking operations can be in flight simultaneously before the rest must queue. -:::info +:::warning Set this high on any real server (2000+); never lower it +The AnyIO default of `40` is far too low for production. When more than `THREAD_POOL_SIZE` blocking operations are needed at once (many users acting at the same time, or a few users each triggering several blocking calls), every further request **waits** for a free slot. The symptom is the whole app appearing to **hang / freeze / stop responding** under load, even though CPU and memory look fine — it is pool starvation, not resource exhaustion. -If you are running larger instances, you WILL NEED to set this to a higher value like multiple hundreds if not thousands (e.g. `1000`) otherwise your app may get stuck the default pool size (which is 40 threads) is full and will not react anymore. +- **Normal servers / production:** `2000` or higher. `2000` is a *lower* bound for very large multi-user instances — going higher is fine and is **not** a CPU or contention risk (it is a ceiling, not a preallocation). +- **Never decrease below the default.** An idle high ceiling costs effectively nothing; a low ceiling causes freezes. +- **Exception — weak hardware (Raspberry Pi, tiny VPS, containers capped at ~250m CPU / very low RAM):** do **not** set `2000` here. Each *genuinely concurrent* blocking op still uses a real OS thread (stack memory), so on a tiny device an enormous ceiling lets a traffic burst spawn enough threads to exhaust RAM. Leave it at the default, or set a modest value (e.g. a few hundred) matched to what the device can actually absorb. This caveat applies only to constrained single-board / micro deployments — any normal server should use `2000+`. ::: @@ -454,21 +457,21 @@ If you are running larger instances, you WILL NEED to set this to a higher value - Type: `bool` - Default: `True` - Description: Toggles whether to show admin user details in the interface. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_PUBLIC_ACTIVE_USERS_COUNT` - Type: `bool` - Default: `True` - Description: Controls whether the active user count is visible to all users or restricted to administrators only. When set to `False`, only admin users can see how many users are currently active, reducing backend load and addressing privacy concerns in large deployments. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_USER_STATUS` - Type: `bool` - Default: `True` - Description: Globally enables or disables user status functionality. When disabled, the status UI (including blinking active/away indicators and status messages) is hidden across the application, and user status API endpoints are restricted. -- Persistence: This environment variable is a `PersistentConfig` variable. It can be toggled in the **Admin Panel > Settings > General > User Status**. +- Persistence: This environment variable is a `ConfigVar` variable. It can be toggled in the **Admin Panel > Settings > General > User Status**. #### `ENABLE_EASTER_EGGS` @@ -480,7 +483,7 @@ If you are running larger instances, you WILL NEED to set this to a higher value - Type: `str` - Description: Sets the admin email shown by `SHOW_ADMIN_DETAILS` -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENV` @@ -566,13 +569,13 @@ Enabling `ENABLE_REALTIME_CHAT_SAVE` causes every single token generated by the - Type: `bool` - Default: `True` -- Description: Controls whether the user and model profile-image endpoints honor an external `http(s)://` URL stored in `profile_image_url` by issuing a `302 Found` redirect to the original origin. When `False`, the redirect is suppressed and the endpoint falls through to the bundled default image instead. Set to `False` to prevent client-side IP, User-Agent, and Referer leaks to attacker-controlled origins via attacker-stored profile URLs (data URIs and same-origin/static images continue to load normally). Existing deployments that legitimately rely on external profile image URLs (e.g. Gravatar redirects served by upstream identity providers) should keep the default. **This variable is read once at startup — it is not a `PersistentConfig` and cannot be changed from the Admin UI.** +- Description: Controls whether the user and model profile-image endpoints honor an external `http(s)://` URL stored in `profile_image_url` by issuing a `302 Found` redirect to the original origin. When `False`, the redirect is suppressed and the endpoint falls through to the bundled default image instead. Set to `False` to prevent client-side IP, User-Agent, and Referer leaks to attacker-controlled origins via attacker-stored profile URLs (data URIs and same-origin/static images continue to load normally). Existing deployments that legitimately rely on external profile image URLs (e.g. Gravatar redirects served by upstream identity providers) should keep the default. **This variable is read once at startup — it is not a `ConfigVar` and cannot be changed from the Admin UI.** #### `PROFILE_IMAGE_ALLOWED_MIME_TYPES` - Type: `str` (comma-separated MIME types) - Default: `image/png,image/jpeg,image/gif,image/webp` -- Description: Allowlist of MIME types accepted when serving a base64 `data:` URI as a profile image. The MIME type is parsed from the data URI prefix and checked against this list before the response is streamed; non-allowlisted types fall through to the bundled default image. Responses also set `X-Content-Type-Options: nosniff` to prevent the browser from sniffing the body into an executable type. SVG is intentionally not in the default list because it can carry inline `