From edc4a1f2887fc355213c1f148d47ac2f5fbbabde Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Wed, 13 May 2026 21:05:31 +0200 Subject: [PATCH 01/24] 0.9.6 --- .../extensibility/plugin/tools/index.mdx | 15 +- docs/features/workspace/knowledge.md | 118 ++- .../quick-start/tab-docker/DockerCompose.md | 10 +- .../quick-start/tab-docker/ManualDocker.md | 6 +- docs/getting-started/updating.mdx | 6 +- docs/reference/database-schema.md | 2 +- docs/reference/env-configuration.mdx | 848 +++++++++--------- docs/reference/index.md | 2 +- docs/troubleshooting/index.mdx | 2 +- docs/troubleshooting/web-search.mdx | 4 +- 10 files changed, 576 insertions(+), 437 deletions(-) diff --git a/docs/features/extensibility/plugin/tools/index.mdx b/docs/features/extensibility/plugin/tools/index.mdx index 1b4b354cf6..ec238740c8 100644 --- a/docs/features/extensibility/plugin/tools/index.mdx +++ b/docs/features/extensibility/plugin/tools/index.mdx @@ -229,8 +229,10 @@ Default Mode is **not** a supported workaround even for DeepSeek — it is legac | `search_knowledge_bases` | Text search over KB names/descriptions. | | `query_knowledge_files` | Search file contents via the RAG retrieval pipeline (hybrid + rerank when enabled). Main tool for finding answers in docs. | | `search_knowledge_files` | Search files by filename. | -| `view_file` | Read a user-accessible file by ID with pagination (`offset`, `max_chars`). | +| `grep_knowledge_files` | Exact text / regex search across knowledge file content. Returns matching lines with line numbers. Complements `query_knowledge_files` (semantic) when you need literal matches. | +| `view_file` | Read a user-accessible file by ID with character pagination (`offset`, `max_chars`) or line range (`start_line`, `end_line`, optional `line_numbers`). | | `view_knowledge_file` | Read a knowledge-base file by ID with pagination (`offset`, `max_chars`). | +| `kb_exec` *(opt-in)* | Filesystem-style command interface for knowledge bases (`ls`, `tree`, `cat`, `head`, `tail`, `sed`, `grep`, `find`, `wc`, `stat`, with pipe support). Directory-aware: `ls docs/`, `tree`, `grep "x" docs/`, and path-based file refs (`docs/api/auth.md`). Replaces the discovery/read tools above when [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set. | | **Image Gen** | *Requires image generation enabled (per-tool) AND per-chat "Image Generation" toggle enabled.* | | `generate_image` | Generates a new image based on a prompt. Requires `ENABLE_IMAGE_GENERATION`. | | `edit_image` | Edits existing images based on a prompt and image URLs. Requires `ENABLE_IMAGE_EDIT`. | @@ -287,12 +289,17 @@ Use this quick matrix instead of memorizing per-row caveats. | `query_knowledge_bases` | ❌ | ✅ | | `search_knowledge_files` | ✅ (auto-scoped) | ✅ (all accessible KBs) | | `query_knowledge_files` | ✅ (auto-scoped) | ✅ | +| `grep_knowledge_files` | ✅ (auto-scoped) | ✅ | | `view_file` | ✅ (when attached items include files/collections) | ❌ | | `view_knowledge_file` | ✅ (when attached items include files/collections) | ✅ | | `view_note` | ✅ (when attached items include notes) | ❌ | Quick rule: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. +:::info `kb_exec` replaces the matrix when enabled +When [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set, Open WebUI injects a single `kb_exec` tool (plus `query_knowledge_files`, and `query_knowledge_bases` when no KB is attached) instead of the discovery/read tools listed above. The model interacts with the knowledge base through familiar shell commands. See the [Knowledge feature page](/features/workspace/knowledge#agentic-knowledge-tools) for details. +::: + #### Tool Reference | Tool | Parameters | Output | @@ -307,8 +314,10 @@ Quick rule: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. | `search_knowledge_bases` | `query` (required), `count` (default: 5), `skip` (default: 0) | Array of `{id, name, description, file_count}` | | `query_knowledge_files` | `query` (required), `knowledge_ids` (optional), `count` (default: 5) | Array of chunks like `{content, source, file_id, distance?}`; note hits include `{note_id, type: "note"}` | | `search_knowledge_files` | `query` (required), `knowledge_id` (optional), `count` (default: 5), `skip` (default: 0) | Array of `{id, filename, knowledge_id, knowledge_name}` | -| `view_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000) | `{id, filename, content, updated_at, created_at}` — includes `truncated`, `total_chars`, `next_offset` when paginated | +| `grep_knowledge_files` | `pattern` (required; regex auto-detected), `file_id` (optional — single-file mode), `case_insensitive` (default: false), `count_only` (default: false) | Matching lines with file IDs, filenames, and 1-indexed line numbers (capped at 50 matches) | +| `view_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000), `line_numbers` (default: false), `start_line` / `end_line` (optional — line-based addressing overrides `offset`/`max_chars`) | `{id, filename, content, updated_at, created_at}` — includes `truncated`, `total_chars`, `next_offset` when paginated, or `total_lines`, `showing_lines`, `next_start_line` in line mode | | `view_knowledge_file` | `file_id` (required), `offset` (default: 0), `max_chars` (default: 10000, cap: 100000) | `{id, filename, content, knowledge_id, knowledge_name}` — includes pagination metadata when truncated | +| `kb_exec` | `command` (required) — filesystem-style command: `ls` (root) / `ls /` / `ls -a` (flat with paths), `tree` / `tree /`, `cat -n `, `head -N `, `tail -N `, `sed -n ',p' `, `grep [-i\|-l\|-c] "" [/\|\|*.ext]`, `find [/] ""`, `wc `, `stat `; supports pipes (`grep "auth" \| head -5`); files referenced by path (`docs/api/auth.md`), filename, or file ID | Plain text command output (matches/listing/tree/file content as appropriate) | | **Image Gen** | | | | `generate_image` | `prompt` (required) | `{status, message, images}` — auto-displayed | | `edit_image` | `prompt` (required), `image_urls` (required) | `{status, message, images}` — auto-displayed | @@ -443,7 +452,7 @@ When the **Builtin Tools** capability is enabled, you can further control which | **Memory** | `search_memories`, `add_memory`, `replace_memory_content`, `delete_memory`, `list_memories` | Search and manage user memories | | **Chat History** | `search_chats`, `view_chat` | Search and view user chat history | | **Notes** | `search_notes`, `view_note`, `write_note`, `replace_note_content` | Search, view, and manage user notes | -| **Knowledge Base** | `list_knowledge`, `list_knowledge_bases`, `search_knowledge_bases`, `query_knowledge_bases`, `search_knowledge_files`, `query_knowledge_files`, `view_file`, `view_knowledge_file` | Browse and query knowledge bases | +| **Knowledge Base** | `list_knowledge`, `list_knowledge_bases`, `search_knowledge_bases`, `query_knowledge_bases`, `search_knowledge_files`, `query_knowledge_files`, `grep_knowledge_files`, `view_file`, `view_knowledge_file` (or just `kb_exec` + `query_knowledge_files` when [`ENABLE_KB_EXEC`](/reference/env-configuration#enable_kb_exec) is set) | Browse and query knowledge bases | | **Web Search** | `search_web`, `fetch_url` | Search the web and fetch URL content | | **Image Generation** | `generate_image`, `edit_image` | Generate and edit images | | **Code Interpreter** | `execute_code` | Execute code in a sandboxed environment | diff --git a/docs/features/workspace/knowledge.md b/docs/features/workspace/knowledge.md index 1ba316d90f..9db514849f 100644 --- a/docs/features/workspace/knowledge.md +++ b/docs/features/workspace/knowledge.md @@ -42,6 +42,7 @@ Attach specific knowledge bases to a model so it only searches what's relevant. | 📑 **5 extraction engines** | Tika, Docling, Azure, Mistral OCR, custom loaders | | 🤖 **Agentic retrieval** | Models browse, search, and read your documents autonomously | | 📄 **Full context mode** | Inject entire documents with no chunking | +| 🗂️ **Nested directories** | Organize files into subdirectories with drag-and-drop reordering | | 📦 **Export and API** | Back up knowledge bases as zip files, manage via REST API | --- @@ -76,12 +77,80 @@ With [native function calling](/features/extensibility/plugin/tools#tool-calling | `query_knowledge_bases` | ❌ | ✅ | Search KB names/descriptions by semantic similarity | | `search_knowledge_files` | ✅ (scoped) | ✅ (all) | Search files by filename | | `query_knowledge_files` | ✅ (scoped) | ✅ | Search file contents using the RAG pipeline | -| `view_file` | ✅ | ❌ | Read file content with pagination (default 10K chars, cap 100K) | +| `grep_knowledge_files` | ✅ (scoped) | ✅ | Exact text / regex search across knowledge files (returns matching lines with line numbers; auto-detects regex like `error|warn`) | +| `view_file` | ✅ | ❌ | Read file content with pagination (`offset`/`max_chars`) or by line range (`start_line`/`end_line`, optional `line_numbers`) | | `view_knowledge_file` | ✅ | ✅ | Read file content from any accessible KB | | `view_note` | ✅ | ❌ | Read attached notes | The key split: `list_knowledge` and `list_knowledge_bases` are mutually exclusive. Attaching a KB scopes the model to only those documents. Leaving it unscoped lets the model discover everything the user has access to. +#### When to prefer `grep_knowledge_files` over `query_knowledge_files` + +The two search tools complement each other: + +| | `query_knowledge_files` | `grep_knowledge_files` | +|---|---|---| +| **How it matches** | Semantic / vector retrieval (with optional BM25 + rerank when [`ENABLE_RAG_HYBRID_SEARCH`](/reference/env-configuration#enable_rag_hybrid_search) is on) | Exact string match — regex auto-detected (e.g. `error\|warn`, `version \d+`) | +| **Returns** | Relevant chunks of content | Matching lines with file ID, filename, and 1-indexed line number | +| **Use when** | "What does the documentation say about X?" — paraphrased questions, conceptual lookups | "Find every place we mention `OPENAI_API_KEY`" — literal identifiers, error strings, version numbers | +| **Result cap** | Top K (default 5) | 50 matches | +| **Flags** | — | `case_insensitive`, `count_only`, `file_id` (single-file mode) | + +In agentic flows, a typical pattern is: `query_knowledge_files` to locate the relevant document, then `grep_knowledge_files` to pinpoint exact lines, then `view_file` (line-range mode below) to read the surrounding context. + +#### Reading with `view_file` + +`view_file` supports two addressing modes: + +- **Character pagination** — `offset` + `max_chars` (default `10000`, hard cap `100000`). Best for streaming through a long document; the response includes `next_offset` when the file is truncated. +- **Line range** — `start_line` + optional `end_line` (1-indexed, inclusive). Overrides `offset`/`max_chars` when set; pairs naturally with `grep_knowledge_files`' line numbers. Pass `line_numbers: true` to also get a `: ` prefix on each returned line. + +The line-range response includes `total_lines`, `showing_lines`, and `next_start_line` for follow-up reads. + +### Filesystem-style access (`kb_exec`) + +When [`ENABLE_KB_EXEC=True`](/reference/env-configuration#enable_kb_exec) is set, Open WebUI replaces the per-purpose knowledge tools (`list_knowledge`, `search_knowledge_files`, `grep_knowledge_files`, `view_file`, `view_knowledge_file`, `view_note`) with a single unified `kb_exec` tool. `query_knowledge_files` stays available (and `query_knowledge_bases` is added when no KB is attached), but everything else collapses into shell-style commands. + +This is experimental and **off by default**. It targets frontier models that already "think in shell" — they tend to chain `ls`, `grep`, and `cat` more reliably than they orchestrate a fan-out of specialized tools. + +**Supported commands** + +| Command | Purpose | +|---------|---------| +| `ls`, `ls /`, `ls -a` | List the current level / a subdirectory / a flat view of every file with full paths | +| `tree`, `tree /` | Recursive directory tree | +| `cat -n ` | Read a file (optionally with line numbers) | +| `head -N ` / `tail -N ` | First or last N lines | +| `sed -n ',p' ` | Print lines `` through `` | +| `grep "" [/\|\|*.ext]` | Exact / regex search; flags `-i` (case-insensitive), `-l` (filenames only), `-c` (counts) | +| `find [/] ""` | Find files by glob | +| `wc ` | Line / word / char counts | +| `stat ` | File metadata | + +**Pipes** + +`kb_exec` parses a single pipeline, so commands compose: + +```text +grep "auth" | head -5 +grep -l "TODO" docs/ +find docs/ "*.md" | head -10 +``` + +**File references** + +Files can be addressed three ways — pick whichever is unambiguous: + +- **Path** — `docs/api/auth.md` (relative to the knowledge base root; resolves through the directory tree) +- **Filename** — `auth.md` (errors with an "ambiguous filename" hint when the same name exists in multiple directories or KBs) +- **File ID** — the UUID returned by `ls`, `find`, or `grep` + +**Behavior notes** + +- `kb_exec` respects the same access control as the other knowledge tools — files the user can't read are silently excluded from results. +- The model still has `query_knowledge_files` for semantic search; reach for it when literal commands won't find a paraphrased concept. +- Built on top of the directory model — `kb_exec` is the only tool that fully reflects the directory structure created in the UI. + Autonomous exploration works best with frontier models that can intelligently chain search, browse, and synthesize. Smaller models may struggle with multi-step retrieval. Administrators can disable the **Knowledge Base** tool category per-model in **Workspace > Models > Edit > Builtin Tools**. For the full list of built-in agentic tools, see the [Native/Agentic Mode Tools Guide](/features/extensibility/plugin/tools#built-in-system-tools-nativeagentic-mode). @@ -104,6 +173,37 @@ When native function calling is enabled, attached knowledge is **not automatical 3. Upload files or add existing documents. 4. Attach the knowledge base to a model in **Workspace > Models > Edit**, or reference it in chat with `#`. +### Organizing into directories + +Knowledge bases support nested **directories** so larger document sets stay navigable. Create them from the **Add Content** menu (**+ New Directory**), then reorganize freely. + +**Creating and navigating** + +- **+ New Directory** lives next to file upload in the **Add Content** menu. Name uniqueness is enforced per parent — two siblings can't share a name, but you can reuse names in different parents. +- Click a directory to descend into it; the **breadcrumb trail** at the top of the view always reflects the current path and lets you jump back to any ancestor in one click. +- Directories can be **renamed** or **moved to a different parent** without affecting the files inside them. + +**Drag-and-drop** + +You can move items by dragging: + +- **Files** onto a directory row, into the empty area of an open directory, or onto any breadcrumb crumb (including the root crumb to send a file back to the top level). +- **Directories** onto another directory to nest them, or onto a breadcrumb crumb to move them up the tree. Moving a directory into itself or one of its descendants is blocked server-side. + +**Deletion semantics** + +Deleting a non-empty directory prompts for the action to take with its contents: + +- **Move files to parent** (default) — the directory is removed but its files and subdirectories are re-parented one level up. +- **Delete everything** — the directory and all files/subdirectories underneath it are permanently removed. + +**Effect on retrieval and tools** + +- **Retrieval and standard RAG** still span the entire knowledge base. Directories don't shard the vector index; chunks from any subdirectory remain reachable in a single search. +- **Agentic tools** are directory-aware: + - `kb_exec` (when enabled) treats subdirectories like a filesystem: `ls docs/`, `tree`, `grep "x" docs/`, and path-style refs (`docs/api/auth.md`) all work — see [Filesystem-Style Access (`kb_exec`)](#filesystem-style-access-kb_exec) below. + - The other knowledge tools (`query_knowledge_files`, `grep_knowledge_files`, `search_knowledge_files`) ignore directory boundaries and return matches from the whole KB. + ### Exporting Admins can export an entire knowledge base as a zip file via the item menu (three dots) > **Export**. Files are converted to `.txt` for universal compatibility. Regular users will not see the Export option. @@ -112,9 +212,19 @@ Admins can export an entire knowledge base as a zip file via the item menu (thre Knowledge bases can be managed programmatically: -- `POST /api/v1/files/` - Upload files -- `GET /api/v1/files/{id}/process/status` - Check processing status -- `POST /api/v1/knowledge/{id}/file/add` - Add files to a knowledge base +**Files** + +- `POST /api/v1/files/` — Upload files +- `GET /api/v1/files/{id}/process/status` — Check processing status +- `POST /api/v1/files/{id}/rename` — Rename a file +- `POST /api/v1/knowledge/{id}/file/add` — Add files to a knowledge base +- `POST /api/v1/knowledge/{id}/file/move` — Move a file between directories within the same KB (body: `file_id`, `directory_id` — `null` moves to the KB root) + +**Directories** + +- `POST /api/v1/knowledge/{id}/dirs/create` — Create a directory (body: `name`, optional `parent_id`) +- `POST /api/v1/knowledge/{id}/dirs/{dir_id}/update` — Rename or re-parent a directory (body: `name` and/or `parent_id`) +- `DELETE /api/v1/knowledge/{id}/dirs/{dir_id}/delete?move_files=true` — Delete a directory. With `move_files=true` (default), contained files are re-parented; with `move_files=false`, they're deleted along with the directory. File processing happens asynchronously. You must poll the status endpoint until processing completes before adding files to a knowledge base, or you'll get an "empty content" error. See [API Endpoints](/reference/api-endpoints#-retrieval-augmented-generation-rag) for workflow examples. diff --git a/docs/getting-started/quick-start/tab-docker/DockerCompose.md b/docs/getting-started/quick-start/tab-docker/DockerCompose.md index 8b88d3ac40..b7bd492f3a 100644 --- a/docs/getting-started/quick-start/tab-docker/DockerCompose.md +++ b/docs/getting-started/quick-start/tab-docker/DockerCompose.md @@ -56,9 +56,15 @@ To start your services, run the following command: docker compose up -d ``` -## Helper Script +## Helper Scripts -A useful helper script called `run-compose.sh` is included with the codebase. This script assists in choosing which Docker Compose files to include in your deployment, streamlining the setup process. +A set of helper scripts is included with the codebase to streamline common Docker workflows: + +- `docker-compose-launcher.sh` — Interactive Compose launcher with GPU auto-detection, configurable WebUI/API ports, host data mounts, and optional Playwright support. Run `./docker-compose-launcher.sh --help` for the full list of flags. Use `--drop` to tear down the project. +- `docker-cleanup.sh` — Stops the Compose project and **deletes all volumes**, including persistent data. Prompts for confirmation before destroying data. +- `docker-run.sh` — Builds the Open WebUI image and runs a single container, exposing it on `OPEN_WEBUI_PORT` (default `3000`). +- `docker-ollama.sh` — Pulls and runs the official Ollama container with optional GPU passthrough, exposing it on `OLLAMA_PORT` (default `11434`). +- `docker-update-models.sh` — Iterates through every model installed in the Ollama container and pulls the latest version. --- diff --git a/docs/getting-started/quick-start/tab-docker/ManualDocker.md b/docs/getting-started/quick-start/tab-docker/ManualDocker.md index b944625d47..8825dedf29 100644 --- a/docs/getting-started/quick-start/tab-docker/ManualDocker.md +++ b/docs/getting-started/quick-start/tab-docker/ManualDocker.md @@ -49,9 +49,9 @@ Visit [http://localhost:3000](http://localhost:3000). For production environments, pin a specific version instead of using floating tags: ```bash -docker pull ghcr.io/open-webui/open-webui:v0.9.5 -docker pull ghcr.io/open-webui/open-webui:v0.9.5-cuda -docker pull ghcr.io/open-webui/open-webui:v0.9.5-ollama +docker pull ghcr.io/open-webui/open-webui:v0.9.6 +docker pull ghcr.io/open-webui/open-webui:v0.9.6-cuda +docker pull ghcr.io/open-webui/open-webui:v0.9.6-ollama ``` --- diff --git a/docs/getting-started/updating.mdx b/docs/getting-started/updating.mdx index 68a118ccdd..7b9000e04d 100644 --- a/docs/getting-started/updating.mdx +++ b/docs/getting-started/updating.mdx @@ -31,9 +31,9 @@ The `:main` tag always points to the **latest build**. It's convenient but can i For stability, pin a specific release tag: ``` -ghcr.io/open-webui/open-webui:v0.9.5 -ghcr.io/open-webui/open-webui:v0.9.5-cuda -ghcr.io/open-webui/open-webui:v0.9.5-ollama +ghcr.io/open-webui/open-webui:v0.9.6 +ghcr.io/open-webui/open-webui:v0.9.6-cuda +ghcr.io/open-webui/open-webui:v0.9.6-ollama ``` Browse all available tags on the [GitHub releases page](https://github.com/open-webui/open-webui/releases). diff --git a/docs/reference/database-schema.md b/docs/reference/database-schema.md index 8b5ab256eb..464ba831a5 100644 --- a/docs/reference/database-schema.md +++ b/docs/reference/database-schema.md @@ -10,7 +10,7 @@ This tutorial is a community contribution and is not supported by the Open WebUI ::: > [!WARNING] -> This documentation reflects schema changes up to Open WebUI v0.9.5. +> This documentation reflects schema changes up to Open WebUI v0.9.6. ## Open-WebUI Internal SQLite Database diff --git a/docs/reference/env-configuration.mdx b/docs/reference/env-configuration.mdx index 8b1da35e90..5e15c43ce2 100644 --- a/docs/reference/env-configuration.mdx +++ b/docs/reference/env-configuration.mdx @@ -12,23 +12,23 @@ As new variables are introduced, this page will be updated to reflect the growin :::info -This page is up-to-date with Open WebUI release version [v0.9.5](https://github.com/open-webui/open-webui/releases/tag/v0.9.5), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions. +This page is up-to-date with Open WebUI release version [v0.9.6](https://github.com/open-webui/open-webui/releases/tag/v0.9.6), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions. ::: -### Important Note on `PersistentConfig` Environment Variables +### Important Note on `ConfigVar` Environment Variables :::note -When launching Open WebUI for the first time, all environment variables are treated equally and can be used to configure the application. However, for environment variables marked as `PersistentConfig`, their values are persisted and stored internally. +When launching Open WebUI for the first time, all environment variables are treated equally and can be used to configure the application. However, for environment variables marked as `ConfigVar`, their values are persisted and stored internally. -After the initial launch, if you restart the container, `PersistentConfig` environment variables will no longer use the external environment variable values. Instead, they will use the internally stored values. +After the initial launch, if you restart the container, `ConfigVar` environment variables will no longer use the external environment variable values. Instead, they will use the internally stored values. In contrast, regular environment variables will continue to be updated and applied on each subsequent restart. -You can update the values of `PersistentConfig` environment variables directly from within Open WebUI, and these changes will be stored internally. This allows you to manage these configuration settings independently of the external environment variables. +You can update the values of `ConfigVar` environment variables directly from within Open WebUI, and these changes will be stored internally. This allows you to manage these configuration settings independently of the external environment variables. -Please note that `PersistentConfig` environment variables are clearly marked as such in the documentation below, so you can be aware of how they will behave. +Please note that `ConfigVar` environment variables are clearly marked as such in the documentation below, so you can be aware of how they will behave. To disable this behavior and force Open WebUI to always use your environment variables (ignoring the database), set `ENABLE_PERSISTENT_CONFIG` to `False`. @@ -44,7 +44,7 @@ If you change an environment variable (like `ENABLE_SIGNUP=True`) but don't see Set `ENABLE_PERSISTENT_CONFIG=False` in your environment. This forces Open WebUI to read your variables directly. Note that UI-based settings changes will not persist across restarts in this mode. #### Option 2: Update via Admin UI (Recommended) -The simplest and safest way to change `PersistentConfig` settings is directly through the **Admin Panel** within Open WebUI. Even if an environment variable is set, changes made in the UI will take precedence and be saved to the database. +The simplest and safest way to change `ConfigVar` settings is directly through the **Admin Panel** within Open WebUI. Even if an environment variable is set, changes made in the UI will take precedence and be saved to the database. #### Option 3: Manual Database Update (Last Resort / Lock-out Recovery) If you are locked out or cannot access the UI, you can manually update the SQLite database via Docker: @@ -78,7 +78,7 @@ environment variables, see our [logging documentation](https://docs.openwebui.co - Type: `str` - Default: `http://localhost:3000` - Description: Specifies the URL where your Open WebUI installation is reachable. Needed for search engine support and OAuth/SSO. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. :::warning @@ -97,7 +97,7 @@ Failure to set WEBUI_URL before using OAuth/SSO will result in failure to log in - Type: `bool` - Default: `True` - Description: Toggles user account creation. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_SIGNUP_PASSWORD_CONFIRMATION` @@ -148,14 +148,14 @@ After the admin account is created, sign-up is automatically disabled for securi - Type: `bool` - Default: `True` - Description: Toggles email, password, sign-in and "or" (only when `ENABLE_OAUTH_SIGNUP` is set to True) elements. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_PASSWORD_CHANGE_FORM` - Type: `bool` - Default: `True` - Description: Controls visibility of the password change UI in **Settings > Account**. When set to `False`, users do not see the password update form, which is useful for SSO-focused deployments where password changes should not be presented in the UI. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_PASSWORD_AUTH` @@ -181,14 +181,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `str` - Default: `en` - Description: Sets the default locale for the application. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_MODELS` - Type: `str` - Default: Empty string (' '), since `None`. - Description: Sets a default Language Model. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_PINNED_MODELS` @@ -196,14 +196,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Default: Empty string (' ') - Description: Comma-separated list of model IDs to pin by default for new users who haven't customized their pinned models. This provides a pre-selected set of frequently used models in the model selector for new accounts. - Example: `gpt-4,claude-3-opus,llama-3-70b` -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_MODEL_METADATA` - Type: `dict` (JSON object) - Default: `{}` - Description: Sets global default metadata (capabilities and other model info) for all models. These defaults act as a baseline — per-model overrides always take precedence. For capabilities, the defaults and per-model values are merged (per-model wins on conflicts). For other metadata fields, the default is only applied if the model has no value set. Configurable via **Admin Settings → Models**. -- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `models.default_metadata`. +- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `models.default_metadata`. :::info @@ -220,7 +220,7 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `dict` (JSON object) - Default: `{}` - Description: Sets global default parameters (temperature, top_p, max_tokens, seed, etc.) for all models. These defaults are applied as a baseline at chat completion time — per-model parameter overrides always take precedence. Configurable via **Admin Settings → Models**. -- Persistence: This environment variable is a `PersistentConfig` variable. Stored at config key `models.default_params`. +- Persistence: This environment variable is a `ConfigVar` variable. Stored at config key `models.default_params`. :::info @@ -240,14 +240,14 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - `admin` - New users are automatically activated with administrator permissions. - Default: `pending` - Description: Sets the default role assigned to new users. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_GROUP_ID` - Type: `str` - Default: Empty string (' ') - Description: Sets the default group ID to assign to new users upon registration. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `DEFAULT_GROUP_SHARE_PERMISSION` @@ -261,63 +261,63 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `str` - Default: Empty string (' ') - Description: Sets a custom title for the pending user overlay. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `PENDING_USER_OVERLAY_CONTENT` - Type: `str` - Default: Empty string (' ') - Description: Sets a custom text content for the pending user overlay. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_CALENDAR` - Type: `bool` - Default: `True` - Description: Enables or disables the Calendar feature. When enabled, users can create calendars, manage events, and share calendars with other users or groups via access grants. Active automations are automatically surfaced as virtual events on a dedicated "Scheduled Tasks" calendar. Requires the `features.calendar` user permission (admins always pass). -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_CHANNELS` - Type: `bool` - Default: `False` - Description: Enables or disables channel support. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_FOLDERS` - Type: `bool` - Default: `True` - Description: Enables or disables the folders feature, allowing users to organize their chats into folders in the sidebar. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `FOLDER_MAX_FILE_COUNT` - Type: `int` - Default: `("") empty string` - Description: Sets the maximum number of files processing allowed per folder. -- Persistence: This environment variable is a `PersistentConfig` variable. It can be configured in the **Admin Panel > Settings > General > Folder Max File Count**. Default is none (empty string) which is unlimited. +- Persistence: This environment variable is a `ConfigVar` variable. It can be configured in the **Admin Panel > Settings > General > Folder Max File Count**. Default is none (empty string) which is unlimited. #### `ENABLE_AUTOMATIONS` - Type: `bool` - Default: `True` - Description: Enables or disables the Automations feature globally. When disabled, the scheduler skips automation processing, the automation API endpoints return `403 Forbidden`, automation builtin tools are not injected, and the Automations entry is hidden from the sidebar. Requires the `features.automations` user permission (admins always pass). -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `AUTOMATION_MAX_COUNT` - Type: `int` - Default: `("") empty string` (unlimited) - Description: Sets the maximum number of automations a non-admin user can create. When set to a positive integer, users who reach this limit will receive a `403 Forbidden` error when attempting to create additional automations. Admins bypass this limit. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `AUTOMATION_MIN_INTERVAL` - Type: `int` (seconds) - Default: `("") empty string` (no minimum) - Description: Sets the minimum allowed interval in seconds between automation recurrences for non-admin users. When set, any automation schedule that recurs more frequently than this value will be rejected with a `400 Bad Request` error. One-time automations (`COUNT=1`) are exempt from this check. Admins bypass this limit. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. :::tip Common values for AUTOMATION_MIN_INTERVAL @@ -347,20 +347,20 @@ is also being used and set to `True`. **Never disable this if OAUTH/SSO is not b - Type: `bool` - Default: `True` - Description: Enables or disables the notes feature, allowing users to create and manage personal notes within Open WebUI. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_MEMORIES` - Type: `bool` - Default: `True` - Description: Enables or disables the [memory feature](/features/chat-conversations/memory), allowing models to store and retrieve long-term information about users. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `WEBHOOK_URL` - Type: `str` - Description: Sets a webhook for integration with Discord/Slack/Microsoft Teams. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. :::note Admin posture toggles vs. security boundaries @@ -416,14 +416,14 @@ Treat anything in this cluster as *what the admin sees and does in the product U - Type: `bool` - Default: `False` - Description: Enables or disables user webhooks. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `RESPONSE_WATERMARK` - Type: `str` - Default: Empty string (' ') - Description: Sets a custom text that will be included when you copy a message in the chat. e.g., `"This text is AI generated"` -> will add "This text is AI generated" to every message, when copied. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `IFRAME_CSP` @@ -454,21 +454,21 @@ If you are running larger instances, you WILL NEED to set this to a higher value - Type: `bool` - Default: `True` - Description: Toggles whether to show admin user details in the interface. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_PUBLIC_ACTIVE_USERS_COUNT` - Type: `bool` - Default: `True` - Description: Controls whether the active user count is visible to all users or restricted to administrators only. When set to `False`, only admin users can see how many users are currently active, reducing backend load and addressing privacy concerns in large deployments. -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENABLE_USER_STATUS` - Type: `bool` - Default: `True` - Description: Globally enables or disables user status functionality. When disabled, the status UI (including blinking active/away indicators and status messages) is hidden across the application, and user status API endpoints are restricted. -- Persistence: This environment variable is a `PersistentConfig` variable. It can be toggled in the **Admin Panel > Settings > General > User Status**. +- Persistence: This environment variable is a `ConfigVar` variable. It can be toggled in the **Admin Panel > Settings > General > User Status**. #### `ENABLE_EASTER_EGGS` @@ -480,7 +480,7 @@ If you are running larger instances, you WILL NEED to set this to a higher value - Type: `str` - Description: Sets the admin email shown by `SHOW_ADMIN_DETAILS` -- Persistence: This environment variable is a `PersistentConfig` variable. +- Persistence: This environment variable is a `ConfigVar` variable. #### `ENV` @@ -566,13 +566,13 @@ Enabling `ENABLE_REALTIME_CHAT_SAVE` causes every single token generated by the - Type: `bool` - Default: `True` -- Description: Controls whether the user and model profile-image endpoints honor an external `http(s)://` URL stored in `profile_image_url` by issuing a `302 Found` redirect to the original origin. When `False`, the redirect is suppressed and the endpoint falls through to the bundled default image instead. Set to `False` to prevent client-side IP, User-Agent, and Referer leaks to attacker-controlled origins via attacker-stored profile URLs (data URIs and same-origin/static images continue to load normally). Existing deployments that legitimately rely on external profile image URLs (e.g. Gravatar redirects served by upstream identity providers) should keep the default. **This variable is read once at startup — it is not a `PersistentConfig` and cannot be changed from the Admin UI.** +- Description: Controls whether the user and model profile-image endpoints honor an external `http(s)://` URL stored in `profile_image_url` by issuing a `302 Found` redirect to the original origin. When `False`, the redirect is suppressed and the endpoint falls through to the bundled default image instead. Set to `False` to prevent client-side IP, User-Agent, and Referer leaks to attacker-controlled origins via attacker-stored profile URLs (data URIs and same-origin/static images continue to load normally). Existing deployments that legitimately rely on external profile image URLs (e.g. Gravatar redirects served by upstream identity providers) should keep the default. **This variable is read once at startup — it is not a `ConfigVar` and cannot be changed from the Admin UI.** #### `PROFILE_IMAGE_ALLOWED_MIME_TYPES` - Type: `str` (comma-separated MIME types) - Default: `image/png,image/jpeg,image/gif,image/webp` -- Description: Allowlist of MIME types accepted when serving a base64 `data:` URI as a profile image. The MIME type is parsed from the data URI prefix and checked against this list before the response is streamed; non-allowlisted types fall through to the bundled default image. Responses also set `X-Content-Type-Options: nosniff` to prevent the browser from sniffing the body into an executable type. SVG is intentionally not in the default list because it can carry inline ` + Checklist + -
- +
    {items_html}
""" @@ -55,11 +55,11 @@ To provide the LLM with actionable context about the embed, return a **tuple** o ```python from fastapi.responses import HTMLResponse -def create_chart(self, data: str) -> tuple: +def render_feedback_form(self, prompt: str) -> tuple: """ - Creates an interactive chart and returns context to the LLM. + Renders an interactive feedback form and returns context to the LLM. - :param data: The data to chart + :param prompt: The question to show the user above the form """ html_content = "..." headers = {"Content-Disposition": "inline"} @@ -67,16 +67,16 @@ def create_chart(self, data: str) -> tuple: # The LLM receives this context instead of the generic message result_context = { "status": "success", - "chart_type": "scatter", - "data_points": 42, - "description": "Scatter plot showing correlation between X and Y" + "form_type": "feedback", + "fields": ["rating", "comment"], + "description": f"Rendered a feedback form asking: {prompt!r}" } return HTMLResponse(content=html_content, headers=headers), result_context ``` The context can be: -- A **string** — sent as-is to the LLM (e.g., `"Generated a bar chart with 5 categories"`) +- A **string** — sent as-is to the LLM (e.g., `"Rendered a 5-item checklist"`) - A **dict** — serialized as JSON for structured context - A **list** — serialized as JSON for multiple items diff --git a/docs/features/extensibility/plugin/development/under-the-hood.mdx b/docs/features/extensibility/plugin/development/under-the-hood.mdx new file mode 100644 index 0000000000..4d6f9a9837 --- /dev/null +++ b/docs/features/extensibility/plugin/development/under-the-hood.mdx @@ -0,0 +1,190 @@ +--- +sidebar_position: 5 +title: "Under the Hood" +--- + +# 🔧 Under the Hood: What the Plugin Loader Actually Does + +:::danger ⚠️ Critical Security Warning +**Tools, Functions, Pipes, Filters, and Actions execute arbitrary Python code on your server.** Function creation is restricted to administrators only, and Workspace Tool creation is gated by the `workspace.tools` permission — granting that permission is equivalent to giving the user shell access to the server. Only install from trusted sources, review code before importing, and restrict creation to trusted administrators. A malicious plugin could access your file system, exfiltrate data, or compromise your entire system. For full details, see the [Plugin Security Warning](/features/extensibility/plugin/). +::: + +Open WebUI's plugins (Tools, Functions = Filters / Pipes / Actions) are not sandboxed scripts running in some restricted runtime. They are **Python modules executed inside your Open WebUI process**, with full access to the standard library, any pip package, the entire `open_webui` codebase, the live FastAPI app, and the database. The documented hooks (`inlet`, `outlet`, `stream`, `pipe`, `action`) are *one* way to use that access. They are not the only way. + +This page documents what the loader really does and what that opens up, so you can build (or audit) plugins beyond the patterns shown on the per-type pages. It also lists the footguns that come with the territory. + +--- + +## How a plugin is loaded + +A single loader in [`backend/open_webui/utils/plugin.py`](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/utils/plugin.py) handles every plugin type: + +1. The plugin's Python source is read from the database. +2. A fresh `types.ModuleType` is created and registered in `sys.modules` as `function_{id}` (or `tool_{id}`). +3. The source is fed to `exec(content, module.__dict__)`. Anything at module top level runs at this point. +4. The loader looks for **one** entry-point class: `Tools`, `Pipe`, `Filter`, or `Action`. That class becomes the handle Open WebUI calls into. +5. The module stays in `sys.modules` for the life of the process. Any side effect of step 3 (imports, monkey-patches, background tasks, route registrations) is now installed in the live application. + +The entry-point class is the only thing the rest of Open WebUI cares about. Everything else in the file is yours. + +### When the module is re-executed + +Inlet/outlet hooks pass `load_from_db=True`. The loader still serves from cache if the source has not changed, but it consults the database on every call to decide that. Stream hooks pass `load_from_db=False` and read straight from cache. + +| Hook | DB check per call? | Module re-exec'd when? | +|---|---|---| +| `inlet` / `outlet` (Filter) | yes | source change between calls | +| `stream` (Filter) | no | only when another hook re-loads it | +| Tools, Pipes, Actions | yes on dispatch | source change between calls | + +Practical consequences: + +- **Editing a Filter via the editor takes effect on the next chat for `inlet`/`outlet`.** Stream picks it up the next time an `inlet` or `outlet` triggers a reload. +- **Re-execution is not per-request**, so module-top-level work is paid for once per content version, not once per chat. Top-level imports, patches, and singletons are fine. +- **Disabling or deleting a plugin** removes it from the active set. It does **not** undo anything its module top level did. The module stays in `sys.modules` and any monkey-patches it installed in other modules stay applied until the process restarts. + +--- + +## What you actually have access to + +From any hook (and from module top level): + +- The full `open_webui.*` package. Examples: `from open_webui.models.chats import Chats`, `from open_webui.utils.middleware import process_chat_payload`, `from open_webui.config import ConfigVar`. +- The live FastAPI `Request` via `__request__`, which carries `__request__.app` (the FastAPI app), `__request__.app.state` (config, caches, handlers), and `__request__.state` (per-request scratch). +- The reserved dunder args documented in [Reserved Arguments](./reserved-args): `__user__`, `__metadata__`, `__model__`, `__request__`, `__event_emitter__`, `__event_call__`, `__features__`, `__body__`, `__id__`, `__oauth_token__`, plus stream-only and per-hook extras. +- Events documented in [Events](./events): emit anything to the frontend, or solicit a response from the user with `event_call`. +- Any pip package via `requirements:` frontmatter, installed at load time (gated by [`ENABLE_PIP_INSTALL_FRONTMATTER_REQUIREMENTS`](/reference/env-configuration#enable_pip_install_frontmatter_requirements)). +- The Python stdlib, plus everything pip-installed in the container. + +There is no sandbox, no allowlist, no capability system. The execution model is **"this is Python, you are inside the server process"**. + +--- + +## Patterns + +### 1. Mutate the per-request model dict from `inlet` + +The `__model__` you receive is **the same dict object** the rest of the request reads. Changing its keys from `inlet` changes how the rest of the pipeline behaves on this request. Example (the reasoning-content fix for DeepSeek / Kimi / MiMo): + +```python +class Filter: + async def inlet(self, body: dict, __model__: dict = None) -> dict: + # Flip the per-request model to the code path that emits + # reasoning_content as a top-level field on assistant messages + # during the native tool-call loop. + if __model__ and __model__.get("provider") not in ("ollama", "llama.cpp"): + __model__["provider"] = "llama.cpp" + return body +``` + +Same trick works for any other field the middleware reads from the model dict: `params`, `meta`, custom keys you put there yourself and then read from another hook. + +### 2. Monkey-patch a backend function + +Because the plugin module can `import open_webui.*` and rebind module attributes: + +```python +import open_webui.utils.middleware as _mw + +_original = _mw.process_chat_payload + +async def _patched(request, form_data, user, metadata, model): + # ...your wrapping logic, then delegate... + return await _original(request, form_data, user, metadata, model) + +_mw.process_chat_payload = _patched +``` + +Runs at module load (once per source version). The patch persists in `sys.modules` for the life of the process. Deleting or disabling the plugin **does not** revert the patch. The only clean rollback is a process restart. + +Use sparingly. Cross-plugin interference is a real risk: if two plugins patch the same function the result depends on load order, which is not deterministic. + +### 3. Add a new HTTP route at load + +```python +def _ensure_route(app): + if any(getattr(r, "path", None) == "/my/route" for r in app.routes): + return + app.add_api_route("/my/route", my_handler, methods=["GET"]) +``` + +Call from the first hook with access to `__request__.app`. The idempotency guard is important: the loader may re-execute on edits, and `add_api_route` will happily register the same path twice. + +### 4. Spawn a background task + +```python +import asyncio + +async def _loop(app): + while True: + # ...periodic work... + await asyncio.sleep(60) + +def _start_once(app): + if getattr(app.state, "_my_plugin_started", False): + return + app.state._my_plugin_started = True + asyncio.create_task(_loop(app)) +``` + +The `app.state` flag makes it "once per process" rather than "once per source version". On a clean restart it starts fresh. + +### 5. Stash state in `app.state` + +```python +async def inlet(self, body, __request__): + cache = __request__.app.state.__dict__.setdefault("my_cache", {}) + # ...read/write cache... + return body +``` + +Shared across requests and **across plugins** in the same process. There is no namespacing: pick a unique key. + +### 6. Use `event_emitter` for arbitrary side effects in the UI + +`event_emitter` accepts any event shape the frontend handles: status banners, source citations, file attachments, chat-message updates, toasts. You are not restricted to the events documented on the per-type pages. See [Events](./events) for the full catalogue. + +### 7. Prompt the user mid-handler with `event_call` + +`event_call` is `event_emitter` that **awaits a response**. Show a form, a confirmation, an input dialog, and block until the user answers. Useful inside Tool methods that need a human in the loop, or Action handlers that confirm before executing. + +### 8. Pipes as full provider replacements + +A `Pipe` replaces the entire LLM call. Open WebUI hands you the request and asks for a response back. Nothing in the middleware constrains what you put in that response, so: + +- wrap an external API (any provider, any protocol), +- route between providers based on request shape, +- run an entire agent inside `pipe()` and stream the agent's output back, +- skip any model entirely and return canned content. + +A Pipe is the most powerful entry point precisely because the middleware steps out of the way. + +### 9. Tools that do more than their docstring says + +A `Tools` class's methods are exposed to the model as callable tools (their docstrings become JSON schema). The method body can do **anything**: call external APIs, emit UI events with `__event_emitter__`, stash data in `app.state`, monkey-patch on first call. The docstring is purely how the tool advertises itself to the model. The implementation is unconstrained. + +### 10. Actions as arbitrary one-shot operations + +`Action` renders a button on an assistant message. The handler runs server-side with the same dunder surface as Filters and Tools, against the chat that the message belongs to. Use for "approve this", "re-run with...", "send to external system", or any one-off operation a user should be able to trigger from a specific message. + +--- + +## Footguns + +- **No sandboxing.** Tools and Functions execute Python in your backend process as the backend user. The security policy ([Rule 10](/security/security-policy#reporting-guidelines)) treats this as intended behaviour: granting Tool or Function creation permission is equivalent to granting shell access on the host. Treat plugin authors as administrators. +- **Stream hooks use a stale cache.** Edits to a `stream` method only take effect after another hook (or a process restart) refreshes the module. If you edit a stream filter and the change does not seem to apply, trigger an `inlet`/`outlet` reload or restart. +- **Cross-plugin interference is not detected.** Two plugins patching the same function, registering the same route, or writing to the same `app.state` key will collide. Load order is not deterministic. Prefer additive patterns (your own namespaces, wrappers that delegate) over destructive ones. +- **Disabling does not unload.** The module stays in `sys.modules` and any module-level side effects stay installed. Restart the process to fully revert. +- **`requirements:` runs `pip install` on every replica at load.** In multi-replica deployments set [`ENABLE_PIP_INSTALL_FRONTMATTER_REQUIREMENTS=False`](/reference/env-configuration#enable_pip_install_frontmatter_requirements) and pre-install dependencies in your image; runtime installs race across workers and crash. See [Scaling → Function/Tool Dependency Installation Crashes](/troubleshooting/multi-replica#9-functiontool-dependency-installation-crashes). +- **Internal APIs are not a stable public surface.** `open_webui.utils.*`, the internal model classes, middleware helpers, and pretty much everything outside the documented dunder args and event types can rename, move, or change signatures between releases. If your monkey-patch breaks after an upgrade, that is on you to repair. +- **The Pipelines server is out of scope here.** This page is about in-process plugins (Tools / Functions). The separate [Pipelines](/features/extensibility/pipelines/) server runs out-of-process and does not share `sys.modules` with Open WebUI: it cannot monkey-patch the main app, but it also is not constrained by it. + +--- + +## When this is the wrong tool + +For anything you can express through the documented hooks (filters that mutate `body`, tools that call APIs and return results, actions that emit events), **stay in the documented hooks**. The patterns above are powerful, but their durability is shallow: cross-plugin interaction, upgrade compatibility, and rollback all degrade the moment you start patching module internals. + +If your plugin needs an interface that does not exist yet, an upstream PR is more durable than a monkey-patch. + +If you file a bug report against a code path that your plugin is monkey-patching, expect it to be closed. Reports must reproduce against an unmodified Open WebUI ([Rule 6](/security/security-policy#reporting-guidelines)). From e1a8ebcbd616426a56f2fdf7b78901c00f7a3071 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Wed, 27 May 2026 00:05:36 +0200 Subject: [PATCH 08/24] fix --- docs/features/extensibility/plugin/functions/filter.mdx | 4 ++-- docs/reference/api-endpoints.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/features/extensibility/plugin/functions/filter.mdx b/docs/features/extensibility/plugin/functions/filter.mdx index e66efa1c00..270da77e1c 100644 --- a/docs/features/extensibility/plugin/functions/filter.mdx +++ b/docs/features/extensibility/plugin/functions/filter.mdx @@ -1012,7 +1012,7 @@ In the world of Open WebUI, the `inlet` function does this important prep work o Modify and return the `body`. The modified version of the `body` is what the LLM works with, so this is your chance to bring clarity, structure, and context to the input. :::info Want to transform RAG chunks? `inlet()` runs **before** retrieval -At `inlet()` time, `body["metadata"]["files"]` and `body["files"]` contain only file/collection *references* — the actual chunk text is fetched and injected later, after every inlet filter has returned. If you need to inspect or transform the chunk text itself (PII redaction, reranking, translation, chunk-level ACLs), see [Owning Retrieval With `file_handler`](#file-handler-custom-rag) for the supported opt-in. +At `inlet()` time, `body["metadata"]["files"]` and `body["files"]` contain only file/collection *references* — the actual chunk text is fetched and injected later, after every inlet filter has returned. If you need to inspect or transform the chunk text itself (PII redaction, reranking, translation, chunk-level ACLs), see [Owning Retrieval With `file_handler`](#owning-retrieval-with-file_handler) for the supported opt-in. ::: ##### Why Would You Use the `inlet`? @@ -1156,7 +1156,7 @@ The `outlet` function is like a **proofreader**: tidy up the AI's response (or m - **Quality scoring** - Run automated quality checks on model outputs :::info Outlet and API Requests -`outlet()` does **not** run reliably for direct `/api/chat/completions` calls. On tagged releases it is never invoked by that endpoint. On `dev` it can run inline, but only when the caller supplies `chat_id` + `id`, owns the chat, and uses a non-streaming request — and even then the filtered content is not returned in the HTTP response. For direct API integrations that need `outlet()`, follow `/api/chat/completions` with `POST /api/chat/completed`. See [Filter Behavior with API Requests](#-filter-behavior-with-api-requests) for the full picture. +`outlet()` does **not** run reliably for direct `/api/chat/completions` calls. On tagged releases it is never invoked by that endpoint. On `dev` it can run inline, but only when the caller supplies `chat_id` + `id`, owns the chat, and uses a non-streaming request — and even then the filtered content is not returned in the HTTP response. For direct API integrations that need `outlet()`, follow `/api/chat/completions` with `POST /api/chat/completed`. See [Filter Behavior with API Requests](#filter-behavior-with-api-requests) for the full picture. ::: 💡 **Example Use Case**: Strip out sensitive API responses you don't want the user to see: diff --git a/docs/reference/api-endpoints.md b/docs/reference/api-endpoints.md index 450d51b689..fc34265657 100644 --- a/docs/reference/api-endpoints.md +++ b/docs/reference/api-endpoints.md @@ -278,7 +278,7 @@ Even in the non-streaming case, **`outlet()` does not rewrite the HTTP response ``` :::tip -If you need `outlet()` output over HTTP today, call `/api/chat/completions` followed by `/api/chat/completed`. Inline execution on `dev` is primarily for WebUI-shaped clients that read from the WebSocket. For more details on filter behavior, see the [Filter Function documentation](/features/extensibility/plugin/functions/filter#-filter-behavior-with-api-requests). +If you need `outlet()` output over HTTP today, call `/api/chat/completions` followed by `/api/chat/completed`. Inline execution on `dev` is primarily for WebUI-shaped clients that read from the WebSocket. For more details on filter behavior, see the [Filter Function documentation](/features/extensibility/plugin/functions/filter#filter-behavior-with-api-requests). ::: ### 🦙 Ollama API Proxy Support From d342b50d718ae878208424b707315454b8dd9d89 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Wed, 27 May 2026 19:15:15 +0200 Subject: [PATCH 09/24] python compat --- docs/getting-started/advanced-topics/development.md | 9 ++++++++- docs/getting-started/quick-start/index.mdx | 2 ++ .../quick-start/tab-python/_PythonCompat.md | 6 ++++++ 3 files changed, 16 insertions(+), 1 deletion(-) create mode 100644 docs/getting-started/quick-start/tab-python/_PythonCompat.md diff --git a/docs/getting-started/advanced-topics/development.md b/docs/getting-started/advanced-topics/development.md index 0dfeba762b..eba4bd84ec 100644 --- a/docs/getting-started/advanced-topics/development.md +++ b/docs/getting-started/advanced-topics/development.md @@ -19,10 +19,17 @@ You can test the latest changes by running the [dev Docker image](/getting-start | Requirement | Version | |-------------|---------| -| **Python** | 3.11+ | +| **Python** | 3.11 or 3.12 (see note below; 3.13 not supported yet) | | **Node.js** | 22.10+ | | **Git** | Any recent version | +:::info Python version compatibility +Open WebUI supports **Python 3.11 and 3.12**. **3.13 is not supported yet** — a few of our dependencies still need to ship 3.13-compatible releases, and until they do, installs on 3.13 will fail or break at runtime. + +- **For production**, use the [Docker image](/getting-started/quick-start) or the **latest Python 3.11**. This is the combination we test against most heavily. +- **3.12 also works**, but we have seen very rare reports of odd behaviour on 3.12 that we have not reproduced on 3.11. If you are running into something inexplicable on 3.12, dropping to the latest 3.11 is the first thing to try. +::: + :::warning Separate your data Never share your database or data directory between dev and production. Dev builds may include database migrations that are not backward-compatible. ::: diff --git a/docs/getting-started/quick-start/index.mdx b/docs/getting-started/quick-start/index.mdx index d4f5d32d00..81f4064b67 100644 --- a/docs/getting-started/quick-start/index.mdx +++ b/docs/getting-started/quick-start/index.mdx @@ -22,6 +22,7 @@ import Pip from './tab-python/Pip.md'; import Uv from './tab-python/Uv.md'; import Conda from './tab-python/Conda.md'; import PythonUpdating from './tab-python/PythonUpdating.md'; +import PythonCompat from './tab-python/_PythonCompat.md'; # Quick Start @@ -87,6 +88,7 @@ Open WebUI works on **macOS, Linux** (x86_64 and ARM64, including Raspberry Pi a +
diff --git a/docs/getting-started/quick-start/tab-python/_PythonCompat.md b/docs/getting-started/quick-start/tab-python/_PythonCompat.md new file mode 100644 index 0000000000..80f68c9a16 --- /dev/null +++ b/docs/getting-started/quick-start/tab-python/_PythonCompat.md @@ -0,0 +1,6 @@ +:::info Python version compatibility +Open WebUI supports **Python 3.11 and 3.12**. **Python 3.13 is not supported yet** — a handful of our dependencies still need to ship 3.13-compatible releases, and until they do, installs on 3.13 will fail or break at runtime. + +- **For production**, run the [Docker image](#docker) or use the **latest Python 3.11**. This is the combination we test against most heavily. +- **Python 3.12 also works**, but we have seen very rare reports of odd behaviour on 3.12 that we have not reproduced on 3.11. If something inexplicable happens on 3.12, drop to the latest 3.11 first. +::: From 868fce74cffe32e0ec2475dffa682cb46fc39e2b Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Wed, 27 May 2026 19:33:38 +0200 Subject: [PATCH 10/24] clarify window.args injection requirements --- .../plugin/development/rich-ui.mdx | 22 ++++++++++++++----- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/docs/features/extensibility/plugin/development/rich-ui.mdx b/docs/features/extensibility/plugin/development/rich-ui.mdx index 6b00b86df8..a1a1c030f7 100644 --- a/docs/features/extensibility/plugin/development/rich-ui.mdx +++ b/docs/features/extensibility/plugin/development/rich-ui.mdx @@ -293,22 +293,32 @@ The parent responds with `{ type: 'payload', requestId: ..., payload: ... }` con ### Tool Args Injection (Tools Only) -When a **Tool** returns a Rich UI embed, the tool call arguments (the parameters the model passed to the tool) are automatically injected into the iframe's `window.args`. This allows your embedded HTML to access the tool's input: +When a **Tool** method returns a Rich UI embed inline at the tool-call display (i.e. you return an `HTMLResponse`, or a `(HTMLResponse, context)` tuple, from the tool method itself), the arguments the model passed are exposed on the iframe as `window.args` — **as a JSON string**, not a parsed object. Parse it before use: ```html ``` -:::note -This only works for Tool embeds rendered via the tool call display. Action embeds do not have `window.args` since they are triggered by the user, not the model. +:::warning Requires `allowSameOrigin` — otherwise `window.args` is silently `undefined` +The args are injected from the parent page via `iframe.contentWindow.args = ...`, which the browser blocks under same-origin policy unless the iframe sandbox carries `allow-same-origin`. That is gated by the per-user **Settings → Interface → "iframe Sandbox Allow Same Origin"** toggle, which is **off by default**. If `window.args` comes back undefined and you have not changed this setting, that is the cause: turn it on and reload. See [allowSameOrigin](#allowsameorigin) above for the security trade-off. +::: + +:::note Where `window.args` is set, and where it is not +- ✅ **Tool method returning `HTMLResponse` or `(HTMLResponse, context)` tuple** — rendered inline at the "View Result from..." tool call indicator. `window.args` is injected (subject to the `allowSameOrigin` requirement above). +- ❌ **`__event_emitter__({"type": "embeds", "data": {"embeds": [...]}})`** — rendered through the chat-controls Embeds panel, which does not wire `args` at all. `window.args` will always be undefined here, regardless of sandbox settings. This is by design: the embeds-event path has no tool call attached, so there are no args to inject. +- ❌ **Action embeds** — triggered by the user, not the model, so there are no model-supplied args to inject. + +If you need to pass dynamic data into an embed rendered via either of the ❌ paths, use the [Payload Requests](#payload-requests) pattern above instead. ::: ### Auto-Injected Libraries From 1fa8babf71a72edc03c8530c16fb161af82afbc6 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Wed, 27 May 2026 23:19:42 +0200 Subject: [PATCH 11/24] external tool events: no credential forwarding, use admin key --- .../extensibility/plugin/development/events.mdx | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/docs/features/extensibility/plugin/development/events.mdx b/docs/features/extensibility/plugin/development/events.mdx index ba8abf730e..6090552d21 100644 --- a/docs/features/extensibility/plugin/development/events.mdx +++ b/docs/features/extensibility/plugin/development/events.mdx @@ -795,6 +795,17 @@ When Open WebUI calls your external tool (with header forwarding enabled), it in **Authentication:** Requires a valid Open WebUI API key or session token. +:::warning Open WebUI does **not** forward user credentials to external tools +The `X-OpenWebUI-User-*` and `X-Open-WebUI-Chat-Id` / `X-Open-WebUI-Message-Id` headers forwarded to your tool are **identification only** — they carry no API key or session token. The same applies to MCP custom-header template tokens (`{{USER_ID}}`, `{{USER_NAME}}`, `{{USER_EMAIL}}`, `{{USER_ROLE}}`, `{{CHAT_ID}}`, `{{MESSAGE_ID}}`): there is no `{{API_KEY}}` or `{{TOKEN}}` placeholder, and the user's own API key / session is never sent to the tool server. + +So an external tool **must hold its own statically-configured Open WebUI API key** to call this endpoint. The endpoint's authorization check requires the caller to be the chat's owner **or an admin**, which gives you two practical options: + +- **Per-user key (uncommon)** — the tool server holds the specific user's API key. Only works for a single-user setup; impractical for a shared MCP server. +- **Admin / service-account key (recommended)** — provision a dedicated admin (or service-account) user in Open WebUI, generate an API key for it, and use that key from the tool server. An admin key works for any user's chat, so a single key serves all callers; the forwarded `X-Open-WebUI-Chat-Id` + `X-Open-WebUI-Message-Id` headers tell your tool *which* chat/message to post to. + +Store the key as a secret on the tool server (env var, secrets manager, etc.); do not expect Open WebUI to push it for you. +::: + **Request Body:** ```json From 1128ccb91d1673b4b31d09682f40175df6e05422 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Wed, 27 May 2026 23:51:51 +0200 Subject: [PATCH 12/24] swap default/recommended columns --- docs/getting-started/essentials.mdx | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/getting-started/essentials.mdx b/docs/getting-started/essentials.mdx index 677ae05b8a..f7bcf4b0e0 100644 --- a/docs/getting-started/essentials.mdx +++ b/docs/getting-started/essentials.mdx @@ -219,14 +219,14 @@ If you just want RAG to work well out of the box, these settings are a solid gen Set these in **Admin Panel > Settings > Documents**: -| Setting | Recommended value | Default | Why | -|---------|-------------------|---------|-----| -| **Text Splitter** | `token` | `character` | Token-based splitting produces more consistent chunk sizes across document types | -| **Markdown Header Splitting** | **On** | On | Respects document structure by splitting at headings, keeping sections coherent | -| **Chunk Size** | `2000` | `1000` | Larger chunks preserve more surrounding context per retrieval hit | -| **Chunk Overlap** | `200` | `100` | More overlap means less chance of cutting a key sentence in half | -| **Top K** | `15` | `3` | Retrieves more candidate chunks, giving the model a wider pool of relevant context. If you are working with local models that have constrained context sizes, lower this to `5` to avoid filling the context window with retrieved chunks | -| **Embedding Model** | External (OpenAI or Ollama) | `all-MiniLM-L6-v2` (local CPU) | The default works for a single user but consumes ~500 MB RAM per worker. For any multi-user setup, use an external embedding API instead | +| Setting | Default | Recommended value | Why | +|---------|---------|-------------------|-----| +| **Text Splitter** | `character` | `token` | Token-based splitting produces more consistent chunk sizes across document types | +| **Markdown Header Splitting** | On | **On** | Respects document structure by splitting at headings, keeping sections coherent | +| **Chunk Size** | `1000` | `2000` | Larger chunks preserve more surrounding context per retrieval hit | +| **Chunk Overlap** | `100` | `200` | More overlap means less chance of cutting a key sentence in half | +| **Top K** | `3` | `15` | Retrieves more candidate chunks, giving the model a wider pool of relevant context. If you are working with local models that have constrained context sizes, lower this to `5` to avoid filling the context window with retrieved chunks | +| **Embedding Model** | `all-MiniLM-L6-v2` (local CPU) | External (OpenAI or Ollama) | The default works for a single user but consumes ~500 MB RAM per worker. For any multi-user setup, use an external embedding API instead | :::tip Embedding model The default SentenceTransformers model runs locally on CPU and is fine for a single user getting started. For anything beyond that, point at an external embeddings API: set `RAG_EMBEDDING_ENGINE=openai` with an OpenAI API key, or `RAG_EMBEDDING_ENGINE=ollama` with any Ollama embedding model (e.g., `nomic-embed-text`). This offloads the work and frees significant RAM. From 06e59779aff78c980f55d81af4a5d563d9fa811b Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Thu, 28 May 2026 12:02:15 +0200 Subject: [PATCH 13/24] Update banners.md --- docs/features/administration/banners.md | 95 ++++++++++++++++++++++++- 1 file changed, 94 insertions(+), 1 deletion(-) diff --git a/docs/features/administration/banners.md b/docs/features/administration/banners.md index 058d39fe73..caad5e3818 100644 --- a/docs/features/administration/banners.md +++ b/docs/features/administration/banners.md @@ -154,7 +154,84 @@ Inline styles are supported on allowed tags: Gradient background ``` -> Keep styling minimal. Overly large padding, font sizes, or complex layouts can cause banners to become tall or visually inconsistent across themes. +You can also style a full message area by wrapping the content in a block element: + +```html +
+ Notice title
+ Short supporting message. +
+``` + +> Keep styling purposeful. Large padding, large font sizes, or deeply nested layouts can make banners too tall and visually inconsistent across themes. + +--- + +## Designing effective banners + +Banners work best when they are easy to scan, visually distinct, and short enough not to interrupt normal work. + +### Structure the message + +Use a predictable structure: + +- Start with the event type or status: `Maintenance`, `Incident`, `Policy update`, `New feature`. +- Put the most important detail first: date, time, impact, or required action. +- Keep the body to one or two short sentences. +- Add one link only if users need more details. + +For longer notices, use short sections instead of one long paragraph. For multilingual notices, separate languages with a subtle `
` or use a collapsible `
` section. + +### Make severity visible + +Use the banner `type` consistently: + +- `info`: neutral announcements and product updates. +- `success`: resolved incidents or completed changes. +- `warning`: planned maintenance, degraded service, or upcoming action needed. +- `error`: active incidents or urgent action required. + +Avoid using `error` for non-urgent announcements. Users learn to ignore alerts when every message looks critical. + +### Use color carefully + +Color should support the banner type, not compete with it: + +- Use soft backgrounds for the full message area. +- Use stronger colors for small accents, labels, or left borders. +- Keep text contrast high enough to read in bright rooms and on dim screens. +- Avoid mixing many unrelated colors in one banner. + +A useful pattern is a pale background plus a stronger left border: + +```html +
+ Notice title
+ Short supporting message. +
+``` + +### Keep layouts responsive + +Banners are shown inside the application layout and must still work on narrow screens. + +- Prefer `display:flex;flex-wrap:wrap` for rows containing labels, dates, or badges. +- Avoid fixed widths. +- Use `width:100%;box-sizing:border-box` for full-width styled blocks. +- Keep icons and badges small so they do not increase banner height. +- Test the banner with a narrow browser window before using it broadly. + +### Avoid accidental extra height + +Banner content treats literal newlines as line breaks. If you use explicit `
` tags, keep the raw HTML compact and avoid adding extra blank lines or indentation in the banner content field. + +This compact style: + +```html +Notice
One short sentence.
Another short sentence. +``` + +renders more predictably than heavily formatted HTML with many line breaks. --- @@ -255,6 +332,22 @@ Service updates:
Status ``` +### Pattern: Styled notice block + +Use a full-width styled block when the whole message should read as one announcement area. Keep this HTML compact when pasting it into the banner content field, especially if it also contains `
` tags. + +```html +
NOTICENotice titleKey detail
Short supporting message.
+``` + +This pattern uses: + +- A pale background for the full message area. +- A stronger left border for fast visual recognition. +- A small uppercase label for the event type. +- A compact date/time chip for the most important metadata. +- `flex-wrap` so the header row still works on narrow screens. + ### Pattern: Collapsible details (keep banners short) ```html From 20f82c699b35cf41ad5bbc61dbe7073e4fa21c70 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Thu, 28 May 2026 19:56:07 +0200 Subject: [PATCH 14/24] explain where payload-request data comes from --- .../extensibility/plugin/development/rich-ui.mdx | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/docs/features/extensibility/plugin/development/rich-ui.mdx b/docs/features/extensibility/plugin/development/rich-ui.mdx index a1a1c030f7..db5a7556f0 100644 --- a/docs/features/extensibility/plugin/development/rich-ui.mdx +++ b/docs/features/extensibility/plugin/development/rich-ui.mdx @@ -271,11 +271,11 @@ The iframe and parent window can communicate beyond just height reporting. The f ### Payload Requests -The iframe can request a data payload from the parent. This is useful for passing dynamic data into the embed after it loads: +The iframe can ask the parent for a data payload after it loads: ```html ``` -The parent responds with `{ type: 'payload', requestId: ..., payload: ... }` containing the configured payload data. +The parent responds with `{ type: 'payload', requestId: ..., payload: ... }`. + +:::info Where the payload comes from +There is no separate "set the payload" call. The payload is whatever the parent component had configured when it instantiated the iframe — and today only one path actually configures one: + +- ✅ **Citation-opened embeds in the chat-controls Embeds panel** — when the user clicks a citation badge whose source has an embed URL, the side panel opens and exposes **the full citation/source object** (the same dict you sent in your `source` / `citation` event via `__event_emitter__`) as the payload. To set it, emit a [`source` event](./events#source-or-citation-and-code-execution) whose `data` includes whatever you want the iframe to be able to fetch. The iframe then asks for it via the postMessage above and receives the citation object back. +- ❌ **Inline tool-call embeds** (from a tool method returning `HTMLResponse` or `(HTMLResponse, context)`) — the parent does not configure a payload on this path, so a payload request returns `{ type: 'payload', requestId: ..., payload: null }`. Use [Tool Args Injection](#tool-args-injection-tools-only) (subject to `allowSameOrigin`) to pass data into a tool-call embed instead. +- ❌ **`__event_emitter__({"type": "embeds", ...})` and Action embeds** — also configured without a payload; the response is `null`. + +In short: payload-request is the side-panel-citation channel, not a generic iframe-data channel. Pick the right rendering path for the data flow you need. +::: ### Tool Args Injection (Tools Only) From 93f13520319457739da4ff51626ecfef39e39374 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Thu, 28 May 2026 20:09:34 +0200 Subject: [PATCH 15/24] add which-extension-do-i-need decision guide + pipe/pipeline disambiguation --- docs/features/extensibility/index.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/docs/features/extensibility/index.md b/docs/features/extensibility/index.md index e103daf5b6..2b8b142f38 100644 --- a/docs/features/extensibility/index.md +++ b/docs/features/extensibility/index.md @@ -17,6 +17,28 @@ There are three layers, and most teams end up using at least two: --- +## Which Extension Do I Need? + +The names don't always map obviously to what they do. Start from what you're trying to accomplish: + +| I want to... | Use | Why this one | +|---|---|---| +| Let the model **call an API or perform an action** (and keep a secret/API key the user and model can never read) | **[Tool](plugin/tools)** | The key lives inside the tool, server-side. The model only sees the *result*, never the credential. | +| **Add a new model or provider** to the model selector | **[Pipe Function](plugin/functions/pipe)** | A Pipe appears as a selectable "model" and handles the request however you like. | +| **Modify messages** going in or out (redact PII, inject system text, log, translate) | **[Filter Function](plugin/functions/filter)** | Filters run on every message via `inlet`/`outlet`/`stream` without touching model config. | +| Add a **button on a message** that runs custom code | **[Action Function](plugin/functions/action)** | Actions are user-triggered, per-message operations. | +| Teach the model **how to approach a task** (methodology, steps, house style) | **[Skill](/features/workspace/skills)** | Skills are instructions, not code. The model reads them; they don't execute anything. | +| Give the model **documents to retrieve from** | **[Knowledge](/features/workspace/knowledge)** | RAG over your files, attached to a model or referenced with `#`. | +| Save a **reusable prompt** behind a slash command | **[Prompt](/features/workspace/prompts)** | Templated text with typed variables; expands when you type `/name`. | +| Connect an **existing external service** that already speaks HTTP | **[OpenAPI / MCP server](mcp)** | Point Open WebUI at the spec; endpoints become callable tools. No glue code. | +| Run something **heavy, GPU-bound, or sandboxed** off the main instance | **[Pipeline](pipelines)** | A separate worker container; keeps the main app lean. | + +:::tip "Pipe" vs "Pipeline" — not the same thing +This is the single most common naming mix-up. A **Pipe** is a type of **Function** (in-process Python, adds a provider to the model list). A **Pipeline** is a **separate external worker container**. They share a prefix and nothing else. If you want to add a model provider, you almost always want a **Pipe Function**, not a Pipeline. +::: + +--- + ## Why Extensibility? ### Give models real-world abilities From 251e20069d9723c415b6394daa3e4f1f1d2dd02e Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Thu, 28 May 2026 20:11:54 +0200 Subject: [PATCH 16/24] tools taxonomy: external tool servers, workspace tools power, open terminal info block --- .../extensibility/plugin/tools/index.mdx | 21 ++++++++++++------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/docs/features/extensibility/plugin/tools/index.mdx b/docs/features/extensibility/plugin/tools/index.mdx index 05d5426ca7..6130b2c6cf 100644 --- a/docs/features/extensibility/plugin/tools/index.mdx +++ b/docs/features/extensibility/plugin/tools/index.mdx @@ -21,12 +21,13 @@ Because there are several ways to integrate "Tools" in Open WebUI, it's importan | Type | Location in UI | Best For... | Source | | :--- | :--- | :--- | :--- | -| **Native Features** | Admin/Settings | Core platform functionality | Built-in to Open WebUI | -| **Workspace Tools** | `Workspace > Tools` | User-created or community Python scripts | [Community Library](https://openwebui.com/search) | -| **Native MCP (HTTP)** | `Settings > Connections` | Standard MCP servers reachable via HTTP/SSE | External MCP Servers | -| **MCP via Proxy (MCPO)** | `Settings > Connections` | Local stdio-based MCP servers (e.g., Claude Desktop tools) | [MCPO Adapter](https://github.com/open-webui/mcpo) | -| **OpenAPI Servers** | `Settings > Connections` | Standard REST/OpenAPI web services | External Web APIs | -| **Open Terminal** | `Settings > Integrations` | Full shell access in an isolated Docker container (always-on) | [Open Terminal](https://github.com/open-webui/open-terminal) | +| **Native Features** | Admin/Settings | Core platform functionality (these are the [built-in system tools](#built-in-system-tools-nativeagentic-mode)) | Built-in to Open WebUI | +| **Workspace Tools** | `Workspace > Tools` | User-created or community Python scripts — **the most powerful, least restricted option** | [Community Library](https://openwebui.com/search) | +| **Native MCP (HTTP)** | `Settings > Connections` | Standard MCP servers reachable via HTTP/SSE | External tool server | +| **MCP via Proxy (MCPO)** | `Settings > Connections` | Local stdio-based MCP servers (e.g., Claude Desktop tools) | External tool server (via [MCPO Adapter](https://github.com/open-webui/mcpo)) | +| **OpenAPI Servers** | `Settings > Connections` | Standard REST/OpenAPI web services | External tool server | + +The last three (**MCP HTTP**, **MCPO**, **OpenAPI**) are all **external tool servers**: the tool code runs on a separate process or machine and Open WebUI calls it over HTTP. **Native Features** are the built-in system tools that ship with Open WebUI. **Workspace Tools** are Python that runs in-process — for the most demanding use cases they are by far the most capable option with the fewest limitations (see below). ### 1. Native Features (Built-in) These are deeply integrated into Open WebUI and generally don't require external scripts. @@ -39,8 +40,8 @@ These are deeply integrated into Open WebUI and generally don't require external In [**Native Mode**](#built-in-system-tools-nativeagentic-mode), these features are exposed as **Tools** that the model can call independently. ### 2. Workspace Tools (Custom Plugins) -These are **Python scripts** that run directly within the Open WebUI environment. -- **Capability**: Can do anything Python can do (web scraping, complex math, API calls). +These are **Python scripts** that run directly within the Open WebUI environment. **For the most demanding use cases, Workspace Tools are by far the most powerful option with the fewest limitations** — they run in-process with full access to Python, the `open_webui` codebase, and the request context, so there is very little they *can't* do (see [Under the Hood](../development/under-the-hood) for the full extent). The external tool servers above are more constrained: they only see what you pass over HTTP and can't reach into Open WebUI itself. +- **Capability**: Can do anything Python can do (web scraping, complex math, API calls), and hold secrets (API keys) entirely server-side so neither the user nor the model can read them. - **Access**: Managed via the `Workspace` menu. - **Safety**: Always review code before importing, as these run on your server. - **⚠️ Security Warning**: Normal or untrusted users should **not** be given permission to access the Workspace Tools section. This access allows a user to upload and execute arbitrary Python code on your server, which could lead to a full system compromise. @@ -54,6 +55,10 @@ These are **Python scripts** that run directly within the Open WebUI environment ### 4. OpenAPI / Function Calling Servers Generic web servers that provide an OpenAPI (`.json` or `.yaml`) specification. Open WebUI can ingest these specs and treat every endpoint as a tool. +:::info Open Terminal — a separate code-execution integration +Beyond the tool types above, Open WebUI also integrates with **[Open Terminal](/features/open-terminal)**: an always-on, isolated Docker container that gives a model a real shell and filesystem. Once connected, it exposes its own set of **built-in tools** (`run_command`, `read_file`, `write_file`, `grep_search`, `glob_search`, process management, and more) that the model can call directly — effectively a sandboxed code-execution and file-handling environment, distinct from the per-message [Code Interpreter](#built-in-system-tools-nativeagentic-mode) tool. See the [Open Terminal documentation](/features/open-terminal) for setup, multi-user, and security considerations. +::: + --- ## How to Install & Manage Workspace Tools From 8921d63efd3898c521db1f61fb77527ddda14fb4 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Thu, 28 May 2026 20:20:13 +0200 Subject: [PATCH 17/24] mark Pipelines legacy across all pipelines pages; drop from decision table --- docs/features/extensibility/index.md | 1 - docs/features/extensibility/pipelines/filters.md | 4 ++++ docs/features/extensibility/pipelines/index.mdx | 10 ++++++---- docs/features/extensibility/pipelines/pipes.md | 4 ++++ docs/features/extensibility/pipelines/tutorials.md | 4 ++++ docs/features/extensibility/pipelines/valves.md | 4 ++++ 6 files changed, 22 insertions(+), 5 deletions(-) diff --git a/docs/features/extensibility/index.md b/docs/features/extensibility/index.md index 2b8b142f38..b331e60cd3 100644 --- a/docs/features/extensibility/index.md +++ b/docs/features/extensibility/index.md @@ -31,7 +31,6 @@ The names don't always map obviously to what they do. Start from what you're try | Give the model **documents to retrieve from** | **[Knowledge](/features/workspace/knowledge)** | RAG over your files, attached to a model or referenced with `#`. | | Save a **reusable prompt** behind a slash command | **[Prompt](/features/workspace/prompts)** | Templated text with typed variables; expands when you type `/name`. | | Connect an **existing external service** that already speaks HTTP | **[OpenAPI / MCP server](mcp)** | Point Open WebUI at the spec; endpoints become callable tools. No glue code. | -| Run something **heavy, GPU-bound, or sandboxed** off the main instance | **[Pipeline](pipelines)** | A separate worker container; keeps the main app lean. | :::tip "Pipe" vs "Pipeline" — not the same thing This is the single most common naming mix-up. A **Pipe** is a type of **Function** (in-process Python, adds a provider to the model list). A **Pipeline** is a **separate external worker container**. They share a prefix and nothing else. If you want to add a model provider, you almost always want a **Pipe Function**, not a Pipeline. diff --git a/docs/features/extensibility/pipelines/filters.md b/docs/features/extensibility/pipelines/filters.md index 24c197c02b..07b9bed90b 100644 --- a/docs/features/extensibility/pipelines/filters.md +++ b/docs/features/extensibility/pipelines/filters.md @@ -5,6 +5,10 @@ title: "Filters" ## Filters +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are legacy and are no longer recommended.** For message pre/post-processing use an in-process [Filter Function](/features/extensibility/plugin/functions/filter) instead — it is built in, easier to configure, and needs no separate worker container. This page is kept for reference and existing deployments only. +::: + Filters are used to perform actions against incoming user messages and outgoing assistant (LLM) messages. Potential actions that can be taken in a filter include sending messages to monitoring platforms (such as Langfuse or DataDog), modifying message contents, blocking toxic messages, translating messages to another language, or rate limiting messages from certain users. A list of examples is maintained in the [Pipelines repo](https://github.com/open-webui/pipelines/tree/main/examples/filters). Filters can be executed as a Function or on a Pipelines server. The general workflow can be seen in the image below.
diff --git a/docs/features/extensibility/pipelines/index.mdx b/docs/features/extensibility/pipelines/index.mdx index 5676e14f78..a8bf05ddbb 100644 --- a/docs/features/extensibility/pipelines/index.mdx +++ b/docs/features/extensibility/pipelines/index.mdx @@ -12,12 +12,14 @@ title: "Pipelines" # Pipelines: UI-Agnostic OpenAI API Plugin Framework -:::warning +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are legacy and are no longer recommended.** They predate the in-process [Functions](/features/extensibility/plugin/functions/) (Pipes, Filters, Actions) and [Tools](/features/extensibility/plugin/tools/) system, which now covers the same use cases without running a separate worker container. -**DO NOT USE PIPELINES IF!** - -If your goal is simply to add support for additional providers like Anthropic or basic filters, you likely don't need Pipelines . For those cases, Open WebUI Functions are a better fit—it's built-in, much more convenient, and easier to configure. Pipelines, however, comes into play when you're dealing with computationally heavy tasks (e.g., running large models or complex logic) that you want to offload from your main Open WebUI instance for better performance and scalability. +- Custom provider / RAG / request routing (a Pipeline **pipe**) → use a [Pipe Function](/features/extensibility/plugin/functions/pipe). +- Message pre/post-processing (a Pipeline **filter**) → use a [Filter Function](/features/extensibility/plugin/functions/filter). +- Connecting an external HTTP service → use an [OpenAPI or MCP tool server](/features/extensibility/mcp). +These pages are kept for reference and for existing deployments only. New work should target Functions, Tools, or external tool servers instead. ::: Welcome to **Pipelines**, an [Open WebUI](https://github.com/open-webui) initiative. Pipelines bring modular, customizable workflows to any UI client supporting OpenAI API specs – and much more! Easily extend functionalities, integrate unique logic, and create dynamic workflows with just a few lines of code. diff --git a/docs/features/extensibility/pipelines/pipes.md b/docs/features/extensibility/pipelines/pipes.md index ab0bdcb2ce..e84f70f5cd 100644 --- a/docs/features/extensibility/pipelines/pipes.md +++ b/docs/features/extensibility/pipelines/pipes.md @@ -5,6 +5,10 @@ title: "Pipes" ## Pipes +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are legacy and are no longer recommended.** For custom providers, RAG, or request routing use an in-process [Pipe Function](/features/extensibility/plugin/functions/pipe) instead — it is built in, easier to configure, and needs no separate worker container. This page is kept for reference and existing deployments only. +::: + Pipes are standalone functions that process inputs and generate responses, possibly by invoking one or more LLMs or external services before returning results to the user. Examples of potential actions you can take with Pipes are Retrieval Augmented Generation (RAG), sending requests to non-OpenAI LLM providers (such as Anthropic, Azure OpenAI, or Google), or executing functions right in your web UI. Pipes can be hosted as a Function or on a Pipelines server. A list of examples is maintained in the [Pipelines repo](https://github.com/open-webui/pipelines/tree/main/examples/pipelines). The general workflow can be seen in the image below.
diff --git a/docs/features/extensibility/pipelines/tutorials.md b/docs/features/extensibility/pipelines/tutorials.md index 9d7302b78d..4f0e2155ea 100644 --- a/docs/features/extensibility/pipelines/tutorials.md +++ b/docs/features/extensibility/pipelines/tutorials.md @@ -5,6 +5,10 @@ title: "Tutorials" ## Pipeline Tutorials +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are legacy and are no longer recommended.** New work should target in-process [Functions](/features/extensibility/plugin/functions/) (Pipes, Filters, Actions), [Tools](/features/extensibility/plugin/tools/), or external [OpenAPI / MCP tool servers](/features/extensibility/mcp). These tutorials are kept for reference and existing deployments only. +::: + ## Tutorials Welcome Are you a content creator with a blog post or YouTube video about your pipeline setup? Get in touch diff --git a/docs/features/extensibility/pipelines/valves.md b/docs/features/extensibility/pipelines/valves.md index 6e333d3dba..ee785e6b19 100644 --- a/docs/features/extensibility/pipelines/valves.md +++ b/docs/features/extensibility/pipelines/valves.md @@ -5,6 +5,10 @@ title: "Valves" ## Valves +:::danger Pipelines are legacy — do not use for new deployments +**Pipelines are legacy and are no longer recommended.** Use in-process [Functions](/features/extensibility/plugin/functions/) (Pipes, Filters, Actions) or [Tools](/features/extensibility/plugin/tools/) instead — they support [Valves](/features/extensibility/plugin/development/valves) too. This page is kept for reference and existing deployments only. +::: + `Valves` (see the dedicated [Valves & UserValves](/features/extensibility/plugin/development/valves) page) can also be set for `Pipeline`. In short, `Valves` are input variables that are set per pipeline. `Valves` are set as a subclass of the `Pipeline` class, and initialized as part of the `__init__` method of the `Pipeline` class. From 45db3e3000e9ca29508a9b147fcacab3352a1041 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Thu, 28 May 2026 20:22:55 +0200 Subject: [PATCH 18/24] pipelines legacy warning: reference both pipe and filter replacements --- docs/features/extensibility/pipelines/filters.md | 7 ++++++- docs/features/extensibility/pipelines/pipes.md | 7 ++++++- docs/features/extensibility/pipelines/tutorials.md | 7 ++++++- docs/features/extensibility/pipelines/valves.md | 7 ++++++- 4 files changed, 24 insertions(+), 4 deletions(-) diff --git a/docs/features/extensibility/pipelines/filters.md b/docs/features/extensibility/pipelines/filters.md index 07b9bed90b..ce636a7d04 100644 --- a/docs/features/extensibility/pipelines/filters.md +++ b/docs/features/extensibility/pipelines/filters.md @@ -6,7 +6,12 @@ title: "Filters" ## Filters :::danger Pipelines are legacy — do not use for new deployments -**Pipelines are legacy and are no longer recommended.** For message pre/post-processing use an in-process [Filter Function](/features/extensibility/plugin/functions/filter) instead — it is built in, easier to configure, and needs no separate worker container. This page is kept for reference and existing deployments only. +**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that are built in, easier to configure, and need no separate worker container: + +- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe) +- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter) + +This page is kept for reference and existing deployments only. ::: Filters are used to perform actions against incoming user messages and outgoing assistant (LLM) messages. Potential actions that can be taken in a filter include sending messages to monitoring platforms (such as Langfuse or DataDog), modifying message contents, blocking toxic messages, translating messages to another language, or rate limiting messages from certain users. A list of examples is maintained in the [Pipelines repo](https://github.com/open-webui/pipelines/tree/main/examples/filters). Filters can be executed as a Function or on a Pipelines server. The general workflow can be seen in the image below. diff --git a/docs/features/extensibility/pipelines/pipes.md b/docs/features/extensibility/pipelines/pipes.md index e84f70f5cd..8c65aeacc3 100644 --- a/docs/features/extensibility/pipelines/pipes.md +++ b/docs/features/extensibility/pipelines/pipes.md @@ -6,7 +6,12 @@ title: "Pipes" ## Pipes :::danger Pipelines are legacy — do not use for new deployments -**Pipelines are legacy and are no longer recommended.** For custom providers, RAG, or request routing use an in-process [Pipe Function](/features/extensibility/plugin/functions/pipe) instead — it is built in, easier to configure, and needs no separate worker container. This page is kept for reference and existing deployments only. +**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that are built in, easier to configure, and need no separate worker container: + +- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe) +- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter) + +This page is kept for reference and existing deployments only. ::: Pipes are standalone functions that process inputs and generate responses, possibly by invoking one or more LLMs or external services before returning results to the user. Examples of potential actions you can take with Pipes are Retrieval Augmented Generation (RAG), sending requests to non-OpenAI LLM providers (such as Anthropic, Azure OpenAI, or Google), or executing functions right in your web UI. Pipes can be hosted as a Function or on a Pipelines server. A list of examples is maintained in the [Pipelines repo](https://github.com/open-webui/pipelines/tree/main/examples/pipelines). The general workflow can be seen in the image below. diff --git a/docs/features/extensibility/pipelines/tutorials.md b/docs/features/extensibility/pipelines/tutorials.md index 4f0e2155ea..69d0a669a3 100644 --- a/docs/features/extensibility/pipelines/tutorials.md +++ b/docs/features/extensibility/pipelines/tutorials.md @@ -6,7 +6,12 @@ title: "Tutorials" ## Pipeline Tutorials :::danger Pipelines are legacy — do not use for new deployments -**Pipelines are legacy and are no longer recommended.** New work should target in-process [Functions](/features/extensibility/plugin/functions/) (Pipes, Filters, Actions), [Tools](/features/extensibility/plugin/tools/), or external [OpenAPI / MCP tool servers](/features/extensibility/mcp). These tutorials are kept for reference and existing deployments only. +**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that are built in, easier to configure, and need no separate worker container: + +- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe) +- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter) + +These tutorials are kept for reference and existing deployments only. ::: ## Tutorials Welcome diff --git a/docs/features/extensibility/pipelines/valves.md b/docs/features/extensibility/pipelines/valves.md index ee785e6b19..e86d1cede0 100644 --- a/docs/features/extensibility/pipelines/valves.md +++ b/docs/features/extensibility/pipelines/valves.md @@ -6,7 +6,12 @@ title: "Valves" ## Valves :::danger Pipelines are legacy — do not use for new deployments -**Pipelines are legacy and are no longer recommended.** Use in-process [Functions](/features/extensibility/plugin/functions/) (Pipes, Filters, Actions) or [Tools](/features/extensibility/plugin/tools/) instead — they support [Valves](/features/extensibility/plugin/development/valves) too. This page is kept for reference and existing deployments only. +**Pipelines are outdated and legacy, and are no longer recommended.** A Pipeline can run as a **pipe** or as a **filter**; both forms now have in-process replacements that support [Valves](/features/extensibility/plugin/development/valves) too, are built in, and need no separate worker container: + +- Pipeline **pipe** (custom provider, RAG, request routing) → [Pipe Function](/features/extensibility/plugin/functions/pipe) +- Pipeline **filter** (message pre/post-processing) → [Filter Function](/features/extensibility/plugin/functions/filter) + +This page is kept for reference and existing deployments only. ::: `Valves` (see the dedicated [Valves & UserValves](/features/extensibility/plugin/development/valves) page) can also be set for `Pipeline`. In short, `Valves` are input variables that are set per pipeline. From c2b60951d91b3837b6a271de21120c2d73413e39 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Thu, 28 May 2026 20:29:53 +0200 Subject: [PATCH 19/24] extensibility index: async backend makes pipelines-for-heavy-work obsolete; redirect to in-process or external tool servers --- docs/features/extensibility/index.md | 31 +++++++++++++++------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/docs/features/extensibility/index.md b/docs/features/extensibility/index.md index b331e60cd3..fa36d46b61 100644 --- a/docs/features/extensibility/index.md +++ b/docs/features/extensibility/index.md @@ -9,11 +9,14 @@ title: "Extensibility" Open WebUI ships with powerful defaults, but your workflows aren't default. Extensibility is how you close the gap: give models real-time data, enforce compliance rules, add new AI providers, or connect to any external service. Write a few lines of Python, point at an OpenAPI endpoint, or browse the community library. The platform adapts to you, not the other way around. -There are three layers, and most teams end up using at least two: +There are two layers, and most teams end up using both: - **In-process Python** (Tools & Functions) runs inside Open WebUI itself with zero infrastructure and instant iteration. - **External HTTP** (OpenAPI & MCP servers) connects to services running anywhere, from a sidecar container to a third-party SaaS. -- **Pipeline workers** (Pipelines) offload heavy or sensitive processing to a separate container, keeping your main instance fast and clean. + +:::warning Pipelines are legacy +You may still see **Pipelines** referenced as a third layer. It is **legacy and no longer recommended** — the heavy-processing problem it solved no longer exists (see [Run heavy or long-running work](#run-heavy-or-long-running-work) below). Use Functions, Tools, or an external tool server instead. See the [Pipelines pages](pipelines) for the full deprecation notice. +::: --- @@ -52,9 +55,11 @@ Have an internal API? A third-party SaaS with an OpenAPI spec? An MCP server alr Functions let you intercept and transform messages before they reach the model (input filters) or before they reach the user (output filters). Help redact PII, enforce formatting rules, log to an observability platform, inject system instructions dynamically, all without touching model configuration. -### Offload heavy processing +### Run heavy or long-running work + +Open WebUI's backend is **fully async**. Long-running Tools and Functions (awaiting an external API, a slow query, a multi-step agent) do not block other users, and synchronous/CPU-bound plugin code is offloaded to a worker thread pool (see [`THREAD_POOL_SIZE`](/reference/env-configuration#thread_pool_size)) — so it doesn't stall the event loop either. In practice you can run heavy work **in-process** without the latency problems that older synchronous releases had. -When a plugin needs GPU access, large dependencies, or isolated execution, run it as a Pipeline on a separate machine. Open WebUI talks to it over a standard API. Your main instance stays lean. +The historical reason to push heavy pipes/filters onto a separate **Pipelines** worker — keeping the single synchronous event loop unblocked — no longer applies. If you genuinely need **GPU access, large or conflicting dependencies, hard isolation, or independent scaling**, run that work as an **external service behind an [OpenAPI or MCP tool server](mcp)**, not a Pipeline. ### Import from the community @@ -70,7 +75,6 @@ Browse hundreds of community-built Tools and Functions from the Open WebUI Commu | ⚙️ **Functions** | Platform extensions that add model providers (Pipes), message processing (Filters), or UI actions (Actions) | | 🔗 **MCP support** | Native Streamable HTTP for Model Context Protocol servers | | 🌐 **OpenAPI servers** | Auto-discover and expose tools from any OpenAPI-compatible endpoint | -| 🔧 **Pipelines** | Modular plugin framework running on a separate worker for heavy or sensitive processing | | 📝 **Skills** | Markdown instruction sets that teach models how to approach specific tasks | | ⚡ **Prompts** | Slash-command templates with typed input variables and versioning | | 🏪 **Community library** | One-click import of community-built Tools and Functions | @@ -83,11 +87,10 @@ Understanding which layer to use saves time: | Layer | Runs where | Best for | Trade-off | |-------|-----------|----------|-----------| -| **Tools & Functions** | Inside Open WebUI process | Real-time data, filters, UI actions, new providers | Shares resources with the main server | -| **OpenAPI / MCP** | Any HTTP endpoint | Connecting existing services, third-party APIs | Requires a running external server | -| **Pipelines** | Separate Docker container | GPU workloads, heavy dependencies, sandboxed execution | Additional infrastructure to manage | +| **Tools & Functions** | Inside Open WebUI process | Real-time data, filters, UI actions, new providers — including heavy/long-running work (the async backend keeps it from blocking) | Shares CPU/RAM with the main server | +| **OpenAPI / MCP** | Any HTTP endpoint | Connecting existing services, third-party APIs, and GPU / heavy-dependency / isolated workloads | Requires a running external server | -Most users start with **Tools & Functions**. They require no extra setup, have a built-in code editor, and cover the majority of use cases. +Most users start with **Tools & Functions**. They require no extra setup, have a built-in code editor, and cover the majority of use cases. (**Pipelines** is a legacy third option, no longer recommended — see the note above.) --- @@ -105,9 +108,9 @@ A healthcare organization deploys a Filter Function that scans outbound messages An engineering team uses Pipe Functions to add Anthropic, Google Vertex AI, and a self-hosted vLLM instance alongside their existing Ollama models. Users see all providers in a single model selector with no separate logins and no API key juggling. -### Heavy-compute pipelines +### GPU-bound external processing -A research group runs a Retrieval-Augmented Generation pipeline that re-ranks with a cross-encoder model requiring GPU. They deploy it as a Pipeline on a dedicated GPU node. Open WebUI routes relevant queries to the pipeline automatically while keeping the main instance on commodity hardware. +A research group needs to re-rank retrieval results with a cross-encoder model that requires a GPU. They run it as a small service on a dedicated GPU node and expose it to Open WebUI as an **[OpenAPI tool server](mcp)**. The model calls it like any other tool while the main instance stays on commodity hardware. (The async backend means lighter custom logic can simply run in-process as a Function — only the GPU dependency pushes this particular workload to a separate service.) --- @@ -115,11 +118,11 @@ A research group runs a Retrieval-Augmented Generation pipeline that re-ranks wi ### Security -Tools, Functions, and Pipelines execute **arbitrary Python code** on your server. Only install extensions from trusted sources, review code before importing, and restrict Workspace access to administrators. See the [Security Policy](/security) for details. +Tools and Functions execute **arbitrary Python code** on your server. Only install extensions from trusted sources, review code before importing, and restrict Workspace access to administrators. See the [Security Policy](/security) for details. ### Resource sharing -In-process Tools and Functions share CPU and memory with Open WebUI. Computationally expensive plugins should be moved to Pipelines or external services. +In-process Tools and Functions share CPU and memory with Open WebUI. The async backend keeps long-running and blocking work from stalling other requests, but it does not create more hardware — genuinely CPU- or GPU-heavy workloads still compete for the same machine. For those, run the work as an external service behind an [OpenAPI / MCP tool server](mcp) so it scales independently. ### MCP transport @@ -133,4 +136,4 @@ Native MCP support is **Streamable HTTP only**. For stdio or SSE-based MCP serve |-------|-------------------| | [**Tools & Functions**](plugin) | Writing Python Tools, Functions (Pipes, Filters, Actions), and the development API | | [**MCP**](mcp) | Connecting Model Context Protocol servers, OAuth setup, troubleshooting | -| [**Pipelines**](pipelines) | Deploying the pipeline worker, building custom pipelines, directory structure | +| [**Pipelines**](pipelines) *(legacy)* | Reference only — the deprecated separate-worker framework, superseded by Functions and Tools | From 2d7b43e965dd001fe419387fcd7699d7fe1864f9 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Thu, 28 May 2026 23:49:57 +0200 Subject: [PATCH 20/24] mcp: document MCP servers are admin-only by design (vs per-user OpenAPI) Per-user MCP servers are deliberately not supported (issue #24620 closed as intended): an MCP server is a stateful, capability-rich endpoint running inside the trust boundary with the user's full scope, unlike a stateless OpenAPI URL. Adds a dedicated section + FAQ entry explaining the admin-only design and why. Co-Authored-By: Claude Opus 4.7 --- docs/features/extensibility/mcp.mdx | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/docs/features/extensibility/mcp.mdx b/docs/features/extensibility/mcp.mdx index d4ad05b089..29a950bd0e 100644 --- a/docs/features/extensibility/mcp.mdx +++ b/docs/features/extensibility/mcp.mdx @@ -32,6 +32,16 @@ Entering MCP-style configuration (with `mcpServers` in JSON) into an OpenAPI con 2. Re-add it with the correct **Type** set to **MCP** ::: +## 🔒 MCP servers are admin-only {#mcp-servers-are-admin-only} + +MCP servers can only be added by **administrators**, under **Admin Settings → External Tools**. Regular users cannot register their own, by design. + +This is **not** the same restriction as OpenAPI. When you grant the **Direct Tool Servers** permission (per user or per group, off by default), users can add their own **OpenAPI** tool servers under **Settings → Tools**, but that path is OpenAPI-only: the connection type is locked, with no MCP option. + +The difference is capability. A user-supplied OpenAPI server is a stateless HTTP URL exposing a fixed set of declared endpoints. An MCP server is far more powerful: it is stateful and capability-rich (sampling, elicitation, persistent sessions and arbitrary host command execution over stdio transports), and it runs inside Open WebUI's trust boundary with the connecting user's full scope. In practice a malicious or compromised MCP server could execute code and read or exfiltrate data with that user's access, so the capability stays admin-gated. Open WebUI's own MCP support is Streamable HTTP only, but the protocol's privileged nature is why adding one is reserved for admins. + +To give users an MCP-backed capability without server-configuration rights, an admin adds the server once and scopes it with **Access Control** to the right users or groups. + ## 🧭 When to use MCP vs OpenAPI :::tip @@ -188,3 +198,7 @@ Supported and improving. The broader ecosystem is still evolving; expect occasio **Can I mix OpenAPI and MCP tools?** Yes. Many deployments do both. + +**Can users add their own MCP servers?** + +No. Adding MCP servers is admin-only (**Admin Settings → External Tools**). Users with the **Direct Tool Servers** permission can add their own **OpenAPI** tool servers, but not MCP. See [MCP servers are admin-only](#mcp-servers-are-admin-only) for the reasoning. From eb0e9e81826915da61260b2e6cada34b165e7b12 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Fri, 29 May 2026 11:27:19 +0200 Subject: [PATCH 21/24] Update scaling.md --- docs/getting-started/advanced-topics/scaling.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/getting-started/advanced-topics/scaling.md b/docs/getting-started/advanced-topics/scaling.md index 9fa4bf896b..3d778ded54 100644 --- a/docs/getting-started/advanced-topics/scaling.md +++ b/docs/getting-started/advanced-topics/scaling.md @@ -109,6 +109,7 @@ ENABLE_WEBSOCKET_SUPPORT=true - If you're using Redis Sentinel for high availability, also set `REDIS_SENTINEL_HOSTS` and consider setting `REDIS_SOCKET_CONNECT_TIMEOUT=5` to prevent hangs during failover. - For AWS Elasticache or other managed Redis Cluster services, set `REDIS_CLUSTER=true`. - Make sure your Redis server has `timeout 1800` and a high enough `maxclients` (10000+) to prevent connection exhaustion over time. +- For high-concurrency websocket streaming, review Redis Pub/Sub output buffer limits. Large Socket.IO events can disconnect Pub/Sub clients if Redis uses small default buffers; see [WebSocket Pub/Sub Buffer Limits](/tutorials/integrations/redis#websocket-pubsub-buffer-limits). - A **single Redis instance** is sufficient for the vast majority of deployments, even with thousands of users. You almost certainly do not need Redis Cluster unless you have specific HA/bandwidth requirements. If you think you need Redis Cluster, first check whether your connection count and memory usage are caused by fixable configuration issues (see [Common Anti-Patterns](/troubleshooting/performance#%EF%B8%8F-common-anti-patterns)). - Without Redis in a multi-instance setup, you will experience [WebSocket 403 errors](/troubleshooting/multi-replica#2-websocket-403-errors--connection-failures), [configuration sync issues](/troubleshooting/multi-replica#3-model-not-found-or-configuration-mismatch), and intermittent authentication failures. From b3574bd36309274e511393a2f12a380bdc6d8c22 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Fri, 29 May 2026 11:27:41 +0200 Subject: [PATCH 22/24] Update env-configuration.mdx --- docs/reference/env-configuration.mdx | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/docs/reference/env-configuration.mdx b/docs/reference/env-configuration.mdx index 43c9244408..0403f05c07 100644 --- a/docs/reference/env-configuration.mdx +++ b/docs/reference/env-configuration.mdx @@ -6913,6 +6913,22 @@ maxclients 10000 timeout 1800 ``` +For high-concurrency websocket deployments, also review Redis Pub/Sub output buffer limits. Open WebUI uses Socket.IO over Redis Pub/Sub when `WEBSOCKET_MANAGER=redis` is enabled, and streaming responses can produce large websocket events. If Redis disconnects Pub/Sub clients under large streaming payloads, you may see `Cannot publish to redis... giving up`, Redis timeout errors, or stalled live updates. Check: + +```bash +redis-cli INFO stats | grep client_output_buffer_limit_disconnections +redis-cli SLOWLOG GET 50 +redis-cli CONFIG GET client-output-buffer-limit +``` + +If `client_output_buffer_limit_disconnections` increases and the slow log shows large `PUBLISH socketio ...` entries, raise the Pub/Sub buffer limit in `redis.conf`. Example: + +```conf +client-output-buffer-limit normal 0 0 0 replica 268435456 67108864 60 pubsub 1073741824 268435456 180 +``` + +This leaves normal clients unchanged and allows Pub/Sub clients to buffer up to 1 GB hard / 256 MB soft for 180 seconds. Tune these values to your available Redis memory and expected websocket payload size. + **Symptoms of this misconfiguration:** - Works fine for days/weeks, then suddenly all logins fail with 500 errors - `redis.exceptions.ConnectionError: max number of clients reached` in logs From 551810d62cb3671a13578000d1cb96889a3c6dbb Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Fri, 29 May 2026 11:28:11 +0200 Subject: [PATCH 23/24] Update performance.md --- docs/troubleshooting/performance.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/docs/troubleshooting/performance.md b/docs/troubleshooting/performance.md index f2ef3f9704..675a9d4fa4 100644 --- a/docs/troubleshooting/performance.md +++ b/docs/troubleshooting/performance.md @@ -518,6 +518,23 @@ Common Redis configuration issues that cause unnecessary scaling: | **Stale connections** | Redis runs out of connections or memory grows indefinitely | Set `timeout 1800` in redis.conf (kills idle connections after 30 minutes) | | **Low maxclients** | `max number of clients reached` errors | Set `maxclients 10000` or higher | | **No connection limits** | Open WebUI pods may accumulate connections that never close | Combine `timeout` with connection pool limits in your Redis client config | +| **Low Pub/Sub output buffer limits** | WebSocket streams stall, `Cannot publish to redis... giving up`, or Redis logs client output buffer disconnections when large Socket.IO events are published | Increase the Redis `client-output-buffer-limit ... pubsub ...` setting, sized for your websocket payloads and available Redis memory | + +For Redis-backed websockets, Open WebUI uses Socket.IO over Redis Pub/Sub. Large streaming responses and tool events can create multi-MB `PUBLISH socketio ...` payloads. If Redis disconnects slow Pub/Sub clients, inspect: + +```bash +redis-cli INFO stats | grep client_output_buffer_limit_disconnections +redis-cli SLOWLOG GET 50 +redis-cli CONFIG GET client-output-buffer-limit +``` + +Example Redis configuration for deployments that need to tolerate large websocket bursts: + +```conf +client-output-buffer-limit normal 0 0 0 replica 268435456 67108864 60 pubsub 1073741824 268435456 180 +``` + +This keeps normal client limits disabled and raises Pub/Sub clients to a 1 GB hard limit and 256 MB soft limit for 180 seconds. Tune downward or upward based on Redis memory headroom and observed payload sizes. --- From 82201a4ba383e81c73500ecbaeac7b80f60cf6a3 Mon Sep 17 00:00:00 2001 From: Classic298 <27028174+Classic298@users.noreply.github.com> Date: Fri, 29 May 2026 11:28:33 +0200 Subject: [PATCH 24/24] Update redis.md --- docs/tutorials/integrations/redis.md | 33 ++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/docs/tutorials/integrations/redis.md b/docs/tutorials/integrations/redis.md index f80bd43aee..1578e97488 100644 --- a/docs/tutorials/integrations/redis.md +++ b/docs/tutorials/integrations/redis.md @@ -188,6 +188,39 @@ The above configuration sets up a Redis container named `redis-valkey` and mount ::: +### WebSocket Pub/Sub Buffer Limits + +Open WebUI uses Socket.IO over Redis Pub/Sub when `WEBSOCKET_MANAGER=redis` is enabled. Streaming responses and tool events can generate large websocket events because some updates include accumulated message state, not only the newest token delta. If Redis disconnects Pub/Sub clients while delivering these events, users can see stalled streams, missing live updates, or log messages such as: + +```text +Cannot publish to redis... retrying +Cannot publish to redis... giving up +redis.exceptions.TimeoutError: Timeout connecting to server +``` + +Check whether Redis is disconnecting Pub/Sub clients because of output buffer limits: + +```bash +redis-cli INFO stats | grep client_output_buffer_limit_disconnections +redis-cli SLOWLOG GET 50 +redis-cli CONFIG GET client-output-buffer-limit +``` + +If the slow log shows large `PUBLISH socketio ...` payloads and `client_output_buffer_limit_disconnections` increases, raise the Redis Pub/Sub output buffer limit. For example: + +```conf +# Keep normal clients unchanged; allow larger websocket Pub/Sub bursts. +client-output-buffer-limit normal 0 0 0 replica 268435456 67108864 60 pubsub 1073741824 268435456 180 +``` + +This sets the Pub/Sub hard limit to 1 GB and the soft limit to 256 MB for 180 seconds. Tune these values for your available Redis memory and expected websocket payload size. Higher limits make Redis more tolerant of temporary slow subscribers, but they also allow each slow Pub/Sub client to buffer more memory before Redis disconnects it. + +If you changed Redis configuration at runtime, verify the active value: + +```bash +redis-cli CONFIG GET client-output-buffer-limit +``` + To create a Docker network for communication between Open WebUI and Redis, run the following command: ```bash