[MISC] Decommission prompt-service, old tools, SDK1 prompt module by harini-venkataraman · Pull Request #1978 · Zipstack/unstract

harini-venkataraman · 2026-05-20T13:14:32Z

What

Phase 5 of the pluggable executor migration — decommission prompt-service, old tools (classifier, structure, text_extractor), and SDK1 prompt module from the OSS repo.

Why

These components have been fully replaced by the executor-based architecture (Phases 1–4). The prompt-service Flask app, old tool containers, and SDK1 prompt module are dead code that adds maintenance burden and CI cost.

How

prompt-service/: Entire Flask service removed (controllers, retrievers, indexing, plugins). All functionality now lives in `workers/executor/`.
tools/classifier, tools/structure, tools/text_extractor: Old tool containers removed. Structure tool routing preserved via `STRUCTURE_TOOL_IMAGE_*` env vars.
unstract/sdk1/prompt.py: Dead module removed (executor uses `answer_prompt.py` directly).
tox.ini: Removed `prompt-service` test environment (directory no longer exists).
Docker: Removed `prompt.Dockerfile`, compose service blocks, debug ports.
CI: Removed prompt-service from `production-build.yaml` matrix and old tools from `docker-tools-build-push.yaml`.
Backend/config: Removed `PROMPT_HOST`/`PROMPT_PORT` from settings, sample envs, workflow-execution constants.

Safety — preserved items

Item	Why
`STRUCTURE_TOOL_IMAGE_*` (3 keys)	Used for structure tool routing
`REMOTE_PROMPT_STUDIO_FILE_PATH`	Prompt Studio data path
`workers/plugins/`	Active executor plugins
`PromptIdeBaseTool`	Still used for IDE indexing

Can this PR break any existing features? If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No. All removed components are dead code — prompt-service was replaced by executor workers in Phases 1-4, old tool containers are unused (structure tool routing uses image env vars, not source), and SDK1 prompt.py had no remaining callers. Verified: zero import references to deleted modules, 263 workers tests pass with no regressions.

Relevant Docs

`architecture-migration-phases.md` (repo root) — Phase 5 plan

Related Issues or PRs

Cloud counterpart: https://github.com/Zipstack/unstract-cloud/pull/1503 (must merge together)
Phase 4 PR (predecessor): merged on `feat/execution-backend`

Dependencies Versions / Env Variables

Removed env vars:

`PROMPT_HOST` (was `http://unstract-prompt-service\`)
`PROMPT_PORT` (was `3003`)

No new dependencies or env vars added.

Notes on Testing

Workers tests: 263 pass (no regressions)
Dangling reference scan: 0 matches for deleted modules
tox.ini updated to remove prompt-service test env
Docker compose validated (removed services, updated depends_on)

Screenshots

N/A — no UI changes.

Checklist

I have read and understood the Contribution Guidelines.

… (Phase 5) Remove prompt-service source, Dockerfiles, and docker-compose entries. Remove tools/classifier, tools/structure, tools/text_extractor directories. Remove SDK1 prompt.py module and its tests. Clean up PROMPT_HOST/PROMPT_PORT from backend settings, sample envs, docker configs, and CI workflows. Remove prompt-service from uv-lock scripts and production build workflow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The prompt-service directory was deleted in the prior commit but tox.ini still referenced it, which would break CI test runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-05-20T13:14:49Z

Summary by CodeRabbit

Chores
- Removed the prompt-service component and related infrastructure.
- Updated service dependencies to use x2text-service and platform-service instead.
- Streamlined Docker build configurations and compose files.
- Updated environment variable configurations and CI/CD pipelines.

Walkthrough

This pull request performs a comprehensive removal of the prompt-service microservice from the platform, along with the deprecated tool-classifier and tool-structure tools. The service dependencies are reconfigured to use platform-service for core functionality, with corresponding updates to Docker orchestration, CI/CD workflows, SDK clients, and environment configuration throughout the codebase.

Changes

Service Infrastructure Reconfiguration

Layer / File(s)	Summary
Build Matrix and Service Registry Update `.github/workflows/production-build.yaml`	Production build workflow service matrix removes `prompt-service` and adds `runner`. Service count reduced from 7 to 6, with GitHub Summary build status table updated to reflect the new service set.
Platform Service Configuration Migration `backend/backend/settings/base.py`, `backend/sample.env`	Backend Django settings now configure `PLATFORM_HOST` and `PLATFORM_PORT` from environment, replacing the removed `PROMPT_HOST`/`PROMPT_PORT` settings.
Docker Compose Service Reconfiguration `docker/compose.debug.yaml`, `docker/docker-compose.yaml`, `docker/sample.compose.override.yaml`	Docker compose files updated to remove `prompt-service` container definition and debug port mapping. Backend service `depends_on` updated to use `x2text-service` instead of `prompt-service`.
Supported Tools Restriction and Build Simplification `.github/workflows/docker-tools-build-push.yaml`	Docker tools build workflow restricted to only `tool-sidecar` and `tool-text-extractor` with simplified build configuration logic, removing classifier and structure tool support.
Build Compose Service Definitions `docker/docker-compose.build.yaml`	Service definitions reorganized to remove `tool-structure`, `tool-classifier`, `prompt-service` while relocating `platform-service` build configuration.
Lockfile Generation Script Updates `docker/scripts/uv-lock-gen/README.md`, `docker/scripts/uv-lock-gen/uv-lock.sh`	Uv-lock generation documentation and script updated to remove `prompt-service` from default service list and update example dependencies.
SDK Service Client Refactoring `unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py`, `unstract/sdk1/tests/conftest.py`, `unstract/sdk1/tests/utils/test_retry_utils.py`	SDK1 retry utilities refactored to use `retry_platform_service_call` instead of `retry_prompt_service_call` for platform service calls, with test configuration and assertions updated.
Workflow Execution Configuration `unstract/workflow-execution/src/unstract/workflow_execution/constants.py`, `unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py`	Workflow execution module updated to remove `PROMPT_HOST`/`PROMPT_PORT` environment variable references while maintaining platform and x2text service integration.
Environment Sample Files and Build Configuration `workers/sample.env`, `prompt-service/.python-version`, `tools/structure/.dockerignore`, `tox.ini`	Worker environment samples, Python version pins, and build configurations updated to replace prompt service references with platform service and remove deprecated tool support.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main objective: decommissioning prompt-service and old tools. It is specific, non-generic, and directly reflects the primary changes in the changeset.
Description check	✅ Passed	The PR description comprehensively addresses all required template sections with detailed content, including 'What', 'Why', 'How', impact analysis, safety considerations, testing notes, and related issues. The description is well-structured and complete.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/phase5-decommission-old-components

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

for more information, see https://pre-commit.ci

greptile-apps · 2026-05-20T13:19:11Z

Greptile Summary

Phase 5 of the pluggable-executor migration: removes the entire prompt-service Flask app, the tools/classifier and tools/structure old tool containers, and the unstract/sdk1/prompt.py dead module. All callsites, env vars (PROMPT_HOST/PROMPT_PORT), Docker compose blocks, CI matrix entries, retry helpers, and tox environments for these components are consistently cleaned up.

~10 800 lines deleted across 105 files — prompt-service controllers/retrievers/services/tests, classifier and structure tool sources, sdk1/prompt.py, and related lock file.
Config/infra scrubbed end-to-end: backend/settings/base.py, workers/sample.env, docker-compose.yaml, compose.debug.yaml, sample.compose.override.yaml, CI workflows, and tox.ini all updated in sync.
One script missed: docker/scripts/bump_sdk_v0_version.sh was not updated and still references the deleted prompt-service and tools/classifier directories, which will cause --bump mode to exit fatally when check_file finds the missing classifier properties.json.

Confidence Score: 4/5

Safe to merge for runtime — the deleted components are confirmed dead code — but the version-bump maintenance script will crash on its next use until updated.

The docker/scripts/bump_sdk_v0_version.sh script still iterates over CUSTOM_TOOL_DIRS which includes the now-deleted tools/classifier. The update_custom_tool_version function calls check_file unconditionally (even in --dry-run), and check_file calls exit 1 when the file is not found. Any developer running this script after the merge will hit an immediate fatal error. No production runtime is affected, but the tooling breakage is a real defect.

docker/scripts/bump_sdk_v0_version.sh — not updated alongside the deleted directories it references.

Important Files Changed

Filename	Overview
docker/scripts/bump_sdk_v0_version.sh	Not updated in this PR — still holds references to deleted prompt-service, classifier, and structure directories, causing the --bump path to fail at check_file for the missing classifier properties.json.
unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py	Correctly removes PROMPT_HOST and PROMPT_PORT from constructor and get_tool_environment_variables; no remaining callers verified in codebase.
unstract/workflow-execution/src/unstract/workflow_execution/constants.py	PROMPT_HOST and PROMPT_PORT constants cleanly removed from ToolRuntimeVariable.
.github/workflows/production-build.yaml	prompt-service removed from build matrix and TOTAL_SERVICES correctly decremented from 7 to 6; summary loop updated consistently.
.github/workflows/docker-tools-build-push.yaml	tool-classifier and tool-structure options removed; tool-sidecar made the default; tool-text-extractor retained with correct Dockerfile path.
tox.ini	prompt-service removed from env_list and [testenv:prompt-service] block cleanly deleted.
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py	retry_prompt_service_call decorator removed; verified no remaining importers in codebase.
docker/docker-compose.yaml	prompt-service service block and its depends_on entry in backend cleanly removed.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Workflow Execution Request] --> B[ToolsUtils.__init__]
    B --> C[Load PLATFORM env vars]
    B --> D[Load X2TEXT env vars]
    B --> E[Load REDIS env vars]
    B -.->|REMOVED| F[PROMPT_HOST / PROMPT_PORT]
    C & D & E --> G[get_tool_environment_variables]
    G --> H[ToolSandbox / Executor]
    H --> I[workers/executor]
    I --> J[answer_prompt.py]
    I --> K[index.py]
    I --> L[postprocessor.py]
    F -.->|was| M[prompt-service Flask app DECOMMISSIONED]
    style F stroke-dasharray:5 5,color:#999
    style M fill:#fdd,stroke:#f00,color:#900

_{Reviews (3): Last reviewed commit: "Restore text_extractor tool removed in P..." | Re-trigger Greptile}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docker/scripts/uv-lock-gen/README.md`:
- Line 5: The example sentence describing "transitive dependency changes"
references the service "workers" which is not in the enumerated list; update
that sentence to reference an existing listed service (e.g., replace "workers"
with "runner") so the example matches the enumerated services, or alternatively
add "workers" into the enumerated list; locate the sentence containing
"transitive dependency changes" and the example "unstract/sdk1" and make the
replacement/addition accordingly to keep the README consistent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f274dae7-8d0e-4a6b-935a-8beec240f62e

📥 Commits

Reviewing files that changed from the base of the PR and between 0559057 and 7bdff5a.

⛔ Files ignored due to path filters (5)

prompt-service/src/unstract/prompt_service/tests/integration/input/sample1.pdf is excluded by !**/*.pdf
prompt-service/uv.lock is excluded by !**/*.lock
tools/classifier/src/config/icon.svg is excluded by !**/*.svg
tools/structure/src/config/icon.svg is excluded by !**/*.svg
tools/text_extractor/src/config/icon.svg is excluded by !**/*.svg

📒 Files selected for processing (114)

.github/workflows/docker-tools-build-push.yaml
.github/workflows/production-build.yaml
backend/backend/settings/base.py
backend/sample.env
docker/compose.debug.yaml
docker/docker-compose.build.yaml
docker/docker-compose.yaml
docker/dockerfiles/prompt.Dockerfile
docker/dockerfiles/prompt.Dockerfile.dockerignore
docker/sample.compose.override.yaml
docker/scripts/uv-lock-gen/README.md
docker/scripts/uv-lock-gen/uv-lock.sh
prompt-service/.gitignore
prompt-service/.python-version
prompt-service/README.md
prompt-service/entrypoint.sh
prompt-service/pyproject.toml
prompt-service/sample.env
prompt-service/src/unstract/prompt_service/__init__.py
prompt-service/src/unstract/prompt_service/config.py
prompt-service/src/unstract/prompt_service/constants.py
prompt-service/src/unstract/prompt_service/controllers/__init__.py
prompt-service/src/unstract/prompt_service/controllers/answer_prompt.py
prompt-service/src/unstract/prompt_service/controllers/extraction.py
prompt-service/src/unstract/prompt_service/controllers/health.py
prompt-service/src/unstract/prompt_service/controllers/indexing.py
prompt-service/src/unstract/prompt_service/core/index_v2.py
prompt-service/src/unstract/prompt_service/core/retrievers/automerging.py
prompt-service/src/unstract/prompt_service/core/retrievers/base_retriever.py
prompt-service/src/unstract/prompt_service/core/retrievers/fusion.py
prompt-service/src/unstract/prompt_service/core/retrievers/keyword_table.py
prompt-service/src/unstract/prompt_service/core/retrievers/recursive.py
prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py
prompt-service/src/unstract/prompt_service/core/retrievers/router.py
prompt-service/src/unstract/prompt_service/core/retrievers/simple.py
prompt-service/src/unstract/prompt_service/core/retrievers/subquestion.py
prompt-service/src/unstract/prompt_service/dto.py
prompt-service/src/unstract/prompt_service/exceptions.py
prompt-service/src/unstract/prompt_service/extensions.py
prompt-service/src/unstract/prompt_service/helpers/__init__.py
prompt-service/src/unstract/prompt_service/helpers/auth.py
prompt-service/src/unstract/prompt_service/helpers/postprocessor.py
prompt-service/src/unstract/prompt_service/helpers/prompt_ide_base_tool.py
prompt-service/src/unstract/prompt_service/helpers/usage.py
prompt-service/src/unstract/prompt_service/helpers/variable_replacement.py
prompt-service/src/unstract/prompt_service/run.py
prompt-service/src/unstract/prompt_service/services/__init__.py
prompt-service/src/unstract/prompt_service/services/answer_prompt.py
prompt-service/src/unstract/prompt_service/services/extraction.py
prompt-service/src/unstract/prompt_service/services/indexing.py
prompt-service/src/unstract/prompt_service/services/rentrolls_extractor/interface.py
prompt-service/src/unstract/prompt_service/services/retrieval.py
prompt-service/src/unstract/prompt_service/services/variable_replacement.py
prompt-service/src/unstract/prompt_service/tests/conftest.py
prompt-service/src/unstract/prompt_service/tests/integration/test_api_endpoints.py
prompt-service/src/unstract/prompt_service/tests/sample.env.test
prompt-service/src/unstract/prompt_service/tests/unit/__init__.py
prompt-service/src/unstract/prompt_service/tests/unit/conftest.py
prompt-service/src/unstract/prompt_service/tests/unit/test_retriever_llm.py
prompt-service/src/unstract/prompt_service/utils/__init__.py
prompt-service/src/unstract/prompt_service/utils/db_utils.py
prompt-service/src/unstract/prompt_service/utils/env_loader.py
prompt-service/src/unstract/prompt_service/utils/file_utils.py
prompt-service/src/unstract/prompt_service/utils/json_repair_helper.py
prompt-service/src/unstract/prompt_service/utils/log.py
prompt-service/src/unstract/prompt_service/utils/metrics.py
prompt-service/src/unstract/prompt_service/utils/request.py
tools/classifier/.dockerignore
tools/classifier/Dockerfile
tools/classifier/README.md
tools/classifier/__init__.py
tools/classifier/requirements.txt
tools/classifier/sample.env
tools/classifier/src/config/properties.json
tools/classifier/src/config/runtime_variables.json
tools/classifier/src/config/spec.json
tools/classifier/src/helper.py
tools/classifier/src/main.py
tools/structure/.dockerignore
tools/structure/.gitignore
tools/structure/Dockerfile
tools/structure/README.md
tools/structure/__init__.py
tools/structure/requirements.txt
tools/structure/sample.env
tools/structure/src/config/properties.json
tools/structure/src/config/runtime_variables.json
tools/structure/src/config/spec.json
tools/structure/src/constants.py
tools/structure/src/helpers.py
tools/structure/src/main.py
tools/structure/src/utils.py
tools/text_extractor/.dockerignore
tools/text_extractor/.gitignore
tools/text_extractor/Dockerfile
tools/text_extractor/README.md
tools/text_extractor/__init__.py
tools/text_extractor/requirements.txt
tools/text_extractor/sample.env
tools/text_extractor/src/config/properties.json
tools/text_extractor/src/config/runtime_variables.json
tools/text_extractor/src/config/spec.json
tools/text_extractor/src/example_package/__init__.py
tools/text_extractor/src/main.py
tools/text_extractor/tests/__init__.py
tox.ini
unstract/sdk1/src/unstract/sdk1/prompt.py
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
unstract/sdk1/tests/conftest.py
unstract/sdk1/tests/test_prompt.py
unstract/sdk1/tests/utils/test_retry_utils.py
unstract/workflow-execution/src/unstract/workflow_execution/constants.py
unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py
workers/sample.env

💤 Files with no reviewable changes (100)

tools/classifier/src/config/properties.json
tools/text_extractor/.gitignore
tools/classifier/.dockerignore
prompt-service/.gitignore
prompt-service/sample.env
docker/dockerfiles/prompt.Dockerfile.dockerignore
tools/structure/sample.env
tools/structure/src/config/properties.json
prompt-service/README.md
docker/dockerfiles/prompt.Dockerfile
tools/text_extractor/src/config/properties.json
tools/text_extractor/README.md
tools/structure/requirements.txt
prompt-service/src/unstract/prompt_service/tests/sample.env.test
tools/classifier/src/config/runtime_variables.json
prompt-service/entrypoint.sh
docker/scripts/uv-lock-gen/uv-lock.sh
tools/classifier/README.md
tools/classifier/sample.env
workers/sample.env
prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py
prompt-service/src/unstract/prompt_service/core/retrievers/base_retriever.py
prompt-service/src/unstract/prompt_service/utils/db_utils.py
tools/structure/src/constants.py
tools/structure/Dockerfile
tools/structure/README.md
prompt-service/src/unstract/prompt_service/services/rentrolls_extractor/interface.py
tools/text_extractor/src/main.py
prompt-service/src/unstract/prompt_service/controllers/health.py
tools/text_extractor/.dockerignore
tools/text_extractor/src/config/runtime_variables.json
tools/text_extractor/requirements.txt
prompt-service/src/unstract/prompt_service/services/indexing.py
tools/text_extractor/Dockerfile
prompt-service/src/unstract/prompt_service/tests/unit/test_retriever_llm.py
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
prompt-service/src/unstract/prompt_service/controllers/init.py
tools/classifier/src/config/spec.json
prompt-service/src/unstract/prompt_service/utils/file_utils.py
prompt-service/src/unstract/prompt_service/tests/integration/test_api_endpoints.py
prompt-service/src/unstract/prompt_service/utils/log.py
prompt-service/src/unstract/prompt_service/utils/metrics.py
prompt-service/src/unstract/prompt_service/extensions.py
unstract/sdk1/src/unstract/sdk1/prompt.py
unstract/workflow-execution/src/unstract/workflow_execution/constants.py
prompt-service/src/unstract/prompt_service/controllers/answer_prompt.py
tools/structure/src/config/runtime_variables.json
prompt-service/src/unstract/prompt_service/constants.py
prompt-service/src/unstract/prompt_service/tests/unit/conftest.py
tools/structure/.gitignore
prompt-service/src/unstract/prompt_service/controllers/indexing.py
tools/classifier/Dockerfile
prompt-service/src/unstract/prompt_service/services/variable_replacement.py
tools/structure/src/helpers.py
tools/structure/src/config/spec.json
prompt-service/src/unstract/prompt_service/helpers/auth.py
prompt-service/src/unstract/prompt_service/core/retrievers/keyword_table.py
tools/text_extractor/src/config/spec.json
prompt-service/src/unstract/prompt_service/tests/conftest.py
unstract/sdk1/tests/utils/test_retry_utils.py
prompt-service/src/unstract/prompt_service/dto.py
prompt-service/.python-version
prompt-service/src/unstract/prompt_service/helpers/postprocessor.py
prompt-service/src/unstract/prompt_service/helpers/usage.py
docker/docker-compose.build.yaml
tools/classifier/src/main.py
backend/sample.env
prompt-service/src/unstract/prompt_service/exceptions.py
prompt-service/src/unstract/prompt_service/utils/json_repair_helper.py
prompt-service/src/unstract/prompt_service/core/retrievers/automerging.py
unstract/sdk1/tests/conftest.py
prompt-service/src/unstract/prompt_service/core/index_v2.py
prompt-service/src/unstract/prompt_service/helpers/prompt_ide_base_tool.py
tools/text_extractor/sample.env
prompt-service/src/unstract/prompt_service/services/answer_prompt.py
prompt-service/src/unstract/prompt_service/core/retrievers/recursive.py
prompt-service/src/unstract/prompt_service/core/retrievers/subquestion.py
tools/structure/src/utils.py
prompt-service/src/unstract/prompt_service/utils/request.py
docker/compose.debug.yaml
prompt-service/src/unstract/prompt_service/services/extraction.py
prompt-service/src/unstract/prompt_service/core/retrievers/router.py
prompt-service/src/unstract/prompt_service/config.py
tools/structure/.dockerignore
prompt-service/src/unstract/prompt_service/controllers/extraction.py
prompt-service/pyproject.toml
tools/classifier/src/helper.py
prompt-service/src/unstract/prompt_service/core/retrievers/simple.py
tools/classifier/requirements.txt
docker/docker-compose.yaml
prompt-service/src/unstract/prompt_service/services/retrieval.py
prompt-service/src/unstract/prompt_service/run.py
prompt-service/src/unstract/prompt_service/helpers/variable_replacement.py
unstract/sdk1/tests/test_prompt.py
tools/structure/src/main.py
prompt-service/src/unstract/prompt_service/utils/env_loader.py
backend/backend/settings/base.py
prompt-service/src/unstract/prompt_service/core/retrievers/fusion.py
unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py
docker/sample.compose.override.yaml

…1877) * [FIX] Add hook for setting default adapters for invited users Add setup_default_adapters_for_user() hook to AuthenticationService and call it from set_user_organization() when an invited user joins an existing organization. This allows the cloud plugin to set up default triad adapters (LLM, embedding, vector DB, x2text) for invited users, fixing silent failures in API deployment creation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update backend/account_v2/authentication_controller.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Praveen Kumar <praveen@zipstack.com> * [FIX] Improve log message for setup_default_adapters_for_user Address review comment: log user email and explain that default adapters will not be set when the method is not implemented. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [MISC] Rename Default Triad to Default LLM Profile in UI Update display label from "Default Triad" to "Default LLM Profile" in the page heading and side navigation menu. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Signed-off-by: Praveen Kumar <praveen@zipstack.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Deepak K <89829542+Deepak-Kesavan@users.noreply.github.com>

* [FIX] Wrap set_user_organization in transaction.atomic The new-org branch creates the org row, then calls frictionless onboarding and the initial platform key. Failures mid-flow leave an orphan org with no adapters or key, and subsequent logins skip onboarding entirely (gated on new_organization). Atomic ensures the org rolls back on any failure so retries get a clean fresh-org path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [MISC] Worktree skill — use --no-track to prevent accidental main pushes Without --no-track, a later `git push -u origin <branch>` can be reported by the server as also fast-forwarding main, landing commits on main. * [FIX] Use logger.exception in authorization_callback Preserves the traceback when the OAuth callback hits the safety-net catch. Behaviour unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com> Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com>

…1930) * UN-3386 [FEAT] Add Prompt Studio HITL change indicator plugin slot Wires up the host-side hooks for the prompt-change-indicator plugin (implementation lives in unstract-cloud): a dynamic-import slot in the prompt card Header for the indicator button, and a route at :orgName/review/readonly/:documentId for the read-only audit view. Both gates fall through gracefully when the plugin is absent (OSS). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3386 [FIX] Warn when ReadOnlyReviewPage loads without ReviewLayout Addresses review feedback: the readonly route nests inside ReviewLayout (manual-review plugin), so a deployment that ships prompt-change-indicator without manual-review would silently fail to register the route. Log a console.warn in that case to make the misconfiguration discoverable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3386 [FIX] Surface real plugin import errors in route loader Bare catch in the prompt-change-indicator dynamic import was swallowing syntax/runtime errors in the plugin file alongside the expected "plugin missing in OSS" case. Detect the missing-module messages explicitly and console.error anything else so a broken cloud plugin no longer disables the readonly route silently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add OpenAI-compatible LLM adapter * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address review feedback for custom OpenAI adapter * Fix import formatting after rebase * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address follow-up review comments for OpenAI-compatible adapter * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Refine OpenAI compatible adapter schema naming * Reject empty model string in OpenAICompatibleLLMParameters validate_model previously produced "custom_openai/" for an empty model, surfacing as a confusing LiteLLM error at call time. Match the existing GeminiLLMParameters.validate_model pattern: strip whitespace, raise ValueError on empty input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Revert SCHEMA_PATH plumbing; rename schema to custom_openai.json Addresses Ritwik's review feedback. The new BaseAdapter.SCHEMA_PATH class variable and the conditional branch in get_json_schema() are unnecessary: OpenAICompatibleLLMAdapter.get_provider() returns "custom_openai", and the default path resolution already builds …/llm1/static/{get_provider()}.json. Renaming the schema file lets the default lookup find it and keeps the base class untouched, which is the convention every other adapter follows. - Rename openai_compatible.json -> custom_openai.json - Drop SCHEMA_PATH class var and the if-None branch from BaseAdapter - Drop SCHEMA_PATH override (and unused os/ClassVar imports) from OpenAICompatibleLLMAdapter - Update test_openai_compatible_schema_is_loadable to read schema via get_json_schema() instead of touching SCHEMA_PATH directly --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Hari John Kuriakose <hari@zipstack.com> Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Athul <athul@zipstack.com> Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com> Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com>

* [HOTFIX] Use importlib.util.find_spec for pluggable worker discovery (#1918) * [FIX] Use importlib.util.find_spec for pluggable worker discovery _verify_pluggable_worker_exists() previously checked for the literal file `pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin has been compiled to a .so (Nuitka, Cython, or any C extension) — the module is perfectly importable but the pre-check rejects it because only the .py extension is considered. Replace the filesystem check with importlib.util.find_spec(), which is Python's standard way to ask "is this module resolvable by the import system?". It honors every registered finder — source .py, compiled .so, bytecode .pyc, namespace packages, zipimports — so the function now matches what its docstring claims: verifying the module can be loaded, not that a specific file extension is present. Behavior is preserved for existing deployments: - Images with no `pluggable_worker/<name>/` subpackage → find_spec raises ModuleNotFoundError (ImportError subclass) → returns False. - Images with source .py → find_spec resolves the .py → returns True. - Images with compiled .so → find_spec resolves the .so → returns True. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Handle ValueError from find_spec in pluggable worker verification Greptile-flagged edge case: importlib.util.find_spec() can raise ValueError (not just ImportError) when sys.modules has a partially initialised module entry with __spec__ = None from a prior failed import. Broaden the except to catch both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Resolve api-deployment worker directory from enum import path worker.py:452 did worker_type.value.replace("-", "_") to derive the on-disk dir name. All WorkerType enum values already use underscores, so the replace was a no-op; for API_DEPLOYMENT whose dir is "api-deployment" (hyphen), it resolved to "api_deployment" and the os.path.exists() check failed. Boot then logged a spurious "❌ Worker directory not found: /app/api_deployment" at ERROR level. The task registration path (builder + celery autodiscover via to_import_path) is unaffected, so this was purely log noise — but noise at ERROR level that masks real failures in log scans. Fix: derive the directory from the authoritative to_import_path() which already handles the hyphen case (api_deployment -> api-deployment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter (#1944) * [FEAT] Allow Bedrock to fall through to boto3's default credential chain Match the S3/MinIO connector pattern: when AWS access keys are left blank on the Bedrock LLM and embedding adapter forms, drop them from the kwargs dict so boto3's default credential chain handles authentication. This unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on hosts that already have ambient AWS credentials (e.g. EKS workers with IRSA, EC2 with an instance profile). - llm1/static/bedrock.json: clarify access-key descriptions to mention IRSA and instance profile (already non-required at v0.163.2 base). - embedding1/static/bedrock.json: drop aws_access_key_id and aws_secret_access_key from top-level required; same description fix; expose aws_profile_name for parity with the LLM form. - base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters now strip empty access-key values from the validated kwargs before returning, so empty strings don't override boto3's default chain. AWSBedrockEmbeddingParameters fields gain explicit None defaults and an aws_profile_name field. Backward-compatible: existing adapters with access keys filled in continue to work unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FEAT] Add Authentication Type selector to Bedrock adapter form Add an explicit `auth_type` selector with two options, making the auth choice clear to users: - "Access Keys" (default): existing flow, keys required - "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on boto3's default credential chain (IRSA on EKS, task role on ECS, instance profile on EC2). Description on the selector explicitly notes this option is only for AWS-hosted Unstract deployments. The form-only auth_type field is stripped before LiteLLM validation in both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters. validate(). Empty access keys continue to be stripped so boto3 falls through to the default chain even when the access_keys arm is selected without values (matches the S3/MinIO connector pattern). Backward-compatible: legacy adapters without auth_type behave as "Access Keys" mode (the default), and existing keys are forwarded unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REVIEW] Address Bedrock auth_type review feedback Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on PR #1944. Behaviour fixes: - Stale-key leak in IAM Role mode: switching an existing adapter from Access Keys to IAM Role would carry truthy stored access keys through the strip-empty-only loop, so boto3 silently authenticated with the old long-lived credentials instead of falling through to the host's IRSA / instance-profile identity. Both LLM and embedding paths were affected. - Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or a malformed payload from a non-UI client passed through the dict comprehension untouched, with no enum guard. - Cross-field validation gap: explicit Access Keys mode with blank or whitespace-only values silently fell through to the default credential chain instead of surfacing the misconfiguration. Implementation: - Add a module-level _resolve_bedrock_aws_credentials helper used by both AWSBedrockLLMParameters.validate() and AWSBedrock EmbeddingParameters.validate(), so the auth-type contract is expressed once. - Validates auth_type against an allowlist (None | "access_keys" | "iam_role"); raises ValueError on anything else. - iam_role: unconditionally drops aws_access_key_id and aws_secret_access_key. - access_keys (explicit): requires non-blank values; raises ValueError if either is empty or whitespace-only. - Legacy (auth_type absent): retains the lenient strip behaviour so pre-PR adapter configurations continue to deserialise unchanged. - Restore aws_region_name as required (no `= None` default) on AWSBedrockEmbeddingParameters; only credentials may legitimately be absent. - Drop the orphan aws_profile_name field from embedding1/static/bedrock.json: it was added for parity with the LLM form but lives outside the auth_type oneOf and contradicts the selector's "no further input" semantics. The LLM form already had aws_profile_name pre-PR and is left alone for backwards compatibility. Tests: - New tests/test_bedrock_adapter.py covers 15 cases across LLM and embedding adapters: legacy-no-auth-type, explicit access_keys with valid/blank/whitespace keys, iam_role with stale/no keys, unknown auth_type rejection, cross-field validation, and preservation of unrelated params (model_id, aws_profile_name, region, thinking). Skipped (P2 nice-to-have): - Comment-scope clarification, MinIO reference rewording, validate-mutates-caller'\''s-dict, and the LLM form description nit about aws_profile_name visibility. These don'\''t change behaviour and can be addressed in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [HOTFIX] Bump litellm to 1.83.10 from PyPI to clear CVE-2026-42208 (#1976) Hotfix for cloud v0.159.3 (OSS v0.163.4). Customer scanner flagged litellm 1.82.3 for CVE-2026-42208 (SQL injection in litellm proxy auth path, affects 1.81.16-1.83.6). We do not use litellm.proxy, but vulnerability scanners flag the installed package regardless of which code path is reachable. Bump to 1.83.10 — the exact version recommended by the upstream advisory (v1.83.10-stable) and the smallest jump that clears the CVE range while keeping python-dotenv==1.0.1 compatible (1.83.14 would force bumping python-dotenv across 7+ pyproject.toml files). Only tiktoken needed to move 0.9 -> 0.12 to satisfy litellm's pin. Switch source back to PyPI now that the PyPI quarantine is over, reversing the temporary fork in #1873. Cohere embed timeout patch: verified that litellm/llms/cohere/embed/handler.py is byte-identical between v1.82.3, v1.83.10-stable, and v1.83.14-stable (the timeout-not-forwarded bug fixed in #1848 is still present upstream — BerriAI/litellm#14635 remains OPEN). Version guard bumped 1.82.3 -> 1.83.10; 6/6 patch tests pass on the new version, confirming the monkey-patch still binds correctly. Other cleanup from #1873: - Drop git apt-install from worker-unified and tool Dockerfiles (no git-sourced deps remain in any uv.lock) - Bump tool versions: structure 0.0.100 -> 0.0.101, classifier 0.0.79 -> 0.0.80, text_extractor 0.0.75 -> 0.0.76 Note on root uv.lock churn: the v0.163.4 root uv.lock had a pre-existing corruption (banks v2.4.1 entry pointing at banks-2.2.0 wheel) that blocked incremental resolution. Regenerated from scratch. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Align cohere patch docstring with version-guard semantics Reviewer flagged that the docstring claimed the patch is "confirmed in every release between 1.82.3 and 1.83.14-stable", but the guard at _PATCHED_LITELLM_VERSION activates only on the exact pinned version. A future maintainer reading the old text could reasonably expect bumping to e.g. 1.83.11 to keep the fix active; in reality it silently turns off. Rewritten to reference _PATCHED_LITELLM_VERSION as the single source of truth and to drop the rot-prone "as of 2026-05-20" calendar date. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

The atomic wrap from #1954 uncommits the new org row when frictionless_onboarding HTTP-calls the LLMW portal mid-transaction. The portal runs on a separate DB session and under READ COMMITTED cannot see the uncommitted row, so the call returns 400 and the caller silently persists an adapter with an empty unstract_key. Every new signup since 2026-05-19 09:47 UTC ships a broken free-trial X2Text adapter (401 on first OCR). Hotfix only — Phase 2 (UN-3476) restructures the function so the atomic guarantee is reapplied around just the pure-DB writes, with HTTP and non-DB side effects moved outside the transaction. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ion-old-components # Conflicts: # prompt-service/uv.lock # tools/classifier/Dockerfile # tools/classifier/src/config/properties.json # tools/structure/Dockerfile # tools/structure/src/config/properties.json # tools/text_extractor/Dockerfile # tools/text_extractor/src/config/properties.json

chandrasekharan-zipstack · 2026-05-22T09:36:30Z

        description: "Tool to build"
        required: true
-        default: "tool-structure" # Provide a default value
+        default: "tool-sidecar" # Provide a default value


@harini-venkataraman tool sidecar can also be removed right? Essentially this workflow file itself

chandrasekharan-zipstack · 2026-05-22T09:36:56Z


            # Define services in order
-            for service in backend frontend platform-service prompt-service runner worker-unified x2text-service; do
+            for service in backend frontend platform-service runner worker-unified x2text-service; do


If we are removing the tools, we should remove the runner as well

The Phase 5 decommission commit removed classifier, structure, text_extractor, and prompt-service. However, text_extractor is still in active use by customers. This surgically restores only the text_extractor tool while keeping the other decommissions in place. - Restore tools/text_extractor/ directory (14 files from origin/main) - Add tool-text_extractor back to docker-compose.build.yaml - Add tool-text-extractor back to docker-tools-build-push.yaml workflow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-05-26T09:37:56Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 338 passed, 0 failed (338 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

sonarqubecloud · 2026-05-26T09:38:05Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/docker-tools-build-push.yaml:
- Around line 59-65: Replace direct inline checks of ${{
github.event.inputs.service_name }} with a single env variable (e.g.,
SERVICE_NAME) and a case whitelist that sets GITHUB_OUTPUT keys (context and
dockerfile) for known services ("tool-sidecar", "tool-text-extractor") and
otherwise prints an error and exits non‑zero to fail closed; update the branch
that currently echoes "context" and "dockerfile" to use the values chosen in the
case for SERVICE_NAME, and ensure unknown values trigger an explicit exit 1 so
$GITHUB_OUTPUT is never left unset for invalid inputs.

In `@docker/docker-compose.build.yaml`:
- Around line 33-37: The docker-compose service key tool-text_extractor
currently has an image name with an underscore
(unstract/tool-text_extractor:${VERSION}) which mismatches other places
expecting the hyphenated repo name; update the image: value for the
tool-text_extractor service to unstract/tool-text-extractor:${VERSION} (keep the
service key tool-text_extractor unchanged) so the locally built image name
matches the registry and CI naming used by run-platform.sh,
.github/workflows/docker-tools-build-push.yaml and public_tools.json.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 615c3169-ef09-4106-b415-f69172a4e10a

📥 Commits

Reviewing files that changed from the base of the PR and between 0619756 and 093a6b4.

📒 Files selected for processing (2)

.github/workflows/docker-tools-build-push.yaml
docker/docker-compose.build.yaml

coderabbitai · 2026-05-26T09:46:46Z

+          if [ "${{ github.event.inputs.service_name }}" == "tool-sidecar" ]; then
            echo "context=." >> $GITHUB_OUTPUT
-            echo "dockerfile=./tools/structure/Dockerfile" >> $GITHUB_OUTPUT
+            echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> $GITHUB_OUTPUT
          elif [ "${{ github.event.inputs.service_name }}" == "tool-text-extractor" ]; then
            echo "context=." >> $GITHUB_OUTPUT
            echo "dockerfile=./tools/text_extractor/Dockerfile" >> $GITHUB_OUTPUT
-          elif [ "${{ github.event.inputs.service_name }}" == "tool-sidecar" ]; then
-            echo "context=." >> $GITHUB_OUTPUT
-            echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> $GITHUB_OUTPUT
          fi


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify this workflow no longer embeds GitHub expression templates directly in shell conditionals. rg -n -C2 '\$\{\{\s*github\.event\.inputs\.service_name\s*\}\}' .github/workflows/docker-tools-build-push.yaml # Expected after fix: only appears in YAML fields (e.g., run-name/tags), not inside shell branching logic.

Repository: Zipstack/unstract

Length of output: 1382

🏁 Script executed:

rg -n -C2 '\$\{\{\s*github\.event\.inputs\.service_name\s*\}\}' .github/workflows/docker-tools-build-push.yaml # Expected: no occurrences inside the bash branching logic; `${{ github.event.inputs.service_name }}` should be assigned to an env var or used in non-branching YAML fields.

Repository: Zipstack/unstract

Length of output: 1382

Harden service_name branching and fail closed for unknown values.

In .github/workflows/docker-tools-build-push.yaml (lines 59-64), bash if/elif branches embed ${{ github.event.inputs.service_name }} directly and there’s no default else/* to reject unsupported values, so the step can leave $GITHUB_OUTPUT unset for unexpected inputs. Use an env var + case whitelist with an explicit exit 1 for unknown values.

Suggested patch

- name: Set build configuration id: build-config + env: + SERVICE_NAME: ${{ github.event.inputs.service_name }} run: | - if [ "${{ github.event.inputs.service_name }}" == "tool-sidecar" ]; then - echo "context=." >> $GITHUB_OUTPUT - echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> $GITHUB_OUTPUT - elif [ "${{ github.event.inputs.service_name }}" == "tool-text-extractor" ]; then - echo "context=." >> $GITHUB_OUTPUT - echo "dockerfile=./tools/text_extractor/Dockerfile" >> $GITHUB_OUTPUT - fi + case "$SERVICE_NAME" in + tool-sidecar) + echo "context=." >> "$GITHUB_OUTPUT" + echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> "$GITHUB_OUTPUT" + ;; + tool-text-extractor) + echo "context=." >> "$GITHUB_OUTPUT" + echo "dockerfile=./tools/text_extractor/Dockerfile" >> "$GITHUB_OUTPUT" + ;; + *) + echo "Unsupported service_name: $SERVICE_NAME" >&2 + exit 1 + ;; + esac

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if [ "${{ github.event.inputs.service_name }}" == "tool-sidecar" ]; then

echo "context=." >> $GITHUB_OUTPUT

echo "dockerfile=./tools/structure/Dockerfile" >> $GITHUB_OUTPUT

echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> $GITHUB_OUTPUT

elif [ "${{ github.event.inputs.service_name }}" == "tool-text-extractor" ]; then

echo "context=." >> $GITHUB_OUTPUT

echo "dockerfile=./tools/text_extractor/Dockerfile" >> $GITHUB_OUTPUT

elif [ "${{ github.event.inputs.service_name }}" == "tool-sidecar" ]; then

echo "context=." >> $GITHUB_OUTPUT

echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> $GITHUB_OUTPUT

fi

- name: Set build configuration

id: build-config

env:

SERVICE_NAME: ${{ github.event.inputs.service_name }}

run: |

case "$SERVICE_NAME" in

tool-sidecar)

echo "context=." >> "$GITHUB_OUTPUT"

echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> "$GITHUB_OUTPUT"

;;

tool-text-extractor)

echo "context=." >> "$GITHUB_OUTPUT"

echo "dockerfile=./tools/text_extractor/Dockerfile" >> "$GITHUB_OUTPUT"

;;

*)

echo "Unsupported service_name: $SERVICE_NAME" >&2

exit 1

;;

esac

🧰 Tools

🪛 zizmor (1.25.2)

[error] 59-59: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)

[error] 62-62: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/docker-tools-build-push.yaml around lines 59 - 65, Replace direct inline checks of ${{ github.event.inputs.service_name }} with a single env variable (e.g., SERVICE_NAME) and a case whitelist that sets GITHUB_OUTPUT keys (context and dockerfile) for known services ("tool-sidecar", "tool-text-extractor") and otherwise prints an error and exits non‑zero to fail closed; update the branch that currently echoes "context" and "dockerfile" to use the values chosen in the case for SERVICE_NAME, and ensure unknown values trigger an explicit exit 1 so $GITHUB_OUTPUT is never left unset for invalid inputs.

coderabbitai · 2026-05-26T09:46:46Z

+  tool-text_extractor:
+    image: unstract/tool-text_extractor:${VERSION}
+    build:
+      dockerfile: tools/text_extractor/Dockerfile
+      context: ..


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Inventory both naming variants to quantify blast radius before renaming. rg -n -C2 'tool-text[-_]extractor|tool-text[_-]extractor|tool-text_extractor|tool-text-extractor' # Expected: pick one canonical form and eliminate the other across workflows/compose/scripts/docs.

Repository: Zipstack/unstract

Length of output: 3034

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find all occurrences and show them grouped by file rg -n --hidden --no-ignore-vcs 'tool-text[-_]extractor|tool-text_extractor' . \ | sed -E 's/^(.*):([0-9]+):(.*)$/\1:\2:\3/' \ | awk -F: '{file=$1":"$2; print}' >/dev/null # Full match listing (limited context per match) rg -n -C1 --hidden --no-ignore-vcs 'tool-text_extractor|tool-text-extractor' . echo "---- public_tools.json image_name/image_url entries ----" rg -n -C2 'tool-text[_-]extractor' unstract/tool-registry/tool_registry_config/public_tools.json echo "---- docker-tools-build-push workflow ----" rg -n -C2 'tool-text[_-]extractor' .github/workflows/docker-tools-build-push.yaml echo "---- docker compose build yaml ----" rg -n -C2 'tool-text[_-]extractor' docker/docker-compose.build.yaml echo "---- run-platform.sh ignore_services / service wiring ----" rg -n -C2 'tool-text[_-]extractor' run-platform.sh

Repository: Zipstack/unstract

Length of output: 6334

🏁 Script executed:

#!/bin/bash set -euo pipefail # Look for image/tag construction in the workflow rg -n --hidden --no-ignore-vcs 'unstract/tool-[^"\s]+' .github/workflows/docker-tools-build-push.yaml # Look for any usage of underscore variant outside compose rg -n --hidden --no-ignore-vcs 'tool-text_extractor' . \ | head -n 50 # Look for any usage of hyphen variant outside the workflow list rg -n --hidden --no-ignore-vcs 'tool-text-extractor' . \ | head -n 50 # Show the relevant portion of the workflow where image name is set (keep output small) python3 - <<'PY' import itertools, pathlib p = pathlib.Path(".github/workflows/docker-tools-build-push.yaml") lines = p.read_text().splitlines() for i,l in enumerate(lines,1): if "service_name" in l or "image" in l or "docker build" in l or "DOCKER" in l: pass # Print a window around likely blocks keys = ["service_name", "image", "dockerfile=", "build-push", "docker/build-push-action", "repository", "tags"] idxs=[] for i,l in enumerate(lines,1): if any(k in l for k in keys): idxs.append(i) if not idxs: idxs=[1] start=max(1,min(idxs)-5) end=min(len(lines),max(idxs)+30) for i in range(start,end+1): print(f"{i:4d}:{lines[i-1]}") PY

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Find where workflow sets service_name -> dockerfile/context (avoid exit 1 on no matches) rg -n --hidden --no-ignore-vcs 'tool-text_extractor|tool-text-extractor|service_name|dockerfile|image|DOCKER' .github/workflows/docker-tools-build-push.yaml || true echo "---- workflow full (first 120 lines) ----" sed -n '1,120p' .github/workflows/docker-tools-build-push.yaml echo "---- workflow remainder (next 120 lines, if any) ----" sed -n '121,260p' .github/workflows/docker-tools-build-push.yaml || true # 2) Check for underscore variant service names in other CI/scripts echo "---- repo occurrences of tool-text_extractor (service naming) ----" rg -n --hidden --no-ignore-vcs 'tool-text_extractor' . || true echo "---- repo occurrences of tool-text-extractor (service naming) ----" rg -n --hidden --no-ignore-vcs 'tool-text-extractor' . || true

Repository: Zipstack/unstract

Length of output: 5478

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- docker/docker-compose.build.yaml (lines 1-80) ----" sed -n '1,80p' docker/docker-compose.build.yaml echo "---- run-platform.sh (lines 300-380) ----" sed -n '300,380p' run-platform.sh echo "---- run-platform.sh: find where ignore_services are applied (small neighborhood) ----" rg -n --no-ignore-vcs 'ignore_services=' run-platform.sh rg -n --no-ignore-vcs 'ignore_services' run-platform.sh | head -n 20

Repository: Zipstack/unstract

Length of output: 3582

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '160,230p' run-platform.sh

Repository: Zipstack/unstract

Length of output: 3384

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- Search compose.yaml for tool-text-extractor / tool-text_extractor ----" rg -n --hidden --no-ignore-vcs 'tool-text-extractor|tool-text_extractor' docker/docker-compose*.yaml echo "---- Search tool registry config (public_tools.json already saw hyphen) for underscore variant ----" rg -n --hidden --no-ignore-vcs 'tool-text_extractor' unstract/tool-registry || true echo "---- Search for underscore/hyphen image repo usage in general (limit output) ----" rg -n --hidden --no-ignore-vcs 'unstract/tool-text_extractor|unstract/tool-text-extractor' . | head -n 50

Repository: Zipstack/unstract

Length of output: 811

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- docker/docker-compose.yaml tool image references ----" rg -n --hidden --no-ignore-vcs 'tool-text-extractor|tool-text_extractor' docker/docker-compose.yaml docker/docker-compose*.yaml echo "---- any other compose files beyond docker/docker-compose*.yaml ----" fd -e yaml .github docker | rg -n --hidden --no-ignore-vcs 'tool-text-extractor|tool-text_extractor' || true

Repository: Zipstack/unstract

Length of output: 335

🏁 Script executed:

#!/bin/bash set -euo pipefail # Check whether runtime references the public tool registry (likely tool image names) rg -n --hidden --no-ignore-vcs 'public_tools\.json|tool_registry_config|tool registry|tool-registry|public_tools' . \ | head -n 80 # Check for any other references to the image repo name patterns outside the compose build file rg -n --hidden --no-ignore-vcs 'unstract/tool-(text[-_]extractor)' . \ | head -n 80 # Check docker-compose.yaml for any service wiring that might reference tool image names indirectly rg -n --hidden --no-ignore-vcs 'tool-(text[-_]extractor)|text_extractor|text-extractor|adapter|tool.*image|image_name|image_url' docker/docker-compose.yaml \ || true

Repository: Zipstack/unstract

Length of output: 8928

🏁 Script executed:

rg -n --hidden --no-ignore-vcs 'tool-text[-_]extractor|unstract/tool-text[-_]extractor' docker docker/scripts .github/workflows unstract/tool-registry

Repository: Zipstack/unstract

Length of output: 684

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n --hidden --no-ignore-vcs 'tool-text_extractor' . && true

Repository: Zipstack/unstract

Length of output: 340

🏁 Script executed:

#!/bin/bash set -euo pipefail # Where tool registry fields image_name/image_url are used to run/pull containers rg -n --hidden --no-ignore-vcs 'image_name|image_url' unstract backend workers docker | head -n 120

Repository: Zipstack/unstract

Length of output: 13915

🏁 Script executed:

#!/bin/bash set -euo pipefail # Check whether env directories exist for both delimiters (what run-platform.sh expects) ls -ld tool-text_extractor tool-text-extractor 2>/dev/null || true find . -maxdepth 1 -type d -name 'tool-text*' -print # Also verify whether compose service keys map to actual env sample.env paths for d in tool-text_extractor tool-text-extractor; do if [ -d "$d" ]; then echo "---- $d contents ----" ls -la "$d" | head -n 50 [ -f "$d/sample.env" ] && echo "has sample.env" || echo "missing sample.env" fi done

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

rg -n --hidden --no-ignore-vcs 'tool-text[-_]extractor|unstract/tool-text[-_]extractor' docker .github/workflows unstract/tool-registry

Repository: Zipstack/unstract

Length of output: 684

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n --no-ignore-vcs 'build_services|build_services\s*\{' run-platform.sh sed -n '230,320p' run-platform.sh

Repository: Zipstack/unstract

Length of output: 4093

Fix tool-text-extractor image repository name mismatch (tool-text_extractor vs tool-text-extractor)

docker/docker-compose.build.yaml builds unstract/tool-text_extractor:${VERSION} (underscore in the image repo name).

.github/workflows/docker-tools-build-push.yaml builds/pushes unstract/tool-text-extractor:${{ inputs.tag }} (hyphen).

unstract/tool-registry/tool_registry_config/public_tools.json publishes/uses unstract/tool-text-extractor.

run-platform.sh’s --build-local path will therefore build a differently named image than the one the platform/tool registry expects. Update docker/docker-compose.build.yaml to use the hyphenated image repo name in the image: field (leave the compose service key tool-text_extractor unchanged since run-platform.sh references it).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docker/docker-compose.build.yaml` around lines 33 - 37, The docker-compose service key tool-text_extractor currently has an image name with an underscore (unstract/tool-text_extractor:${VERSION}) which mismatches other places expecting the hyphenated repo name; update the image: value for the tool-text_extractor service to unstract/tool-text-extractor:${VERSION} (keep the service key tool-text_extractor unchanged) so the locally built image name matches the registry and CI naming used by run-platform.sh, .github/workflows/docker-tools-build-push.yaml and public_tools.json.

harini-venkataraman and others added 2 commits May 19, 2026 16:39

[MISC] Remove prompt-service from tox.ini env_list

2e1bc54

The prompt-service directory was deleted in the prior commit but tox.ini still referenced it, which would break CI test runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

7bdff5a

for more information, see https://pre-commit.ci

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread docker/scripts/uv-lock-gen/README.md

harini-venkataraman changed the title ~~[MISC] Phase 5: Decommission prompt-service, old tools, SDK1 prompt module~~ [MISC] Decommission prompt-service, old tools, SDK1 prompt module May 20, 2026

pk-zipstack and others added 7 commits May 21, 2026 21:46

Deepak-Kesavan approved these changes May 21, 2026

View reviewed changes

harini-venkataraman requested a review from chandrasekharan-zipstack May 22, 2026 07:08

chandrasekharan-zipstack reviewed May 22, 2026

View reviewed changes

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MISC] Decommission prompt-service, old tools, SDK1 prompt module#1978

[MISC] Decommission prompt-service, old tools, SDK1 prompt module#1978
harini-venkataraman wants to merge 11 commits into
mainfrom
feat/phase5-decommission-old-components

harini-venkataraman commented May 20, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 20, 2026 •

edited

Loading

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

chandrasekharan-zipstack May 22, 2026

Uh oh!

chandrasekharan-zipstack May 22, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

sonarqubecloud Bot commented May 26, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

-          if [ "${{ github.event.inputs.service_name }}" == "tool-sidecar" ]; then
-            echo "context=." >> $GITHUB_OUTPUT
-            echo "dockerfile=./tools/structure/Dockerfile" >> $GITHUB_OUTPUT
-            echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> $GITHUB_OUTPUT
-          elif [ "${{ github.event.inputs.service_name }}" == "tool-text-extractor" ]; then
-            echo "context=." >> $GITHUB_OUTPUT
-            echo "dockerfile=./tools/text_extractor/Dockerfile" >> $GITHUB_OUTPUT
-          elif [ "${{ github.event.inputs.service_name }}" == "tool-sidecar" ]; then
-            echo "context=." >> $GITHUB_OUTPUT
-            echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> $GITHUB_OUTPUT
-          fi
+       - name: Set build configuration
+         id: build-config
+         env:
+           SERVICE_NAME: ${{ github.event.inputs.service_name }}
+         run: |
+          case "$SERVICE_NAME" in
+            tool-sidecar)
+              echo "context=." >> "$GITHUB_OUTPUT"
+              echo "dockerfile=docker/dockerfiles/tool-sidecar.Dockerfile" >> "$GITHUB_OUTPUT"
+              ;;
+            tool-text-extractor)
+              echo "context=." >> "$GITHUB_OUTPUT"
+              echo "dockerfile=./tools/text_extractor/Dockerfile" >> "$GITHUB_OUTPUT"
+              ;;
+            *)
+              echo "Unsupported service_name: $SERVICE_NAME" >&2
+              exit 1
+              ;;
+          esac

Conversation

harini-venkataraman commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Safety — preserved items

Can this PR break any existing features? If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

Notes on Testing

Screenshots

Checklist

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Uh oh!

greptile-apps Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chandrasekharan-zipstack May 22, 2026

Choose a reason for hiding this comment

Uh oh!

chandrasekharan-zipstack May 22, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 26, 2026

Test Results

Uh oh!

sonarqubecloud Bot commented May 26, 2026

Quality Gate passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

harini-venkataraman commented May 20, 2026 •

edited

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading

greptile-apps Bot commented May 20, 2026 •

edited

Loading