diff --git a/README.md b/README.md index e0c73be..0194925 100644 --- a/README.md +++ b/README.md @@ -33,23 +33,25 @@ You can then update configs in the `.env` files to customize your configurations ## Bootstrap & development mode -You have two options to start this dockerized setup, depending on whether you want to reset the database: -### Option A: Run migrations & seed data (will reset DB) +You have two options to start this dockerized setup: +### Option A: Run migrations & seed data Use the prestart profile to automatically run database migrations and seed data. -This profile also resets the database, so use it only when you want a fresh start. +This does **not** reset/drop the database. ```bash -docker compose --profile prestart up +docker compose --profile prestart up prestart ``` -### Option B: Start normally without resetting DB +### Option B: Start normally -If you don't want to reset the database, start the project directly: +Start the project directly: ```bash docker compose watch ``` This will start all services in watch mode for development — ideal for local iterations. +Backend is exposed at `http://localhost:8001` (container port `8000` mapped to host `8001`). + ### Rebuilding Images ```bash @@ -77,7 +79,7 @@ This includes using Docker Compose, custom local domains, `.env` configurations, ## Release Notes -Check the file [release-notes.md](./release-notes.md). +Release notes file is not currently maintained in this repository. ## Credits diff --git a/backend/README.md b/backend/README.md index 1bb528b..bf1f38a 100644 --- a/backend/README.md +++ b/backend/README.md @@ -89,7 +89,7 @@ To test the backend run: $ bash ./scripts/test.sh ``` -The tests run with Pytest, modify and add tests to `./backend/tests/`. +The tests run with Pytest, modify and add tests to `./backend/app/tests/`. If you use GitHub Actions the tests will run automatically. @@ -140,6 +140,16 @@ predictions.csv contains original text, anonymized output, ground-truth masked t metrics.json contains entity-level precision, recall, and F1 per PII type. +## Validator configuration guide + +Detailed validator configuration reference: +`backend/app/core/validators/README.md` + +## API usage guide + +Detailed API usage and end-to-end request examples: +`backend/app/api/API_USAGE.md` + ### Test running stack If your stack is already up and you just want to run the tests, you can use: @@ -172,7 +182,7 @@ Make sure you create a "revision" of your models and that you "upgrade" your dat $ docker compose exec backend bash ``` -* Alembic is already configured to import your SQLModel models from `./backend/app/models.py`. +* Alembic is configured with SQLModel models under `./backend/app/models/`. * After changing a model (for example, adding a column), inside the container, create a revision, e.g.: @@ -216,6 +226,20 @@ echo -n "your-plain-text-token" | shasum -a 256 Set the resulting digest as `AUTH_TOKEN` in your `.env` / `.env.test`. +## Multi-tenant API Key Configuration + +Ban List APIs use `X-API-KEY` auth instead of bearer token auth. + +Required environment variables: +- `KAAPI_AUTH_URL`: Base URL of the Kaapi auth service used to verify API keys. +- `KAAPI_AUTH_TIMEOUT`: Timeout in seconds for auth verification calls. + +At runtime, the backend calls: +- `GET {KAAPI_AUTH_URL}/apikeys/verify` +- Header: `X-API-KEY: ApiKey ` + +If verification succeeds, tenant's scope (`organization_id`, `project_id`) is resolved from the auth response and applied to Ban List CRUD operations. + ## Guardrails AI Setup 1. Ensure that the .env file contains the correct value from `GUARDRAILS_HUB_API_KEY`. The key can be fetched from [here](https://hub.guardrailsai.com/keys). @@ -244,7 +268,7 @@ Enter API Key below leave empty if you want to keep existing token [HBPo] ``` To install any validator from Guardrails Hub: -``` +```bash guardrails hub install hub://guardrails/ Example - @@ -254,13 +278,13 @@ guardrails hub install hub://guardrails/ban_list ## Adding a new validator from Guardrails Hub To add a new validator from the Guardrails Hub to this project, follow the steps below. -1. In the `backend/app/models` folder, create a new Python file called `_safety_validator_config.py`. Add the following code there: +1. In the `backend/app/core/validators/config` folder, create a new Python file called `_safety_validator_config.py`. Add the following code there: -``` +```python from guardrails.hub import # validator name from Guardrails Hub from typing import List, Literal -from app.models.base_validator_config import BaseValidatorConfig +from app.core.validators.config.base_validator_config import BaseValidatorConfig class SafetyValidatorConfig(BaseValidatorConfig): type: Literal[""] @@ -272,11 +296,11 @@ class SafetyValidatorConfig(BaseValidatorConfig): For example, this is the code for [BanList validator](https://guardrailsai.com/hub/validator/guardrails/ban_list). -``` +```python from guardrails.hub import BanList from typing import List, Literal -from app.models.base_validator_config import BaseValidatorConfig +from app.core.validators.config.base_validator_config import BaseValidatorConfig class BanListSafetyValidatorConfig(BaseValidatorConfig): @@ -291,14 +315,14 @@ class BanListSafetyValidatorConfig(BaseValidatorConfig): ``` -2. In `backend/app/guardrail_config.py`, add the newly created config class to `ValidatorConfigItem`. +2. In `backend/app/schemas/guardrail_config.py`, add the newly created config class to `ValidatorConfigItem`. ## How to add custom validators? To add a custom validator to this project, follow the steps below. 1. Create the custom validator class. Take a look at the `backend/app/core/validators/gender_assumption_bias.py` as an example. Each custom validator should contain an `__init__` and `_validator` method. For example, -``` +```python from guardrails import OnFailAction from guardrails.validators import ( FailResult, @@ -324,12 +348,12 @@ class (Validator): # add logic for validation ``` -2. In the `backend/app/models` folder, create a new Python file called `_safety_validator_config.py`. Add the following code there: +2. In the `backend/app/core/validators/config` folder, create a new Python file called `_safety_validator_config.py`. Add the following code there: -``` +```python from typing import List, Literal -from app.models.base_validator_config import BaseValidatorConfig +from app.core.validators.config.base_validator_config import BaseValidatorConfig class SafetyValidatorConfig(BaseValidatorConfig): type: Literal[""] @@ -341,9 +365,9 @@ class SafetyValidatorConfig(BaseValidatorConfig): For example, this is the code for GenderAssumptionBias validator. -``` +```python from typing import ClassVar, List, Literal, Optional -from app.models.base_validator_config import BaseValidatorConfig +from app.core.validators.config.base_validator_config import BaseValidatorConfig from app.core.enum import BiasCategories from app.core.validators.gender_assumption_bias import GenderAssumptionBias @@ -358,4 +382,4 @@ class GenderAssumptionBiasSafetyValidatorConfig(BaseValidatorConfig): ) ``` -2. In `backend/app/guardrail_config.py`, add the newly created config class to `ValidatorConfigItem`. +3. In `backend/app/schemas/guardrail_config.py`, add the newly created config class to `ValidatorConfigItem`. diff --git a/backend/app/api/API_USAGE.md b/backend/app/api/API_USAGE.md new file mode 100644 index 0000000..5e0b0a3 --- /dev/null +++ b/backend/app/api/API_USAGE.md @@ -0,0 +1,364 @@ +# API Usage Guide + +This guide explains how to use the current API surface for: +- Health checks +- Validator configuration CRUD +- Runtime validator discovery +- Guardrail execution +- Ban list CRUD for multi-tenant projects + +## Base URL and Version + +All routes are mounted under: +- `/api/v1` + +Example local base URL: +- `http://localhost:8001/api/v1` + +## Authentication + +This API currently uses two auth modes: + +1. Bearer token auth (`Authorization: Bearer `) + - Used by validator config and guardrails endpoints. + - The server validates your plaintext bearer token against a SHA-256 digest stored in `AUTH_TOKEN`. +2. multi-tenant API key auth (`X-API-KEY: `) + - Used by ban list endpoints. + - The API key is verified against `KAAPI_AUTH_URL` and resolves tenant's scope (`organization_id`, `project_id`). + +Notes: +- `GET /utils/health-check/` is public. + +## Response Shape + +All successful API responses use: + +```json +{ + "success": true, + "data": {}, + "error": null, + "metadata": null +} +``` + +Failure responses return `success: false` and an `error` message. + +## 1) Health Check + +Endpoint: +- `GET /api/v1/utils/health-check/` + +Example: + +```bash +curl -X GET "http://localhost:8001/api/v1/utils/health-check/" +``` + +Response: + +```json +true +``` + +## 2) Validator Config APIs + +These endpoints manage persisted validator configs scoped by: +- `organization_id` +- `project_id` + +Base path: +- `/api/v1/guardrails/validators/configs` + +## 2.1 Create validator config + +Endpoint: +- `POST /api/v1/guardrails/validators/configs/?organization_id=1&project_id=101` + +Example (PII input validator): + +```bash +curl -X POST "http://localhost:8001/api/v1/guardrails/validators/configs/?organization_id=1&project_id=101" \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{ + "type": "pii_remover", + "stage": "input", + "on_fail_action": "fix", + "is_enabled": true, + "entity_types": ["PERSON", "PHONE_NUMBER", "IN_AADHAAR"], + "threshold": 0.6 + }' +``` + +## 2.2 List validator configs + +Endpoint: +- `GET /api/v1/guardrails/validators/configs/?organization_id=1&project_id=101` + +Optional filters: +- `ids=&ids=` +- `stage=input|output` +- `type=uli_slur_match|pii_remover|gender_assumption_bias|ban_list` + +Example: + +```bash +curl -X GET "http://localhost:8001/api/v1/guardrails/validators/configs/?organization_id=1&project_id=101&stage=input" \ + -H "Authorization: Bearer " +``` + +## 2.3 Get validator config by id + +Endpoint: +- `GET /api/v1/guardrails/validators/configs/{id}?organization_id=1&project_id=101` + +Example: + +```bash +curl -X GET "http://localhost:8001/api/v1/guardrails/validators/configs/?organization_id=1&project_id=101" \ + -H "Authorization: Bearer " +``` + +## 2.4 Update validator config + +Endpoint: +- `PATCH /api/v1/guardrails/validators/configs/{id}?organization_id=1&project_id=101` + +Example: + +```bash +curl -X PATCH "http://localhost:8001/api/v1/guardrails/validators/configs/?organization_id=1&project_id=101" \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{ + "is_enabled": false, + "threshold": 0.7 + }' +``` + +## 2.5 Delete validator config + +Endpoint: +- `DELETE /api/v1/guardrails/validators/configs/{id}?organization_id=1&project_id=101` + +Example: + +```bash +curl -X DELETE "http://localhost:8001/api/v1/guardrails/validators/configs/?organization_id=1&project_id=101" \ + -H "Authorization: Bearer " +``` + +## 3) Runtime Validator Discovery + +Endpoint: +- `GET /api/v1/guardrails/` + +Purpose: +- Returns all runtime validator `type` values and their JSON schemas. + +Example: + +```bash +curl -X GET "http://localhost:8001/api/v1/guardrails/" \ + -H "Authorization: Bearer " +``` + +## 4) Guardrail Execution + +Endpoint: +- `POST /api/v1/guardrails/` + +Query params: +- `suppress_pass_logs=true|false` (default `true`) + +Request fields: +- `request_id` (UUID string) +- `organization_id` (int) +- `project_id` (int) +- `input` (text to validate) +- `validators` (runtime validator configs) + +Important: +- Runtime validators use `on_fail`. +- If you pass objects from config APIs, server normalization supports `on_fail_action` and strips non-runtime fields. + +Example: + +```bash +curl -X POST "http://localhost:8001/api/v1/guardrails/?suppress_pass_logs=true" \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{ + "request_id": "2a6f6d5c-5b9f-4f6b-92e4-cf7d67f87932", + "organization_id": 1, + "project_id": 101, + "input": "Amit Gupta phone number is 919611188278", + "validators": [ + { + "type": "pii_remover", + "on_fail": "fix", + "entity_types": ["PERSON", "PHONE_NUMBER"], + "threshold": 0.5 + }, + { + "type": "uli_slur_match", + "on_fail": "fix", + "languages": ["en", "hi"], + "severity": "all" + } + ] + }' +``` + +Possible success response: + +```json +{ + "success": true, + "data": { + "response_id": "d676f841-4579-4b73-bf8f-fe968af842f1", + "rephrase_needed": false, + "safe_text": "[REDACTED_PERSON_1] phone number is [REDACTED_PHONE_NUMBER_1]" + }, + "error": null, + "metadata": null +} +``` + +Possible failure response: + +```json +{ + "success": false, + "data": { + "response_id": "2f87665c-3e0f-4ea7-8d7d-2f97dfe8ec98", + "rephrase_needed": true, + "safe_text": "Please rephrase the query without unsafe content...." + }, + "error": "Validation failed", + "metadata": null +} +``` + +## 5) Ban List APIs (multi-tenant) + +These endpoints manage tenant-scoped ban lists and use `X-API-KEY` auth. + +Base path: +- `/api/v1/guardrails/ban_lists` + +## 5.1 Create ban list + +Endpoint: +- `POST /api/v1/guardrails/ban_lists/` + +Example: + +```bash +curl -X POST "http://localhost:8001/api/v1/guardrails/ban_lists/" \ + -H "X-API-KEY: " \ + -H "Content-Type: application/json" \ + -d '{ + "name": "Safety Banned Terms", + "description": "Terms not allowed for this tenant policy", + "domain": "abuse", + "is_public": false, + "banned_words": ["slur_a", "slur_b"] + }' +``` + +## 5.2 List ban lists + +Endpoint: +- `GET /api/v1/guardrails/ban_lists/?domain=abuse&offset=0&limit=20` + +Example: + +```bash +curl -X GET "http://localhost:8001/api/v1/guardrails/ban_lists/?offset=0&limit=20" \ + -H "X-API-KEY: " +``` + +## 5.3 Get ban list by id + +Endpoint: +- `GET /api/v1/guardrails/ban_lists/{id}` + +Example: + +```bash +curl -X GET "http://localhost:8001/api/v1/guardrails/ban_lists/" \ + -H "X-API-KEY: " +``` + +## 5.4 Update ban list + +Endpoint: +- `PATCH /api/v1/guardrails/ban_lists/{id}` + +Example: + +```bash +curl -X PATCH "http://localhost:8001/api/v1/guardrails/ban_lists/" \ + -H "X-API-KEY: " \ + -H "Content-Type: application/json" \ + -d '{ + "description": "Updated description", + "banned_words": ["slur_a", "slur_b", "slur_c"] + }' +``` + +## 5.5 Delete ban list + +Endpoint: +- `DELETE /api/v1/guardrails/ban_lists/{id}` + +Example: + +```bash +curl -X DELETE "http://localhost:8001/api/v1/guardrails/ban_lists/" \ + -H "X-API-KEY: " +``` + +## 6) End-to-End Usage Pattern + +Recommended request flow: +1. Create/update validator configs via `/guardrails/validators/configs`. +2. List configs and select active validators for a project. +3. Send selected validators in `POST /guardrails/`. +4. Use `safe_text` as downstream text. +5. If `rephrase_needed=true`, ask user to rephrase. +6. For `ban_list` validators without inline `banned_words`, create/manage a ban list first and pass `ban_list_id`. + +## 7) Common Errors + +- `401 Missing Authorization header` + - Add `Authorization: Bearer `. +- `401 Invalid authorization token` + - Verify plaintext token matches server-side hash. +- `401 Missing X-API-KEY header` + - Add `X-API-KEY: ` for ban list endpoints. +- `401 Invalid API key` + - Verify the API key is valid in the upstream Kaapi auth service. +- `Invalid request_id` + - Ensure `request_id` is a valid UUID string. +- `Validator already exists for this type and stage` + - Type+stage is unique per organization/project scope. +- `Validator not found` + - Confirm `id`, `organization_id`, and `project_id` match. + +## 8) Current Validator Types + +From `validators.json`: +- `uli_slur_match` +- `pii_remover` +- `gender_assumption_bias` +- `ban_list` + +Source of truth: +- `backend/app/core/validators/validators.json` +- `GET /api/v1/guardrails/` (runtime-discovered schemas/types) + +See detailed configuration notes in: +- `backend/app/core/validators/README.md` diff --git a/backend/app/api/docs/ban_lists/create_ban_list.md b/backend/app/api/docs/ban_lists/create_ban_list.md new file mode 100644 index 0000000..64febf8 --- /dev/null +++ b/backend/app/api/docs/ban_lists/create_ban_list.md @@ -0,0 +1,9 @@ +Creates a ban list for the tenant resolved from `X-API-KEY`. + +Behavior notes: +- Stores a domain-scoped list of banned words used by the `ban_list` validator. +- `is_public` defaults to `false` when omitted. + +Common failure cases: +- Missing or invalid API key. +- Payload schema validation errors. diff --git a/backend/app/api/docs/ban_lists/delete_ban_list.md b/backend/app/api/docs/ban_lists/delete_ban_list.md new file mode 100644 index 0000000..889ccca --- /dev/null +++ b/backend/app/api/docs/ban_lists/delete_ban_list.md @@ -0,0 +1,9 @@ +Deletes a ban list by id for the tenant resolved from `X-API-KEY`. + +Behavior notes: +- Deletion is restricted to owner scope. +- Tenant's scope is enforced from the API key context. + +Common failure cases: +- Missing or invalid API key. +- Ban list not found in tenant's scope. diff --git a/backend/app/api/docs/ban_lists/get_ban_list.md b/backend/app/api/docs/ban_lists/get_ban_list.md new file mode 100644 index 0000000..a40fa79 --- /dev/null +++ b/backend/app/api/docs/ban_lists/get_ban_list.md @@ -0,0 +1,9 @@ +Fetches a single ban list by id for the tenant resolved from `X-API-KEY`. + +Behavior notes: +- Tenant's scope is enforced from the API key context. + +Common failure cases: +- Missing or invalid API key. +- Ban list not found in tenant's scope. +- Invalid id format. diff --git a/backend/app/api/docs/ban_lists/list_ban_lists.md b/backend/app/api/docs/ban_lists/list_ban_lists.md new file mode 100644 index 0000000..59e446a --- /dev/null +++ b/backend/app/api/docs/ban_lists/list_ban_lists.md @@ -0,0 +1,11 @@ +Lists ban lists for the tenant resolved from `X-API-KEY`. + +Behavior notes: +- Supports filtering by `domain`. +- Supports pagination via `offset` and `limit`. +- `offset` defaults to `0`. +- `limit` is optional; when omitted, no limit is applied. + +Common failure cases: +- Missing or invalid API key. +- Invalid filter/pagination values. diff --git a/backend/app/api/docs/ban_lists/update_ban_list.md b/backend/app/api/docs/ban_lists/update_ban_list.md new file mode 100644 index 0000000..90c8786 --- /dev/null +++ b/backend/app/api/docs/ban_lists/update_ban_list.md @@ -0,0 +1,10 @@ +Partially updates a ban list by id for the tenant resolved from `X-API-KEY`. + +Behavior notes: +- Supports patch-style updates; omitted fields remain unchanged. +- Tenant's scope is enforced from the API key context. + +Common failure cases: +- Missing or invalid API key. +- Ban list not found in tenant's scope. +- Payload schema validation errors. diff --git a/backend/app/api/docs/guardrails/list_validators.md b/backend/app/api/docs/guardrails/list_validators.md new file mode 100644 index 0000000..1a48cc1 --- /dev/null +++ b/backend/app/api/docs/guardrails/list_validators.md @@ -0,0 +1,10 @@ +Lists all available runtime validators and their JSON schemas. + +Use this endpoint to discover supported validator `type` values and validator-specific config schema before calling guardrail execution. + +Behavior notes: +- Success payload is a plain object (`{"validators": [...]}`), not the `APIResponse` wrapper. +- Validator entries include `type` (runtime identifier) and `config` (validator JSON schema). + +Common failure cases: +- Internal schema extraction/parsing error for a validator model. diff --git a/backend/app/api/docs/guardrails/run_guardrails.md b/backend/app/api/docs/guardrails/run_guardrails.md new file mode 100644 index 0000000..bd8b9e0 --- /dev/null +++ b/backend/app/api/docs/guardrails/run_guardrails.md @@ -0,0 +1,17 @@ +Runs guardrails on input text with a selected list of validators. + +Behavior notes: +- Runtime validator format uses `on_fail`; config-style payloads with `on_fail_action` are accepted and normalized. +- `suppress_pass_logs=true` skips persisting pass-case validator logs. +- The endpoint always saves a `request_log` entry for the run. +- Validator logs are also saved; with `suppress_pass_logs=true`, only fail-case validator logs are persisted. Otherwise, all validator logs are added. +- `rephrase_needed=true` means the system could not safely auto-fix the input/output and wants the user to retry with a rephrased query. +- When `rephrase_needed=true`, `safe_text` contains the rephrase prompt shown to the user. + +Failure behavior: +- `success=false` is returned when validation fails without a recoverable fix or an internal runtime error occurs. +- Common failures include invalid `request_id` format and validator errors without fallback output. + +Side effects: +- Saves/updates `request_log` with request context and final response status/text. +- Saves `validator_log` entries for executed validators based on `suppress_pass_logs`. diff --git a/backend/app/api/docs/utils/health_check.md b/backend/app/api/docs/utils/health_check.md new file mode 100644 index 0000000..32317f9 --- /dev/null +++ b/backend/app/api/docs/utils/health_check.md @@ -0,0 +1,3 @@ +Service liveness probe endpoint. + +Use this endpoint to verify the API process is reachable. \ No newline at end of file diff --git a/backend/app/api/docs/validator_configs/create_validator.md b/backend/app/api/docs/validator_configs/create_validator.md new file mode 100644 index 0000000..fc2ff18 --- /dev/null +++ b/backend/app/api/docs/validator_configs/create_validator.md @@ -0,0 +1,12 @@ +Creates a validator configuration within an organization/project scope. + +The record stores base validator metadata and validator-specific config fields in one object, then returns a flattened response shape. + +Behavior notes: +- `on_fail_action` defaults to `fix` when omitted. +- `is_enabled` defaults to `true` when omitted. +- Uniqueness is enforced per `(organization_id, project_id, type, stage)`. + +Common failure cases: +- Duplicate validator for the same `(organization_id, project_id, type, stage)` combination. +- Schema/enum validation errors in validator-specific config. diff --git a/backend/app/api/docs/validator_configs/delete_validator.md b/backend/app/api/docs/validator_configs/delete_validator.md new file mode 100644 index 0000000..26b0159 --- /dev/null +++ b/backend/app/api/docs/validator_configs/delete_validator.md @@ -0,0 +1,7 @@ +Deletes a validator configuration by id within an organization/project scope. + +Behavior notes: +- Deletion is scope-aware; the id must belong to the provided organization/project. + +Common failure cases: +- Validator not found for provided scope. diff --git a/backend/app/api/docs/validator_configs/get_validator.md b/backend/app/api/docs/validator_configs/get_validator.md new file mode 100644 index 0000000..b795982 --- /dev/null +++ b/backend/app/api/docs/validator_configs/get_validator.md @@ -0,0 +1,10 @@ +Fetches a single validator configuration by id within an organization/project scope. + +Response data is flattened and includes both base validator fields and validator-specific config fields. + +Behavior notes: +- Scope is strictly enforced; a validator id outside the provided organization/project is treated as inaccessible. + +Common failure cases: +- Validator not found. +- Validator exists but does not match the provided scope. diff --git a/backend/app/api/docs/validator_configs/list_validators.md b/backend/app/api/docs/validator_configs/list_validators.md new file mode 100644 index 0000000..fcd683c --- /dev/null +++ b/backend/app/api/docs/validator_configs/list_validators.md @@ -0,0 +1,12 @@ +Lists validator configurations for an organization/project scope, with optional filtering. + +Each result item is flattened: +- Base fields (`id`, `type`, `stage`, `on_fail_action`, `is_enabled`, scope ids, timestamps) +- Validator-specific config fields + +Behavior notes: +- Filters are combined (logical AND) when multiple are provided. +- `ids` supports multi-value filtering. + +Common failure cases: +- Invalid filter formats (for example malformed UUID in `ids`). diff --git a/backend/app/api/docs/validator_configs/update_validator.md b/backend/app/api/docs/validator_configs/update_validator.md new file mode 100644 index 0000000..0700509 --- /dev/null +++ b/backend/app/api/docs/validator_configs/update_validator.md @@ -0,0 +1,12 @@ +Partially updates a validator configuration by id within an organization/project scope. + +Behavior notes: +- Supports patching base fields and validator-specific config fields. +- Validator-specific updates are merged into the existing config rather than replacing the entire config object. +- Omitted fields remain unchanged. +- Updates still honor uniqueness on `(organization_id, project_id, type, stage)`. + +Common failure cases: +- Validator not found for provided scope. +- Duplicate validator conflict after changing `type`/`stage`. +- Invalid patch payload. diff --git a/backend/app/api/routes/ban_lists.py b/backend/app/api/routes/ban_lists.py index 279963f..5776b3d 100644 --- a/backend/app/api/routes/ban_lists.py +++ b/backend/app/api/routes/ban_lists.py @@ -6,12 +6,16 @@ from app.api.deps import MultitenantAuthDep, SessionDep from app.crud.ban_list import ban_list_crud from app.schemas.ban_list import BanListCreate, BanListUpdate, BanListResponse -from app.utils import APIResponse +from app.utils import APIResponse, load_description router = APIRouter(prefix="/guardrails/ban_lists", tags=["Ban Lists"]) -@router.post("/", response_model=APIResponse[BanListResponse]) +@router.post( + "/", + description=load_description("ban_lists/create_ban_list.md"), + response_model=APIResponse[BanListResponse], +) def create_ban_list( payload: BanListCreate, session: SessionDep, @@ -23,7 +27,11 @@ def create_ban_list( return APIResponse.success_response(data=ban_list) -@router.get("/", response_model=APIResponse[list[BanListResponse]]) +@router.get( + "/", + description=load_description("ban_lists/list_ban_lists.md"), + response_model=APIResponse[list[BanListResponse]], +) def list_ban_lists( session: SessionDep, auth: MultitenantAuthDep, @@ -42,7 +50,11 @@ def list_ban_lists( return APIResponse.success_response(data=ban_lists) -@router.get("/{id}", response_model=APIResponse[BanListResponse]) +@router.get( + "/{id}", + description=load_description("ban_lists/get_ban_list.md"), + response_model=APIResponse[BanListResponse], +) def get_ban_list( id: UUID, session: SessionDep, @@ -52,7 +64,11 @@ def get_ban_list( return APIResponse.success_response(data=obj) -@router.patch("/{id}", response_model=APIResponse[BanListResponse]) +@router.patch( + "/{id}", + description=load_description("ban_lists/update_ban_list.md"), + response_model=APIResponse[BanListResponse], +) def update_ban_list( id: UUID, payload: BanListUpdate, @@ -69,7 +85,11 @@ def update_ban_list( return APIResponse.success_response(data=ban_list) -@router.delete("/{id}", response_model=APIResponse[dict]) +@router.delete( + "/{id}", + description=load_description("ban_lists/delete_ban_list.md"), + response_model=APIResponse[dict], +) def delete_ban_list( id: UUID, session: SessionDep, diff --git a/backend/app/api/routes/guardrails.py b/backend/app/api/routes/guardrails.py index 4700a64..def2e61 100644 --- a/backend/app/api/routes/guardrails.py +++ b/backend/app/api/routes/guardrails.py @@ -20,13 +20,16 @@ from app.schemas.guardrail_config import GuardrailRequest, GuardrailResponse from app.models.logging.request_log import RequestLogUpdate, RequestStatus from app.models.logging.validator_log import ValidatorLog, ValidatorOutcome -from app.utils import APIResponse +from app.utils import APIResponse, load_description router = APIRouter(prefix="/guardrails", tags=["guardrails"]) @router.post( - "/", response_model=APIResponse[GuardrailResponse], response_model_exclude_none=True + "/", + description=load_description("guardrails/run_guardrails.md"), + response_model=APIResponse[GuardrailResponse], + response_model_exclude_none=True, ) def run_guardrails( payload: GuardrailRequest, @@ -52,7 +55,7 @@ def run_guardrails( ) -@router.get("/") +@router.get("/", description=load_description("guardrails/list_validators.md")) def list_validators(_: AuthDep): """ Lists all validators and their parameters directly. diff --git a/backend/app/api/routes/utils.py b/backend/app/api/routes/utils.py index 9b138d3..c26e513 100644 --- a/backend/app/api/routes/utils.py +++ b/backend/app/api/routes/utils.py @@ -1,8 +1,12 @@ from fastapi import APIRouter +from app.utils import load_description router = APIRouter(prefix="/utils", tags=["utils"]) -@router.get("/health-check/") +@router.get( + "/health-check/", + description=load_description("utils/health_check.md"), +) def health_check() -> bool: return True diff --git a/backend/app/api/routes/validator_configs.py b/backend/app/api/routes/validator_configs.py index ed34895..9db7215 100644 --- a/backend/app/api/routes/validator_configs.py +++ b/backend/app/api/routes/validator_configs.py @@ -11,7 +11,7 @@ ValidatorUpdate, ) from app.crud.validator_config import validator_config_crud -from app.utils import APIResponse +from app.utils import APIResponse, load_description router = APIRouter( prefix="/guardrails/validators/configs", @@ -19,7 +19,11 @@ ) -@router.post("/", response_model=APIResponse[ValidatorResponse]) +@router.post( + "/", + description=load_description("validator_configs/create_validator.md"), + response_model=APIResponse[ValidatorResponse], +) def create_validator( payload: ValidatorCreate, session: SessionDep, @@ -33,7 +37,11 @@ def create_validator( return APIResponse.success_response(data=response_model) -@router.get("/", response_model=APIResponse[list[ValidatorResponse]]) +@router.get( + "/", + description=load_description("validator_configs/list_validators.md"), + response_model=APIResponse[list[ValidatorResponse]], +) def list_validators( organization_id: int, project_id: int, @@ -49,7 +57,11 @@ def list_validators( return APIResponse.success_response(data=response_model) -@router.get("/{id}", response_model=APIResponse[ValidatorResponse]) +@router.get( + "/{id}", + description=load_description("validator_configs/get_validator.md"), + response_model=APIResponse[ValidatorResponse], +) def get_validator( id: UUID, organization_id: int, @@ -61,7 +73,11 @@ def get_validator( return APIResponse.success_response(data=validator_config_crud.flatten(obj)) -@router.patch("/{id}", response_model=APIResponse[ValidatorResponse]) +@router.patch( + "/{id}", + description=load_description("validator_configs/update_validator.md"), + response_model=APIResponse[ValidatorResponse], +) def update_validator( id: UUID, organization_id: int, @@ -77,7 +93,11 @@ def update_validator( return APIResponse.success_response(data=response_model) -@router.delete("/{id}", response_model=APIResponse[dict]) +@router.delete( + "/{id}", + description=load_description("validator_configs/delete_validator.md"), + response_model=APIResponse[dict], +) def delete_validator( id: UUID, organization_id: int, diff --git a/backend/app/core/validators/README.md b/backend/app/core/validators/README.md new file mode 100644 index 0000000..6366a05 --- /dev/null +++ b/backend/app/core/validators/README.md @@ -0,0 +1,292 @@ +# Validator Configuration Guide + +This document describes the validator configuration model used in this codebase, including the 4 currently supported validators from `backend/app/core/validators/validators.json`. + +## Supported Validators + +Current validator manifest: +- `uli_slur_match` (source: `local`) +- `pii_remover` (source: `local`) +- `gender_assumption_bias` (source: `local`) +- `ban_list` (source: `hub://guardrails/ban_list`) + +## Configuration Model + +All validator config classes inherit from `BaseValidatorConfig` in `backend/app/core/validators/config/base_validator_config.py`. + +Shared fields: +- `on_fail` (default: `fix`) + - `fix`: return transformed/redacted output when validator provides a fix + - `exception`: fail validation when validator fails (no safe replacement output) + - `rephrase`: return a user-facing rephrase prompt plus validator error details + +At the Validator Config API layer (`/guardrails/validators/configs`), configs also include: +- `type` +- `stage`: `input` or `output` +- `on_fail_action` (mapped to runtime `on_fail`) +- `is_enabled` + +## Runtime vs Stored Config + +There are two config shapes used in this project: + +1. Stored validator config (Config CRUD APIs) +- includes `stage`, `on_fail_action`, scope metadata, etc. + +2. Runtime guardrail config (POST `/guardrails/`) +- validator objects are normalized before execution +- internal metadata like `stage`, ids, timestamps are removed +- `on_fail_action` is converted to `on_fail` + +## On-Fail Actions + +This project supports three `on_fail` behaviors at runtime: + +- `fix` + - Uses Guardrails built-in fix flow (`OnFailAction.FIX`). + - If a validator returns `fix_value`, validation succeeds and API returns that transformed value as `safe_text`. + - Typical outcome: redaction/anonymization/substitution without asking user to retry. + +- `exception` + - Uses Guardrails built-in exception flow (`OnFailAction.EXCEPTION`). + - Validation fails without a fallback text; API returns failure (`success=false`) with error details. + - Use when policy requires hard rejection instead of auto-correction. + +- `rephrase` + - Uses project custom handler `rephrase_query_on_fail`. + - Returns: `"Please rephrase the query without unsafe content." + validator error message`. + - API marks `rephrase_needed=true` when returned text starts with this prefix. + - Useful when you want users to rewrite input instead of silently fixing it. + +## How Recommendation Is Chosen + +`stage` is always required in validator configuration (`input` or `output`). +The recommendation below is guidance on what to choose first, based on: +- where harm is most likely (`input`, `output`, or both), +- whether auto-fixes are acceptable for user experience, +- whether extra filtering at that stage creates too many false positives for the product flow. + +## How These Recommendations Were Derived + +These recommendations come from working with multiple NGOs to understand their GenAI WhatsApp bot use cases, reviewing real bot conversations/data, and then running a structured evaluation flow: +- NGO use-case discovery and conversation analysis: + - Reviewed real conversational patterns, safety failure modes, and policy expectations across partner NGO workflows. + - Identified practical risks to prioritize (harmful language, privacy leakage, bias, and deployment-specific banned terms). +- Curated validator-specific datasets: + - Built evaluation and challenge datasets per validator from NGO use-case patterns and conversation analysis. + - Included realistic multilingual/code-mixed and domain-specific examples that reflect actual bot usage contexts. +- Controlled experiments: + - Compared validators and parameter settings across stages (`input` vs `output`). + - Compared custom implementations against: + - Guardrails Hub variants (prebuilt validators/packages from the Guardrails Hub ecosystem), where applicable. + - Baseline alternatives (simpler/default checks used as reference points, such as minimal rule-based setups or prior/default configurations). + - Tracked quality metrics (precision/recall/F1 where applicable) plus manual error review. +- Stress testing: + - Used validator-specific stress-test datasets curated from NGO-inspired scenarios (adversarial phrasing, spelling variants, code-mixed text, and context-ambiguous examples). + - Focused on false-positive/false-negative behavior and user-facing impact. +- Deployment-oriented tuning: + - Converted findings into stage recommendations and default parameter guidance. + - Prioritized NGO safety goals: reduce harmful content and privacy leakage while preserving usability. + +## Validator Details + +### 1) Lexical Slur Validator (`uli_slur_match`) + +Code: +- Config: `backend/app/core/validators/config/lexical_slur_safety_validator_config.py` +- Runtime validator: `backend/app/core/validators/lexical_slur.py` +- Data file: `backend/app/core/validators/utils/files/curated_slurlist_hi_en.csv` + +What it does: +- Detects lexical slurs using list-based matching. +- Normalizes text (emoji removal, encoding fix, unicode normalization, lowercase, whitespace normalization). +- Redacts detected slurs with `[REDACTED_SLUR]` when `on_fail=fix`. + +Why this is used: +- Helps mitigate toxic/abusive language in user inputs and model outputs. +- Evaluation and stress tests showed this is effective for multilingual abusive-content filtering in NGO-style conversational flows. + +Recommendation: +- `input` and `output` + - Why `input`: catches abusive wording before it reaches prompt construction, logging, or downstream tools. + - Why `output`: catches toxic generations that can still appear even with safe input. + +Parameters / customization: +- `languages: list[str]` (default: `['en', 'hi']`) +- `severity: 'low' | 'medium' | 'high' | 'all'` (default: `'all'`) +- `on_fail` + +Notes / limitations: +- Lexical matching can produce false positives in domain-specific contexts. +- Severity filtering is dependent on source slur list labels. +- Rules-based approach may miss semantic toxicity without explicit lexical matches. + +Evidence and evaluation: +- Dataset reference: `https://www.kaggle.com/c/multilingualabusivecomment/data` +- Label convention used in that dataset: + - `1` = abusive comment + - `0` = non-abusive comment +- Experiments highlighted acronym/context ambiguity (example: terms abusive in one context but neutral in another), so deployment-specific filtering is required. + +### 2) PII Remover Validator (`pii_remover`) + +Code: +- Config: `backend/app/core/validators/config/pii_remover_safety_validator_config.py` +- Runtime validator: `backend/app/core/validators/pii_remover.py` + +What it does: +- Detects and anonymizes personally identifiable information using Presidio. +- Returns redacted text when PII is found and `on_fail=fix`. + +Why this is used: +- Privacy is a primary safety requirement in NGO deployments. +- Evaluation runs for this project showed clear risk of personal-data leakage/retention in conversational workflows without PII masking. + +Recommendation: +- `input` and `output` + - Why `input`: prevents storing or processing raw user PII in logs/services. + - Why `output`: prevents model-generated leakage of names, numbers, or identifiers. + +Parameters / customization: +- `entity_types: list[str] | None` (default: all supported types) +- `threshold: float` (default: `0.5`) +- `on_fail` + +Threshold guidance: +- `threshold` is the minimum confidence score required for a detected entity to be treated as PII. +- Lower threshold -> more detections (higher recall, more false positives/over-masking). +- Higher threshold -> fewer detections (higher precision, more false negatives/missed PII). +- Start around `0.5`, then tune using real conversation samples by reviewing both missed PII and unnecessary masking. +- If the product is privacy-critical, prefer a slightly lower threshold and tighter `entity_types`; if readability is primary, prefer a slightly higher threshold. + +Supported default entity types: +- `CREDIT_CARD`, `EMAIL_ADDRESS`, `IBAN_CODE`, `IP_ADDRESS`, `LOCATION`, `MEDICAL_LICENSE`, `NRP`, `PERSON`, `PHONE_NUMBER`, `URL`, `IN_AADHAAR`, `IN_PAN`, `IN_PASSPORT`, `IN_VEHICLE_REGISTRATION`, `IN_VOTER` + +Notes / limitations: +- Rule/ML recognizers can under-detect free-text references. +- Threshold and entity selection should be tuned per deployment context. + +Evidence and evaluation: +- Compared approaches: + - Custom PII validator (this codebase) + - Guardrails Hub PII validator +- Dataset 1: `https://huggingface.co/datasets/ai4privacy/pii-masking-200k` (English subset in project evaluation) + - Custom: Precision `0.614`, Recall `0.344`, F1 `0.441` + - Hub: Precision `0.54`, Recall `0.33`, F1 `0.41` +- Dataset 2: synthetic Hindi dataset (project-created) + - Reported results indicate tradeoffs by class, with the custom validator showing stronger balance for NGO deployment contexts where over-masking harms usability. + +### 3) Gender Assumption Bias Validator (`gender_assumption_bias`) + +Code: +- Config: `backend/app/core/validators/config/gender_assumption_bias_safety_validator_config.py` +- Runtime validator: `backend/app/core/validators/gender_assumption_bias.py` +- Data file: `backend/app/core/validators/utils/files/gender_assumption_bias_words.csv` + +What it does: +- Detects gender-assumptive words/phrases and substitutes neutral terms. +- Uses a curated mapping from gendered terms to neutral alternatives. + +Why this is used: +- Addresses model harm from assuming user gender or producing gender-biased language. +- Evaluation reviews and stress tests identified this as a recurring conversational quality/safety issue. + +Recommendation: +- primarily `output` + - Why `output`: the assistant response is where assumption-biased phrasing is most likely to be emitted to end users. + - Why not `input` by default: user text can be descriptive/quoted, so rewriting input can introduce false positives and intent drift. + - Use `input` too when your policy requires strict moderation of user phrasing before any model processing. + +Parameters / customization: +- `categories: list[BiasCategories] | None` (default: `[all]`) +- `on_fail` + +`BiasCategories` values: +- `generic`, `healthcare`, `education`, `all` + +Notes / limitations: +- Rule-based substitutions may affect natural fluency. +- Gender-neutral transformation in Hindi/romanized Hindi can be context-sensitive. +- Full assumption detection often benefits from multi-turn context and/or LLM-as-judge approaches. + +Improvement suggestions from evaluation: +- Strengthen prompt strategy so the model asks user preferences instead of assuming gendered terms. +- Fine-tune generation prompts for neutral language defaults. +- Consider external LLM-as-judge checks for nuanced multi-turn assumption detection. + +### 4) Ban List Validator (`ban_list`) + +Code: +- Config: `backend/app/core/validators/config/ban_list_safety_validator_config.py` +- Source: Guardrails Hub (`hub://guardrails/ban_list`) + +What it does: +- Blocks or redacts configured banned words using the Guardrails Hub BanList validator. + +Why this is used: +- Provides deployment-specific denylist control for terms that must never appear in inputs/outputs. +- Useful for policy-level restrictions not fully covered by generic toxicity detection. + +Recommendation: +- `input` and `output` + - Why `input`: blocks prohibited terms before model invocation and tool calls. + - Why `output`: enforces policy on generated text before it is shown to users. + +Parameters / customization: +- `banned_words: list[str]` (optional if `ban_list_id` is provided) +- `ban_list_id: UUID` (optional if `banned_words` is provided) +- `on_fail` + +Notes / limitations: +- Exact-list approach requires ongoing maintenance. +- Contextual false positives can occur for ambiguous terms. +- Runtime validation requires at least one of `banned_words` or `ban_list_id`. +- If `ban_list_id` is used, banned words are resolved from the tenant-scoped Ban List APIs. + +## Example Config Payloads + +Example: create validator config (stored shape) + +```json +{ + "type": "pii_remover", + "stage": "input", + "on_fail_action": "fix", + "is_enabled": true, + "entity_types": ["PERSON", "PHONE_NUMBER", "IN_AADHAAR"], + "threshold": 0.6 +} +``` + +Example: runtime guardrail validator object (execution shape) + +```json +{ + "type": "pii_remover", + "on_fail": "fix", + "entity_types": ["PERSON", "PHONE_NUMBER", "IN_AADHAAR"], + "threshold": 0.6 +} +``` + +## Operational Guidance + +Default stage strategy: +- Input guardrails: `pii_remover`, `uli_slur_match`, `ban_list` +- Output guardrails: `pii_remover`, `uli_slur_match`, `gender_assumption_bias`, `ban_list` + +Tuning strategy: +- Start with conservative defaults and log validator outcomes. +- Review false positives/false negatives by validator and stage. +- Iterate on per-validator parameters (`severity`, `threshold`, `categories`, `banned_words`). + +## Related Files + +- `backend/app/core/validators/validators.json` +- `backend/app/core/validators/config/base_validator_config.py` +- `backend/app/core/validators/config/ban_list_safety_validator_config.py` +- `backend/app/core/validators/config/pii_remover_safety_validator_config.py` +- `backend/app/core/validators/config/lexical_slur_safety_validator_config.py` +- `backend/app/core/validators/config/gender_assumption_bias_safety_validator_config.py` +- `backend/app/schemas/guardrail_config.py` +- `backend/app/schemas/validator_config.py` diff --git a/backend/app/utils.py b/backend/app/utils.py index 8684602..636d893 100644 --- a/backend/app/utils.py +++ b/backend/app/utils.py @@ -1,5 +1,7 @@ import logging +import functools as ft from datetime import datetime, timezone +from pathlib import Path from pydantic import BaseModel from typing import Any, Dict, Generic, Optional, TypeVar @@ -32,6 +34,20 @@ def split_validator_payload(data: dict): return model_fields, config_fields +@ft.singledispatch +def load_description(filename: Path) -> str: + if not filename.exists(): + this = Path(__file__) + filename = this.parent.joinpath("api", "docs", filename) + + return filename.read_text() + + +@load_description.register +def _(filename: str) -> str: + return load_description(Path(filename)) + + class APIResponse(BaseModel, Generic[T]): success: bool data: Optional[T] = None diff --git a/deployment.md b/deployment.md index 1ca82d2..1d43258 100644 --- a/deployment.md +++ b/deployment.md @@ -2,6 +2,38 @@ ## Preparation +1. Create production env values from `.env.example`. +2. Set `ENVIRONMENT=production`. +3. Set `POSTGRES_*` for your production database. +4. Set `AUTH_TOKEN` as a SHA-256 hex digest (64 lowercase chars) of your bearer token. +5. Optionally set `GUARDRAILS_HUB_API_KEY` and `SENTRY_DSN`. + +Generate AUTH token hash: + +```bash +echo -n "your-plain-text-token" | shasum -a 256 +``` + ## Deploy the FastAPI Project -## Continuous Deployment (CD) \ No newline at end of file +Build and start backend: + +```bash +docker compose build backend +docker compose up -d backend +``` + +Run migrations and initial setup (recommended before or during first rollout): + +```bash +docker compose --profile prestart up prestart +``` + +Default host endpoint: +- API/Docs host: `http://:8001` +- Health check: `http://:8001/api/v1/utils/health-check/` + +## Continuous Deployment (CD) + +CI workflow is defined in `.github/workflows/continuous_integration.yml`. +It runs dependency install, Guardrails validator installation, migrations, pre-commit checks, and tests. diff --git a/development.md b/development.md index 41a6f42..936b591 100644 --- a/development.md +++ b/development.md @@ -10,11 +10,9 @@ docker compose watch * Now you can open your browser and interact with these URLs: -Backend, JSON based web API based on OpenAPI: +Backend, JSON based web API based on OpenAPI: -Automatic interactive documentation with Swagger UI (from the OpenAPI backend): - -Adminer, database web administration: +Automatic interactive documentation with Swagger UI (from the OpenAPI backend): **Note**: The first time you start your stack, it might take a minute for it to be ready. While the backend waits for the database to be ready and configures everything. You can check the logs to monitor it. @@ -34,7 +32,7 @@ docker compose logs backend The Docker Compose files are configured so that each of the services is available in a different port in `localhost`. -For the backend, we use the same port that would be used by their local development server, so, the backend is at `http://localhost:8000`. +For the backend, Docker maps container port `8000` to host port `8001`, so the backend is at `http://localhost:8001`. This way, you could turn off a Docker Compose service and start its local development service, and everything would keep working, because it all uses the same ports. @@ -47,29 +45,7 @@ And then you can run the local development server for the backend: ```bash cd backend -fastapi dev app/main.py -``` - -## Docker Compose in `localhost.tiangolo.com` - -When you start the Docker Compose stack, it uses `localhost` by default, with different ports for each service (backend, frontend, adminer, etc). - -When you deploy it to production (or staging), it will deploy each service in a different subdomain, like `api.example.com` for the backend. - -If you want to test that it's all working locally, you can edit the local `.env` file, and change: - -```dotenv -DOMAIN=localhost.tiangolo.com -``` - -That will be used by the Docker Compose files to configure the base domain for the services. - -The domain `localhost.tiangolo.com` is a special domain that is configured (with all its subdomains) to point to `127.0.0.1`. This way you can use that for your local development. - -After you update it, run again: - -```bash -docker compose watch +fastapi dev app/main.py --port 8001 ``` ## Docker Compose files and env vars @@ -148,22 +124,8 @@ The production or staging URLs would use these same paths, but with your own dom Development URLs, for local development. -Backend: - -Automatic Interactive Docs (Swagger UI): - -Automatic Alternative Docs (ReDoc): - -Adminer: - -### Development URLs with `localhost.tiangolo.com` Configured - -Development URLs, for local development. - -Backend: - -Automatic Interactive Docs (Swagger UI): +Backend: -Automatic Alternative Docs (ReDoc): +Automatic Interactive Docs (Swagger UI): -Adminer: \ No newline at end of file +Automatic Alternative Docs (ReDoc):