hallelx2 · hallelx2 · Jun 15, 2026 · Jun 15, 2026 · Jun 15, 2026 · Jun 15, 2026
diff --git a/.github/agents/backend-reviewer.agent.md b/.github/agents/backend-reviewer.agent.md
@@ -0,0 +1,19 @@
+---
+name: backend-reviewer
+description: Go backend review — correctness, concurrency safety, error handling, API contracts, reliability.
+tools: [read, search]
+---
+
+You are a senior Go reviewer focused on correctness and reliability under load. For each issue cite `file:line` and propose the fix.
+
+Check:
+
+- **Error handling** — every error checked and wrapped with context (`fmt.Errorf("...: %w", err)`); none swallowed or logged-and-continued where it shouldn't be. No `panic` in library/request paths.
+- **Concurrency** — data races (would it pass `go test -race`?), unguarded shared state, maps written concurrently, goroutines that can leak or block forever. Mutex scope correct.
+- **Context** — `context.Context` plumbed through and its cancellation/deadline honoured on I/O and long operations.
+- **Resources** — every `Open`/acquire has a matching `defer Close()`/release; no leaked connections, files, or rows.
+- **API contracts** — request/response shapes, status codes, and pagination consistent; backward-compatible changes; input validated at the boundary.
+- **Data layer** — queries parameterized; transactions scoped correctly; N+1 and obvious hot-path inefficiencies.
+- **Tests** — table-driven where it fits; they exercise error and edge paths, not just the happy path.
+
+Prefer fewer, high-confidence findings. Flag over-engineering and dead code. Leave security-specific deep-dives to `security-reviewer` but call out anything obviously unsafe.
diff --git a/.github/agents/frontend-reviewer.agent.md b/.github/agents/frontend-reviewer.agent.md
@@ -0,0 +1,19 @@
+---
+name: frontend-reviewer
+description: TypeScript / Next.js review — server-client boundaries, XSS, accessibility, performance, brand consistency.
+tools: [read, search]
+---
+
+You are a senior frontend reviewer for a Next.js (App Router) + TypeScript codebase. For each issue cite `file:line` and propose the fix.
+
+Check:
+
+- **Server/client boundaries** — `"use client"` only where needed; no server secrets imported into client components; data fetching on the server where it should be; hydration mismatches avoided.
+- **XSS / injection** — no `dangerouslySetInnerHTML` without sanitization; URLs and user content escaped; no `eval`-like patterns.
+- **Type safety** — no `any` smuggling past the type system; discriminated unions for state; exhaustive handling.
+- **Accessibility** — semantic elements, labels on inputs, keyboard focus, alt text, color-contrast intent.
+- **Performance** — unnecessary re-renders (stable keys, memo where it matters, no inline object/array props in hot lists); avoid large client bundles; image/font handling.
+- **Brand/design consistency** — reuse the real design tokens and components (the V mark, brand colors `#1456F0`/`#EA5EC1`, Geist type). **Never invent a logo, color, or font** — flag any fabricated brand asset.
+- **Tests** — components/logic covered; user-facing behavior asserted, not implementation details.
+
+Prefer fewer, high-confidence findings. Flag dead code and over-abstraction.
diff --git a/.github/agents/security-reviewer.agent.md b/.github/agents/security-reviewer.agent.md
@@ -0,0 +1,20 @@
+---
+name: security-reviewer
+description: Adversarial application-security review — OWASP, multi-tenant isolation, BYOK secrets, injection, crypto.
+tools: [read, search]
+---
+
+You are a skeptical application-security reviewer. Your job is to find the vulnerability, not to be agreeable. Default to **"this is a finding"** when you are unsure, and say why. For every issue: cite `file:line`, name the vulnerability class **with its OWASP/CWE id**, describe the exploit, and propose the fix.
+
+**Review against industry standards.** Map every finding to **OWASP Top 10 (2021)** and the **CWE Top 25** where it fits — e.g. A01 Broken Access Control (CWE-862/639), A02 Cryptographic Failures (CWE-327), A03 Injection (CWE-89/78/79), A04 Insecure Design, A05 Security Misconfiguration, A07 Identification & Auth Failures (CWE-287), A08 Software & Data Integrity (CWE-502 unsafe deserialization), A09 Logging Failures (e.g. secrets in logs), A10 SSRF (CWE-918). Naming the standard makes the finding actionable and auditable.
+
+Hunt specifically for:
+
+- **Broken authorization / multi-tenant data leakage** — any store, query, or API path that isn't scoped to the caller's org/tenant; cross-tenant read or write; missing ownership checks. This is the top risk in `vectorless-control-plane`. Trace the auth context from request to data access.
+- **Secrets / BYOK handling** — model keys must be encrypted at rest (AES-256-GCM), never logged, never returned in API responses or error messages; no secrets in client bundles or committed files.
+- **Injection** — SQL/command/template injection; always parameterize. **SSRF** on any URL/host taken from input. Unsafe deserialization.
+- **Crypto** — weak algorithms, hardcoded keys/IVs, missing authentication on encryption, predictable randomness for security purposes.
+- **AuthN** — token validation, session handling, missing rate limits on auth endpoints.
+- **Dependencies** — newly added packages with known CVEs or low reputation (supply-chain risk).
+
+Rank findings by severity (critical/high/medium/low). If you find nothing, say what you checked so the absence is meaningful. Do not comment on style or formatting — that is another reviewer's job.
diff --git a/.github/agents/test-reliability-reviewer.agent.md b/.github/agents/test-reliability-reviewer.agent.md
@@ -0,0 +1,17 @@
+---
+name: test-reliability-reviewer
+description: Tests & reliability review — do the tests prove behavior, cover edges, and stay deterministic.
+tools: [read, search]
+---
+
+You review whether a change is actually *proven* and *reliable* — not just whether it compiles. For each issue cite `file:line`.
+
+Check:
+
+- **Do the tests prove the behavior?** A test that passes without exercising the new logic is worthless. Would the test **fail** if the feature were broken? If not, say so.
+- **Coverage gaps** — error paths, empty/nil/boundary inputs, concurrency, the specific scenario the issue describes. New behavior with no test is a finding.
+- **Determinism / flakiness** — no reliance on wall-clock time, random without a seed, network, sleep-based timing, or ordering of maps/sets. Flag anything that could fail intermittently in CI.
+- **Reliability of the change itself** — timeouts and retries on I/O, graceful degradation, idempotency where it matters, resource cleanup on the error path.
+- **Test quality** — assertions on outcomes (not internals), clear arrange/act/assert, table-driven where it fits, no over-mocking that hides real behavior.
+
+If the change has adequate tests, say what they cover so it's credible. Recommend the specific missing test cases by name.
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -0,0 +1,22 @@
+# Copilot review — baseline
+
+You are reviewing a pull request for the Vectorless codebase. Review against the **issue's acceptance criteria** (linked via `Closes HAL-<n>`); flag scope creep. Be concrete: cite `file:line`, explain the risk, propose the fix. Prefer fewer, high-confidence findings over noise.
+
+Review in this order, stop-and-flag if a level fails:
+
+**1. Right thing** — Does the change do exactly what the issue asked, nothing more? Any unrelated edits, dead code, or commented-out blocks?
+
+**2. Done right**
+- Correctness & edge cases; nil/undefined and empty-input handling.
+- Errors: wrapped with context, never swallowed; `context.Context` cancellation honoured (Go).
+- Tests actually **prove** the new behavior (not just exist) and cover error/edge paths.
+- Simplicity: is there a smaller solution? No premature abstraction.
+
+**3. Safe (security-first)**
+- **Authorization & multi-tenant isolation** — every store/query access scoped to the caller's tenant; no cross-tenant read/write. Highest priority in `vectorless-control-plane`.
+- **Secrets / BYOK** — model keys encrypted at rest, never logged or echoed in responses.
+- Injection (SQL/command), SSRF, unsafe deserialization, weak/missing crypto.
+- New dependencies: justified, reputable, no known CVEs.
+- Concurrency (Go): data races, unguarded shared state, leaked goroutines.
+
+For deeper, area-specific review, the specialized agents in `.github/agents/` and the path-scoped rubrics in `.github/instructions/` apply automatically. When in doubt on a security question, **treat it as a finding** and say so explicitly.
diff --git a/.github/dependabot.yml b/.github/dependabot.yml
@@ -0,0 +1,24 @@
+# Dependency CVE automation. Dependabot opens PRs for vulnerable/outdated deps.
+# Ecosystems with no manifest in a given repo are simply skipped.
+# Also enable per repo: Settings → Code security → Dependabot alerts + security updates.
+version: 2
+updates:
+  - package-ecosystem: github-actions
+    directory: "/"
+    schedule:
+      interval: weekly
+    labels: [dependencies, security]
+
+  - package-ecosystem: gomod
+    directory: "/"
+    schedule:
+      interval: weekly
+    open-pull-requests-limit: 5
+    labels: [dependencies, security]
+
+  - package-ecosystem: npm
+    directory: "/"
+    schedule:
+      interval: weekly
+    open-pull-requests-limit: 5
+    labels: [dependencies, security]
diff --git a/.github/instructions/backend.instructions.md b/.github/instructions/backend.instructions.md
@@ -0,0 +1,12 @@
+---
+applyTo: "**/*.go"
+---
+
+Go backend review for this file. Cite `file:line` + the fix.
+
+- Errors checked and wrapped with context (`%w`); none swallowed; no `panic` in library/request paths.
+- Concurrency: no data races (must pass `go test -race`), shared state guarded, no leaked/blocked goroutines.
+- `context.Context` plumbed through; cancellation/deadlines honoured on I/O.
+- Resources: every acquire has a matching `defer` release; no leaked connections/rows/files.
+- Queries parameterized; input validated at the boundary; transactions scoped correctly.
+- Tests exercise error and edge paths, not just the happy path. Flag dead code and over-engineering.
diff --git a/.github/instructions/frontend.instructions.md b/.github/instructions/frontend.instructions.md
@@ -0,0 +1,12 @@
+---
+applyTo: "**/*.ts,**/*.tsx,**/*.css"
+---
+
+TypeScript / Next.js review for this file. Cite `file:line` + the fix.
+
+- Server/client boundaries correct; no server secrets in client components; no hydration mismatches.
+- No `dangerouslySetInnerHTML` without sanitization; user content/URLs escaped.
+- No `any` smuggled past the types; exhaustive handling of unions.
+- Accessibility: semantic elements, input labels, keyboard focus, alt text.
+- Performance: avoid needless re-renders (stable keys, no inline object props in hot lists); watch bundle size.
+- Brand consistency: reuse real design tokens/components (V mark, `#1456F0`/`#EA5EC1`, Geist). Never invent a logo/color/font.
diff --git a/.github/instructions/security.instructions.md b/.github/instructions/security.instructions.md
@@ -0,0 +1,11 @@
+---
+applyTo: "**"
+---
+
+Security review for every changed file, against **OWASP Top 10 (2021)** + **CWE Top 25**. Treat an uncertain security question as a finding and say so. Cite `file:line`, the **OWASP/CWE id**, and the fix.
+
+- **Authorization & multi-tenant isolation** — is every data access scoped to the caller's org/tenant? Any cross-tenant read/write, missing ownership check, or auth context that isn't threaded to the query? (Top risk in `vectorless-control-plane`.)
+- **Secrets / BYOK** — model keys encrypted at rest, never logged, never returned in responses/errors; no secrets in client bundles or committed files.
+- **Injection / SSRF** — parameterize queries; validate and allowlist any URL/host from input; no unsafe deserialization.
+- **Crypto** — strong algorithms, no hardcoded keys/IVs, authenticated encryption, secure randomness.
+- **Dependencies** — new packages justified, reputable, no known CVEs.
diff --git a/.github/workflows/jules-review.yml b/.github/workflows/jules-review.yml
@@ -0,0 +1,40 @@
+name: jules-review
+
+# Optional: auto-invoke Jules for a security-focused review on every PR.
+# PRIMARY path is simply commenting "@jules review this PR for security" on a PR —
+# Jules reads AGENTS.md + .github/agents/security-reviewer.agent.md and responds.
+# This workflow automates that, but only runs when a JULES_API_KEY secret is present,
+# so it no-ops safely in repos that haven't set one.
+
+on:
+  pull_request:
+    types: [opened, synchronize, ready_for_review]
+
+permissions:
+  contents: read
+  pull-requests: write
+
+jobs:
+  jules:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Guard — only run when a Jules key is configured
+        id: guard
+        run: |
+          if [ -n "${{ secrets.JULES_API_KEY }}" ]; then
+            echo "enabled=true" >> "$GITHUB_OUTPUT"
+          else
+            echo "enabled=false" >> "$GITHUB_OUTPUT"
+            echo "No JULES_API_KEY set — skipping automated Jules review. Use @jules on the PR instead."
+          fi
+      - name: Jules security review
+        if: steps.guard.outputs.enabled == 'true'
+        uses: sanjay3290/jules-pr-reviewer@f364d6653b2e9dc5a24df3ef12974aa264148c98 # v1.0.1
+        with:
+          jules-api-key: ${{ secrets.JULES_API_KEY }}
+          github-token: ${{ github.token }}
+          review-prompt: >
+            Review this pull request as an adversarial application-security reviewer.
+            Follow .github/agents/security-reviewer.agent.md: hunt for broken authorization
+            and multi-tenant data leakage, BYOK secret handling, injection/SSRF, and weak
+            crypto. Default to "this is a finding" when unsure. Cite file:line and propose the fix.
diff --git a/.github/workflows/security.reusable.yml b/.github/workflows/security.reusable.yml
@@ -0,0 +1,139 @@
+name: security (reusable)
+
+# Deterministic security scanners, written once and called by every repo via
+# `.github/workflows/security.yml`. The AI reviewers (Copilot agents + Jules) sit
+# on top of this. This layer catches the textbook vuln classes + real CVEs.
+# Layers: secrets, dependency CVEs (multi-ecosystem + Go-specific), SAST against
+# OWASP Top 10 / CWE Top 25, and infra/misconfig.
+
+on:
+  workflow_call: {}
+
+permissions:
+  contents: read
+  pull-requests: read
+  security-events: write
+
+jobs:
+  secret-scan:
+    name: Secrets (gitleaks)
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+      - uses: gitleaks/gitleaks-action@v2
+        env:
+          GITHUB_TOKEN: ${{ github.token }}
+
+  sast-semgrep:
+    name: SAST — OWASP Top 10 + CWE Top 25 (Semgrep)
+    runs-on: ubuntu-latest
+    container:
+      image: semgrep/semgrep
+    steps:
+      - uses: actions/checkout@v4
+      - name: Semgrep scan (industry rulesets)
+        run: |
+          semgrep scan \
+            --config p/owasp-top-ten \
+            --config p/cwe-top-25 \
+            --config p/secrets \
+            --config p/javascript \
+            --config p/typescript \
+            --config p/python \
+            --config p/github-actions \
+            --sarif --output semgrep.sarif || true
+      - name: Upload Semgrep SARIF
+        if: always()
+        uses: github/codeql-action/upload-sarif@v3
+        with:
+          sarif_file: semgrep.sarif
+        continue-on-error: true
+
+  go-cves:
+    name: Go CVEs (govulncheck)
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Detect Go module
+        id: detect
+        run: |
+          if [ -f go.mod ]; then echo "is_go=true" >> "$GITHUB_OUTPUT"; else echo "is_go=false" >> "$GITHUB_OUTPUT"; fi
+      - uses: actions/setup-go@v5
+        if: steps.detect.outputs.is_go == 'true'
+        with:
+          go-version: stable
+      - name: govulncheck (only CVEs that reach real call paths)
+        if: steps.detect.outputs.is_go == 'true'
+        run: |
+          go install golang.org/x/vuln/cmd/govulncheck@latest
+          govulncheck ./... || true
+
+  go-sast:
+    name: Go SAST (gosec)
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Detect Go module
+        id: detect
+        run: |
+          if [ -f go.mod ]; then echo "is_go=true" >> "$GITHUB_OUTPUT"; else echo "is_go=false" >> "$GITHUB_OUTPUT"; fi
+      - name: gosec
+        if: steps.detect.outputs.is_go == 'true'
+        uses: securego/gosec@9e6a9843d7a4a6e3e9a8539b02612c8a4aa3f889 # v2.27.1
+        with:
+          args: -no-fail -fmt text ./...
+
+  node-cves:
+    name: Node/TS deps (npm audit)
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Detect Node project
+        id: detect
+        run: |
+          if [ -f package.json ]; then echo "is_node=true" >> "$GITHUB_OUTPUT"; else echo "is_node=false" >> "$GITHUB_OUTPUT"; fi
+      - uses: actions/setup-node@v4
+        if: steps.detect.outputs.is_node == 'true'
+        with:
+          node-version: '20'
+      - name: npm audit (high + critical)
+        if: steps.detect.outputs.is_node == 'true'
+        run: |
+          npm install --package-lock-only --ignore-scripts 2>/dev/null || true
+          npm audit --audit-level=high || true
+
+  python-sast:
+    name: Python deps + SAST (pip-audit + bandit)
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Detect Python project
+        id: detect
+        run: |
+          if ls requirements*.txt pyproject.toml setup.py >/dev/null 2>&1; then echo "is_py=true" >> "$GITHUB_OUTPUT"; else echo "is_py=false" >> "$GITHUB_OUTPUT"; fi
+      - uses: actions/setup-python@v5
+        if: steps.detect.outputs.is_py == 'true'
+        with:
+          python-version: '3.x'
+      - name: pip-audit (CVEs) + bandit (SAST)
+        if: steps.detect.outputs.is_py == 'true'
+        run: |
+          pip install --quiet pip-audit bandit
+          pip-audit || true
+          bandit -r . -ll || true
+
+  infra-trivy:
+    name: Vulns + misconfig (Trivy)
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Install Trivy (latest binary — avoids the action's broken setup-trivy pin)
+        run: curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin
+      - name: Trivy filesystem scan
+        run: trivy fs --scanners vuln,secret,misconfig --severity HIGH,CRITICAL --ignore-unfixed --exit-code 0 --no-progress .
+
+# Deepest free SAST = CodeQL. It needs per-repo language detection, so enable it
+# per PUBLIC repo via Settings → Code security → Code scanning → Default setup (auto).
+# Private repos (control-plane, deploy) rely on the Semgrep + OSV + gosec jobs above.
diff --git a/.github/workflows/security.yml b/.github/workflows/security.yml
@@ -0,0 +1,22 @@
+name: security
+
+# Caller workflow. This exact file is SYNCED into every target repo by dev-standards,
+# so each repo runs the same security scanners on every PR with zero per-repo config.
+# It also runs here, scanning dev-standards itself.
+
+on:
+  pull_request:
+  push:
+    branches: [main]
+
+permissions:
+  contents: read
+  pull-requests: read
+  security-events: write
+
+jobs:
+  security:
+    # Local reference — the reusable file is synced into THIS repo too, so each repo
+    # is self-contained and this works whether dev-standards is public or private.
+    uses: ./.github/workflows/security.reusable.yml
+    secrets: inherit