Skip to content

Composable queries: reference other library queries via {{name}} (CTE-merge) #39

Description

@BorisTyshkevich

Summary

Let a saved query reference another saved query in the same library by name, using a {{query name}} placeholder. At Run time we resolve the references and splice the referenced queries in as CTEs (WITH … AS (…)), then run the single composed query.

This gives DRY, reusable query building blocks without needing server-side CREATE VIEW (which requires DDL privileges this read-only SQL browser's users often don't have). Everything stays client-side in the library you already save / export / share.

Motivation

Today you copy-paste common subqueries between saved queries. A "metrics library" pattern — define active_users once, reference it from many queries — isn't possible without server-side views. We want that reuse purely in the client-side library.

Proposed syntax: {{name}}

{{ }} is not valid SQL anywhere, so it won't collide with the language. (Note: a %name%-style delimiter is a non-starter — % is modulo and the LIKE '%…%' wildcard, so WHERE x LIKE '%2023%' would be misread as a reference.) References must be ignored inside string literals and comments.

Mechanism: CTE-merge

Library query active_users:

SELECT user_id, count() AS events
FROM events
WHERE event_date >= today() - 30
GROUP BY user_id

A query that references it:

SELECT count() AS active
FROM {{active_users}}
WHERE events > 10

Expands at Run to:

WITH active_users AS (
  SELECT user_id, count() AS events
  FROM events
  WHERE event_date >= today() - 30
  GROUP BY user_id
)
SELECT count() AS active
FROM active_users
WHERE events > 10

CTE-merge (vs. inlining each reference as a (SELECT …) subquery) defines each fragment once and reads cleanly even when a fragment is referenced multiple times.

Resolution rules

  • Recursive: a referenced fragment may reference others; collect them all into one WITH block.
  • Cycle detection: A → B → A must error, not loop forever.
  • Missing reference: {{nope}} → clear error naming the missing query.
  • Trailing clauses stripped: drop a fragment's trailing FORMAT … / ; before it becomes a CTE (a SELECT … FORMAT JSON can't be a CTE/subquery).
  • SELECT/WITH only: refuse to splice an INSERT/DDL fragment.
  • Skip strings/comments when scanning for {{…}}.
  • Implement as a pure resolverexpandRefs(sql, libraryByName) → { sql, error } in src/core/ (100% covered) — called on the Run path; the structured ClickHouse client is unchanged.

Pros

  • Reuse without DDL rights; stays in the shareable/exportable library.
  • "Define once, reference everywhere" building blocks.
  • Pure, testable transform that drops into the existing Run path.

Risks / open design questions (feedback wanted)

  1. Error-position mapping. ClickHouse reports error offsets against the expanded SQL, but the editor shows the original. We recently added caret-to-error for Format; composed queries would regress it unless we (a) show the expanded SQL when a composed query errors, and/or (b) map positions back to the source. How important is precise caret mapping vs. just surfacing the expanded SQL on error?
  2. Names → CTE aliases. Library names are freeform (Top Airports — 2023) and aren't valid CTE identifiers. We'd derive a safe alias (sanitize/slugify) and substitute it for {{name}}. How to handle collisions after sanitization?
  3. Rename fragility. References are by human-readable name, so renaming a fragment silently breaks referrers. Options: accept it, auto-update referrers on rename, or reference by stable id (robust but not human-writable). Preference?
  4. Duplicate names in a library — ambiguous reference. Reject at expand time? Enforce unique names?
  5. Sharing. A shared link / exported library must include every referenced fragment or it breaks. Bundle transitively on share/export?
  6. Performance. ClickHouse doesn't materialize CTEs by default, so a heavy fragment referenced N times executes N times. Document it? Offer an opt-in … AS (…) materialization hint?
  7. "Runs ≠ shows." General debuggability cost of a templating layer. Mitigate with a "Preview expanded SQL" toggle in the editor + editor highlighting/autocomplete for {{refs}}.

Alternatives considered

  • Inline-as-subquery instead of CTE-merge — simpler, but duplicates a fragment used multiple times and reads worse.
  • Insert-as-CTE at edit time (a command/autocomplete that pastes the fragment into the editor as a CTE): WYSIWYG, undoable, no hidden runtime expansion, no error-mapping regression — but loses "live" propagation (editing the source fragment doesn't update referrers). Lower-risk way to get ~80% of the value; could ship first and add runtime expansion later if propagation is actually wanted.
  • Server-side views — the SQL-native answer, but needs DDL privileges and persists server-side; doesn't fit the read-only / client-library niche.

Scope guard

This is the on-ramp to reinventing dbt/Jinja. Intent is to stop at named query references + CTE-merge (no parameters, no control flow). Parameterized fragments like {{fragment(year=2023)}} are explicitly out of scope for v1 — separate discussion.

Feedback welcome on the open questions above, especially (1) error mapping and the runtime-expansion vs. insert-as-CTE tradeoff.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions