Skip to content

feat: migrate ClickHouse adapter to utopia-php/query 0.3.x builder#4

Open
lohanidamodar wants to merge 6 commits into
mainfrom
feat/utopia-query-0.3.x
Open

feat: migrate ClickHouse adapter to utopia-php/query 0.3.x builder#4
lohanidamodar wants to merge 6 commits into
mainfrom
feat/utopia-query-0.3.x

Conversation

@lohanidamodar
Copy link
Copy Markdown
Contributor

Summary

Migrates src/Usage/Adapter/ClickHouse.php (3252 → 2854 lines) to consume utopia-php/query 0.3.x. The adapter is now a thin HTTP runtime — every DDL, INSERT, SELECT, and DELETE goes through Schema\ClickHouse / Builder\ClickHouse.

Depends on utopia-php/query#11 (branch feat/clickhouse-insert-delete-settings-mv). Once that PR is tagged 0.3.2, flip composer from dev-feat/clickhouse-insert-delete-settings-mv as 0.3.2 to ^0.3.2.

What changed

Schema (DDL via Utopia\Query\Schema\ClickHouse)

  • setup() emits events/gauges tables through Table\ClickHouse with typed columns, Engine::MergeTree, monthly partitioning, ORDER BY, bloom_filter skip indexes, and SETTINGS.
  • createDailyTable() uses Engine::SummingMergeTree('value').
  • createDailyMaterializedView() uses Schema\ClickHouse::createMaterializedView; the MV body remains a hand SELECT because subquery-aggregation MV bodies do not round-trip cleanly through the builder yet (deferred upstream).

Reads (SELECT via Utopia\Query\Builder\ClickHouse)

  • Each builder is initialised with useNamedBindings() + withParamTypes() so every ? placeholder rewrites to ClickHouse {paramN:Type} form (DateTime64(3), Int64, String, Nullable(String)).
  • Migrated: find / findFromTable, findAggregatedFromTable, count / countFromTable, sum / sumFromTable, findDaily (FINAL), sumDaily, sumDailyBatch, getTotal / getTotalFromEvents / getTotalFromGauges, getTotalBatch, getTimeSeries / getTimeSeriesFromTable.
  • Time-bucketed aggregation consumes Query::groupByTimeBucket via the builder's native GROUP BY emission, paired with selectRaw/orderByRaw for the bucket projection and ordering.
  • Cursor pagination keeps the tuple-keyset comparison as a whereRaw fragment and merges its named bindings into the builder Statement at HTTP execute time (upstream gap, deferred).

Writes (INSERT via Builder\ClickHouse::insertFormat)

  • addBatch() builds INSERT INTO t (...) FORMAT JSONEachRow; row payload still streamed in HTTP body.

Deletes (DELETE via Builder\ClickHouse::delete)

  • purge() and purgeDaily() emit lightweight DELETE FROM through the builder.

Removed

  • src/Usage/UsageQuery.phpgroupByInterval replaced by Utopia\Query\Query::groupByTimeBucket. BREAKING: downstream callers must switch from UsageQuery::groupByInterval('time', '1h') to Query::groupByTimeBucket('time', '1h').

Adapters: Method enum migration

  • convertQueriesToDatabase in Adapter/Database.php switched to Method enum cases. Same for the ClickHouse adapter's filter-method handling.

Infra

  • Dockerfile bumped to php:8.4.21-cli-alpine3.23.
  • composer.json PHP constraint bumped to >=8.4 (matches the 0.3.x query lib requirement).
  • phpstan bumped to ^2.0 for PHP 8.4 compatibility.

Test plan

  • composer lint clean
  • composer check (PHPStan max) clean
  • Snapshot tests ClickHouseSqlSnapshotTest: 9/9 green (DDL, daily, MV, find with typed bindings, groupByTimeBucket aggregation, INSERT FORMAT, getTimeSeries, FINAL, lightweight DELETE)
  • Integration tests against the Docker ClickHouse — to run in CI

Still deferred upstream (follow-up PRs)

  • Tuple-cursor builder helper (cursor pagination still uses whereRaw)
  • LowCardinality column type
  • SummingMergeTree shorthand
  • Materialized view body via builder (currently still raw SQL for the subquery-aggregation case)
  • Eventual HTTP transport extraction (future home: utopia-php/database)

Related

lohanidamodar and others added 5 commits May 17, 2026 09:37
Pin `utopia-php/query` to `dev-feat/clickhouse-insert-delete-settings-mv as
0.3.2` to consume the new ClickHouse builder + schema layer
(groupByTimeBucket, named-typed bindings, Schema\ClickHouse,
Builder\ClickHouse insertFormat/deleteMode). Flip composer to
`minimum-stability: dev` with `prefer-stable: true` so the alias resolves.

Tracks utopia-php/query#11; flip to `^0.3.2` once tagged.
Replace Query::TYPE_* string constants (removed upstream in 0.3.x) with
Method enum cases. Behaviour unchanged; the GroupByTimeBucket case is the
new name for the deferred groupByInterval/intervalGroupBy bucket — still a
no-op for the Database adapter since it has no time-bucketed aggregation
path.
PHPStan 1.x ships php-parser 4.x which cannot parse the asymmetric
visibility (`public protected(set)`) used by utopia-php/query 0.3.x's
ClickHouse schema layer (Schema\Table\ClickHouse), causing an internal
"Lexer::getNextToken returned null" crash on the usage adapter source
file.

Bump to PHPStan ^2.0 (ships php-parser 5.x, understands PHP 8.4) and
capture the pre-existing "mixed array access" findings from json_decode
result handling in a baseline so net analysis stays green while the
adapter migrates. Wire up a phpstan.neon that runs at level max over
src + tests.
Replaces the hand-rolled SQL strings in src/Usage/Adapter/ClickHouse.php
with the utopia-php/query 0.3.x builder + schema layer. The adapter is
now a thin HTTP runtime: every DDL statement, INSERT, SELECT, and DELETE
fragment is produced by Schema\ClickHouse or Builder\ClickHouse.

Schema (DDL via Utopia\Query\Schema\ClickHouse):
- setup() emits events/gauges tables through Table\ClickHouse with typed
  columns (string, datetime, bigInteger), Engine::MergeTree, monthly
  partitioning, ORDER BY, bloom_filter skip indexes, and SETTINGS.
- createDailyTable() emits the SummingMergeTree daily aggregation
  table via Engine::SummingMergeTree('value').
- createDailyMaterializedView() emits the MV via
  Schema\ClickHouse::createMaterializedView. The MV body remains a hand
  SELECT because subquery-aggregation MV bodies do not round-trip
  cleanly through the builder yet (deferred upstream).

Reads (SELECT via Utopia\Query\Builder\ClickHouse):
- A per-call builder is initialised with `useNamedBindings()` plus
  `withParamTypes()` so every `?` placeholder rewrites to ClickHouse
  `{paramN:Type}` form (DateTime64(3), Int64, String, Nullable(String)).
- find/findFromTable, findAggregatedFromTable, count/countFromTable,
  sum/sumFromTable, findDaily (FINAL), sumDaily, sumDailyBatch,
  getTotal/getTotalFromEvents/getTotalFromGauges, getTotalBatch,
  getTimeSeries/getTimeSeriesFromTable all compose filters via
  $builder->filter([Query::...]) and emit aggregations through
  $builder->sum()/->count()/selectRaw.
- Time-bucketed aggregation consumes Query::groupByTimeBucket via the
  builder's native GROUP BY emission, paired with selectRaw/orderByRaw
  for the bucket projection and ordering.
- Cursor pagination keeps the tuple-keyset comparison as a whereRaw
  fragment and merges its named bindings into the builder Statement at
  HTTP execute time.

Writes (INSERT via Utopia\Query\Builder\ClickHouse::insertFormat):
- addBatch() builds `INSERT INTO t (...) FORMAT JSONEachRow` via the
  builder's insertFormat helper; the JSONEachRow payload is streamed in
  the HTTP body.

Deletes (DELETE via Utopia\Query\Builder\ClickHouse::delete):
- purge() and purgeDaily() emit lightweight DELETE through the builder.

UsageQuery is removed. Downstream callers must switch from
`UsageQuery::groupByInterval('time', '1h')` to
`Query::groupByTimeBucket('time', '1h')`. The 0.3.x library rejects
non-enumerated intervals through ValidationException at construction.

HTTP transport and configuration setters (validateHost/validatePort,
escapeIdentifier, setNamespace/setTenant/setSharedTables/setDatabase/
setSecure/setTimeout/setCompression/setKeepAlive/setMaxRetries/
setRetryDelay/setAsyncInserts, healthCheck, query/insert/executeWithRetry,
logQuery, getConnectionStats/getQueryLog/clearQueryLog) stay untouched -
the migration is strictly to the SQL-building layer.

LowCardinality wrapping on the country column is dropped (the schema
layer wraps `LowCardinality` inside `Nullable`, but ClickHouse only
accepts `LowCardinality(Nullable(T))`; the column stays a plain
Nullable(String) until the upstream wrapping order is fixed - deferred
upstream).

References utopia-php/query#11.
Adds tests/Usage/Adapter/ClickHouseSqlSnapshotTest.php covering DDL
(events table, daily SummingMergeTree, materialized view), the
named-typed binding read path, groupByTimeBucket aggregation,
INSERT FORMAT JSONEachRow, getTimeSeries shape, FINAL on daily reads,
and the lightweight DELETE form. 9 snapshot tests / 30 assertions.

Bumps Dockerfile from php:8.3.3-cli-alpine3.19 to php:8.4.21-cli-alpine3.23
and composer.json's php constraint from >=8.0 to >=8.4 so CI matches the
utopia-php/query 0.3.x asymmetric-visibility requirement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 17, 2026

Greptile Summary

This PR migrates the ClickHouse adapter from hand-rolled SQL to the utopia-php/query 0.3.x builder/schema layer, replacing ~400 lines of raw string SQL with typed builder calls for DDL, INSERT, SELECT, and DELETE, and removes UsageQuery in favour of Query::groupByTimeBucket.

  • DDL: setup(), createDailyTable(), and createDailyMaterializedView() now use Schema\\ClickHouse / Table\\ClickHouse; qualifyDdl() injects the database qualifier via str_replace after the builder emits unqualified names.
  • DML: All SELECT paths use useNamedBindings() + withParamTypes() for {paramN:Type} ClickHouse placeholders; normalizeTimeValues() canonicalises datetime values before binding; cursor pagination compiles a tuple-keyset WHERE via whereRaw with manually merged named bindings.
  • Writes / Deletes: addBatch() uses insertFormat('JSONEachRow') and purge() / purgeDaily() use the builder's lightweight delete() path.

Confidence Score: 5/5

The migration is a well-scoped refactor with no functional regressions found; the two findings are style-level inconsistencies that do not cause data loss or incorrect query results.

All read and write paths correctly use named typed bindings, normalise datetime values before binding, and validate query attributes. The purge helpers omitting enforceValueRequirements produces a silent no-op rather than an error, but carries no data-loss risk. The qualifyDdl str_replace approach is fragile in theory but safe for the current naming conventions. The snapshot test suite covers all major SQL paths end-to-end.

src/Usage/Adapter/ClickHouse.php — specifically the purge helpers and qualifyDdl

Important Files Changed

Filename Overview
src/Usage/Adapter/ClickHouse.php Major refactor from hand-rolled SQL to utopia-php/query 0.3.x builder/schema; two P2 findings: purgeFromTable/purgeDaily skip enforceValueRequirements, and qualifyDdl uses global str_replace for database qualification.
src/Usage/Adapter/Database.php Switch statement cases migrated from Query::TYPE_* string constants to Method enum cases; GroupByTimeBucket replaces UsageQuery::TYPE_GROUP_BY_INTERVAL. Mechanical but correct.
src/Usage/UsageQuery.php File deleted; functionality superseded by Query::groupByTimeBucket from utopia-php/query 0.3.x. Breaking change for downstream callers documented in PR description.
tests/Usage/Adapter/ClickHouseSqlSnapshotTest.php New snapshot test suite covering DDL, INSERT FORMAT, SELECT with named typed bindings, groupByTimeBucket aggregation, FINAL, and lightweight DELETE paths — all green at 9/9.
composer.json Dev-branch dependency and minimum-stability: dev noted in previous thread; PHP constraint bumped to >=8.4 and phpstan to ^2.0 are clean changes.
phpstan-baseline.neon New baseline silences 34 type errors (offset access and cast on mixed from untyped json_decode returns); noted in previous thread as suppressions that mask potential regression detection.

Reviews (2): Last reviewed commit: "chore(clickhouse): drop unused runStatem..." | Re-trigger Greptile

Comment thread src/Usage/Adapter/ClickHouse.php Outdated
Comment thread composer.json
Comment on lines +19 to +21
"prefer-stable": true,
"require": {
"php": ">=8.0",
"php": ">=8.4",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dev-branch dependency and loosened stability setting

"minimum-stability": "dev" means Composer can resolve any transitive dependency to a dev release. Combined with "utopia-php/query": "dev-feat/clickhouse-insert-delete-settings-mv as 0.3.2", the locked revision is a mutable branch tip — if the branch is force-pushed or rebased after composer.lock is committed, any fresh composer update would silently pull a different commit. The PR description acknowledges this needs flipping to ^0.3.2 once tagged.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — this is intentional during the rollout and is called out in the PR description. The plan is:

  1. Merge the upstream utopia-php/query PR and tag 0.3.2 (or whatever the next stable carries the ClickHouse builder + Method enum surface used here).
  2. Flip this PR's composer.json back to "utopia-php/query": "^0.3.2" and restore "minimum-stability": "stable" (dropping prefer-stable) in the same commit, refresh composer.lock, and only then merge.

Until the tag exists the dev-branch alias is required so CI can resolve the dependency at all. Leaving this thread open so it stays visible as a pre-merge gate.

Comment thread phpstan-baseline.neon
…ment import

Inline `$this->query(...)` sites already pass merged named bindings explicitly,
so the wrapper is dead. Removes the corresponding PHPStan baseline entry.
Comment on lines 156 to 170
@@ -172,49 +166,24 @@ public function setTimeout(int $milliseconds): self
return $this;
}

/**
* Enable or disable query logging for debugging.
*
* @param bool $enable Whether to enable query logging
* @return self
*/
public function enableQueryLogging(bool $enable = true): self
{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the docs on public methods

Comment on lines +977 to +980
'integer' => $table->addColumn($id, ColumnType::BigInteger),
'float' => $table->addColumn($id, ColumnType::Float),
'boolean' => $table->addColumn($id, ColumnType::Boolean),
'datetime' => $table->datetime($id, 3),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be shorthand methods we can use for integer, float and boolean too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants