Skip to content

Add indexes on sp_action.rollout and rollout_group#3045

Open
clayly wants to merge 4 commits into
eclipse-hawkbit:masterfrom
clayly:perf/sp-action-rollout-indexes
Open

Add indexes on sp_action.rollout and rollout_group#3045
clayly wants to merge 4 commits into
eclipse-hawkbit:masterfrom
clayly:perf/sp-action-rollout-indexes

Conversation

@clayly
Copy link
Copy Markdown
Contributor

@clayly clayly commented May 2, 2026

Summary

Several rollout-monitoring queries on sp_action (existsByRolloutId, getStatusCountByRolloutId, getStatusCountByRolloutGroupId, etc.) filter by rollout or rollout_group. The flyway baseline does not index either column, so Postgres falls back to Seq Scan, scanning the entire sp_action table on every monitoring poll.

For a 16 k-device rollout this means 16 k action rows scanned per call, multiple times per second during dispatch and monitoring. Beyond a single rollout, the table grows over the lifetime of the deployment and the cost grows linearly.

Fix

Two composite indexes:

  • sp_idx_action_rollout_status on (tenant, rollout, status) — covers both filter and GROUP BY status patterns; enables Index Only Scan with no heap fetch
  • sp_idx_action_rollout_group on (tenant, rollout_group) — covers rollout_group=? lookups (Hibernate-generated SQL always includes tenant via the multi-tenant filter)

V1_20_2__action_rollout_indexes migrations added for POSTGRESQL, H2, and MYSQL.

Idempotency

Each migration is guarded so re-applying the SQL against a schema that already has the index is a no-op:

  • PostgreSQL and H2: CREATE INDEX IF NOT EXISTS
  • MySQL ≤ 8.x has no native IF NOT EXISTS for CREATE INDEX, so the migration uses an INFORMATION_SCHEMA lookup with PREPARE/EXECUTE. MariaDB and MySQL behave identically here.

This protects deployments that may already have created these indexes out-of-band (e.g. via a fork's repeatable migration, an operator hot-fix, or a re-run after a failed migration).

Performance evidence

Insert 16 000 rows into sp_action tied to rollout=1, spread across rollout_group=1..17. Run the hot queries 1000 times each, before and after the indexes.

PostgreSQL 16

query no index with index speedup
WHERE tenant=? AND rollout=? GROUP BY status 1.256 s 0.504 s 2.5x
WHERE tenant=? AND rollout_group=? count 0.501 s 0.027 s 18.6x
WHERE rollout=? LIMIT 1 (exists) 1.5 µs 1.4 µs 1x (early-out)

Plans switch from Seq Scan to Index Only Scan, Heap Fetches: 0. Buffer reads drop from 214 → 17 (12x).

YugabyteDB (PostgreSQL-compatible YSQL)

query no index with index speedup
WHERE tenant=? AND rollout=? GROUP BY status 4.146 s 2.841 s 1.5x
WHERE tenant=? AND rollout_group=? count 2.595 s 0.148 s 17.6x
WHERE rollout=? LIMIT 1 (exists) 21 µs 21 µs 1x (early-out)

Plain PG syntax works on YB. The leading column gets HASH-partitioned across tablets by default, optimal for the equality-on-tenant access pattern. Index Only Scan is supported on YB with Heap Fetches: 0.

YB seq scans are 3–5x slower than PG seq scans (network round-trip per page across tablets), so the index is even more critical on a YugabyteDB deployment.

Test plan

  • Schema migration applies cleanly on H2, PostgreSQL and YugabyteDB
  • Re-applying migration against a schema that already has the indexes is a no-op
  • EXPLAIN confirms Index Only Scan after migration
  • Long-running rollout test (1 k+ targets) before/after — measure UI rollout-detail responsiveness

@hawkbit-bot
Copy link
Copy Markdown

Thanks @clayly for taking the time to contribute to hawkBit! We really appreciate this. Make yourself comfortable while I'm looking for a committer to help you with your contribution.
Please make sure you read the contribution guide and signed the Eclipse Contributor Agreement (ECA).

Rollout monitoring queries (existsByRolloutId, getStatusCountByRolloutId,
getStatusCountByRolloutGroupId) filter by rollout or rollout_group on
sp_action. The flyway baseline did not index either column, so Postgres
falls back to Seq Scan on every monitoring poll. With 16k action rows
this is meaningful — the group-count query takes ~500 ms without the
index and ~27 ms with it (Index Only Scan, Heap Fetches: 0).

Bench (16k rows, 1000 iter):
- WHERE tenant=? AND rollout_group=?           18.6x faster on PG
                                               17.6x faster on YugabyteDB
- WHERE tenant=? AND rollout=? GROUP BY status  2.5x faster on PG
                                                1.5x faster on YugabyteDB

Adds V1_20_2 sibling migrations for POSTGRESQL, H2, and MYSQL.
@clayly clayly force-pushed the perf/sp-action-rollout-indexes branch from 83390ff to 51c1316 Compare May 3, 2026 10:58
@vasilchev
Copy link
Copy Markdown
Contributor

vasilchev commented May 12, 2026

Hello @clayly ,

nice catch! indeed sp_action had no indexes on rollout and rollout_group columns (including the auto-injected 'tenant') despite
both being heavily queried by rollout monitoring and scheduling.

One comment though,

Wouldn't including 'status' in 'rollout_group' index (also like for the 'rollout') be even better i.e.: (example for h2, but applicable for all h2,mysql,postgre)

CREATE INDEX IF NOT EXISTS sp_idx_action_rollout_group ON sp_action (tenant, rollout_group);

to become

CREATE INDEX IF NOT EXISTS sp_idx_action_rollout_group ON sp_action (tenant, rollout_group, status);

This way we would improve and handle the queries that includes the 'status' for 'rollout_group' related (mostly in GROUP_BY) and also actually benefit even more for 'countByRolloutIdAndRolloutGroupIdAndStatus' (this now would pick the (tenant, rollout_group, status) index -> selectin all actions with given status for given rolloutGroup first, rather then, selecting all actions with given status for given rollout first?

Per upstream review (vasilchev on PR eclipse-hawkbit#3045): widening the rollout_group
index to (tenant, rollout_group, status) lets:

  * getStatusCountByRolloutGroupId(s) satisfy
    GROUP BY rollout_group, status as an index-only scan, and
  * countByRolloutIdAndRolloutGroupIdAndStatus seek directly on
    (rollout_group, status) instead of scanning every action of the
    parent rollout for the requested status.

Applied symmetrically to H2, MYSQL, and POSTGRESQL migrations; the
POSTGRESQL header comment is extended with the rationale.
@clayly
Copy link
Copy Markdown
Contributor Author

clayly commented May 13, 2026

Thanks @vasilchev — good catch. Pushed commit 8a83e1f widening sp_idx_action_rollout_group to (tenant, rollout_group, status) across H2, MySQL and PostgreSQL migrations. As you noted this also helps countByRolloutIdAndRolloutGroupIdAndStatus and the GROUP BY rollout_group, status queries (getStatusCountByRolloutGroupId(s)) — extended the PostgreSQL header comment with that rationale.

@@ -0,0 +1,3 @@
-- See POSTGRESQL/V1_20_2__action_rollout_indexes__POSTGRESQL.sql for rationale.
CREATE INDEX IF NOT EXISTS sp_idx_action_rollout_status ON sp_action (tenant, rollout, status);
CREATE INDEX IF NOT EXISTS sp_idx_action_rollout_group ON sp_action (tenant, rollout_group, status);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'sp_idx_action_rollout_group' now can become 'sp_idx_action_rollout_group_status' (for all db migration schemas)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 340712b — renamed to sp_idx_action_rollout_group_status across H2, MySQL and PostgreSQL migrations to mirror the sp_idx_action_rollout_status naming.

-- directly on (rollout_group, status) rather than scanning all actions of
-- the parent rollout for the requested status.
--
-- IF NOT EXISTS guards against deployments that already created these indexes
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that we should add such improvements with such checks (IF NOT EXIST).

One of the issues i have is CREATE INDEX IF NOT EXIST would check existence of such index by name and if such exists our improvement will be skipped silently.
There is possibility someone created such index name but content is different - without any idea this improvement will be skipped. (and user will not be aware if not seen actual change list of original repo)

For such cases where forked repo want to slef introduce improvements maybe better to add convention with some index prefixes (i.e. my_fork_idx_action....).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree — the silent-skip risk on column-set drift outweighs the original "tolerate fork pre-creation" motivation, and we'd already hit that trap on our own fork. Dropped CREATE INDEX IF NOT EXISTS from H2/PostgreSQL and the INFORMATION_SCHEMA guard from MySQL in 340712b. Rewrote the PostgreSQL header to spell out the rationale and recommend that forks self-introducing these indexes use a fork-specific name prefix (e.g. fork_idx_action_rollout_*) so the upstream-named indexes can land cleanly on sync.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick follow-up: trimmed the fork-prefix recommendation out of the header in ed1cb5a — migration comments shouldn't double as fork-management guidance. Kept only the in-scope rationale (loud-fail vs silent-skip on name-vs-content drift).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to remove all the headers (comments) in the schema migrations themselves - it is enough in the PR description and the commit message description itself (if one wants to get familiar why this was introduced).

Regarding the the fork-managment guidance - i would suggest to the team to add a general 'recommendation' section for forks in the main README (for db schema migrations like the prefix 'my_fork_idx..' and maybe other recommendations would be put there later as well.)

clayly added 2 commits May 14, 2026 15:56
Per upstream review (vasilchev on PR eclipse-hawkbit#3045):

* Rename `sp_idx_action_rollout_group` to `sp_idx_action_rollout_group_status`
  to reflect the widened column set `(tenant, rollout_group, status)`.
  Mirrors the existing `sp_idx_action_rollout_status` naming.

* Drop `CREATE INDEX IF NOT EXISTS` from H2 / PostgreSQL and the
  INFORMATION_SCHEMA guard from MySQL. The original intent was tolerating
  fork-side pre-creation, but `CREATE INDEX IF NOT EXISTS` matches by
  index name only — a pre-existing index with the same name but a
  different column set is silently kept and the perf improvement is
  lost. A loud migration failure is the safer signal. Forks that
  pre-create these indexes should use a fork-specific name prefix
  (e.g. `fork_idx_action_rollout_*`) so the upstream-named indexes can
  land cleanly on next sync. Comment in the PostgreSQL header rewritten
  accordingly.
Migration header should not double as fork-management guidance.
Keep only the in-scope rationale (loud-fail vs silent-skip on
name-vs-content drift) and drop the fork-prefix recommendation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants