Skip to content

revamp gateway-controller#2277

Open
tharindu1st wants to merge 1 commit into
wso2:mainfrom
tharindu1st:gatewat-controller-cleanup
Open

revamp gateway-controller#2277
tharindu1st wants to merge 1 commit into
wso2:mainfrom
tharindu1st:gatewat-controller-cleanup

Conversation

@tharindu1st

Copy link
Copy Markdown
Contributor

revamp gateway-controller

@github-actions

Copy link
Copy Markdown
Contributor

Dependency Validation Results

Dependency name: github.com/envoyproxy/go-control-plane
Version: v0.14.0
Approved: ❌ No - Module not found in dependency registry

Dependency name: github.com/wso2/api-platform/common
Version: v0.0.0
Approved: ❌ No - Module not found in dependency registry

Dependency name: github.com/wso2/api-platform/gateway/gateway-controller
Version: v0.0.0
Approved: ❌ No - Module not found in dependency registry

Dependency name: google.golang.org/protobuf
Version: v1.36.11
Allowed range: >=v1.36.11
Approved: ✅ Yes


Next Steps

  1. Review the validation failures listed above
  2. Check if dependencies are in the approved dependency list
  3. Options to resolve:
    • Remove the unapproved dependencies from this PR
    • OR submit a PR to add these dependencies to the approved list in engineering-governance
  4. Once resolved, push changes to re-run validation

This PR is blocked until all dependencies are approved.

⚠️ Please verify the scope of the dependencies usage is necessary

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Summary

This change revamps the gateway-controller and introduces a new event-gateway-controller component with shared bootstrap patterns.

Gateway controller

  • Refactored startup into a reusable bootstrap package and simplified the main entrypoint.
  • Added extension hooks for schema setup, event processors, routes, and control-plane integration.
  • Introduced configurable storage/schema handling with backend-specific DDL support and registry-based config kind registration.
  • Added webhook secret service/snapshot processing flow and wired it into event handling and xDS updates.
  • Updated control-plane and policy xDS components to use optional extension-provided secret synchronization and cache injection.

Event gateway controller

  • Added a new controller entrypoint, Makefile, Dockerfile, and module definition.
  • Introduced extension bootstrap wiring, webhook secret service/processor logic, and xDS snapshot management.
  • Added backend-specific schema files and schema lookup helpers.
  • Added supporting storage and test updates to match the new interfaces and startup flow.

Walkthrough

This change adds an event-gateway controller with build packaging, schema loading, webhook-secret service and processor logic, xDS snapshot management, and extension wiring. It also moves gateway-controller startup into a bootstrap package, adds storage backend/raw-SQL support, and routes webhook-secret handling through extension hooks, policy xDS cache injection, and management API wiring.

Sequence Diagram(s)

Startup wiring

sequenceDiagram
  participant Bootstrap as bootstrap.Run
  participant Extension as EventGatewayExtension
  participant Listener as EventListener
  participant ControlPlane as controlplane.Client
  participant PolicyXDS as policyxds.Server
  Bootstrap->>Extension: ApplySchema()
  Bootstrap->>Extension: Init()
  Bootstrap->>Listener: RegisterEventProcessor(webhook-secret)
  Bootstrap->>ControlPlane: SetHMACSecretSyncer(Extension)
  Bootstrap->>PolicyXDS: WithWebhookSecretCache(cache)
Loading

Webhook secret lifecycle

sequenceDiagram
  participant Service as WebhookSecretService
  participant Storage as storage.Storage
  participant EventHub as eventhub.EventHub
  participant Listener as EventListener
  participant Processor as WebhookSecretProcessor
  participant Store as webhooksecret.WebhookSecretStore
  participant Snapshot as webhooksecretxds.SnapshotManager
  Service->>Storage: persist or delete secret
  Service->>EventHub: publish webhook-secret event
  EventHub->>Listener: deliver event
  Listener->>Processor: Process(event)
  Processor->>Store: upsert or remove secret
  Processor->>Snapshot: RefreshSnapshot()
Loading

Suggested reviewers

  • renuka-fernando
  • RakhithaRR
  • Tharsanan1
  • VirajSalaka
  • malinthaprasan
  • CrowleyRajapakse
  • pubudu538
🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (1 warning, 2 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is too generic to convey the primary change in the changeset. Rename it to a specific summary of the main change, e.g. the controller bootstrap/event-gateway revamp.
Description check ❓ Inconclusive The description is just a vague phrase and omits the required template sections. Populate Purpose, Goals, Approach, tests, security, docs, and other template sections with concrete details.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain modules listed in go.work or their selected dependencies"


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (2)
gateway/gateway-controller/pkg/storage/sql_store.go (1)

53-58: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick win

Fail fast on invalid or duplicate kind registrations.

RegisterConfigKind silently overwrites existing mappings and accepts unchecked table names, while later reads depend on this registry. Return an error or otherwise reject empty values, invalid identifiers, and remapping an existing kind to a different table during extension schema setup.

Also applies to: 306-314, 990-1014

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gateway/gateway-controller/pkg/storage/sql_store.go` around lines 53 - 58,
RegisterConfigKind currently overwrites existing kind-to-table mappings and
accepts empty or invalid inputs, so update it to fail fast instead of silently
registering bad state. Add validation for kind and tableName identifiers, and
reject attempts to remap an existing kind to a different table while allowing
the same mapping to be re-registered safely if needed. Propagate the failure
through the extension schema setup path that calls RegisterConfigKind, and
ensure the registry lookup code used by LoadFromDatabase only sees validated
entries.
gateway/gateway-controller/pkg/bootstrap/runner_test.go (1)

552-709: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add a non-nil extension case to this suite.

These updates only cover generateAuthConfig(cfg, nil), so the new extension merge path is still untested. Add a small fake extension that asserts prefixed and legacy entries are generated and that duplicate route keys are handled explicitly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gateway/gateway-controller/pkg/bootstrap/runner_test.go` around lines 552 -
709, The TestGenerateAuthConfig suite only covers generateAuthConfig(cfg, nil),
so the extension merge path is still untested. Add a non-nil fake extension case
in TestGenerateAuthConfig that exercises generateAuthConfig with an extension
and verifies both prefixed and legacy resource-role entries are produced. Use
the existing generateAuthConfig helper and ResourceRoles assertions, and
explicitly cover how duplicate route keys are handled when the extension
contributes overlapping routes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@event-gateway/event-gateway-controller/Dockerfile`:
- Around line 27-29: The Dockerfile is copying the gateway-controller module
into a path that does not match the module’s local replace target, so fix the
COPY destination used in the `gateway-controller` build stage to match the path
expected by `go.mod` and `go mod download` (the one referenced by the
`gateway-controller` module’s replace). Update both the initial dependency-copy
step and the later copy step that reuse the `gateway-controller` module so the
files land under the same `/gateway/gateway-controller` location instead of
`/gateway-controller`.

In
`@event-gateway/event-gateway-controller/pkg/eventprocessors/webhook_secret.go`:
- Line 112: The upsert path in webhook_secret processing dereferences
providerManager without checking whether encryption is configured, so
CREATE/UPDATE events can panic when extension initialization leaves it nil.
Update the webhook secret upsert handling in the relevant method around the
Decrypt call to return early for CREATE/UPDATE when providerManager is nil, and
ensure the guard is applied before any providerManager.Decrypt usage.

In `@event-gateway/event-gateway-controller/pkg/extension/extension.go`:
- Around line 276-289: The artifact-clear flow in extension.go is using
e.db.GetWebhookSecretsByArtifact, so it only iterates DB rows and can miss stale
in-memory secrets after the rows are already gone. Update the clear logic to
read the current secrets from webhookSecretStore instead, then remove each
matching entry from the in-memory store using the existing Remove path and keep
the warning/error logging in the same clear loop.

In `@event-gateway/event-gateway-controller/pkg/services/webhook_secret.go`:
- Line 129: The mutating webhook secret flows currently ignore failures from
publishWebhookSecretEvent and Regenerate is still emitting a CREATE event for an
existing secret. Update the relevant methods that call
s.publishWebhookSecretEvent to return any publish error back to the caller
instead of treating the operation as successful, and change the regeneration
path to publish an UPDATE event. Use the existing symbols
publishWebhookSecretEvent, Regenerate, and the create/update call sites in
webhook_secret.go to make the fix consistently across the affected methods.

In `@gateway/gateway-controller/pkg/bootstrap/runner.go`:
- Around line 849-853: Reject duplicate auth entries when merging extension
authz maps in runner.go. In the extension merge loop that iterates over
extensions and calls AuthzEntries, detect when a methodAndPath already exists in
relativeRoles; if the existing roles differ, fail startup instead of
overwriting. If the duplicate roles are identical, allow it or treat it as a
no-op, and make the check explicit around the relativeRoles assignment.
- Around line 429-434: The extension option collection in runner.go is silently
discarding any PolicyXDSOptions entry that does not implement
policyxds.ServerOption, which hides wiring mistakes. Update the loop over
extensions and ext.PolicyXDSOptions() so that non-server options are treated as
an error: log the extension and option type, then fail fast instead of appending
only valid options and continuing. Use the existing extension iteration in
runner.go as the place to surface the invalid xDS option immediately.
- Around line 911-918: The raw adapter methods in eventListenerAdapter and the
related adapter at the other referenced location silently ignore failed type
assertions, which hides wiring mistakes; update RegisterEventProcessorRaw (and
the matching method) to surface the mismatch by returning an error or failing
startup instead of no-oping, and adjust the caller contract so bad eventType
values are detected early rather than silently dropping event processing.

In `@gateway/gateway-controller/pkg/controlplane/client.go`:
- Around line 148-150: The HMAC syncer registration order is wrong: the Client
contract in SetHMACSecretSyncer requires wiring before Start(), but the runner
currently starts cpClient before SetControlPlaneDeps, leaving early WebSub/HMAC
callbacks without a registered syncer. Reorder the bootstrap flow in runner.go
so extension/control-plane dependencies are set on cpClient before calling
cpClient.Start(), and keep the SetHMACSecretSyncer/SetControlPlaneDeps path
consistent with the pre-Start contract.

In `@gateway/gateway-controller/pkg/storage/sqlite.go`:
- Line 76: The SQLite schema version expectation has changed, so the
unsupported-schema test in sqlite_test.go is now out of sync. Update the test
assertion that currently expects schema version 3 to match the new
currentSchemaVersion value used in sqlite.go, and adjust the test case around
the unsupported-version check so it validates against version 4.

---

Nitpick comments:
In `@gateway/gateway-controller/pkg/bootstrap/runner_test.go`:
- Around line 552-709: The TestGenerateAuthConfig suite only covers
generateAuthConfig(cfg, nil), so the extension merge path is still untested. Add
a non-nil fake extension case in TestGenerateAuthConfig that exercises
generateAuthConfig with an extension and verifies both prefixed and legacy
resource-role entries are produced. Use the existing generateAuthConfig helper
and ResourceRoles assertions, and explicitly cover how duplicate route keys are
handled when the extension contributes overlapping routes.

In `@gateway/gateway-controller/pkg/storage/sql_store.go`:
- Around line 53-58: RegisterConfigKind currently overwrites existing
kind-to-table mappings and accepts empty or invalid inputs, so update it to fail
fast instead of silently registering bad state. Add validation for kind and
tableName identifiers, and reject attempts to remap an existing kind to a
different table while allowing the same mapping to be re-registered safely if
needed. Propagate the failure through the extension schema setup path that calls
RegisterConfigKind, and ensure the registry lookup code used by LoadFromDatabase
only sees validated entries.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c3166066-14b7-4627-b852-5b58d11208e8

📥 Commits

Reviewing files that changed from the base of the PR and between 7d3b667 and 5832da8.

⛔ Files ignored due to path filters (2)
  • event-gateway/event-gateway-controller/go.sum is excluded by !**/*.sum
  • go.work is excluded by !**/*.work
📒 Files selected for processing (37)
  • event-gateway/event-gateway-controller/Dockerfile
  • event-gateway/event-gateway-controller/Makefile
  • event-gateway/event-gateway-controller/cmd/main.go
  • event-gateway/event-gateway-controller/go.mod
  • event-gateway/event-gateway-controller/pkg/eventprocessors/webhook_secret.go
  • event-gateway/event-gateway-controller/pkg/extension/extension.go
  • event-gateway/event-gateway-controller/pkg/services/webhook_secret.go
  • event-gateway/event-gateway-controller/pkg/storage/event-gateway-db.postgres.sql
  • event-gateway/event-gateway-controller/pkg/storage/event-gateway-db.sql
  • event-gateway/event-gateway-controller/pkg/storage/event-gateway-db.sqlserver.sql
  • event-gateway/event-gateway-controller/pkg/storage/schema.go
  • event-gateway/event-gateway-controller/pkg/webhooksecretxds/snapshot.go
  • gateway/gateway-controller/cmd/controller/main.go
  • gateway/gateway-controller/pkg/api/handlers/handlers.go
  • gateway/gateway-controller/pkg/api/handlers/handlers_test.go
  • gateway/gateway-controller/pkg/bootstrap/extension.go
  • gateway/gateway-controller/pkg/bootstrap/runner.go
  • gateway/gateway-controller/pkg/bootstrap/runner_test.go
  • gateway/gateway-controller/pkg/bootstrap/runtime_bootstrap.go
  • gateway/gateway-controller/pkg/bootstrap/runtime_bootstrap_test.go
  • gateway/gateway-controller/pkg/controlplane/api_deleted_test.go
  • gateway/gateway-controller/pkg/controlplane/client.go
  • gateway/gateway-controller/pkg/controlplane/client_integration_test.go
  • gateway/gateway-controller/pkg/controlplane/controlplane_test.go
  • gateway/gateway-controller/pkg/eventlistener/listener.go
  • gateway/gateway-controller/pkg/eventlistener/listener_test.go
  • gateway/gateway-controller/pkg/policyxds/combined_cache.go
  • gateway/gateway-controller/pkg/policyxds/server.go
  • gateway/gateway-controller/pkg/secrets/service_test.go
  • gateway/gateway-controller/pkg/storage/gateway-controller-db.postgres.sql
  • gateway/gateway-controller/pkg/storage/gateway-controller-db.sql
  • gateway/gateway-controller/pkg/storage/gateway-controller-db.sqlserver.sql
  • gateway/gateway-controller/pkg/storage/interface.go
  • gateway/gateway-controller/pkg/storage/sql_store.go
  • gateway/gateway-controller/pkg/storage/sqlite.go
  • gateway/gateway-controller/pkg/subscriptionxds/subscription_snapshot_test.go
  • gateway/gateway-controller/pkg/utils/mock_db_test.go
💤 Files with no reviewable changes (3)
  • gateway/gateway-controller/pkg/storage/gateway-controller-db.sqlserver.sql
  • gateway/gateway-controller/pkg/storage/gateway-controller-db.postgres.sql
  • gateway/gateway-controller/pkg/eventlistener/listener_test.go

Comment on lines +27 to +29
COPY --from=sdk-core go.mod /sdk/core/
COPY --from=common go.mod go.sum /common/
COPY --from=gateway-controller go.mod go.sum /gateway-controller/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🔴 Critical | ⚡ Quick win

Align the gateway-controller copy destination with the local replace path.

The replace on Line 28 of event-gateway/event-gateway-controller/go.mod resolves ../../gateway/gateway-controller from /build, so go mod download looks for /gateway/gateway-controller. These lines populate /gateway-controller instead, which breaks the builder before compile.

Suggested fix
-COPY --from=gateway-controller go.mod go.sum /gateway-controller/
+COPY --from=gateway-controller go.mod go.sum /gateway/gateway-controller/
@@
-COPY --from=gateway-controller . /gateway-controller
+COPY --from=gateway-controller . /gateway/gateway-controller

Also applies to: 36-38

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@event-gateway/event-gateway-controller/Dockerfile` around lines 27 - 29, The
Dockerfile is copying the gateway-controller module into a path that does not
match the module’s local replace target, so fix the COPY destination used in the
`gateway-controller` build stage to match the path expected by `go.mod` and `go
mod download` (the one referenced by the `gateway-controller` module’s replace).
Update both the initial dependency-copy step and the later copy step that reuse
the `gateway-controller` module so the files land under the same
`/gateway/gateway-controller` location instead of `/gateway-controller`.

}

plaintext, err := l.providerManager.Decrypt(payload)
plaintext, err := p.providerManager.Decrypt(payload)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🔴 Critical | ⚡ Quick win

Guard upsert processing when encryption is unavailable.

providerManager can be nil from extension initialization, but upsert processing dereferences it unconditionally. Return early for CREATE/UPDATE events when no provider is configured.

Proposed fix
+	if p.providerManager == nil {
+		p.logger.Warn("Skipping webhook secret upsert because encryption provider is unavailable",
+			slog.String("secret_uuid", secretUUID),
+			slog.String("event_id", event.EventID))
+		return
+	}
+
 	plaintext, err := p.providerManager.Decrypt(payload)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
plaintext, err := p.providerManager.Decrypt(payload)
if p.providerManager == nil {
p.logger.Warn("Skipping webhook secret upsert because encryption provider is unavailable",
slog.String("secret_uuid", secretUUID),
slog.String("event_id", event.EventID))
return
}
plaintext, err := p.providerManager.Decrypt(payload)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@event-gateway/event-gateway-controller/pkg/eventprocessors/webhook_secret.go`
at line 112, The upsert path in webhook_secret processing dereferences
providerManager without checking whether encryption is configured, so
CREATE/UPDATE events can panic when extension initialization leaves it nil.
Update the webhook secret upsert handling in the relevant method around the
Decrypt call to return early for CREATE/UPDATE when providerManager is nil, and
ensure the guard is applied before any providerManager.Decrypt usage.

Comment on lines +276 to +289
secrets, err := e.db.GetWebhookSecretsByArtifact(artifactID)
if err != nil {
e.log.Error("Failed to fetch webhook secrets for artifact clear",
slog.String("artifact_id", artifactID),
slog.Any("error", err))
return
}
for _, ws := range secrets {
if err := e.webhookSecretStore.Remove(ws.ArtifactUUID, ws.Name); err != nil && err != webhooksecret.ErrNotFound {
e.log.Warn("Failed to remove webhook secret from memory during artifact clear",
slog.String("secret_uuid", ws.UUID),
slog.Any("error", err))
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Clear artifact secrets from the in-memory store, not from the DB result set.

If the DB rows have already been removed, this fetch returns no rows and leaves stale in-memory secrets. Use the store contents for the artifact when clearing.

Proposed fix
-	secrets, err := e.db.GetWebhookSecretsByArtifact(artifactID)
-	if err != nil {
-		e.log.Error("Failed to fetch webhook secrets for artifact clear",
-			slog.String("artifact_id", artifactID),
-			slog.Any("error", err))
-		return
-	}
-	for _, ws := range secrets {
-		if err := e.webhookSecretStore.Remove(ws.ArtifactUUID, ws.Name); err != nil && err != webhooksecret.ErrNotFound {
+	for secretName := range e.webhookSecretStore.GetAll()[artifactID] {
+		if err := e.webhookSecretStore.Remove(artifactID, secretName); err != nil && err != webhooksecret.ErrNotFound {
 			e.log.Warn("Failed to remove webhook secret from memory during artifact clear",
-				slog.String("secret_uuid", ws.UUID),
+				slog.String("artifact_id", artifactID),
+				slog.String("secret_name", secretName),
 				slog.Any("error", err))
 		}
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
secrets, err := e.db.GetWebhookSecretsByArtifact(artifactID)
if err != nil {
e.log.Error("Failed to fetch webhook secrets for artifact clear",
slog.String("artifact_id", artifactID),
slog.Any("error", err))
return
}
for _, ws := range secrets {
if err := e.webhookSecretStore.Remove(ws.ArtifactUUID, ws.Name); err != nil && err != webhooksecret.ErrNotFound {
e.log.Warn("Failed to remove webhook secret from memory during artifact clear",
slog.String("secret_uuid", ws.UUID),
slog.Any("error", err))
}
}
for secretName := range e.webhookSecretStore.GetAll()[artifactID] {
if err := e.webhookSecretStore.Remove(artifactID, secretName); err != nil && err != webhooksecret.ErrNotFound {
e.log.Warn("Failed to remove webhook secret from memory during artifact clear",
slog.String("artifact_id", artifactID),
slog.String("secret_name", secretName),
slog.Any("error", err))
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@event-gateway/event-gateway-controller/pkg/extension/extension.go` around
lines 276 - 289, The artifact-clear flow in extension.go is using
e.db.GetWebhookSecretsByArtifact, so it only iterates DB rows and can miss stale
in-memory secrets after the rows are already gone. Update the clear logic to
read the current secrets from webhookSecretStore instead, then remove each
matching entry from the in-memory store using the existing Remove path and keep
the warning/error logging in the same clear loop.

slog.Any("error", err))
}

s.publishWebhookSecretEvent("CREATE", artifactUUID, ws.UUID, ws.Name, correlationID)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Propagate publish failures and emit UPDATE for regeneration.

These mutating methods currently report success even if event publication fails, and Regenerate emits CREATE for an existing secret. Return the publish error to callers and use UPDATE for regeneration.

Proposed fix
-	s.publishWebhookSecretEvent("CREATE", artifactUUID, ws.UUID, ws.Name, correlationID)
+	if err := s.publishWebhookSecretEvent("CREATE", artifactUUID, ws.UUID, ws.Name, correlationID); err != nil {
+		return nil, "", err
+	}
 
 	return ws, plaintext, nil
@@
-	s.publishWebhookSecretEvent("CREATE", artifactUUID, ws.UUID, ws.Name, correlationID)
+	if err := s.publishWebhookSecretEvent("UPDATE", artifactUUID, ws.UUID, ws.Name, correlationID); err != nil {
+		return nil, "", err
+	}
 
 	return ws, plaintext, nil
@@
-	s.publishWebhookSecretEvent("DELETE", artifactUUID, ws.UUID, ws.Name, correlationID)
+	if err := s.publishWebhookSecretEvent("DELETE", artifactUUID, ws.UUID, ws.Name, correlationID); err != nil {
+		return err
+	}
 
 	return nil
 }
@@
-func (s *WebhookSecretService) publishWebhookSecretEvent(action, artifactUUID, secretUUID, secretName, correlationID string) {
+func (s *WebhookSecretService) publishWebhookSecretEvent(action, artifactUUID, secretUUID, secretName, correlationID string) error {
 	event := eventhub.Event{
@@
 	if err := s.eventHub.PublishEvent(s.gatewayID, event); err != nil {
 		s.logger.Error("Failed to publish webhook secret event",
@@
-			slog.Any("error", err))
+			slog.Any("error", err))
+		return fmt.Errorf("failed to publish webhook secret event: %w", err)
 	}
+	return nil
 }

Also applies to: 173-173, 199-199, 212-226

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@event-gateway/event-gateway-controller/pkg/services/webhook_secret.go` at
line 129, The mutating webhook secret flows currently ignore failures from
publishWebhookSecretEvent and Regenerate is still emitting a CREATE event for an
existing secret. Update the relevant methods that call
s.publishWebhookSecretEvent to return any publish error back to the caller
instead of treating the operation as successful, and change the regeneration
path to publish an UPDATE event. Use the existing symbols
publishWebhookSecretEvent, Regenerate, and the create/update call sites in
webhook_secret.go to make the fix consistently across the affected methods.

Comment on lines +429 to +434
for _, ext := range extensions {
for _, opt := range ext.PolicyXDSOptions() {
if serverOpt, ok := opt.(policyxds.ServerOption); ok {
serverOpts = append(serverOpts, serverOpt)
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Do not ignore invalid extension xDS options.

This loop silently drops any value that is not a policyxds.ServerOption. That makes extension wiring mistakes look like a successful startup while the extra xDS behavior never turns on. Log the extension/type and fail fast here instead of continuing.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gateway/gateway-controller/pkg/bootstrap/runner.go` around lines 429 - 434,
The extension option collection in runner.go is silently discarding any
PolicyXDSOptions entry that does not implement policyxds.ServerOption, which
hides wiring mistakes. Update the loop over extensions and
ext.PolicyXDSOptions() so that non-server options are treated as an error: log
the extension and option type, then fail fast instead of appending only valid
options and continuing. Use the existing extension iteration in runner.go as the
place to surface the invalid xDS option immediately.

Comment on lines +849 to +853
// Merge in authorization entries from each extension.
for _, ext := range extensions {
for methodAndPath, roles := range ext.AuthzEntries() {
relativeRoles[methodAndPath] = roles
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Reject duplicate auth entries from extensions.

These assignments overwrite existing role mappings on key collision. If an extension reuses a built-in route key by mistake, the controller starts with a different authorization map for that route. Detect duplicates and stop startup unless the role sets are intentionally identical.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gateway/gateway-controller/pkg/bootstrap/runner.go` around lines 849 - 853,
Reject duplicate auth entries when merging extension authz maps in runner.go. In
the extension merge loop that iterates over extensions and calls AuthzEntries,
detect when a methodAndPath already exists in relativeRoles; if the existing
roles differ, fail startup instead of overwriting. If the duplicate roles are
identical, allow it or treat it as a no-op, and make the check explicit around
the relativeRoles assignment.

Comment on lines +911 to +918
func (a *eventListenerAdapter) RegisterEventProcessorRaw(eventType interface{}, fn func(interface{})) {
et, ok := eventType.(eventhub.EventType)
if !ok {
return
}
a.l.RegisterEventProcessor(et, func(event eventhub.Event) {
fn(event)
})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Surface raw adapter type mismatches.

Both adapters return quietly on failed type assertions. That turns extension contract mistakes into silent loss of event processing or control-plane wiring. Return an error or fail startup instead of no-oping here.

Also applies to: 926-929

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gateway/gateway-controller/pkg/bootstrap/runner.go` around lines 911 - 918,
The raw adapter methods in eventListenerAdapter and the related adapter at the
other referenced location silently ignore failed type assertions, which hides
wiring mistakes; update RegisterEventProcessorRaw (and the matching method) to
surface the mismatch by returning an error or failing startup instead of
no-oping, and adjust the caller contract so bad eventType values are detected
early rather than silently dropping event processing.

Comment on lines +148 to +150
// SetHMACSecretSyncer registers an HMAC secret syncer. Must be called before Start().
func (c *Client) SetHMACSecretSyncer(s HMACSecretSyncer) {
c.hmacSecretSyncer = s

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Register the HMAC syncer before starting the client.

Line 148 documents a pre-Start() contract, but the provided runner wiring starts cpClient before calling SetControlPlaneDeps, so early WebSub/HMAC events can hit these delegate methods with no syncer registered. Move extension dependency wiring before cpClient.Start().

Proposed ordering fix in gateway/gateway-controller/pkg/bootstrap/runner.go
 cpClient := controlplane.NewClient(
     ...
 )
-if err := cpClient.Start(); err != nil {
-    log.Error("Failed to start control plane client", slog.Any("error", err))
-    // Don't fail startup — gateway can run in degraded mode without control plane.
-}
 
 // Wire extension deps into the control plane client.
 for _, ext := range extensions {
     ext.SetControlPlaneDeps(&controlPlaneAdapter{cpClient})
 }
+
+if err := cpClient.Start(); err != nil {
+    log.Error("Failed to start control plane client", slog.Any("error", err))
+    // Don't fail startup — gateway can run in degraded mode without control plane.
+}

Also applies to: 2489-2499

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gateway/gateway-controller/pkg/controlplane/client.go` around lines 148 -
150, The HMAC syncer registration order is wrong: the Client contract in
SetHMACSecretSyncer requires wiring before Start(), but the runner currently
starts cpClient before SetControlPlaneDeps, leaving early WebSub/HMAC callbacks
without a registered syncer. Reorder the bootstrap flow in runner.go so
extension/control-plane dependencies are set on cpClient before calling
cpClient.Start(), and keep the SetHMACSecretSyncer/SetControlPlaneDeps path
consistent with the pre-Start contract.

}

const currentSchemaVersion = 3
const currentSchemaVersion = 4

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Update the SQLite schema-version test expectation.

Line 76 now reports expected 4, but the provided sqlite_test.go snippet still checks for expected 3, so that unsupported-schema test will fail.

Suggested test update
- assert.ErrorContains(t, err, "failed to initialize schema: unsupported schema version 5, expected 3; delete the database to recreate")
+ assert.ErrorContains(t, err, "failed to initialize schema: unsupported schema version 5, expected 4; delete the database to recreate")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gateway/gateway-controller/pkg/storage/sqlite.go` at line 76, The SQLite
schema version expectation has changed, so the unsupported-schema test in
sqlite_test.go is now out of sync. Update the test assertion that currently
expects schema version 3 to match the new currentSchemaVersion value used in
sqlite.go, and adjust the test case around the unsupported-version check so it
validates against version 4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant