Skip to content

feat(client): autonomous EdgeX 4.x edge collection container (MQTT bus)#9

Draft
tomgrv wants to merge 17 commits into
developfrom
claude/loving-curie-6pu0c4
Draft

feat(client): autonomous EdgeX 4.x edge collection container (MQTT bus)#9
tomgrv wants to merge 17 commits into
developfrom
claude/loving-curie-6pu0c4

Conversation

@tomgrv

@tomgrv tomgrv commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements the "Container autonome de collecte EdgeX (v4, MQTT)" epic: the EdgeKit client is pivoted from a bespoke Node.js metrics agent to an autonomous EdgeX Foundry 4.x edge collection stack. Delivered as requested: design docs + client Dockerfile + static config files, replacing the old client.

The edge node:

  • uses an internal MQTT message bus (Mosquitto, loopback) instead of Redis
  • collects via device-modbus (Modbus TCP) and device-rest (REST/JSON)
  • runs app-service-configurable to filter → tag → compress → export Events to the central MQTT broker, with native Store & Forward for offline buffering and auto-reconnect
  • persists locally in embedded PostgreSQL (EdgeX 4.x default DB)
  • optionally enables core-data for local generic persistence (EDGEKIT_ENABLE_CORE_DATA)
  • is 100% statically configured (baked res/ files + env overrides), no Consul/registry

How user stories are addressed

# Story Where
1 Autonomous EdgeX 4.x node, MQTT internal bus, single run client/Dockerfile (all-in-one, supervisord), docker-compose.yml
2 Export to central MQTT broker, reconnect/recovery client/res/app-service/configuration.yaml (MQTTExport, AutoReconnect)
3 Local buffering + replay on reconnect Writable.StoreAndForward + PersistOnError (Postgres-backed)
4 Static config only, documented upgrade procedure client/res/, client/env/edge.env, docs/edge-operations.md
5 Optional core-data + Postgres EDGEKIT_ENABLE_CORE_DATA, core-data-wrapper.sh

Key changes

  • client/: all-in-one Dockerfile (lifts EdgeX 4.x binaries, orchestrated by supervisord), entrypoint.sh (Postgres bootstrap), static res/ config and env/edge.env
  • Helm: client converted from Deployment to StatefulSet with per-replica persistence + headless service; values.yaml reworked for the new model
  • docker-compose / scripts / CI: updated for the new client (no more Node build)
  • docs: architecture.md rewrite, adr/0001-edgex-4x-mqtt-edge.md, edge-operations.md

⚠️ Validation status

This stack has not been booted end-to-end. It is configuration/scaffolding and the following must be validated against the pinned EDGEX_VERSION before production use (tracked in ADR 0001 follow-ups):

  • upstream EdgeX image binary / res/ paths used by the multi-stage COPY --from
  • exact env-override key names
  • the embedded Postgres bootstrap and EdgeX DB credential handling
  • app-service-configurable profile selection

Adding a smoke test to CI and an optional one-process-per-container docker-compose.edge.yml are noted as follow-ups.

🤖 Generated with Claude Code


Generated by Claude Code

…ntainer

Pivot the EdgeKit client from a bespoke Node.js metrics agent to an
autonomous EdgeX Foundry 4.x edge collection stack, per the "Container
autonome de collecte EdgeX (v4, MQTT)" epic.

The edge node:
- uses an internal MQTT message bus (Mosquitto, loopback) instead of Redis
- collects via device-modbus (Modbus TCP) and device-rest (REST/JSON)
- runs app-service-configurable to filter/tag/compress and export Events to
  the central MQTT broker, with native Store & Forward for offline buffering
- persists locally in embedded PostgreSQL (EdgeX 4.x default DB)
- optionally enables core-data for local generic persistence
- is fully statically configured (baked res/ files + env overrides), no Consul

Implementation:
- client/Dockerfile: all-in-one image lifting EdgeX 4.x binaries, orchestrated
  by supervisord; entrypoint bootstraps embedded Postgres
- client/res/: static config (device profiles/devices, app-service pipeline,
  internal broker), client/env/edge.env: per-site overrides
- Helm: client converted to a StatefulSet with per-replica persistence + a
  headless service; values reworked for the new model
- docker-compose, scripts, CI updated for the new client
- docs: architecture rewrite, ADR 0001, edge-operations guide

Note: the stack has not been booted end-to-end; pinned EdgeX paths, env-override
keys and the Postgres bootstrap need validation against the target release
(see ADR 0001 follow-ups).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01K7KSPTR8ntNPbb9DCZmKjm

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the bespoke Node.js metrics agent with an autonomous EdgeX Foundry 4.x edge stack packaged as a single Docker image (edgekit-client), featuring local Modbus/REST collection, an internal MQTT bus, and Store & Forward capabilities. The review identified several critical issues: potential container failures in entrypoint.sh due to unsafe directory ownership changes and unquoted heredocs, a hardcoded database path in supervisord.conf that ignores environment overrides, and YAML type mismatches in device definitions where numeric values must be strings. Additionally, the AddTags pipeline function incorrectly uses = instead of : as a separator across multiple configuration files, and the Helm StatefulSet needs to utilize the Kubernetes downward API to ensure unique site identities are generated for each replica.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread client/entrypoint.sh Outdated
Comment thread client/supervisord.conf Outdated
Comment thread client/res/device-modbus/devices/modbus.devices.yaml Outdated
Comment thread client/res/device-rest/devices/rest.devices.yaml Outdated
Comment thread client/res/app-service/configuration.yaml Outdated
Comment thread client/env/edge.env Outdated
Comment thread docker-compose.yml Outdated
Comment thread client/entrypoint.sh Outdated
Comment thread helm/edgekit/templates/client-statefulset.yaml Outdated
claude and others added 16 commits June 19, 2026 12:06
- entrypoint: chown only $PGDATA (not its parent) to avoid chowning / when
  PGDATA is a root-level path; harden the Postgres bootstrap with a quoted
  heredoc + psql -v parameters and quote_ident
- supervisord: use %(ENV_PGDATA)s so an overridden PGDATA is respected
- device-modbus/device-rest: quote numeric ProtocolProperties values
  (EdgeX ProtocolProperties is map[string]string)
- AddTags pipeline: use key:value separator (colon, not '=') in app-service
  config, edge.env and docker-compose
- helm: inject POD_NAME via the downward API and build a unique per-replica
  site tag (site:<siteId>-$(POD_NAME))

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01K7KSPTR8ntNPbb9DCZmKjm
The release workflow now builds the EdgeX client with the same pinned
EDGEX_VERSION used in client/Dockerfile and ci.yml, so released images are
built against a known EdgeX release rather than the Dockerfile default.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01K7KSPTR8ntNPbb9DCZmKjm
Edge nodes are commonly arm64 (gateways, SBCs), so both the CI validation
builds and the released images now target amd64 and arm64. Adds QEMU setup
and a shared PLATFORMS env; release images are published as multi-arch
manifest lists.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01K7KSPTR8ntNPbb9DCZmKjm
Switch the edge node from file-only configuration (-cp=false -r=false) to the
EdgeX 4.0 configuration & registry provider, per
https://docs.edgexfoundry.org/4.0/microservices/configuration/ConfigurationAndRegistry/

- add core-keeper (provider, port 59890) and core-common-config-bootstrapper
  to the all-in-one image and supervisord orchestration
- add a shared common configuration (res/common/configuration.yaml) holding
  MessageBus, Database, Registry and the metadata client; the bootstrapper
  seeds it into Keeper on first start
- slim each service's private config to service-specific settings only; the
  res/ files now seed Keeper rather than being the sole runtime source
- run every service with -cp=keeper.http://localhost:59890 -r=true (Keeper
  itself runs without those flags); drop EDGEX_USE_REGISTRY=false
- device profiles/devices remain file-based (loaded into metadata/Postgres)
- docs: ADR 0002 (amends the no-registry decision in ADR 0001), architecture,
  edge-operations (sources + upgrade/re-seed procedure) and READMEs updated

This deliberately reverses Story #4's "no registry" goal in favour of central,
rebuild-free runtime config and a real service registry (see ADR 0002). Not
booted end-to-end; Keeper paths/flags/seeding need validation against the
pinned EDGEX_VERSION.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01K7KSPTR8ntNPbb9DCZmKjm
…henticate to Postgres

Core-keeper (and every other service) reads DB credentials from
Writable.InsecureSecrets.DB.SecretData, not DATABASE_USERNAME/DATABASE_PASSWORD
(those env names don't exist in EdgeX). The embedded Postgres bootstrap set the
"postgres" role's real password to POSTGRES_PASSWORD while every service kept
reading the upstream default ("postgres"/"postgres"), causing
"password authentication failed for user postgres" at startup. Override the
actual InsecureSecrets env keys instead so they match.
EdgeX's Postgres client defaults to database "edgex_db" (a Go constant,
overridable only via the EDGEX_DBNAME env var - not a configuration.yaml key).
We were creating a database named "edgex" and never set EDGEX_DBNAME, so every
service failed with "database \"edgex_db\" does not exist" right after the
Postgres credential fix. Default POSTGRES_DB to edgex_db and export
EDGEX_DBNAME from it so the created database always matches what's queried.
…st init

The user/password/database setup only ran inside the one-time initdb guard, so
an existing data directory from an earlier run (created when POSTGRES_DB
defaulted to "edgex") never picked up the password fix or the edgex_db
database - every service kept failing with "database edgex_db does not exist".
Run the ALTER USER / CREATE DATABASE IF NOT EXISTS step on every start so it
self-heals regardless of what an existing volume was initialised with.
core-keeper's idempotent SQL bootstrap runs CREATE EXTENSION "uuid-ossp", but
that extension ships in postgresql16-contrib, which wasn't installed -
core-keeper kept failing to set up its schema. Requires an image rebuild to
take effect.
core-keeper's own private res/configuration.yaml carries InsecureSecrets.DB by
default, so it could connect to Postgres - but core-metadata, device-modbus,
device-rest and app-service get their DB credentials from the *common* config
pushed by core-common-config-bootstrapper, which had no InsecureSecrets
section at all. Those services were logging "InsecureSecrets missing from
configuration" and never reached the database. Add the same
Writable.InsecureSecrets.DB block (matching upstream's default) to the common
config so every non-core-keeper service picks it up, with the
WRITABLE_INSECURESECRETS_DB_SECRETDATA_PASSWORD override from entrypoint.sh
applying uniformly.
- Common config's device-services/app-services groups were missing the
  Writable section upstream provides; device-modbus/device-rest unconditionally
  list common config keys under Writable and got a 404 with it entirely
  absent.
- app-service's private configuration.yaml never set Service.Host/Port, so it
  registered with Keeper without service info ("Service information not set")
  and other services couldn't reach it.
…volume

Core Keeper's stored config and the embedded Postgres data share one volume,
and Keeper won't overwrite existing config without -o/--overwrite. After a
client res/ config change, the only way to see it locally is a clean volume.
Add a script to stop and remove just edge-pgdata (leaving the broker's
mosquitto-data alone) so this doesn't require remembering the volume name.
core-common-config-bootstrapper builds override env var names from the
full map path including the top-level group key (all-services/...), so
the plain WRITABLE_INSECURESECRETS_DB_SECRETDATA_PASSWORD override only
ever patched core-keeper's own private config. The common config pushed
to Keeper (read by core-metadata, device-modbus, device-rest, and
app-service) kept the literal "postgres" default, causing
SQLSTATE 28P01 auth failures once the Postgres role's real password
diverged from that default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01K7KSPTR8ntNPbb9DCZmKjm
- app-service: FilterByDeviceName's parameter is "DeviceNames" (per
  app-functions-sdk-go), not "FilterValues" - the latter caused the
  pipeline to fail loading entirely. Empty DeviceNames + FilterOut=false
  still passes all events through, preserving the original no-op intent.
- device-modbus: the temperature profile's "scale" attribute was quoted
  as a YAML string ("0.1") but core-metadata requires a float64, causing
  the profile (and its dependent device modbus-th-01) to fail loading.
- device-rest/device-modbus: device-sdk-go's autodiscovery loop calls
  time.ParseDuration on Device.Discovery.Interval unconditionally, even
  when disabled; an absent/empty value logged a spurious error. Added
  the standard Discovery block (Enabled: false, Interval: "30s").

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01K7KSPTR8ntNPbb9DCZmKjm
device-sdk-go rejects any deviceCommand whose resourceOperations count
exceeds Device.MaxCmdOps, which defaulted to 0 because our slimmed
device-services common config dropped the upstream Device block entirely.
This caused every AutoEvent read to fail with "exceed MaxCmdOps (0)".
Restored the standard shared device-service defaults (MaxCmdOps: 128,
MaxCmdValueLen: 256, EnableAsyncReadings, AsyncBufferSize, DataTransform).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01K7KSPTR8ntNPbb9DCZmKjm
device-rest-go looks up the device's protocol parameters under the key
"REST" (the RESTProtocol constant). The lowercase "rest" key meant the
driver found no protocol block and failed every AutoEvent read with
"no end device parameters defined in the protocol list". The field names
(Host/Port/Path) were already correct.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01K7KSPTR8ntNPbb9DCZmKjm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants