Status: active
The storage subsystem is the system extension's data plane for stateful
workloads: it provisions backing volumes, mounts them onto NodeInstances over
the SDWAN overlay, owns the Unix-identity (chown) model for on-disk files, and
moves a stateful component's data from one volume to another without losing the
(deployment, role) binding. It sits one layer below the runtime modules in the
fleet substrate — a docker-engine or k3s-server module runs the workload;
the storage subsystem keeps that workload's persistent state where it belongs.
This document is the concept/architecture reference. For the operator procedure to run a migration end-to-end, see runbooks/storage-migration.md. For the broader substrate, see ARCHITECTURE.md.
Operator / AI agent
│ MCP (system_* storage actions) + REST (/api/v1/system/.../volumes, storage_*)
▼
Control plane (Rails 8)
├─ ProviderVolume / ProviderVolumeType / ProviderVolumeMember (backing storage)
├─ StorageAssignment (a mount of a storage onto a NodeInstance)
├─ StorageCredential / MountEncryptionKey (per-instance access + at-rest keys)
└─ StorageMigration (in-flight volume-to-volume data move)
│
│ System::Task rows (command: "storage.*") — pull-based task lease
▼
On-node powernode-agent (Go)
mounts / unmounts, writes exports.d + samba users, runs find/chown + rsync
The platform never pushes to a node. Every side effect — mount, unmount, NFS
export write, Samba user creation, chown, rsync — is enqueued as a
System::Task (app/models/system/task.rb) that the on-node agent leases over
its mTLS node_api channel and POSTs results back. The control-plane models
hold the desired state; the agent reconciles the node toward it.
app/models/system/provider_volume.rb is the unit of backing storage. Key
shape:
STATUSES = creating, available, in-use, deleting, deleted, error.belongs_to :volume_type(System::ProviderVolumeType),:provider_region,:availability_zone, and optionally:node_instance(the attached host).- Attachment is modeled directly:
attach_to!(instance, device_name)flipsstatustoin-useand setsnode_instance_id;detach!reverses it. Predicatescan_attach?/can_detach?/can_delete?/can_snapshot?gate the transitions (e.g.can_delete?requiresavailable || errorand unattached). - RAID is supported via
ProviderVolumeMember(app/models/system/provider_volume_member.rb):RAID_LEVELS = [0, 1](0 = striping, 1 = mirroring), withraid_capacity/active_member_count/has_minimum_members?helpers.
app/models/system/provider_volume_type.rb carries
VOLUME_TYPES = gp2 gp3 io1 io2 st1 sc1 standard ssd hdd nfs iscsi smb custom
— AWS-EBS-derived tiers, generic ssd/hdd tiers, the network-filesystem
transports nfs/iscsi/smb, and custom as a catch-all. When the subsystem
needs a mount transport it reads volume_type.volume_type and maps it: the
network transports nfs/smb/iscsi mount as themselves; everything else
(EBS tiers, ssd/hdd, custom) mounts as a local block device (see
StorageMigration#promote_target_binding!).
app/models/system/storage_assignment.rb is one mount of a storage onto one
NodeInstance at a mount_path. It carries:
STATUSES = pending, provisioning, mounted, degraded, unmounting, failed, disabled.ENCRYPTION_MODES = inherit, none, fscrypt, luks, client_side_aes.- The ownership model:
OWNER_KINDS = service_user, operator, nobody, root. Non-service owners take static numeric IDs fromBASELINE_UIDS(operator → 1000,nobody → 65534,root → 0); aservice_userowner resolves to a platform-allocated UID in the70000..99999range. - The chown state:
CHOWN_STATES = complete, pending, running, failed, manual_required, pluschown_previous_uid/gid,chown_task_id,chown_started_at,chown_completed_at,chown_last_error. During an in-flight chown the assignment exposeseffective_export_uid/gid(the previous owner) so NFS exports keep serving the old IDs and consumers don't take anEACCESstorm while the agent rewrites ownership. - Associations to
StorageCredentialandMountEncryptionKey(bothdependent: :destroy).
StorageCredential(app/models/system/storage_credential.rb) — a per-instance credential (NFS export grant handle, Samba user, STS token, …), sealed in Vault, withissued → active → rotating → revokedlifecycle and expiry/rotation predicates.MountEncryptionKey(app/models/system/mount_encryption_key.rb) — the at-rest key forfscrypt/luks/client_side_aesmounts; escrowed, material stored directly in Vault (never returned to the platform process).
app/models/system/storage_migration.rb tracks moving a stateful component's
data (e.g. /var/lib/postgresql) from one ProviderVolume to another while
preserving the (node_instance, role) binding. It is distinct from
System::Migration (cross-peer record transfer). Its state machine is detailed
below.
StorageMigration does not use AASM — it is a hand-rolled TRANSITIONS
hash validated by transition_to!. Statuses and rules below mirror the model
exactly.
STATUSES = planned approved preparing syncing verifying cutover
completed failed cancelled
TERMINAL = completed failed cancelled
planned ──approve──▶ approved ──prepare──▶ preparing
│ │ │
│ │ ▼
│ │ syncing
│ │ │
│ │ ▼
│ │ verifying
│ │ │
│ │ ▼
│ │ cutover
│ │ │
│ │ ▼
└──cancel──┐ ┌─cancel─┘ ┌─cancel─┐ completed (terminal)
▼ ▼ ▼
cancelled failed (terminal; reachable
from ANY non-terminal state)
Forward transitions (TRANSITIONS map, exact):
| From | Allowed next |
|---|---|
planned |
approved, cancelled, failed |
approved |
preparing, cancelled, failed |
preparing |
syncing, cancelled, failed |
syncing |
verifying, failed |
verifying |
cutover, failed |
cutover |
completed, failed |
completed |
— (terminal) |
failed |
— (terminal) |
cancelled |
— (terminal) |
Rules:
failedis reachable from every non-terminal state (it appears in each non-terminal's allowed list, andmark_failed!(reason:)is a shortcut that records the reason +failed_atfrom any non-terminal state).cancelledis reachable only fromplanned,approved, orpreparing.cancel!(reason:, user:)raisesArgumentErroronce the sync has started (statussyncingor later) — cancellation is a pre-sync escape hatch only.
Key methods (all on StorageMigration):
transition_to!(new_status, message:, details:)— validates the target is a known status and a legal transition, appends an audit entry, stamps the matching timestamp (approved_at/started_aton firstpreparing/completed_at/failed_at/cancelled_at), and on→ completedcallspromote_target_binding!.mark_failed!(reason:)— no-op if already terminal; otherwise records the reason inerror_message+failed_atand audit.cancel!(reason:, user:)— no-op if terminal; raisesArgumentErrorunless status ∈{planned, approved, preparing}.report_progress!(bytes_copied:, bytes_total:, bytes_verified:, note:)— updates the byte counters and appends an audit note (the operator-visible timeline); does not change status on its own.promote_target_binding!— oncutover → completed, swaps the instance'sNodeInstance.config["storage_volume"]binding from source to target so post-restart agent boots and heartbeat fetches mount the new home. It is defensively wrapped: on any error it logs a warning and appends apromote_target_binding! warning: …audit entry rather than raising — leaving a silent half-cutover (data at target, instance still bound to source) that the operator must reconcile. See the runbook's Failure modes.
The state advance happens server-side on operator/agent action; the actual
data copy (rsync) runs on the on-node Go agent, driven by the agent_contract
recipe in the migration plan (steps: mount_target, snapshot, rsync, verify, cutover, unmount_source).
All live in app/services/system/storage/. Each one-paragraph summary is
grounded in the class's own top-of-file comment and public methods.
Drives a StorageAssignment toward its target state. Triggered by the
assignment's after_commit, by an agent heartbeat reporting a missing mount,
and by the periodic StorageAssignmentDriftSensor
(app/services/system/fleet/sensors/storage_assignment_drift_sensor.rb). Per
assignment it: dispatches an unmount task if a mounted assignment is now
disabled; honors an exponential backoff (BACKOFF_BASE = 30s, capped at
30.minutes) encoded in error_message; ensures an Sdwan::Peer exists
(auto-enrolling via Sdwan::PeerEnroller); ensures a non-expired
StorageCredential (issuing/rotating via CredentialIssuer); ensures a
MountEncryptionKey when the effective encryption mode is not none; and
finally creates a storage.mount System::Task with a payload from
TaskPayloadBuilder.
Routes a pending chown for a StorageAssignment to the correct node's agent,
which runs find -uid OLD -exec chown NEW {} + and POSTs completion to
/api/v1/system/worker_api/storage/chown_complete
(app/controllers/api/v1/system/worker_api/storage_chown_complete_controller.rb).
Storage-type routing: nfs/smb chown runs on the provider node hosting the
export (gateway or backend); ebs/local-block/fscrypt run on the consumer
node (the assignment's node_instance); object stores (s3/gcs/azure) are a
no-op marked complete inline (object ACLs are metadata, not file ownership);
external/unmanaged NFS/SMB (no platform-managed provider node) is marked
manual_required. Idempotent — re-dispatch while chown_state == "running" is a
no-op. Dispatch failures flip the assignment to chown_state = "failed" and
raise DispatchError.
Issues, rotates, and revokes per-instance StorageCredentials. Flow: resolve
(or auto-enroll) the Sdwan::Peer; assemble a plain-hash context; call
storage_provider.issue_node_credential (pure data return); persist the
StorageCredential and seal the payload in Vault; then materialize the backend
side via NfsExportManager#grant! or SmbUserManager#provision_user! depending
on provider_type. rotate! issues a new credential then revokes the old;
revoke! tears down the backend grant/user and revokes the provider handle. It
deliberately re-fetches the credential via Model.find(id) (not reload) after
store_in_vault to dodge a known vault-credential cache reload bug.
Shape-2 (gateway_proxy) only. Configures a gateway powernode to mount an
external NFS/SMB server and re-export it on its SDWAN interface, so SDWAN
clients mount the gateway (the trust boundary) rather than the upstream.
provision! / deprovision! validate the storage is gateway_proxy? (raising
ArgumentError otherwise) and enqueue storage.gateway.provision /
storage.gateway.deprovision tasks to the gateway node. V1 ships plaintext
gateway↔upstream traffic (operator must place the gateway on a trusted subnet);
TLS wrapping is a V2 item.
Maps a mount_path to an inferred assignment owner. Used by the
owner-refactor backfill and by agent/operator surfaces wanting a sensible
default. A static, ordered INFERENCE_RULES table (more-specific patterns
first) maps well-known paths to owners — e.g. /var/lib/postgresql → service_user postgres, /var/www → www-data, /home/pnadmin → operator,
/var/log/audit → root, /tmp → nobody. The rules live in code (not config)
because the mapping is a human-encoded convention that must stay auditable. By
design it fails loud: infer(path) returns { kind: :unresolved } rather
than guessing a wrong owner, leaving the decision to the caller.
resolvable?(path) is the convenience predicate.
Backend-side NFS export orchestrator. For Shape 1 (self_hosted) the backend
peer hosts the export directly; for Shape 2 (gateway_proxy) it is the gateway
re-exporting the upstream. grant! / revoke! enqueue storage.exports.apply
tasks for a single credential; reconcile! rewrites the whole exports file from
all enabled assignments (a rarely-invoked drift-recovery path). Per-storage
writes are serialized with a Postgres advisory lock
(pg_advisory_xact_lock, keyed on the storage UUID) so concurrent
CredentialIssuer runs can't race the exports.d file. Exports preserve
effective_export_uid/gid (the old owner during an in-flight chown).
Backend-side per-instance Samba user provisioner. provision_user! /
deprovision_user! / rotate_user! enqueue storage.smb_user.apply tasks
(actions create / delete / set_password) to the backend node — the storage
backend (Shape 1) or the gateway running Samba (Shape 2). Credentials
(username/password) come from the sealed StorageCredential's
vault_credentials.
Composes the JSON task payloads the on-node agent receives via System::Task.
Builds mount / unmount / exports.apply / gateway.provision /
gateway.deprovision payloads. Mount recipes come from the provider layer's
FileManagement::Storage#node_mount_recipe(context:) (pure data — no extension
types leak into the platform provider layer); the builder layers on combined
mount options, the read-only flag, an encryption payload, the systemd unit name
(powernode-storage-<sanitized-path>.mount), and the WireGuard interface hint.
Object-storage recipes (s3fs/gcsfuse/rclone) skip the WireGuard
requirement (native egress); everything else rides SDWAN.
An ownership change on a StorageAssignment (via system_assign_storage_owner)
commits, then StorageAssignment#dispatch_chown_if_pending calls
ChownDispatchService.dispatch!. The service records chown_previous_uid/gid,
flips chown_state → running, and enqueues a storage.chown task to the node
that actually owns the files (provider node for NFS/SMB; consumer node for
block/fscrypt). While chown_state is in-flight, effective_export_uid/gid
returns the previous owner, so the NFS export (rewritten by
NfsExportManager/TaskPayloadBuilder) keeps serving the old IDs and avoids an
EACCES storm. The agent runs the recursive chown and POSTs to the
storage_chown_complete worker endpoint, which flips chown_state → complete
and clears the previous IDs. Failures land in chown_state = failed (or
manual_required for unreachable/external providers) and are surfaced + retried
via system_storage_chown_status / system_storage_chown_retry.
Separately, AssignmentReconciliationService is the mount-level reconciler: it
re-mounts drifted/failed assignments, issues/rotates credentials, and unmounts
disabled ones, with exponential backoff on repeated failure.
These cooperate to bring a network mount up:
MountPathInferenceServiceproposes the owner for a new mount (or the backfill resolves it).CredentialIssuermints the per-instanceStorageCredential(Vault-sealed) and calls the backend materializer.NfsExportManager(NFS) orSmbUserManager(SMB) writes the backend-side export entry / Samba user via a node task.- For external upstreams behind a gateway,
GatewayProvisioningServicemounts the upstream on the gateway and re-exports it on the SDWAN interface. TaskPayloadBuilderbuilds the consumer-sidestorage.mountpayload (recipe- options + encryption + WireGuard hint) and
AssignmentReconciliationServicedispatches it.
- options + encryption + WireGuard hint) and
Reachability of an NFS upstream can be probed before recording a volume with
system_test_nfs_export (DNS + TCP 111/2049 + showmount -e; it never mounts).
All actions below are registered in the parent platform's tool registry and
dispatched by app/services/ai/tools/system_fleet_tool.rb (volumes + migrations
- recommendations + NFS probe) and
app/services/ai/tools/system_storage_owner_tool.rb(ownership + chown).
| Action | Purpose | Key params |
|---|---|---|
system_list_volumes |
List ProviderVolumes | status, transport, node_instance_id, unattached_only |
system_get_volume |
Full detail on one volume | id |
system_create_volume |
Register a ProviderVolume | name, size_gb, transport, nfs_server, nfs_export_path, … |
system_update_volume |
Update name/desc/size/status | id, … |
system_delete_volume |
Delete a volume (must be detached) | id |
system_attach_volume |
Attach to a NodeInstance | volume_id, node_instance_id, role |
system_detach_volume |
Detach from a NodeInstance | volume_id, node_instance_id |
system_test_nfs_export |
Probe an NFS server/export (no mount) | server, export_path |
| Action | Purpose | Key params |
|---|---|---|
system_migrate_storage_component |
Create a planned StorageMigration + plan |
node_instance_id, source_volume_id, target_volume_id, role |
system_list_storage_migrations |
List migrations (newest, cap 100) | status, node_instance_id, active_only |
system_get_storage_migration |
Fetch one (plan, bytes, audit log) | id |
system_approve_storage_migration |
planned → approved |
id |
system_cancel_storage_migration |
Cancel pre-sync only | id, reason |
system_report_storage_migration_progress |
Advance phase + record bytes | id, status, bytes_copied, bytes_total, bytes_verified, note |
| Action | Purpose | Key params |
|---|---|---|
system_assign_storage_owner |
Set assignment owner (triggers chown) | storage_assignment_id, owner_kind, service_user_username, shared_group_groupname |
system_list_storage_assignments_by_owner |
Audit ownership/chown across the fleet | owner_kind, service_user_username, node_instance_id, chown_state |
system_storage_chown_status |
Inspect chown state of one assignment | storage_assignment_id |
system_storage_chown_retry |
Re-dispatch a failed/manual chown | storage_assignment_id, force_complete |
| Action | Purpose | Key params |
|---|---|---|
system_get_storage_recommendations |
Read role mount points + sizes | — |
system_update_storage_recommendations |
Partial-merge override | recommendations |
Permissions: ownership/chown actions gate on system.storage.read and
system.storage.assignments.update; volume actions on system.volumes.*;
migration + recommendations on system.platform.read / system.platform.scale.
For the curated MCP reference see MCP_API_REFERENCE.md.
Three storage operations change or destroy data-bearing state and warrant operator care. The end-to-end procedure + per-failure remediation lives in runbooks/storage-migration.md.
| Operation | Why it's dangerous | Guardrail |
|---|---|---|
Migration cutover (cutover → completed) |
Swaps the instance's storage_volume binding source→target; a failure in promote_target_binding! leaves a silent half-cutover (data at target, instance bound to source) |
Defensive rescue + audit warning; operator must verify the binding post-cutover |
chown (system_assign_storage_owner) |
Recursive chown over an entire mount; wrong owner makes a service unable to read its own data |
Loud :unresolved inference, effective_export_uid/gid masks the change until complete, failed/manual_required states + retry |
system_delete_volume |
Removes backing storage | Refuses while the volume is attached (can_delete? = available/error and unattached) |
- Models:
app/models/system/{provider_volume,provider_volume_type,provider_volume_member,storage_assignment,storage_credential,mount_encryption_key,storage_migration}.rb - Services:
app/services/system/storage/(the eight above) +app/services/system/platform/{storage_recommendations,storage_layout}.rb - Sensor:
app/services/system/fleet/sensors/storage_assignment_drift_sensor.rb - MCP tools:
app/services/ai/tools/{system_fleet_tool,system_storage_owner_tool}.rb - Controllers:
app/controllers/api/v1/system/{provider_volumes_controller,storage_assignments_controller,storage_credentials_controller}.rb,app/controllers/api/v1/system/platform/{volumes_controller,storage_migrations_controller}.rb,app/controllers/api/v1/system/node_api/{storage_volume_controller,storage_assignments_controller,storage_migrations_controller}.rb,app/controllers/api/v1/system/worker_api/{volumes_controller,storage_chown_complete_controller}.rb - Runbook: runbooks/storage-migration.md
- Substrate overview: ARCHITECTURE.md · sensor reference: FLEET_SENSORS.md
Last verified: 2026-06-26