feat(gc): per-id INFO logs + orchestrator summary#54
Merged
Conversation
Issue #33's "output detail" requirement: `cocoon gc` was previously near-silent, only printing a single "GC completed". Now every collected item is logged at INFO under `gc.<module>` with a structured key=value payload, and the orchestrator ends each cycle with a per-module count + duration. Sample run: INFO gc.snapshot collected id=XEOU... name=ubuntu-hot-testing:v1 bytes=3221225472 last_accessed=2026-04-12T10:30:00Z reason=lru-age INFO gc.cloudhypervisor collected id=ABC123 runDir=/var/lib/cocoon/run/cloudhypervisor/ABC123 logDir=/var/log/cocoon/cloudhypervisor/ABC123 reason=orphan-runDir INFO gc.oci collected blob=b40150c1c2717d... reason=unreferenced INFO gc.cni collected id=JKL netns=cocoon-JKL nics=2 reason=orphan INFO gc.bridge collected id=MNO iface=btMNO-0 reason=orphan-tap INFO gc.Run completed: cloudhypervisor=1 cni=1 oci=4 snapshot=3 (failures: 0, duration: 230ms) Reasons surface why each id was selected: - snapshot: orphan / stale-pending / lru-all / lru-age / lru-keep / lru-size (multi-criterion joins with `+`, e.g. `lru-age+lru-size`) - cloudhypervisor/firecracker: orphan-runDir / orphan-logDir / stale-creating - images (oci, cloudimg): unreferenced - cni: orphan - bridge: orphan-tap Implementation: - pickLRU now returns `map[string]string` (id → reason); tests updated. - snapshot and hypervisor modules carry a `reasons` closure variable populated in Resolve, read in Collect — Module.Collect signature is unchanged so the framework stays untouched for the other modules. - gcCollect (hypervisor) is now private and takes the reasons map; no external callers existed. - GCCollectBlobs takes a `module` parameter so the per-blob log goes to the correct `gc.<oci|cloudimg>` namespace. - Orchestrator tracks {module → attempted-id-count} and emits a summary log via formatSummary; counts are attempts (not successes) to keep Collect signatures untouched. Failures count comes from the per-module err. - README documents the format + grep/awk examples. Compatible: no flag changes, no DB shape changes, no Module signature changes. The only visible difference is more log output (INFO level — appears by default).
- Hoist EvictionPolicy + hasCriteria above backfillSizeBytes (public declarations must precede private per file-organization rule). - Align backfillSizeBytes logger prefix to gc.snapshot (matches the other two log sites in this file; localfile.X is reserved for non-GC callers). - Extract logEvictRow shared by Collect (collected) and logWouldEvict (would-evict) — same field shape was duplicated; inlines formatTime into the single caller. - Trim three godocs of restating chatter.
…ontract Both snapshot/localfile and hypervisor maintained a closure-captured reasons map[string]string that Resolve wrote and Collect read — the same coupling smell duplicated in two places. Add snap to Collect's signature (gc.Module[S].Collect) so any map field populated by Resolve (reasons in this case, but generally any per-id metadata) is visible to Collect without the closure dance. - snapshot/localfile/gc.go: drop var (records, reasons) outer capture; reasons becomes a field of snapshotGCSnapshot, initialized in ReadDB, populated in Resolve, read in Collect via snap. - hypervisor/gc.go: drop var reasons closure; reasons becomes a field of VMGCSnapshot. - images/network bridge/cni: Collect signature gains a snap parameter; modules that don't need it use _ XSnapshot. - gc/module.go + runner.go + orchestrator.go: thread snap through the collect adapter via the snapshots map already kept for cross-module analysis. Net: same line count, removes duplicated closure-state smell, makes "per-id metadata Resolve→Collect" a first-class GC contract for any future module that needs it.
- gc/module.go: trim Resolve+Collect godoc narration that restated the signature contract. - gc/module.go: collect adapter type-asserts symmetrically with resolveTargets — return nil on mismatch instead of silently passing zero S. - hypervisor/gc.go: gcCollect now takes snap VMGCSnapshot (consistent with snapshot/localfile/gc.go which also reads snap.reasons inside Collect); reasons-map plumbing folded into the snap contract. - snapshot/localfile/gc_test.go: ReadDB called BEFORE chmod 0500 on parent so the test exercises the intended path (read-then-fail-on- remove) instead of accidentally depending on chmod 0500 still permitting parent dir reads.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the "output detail" half of issue #33 (the LRU half landed in #53).
`cocoon gc` was effectively silent — only `INFO GC completed`. Operators couldn't tell what was actually reclaimed without inspecting filesystems. This PR adds:
Sample output
```
INFO gc.snapshot collected id=XEOU... name=ubuntu-hot-testing:v1 bytes=3221225472 last_accessed=2026-04-12T10:30:00Z reason=lru-age
INFO gc.snapshot collected id=2GQVEA... name= bytes=0 last_accessed=never reason=orphan
INFO gc.cloudhypervisor collected id=ABC123 runDir=/var/lib/cocoon/run/cloudhypervisor/ABC123 logDir=/var/log/cocoon/cloudhypervisor/ABC123 reason=orphan-runDir
INFO gc.oci collected blob=b40150c1c2717d... reason=unreferenced
INFO gc.cni collected id=JKL netns=cocoon-JKL nics=2 reason=orphan
INFO gc.bridge collected id=MNO iface=btMNO-0 reason=orphan-tap
INFO gc.Run completed: cloudhypervisor=1 cni=1 oci=4 snapshot=3 (failures: 0, duration: 230ms)
```
Reason vocabulary
Design notes
Files changed
```
README.md | 28 +++
cmd/others/handler.go | 7 +---
gc/orchestrator.go | 26 ++++
gc/orchestrator_test.go | 24 ++++ (NEW)
hypervisor/gc.go | 35 ++++-
images/gc.go | 2 +-
images/index.go | 14 +++
network/bridge/gc_linux.go | 4 +-
network/cni/gc.go | 7 ++++
snapshot/localfile/gc.go | 74 +++++++++---
snapshot/localfile/gc_test.go | 32 +++--
11 files changed, ~200 insertions
```
Verification
```
✓ go build ./... (zero warnings)
✓ make lint (0 issues × Linux + Darwin)
✓ go test ./... (all green, including new TestFormatSummary cases + updated pickLRU tests)
```
Test plan