feat: generate operational runbooks in eject bundle#249
Conversation
Add day-to-day-ops and disaster-recovery runbook templates that render with workload-specific data (cloud, region, sites, cluster name). Wired into eject.Run() so `ptd eject --dry-run` produces runbooks/ in the output directory.
|
Claude finished @timtalbot's task —— View job Review of PR #249: Operational Runbooks in Eject Bundle
Overall this is solid work — clean templates, good cloud-conditional logic, and thorough test coverage (especially the no-pulumi and no-auto-apply guardrails). A few things worth addressing: 1.
Either remove the field or wire it up so templates render the actual cluster name. 2. Azure resource group name skips sanitization — data.ResourceGroup = fmt.Sprintf("rsg-ptd-%s", targetName)The canonical implementation in 3. Non-deterministic site ordering — Iterating names := maps.Keys(cfg.Sites)
slices.Sort(names)
for _, name := range names {
site := cfg.Sites[name]
data.Sites = append(data.Sites, SiteData{Name: name, Domain: site.Spec.Domain})
}4. The step table covers 5. These exported functions are only called from tests. The actual code path uses None of these are blocking — the runbook content and test coverage are well done. |
Address code review feedback on #249: - Populate ClusterName from config (AWS: default_{name}-{release}-control-plane, Azure: {sanitized-name}-{release}) and render in kubeconfig commands - Sort sites deterministically by name to produce stable output - Apply Azure naming sanitization (lowercase, strip non-alphanumeric) to resource group name, matching lib/azure/target.go convention
- Remove false claims about S3/Azure blob versioning on state and data buckets — none of these have versioning enabled - Remove false claims about Azure storage snapshots/soft-delete - Keep accurate content: RDS 7-day backups, Azure PG default backups, FSx 30-day automatic backups - Add "Prevention" notes suggesting customers enable versioning post-eject - Remove redundant --dry-run + apply pattern from both runbooks — ptd ensure already shows a preview and prompts for confirmation - Full rebuild uses bare `ptd ensure` instead of listing every step, which also avoids skipping custom steps
Summary
day-to-day-ops.mdanddisaster-recovery.md) populated with workload-specific data (cloud, region, sites, cluster name)eject.Run()soptd eject --dry-runwrites runbooks to the output directoryCloses #218
Test plan
go test ./eject/ -run Runbook— 19 tests pass (both clouds, all sections, no--auto-apply, no raw Pulumi)ptd eject <target> --dry-runagainst a real workload — review generated runbooks for accuracy