Adventure draft: 🧪 Blind by Design (OpenFeature + flagd)#40
Closed
aepfli wants to merge 27 commits into
Closed
Conversation
A pharma/lab-themed OpenFeature adventure. Three levels covering wire,
target, and operationalize, framed as a clinical trial of a vision-
amplification serum that ends up emerging blind subjects when the new
algorithm is rolled out at 100%.
🟢 Beginner — Stand up the lab
Wire OpenFeature SDK + flagd file-mode provider into a Spring Boot
app. Prove flags.json hot-reloads without a restart.
🟡 Intermediate — Dose by cohort
Add a Spring HandlerInterceptor for request-scoped language context,
a global eval context for the framework version, and a CustomHook
for per-evaluation audit logging.
🔴 Expert — Phase 3, read the chart
Replace file-mode flagd with a remote container, finish wiring
OpenTelemetry traces + metrics through to a Grafana LGTM stack,
identify the misbehaving fractional rollout (200ms slow + 10%
"subjects emerging blind" = HTTP 500), and roll it back via
flags.json without redeploying.
Story spine: a research lab is enhancing eyesight; the new amplifier
algorithm is causing 1-in-10 subjects to emerge blind, and the lab
itself can't see because the metric exporter is unwired. Light up the
dashboard, find the bad arm, halt enrolment.
Each level ships:
- Broken-state Spring Boot app (no SDK / partial SDK / mis-wired OTel)
- verify.sh sourcing lib/scripts/loader.sh, asserting outcomes only
(port reachable, flag value resolved, hot-reload works, eval-metrics
flowing in Prometheus, traces present in Tempo, 5xx below threshold)
- docs/<level>.md with How-to-Play and docs/solutions/<level>.md
- A per-level devcontainer under .devcontainer/00-side-effects-may-vary_<NN>-<level>/
scoped to only the tooling that level needs (Beginner: JDK only;
Intermediate: + DinD; Expert: + LGTM + loadgen ports)
Source content adapted from the polyglot Fun-With-Flags-Demo
(github.com/aepfli/Fun-With-Flags-Demo) java-spring variant.
Compiles on Java 21 across all three levels.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
7957b84 to
f79efae
Compare
Replace the single-container + Docker-in-Docker shape with multi-container
devcontainers backed by docker-compose. flagd, the Grafana LGTM stack, and
the k6 loadgen now run as sibling services that come up at devcontainer
boot — no `docker compose up` step inside the workspace, no DinD.
Beginner is unchanged: FILE-mode flagd is in-process, no containers needed.
Intermediate
- DinD removed.
- Adds a flagd sibling that watches the participant's flags.json.
Primary path is still FILE mode (in-JVM); the sibling is there so
participants who finish early can flip the FlagdProvider to RPC and
reach `flagd:8013` immediately, without an Expert-only setup.
- FLAGD_HOST=flagd / FLAGD_PORT=8013 exported in the workspace env.
Expert
- DinD removed; level-folder docker-compose.yaml and
docker-compose.observability.yaml deleted.
- .devcontainer/00-side-effects-may-vary_03-expert/docker-compose.yml
now declares workspace + flagd + lgtm + loadgen as siblings on one
network. Workspace bind-mounts the repo; flagd watches expert/flags.json
in place; LGTM mounts expert/dashboards.
- workspace env: FLAGD_HOST=flagd, OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4317.
- loadgen reaches the workspace as `workspace:8080` instead of
`host.docker.internal:8080`.
- post-start.sh no longer brings anything up — the sibling services are
already running. It now just prints orientation.
- post-create.sh pre-warms the Maven dependency cache the same way
Intermediate does.
- verify.sh hints updated: "Reopen the Codespace if a sibling is missing"
instead of "docker compose up".
Codespaces still forwards every published port to localhost on the host,
so docs and `verify.sh` keep using localhost:NNNN unchanged.
Compiles clean on Java 21 across all three levels.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
…ling The flagd sibling already exposes port 8015 (the sync stream). Spell out that participants who finish the FILE-mode task can switch to either Resolver.RPC (port 8013) or Resolver.IN_PROCESS (port 8015) against the same flag definitions. IN_PROCESS gives best-of-both: definitions streamed from a central source, evaluation stays in the JVM. Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Java newcomers should not have to know to find DemoApplication and run
./mvnw spring-boot:run by hand. Each level now ships a checked-in
.vscode/ directory:
- launch.json → "🧪 Run the Lab" — F5 starts DemoApplication.
Spring Boot Dashboard (already in the devcontainer
extensions) auto-discovers the same main class.
- tasks.json → "Run the Lab" + "Verify Solution" tasks. The
verify task is the default test task, so
Tasks → Run Test runs ./verify.sh.
Documentation updates: docs/{beginner,intermediate,expert}.md now
mention pressing F5 alongside the existing ./mvnw spring-boot:run path,
so participants who prefer the IDE button do not need to read the
Maven docs.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The previous commit shipped per-level .vscode/launch.json + tasks.json
to give Java newcomers an F5 entry point. The repo's root .gitignore
intentionally excludes .vscode/ — no other adventure overrides it,
and forcing the override in this PR would set a precedent the
maintainers haven't asked for.
Spring Boot Dashboard (vscjava.vscode-spring-boot-dashboard) is already
in the devcontainer's recommended extensions and auto-detects
DemoApplication on its own. Combined with F5 fallback (the Java
debugger picks the main class without a launch.json), participants
unfamiliar with Java still get a one-click Run.
Doc updates (docs/{beginner,intermediate,expert}.md) point at the
Spring Boot Dashboard panel by name, mention F5 as the no-config
fallback, and keep ./mvnw spring-boot:run as the terminal path.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
…files Two friction points reported on the Intermediate level: 1. The puzzle assumes the participant knows what a Spring HandlerInterceptor, an OpenFeature transaction context, a global evaluation context, and a Hook are. If you have not seen those before, the code blocks are opaque. Add a "Concepts you'll touch" section to docs/intermediate.md that explains each concept *briefly* — what it is, why it exists, where it sits in the request lifecycle — without giving the solution. 2. The Codespace lands in the IDE with only the docs open; the participant has to go hunting for the file to edit. Same treatment for Expert — adds a Concepts section covering OTel TracerProvider vs MeterProvider, the OpenFeature Traces/MetricsHook pair, the flagd fractional operation, and why a flag flip beats a redeploy when a rollout misbehaves. Devcontainer openFiles updated: - Intermediate now also opens OpenFeatureConfig.java (the file most of the work hangs off) and flags.json (so the targeting rules waiting on context are visible from the start). - Expert now also opens OpenTelemetryConfig.java (the half-wired file the participant fixes), OpenFeatureConfig.java (where MetricsHook gets registered), and flags.json (where the rollout gets rolled back). Beginner already pre-opens IndexController.java alongside its docs; unchanged. Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
…roller
Two thematic adjustments to the Intermediate puzzle reported as confusing:
1. Spring Boot's "framework version" attribute does not fit the lab story.
Replace it with `country` from the COUNTRY env var — the trial's
country of registration, fixed for the lifetime of a lab instance,
read once at startup via System.getenv("COUNTRY") and put on the
global eval context. Keep the request-scoped attribute as `race`
(was "language"), read off the request — each subject brings their
own species; humans, zyklops, etc.
Targeting:
race == zyklop → enhanced (per-subject, query-param-driven)
country == de → sharp (per-instance, env-driven)
else → blurry (default)
2. Class names that fit the metaphor instead of Spring boilerplate:
DemoApplication → Laboratory (the @SpringBootApplication)
IndexController → Trial (the @RestController)
LanguageInterceptor → RaceInterceptor (Expert-only file rename;
Intermediate has the
participant create it from
scratch, also as
RaceInterceptor.java)
Files in flight:
- intermediate/flags.json + expert/flags.json: targeting now keys on
`race` and `country`, no more sem_ver(springVersion).
- intermediate/{verify.sh, docs/intermediate.md, docs/solutions/intermediate.md}:
full rewrite of the relevant prose + assertions.
- expert/{src/.../OpenFeatureConfig.java, RaceInterceptor.java (renamed),
Laboratory.java (renamed), Trial.java (renamed),
loadgen/k6/script.js, verify.sh, docs/expert.md,
docs/solutions/expert.md}: same refactor through Expert.
- intermediate/run-germany.sh + run-austria.sh: per-country convenience
starters (COUNTRY=de / COUNTRY=at, pipe to app.log so verify can grep).
- .devcontainer/00-side-effects-may-vary_02-intermediate/docker-compose.yml:
workspace exports COUNTRY=de by default, so a plain ./mvnw spring-boot:run
or F5 already exercises the country branch.
- .devcontainer/00-side-effects-may-vary_03-expert/docker-compose.yml:
workspace also exports COUNTRY=de.
- .devcontainer/.../devcontainer.json openFiles: point at Trial.java
(was IndexController.java).
- ASCII architecture diagrams in beginner.md + intermediate.md realigned
for the new (shorter) class names.
Compiles clean on Java 21 across all three levels.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The repo's root .gitignore intentionally excludes .vscode/ — no other adventure overrides it, and shipping a checked-in directory would set a precedent the maintainers haven't asked for. But the participant unfamiliar with Spring Boot really does want a "Run" button in the IDE, and ideally a way to switch the trial country without leaving it. Solution: have post-create.sh write the launch + task configs at codespace boot. The files only exist in the live codespace's filesystem (they're gitignored at the repo root, so even after generation `git status` does not see them) — participants get the buttons, the upstream repo stays clean. Per level: Beginner — one launch config: 🧪 Run the Lab. One task: Verify Solution. Intermediate — three launch configs (🇩🇪 Germany / 🇦🇹 Austria / 🌍 No country, each with the right COUNTRY env var); tasks for the same plus a Verify Solution. Switching trial country is a dropdown click in the Run and Debug view, no `./run-germany.sh` step. Expert — one launch config: 🧪 Run the Phase 3 Lab (env vars come from the docker-compose workspace service). One task: Verify Solution. Each post-create heredoc is idempotent: only writes the file if it does not already exist, so a participant who customises their launch config does not get blown away on the next codespace start. docs/intermediate.md gets a short paragraph naming the three launch configs and explaining where they come from, so a participant who sees "Run the Lab — Austria" in the UI knows it is intentional and that the file is regenerated by post-create rather than checked in. Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
…gitignore
The post-create.sh approach (materialise .vscode/ at codespace boot)
worked but the configs are part of the scenario design, not editor
preference — they belong checked in next to the broken-state code.
The repo root .gitignore intentionally excludes .vscode/ globally so
that personal editor settings stay out. To ship the launch + task
configs as part of this scenario without touching the global rule,
the scenario's own .gitignore re-includes them:
!*/.vscode/
!*/.vscode/**
The negation is scoped to children of this adventure folder, so
nothing changes for the rest of the repo: any other .vscode/ anywhere
(including a participant's personal one outside the scenario) is
still ignored.
Each level ships a checked-in .vscode/launch.json + .vscode/tasks.json:
Beginner — single config 🧪 Run the Lab + 🧪 Verify Solution task.
Intermediate — three launch configs (🇩🇪 Germany / 🇦🇹 Austria /
🌍 No country) so country switching is a one-click
dropdown change in Run and Debug. Plus matching
shell-script tasks and a Verify Solution task.
Expert — single config 🧪 Run the Phase 3 Lab. Verify task.
post-create.sh files revert to their pre-config-generation shape —
the heredoc generation is gone, since the files are already on disk.
docs/intermediate.md updated: dropped the "post-create materialises"
note since the configs are now plain files in .vscode/.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
…d fallback) Two fixes against the same observed gap (codespace launched in the web client, files did not auto-open): 1. customizations.codespaces.openFiles is unreliable for dockerComposeFile-based devcontainers. The Codespaces orchestrator merges devcontainer.json into a runtime config (visible in the boot log as `--override-config /root/.codespaces/shared/merged_devcontainer.json`) and the openFiles field can be reshaped or dropped. Adventures 01 and 03 use single-container `image:` devcontainers, where it works; ours uses `dockerComposeFile + service: workspace`, where it does not reliably fire. The keyed-in openFiles paths stay (no harm if the field does fire), but post-start.sh now also calls `code <file>` — the same CLI the editor uses internally, works in web client and Desktop, idempotent if openFiles already opened the file. - Beginner opens docs/beginner.md + Trial.java - Intermediate opens docs/intermediate.md + OpenFeatureConfig.java + flags.json - Expert opens docs/expert.md + OpenTelemetryConfig.java + OpenFeatureConfig.java + flags.json 2. The post-start banners had drifted. Beginner still said "dispenser" from before the lab/Trial rename. Intermediate suggested `?language=de` from before the race/country swap. Both rewritten to reference the current state (lab, Trial, ?race=, COUNTRY env var, the 🧪 launch configs in .vscode/, and the 🧪 Verify Solution task). Expert banner gets a similar polish + adds the launch-config nudge. Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The Intermediate level was missing the third axis of OpenFeature's
context model — invocation context, the one the call site sets right
before it asks the client. Add it as a fourth required piece, framed as
"some clinical staff don't follow protocol":
- Trial.observeSubject randomly picks dose ∈ {standard (60%), underdose
(30%), overdose (10%)}, overridable via ?dose= for testing. The
controller passes dose as the invocation-context attribute on the
client.getStringDetails(...) call.
- flags.json targeting now reads (top-to-bottom):
race == zyklop -> enhanced (zyklop biology
survives bad
dosing)
dose ∈ {underdose, overdose} -> clouded (improper dose
for non-zyklops)
country == de -> sharp
default -> blurry
The "race wins over dose" priority is the punchline: only humans (and
any other non-zyklop) suffer from a tech mis-measuring the dose.
- intermediate/verify.sh asserts all four branches deterministically:
/?race=zyklop -> enhanced
/?dose=standard (COUNTRY=de) -> sharp
/?dose=underdose -> clouded (invocation context)
/?race=zyklop&dose=underdose -> enhanced (priority correct)
Expert reuses the dose attribute for OpenTelemetry correlation. Two
new tasks added to the Phase 3 challenge:
- Author a small ContextSpanHook (10-line Hook implementation) that
copies merged-eval-context attrs (race, country, dose) onto the
active OTel span as feature_flag.context.<key>. Register it next to
TracesHook and MetricsHook in OpenFeatureConfig.
- Verify Tempo correlation: searching tags=feature_flag.context.dose=
underdose returns spans, lining up with feature_flag.variant=clouded
on the same trace.
The Expert verify.sh now generates a deterministic underdose request,
waits for the OTel batch flush, and queries Tempo for spans tagged
with the context attribute. Missing-hook gives a precise hint.
Pedagogically this gives Expert two views of the OpenFeature hook
pattern: one consuming a library hook (MetricsHook) and one authoring
a small custom hook (ContextSpanHook) — the second shows hooks as a
generic extension point, not just a place to register vendor metrics.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Span and metric attributes flow into observability backends and are
retained for days. A naive copy-paste of the ContextSpanHook with
`for (String key : ec.asMap().keySet())` would push the OpenFeature
targetingKey (often a stable user id) — and any other context attribute
the host app has set, including email or account identifiers in real
apps — straight into Tempo and Prometheus. In several regulatory
regimes that is a notifiable breach.
The reference solution already uses a fixed allowlist
(List.of("race", "country", "dose")), but the docs and the code TODO
did not call out *why*. Add the warning in three places where a
participant may meet it:
- expert.md "Authoring your own hook to enrich spans with context"
section: callout box explaining the rule + link to the OpenTelemetry
security & privacy guidance and the semantic-conventions attribute
requirement levels.
- solutions/expert.md "Three notes worth calling out": replaces the
prior two-bullet list, adds the allowlist rule explicitly with the
OTel security link.
- OpenFeatureConfig.java TODO comment for Phase 3 task off-on-dev#2: a short
⚠️ paragraph covering the same ground with the security URL, so
someone skipping the docs and reading the code still gets the
warning at the call site.
No code change to the broken state — the allowlist pattern is already
the documented solution. Compile clean on Java 21.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
… of ContextSpanHook
The Intermediate-level CustomHook was a hello-world hook — Before /
After / Error / Finally lines with no payload. Pedagogically it
demonstrated the lifecycle but added no value. Replace it with a
real audit-log hook that does the lab director's actual job:
- Reads the merged evaluation context via HookContext.getCtx() and
logs an [AUDIT] line per evaluation with race / country / dose /
variant / reason.
- WARN when the resolved variant is "clouded" (improper-dose case)
so the safety officer can grep for follow-ups; INFO otherwise.
- WARN on errors with the flag key and exception.
A fixed allowlist (race, country, dose) — same PII discipline as the
Expert ContextSpanHook, just with weaker retention. Audit logs ship
to SIEMs and live a long time; iterating over the whole context is
the same kind of mistake. The OTel security & privacy guidance is
linked from the docs.
Expert keeps CustomHook AND adds ContextSpanHook on top:
- CustomHook -> durable text audit log (safety officer's tool,
useful weeks later for forensic follow-up)
- ContextSpanHook -> real-time span enrichment in Tempo (on-call's
tool, correlation alongside feature_flag.variant)
Both serve the same data through different downstreams. The Expert
docs make this layered story explicit so the participant understands
why both stay registered.
Files:
- docs/solutions/intermediate.md: full audit-style CustomHook source +
PII rationale; replaces the Before/After hello-world.
- docs/intermediate.md: How-to-Play step 3c rewritten to ask for the
audit shape (with the allowlist call-out); Concepts section now
explains that hooks are valuable when they read the merged context,
not just when they log "got here".
- expert/.../CustomHook.java: broken-state file aligned with the
Intermediate solution shape (audit log + AUDITED allowlist).
- docs/expert.md: ContextSpanHook section reframed — both hooks stay
registered, they cover different downstreams.
- intermediate/verify.sh: grep accepts AUDIT|Before hook|After hook
so either implementation passes (older simple hooks still work).
Compile clean on Java 21.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
CustomHook was generic; AuditHook is what the docs already called it
("audit hook", "audit log") and lines up with the OpenFeature contrib
naming (TracesHook, MetricsHook, ContextSpanHook).
Renamed:
- expert/.../CustomHook.java -> expert/.../AuditHook.java (file + class)
- expert/.../OpenFeatureConfig.java: addHooks(new AuditHook())
- docs/intermediate.md, docs/expert.md, docs/solutions/{intermediate,expert}.md
- intermediate/verify.sh: log-line and hint references
- intermediate solution prose: "A AuditHook" -> "An AuditHook"
The Intermediate level still has the participant *create* the file from
scratch; only the suggested filename in the docs changes.
Compiles clean on Java 21.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Leftover from the CustomHook -> AuditHook rename in 6e6acf5; the OBJECTIVE string in intermediate/verify.sh wasn't covered by the docs-only sed sweep. Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The Beginner level was running flagd in FILE mode in-process — pedagogically
cleaner at the time, but it meant the participant met "flagd as a sibling
service" only at Intermediate, after they'd already built mental model on a
mode they'd then have to throw away. The new shape keeps the lab and a flagd
container side-by-side from level 1 and makes RPC the answer everywhere.
What's in:
- New docker-compose.yml for the Beginner devcontainer with workspace +
flagd siblings (mirrors Intermediate). FLAGD_HOST=flagd is exported into
the workspace shell so a default Resolver.RPC config picks the sidecar
up automatically.
- devcontainer.json switches from `image:` to dockerComposeFile + `service:
workspace`, forwards 8013–8016, pre-opens flags.json alongside Trial.java
and the doc.
- A seed flags.json (`{"flags": {}}`) at beginner/ so the flagd container
has a valid file to mount at boot — the participant adds the
`vision_state` flag during the level.
- docs/beginner.md rewritten: new architecture diagram (lab → flagd:8013
→ flags.json on disk), new "what you'll learn" beat about remote
providers vs in-process, RPC instructions in step b, "open the existing
flags.json and add the vision_state flag" in step c.
- docs/solutions/beginner.md: OpenFeatureConfig switches to Resolver.RPC
with FLAGD_HOST/FLAGD_PORT picked up from the env. New sidebar comparing
RPC, IN_PROCESS (flagged honestly as the most common shape in real
production deployments), and FILE — with a forward reference to the
Intermediate IN_PROCESS sidebar.
- verify.sh hint copy updated for RPC-against-sidecar; the hot-reload
mechanism note now points at flagd's file watcher (read-only mount of
the workspace) rather than the SDK's.
- post-start.sh banner enumerates the four flagd ports the participant
may meet during the adventure.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The port labels and a couple of runtime URLs were stuck on an older flagd port layout (where 8014 was the HTTP eval gateway). Current flagd defaults: - 8013 — gRPC eval (and HTTP/JSON via gRPC-Gateway, multiplexed via cmux) - 8014 — management (Prometheus /metrics + /healthz, /readyz) - 8015 — sync stream (gRPC, used by Resolver.IN_PROCESS providers) - 8016 — OFREP HTTP eval (vendor-neutral standard) What changed: - Port labels in all three devcontainer.json portsAttributes blocks now match: gRPC eval / management/metrics / sync (IN_PROCESS) / OFREP. - post-start.sh banners enumerate the same four ports correctly. - expert/docs/expert.md architecture diagram + the per-port reference section explain each port's actual role, and the curl example uses :8013 (gRPC-Gateway) instead of :8014 (which only serves /metrics). The runtime breakage that was hiding behind the wrong labels: - expert/verify.sh hit `http://localhost:8014/flagd.evaluation.v1.Service/ ResolveBoolean` to verify flagd reachability — that path is on the gRPC eval port (8013), not the management port. Fixed. - expert docker-compose.yml exported FLAGD_URL=http://flagd:8014 to the k6 loadgen, which polls `loadgen_active` over the same gRPC-Gateway path. Same correction. - expert/loadgen/k6/script.js default FLAGD_URL bumped to :8013 with a comment explaining the cmux multiplexing. Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The Intermediate solution had three real holes: - The Trial controller update was missing entirely. The level's whole point is the third evaluation-context layer (invocation context — dose passed at the call site), but the solution doc didn't show how to modify Trial.java at all. Added a new Step 5 with the full controller diff (accept ?dose=, sample one when missing, build the invocation ImmutableContext, pass it to client.getStringDetails) and a short note on why dose lives at the call site rather than in a filter or a @PostConstruct. - OpenFeatureConfig.java in the solution showed Resolver.FILE + offlineFlagSourcePath. The broken state already uses Resolver.RPC against the flagd sidecar, so the FILE-mode shape was wrong end-to-end. Fixed to RPC, dropped the stale offlineFlagSourcePath line, and documented why (RPC ignores offlineFlagSourcePath; the flagd sibling reads flags.json itself). - The verify section pattern-matched "Before hook|After hook" in app.log, but AuditHook prefixes its lines with [AUDIT]. Corrected the grep, and expanded the curl examples to cover ?dose=underdose and the race+dose precedence case (which verify.sh actually checks). Beginner: minor — solution showed method `helloWorld()`, broken state has `observeSubject()`. Made the names match so the participant doesn't have to either rename their broken-state method or accept a name change as part of the diff. Expert: the Step 1 objective recap was stale (predates the ContextSpanHook task). Added the two missing bullets (ContextSpanHook + spans tagged with feature_flag.context.dose=underdose) and corrected "All seven checks" → "All eight checks" with the per-check breakdown. Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The story slipped into switchboard language in places — "the lab doses the right formulation per cohort", "the dispenser hands every one of them the same default formulation", "the dose that's about to be administered". That reads as if the lab is varying the trial design per subject, when the actual story is the opposite: the protocol is fixed, the targeting is a model of how the same trial yields different *observed outcomes* for different subjects (different biology, dose adherence, jurisdictional baseline). Light language pass to land the framing consistently: - Intermediate retitled "Dose by cohort" → "Outcome by cohort" everywhere it appears (docs, devcontainer name, post-start banner, verify.sh OBJECTIVE, Beginner cross-link, index.md level picker). - The Intermediate "what's invocation context for" passage now frames `dose` as the dose the subject *actually absorbed* (observational — missed appointments, fast metabolisers, the usual reasons), not the dose about to be administered. - The mission line in index.md / README.md pivots from "stand up the lab, dose subjects by cohort" → "stand up the lab, read the chart by cohort". Added a short paragraph in index.md framing the trial as fixed and the outcome as observed — the explicit anchor for the story. - Beginner doc: "let the formulation in flags.json decide what gets recorded" → "let flags.json drive what gets recorded"; "the dosing protocol" subhead → "the chart system"; the next-subject hot-reload line drops "receives the new dose" for "has the new reading". - Expert OpenFeatureConfig TODO: "what the dispenser handed out / what the dispenser knew at the time" → "what the lab recorded / what the chart knew at the time". - Devcontainer name: also added the missing 🧪 emoji prefix to the Intermediate name to match Beginner / Expert. Out of scope: the original ideas/side-effects-may-vary.md pitch is left untouched — that's the historical record from PR off-on-dev#38, not learner-facing. Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Reviewer pointed out the loaded-term-and-collision problem: the prose
already said "species" everywhere ("each subject brings their own
species — humans, zyklops, ..."), but the query parameter, the
evaluation context key, the targeting rule, and the interceptor class
all said race. Two issues at once: race carries baggage in a tutorial
context, and it collides with "race condition" in any Java reader's
head.
This commit lands the rename end-to-end:
- Class file: RaceInterceptor.java -> SpeciesInterceptor.java (git mv,
with the class name, javadoc, and local variable updated to match).
- OpenFeatureConfig: registers `new SpeciesInterceptor()`; the
ContextSpanHook TODO comment lists species/country/dose.
- AuditHook AUDITED allowlist: race -> species.
- Flag definitions in intermediate/flags.json and expert/flags.json:
`{"var": "race"}` -> `{"var": "species"}`.
- verify.sh (intermediate + expert): query strings, hint copy, and the
two FAILED_CHECKS tag names (race_targeting -> species_targeting,
priority_race_over_dose -> priority_species_over_dose).
- k6 loadgen: RACES -> SPECIES, race query param -> species, k6 tag.
- Docs (expert.md, solutions/expert.md, solutions/intermediate.md):
prose, code blocks, curl examples.
- Idea (ideas/side-effects-may-vary.md): same sweep.
- Banner: intermediate post-start.sh curl example.
Also added the missing 🧪 emoji prefix to the Intermediate devcontainer
name field so it lines up with Beginner / Expert. (Carried over from a
prior pass; the surrounding diff shows it.)
The substitutions used were `\bRace\b -> Species`, `\brace\b -> species`,
`\bRACES\b -> SPECIES`, `RaceInterceptor -> SpeciesInterceptor`. Whole-
word boundaries kept TracesHook, TracerProvider, traces, tracker, etc.
untouched. Two compound names (race_targeting, priority_race_over_dose)
weren't covered by the boundary regex and were updated by hand.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
(1) The architecture diagram had "FlagdProvider (FILE mode)" and the
prose underneath said "flagd is not running as a container yet". Both
are stale — broken-state OpenFeatureConfig is Resolver.RPC against the
flagd sibling, and has been since the levels were unified on the
sidecar shape. Diagram now shows the flagd sibling as a separate box
on gRPC :8013, and the prose matches.
(2) The diagram's invocation-context label read `dose=random/?dose=`,
which the reader couldn't parse on first scan. Now reads `dose ←
computed at call site, overridable with ?dose=` — verbose, but
unambiguous.
(3) The `flags.json` snippet in "Inspect the Starting Point" showed
two branches (species → country) but the concepts section showed
three (species → improper-dose → country). The starting `flags.json`
genuinely has all three; the snippet was lagging the rule. Updated to
match `intermediate/flags.json` verbatim, including the `dose ∈
{underdose, overdose}` arm.
(4) "Verify Each Cohort by Hand" tested ?species=zyklop and the country
branch but never the invocation-context (?dose=) cases that the
objective list explicitly promised. Added the two missing curls
(?dose=underdose for the invocation branch, ?species=zyklop&dose=
underdose for the precedence case), pinned the country curl with
?dose=standard so the random sampler can't trip improper-dose, and
stated Austria's expected output ("blurry — no targeting branch
fires, default applies") so the reader can tell whether they've
solved it or broken it. The tail-the-log grep also moved from "Before
hook|After hook" to '\[AUDIT\]' to match what AuditHook actually
writes.
(5) The app.log requirement was buried in step 4 — somebody who runs
./mvnw spring-boot:run directly (which the COUNTRY=de devcontainer
default encourages) would have a passing app and a failing verifier
with no obvious cause. Pulled the requirement up into a callout right
after the objective list.
(6) The audit-log PII discipline note in step 3c was substantively right
but a 110-word paragraph in the middle of an implementation step.
Restructured to lead with the rule (one short imperative sentence:
"use a fixed allowlist, never iterate the eval context") and tucked
the "why" — SIEM retention, redaction difficulty, OTel link — under
that.
Plus a small parallel improvement to the toolbox sidebar: now that the
flagd sibling is the canonical answer at every level, the IN_PROCESS
sidebar is positioned as the production-recommended shape (sync stream
on :8015, no per-call hop) rather than a future-tense bridge to Expert.
FILE mode kept as a unit-test option.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Two reviewer-flagged blockers in the Intermediate + Expert post-create scripts: - jq is required by both verify.sh scripts (parsing the JSON evaluation details from the lab and the JSON from Prometheus / Tempo / flagd's HTTP gateway), but only the Beginner post-create installed it. The Java devcontainer image is Debian-based but ships without jq. Mirrored the Beginner block (`apt-get update && apt-get install -y --no-install- recommends jq`, guarded by `command -v jq`) into both other scripts. - set_tracking_context was being called with the unprefixed name "side-effects-may-vary" in Intermediate + Expert, while Beginner — and every other adventure — uses the "00-side-effects-may-vary" form. Telemetry was splitting between the two identifiers. Aligned both to the prefixed form. Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
…lloWorld) The Intermediate broken state shipped with `helloWorld()` as the GET / handler — a Spring Initializr name that survived from an earlier draft and never matched the level's vocabulary or the Beginner solved state (`observeSubject()`). The Intermediate solution doc shows the final method as `observeSubject(@RequestParam String dose)`, so a participant copying the solution silently changed the method name on top of the real wiring change. Reviewer flagged this as a confusing "signature change disguised as an update" — fair. Renamed the broken-state method to `observeSubject()`. Return type, body, imports, mapping all unchanged. The Intermediate solution diff is now a true body update (add `@RequestParam` for `?dose=`, build the invocation ImmutableContext, pass it to `client.getStringDetails(...)`) — no rename. While in this neighbourhood, dropped the stale `offlineFlagSourcePath("./flags.json")` line from the broken-state OpenFeatureConfig. With `Resolver.RPC` the flagd contrib provider ignores it; leaving it in misled learners reading the broken state. Did not add hooks/interceptor/global-eval-context — those are still the participant's job. Compile clean (Java 21). Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
…ode lives in solutions only
Two reviewer-flagged gaps on Expert.md — same file, addressed
together because they touch adjacent sections.
(1) targetingKey for fractional rollout was never stated. The Expert
level's vision_amplifier_v2 flag uses flagd's fractional operation
which buckets by hashing the OpenFeature targetingKey. The
SpeciesInterceptor (carried over from Intermediate) reads ?userId= and
sets it as the targetingKey via the first-arg String to ImmutableContext.
The k6 loadgen generates a fresh userId per request to spread load
across buckets. None of this was on the participant's radar — the
objective list didn't mention targetingKey, the architecture diagram
didn't annotate the userId-to-targetingKey path, and the only mention
was a passing line in the loadgen narrative. So a learner who hits the
endpoint by hand without ?userId= would see every request land in the
same bucket and the rollback "look like it works" by accident.
- Architecture diagram (loadgen→app arrow): now annotates that
?userId= becomes the targetingKey via the SpeciesInterceptor.
- Objective bullet (already there from a prior pass): kept,
verified it reads as a level-1 deliverable rather than as flavour.
- flagd fractional + targetingKey concept section: paragraph naming
the SpeciesInterceptor as the targetingKey source — the loadgen
narrative now reads as a demonstration of correctly-wired
bucketing, not the place where the bucketing magic happens.
- solutions/expert.md "Inspect what's already wired" (Step 2): added
a one-paragraph callback explaining that the SpeciesInterceptor
was wired in Intermediate and is what makes the Step 6 rollback
take effect immediately. Closes the gap that the rollback "just
works" because of code participants never see.
(2) ContextSpanHook full implementation lived in the concept section as
well as in the solution doc. Concept sections should motivate the
*idea* — the implementation is the answer. Reviewer flagged that the
current shape invites copy-paste over comprehension, and the PII
allowlist lesson (which is the actual learning goal of that section)
gets buried under a wall of Java imports.
- Replaced the full Java code block with a 2-3 line text-fenced
pseudocode sketch (`before(hookCtx) { span = active OTel span; for
each allowlisted key in merged eval context: span.setAttribute(...)
}`) so the shape is conveyed without being copy-pasteable.
- Added one short closing pointer to solutions/expert.md for the
full implementation including imports and the subtle correctness
notes (no-op span, why we don't need defensive guards).
- Kept the PII allowlist callout block intact — that's the actual
learning goal of the section.
Plus one small framing fix on the OTel TracerProvider/MeterProvider
concept section: added a one-sentence grounding ("spans = per-request
timing; counters = aggregate population stats; in this lab traces work,
metrics don't — that's the gap you close") so a participant without
prior OTel literacy doesn't have to infer what "tracer" and "meter"
mean from context.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Reviewer ran a density analysis across all seven docs and flagged that
the same three or four boundary concepts get re-explained 3–4 times
across the adventure. By Expert, the participant has heard "flagd has
4 ports" three times and "RPC vs IN_PROCESS vs FILE" four times. The
worst offender — intermediate.md "Concepts you'll touch" — was 53 lines
re-narrating what the architecture diagram + curl table already teach.
This commit lands the trims, one file per agent in parallel:
- **beginner.md (261 → 230, −30):** dropped the resolver-mode callout
in step b and the "Why a sidecar instead of file mode" defence in
the Architecture section (the canonical explanation lives in
solutions/beginner.md, where the participant has earned the context
to think about variants); slimmed the four-port enumeration in
Toolbox + "Access the UIs" to one line on :8013; replaced the ASCII
architecture diagram with the existing four narrative bullets that
carry the same host/port/env-var detail in less space; dropped the
"explanatory paragraph" framing the verification JSON.
- **intermediate.md (295 → 230, −64, ~22%):** replaced the 53-line
"Concepts you'll touch" section with a 4-bullet primer + a
cross-link to solutions/intermediate.md "Why This Layout Works" (the
reviewer-praised gold-standard 8-line summary of the same concepts);
dropped the resolver-mode sidebar in Toolbox (fourth instance in the
adventure, and Intermediate doesn't flip resolver modes); collapsed
"Run the Lab" from 20 lines listing four ways to start the app down
to 7 — `./run-germany.sh` as canonical, one-liner mention of
`./run-austria.sh` and the launch configs; tightened the
`tee app.log` callout from 4 lines of prose to 2.
- **expert.md (346 → 321, −25):** replaced the flagd-port enumeration
block (third instance of the 4-port walkthrough in the adventure)
with a one-liner pointing back to Beginner; dropped "Why a flag
flip beats a redeploy" subsection that restated the level's intro
paragraph; collapsed the 8-line PII allowlist callback (full
version lives in intermediate.md) to a 2-line cross-link; trimmed
the 4-bullet predicted-numbers paragraph to a one-line cue
("if those don't move, the loadgen flag isn't actually live yet").
Smaller delta than other files — the Expert sub-agent flagged that
the four prescribed trims really were ~25 lines of removable
content; the rest would require expanding scope into the
reviewer-preserved sections (intro, architecture diagram,
TracerProvider/MeterProvider concept, fractional+targetingKey
concept, implementation steps).
- **solutions/intermediate.md (291 → 271, −20):** the curl table at
the end of the solution was a near-verbatim rerun of the
participant doc's "Verify Each Cohort by Hand" — replaced with a
one-sentence pointer back to it. The "Why This Layout Works"
closing section (reviewer's "best 8 lines in the adventure") is
untouched.
solutions/beginner.md left as-is — its end-of-page resolver-modes
blockquote is the canonical explanation the other instances were
deferring to. It already carried the production-shape framing
(IN_PROCESS as the most common shape in real production deployments).
Total net: ~140 lines down. Reviewer's stretch target was ~280; the
gap is mostly Expert (~75 lines short) where further cuts would mean
expanding the trim into the preserved-by-default concept sections.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
…ver" is genuine
Reviewer caught a real bug: Expert told participants the SpeciesInterceptor
was "carried over from Intermediate," but Intermediate's solved version
only handled `?species=` and Expert silently shipped a fatter version that
also wired `?userId=` as the OpenFeature targetingKey. A participant who
solved Intermediate themselves and walked into Expert would find code in
"their" interceptor that they didn't write.
Going with the bigger fix: have Intermediate teach the userId/targetingKey
wiring too. The lesson is genuinely strengthened by it — the third
evaluation-context layer story now also lands the canonical PII identifier
on the transaction context, which makes the AuditHook PII discussion in
3c go from abstract ("targetingKey would be PII") to load-bearing ("the
targetingKey you just wired in 3a is exactly the kind of value the
allowlist keeps out of [AUDIT] lines").
Files updated:
- docs/solutions/intermediate.md SpeciesInterceptor code now matches the
Expert version byte-for-byte: read both ?species= and ?userId=, build
ImmutableContext(userId, attributes) when userId is present, otherwise
ImmutableContext(attributes). Notes section explains the targetingKey
constructor branch and the forward-looking nature of the wiring (no
Intermediate flag uses targetingKey; Expert's vision_amplifier_v2
fractional rollout is where it pays off).
- docs/intermediate.md: objective list adds a bullet for the targetingKey
wiring that names it as the canonical PII identifier the AuditHook
deliberately won't log; architecture diagram annotates `targetingKey ←
?userId=` on the transaction context arrow; the Concepts primer
rearranges to tuck ThreadLocalTransactionContextPropagator under the
three-context-layers bullet (it's a sub-concept), adds a new bullet on
targetingKey explaining ec.getTargetingKey() vs getValue("targetingKey")
and the ImmutableContext(targetingKey, attributes) constructor; step 3a
now instructs reading both ?species= and ?userId=; step 3c PII callout
rewritten to reference the userId the participant just wired in 3a as
the concrete example.
- docs/expert.md: objective bullet for SpeciesInterceptor reworded from
task-flavoured to verifiable-outcome ("you don't write this — verify
it via the variant-distribution panel after step 5"); ASCII diagram
fixed (the SpeciesInterceptor word-broke across lines as
"SpeciesIntercep- / tor"); fractional+targetingKey concept section
rewritten so it leans on "you already wired this in Intermediate"
rather than introducing the interceptor as new at this level.
- docs/solutions/expert.md: Step 2 callback rewritten — "the
SpeciesInterceptor you wrote in Intermediate" with explicit "Expert
ships it byte-for-byte unchanged"; the OpenFeatureConfig solution code
block now annotates AuditHook + TracesHook lines as "already wired in
broken state" and MetricsHook + ContextSpanHook as "you add this," so
the diff against the broken state is unambiguous.
Plus three smaller polish items the reviewer flagged in passing:
- docs/intermediate.md heading "3c. A `AuditHook`" -> "3c. An
`AuditHook`". The auto-generated slug becomes #3c-an-audithook, which
matches the existing link from expert.md (was broken before this).
Also fixes the same grammar artefact addressed earlier in verify.sh
at commit 144dd70.
- docs/intermediate.md Concepts primer reordered:
ThreadLocalTransactionContextPropagator was stranded as bullet 4 after
the Hook bullet, when it's actually a sub-concept of the transaction
context layer. Tucked into the three-context-layers bullet as a
follow-up sentence.
- docs/solutions/intermediate.md "Why This Layout Works" parenthetical
"(the sidebar)" was a dangling reference — the IN_PROCESS sidebar got
trimmed out of intermediate.md in 32d9646. Replaced with a pointer to
solutions/beginner.md where the resolver-modes overview now lives
canonically, and added a fifth bullet on targetingKey for completeness.
All three levels still compile clean (Java 21); shell scripts pass
`bash -n`.
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Slug, title, and tracking-context tag renamed across the repo. Java packages, branch name, and adventure number prefix unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
3 tasks
This was referenced Apr 30, 2026
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A new adventure that walks participants through OpenFeature with flagd as the provider, framed as a clinical trial of a vision-amplification serum where the Phase 3 rollout is making roughly one in ten subjects emerge blind. Single-stack (Java + Spring Boot) per the existing convention; three levels.
The draft has come a long way since I opened it; this body is a refresh.
Level arc
Resolver.RPCmode against a flagd sibling that the devcontainer runs alongside the workspace. Proveflags.jsonhot-reloads without a restart — flagd's file watcher does the work.RaceInterceptorlifts?race=into transaction context (per-request), theCOUNTRYenv var feeds the global context (per-process), and theTrialpasses adoseattribute as invocation context at the call site (per-evaluation). AnAuditHookrecords every dose dispensed (with a PII-safe attribute allowlist).ContextSpanHookthat mirrors the merged eval context onto the active span (so Tempo can answer "which dose got which variant?"), identify the misbehaving fractional rollout (vision_amplifier_v2: 200ms slow + 10% blind emergence = HTTP 500), and roll it back viaflags.jsonwithout redeploying.Why this story
The story has a built-in pun the participant resolves with their solution: the lab studying eyesight cannot see what is happening because the metrics half of OTel is unwired. Lighting up the Grafana dashboard is the first move toward saving the next batch of subjects. Feature flags become an operational lever rather than a configuration footnote.
Notable shape choices since the draft opened
language(transaction) +springVersion(global). Refactored so race comes from the URL, country from the env, and dose from the call site — which gave us a clean place to teach precedence (invocation wins) and to talk about why you'd choose one layer over another.AuditHook(renamed fromCustomHook) is a real audit log. Earlier draft had a thin print-everything hook; reviewer instinct said it was filler. Rewrote it as an audit-style hook with a fixedAUDITEDallowlist (race,country,dose,targetingKey),WARNon the failure variant,INFOotherwise. Doubles as the on-ramp for the privacy discussion in Expert.ContextSpanHookis a participant task in Expert. The merged eval context already lives onHookContext.getCtx(); copying it onto the active OTel span is a few lines, but the lesson is the PII allowlist: the merged context routinely carriestargetingKey(often a user id) and would carry email/account-id in a real app. Span attributes are retained for days in Tempo/Prometheus and are hard to redact after the fact, so the task ships with a callout linking https://opentelemetry.io/docs/security/ and an inline TODO that says explicitly do not iterate the whole context.flagdsibling (andlgtm+loadgenfor Expert), viadockerComposeFile+service: workspace. Beginner originally ran flagd in FILE mode in-process — pedagogically lighter, but it meant participants met "flagd as a separate service" only at Intermediate, after they'd already built a mental model on a mode they'd then have to throw away. Beginner now ships with the flagd sibling andResolver.RPCas the canonical answer, so the shape is consistent across the adventure..vscode/launch.jsonchecked in. Three configs per level — 🇩🇪 Germany, 🇦🇹 Austria, 🌍 No country — so participants can F5 between cohorts to see global-context targeting flip live. The repo's root.gitignoreexcludes.vscode/; per-scenario.gitignores re-include with!*/.vscode/**so the configs ship with the broken-state code.:8015, evaluations stay local) — and explains that we lead withResolver.RPConly because the wire model is easier to reason about for a first contact. Intermediate keeps its sidebar showing how to flip to IN_PROCESS against the same flagd sibling.:8014labelled as "flagd HTTP eval" across three devcontainers and one verify.sh + the k6 loadgen — that's stale (it was the layout in older flagd). Current flagd defaults::8013gRPC eval (multiplexes HTTP/JSON via gRPC-Gateway too),:8014management/metrics,:8015sync gRPC,:8016OFREP. Fixed labels everywhere and pointed the runtime URLs (verify.sh, composeFLAGD_URL, k6 default) back to:8013where the gRPC-Gateway path actually lives.What's in the draft
ideas/blind-by-design.md— passesscripts/validate-idea.shadventures/planned/00-blind-by-design/mkdocs.yaml(🧪 emoji + nav)docs/{index,beginner,intermediate,expert}.mdanddocs/solutions/*.mdbeginner/,intermediate/,expert/(each withpom.xml,mvnw,src/,flags.json,verify.sh, scoped.gitignore+.vscode/)intermediate/run-germany.sh,run-austria.sh)expert/docker-compose.yamlfor the LGTM + flagd + loadgen stack used during the level.devcontainer/00-blind-by-design_{01-beginner,02-intermediate,03-expert}/— one per levelflagdsiblings (compose-based)flagd)flagd+lgtm+loadgenpost-create.sh/post-start.shsource the existing helpers inlib/and pre-open the level's first file viacode(compose-based devcontainers don't reliably honourcustomizations.codespaces.openFiles)Verification approach
verify.shper level checks outcomes, not file contents (perdocs/contributing/adventures.md):vision_state; value isn't the hard-codeduntreatedfallback (proves the flagd sibling resolved it); swappingdefaultVariantinflags.jsonflips the response live (proves the flagd file watcher reloads).?race=zyklop→enhanced(transaction context);?dose=standardwithCOUNTRY=de→sharp(global context);?dose=underdose→clouded(invocation context);?race=zyklop&dose=underdose→enhanced(precedence: race-zyklop branch wins over improper-dose branch inflags.jsontargeting);app.logshowsAUDITlines.MetricsHookis firing (feature_flag_evaluation_requests_totalnon-zero in Prometheus); traces present in Tempo forfun-with-flags-java-spring; at least one trace carriesfeature_flag.context.dose=underdose(provesContextSpanHookis registered);vision_amplifier_v2rolled back to 0% on (read from flagd's gRPC-Gateway HTTP route on:8013); HTTP 5xx rate below threshold.Source
The broken-state code is adapted from the polyglot Fun-With-Flags-Demo (java-spring variant), which already has step branches matching each level's start state.
Validation
scripts/validate-idea.sh ideas/blind-by-design.md✅./mvnw compile✅ on all three levels (Java 21)bash -n✅ on every shell script./verify.sh✅ end-to-end on each level against the running broken-state-then-solved appmkdocs serverenders under00-blind-by-designper the existing navWhat I'd love feedback on
ContextSpanHooktask pitched at the right level for Expert (it's a small write, but the PII discussion is the actual learning goal)?Ready to iterate based on review.