feat(observability): WS4 funnel custom events → New Relic (InstantFunnel)#259
Merged
Conversation
…nel)
Wire common/analyticsevent into the api so the conversion funnel
(anonymous→provision→claim→paid) is recorded as a per-entity New Relic
custom event (InstantFunnel) alongside the existing aggregate Prometheus
counter instant_conversion_funnel_total. Closes the WS4 gap: backend→NR
custom-event bridge now emits at the funnel points.
Emitter:
- Package-level analyticsevent.Emitter in handlers (atomic, boxed for
type-stable atomic.Value), default = noop. Router builds it once at boot
(wireAnalyticsEmitter) from ANALYTICS_BACKEND (default "noop" = INERT;
"newrelic" reuses the api's existing *newrelic.Application). Fail-open:
the analyticsevent wrapper swallows panics + sanitizes PII (allowlist).
noop-default is the flag protection — no separate feature flag.
Emit sites (10, alongside — not replacing — the Prom counter):
- provision: db/cache/nosql/vector/queue/storage/webhook NewX (anon path)
- claim: onboarding.Claim (anon→claimed)
- landing: onboarding.StartLanding (top of funnel)
- paid: billing.handleSubscriptionCharged (claimed→paid)
Attributes are PII-safe + low-cardinality: funnelStep, service, tier, env,
hashed fingerprint, opaque teamId. No raw email/token/connection string.
Observability (rule 25): new Prom counter
instant_analytics_emit_failed_total{reason} (nil_app = NR unconfigured) via
the nr failure hook; docs/OBSERVABILITY-FUNNEL-EVENTS.md documents the
InstantFunnel event + NRQL + the cohort='synthetic' exclusion. Alert +
dashboard tile live in the infra repo (no auto-apply).
Tests: recording-emitter assertions per step+attrs, noop-default no-error,
PII-not-emitted, registry-iterating allowlist guard, wire-contract step
guard, NR nil-app failure-hook → counter, and a DB-backed paid-funnel
event test through the real webhook path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e 69)
The 100%-patch gate flagged analytics.go:69 (the NewNoop() fallback) uncovered —
SetAnalyticsEmitter ignores nil so the existing tests always store a non-nil
box. Add an in-package test storing emitterBox{e:nil} directly to exercise the
fallback branch. analytics.go getAnalyticsEmitter now 100%.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Wires
common/analyticsevent(merged in common #44) into the api so the conversion funnel (anonymous→provision→claim→paid) is recorded as a per-entity New Relic custom event (InstantFunnel), alongside the existing aggregate Prometheus counterinstant_conversion_funnel_total. Closes the WS4 gap from OBSERVABILITY-AND-INTELLIGENCE-PLAN.md — the backend→NR custom-event bridge now emits at the funnel points.Prometheus answers "how many"; the NR custom event answers "for which entity / cohort" — the KPIs anon→claimed (>2%) and claimed→paid (>20%) need a stable key (fingerprint bucket / teamId), which a counter can't carry.
Emitter construction / config
analyticsevent.Emitterinhandlers(atomic, boxed for type-stableatomic.Value), default = noop.wireAnalyticsEmitter) fromANALYTICS_BACKEND(default"noop"= inert;"newrelic"reuses the api's existing*newrelic.Application— no second NR connection).Wrapswallows panics and PII-sanitizes (allowlist) before any backend sees the attrs; emit calls add no error handling that could fail the request path.Funnel emit points wired (10; ADD, never replace the Prom counter)
provisiondb.go/cache.go/nosql.go/vector.go/queue.go/storage.go/webhook.go—NewXanon pathclaimonboarding.goClaimlandingonboarding.goStartLandingpaidbilling.gohandleSubscriptionChargedEnumeration:
rg -F 'metrics.ConversionFunnel.WithLabelValues'→ 10 sites; all 10 touched.Attributes are PII-safe + low-cardinality:
funnelStep,serviceName,tier,env, hashedfingerprint, opaqueteamId. No raw email/token/connection string (allowlist backstop in the wrapper).Observability (rule 25)
instant_analytics_emit_failed_total{reason}(nil_app= NR unconfigured) via the nr failure hook.docs/OBSERVABILITY-FUNNEL-EVENTS.mddocuments theInstantFunnelevent contract, NRQL starters, and thecohort='synthetic'exclusion for funnel analysis. The NR alert + dashboard tile live in the separateinfrarepo (no auto-apply).Tests (test names)
internal/handlers/analytics_test.go:TestGetAnalyticsEmitter_DefaultsToNoop,TestSetAnalyticsEmitter_NilIgnored,TestRecordFunnelEvent_EmitsFunnelEventWithStepAndAttrs,TestRecordFunnelEvent_OmitsEmptyOptionalAttrs,TestRecordFunnelEvent_EachCanonicalStep,TestRecordFunnelEvent_DoesNotEmitPII,TestFunnelAttrs_ToMap_OnlyAllowlistedKeys,TestFunnelStepsMatchCanonicalinternal/router/analytics_wiring_test.go:TestWireAnalyticsEmitter_DefaultNoop,TestWireAnalyticsEmitter_UnknownBackendDegradesToNoop,TestWireAnalyticsEmitter_NewRelicNilAppFiresFailureHookinternal/handlers/billing_funnel_event_test.go:TestBillingWebhook_SubscriptionCharged_EmitsPaidFunnelEvent(DB-backed, real webhook path)Gate
make gate: build + vet clean; all touched-package tests pass. The handlers-p 1run reds only on pre-existing NATS/customer-DB environmental flakes (TestDBNew 503 = customer-DB provisioner down, TestQueue "NATS health check failed — is the NATS pod running?", etc.) — verified identical on clean origin/master with these files stashed. CI (which has those services) is authoritative.🤖 Generated with Claude Code