feat(loadgen): adding loadgen binary#3335
Conversation
…spamoor Standalone Go binary that orchestrates spamoor-daemon via HTTP API to drive sustained and burst transaction load against ev-reth. Structured as a cobra CLI with subcommands: check (connectivity), regular (~1M tx/day), burst (500K probabilistic), and run (custom matrix). Includes Dockerfile, docker-compose, baseline/burst matrices, crontab for supercronic scheduling, just targets for build and smoke testing.
Replace separate regular/burst subcommands invoked by supercronic with a single `start` command that runs an infinite scheduling loop. Regular workloads fire on a ticker, bursts at random times within rolling 24h windows. All timing controlled via CLI flags with env fallbacks. - remove supercronic from Dockerfile - delete regular.go, burst.go, crontab - add start.go with scheduler loop - plumb context for graceful shutdown - unify ExecuteMatrix/ExecuteMatrixWithOverrides via matrixOpts - bump fees (BASE_FEE 20→500, TIP_FEE 2→50) - remove healthcheck from docker-compose, rely on WaitForSync
📝 WalkthroughWalkthroughAdds a new ev-loadgen CLI app: build and CI entries, Justfile recipes, Docker assets, a Cobra CLI (run/start/check), a Spamoor client abstraction, matrix parsing/validation, a runner that creates/polls spammers, scheduling (regular and burst), tests, and documentation. Changesev-loadgen Load Generator Application
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude finished @chatton's task in 3m 13s —— View job Code ReviewThis is a well-structured addition. The package layout is clean, the SpamoorClient interface is appropriately scoped for testability, and the fake test double is well-implemented. Several issues worth addressing before merge: 🔴 High Priority1. README documents mutex serialization that doesn't exist [ The README states: "a mutex prevents concurrent spamoor access. If burst fires during regular (or vice versa), it waits for the lock." But 2. Goroutine accumulation on slow workloads [ case <-ticker.C:
go runWorkload("regular", cfg.regularMatrix, regularTxPerRun)Goroutines are launched unconditionally on every tick. If a workload takes longer than 3. for _, sp := range existing {
if err := api.DeleteSpammer(sp.ID); err != nil {
return fmt.Errorf("delete spammer %d: %w", sp.ID, err) // ← early return
}
}On shutdown cleanup, if deleting one spammer fails (transient network error, etc.), all remaining spammers are left running. This is particularly bad for the shutdown path in var errs []error
for _, sp := range existing {
if err := api.DeleteSpammer(sp.ID); err != nil {
errs = append(errs, fmt.Errorf("delete spammer %d: %w", sp.ID, err))
}
}
return errors.Join(errs...)🟡 Medium Priority4.
5. Stale height, err := getSpamoorHeight(api)
if err != nil {
log.Printf("warning: %v", err)
} else {
if lastHeight > 0 && height > 0 {
delta := height - lastHeight // ← lastHeight not reset on errorIf 6. Potential panic in delay := time.Duration(rand.Int64N(int64(window) / int64(remaining)))If if cfg.burstPerDay > int(burstWindow.Seconds()) {
return fmt.Errorf("burst-per-day %d exceeds maximum (%d)", cfg.burstPerDay, int(burstWindow.Seconds()))
}🔵 Low Priority / Style7. A typo like 8. Inconsistent HTTP timeout [
9. Burst reset is 24h from startup, not midnight UTC [ The README says: "at startup + each midnight UTC, generates N random times across the day". The actual implementation uses 10. No context propagation in None of the interface methods accept ✅ Already Addressed (from prior CodeRabbit review)
TestingThe test coverage for the core paths is solid — baseline counter seeding, spammer start failure, completion polling, sync detection, and context cancellation are all covered. The Missing coverage:
|
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3335 +/- ##
==========================================
- Coverage 60.75% 60.72% -0.03%
==========================================
Files 127 127
Lines 13781 13781
==========================================
- Hits 8372 8368 -4
- Misses 4497 4500 +3
- Partials 912 913 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (10)
apps/loadgen/cmd/flags_test.go (1)
10-12: 💤 Low valueRedundant (duplicate)
startregistration.
NewRootCmd()already callsrootCmd.AddCommand(newRunCmd(), newStartCmd(), newCheckCmd()), sorootalready owns astartsubcommand. Adding anotherstartCmdhere registers a second, duplicatestartcommand onroot. The test still passes because the assertions run against the localstartCmd, butrootis never used in this test — bothrootand the extraAddCommandcan be dropped.♻️ Suggested simplification
func TestStartFlags(t *testing.T) { - root := NewRootCmd() startCmd := newStartCmd() - root.AddCommand(startCmd) err := startCmd.ParseFlags([]string{"--regular-matrix", "custom.json"}) require.NoError(t, err)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/cmd/flags_test.go` around lines 10 - 12, Remove the redundant registration of the "start" subcommand: NewRootCmd() already adds newStartCmd(), so drop the creation of root and the extra root.AddCommand(startCmd) call from the test; keep the local startCmd used for assertions (created via newStartCmd()) and remove references to NewRootCmd(), root, and root.AddCommand to avoid registering a duplicate start command.apps/loadgen/main.go (1)
17-21: 💤 Low valueBanner side-effect in
init()fires on every invocation.
init()runs beforemain()for all entrypoints, so"ev-loadgen starting"is emitted even for--help, shell completion, and other non-running subcommands, polluting their output. Consider keeping only logger configuration ininit()and moving the startup message intomain()(or aPersistentPreRun) so it prints only when a workload actually starts.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/main.go` around lines 17 - 21, The init() currently configures logging and also prints a startup banner ("ev-loadgen starting"), which runs for every entrypoint; remove the fmt.Fprintln(os.Stderr, "ev-loadgen starting") from init() so init() only does log.SetFlags and log.SetOutput, and relocate the banner to main() (or a Cobra PersistentPreRun if using Cobra) so the message is emitted only when the workload actually starts; update references in main() or the command's PersistentPreRun to print the same startup message to os.Stderr.apps/loadgen/internal/client.go (2)
67-81: ⚡ Quick winWrap returned errors with context.
GetClientsreturns bare errors from the HTTP GET and the JSON decode. Wrapping them makes failures actionable in logs/callers.♻️ Proposed wrapping
url := fmt.Sprintf("%s/api/clients", c.api.BaseURL) resp, err := c.client.Get(url) if err != nil { - return nil, err + return nil, fmt.Errorf("get clients: %w", err) } defer func() { _ = resp.Body.Close() }() if resp.StatusCode < 200 || resp.StatusCode >= 300 { body, _ := io.ReadAll(resp.Body) return nil, fmt.Errorf("get clients failed: %s", string(body)) } var raw []clientResponse if err := json.NewDecoder(resp.Body).Decode(&raw); err != nil { - return nil, err + return nil, fmt.Errorf("decode clients response: %w", err) }As per coding guidelines: "Wrap errors with context using fmt.Errorf in Go code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/internal/client.go` around lines 67 - 81, GetClients currently returns raw errors from the HTTP GET and JSON decode; update spamoorAPIClient.GetClients to wrap those errors with contextual messages (e.g., using fmt.Errorf) so callers see what operation failed. Specifically, wrap the error returned by c.client.Get(url) with context like "GET /api/clients failed" and wrap the error from json.NewDecoder(resp.Body).Decode(&raw) with context like "decode GetClients response failed"; keep existing behavior for non-2xx handling and closing resp.Body, but replace bare err returns with fmt.Errorf including the original err.
16-24: ⚖️ Poor tradeoffConsider propagating
context.Contextthrough the client.The
SpamoorClientinterface andGetClientsrely solely on the 2shttp.Clienttimeout and ignore the caller's context (e.g.,cmd.Context()fromRunCheck). Threading acontext.Contextinto these methods would honor cancellation and deadlines from upstream callers. This touches the interface surface, so it can be deferred, but it aligns with the guideline.As per coding guidelines: "Use context.Context for cancellation in Go code".
Also applies to: 67-69
apps/loadgen/docker-compose.yml (1)
5-5: ⚡ Quick winPin
spamoor-daemonto an immutable version/digest instead oflatest.Using
:latestmakes load tests non-reproducible and can silently change behavior between runs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/docker-compose.yml` at line 5, Replace the floating image reference "image: ethpandaops/spamoor:latest" with an immutable image reference (either a specific version tag like "ethpandaops/spamoor:vX.Y.Z" or a content-addressable digest "ethpandaops/spamoor@sha256:...") so the loadgen service uses a reproducible artifact; update the compose service's image field where "image: ethpandaops/spamoor:latest" appears and ensure any CI/docs that trigger builds are updated to pin or update the tag/digest when intentional.apps/loadgen/internal/matrix.go (2)
104-114: 💤 Low value
envIntsilently masks malformed values.A non-integer value for
BENCH_NUM_SPAMMERS/BENCH_COUNT_PER_SPAMMER(e.g. a typo like"2x") silently falls back to the default instead of being rejected, so a misconfigured matrix can run with unintended values. Since these are required, validated inputs, surfacing the parse error would be safer.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/internal/matrix.go` around lines 104 - 114, The envInt helper currently swallows parse errors; change envInt(key) to return (int, error) instead of silently falling back so callers can fail fast for required inputs like BENCH_NUM_SPAMMERS and BENCH_COUNT_PER_SPAMMER; specifically, update envInt to: 1) return fallback value only when the key is absent (or keep current semantics if fallback is intended) but return a non-nil error when strconv.Atoi fails, 2) update all callers of envInt (e.g., where BENCH_NUM_SPAMMERS and BENCH_COUNT_PER_SPAMMER are read) to handle the error and surface it (return error up the stack or log and exit), and 3) ensure error messages include the env key and the invalid value to aid debugging.
79-102: ⚡ Quick winConsider validating
Timeoutduring matrix validation.
Timeoutis accepted as a free-form string but never parsed here, so an invalid duration (e.g."2x") passes validation and only surfaces as an error later when the runner parses it. Validating it up front (when non-empty) gives faster, clearer feedback at load time.♻️ Proposed early validation
if e.Probability != nil && (*e.Probability < 0 || *e.Probability > 1) { return fmt.Errorf("probability must be between 0.0 and 1.0, got %f", *e.Probability) } + if e.Timeout != "" { + if _, err := time.ParseDuration(e.Timeout); err != nil { + return fmt.Errorf("invalid timeout %q: %w", e.Timeout, err) + } + } + return nilAdd
"time"to the imports.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/internal/matrix.go` around lines 79 - 102, The validate method on Entry should also parse and validate the Timeout string early: when e.Timeout is non-empty call time.ParseDuration(e.Timeout) (add the "time" import) and return a clear fmt.Errorf if parsing fails (e.g., "invalid timeout %q: %v"), so invalid durations like "2x" fail during Entry.validate rather than later in the runner.apps/loadgen/cmd/start.go (2)
56-129: ⚖️ Poor tradeoffUse structured logging instead of
log.Printf.This scheduler (and the runner package) logs exclusively via the standard
logpackage with no levels. Consider a leveled/structured logger so operational output (info vs warning/error such as the workload-error and cleanup-failure cases) is distinguishable.As per coding guidelines: "Use structured logging with appropriate log levels".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/cmd/start.go` around lines 56 - 129, The function runScheduler uses plain log.Printf; replace it with a leveled/structured logger and use appropriate levels for events (info for normal lifecycle messages, error for failures, warn for cleanup warnings). Instantiate or accept a structured logger (e.g., logger variable or parameter) in runScheduler, then change all log.Printf calls inside runScheduler and the nested runWorkload closure to logger.Infof/logger.Warnf/logger.Errorf (or equivalent) and include structured fields like "label", "txCount", "interval", "regularMatrix", "burstMatrix", and "spamoor" to make messages machine-readable; also propagate or pass the logger into any calls that need it (or include it in context) for internal.WaitForSync, internal.ExecuteMatrixWithOverridesFromFile, and internal.DeleteAllSpammers so their failures are logged with error level and contextual fields.
79-119: ⚖️ Poor tradeoffBursts/regular workloads launched on every tick can queue up unboundedly.
Each
ticker.C/burstTimer.Cfire spawns a newgo runWorkload(...). These serialize onmu, but if a workload runs longer thanintervalthe goroutines pile up and later run stale workloads against a chain that has already moved on. Consider skipping (or coalescing) a tick when a workload is already in flight, e.g. aTryLock/busy flag rather than launching unconditionally.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/cmd/start.go` around lines 79 - 119, The tick handlers currently spawn goroutines unconditionally which can pile up because runWorkload serializes on mu; add a short-circuit "in-flight" guard (e.g., an int32 atomic flag or a token channel) to skip launching when a workload is already running: declare inFlight int32 (or a chan struct{} semaphore) near the function, change the ticker and burstTimer cases to do an atomic.CompareAndSwapInt32(&inFlight, 0, 1) (or try to receive/send on the token channel) and only call go runWorkload(...) when the CAS succeeds, and ensure runWorkload clears the flag on exit (defer atomic.StoreInt32(&inFlight, 0) or release the token) so subsequent ticks can run; keep existing mu locking inside runWorkload and preserve cleanup on ctx.Done(). Use the same mechanism for both "regular" and "burst" launches to coalesce/skips overlapping runs.apps/loadgen/internal/runner.go (1)
337-353: 💤 Low value
waitForSyncreuses a stalelastHeightafter a fetch error.If
getSpamoorHeighterrors on an iteration,lastHeightis not updated, so the next successful read computesdelta = height - lastHeightacross a longer wall-clock gap (two+ poll intervals). That inflates the delta and can prevent thedelta < syncThresholdexit even when synced. Minor, but resettinglastHeight = 0on error (forcing a fresh baseline) would make the heuristic more robust.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/internal/runner.go` around lines 337 - 353, In waitForSync's polling loop, when getSpamoorHeight returns an error the code currently leaves lastHeight unchanged which can inflate the next delta; update the error branch in waitForSync to reset lastHeight to 0 (or another sentinel) after logging the error so the next successful getSpamoorHeight call establishes a fresh baseline before computing delta against syncThreshold using pollInterval; modify the block that handles err from getSpamoorHeight to set lastHeight = 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/loadgen/Dockerfile`:
- Around line 11-24: The image runs as root; change the Dockerfile to create and
switch to a non-root user before executing CMD: add an unprivileged group/user
(e.g., create group "ev" and user "ev"), change WORKDIR to a non-root directory
(e.g., /home/ev), chown the copied files (ev-loadgen and the JSON matrices) to
that user (references: WORKDIR, COPY lines that place /usr/bin/ev-loadgen and
/root/*.json), and add a USER instruction so the container runs as that
unprivileged user when invoking CMD ["ev-loadgen","start"].
In `@apps/loadgen/go.mod`:
- Line 42: apps/loadgen/go.mod pulls vulnerable transitive modules
(github.com/cometbft/cometbft and github.com/cosmos/cosmos-sdk) via
github.com/celestiaorg/tastora; update the transitive chain by bumping tastora
to v0.20.0 (which brings cometbft to v1.0.1 and cosmos-sdk to v0.54.3), either
by running `go get github.com/celestiaorg/tastora@v0.20.0` or editing
apps/loadgen/go.mod to require the new versions, then run `go mod tidy` and `go
list -m all` to verify the upgraded versions and re-run the vulnerability scan
(e.g., `govulncheck ./...`) to confirm the fixes.
In `@apps/loadgen/internal/config.go`:
- Around line 39-49: The loop that maps environment variables into cfg currently
uses strconv.Atoi and falls back to the original string, which turns
BENCH_REBROADCAST="true" into the JSON string "true"; change the numeric parsing
to use strconv.ParseUint(val, 10, 64) (for example in the same loop that
iterates envMapping) and, on success, assign the parsed uint64 value into
cfg[cfgKey]; on parse failure keep the original string as before. Also update
the corresponding test in config_test.go to expect a numeric value (uint64/JSON
number) for the BENCH_REBROADCAST entry instead of the string "true". Ensure
references include the envMapping loop and the BENCH_REBROADCAST key so the
change is applied to the correct mapping.
In `@apps/loadgen/README.md`:
- Around line 7-12: The README contains fenced code blocks with ASCII diagrams
(e.g., the block starting with "ev-loadgen (this binary) --> spamoor-daemon
--> ev-reth RPC") that are missing language identifiers and trigger MD040;
update each such triple-backtick fence (including the block at lines referenced
and the other block around lines 19-23) to include an appropriate language tag
like `text` or `bash` (e.g., ```text) so markdown lint passes and the blocks
render correctly.
---
Nitpick comments:
In `@apps/loadgen/cmd/flags_test.go`:
- Around line 10-12: Remove the redundant registration of the "start"
subcommand: NewRootCmd() already adds newStartCmd(), so drop the creation of
root and the extra root.AddCommand(startCmd) call from the test; keep the local
startCmd used for assertions (created via newStartCmd()) and remove references
to NewRootCmd(), root, and root.AddCommand to avoid registering a duplicate
start command.
In `@apps/loadgen/cmd/start.go`:
- Around line 56-129: The function runScheduler uses plain log.Printf; replace
it with a leveled/structured logger and use appropriate levels for events (info
for normal lifecycle messages, error for failures, warn for cleanup warnings).
Instantiate or accept a structured logger (e.g., logger variable or parameter)
in runScheduler, then change all log.Printf calls inside runScheduler and the
nested runWorkload closure to logger.Infof/logger.Warnf/logger.Errorf (or
equivalent) and include structured fields like "label", "txCount", "interval",
"regularMatrix", "burstMatrix", and "spamoor" to make messages machine-readable;
also propagate or pass the logger into any calls that need it (or include it in
context) for internal.WaitForSync, internal.ExecuteMatrixWithOverridesFromFile,
and internal.DeleteAllSpammers so their failures are logged with error level and
contextual fields.
- Around line 79-119: The tick handlers currently spawn goroutines
unconditionally which can pile up because runWorkload serializes on mu; add a
short-circuit "in-flight" guard (e.g., an int32 atomic flag or a token channel)
to skip launching when a workload is already running: declare inFlight int32 (or
a chan struct{} semaphore) near the function, change the ticker and burstTimer
cases to do an atomic.CompareAndSwapInt32(&inFlight, 0, 1) (or try to
receive/send on the token channel) and only call go runWorkload(...) when the
CAS succeeds, and ensure runWorkload clears the flag on exit (defer
atomic.StoreInt32(&inFlight, 0) or release the token) so subsequent ticks can
run; keep existing mu locking inside runWorkload and preserve cleanup on
ctx.Done(). Use the same mechanism for both "regular" and "burst" launches to
coalesce/skips overlapping runs.
In `@apps/loadgen/docker-compose.yml`:
- Line 5: Replace the floating image reference "image:
ethpandaops/spamoor:latest" with an immutable image reference (either a specific
version tag like "ethpandaops/spamoor:vX.Y.Z" or a content-addressable digest
"ethpandaops/spamoor@sha256:...") so the loadgen service uses a reproducible
artifact; update the compose service's image field where "image:
ethpandaops/spamoor:latest" appears and ensure any CI/docs that trigger builds
are updated to pin or update the tag/digest when intentional.
In `@apps/loadgen/internal/client.go`:
- Around line 67-81: GetClients currently returns raw errors from the HTTP GET
and JSON decode; update spamoorAPIClient.GetClients to wrap those errors with
contextual messages (e.g., using fmt.Errorf) so callers see what operation
failed. Specifically, wrap the error returned by c.client.Get(url) with context
like "GET /api/clients failed" and wrap the error from
json.NewDecoder(resp.Body).Decode(&raw) with context like "decode GetClients
response failed"; keep existing behavior for non-2xx handling and closing
resp.Body, but replace bare err returns with fmt.Errorf including the original
err.
In `@apps/loadgen/internal/matrix.go`:
- Around line 104-114: The envInt helper currently swallows parse errors; change
envInt(key) to return (int, error) instead of silently falling back so callers
can fail fast for required inputs like BENCH_NUM_SPAMMERS and
BENCH_COUNT_PER_SPAMMER; specifically, update envInt to: 1) return fallback
value only when the key is absent (or keep current semantics if fallback is
intended) but return a non-nil error when strconv.Atoi fails, 2) update all
callers of envInt (e.g., where BENCH_NUM_SPAMMERS and BENCH_COUNT_PER_SPAMMER
are read) to handle the error and surface it (return error up the stack or log
and exit), and 3) ensure error messages include the env key and the invalid
value to aid debugging.
- Around line 79-102: The validate method on Entry should also parse and
validate the Timeout string early: when e.Timeout is non-empty call
time.ParseDuration(e.Timeout) (add the "time" import) and return a clear
fmt.Errorf if parsing fails (e.g., "invalid timeout %q: %v"), so invalid
durations like "2x" fail during Entry.validate rather than later in the runner.
In `@apps/loadgen/internal/runner.go`:
- Around line 337-353: In waitForSync's polling loop, when getSpamoorHeight
returns an error the code currently leaves lastHeight unchanged which can
inflate the next delta; update the error branch in waitForSync to reset
lastHeight to 0 (or another sentinel) after logging the error so the next
successful getSpamoorHeight call establishes a fresh baseline before computing
delta against syncThreshold using pollInterval; modify the block that handles
err from getSpamoorHeight to set lastHeight = 0.
In `@apps/loadgen/main.go`:
- Around line 17-21: The init() currently configures logging and also prints a
startup banner ("ev-loadgen starting"), which runs for every entrypoint; remove
the fmt.Fprintln(os.Stderr, "ev-loadgen starting") from init() so init() only
does log.SetFlags and log.SetOutput, and relocate the banner to main() (or a
Cobra PersistentPreRun if using Cobra) so the message is emitted only when the
workload actually starts; update references in main() or the command's
PersistentPreRun to print the same startup message to os.Stderr.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 864cc402-06e5-49ca-83c4-83444b6132aa
⛔ Files ignored due to path filters (5)
apps/evm/go.sumis excluded by!**/*.sumapps/grpc/go.sumis excluded by!**/*.sumapps/loadgen/go.sumis excluded by!**/*.sumapps/testapp/go.sumis excluded by!**/*.sumgo.sumis excluded by!**/*.sum
📒 Files selected for processing (29)
.github/workflows/ci.yml.gitignore.just/build.just.just/loadgen.justapps/evm/go.modapps/grpc/go.modapps/loadgen/Dockerfileapps/loadgen/README.mdapps/loadgen/cmd/check.goapps/loadgen/cmd/flags_test.goapps/loadgen/cmd/root.goapps/loadgen/cmd/run.goapps/loadgen/cmd/start.goapps/loadgen/cmd/start_test.goapps/loadgen/docker-compose.ymlapps/loadgen/go.modapps/loadgen/internal/client.goapps/loadgen/internal/config.goapps/loadgen/internal/config_test.goapps/loadgen/internal/matrix.goapps/loadgen/internal/matrix_test.goapps/loadgen/internal/runner.goapps/loadgen/internal/runner_test.goapps/loadgen/main.goapps/loadgen/matrices/baseline.jsonapps/loadgen/matrices/burst.jsonapps/testapp/go.modgo.modjustfile
| github.com/cockroachdb/pebble v1.1.5 // indirect | ||
| github.com/cockroachdb/redact v1.1.5 // indirect | ||
| github.com/cockroachdb/tokenbucket v0.0.0-20230807174530-cc333fc44b06 // indirect | ||
| github.com/cometbft/cometbft v0.38.17 // indirect |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify why vulnerable modules are present and whether upgrades are available.
# 1) Confirm dependency path
go mod why -m github.com/cometbft/cometbft
go mod why -m github.com/cosmos/cosmos-sdk
# 2) List available updates impacting these modules
go list -m -u all | rg 'github.com/(celestiaorg/tastora|cometbft/cometbft|cosmos/cosmos-sdk)'
# 3) Re-run vuln scan after update in CI/local pipeline
# (command depends on tooling in your environment)
# govulncheck ./...Repository: evstack/ev-node
Length of output: 2478
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Snippet from apps/loadgen/go.mod =="
nl -ba apps/loadgen/go.mod | sed -n '35,60p'
echo
echo "== go env (quick context) =="
(cd apps/loadgen && go env GOMOD GOPATH)
echo
echo "== Dependency paths within apps/loadgen =="
(cd apps/loadgen && go mod why -m github.com/cometbft/cometbft || true)
(cd apps/loadgen && go mod why -m github.com/cosmos/cosmos-sdk || true)
echo
echo "== Available updates impacting these modules (within apps/loadgen) =="
(cd apps/loadgen && go list -m -u all | rg 'github.com/(celestiaorg/tastora|cometbft/cometbft|cosmos/cosmos-sdk)' || true)
echo
echo "== govulncheck (if available) =="
if command -v govulncheck >/dev/null 2>&1; then
(cd apps/loadgen && govulncheck ./... || true)
else
echo "govulncheck not installed in this environment."
fiRepository: evstack/ev-node
Length of output: 159
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Snippet from apps/loadgen/go.mod (lines 35-60) =="
awk 'NR>=35 && NR<=60 {printf "%d\t%s\n", NR, $0}' apps/loadgen/go.mod
echo
echo "== go env (apps/loadgen context) =="
(cd apps/loadgen && go env GOMOD GOPATH)
echo
echo "== Dependency paths within apps/loadgen =="
(cd apps/loadgen && go mod why -m github.com/cometbft/cometbft || true)
(cd apps/loadgen && go mod why -m github.com/cosmos/cosmos-sdk || true)
echo
echo "== Available updates impacting these modules (within apps/loadgen) =="
if command -v rg >/dev/null 2>&1; then
(cd apps/loadgen && go list -m -u all | rg 'github.com/(celestiaorg/tastora|cometbft/cometbft|cosmos/cosmos-sdk)' || true)
else
(cd apps/loadgen && go list -m -u all | grep -E 'github.com/(celestiaorg/tastora|cometbft/cometbft|cosmos/cosmos-sdk)' || true)
fi
echo
echo "== govulncheck (if available) =="
if command -v govulncheck >/dev/null 2>&1; then
(cd apps/loadgen && govulncheck ./... || true)
else
echo "govulncheck not installed in this environment."
fiRepository: evstack/ev-node
Length of output: 5228
Upgrade vulnerable transitive deps in apps/loadgen/go.mod before release
apps/loadgen/go.mod lists github.com/cometbft/cometbft v0.38.17 (line 42) and github.com/cosmos/cosmos-sdk v0.50.12 (line 49) as indirect dependencies, pulled in via github.com/celestiaorg/tastora/framework/... (see go mod why paths). go list -m -u all shows available upgrades: github.com/celestiaorg/tastora v0.20.0, github.com/cometbft/cometbft v1.0.1, and github.com/cosmos/cosmos-sdk v0.54.3. Upgrade the chain and re-run the vulnerability scan in CI (e.g., govulncheck ./...).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/loadgen/go.mod` at line 42, apps/loadgen/go.mod pulls vulnerable
transitive modules (github.com/cometbft/cometbft and
github.com/cosmos/cosmos-sdk) via github.com/celestiaorg/tastora; update the
transitive chain by bumping tastora to v0.20.0 (which brings cometbft to v1.0.1
and cosmos-sdk to v0.54.3), either by running `go get
github.com/celestiaorg/tastora@v0.20.0` or editing apps/loadgen/go.mod to
require the new versions, then run `go mod tidy` and `go list -m all` to verify
the upgraded versions and re-run the vulnerability scan (e.g., `govulncheck
./...`) to confirm the fixes.
- run container as non-root user - use ParseUint for numeric env values in scenario config - add language identifiers to fenced code blocks (MD040) - wrap bare errors with context in client.go - remove redundant root cmd in flags_test.go - validate timeout duration during matrix validation - remove startup banner from init() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
apps/loadgen/cmd/start.go (3)
110-120:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReschedule bursts against the remaining day, not a fresh 24h window.
After a burst fires,
nextBurstTimer(burstsRemaining, burstWindow)ignores how much time is left beforeresetTimerfires. If an earlier burst happened late, the next timer is frequently pushed past the pending reset and gets canceled, soburstPerDayis not actually honored. Usetime.Until(nextReset)for the follow-up delay window instead of another fullburstWindow.Also applies to: 141-156
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/cmd/start.go` around lines 110 - 120, The burst scheduling currently calls nextBurstTimer(burstsRemaining, burstWindow) after each burst which resets the follow-up delay to a full burstWindow and ignores the remaining time until resetTimer fires; update the logic in the run loop where burstTimer and burstsRemaining are updated (references: burstTimer, resetTimer, burstsRemaining, nextBurstTimer, burstWindow, resetTimer.Reset) to compute the remaining window until the next reset (e.g., time.Until(nextReset)) and pass that reduced window into nextBurstTimer so subsequent bursts are scheduled only within the remaining day; apply the same change to the other similar block handling bursts (the later occurrence around where burstsRemaining is decremented and burstTimer reset) so both places use the time until the pending reset rather than a fresh burstWindow.
66-67:⚠️ Potential issue | 🟠 Major | ⚡ Quick winZero
txCounthere means "use matrix defaults", not "run nothing".
validateStartConfigallows zero values, andregularTxPerRuntruncates fractional daily budgets down to zero. Downstream,apps/loadgen/internal/runner.goonly applies theBENCH_COUNT_PER_SPAMMERoverride whentotalTxTarget > 0, so--tx-per-day=0,--burst-tx-count=0, or--tx-per-day=1 --interval=1hwill still execute the matrix at its baked-in counts. Skip launches whentxCount == 0, and either reject or carry fractional budget forward instead of truncating it away. As per coding guidelines, "Validate all inputs from external sources in Go code".Also applies to: 77-80, 84-85, 110-111, 125-137
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/cmd/start.go` around lines 66 - 67, The calculation in start.go (runsPerDay and regularTxPerRun) currently truncates fractional budgets to zero and treats zero as "run nothing"; instead preserve fractional budget or reject ambiguous zeroes and treat zero as the sentinel meaning "use matrix defaults". Update the logic around regularTxPerRun (the variables runsPerDay, regularTxPerRun) and validateStartConfig so that: 1) inputs are validated (no negatives; reject or require explicit flags if the caller intended zero vs defaults), 2) per-run budget does not silently truncate fractional daily budgets to zero — either carry the fractional remainder forward using a float accumulator on the start config (e.g., txBudgetFloat / runsPerDay) and distribute it across runs, or use rounding (math.Ceil) when appropriate so small budgets aren’t lost, and 3) explicitly treat txCount==0 as the sentinel to "use matrix defaults" (so do not skip applying matrix defaults) and ensure runner.go’s BENCH_COUNT_PER_SPAMMER behavior is preserved by checking this sentinel instead of relying on totalTxTarget > 0.
77-85:⚠️ Potential issue | 🟠 Major | ⚡ Quick winWait for workload goroutines to finish before calling
internal.DeleteAllSpammersduring shutdown.In
apps/loadgen/cmd/start.go,runWorkloadis launched as detached goroutines (immediately and on ticker/burst), but whenctx.Done()fires the scheduler immediately callsinternal.DeleteAllSpammers(api)and returns without waiting. BecauseSpamoorClientmethods don’t takecontext.Context, in-flight matrix execution (and per-entry deferredapi.DeleteSpammer) can still be running while global cleanup deletes the same spammers, leading to shutdown races, cancellation/timed-out errors, and spurious cleanup warnings. Track launched goroutines with async.WaitGrouporerrgroupand join them beforeDeleteAllSpammers.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/cmd/start.go` around lines 77 - 85, The shutdown races occur because runWorkload goroutines (launched from start.go, including the immediate goroutine and those started by the ticker/burst logic) are not awaited before calling internal.DeleteAllSpammers(api); update the scheduler to track all launched workers with a sync.WaitGroup or an errgroup tied to ctx, incrementing the counter inside the code that invokes runWorkload and every goroutine started from the ticker/burst paths, then on ctx.Done() wait for the group to finish (wg.Wait() or group.Wait()) before calling internal.DeleteAllSpammers(api) so in-flight ExecuteMatrixWithOverridesFromFile and per-entry deferred api.DeleteSpammer calls complete first.
♻️ Duplicate comments (1)
apps/loadgen/internal/config.go (1)
39-48:⚠️ Potential issue | 🟠 MajorReject invalid numeric
BENCH_*overrides instead of serializing strings.The
elsebranch still forwards bad external input as the wrong JSON type. With the current mapping, values likeBENCH_REBROADCAST=trueorBENCH_THROUGHPUT=-1become strings here and the failure is deferred to the spamoor API call instead of being caught at config build time. Please fail fast on parse errors rather than populatingcfg[cfgKey] = val. As per coding guidelines, "Validate all inputs from external sources in Go code".What JSON types does spamoor expect for the scenario config fields total_count, throughput, max_pending, max_wallets, gas_units_to_burn, base_fee, tip_fee, and rebroadcast?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/loadgen/internal/config.go` around lines 39 - 48, The code currently converts env values to uint64 with strconv.ParseUint and on parse failure writes the raw string into cfg (envMapping/cfg), which allows invalid types to pass through; instead, fail fast: when strconv.ParseUint returns an error for an envKey mapped to a numeric cfgKey, return (or propagate) an error describing the invalid override (include envKey and val) rather than assigning cfg[cfgKey]=val; for boolean-like keys such as rebroadcast, use strconv.ParseBool and likewise return an error on failure; update the block referencing envMapping, envKey, cfgKey, strconv.ParseUint and cfg to perform strict parsing and error propagation so only correctly typed numeric/boolean values populate cfg.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@apps/loadgen/cmd/start.go`:
- Around line 110-120: The burst scheduling currently calls
nextBurstTimer(burstsRemaining, burstWindow) after each burst which resets the
follow-up delay to a full burstWindow and ignores the remaining time until
resetTimer fires; update the logic in the run loop where burstTimer and
burstsRemaining are updated (references: burstTimer, resetTimer,
burstsRemaining, nextBurstTimer, burstWindow, resetTimer.Reset) to compute the
remaining window until the next reset (e.g., time.Until(nextReset)) and pass
that reduced window into nextBurstTimer so subsequent bursts are scheduled only
within the remaining day; apply the same change to the other similar block
handling bursts (the later occurrence around where burstsRemaining is
decremented and burstTimer reset) so both places use the time until the pending
reset rather than a fresh burstWindow.
- Around line 66-67: The calculation in start.go (runsPerDay and
regularTxPerRun) currently truncates fractional budgets to zero and treats zero
as "run nothing"; instead preserve fractional budget or reject ambiguous zeroes
and treat zero as the sentinel meaning "use matrix defaults". Update the logic
around regularTxPerRun (the variables runsPerDay, regularTxPerRun) and
validateStartConfig so that: 1) inputs are validated (no negatives; reject or
require explicit flags if the caller intended zero vs defaults), 2) per-run
budget does not silently truncate fractional daily budgets to zero — either
carry the fractional remainder forward using a float accumulator on the start
config (e.g., txBudgetFloat / runsPerDay) and distribute it across runs, or use
rounding (math.Ceil) when appropriate so small budgets aren’t lost, and 3)
explicitly treat txCount==0 as the sentinel to "use matrix defaults" (so do not
skip applying matrix defaults) and ensure runner.go’s BENCH_COUNT_PER_SPAMMER
behavior is preserved by checking this sentinel instead of relying on
totalTxTarget > 0.
- Around line 77-85: The shutdown races occur because runWorkload goroutines
(launched from start.go, including the immediate goroutine and those started by
the ticker/burst logic) are not awaited before calling
internal.DeleteAllSpammers(api); update the scheduler to track all launched
workers with a sync.WaitGroup or an errgroup tied to ctx, incrementing the
counter inside the code that invokes runWorkload and every goroutine started
from the ticker/burst paths, then on ctx.Done() wait for the group to finish
(wg.Wait() or group.Wait()) before calling internal.DeleteAllSpammers(api) so
in-flight ExecuteMatrixWithOverridesFromFile and per-entry deferred
api.DeleteSpammer calls complete first.
---
Duplicate comments:
In `@apps/loadgen/internal/config.go`:
- Around line 39-48: The code currently converts env values to uint64 with
strconv.ParseUint and on parse failure writes the raw string into cfg
(envMapping/cfg), which allows invalid types to pass through; instead, fail
fast: when strconv.ParseUint returns an error for an envKey mapped to a numeric
cfgKey, return (or propagate) an error describing the invalid override (include
envKey and val) rather than assigning cfg[cfgKey]=val; for boolean-like keys
such as rebroadcast, use strconv.ParseBool and likewise return an error on
failure; update the block referencing envMapping, envKey, cfgKey,
strconv.ParseUint and cfg to perform strict parsing and error propagation so
only correctly typed numeric/boolean values populate cfg.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fb41e19f-036f-4e69-a32b-b815da44cd89
📒 Files selected for processing (10)
apps/loadgen/Dockerfileapps/loadgen/README.mdapps/loadgen/cmd/flags_test.goapps/loadgen/cmd/start.goapps/loadgen/internal/client.goapps/loadgen/internal/config.goapps/loadgen/internal/config_test.goapps/loadgen/internal/matrix.goapps/loadgen/internal/runner.goapps/loadgen/main.go
💤 Files with no reviewable changes (2)
- apps/loadgen/cmd/flags_test.go
- apps/loadgen/main.go
✅ Files skipped from review due to trivial changes (1)
- apps/loadgen/README.md
🚧 Files skipped from review as they are similar to previous changes (5)
- apps/loadgen/Dockerfile
- apps/loadgen/internal/matrix.go
- apps/loadgen/internal/config_test.go
- apps/loadgen/internal/client.go
- apps/loadgen/internal/runner.go
Overview
Adding a load generating binary that uses spamoor to spam transactions both sustained over a period of time, as well as bursts.
Summary by CodeRabbit
New Features
Documentation
Tests
Chores