feat(submitting): add posting strategies #2973

julienrbrt · 2026-01-13T16:20:01Z

Closes: #2890

Add submission strategies.

github-actions · 2026-01-13T16:20:13Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Jan 15, 2026, 10:32 PM

claude · 2026-01-13T16:20:24Z

Claude finished @julienrbrt's task in 3m 5s —— View job

Comprehensive Code Review Complete ✅

I've completed a thorough review of the DA submission batching strategies implementation. This is an excellent enhancement that adds much-needed flexibility for optimizing DA submission costs and latency. The code quality is high overall, with good test coverage and clear separation of concerns. Below are my detailed findings organized by priority and category.

✅ Strengths

Clean Architecture: The batching strategy interface is well-designed with clear separation between strategy logic and submission mechanics
Excellent Test Coverage: Comprehensive unit tests for all four strategies with edge cases covered (81.89% patch coverage)
Marshalling Optimization: The recent fix to cache marshalled bytes in pending_base.go elegantly solves the double-marshalling concern raised in earlier reviews
Configuration Flexibility: Well-documented configuration options with sensible defaults
Proper Concurrency Control: Good use of mutexes and atomic operations to prevent race conditions
Error Handling: Critical errors (like ErrOversizedItem) are properly detected and halt the node to prevent live lock

🔴 Critical Issues

1. Potential Division by Zero in Batching Strategies

Location: batching_strategy.go:65, batching_strategy.go:129

threshold := int(float64(maxBlobSize) * s.sizeThreshold)

If maxBlobSize is 0, this will result in threshold = 0, which is safe but may not be the intended behavior. More critically, if this value comes from misconfiguration, it could lead to unexpected submission behavior. Consider adding validation:

if maxBlobSize <= 0 {
    return false // or return an error/log warning
}
threshold := int(float64(maxBlobSize) * s.sizeThreshold)

Impact: Medium - Could cause unexpected behavior with misconfigured max blob sizes
Recommendation: Add validation in ShouldSubmit methods or during strategy construction

2. Max Blob Size Mismatch Risk

Location: block/internal/common/consts.go:3

The DefaultMaxBlobSize is set to 7MB, but the PR description mentions increasing it from 2MB to 8MB, and the review comments discuss Celestia's 8MB limit with padding considerations. There's a discrepancy:

const DefaultMaxBlobSize = 7 * 1024 * 1024 // 7MB fallback blob size limit

Concerns:

The comment says this accounts for padding, but it's unclear if 7MB is sufficient margin
The submitter uses this constant in submitter.go:202 for strategy decisions
If padding overhead is more than 1MB (12.5%), submissions could still fail

Recommendation:

Document exactly how much overhead padding adds (consult Celestia docs)
Either adjust the constant or add a configuration option for padding margin
Consider making this configurable per DA layer since different DAs have different overhead

🟡 High Priority Issues

3. Concurrent Goroutine Leak Risk in Submission Loop

Location: submitter.go:183-228, submitter.go:237-283

The submission loop spawns goroutines using TryLock() and go func(), but there's a potential issue:

if s.headerSubmissionMtx.TryLock() {
    go func() {
        defer s.headerSubmissionMtx.Unlock()
        // ... submission logic
    }()
}

Concerns:

If the context is cancelled while goroutines are running, they may not terminate cleanly
The wg.Wait() in Stop() only waits for the main loops, not these spawned goroutines
During shutdown, pending goroutines could continue running after the context is cancelled

Recommendation:

if s.headerSubmissionMtx.TryLock() {
    s.wg.Add(1)  // Track the goroutine
    go func() {
        defer s.wg.Done()
        defer s.headerSubmissionMtx.Unlock()
        
        // Check context before proceeding
        select {
        case <-s.ctx.Done():
            return
        default:
        }
        
        // ... submission logic
    }()
}

4. Marshalled Cache Memory Growth

Location: block/internal/cache/pending_base.go:27-28

The marshalled cache uses sync.Map without size limits:

// Marshalling cache to avoid redundant marshalling
marshalledCache sync.Map // key: uint64 (height), value: []byte

Concerns:

If DA submissions are delayed (network issues, DA downtime), the cache grows unbounded
Each marshalled header/data is ~several KB, so thousands of pending items could consume significant memory
The cache is only cleared in setLastSubmittedHeight after successful submission

Impact: Medium - Could cause memory pressure during DA outages
Recommendation:

Add a maximum cache size or TTL
Consider implementing an LRU eviction policy
Monitor cache size with metrics

5. Error Handling in Size Estimation

Location: submitter.go:194-197, submitter.go:249-252

While the marshalling is now cached, the code retrieving from cache doesn't handle the case where marshalling failed during caching:

// Calculate total size (excluding signature)
totalSize := 0
for _, marshalled := range marshalledHeaders {
    totalSize += len(marshalled)
}

If GetPendingHeaders encounters a marshalling error (line pending_headers.go:68-70), it returns an error, but the error handling in the submitter only logs it:

headers, marshalledHeaders, err := s.cache.GetPendingHeaders(s.ctx)
if err != nil {
    s.logger.Error().Err(err).Msg("failed to get pending headers for batching decision")
    return
}

Recommendation: This is actually handled correctly - returning early if marshalling fails. No changes needed, but consider adding a metric to track marshalling failures.

🟠 Medium Priority Issues

6. Race Condition in Time Tracking

Location: submitter.go:177-178, submitter.go:234-235

lastSubmitNanos := s.lastHeaderSubmit.Load()
timeSinceLastSubmit := time.Since(time.Unix(0, lastSubmitNanos))

There's a tiny window where lastHeaderSubmit could be updated between Load() and time.Since(), leading to slightly inaccurate time calculations.

Impact: Very Low - Worst case is a batch submits slightly earlier/later than optimal
Recommendation: Accept this as-is, or snapshot the time at the start of the ticker loop

7. Configuration Validation Missing

Location: batching_strategy.go:18-32

The NewBatchingStrategy function validates strategy names but doesn't validate the combination of parameters:

case "immediate":
    return &ImmediateStrategy{}, nil

Issues:

For "immediate" strategy, BatchSizeThreshold, BatchMaxDelay, and BatchMinItems are ignored silently
No warning to users that they configured unused parameters
Could lead to confusion when users expect BatchMinItems=10 to work with "immediate" strategy

Recommendation:

case "immediate":
    if cfg.BatchSizeThreshold != 0.8 || cfg.BatchMaxDelay.Duration != 0 || cfg.BatchMinItems != 1 {
        // Log warning that these params are ignored for immediate strategy
    }
    return &ImmediateStrategy{}, nil

8. Default BatchMaxDelay of 0 Could Be Confusing

Location: pkg/config/defaults.go:83

BatchMaxDelay: DurationWrapper{0}, // 0 means use DA BlockTime

The "0 means use DA BlockTime" convention is only documented in the comment and in the strategy constructors. Users might expect 0 to mean "no delay" instead.

Recommendation:

Make the default explicit: BatchMaxDelay: DurationWrapper{6 * time.Second}
Or add validation in config loading to set it to BlockTime if 0
Document this behavior prominently in configuration docs

🟢 Low Priority / Suggestions

9. Test Coverage for Edge Cases

The test coverage is good (81.89%), but a few additional test cases would be valuable:

Concurrent submission attempts: Test behavior when multiple goroutines try to submit simultaneously (though the mutex should prevent this)
Context cancellation during submission: Verify clean shutdown when context is cancelled mid-submission
Large batch splitting: Test the limitBatchBySize function with various edge cases
Cache invalidation: Test that marshalled cache is properly cleared after submissions

10. Code Documentation

Some complex logic could benefit from additional comments:

Location: submitter.go:164

checkInterval := max(s.config.DA.BlockTime.Duration/4, 100*time.Millisecond)

Recommendation: Explain why BlockTime/4 was chosen and why 100ms is the minimum

11. Metrics Enhancements

Consider adding these metrics for better observability:

da_submission_batch_size - histogram of actual batch sizes submitted
da_submission_strategy_decision_latency - time spent in ShouldSubmit
da_marshalled_cache_size - current size of marshalled cache
da_submission_trigger_reason - counter per strategy (size/time/immediate)

12. Performance: Unnecessary Time Operations

Location: submitter.go:224, submitter.go:279

s.lastHeaderSubmit.Store(time.Now().UnixNano())

time.Now() is called twice: once to store and potentially once in the next iteration to calculate duration. Minor optimization would be to reuse the time value.

📋 Code Quality & Style

Positive Observations:

✅ Consistent error wrapping with context
✅ Proper use of structured logging with zerolog
✅ Good separation of concerns (strategy, submitter, cache)
✅ Idiomatic Go code throughout
✅ Table-driven tests with clear naming

Minor Style Suggestions:

Consider extracting magic numbers to constants (e.g., checkInterval calculation)
Some functions are getting long (daSubmissionLoop ~130 lines) - consider extracting submission logic to helper methods

🏗️ Architecture & Design

The overall architecture is solid:

Interface-based design allows easy addition of new strategies
Dependency injection makes testing straightforward
Configuration-driven behavior enables runtime tuning
Separation of concerns between strategy decision and submission mechanics

Suggestion for Future Enhancement: Consider making strategies pluggable at runtime without restart, possibly via a strategy registry pattern.

🔐 Security Considerations

No significant security issues identified. The code properly:

Uses context for cancellation
Validates signatures before submission
Checks proposer address matches genesis
Handles authentication tokens securely (marked with #nosec where appropriate)

📊 Performance Considerations

The marshalling cache optimization is excellent and addresses the main performance concern. Additional considerations:

Memory: Monitor marshalled cache size during DA outages
CPU: Marshalling happens once per item, which is optimal
I/O: Store operations are minimal and well-batched
Goroutines: Properly bounded with mutexes

✅ Summary & Recommendation

This is a high-quality PR that significantly enhances the DA submission system. The batching strategies are well-designed, properly tested, and will provide meaningful cost savings and flexibility.

Recommended Actions Before Merge:

🔴 Address the goroutine lifecycle tracking (Issue Transaction gossiping #3)
🟡 Clarify/document the max blob size and padding calculations (Issue Peer discovery #2)
🟡 Add validation for division by zero in strategies (Issue ABCI interface #1)
🟠 Consider adding marshalled cache size limits (Issue Fraud proofs gossiping #4)

Nice-to-Have (can be addressed in follow-up PRs):

Enhanced metrics for observability
Additional test coverage for concurrent scenarios
Configuration validation warnings for unused parameters

📝 Existing Review Comments Status

Reviewing the previous comments:

✅ Gemini's double-fetching concern - Resolved by caching marshalled bytes
✅ Gemini's marshalling error handling - Actually handled correctly by returning early
✅ Division by zero warning - Still relevant (see Issue #1)
✅ Marshalling CPU waste - Resolved with cache optimization
✅ Padding/max blob size - Partially addressed but needs clarification (see Issue #2)

Overall Assessment: Approve with minor recommendations

The implementation is production-ready with the understanding that the critical issues listed above should be addressed soon (either before merge or in immediate follow-up). The code demonstrates strong engineering practices and will provide real value to users optimizing their DA costs.

gemini-code-assist · 2026-01-13T16:20:26Z

Summary of Changes

Hello @julienrbrt, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly upgrades the Data Availability (DA) layer submission process by introducing a robust and configurable batching system. Instead of immediate submission, the system can now intelligently decide when to send data based on various strategies, such as accumulating a certain amount of data, waiting for a specific time interval, or an adaptive combination of both. This enhancement aims to improve the efficiency and cost-effectiveness of data submissions, particularly for DA layers like Celestia, by better utilizing blob capacity and reducing transaction overhead. The maximum blob size has also been updated to reflect current DA layer capabilities.

Highlights

Flexible DA Submission Strategies: Introduced an interface and four concrete implementations for batching strategies: Immediate, Size-Based, Time-Based, and Adaptive. This allows for configurable logic to determine when to submit data to the Data Availability layer, optimizing for latency, throughput, or cost.
Increased Default Max Blob Size: The DefaultMaxBlobSize has been increased from 2MB to 8MB across the codebase, aligning with Celestia's current blob size limits. This change allows for larger batches of data to be submitted in a single transaction.
Configurable Batching Parameters: New configuration flags and fields have been added to DAConfig to control the chosen batching strategy, size thresholds, maximum delays, and minimum item counts, providing fine-grained control over DA submission behavior.
Integrated into Submitter Loop: The DA submission loop in submitter.go has been refactored to utilize the new batching strategies. It now periodically checks the strategy to decide whether to submit pending headers and data, improving efficiency by avoiding premature or undersized submissions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces configurable batching strategies for DA submissions, a valuable feature for optimizing costs and latency. The implementation is well-structured, adding immediate, size, time, and adaptive strategies, along with comprehensive tests. My review focuses on improving the efficiency and robustness of the new logic in the submitter, particularly around size estimation and data fetching.

gemini-code-assist · 2026-01-13T16:22:33Z

block/internal/submitting/batching_strategy.go

+	// Wait if current utilization is below minimum threshold
+	// Use epsilon for floating point comparison
+	const epsilon = 0.001
+	currentUtilization := float64(currentSize) / float64(maxBlobSize)


There's a potential for division by zero if maxBlobSize is 0. In Go, floating-point division by zero results in +Inf or NaN rather than a panic, but this can lead to unexpected behavior in the comparison that follows. It would be safer to add a guard against this, similar to the pattern used in calculateBatchMetrics.

block/internal/submitting/submitter.go

gemini-code-assist · 2026-01-13T16:22:33Z

block/internal/submitting/submitter.go

+							data, err := h.MarshalBinary()
+							if err == nil {
+								totalSize += len(data)
+							}


Errors from h.MarshalBinary() are silently ignored. This could lead to an inaccurate totalSize, causing the batching strategy to make a suboptimal decision (e.g., delaying a submission). It's better to log these errors for visibility.

Additionally, this size estimation logic is duplicated for data submission. Consider extracting it into a shared helper function to improve maintainability.

data, err := h.MarshalBinary() if err != nil { s.logger.Warn().Err(err).Msg("failed to marshal header for size estimation") continue } totalSize += len(data)

gemini-code-assist · 2026-01-13T16:22:33Z

block/internal/submitting/submitter.go

+							data, err := d.MarshalBinary()
+							if err == nil {
+								totalSize += len(data)
+							}


Similar to the header submission logic, errors from d.MarshalBinary() are silently ignored here. This can lead to inaccurate size estimations and suboptimal batching. It's better to log these errors for improved diagnostics and robustness.

data, err := d.MarshalBinary() if err != nil { s.logger.Warn().Err(err).Msg("failed to marshal data for size estimation") continue } totalSize += len(data)

codecov · 2026-01-15T12:23:56Z

Codecov Report

❌ Patch coverage is 81.89300% with 44 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.22%. Comparing base (52825bf) to head (ef3f6fd).

Files with missing lines	Patch %	Lines
block/internal/submitting/submitter.go	71.95%	18 Missing and 5 partials ⚠️
block/internal/submitting/da_submitter.go	72.22%	5 Missing and 5 partials ⚠️
block/internal/cache/pending_data.go	78.94%	2 Missing and 2 partials ⚠️
block/internal/submitting/batching_strategy.go	93.22%	2 Missing and 2 partials ⚠️
block/internal/cache/pending_headers.go	89.47%	1 Missing and 1 partial ⚠️
block/internal/cache/manager.go	87.50%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2973      +/-   ##
==========================================
+ Coverage   58.77%   59.22%   +0.44%     
==========================================
  Files         101      102       +1     
  Lines        9685     9826     +141     
==========================================
+ Hits         5692     5819     +127     
- Misses       3381     3390       +9     
- Partials      612      617       +5

Flag	Coverage Δ
combined	`59.22% <81.89%> (+0.44%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

block/internal/submitting/da_submitter.go

block/internal/submitting/submitter.go

block/internal/common/consts.go

tac0turtle

left two comments, i think we should cleanup marshaling flow to not marshal then not use the data. it will end up in wasted cpu cycles. not a large issue in our case but we should strive to reduce cpu cycles

tac0turtle · 2026-01-15T16:22:21Z

block/internal/submitting/da_submitter.go

-				return nil, fmt.Errorf("failed to sign envelope: %w", err)
-			}
+		// Create the envelope and marshal it
+		envelope, err := header.MarshalDAEnvelope(envelopeSignature)


we can open an issue for this as we are marshaling the header twice in this flow. one in the cache(this pr, previously elsewhere) and in the envelope again, but we sign over the encoded data. if we made this a function that takes in the encoded header then it removes the passing of headers on top of marshaled data.

tac0turtle

nice job

julienrbrt · 2026-01-15T18:08:55Z

checking e2e, will fix!

feat(submitting): add posting strategies

01ac20d

github-actions bot assigned julienrbrt Jan 13, 2026

Merge branch 'main' into julien/da-strategy

d50ecf8

gemini-code-assist bot reviewed Jan 13, 2026

View reviewed changes

julienrbrt added 8 commits January 15, 2026 10:34

updates

1286154

don't use hardcoded da time

b28c122

simplify

a9de50d

cleanups

3f553fa

Merge branch 'main' into julien/da-strategy

092a67e

updates

2dad343

lint

19815da

fix race

0bff41a

julienrbrt added 2 commits January 15, 2026 13:34

remove double marshalling

ee4847d

bring back concurrent marhalling but in submitter

98e9317

julienrbrt marked this pull request as ready for review January 15, 2026 12:40

Merge branch 'main' into julien/da-strategy

fa683c2

julienrbrt commented Jan 15, 2026

View reviewed changes

block/internal/submitting/da_submitter.go Show resolved Hide resolved

block/internal/submitting/submitter.go Show resolved Hide resolved

block/internal/submitting/submitter.go Outdated Show resolved Hide resolved

tac0turtle reviewed Jan 15, 2026

View reviewed changes

block/internal/common/consts.go Outdated Show resolved Hide resolved

tac0turtle reviewed Jan 15, 2026

View reviewed changes

julienrbrt added 5 commits January 15, 2026 14:18

marshal in cache

4ff727d

lower max blob

de67ecf

updates

fabc2de

add cl

c7d20d0

feedback

4a6661e

tac0turtle reviewed Jan 15, 2026

View reviewed changes

tac0turtle approved these changes Jan 15, 2026

View reviewed changes

Merge branch 'main' into julien/da-strategy

f6e20d3

tac0turtle enabled auto-merge January 15, 2026 17:24

fix defaults

ef3cd60

julienrbrt force-pushed the julien/da-strategy branch from 7d7300b to ef3cd60 Compare January 15, 2026 22:21

fix e2e

ef3f6fd

feat(submitting): add posting strategies #2973

Are you sure you want to change the base?

feat(submitting): add posting strategies #2973

Uh oh!

Conversation

julienrbrt commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comprehensive Code Review Complete ✅

✅ Strengths

🔴 Critical Issues

1. Potential Division by Zero in Batching Strategies

2. Max Blob Size Mismatch Risk

🟡 High Priority Issues

3. Concurrent Goroutine Leak Risk in Submission Loop

4. Marshalled Cache Memory Growth

5. Error Handling in Size Estimation

🟠 Medium Priority Issues

6. Race Condition in Time Tracking

7. Configuration Validation Missing

8. Default BatchMaxDelay of 0 Could Be Confusing

🟢 Low Priority / Suggestions

9. Test Coverage for Edge Cases

10. Code Documentation

11. Metrics Enhancements

12. Performance: Unnecessary Time Operations

📋 Code Quality & Style

Positive Observations:

Minor Style Suggestions:

🏗️ Architecture & Design

🔐 Security Considerations

📊 Performance Considerations

✅ Summary & Recommendation

📝 Existing Review Comments Status

Uh oh!

gemini-code-assist bot commented Jan 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tac0turtle left a comment

Choose a reason for hiding this comment

Uh oh!

tac0turtle Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

tac0turtle left a comment

Choose a reason for hiding this comment

Uh oh!

julienrbrt commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

github-actions bot commented Jan 13, 2026 •

edited

Loading

claude bot commented Jan 13, 2026 •

edited

Loading

codecov bot commented Jan 15, 2026 •

edited

Loading