Skip to content

feat: Add graph.StorageMaintainer interface - BED-8353#94

Open
brandonshearin wants to merge 5 commits into
mainfrom
BED-8353
Open

feat: Add graph.StorageMaintainer interface - BED-8353#94
brandonshearin wants to merge 5 commits into
mainfrom
BED-8353

Conversation

@brandonshearin
Copy link
Copy Markdown
Contributor

@brandonshearin brandonshearin commented Jun 1, 2026

Description

Resolves: BED-8353

  • This change breaks out a piece of this PR DRAFT: Introduce regular database maintenance to datapipe #79
  • It defines a new optional interface that dawgs drivers can implement for optimizing storage via methods like vacuum, analyze, etc. Extensive Godoc strings added with help from Mr. Auggie
  • The OptimizeOption functional option pattern provides extensibility for future optimization parameters beyond just storage targets. This is the only difference between this interface proposal and the original implementation proposed in the draft PoC.

Type of Change

  • Chore (a change that does not modify the application functionality)
  • Bug fix (a change that fixes an issue)
  • New feature / enhancement (a change that adds new functionality)
  • Refactor (no behaviour change)
  • Test coverage
  • Build / CI / tooling
  • Documentation

Testing

  • Unit tests added / updated
  • Integration tests added / updated
  • Full test suite run (make test_all with CONNECTION_STRING set)

Screenshots (if appropriate):

Driver Impact

  • PostgreSQL driver (drivers/pg)
  • Neo4j driver (drivers/neo4j)

Checklist

  • Code is formatted
  • All existing tests pass
  • go.mod / go.sum are up to date if dependencies changed

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced storage optimization API for graph databases, enabling selective optimization of storage components.
    • PostgreSQL driver now supports the optimization interface.
  • Tests

    • Added tests validating optimization configuration behavior.

@brandonshearin brandonshearin self-assigned this Jun 1, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

Warning

Review limit reached

@brandonshearin, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 37 minutes and 7 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 59a24ebc-f599-410b-a136-820754275687

📥 Commits

Reviewing files that changed from the base of the PR and between b1c99f2 and 3488063.

📒 Files selected for processing (2)
  • drivers/pg/driver.go
  • graph/optimize.go

Walkthrough

Adds a graph storage optimization API (target enum, config, option builder), tests for option accumulation, and a no-op PostgreSQL driver Optimize entrypoint.

Changes

Storage Optimization Framework

Layer / File(s) Summary
Optimization API contract and configuration
graph/optimize.go
TargetStorage enum (Nodes, Relationships); OptimizeConfig holds Targets; OptimizeOption mutates the config; OptimizeTargets appends selections; StorageOptimizer interface declares OptimizeStorage(ctx, opts ...OptimizeOption) error.
Option configuration and accumulation tests
graph/optimize_test.go
TestOptimizeOptions verifies option application: no options -> Targets nil; OptimizeTargets() with no args -> Targets nil; single call preserves order; repeated calls accumulate targets.
PostgreSQL driver optimization entry point
drivers/pg/driver.go
Driver.Optimize(ctx, opts ...graph.OptimizeOption) error builds OptimizeConfig from options and returns nil (contains placeholder comment for future VACUUM/ANALYZE work).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A tiny hop toward tidy stacks,
Targets listed in neat little tracks,
Nodes and links in ordered rows,
Tests confirm how the option grows,
The driver waits to sweep and relax.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title references 'graph.StorageMaintainer' but the actual changes implement 'StorageOptimizer' interface, creating a mismatch between the stated and actual implementation. Update the PR title to 'feat: Add graph.StorageOptimizer interface - BED-8353' to accurately reflect the implemented interface name.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The PR description is well-structured, includes the ticket reference, explains the motivation, checks the appropriate type of change, indicates unit tests were added, marks PostgreSQL driver impact, and provides context about breaking out from another PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch BED-8353

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
graph/optimize.go (1)

7-14: ⚡ Quick win

Consider adding a String() method for debugging and logging.

The TargetStorage enum would benefit from a String() method to provide human-readable names when logging or debugging optimization operations.

🎯 Proposed enhancement
 const (
 	Nodes TargetStorage = iota
 	Relationships
 )
+
+func (t TargetStorage) String() string {
+	switch t {
+	case Nodes:
+		return "Nodes"
+	case Relationships:
+		return "Relationships"
+	default:
+		return fmt.Sprintf("TargetStorage(%d)", t)
+	}
+}

Note: This would require adding "fmt" to the imports.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@graph/optimize.go` around lines 7 - 14, Add a String() method on the
TargetStorage type to map enum values to human-readable names (e.g., return
"Nodes" for Nodes, "Relationships" for Relationships, and a fallback for unknown
values); update the graph/optimize.go imports to include "fmt" if using
fmt.Sprintf for the fallback. Implement the method with receiver (t
TargetStorage) String() string and reference the constants Nodes and
Relationships inside it so logging and debugging produce clear names.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@drivers/pg/driver.go`:
- Around line 180-188: The Optimize method currently builds graph.OptimizeConfig
but returns nil (no-op) which falsely signals success; update Driver.Optimize to
mark it as an explicit placeholder by adding a TODO comment and returning a
clear not-implemented error (e.g., errors.New("Driver.Optimize: not
implemented") or a package-specific ErrNotImplemented) instead of nil, and
ensure any capability advertising (StorageMaintainer) is removed or gated until
the VACUUM/ANALYZE implementation using graph.OptimizeConfig is completed so
callers can distinguish implemented vs placeholder behavior.

---

Nitpick comments:
In `@graph/optimize.go`:
- Around line 7-14: Add a String() method on the TargetStorage type to map enum
values to human-readable names (e.g., return "Nodes" for Nodes, "Relationships"
for Relationships, and a fallback for unknown values); update the
graph/optimize.go imports to include "fmt" if using fmt.Sprintf for the
fallback. Implement the method with receiver (t TargetStorage) String() string
and reference the constants Nodes and Relationships inside it so logging and
debugging produce clear names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b7be0ac4-5cc2-4030-8d96-a44a76e2771d

📥 Commits

Reviewing files that changed from the base of the PR and between a637b4a and 681c8b3.

📒 Files selected for processing (3)
  • drivers/pg/driver.go
  • graph/optimize.go
  • graph/optimize_test.go

Comment thread drivers/pg/driver.go Outdated
Comment on lines +180 to +188
func (s *Driver) Optimize(ctx context.Context, opts ...graph.OptimizeOption) error {
config := &graph.OptimizeConfig{}
for _, opt := range opts {
opt(config)
}

// PostgreSQL VACUUM/ANALYZE implementation
return nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

No-op Optimize silently reports success.

The method builds config from the options but discards it and returns nil without performing any work. This is fine as an intentional placeholder, but as written it advertises the StorageMaintainer capability while doing nothing, so callers cannot distinguish "optimized" from "not yet implemented." Consider marking the placeholder as a TODO so it surfaces in tracking and isn't mistaken for a completed implementation.

📝 Suggested placeholder marker
-	// PostgreSQL VACUUM/ANALYZE implementation
-	return nil
+	// TODO(BED-8353): implement PostgreSQL VACUUM/ANALYZE using config.Targets.
+	// A nil/empty config.Targets means "optimize every region" per graph.OptimizeConfig.
+	return nil

Want me to open a follow-up issue to track the VACUUM/ANALYZE implementation?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@drivers/pg/driver.go` around lines 180 - 188, The Optimize method currently
builds graph.OptimizeConfig but returns nil (no-op) which falsely signals
success; update Driver.Optimize to mark it as an explicit placeholder by adding
a TODO comment and returning a clear not-implemented error (e.g.,
errors.New("Driver.Optimize: not implemented") or a package-specific
ErrNotImplemented) instead of nil, and ensure any capability advertising
(StorageMaintainer) is removed or gated until the VACUUM/ANALYZE implementation
using graph.OptimizeConfig is completed so callers can distinguish implemented
vs placeholder behavior.

@brandonshearin brandonshearin changed the title feature: Add graph.StorageMaintainer interface: BED-8353 feat: Add graph.StorageMaintainer interface - BED-8353 Jun 1, 2026
Comment thread graph/optimize.go
type OptimizeConfig struct {
// Targets is the set of storage regions to optimize. A nil or empty
// slice instructs the driver to optimize every region it knows about.
Targets []TargetStorage
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion: While convenient, an empty slice representing "all" supported storage targets may be misleading

Comment thread graph/optimize.go Outdated
Comment on lines +52 to +57
type StorageMaintainer interface {
// Optimize runs a storage maintenance pass against the regions
// identified by opts. With no options, every region the driver
// recognizes is optimized.
Optimize(ctx context.Context, opts ...OptimizeOption) error
}
Copy link
Copy Markdown
Member

@ddlees ddlees Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dig that this interface features Options as closures. I think it has to potential to be really powerful.

For example, we could potentially pass driver specific options and only apply them when applicable. We'll want to feel something like that out a bit but here's the general idea:

m.Optimize(ctx,
  graph.Targets(targets),
  pg.Partitions(partitions),
  pg.SkipReindex(),
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context of having zero options is different than declaring zero storage targets.

On one hand, m.Optimize(ctx) optimizing all DAWGS storage makes perfect sense. If you're casting that from the graph db/driver, it says "optimize everything managed by this struct."

Where the expected behavior becomes murky is something like m.Optimize(OptimizeTargets(nil)). That to me could translate to, "of everything managed by this struct only optimize none of the targets."

We'll want to spend time considering how to expose targeting individual storage targets and all storage targets.

Copy link
Copy Markdown
Contributor Author

@brandonshearin brandonshearin Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this API gracefully handles the edge case you are mentioning.

The foundational rule is: when OptimizeConfig.Targets is nil, the API should treat that as optimize all StorageTargets. The specific case you are describing is actually captured by the simple table test i made in optimize_test.go (test case #2).

Screenshot 2026-06-01 at 2 51 22 PM

This test shows that calling m.Optimize(OptimizeTargets()) will evaluate to an OptimizeConfig struct with a nil Targets property, which is what I think we want.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to reframe my last comment as a question. Does my understanding of this edge case align with your idea of a good design here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess what I'm trying to say is that I can understand as both ways and that means we have to encourage/enforce that convention for each driver.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@graph/optimize.go`:
- Around line 43-56: Update the documentation and example to use the new method
name OptimizeStorage everywhere instead of the old Optimize: change the consumer
example to call m.OptimizeStorage(ctx, graph.OptimizeTargets(...)) and update
all prose references to "Optimize" (including lines mentioning the interface and
behavior) to "OptimizeStorage"; ensure the interface name StorageOptimizer and
the parameter type OptimizeOption remain unchanged and the example still shows
the type assertion against graph.Database and the correct ctx and options usage.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 28d5edeb-8a99-4a97-a2bd-2db4ac325fc1

📥 Commits

Reviewing files that changed from the base of the PR and between 681c8b3 and b1c99f2.

📒 Files selected for processing (1)
  • graph/optimize.go

Comment thread graph/optimize.go Outdated
Comment on lines +43 to +56
// A non-nil error returned from Optimize signals a driver-specific failure
// during a supported call.
//
// Consumers detect support with a type assertion against a graph.Database:
//
// if m, ok := db.(graph.StorageOptimizer); ok {
// err := m.Optimize(ctx, graph.OptimizeTargets(graph.Nodes, graph.Relationships))
// ...
// }
type StorageOptimizer interface {
// OptimizeStorage runs a storage optimization pass against the regions
// identified by opts. With no options, every region the driver
// recognizes is optimized.
OptimizeStorage(ctx context.Context, opts ...OptimizeOption) error
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Incomplete rename: doc/example still reference the old Optimize method.

The interface method is now OptimizeStorage, but the doc comments and the consumer example were not fully updated:

  • Line 49 example calls m.Optimize(...), which no longer exists on StorageOptimizer — anyone copying this snippet will fail to compile.
  • Prose references to "Optimize" remain at lines 16, 25, 31, and 43.
📝 Proposed fix for the example call
 //	if m, ok := db.(graph.StorageOptimizer); ok {
-//	    err := m.Optimize(ctx, graph.OptimizeTargets(graph.Nodes, graph.Relationships))
+//	    err := m.OptimizeStorage(ctx, graph.OptimizeTargets(graph.Nodes, graph.Relationships))
 //	    ...
 //	}

Also update the prose references at lines 16, 25, 31, and 43 from "Optimize" to "OptimizeStorage" for consistency.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// A non-nil error returned from Optimize signals a driver-specific failure
// during a supported call.
//
// Consumers detect support with a type assertion against a graph.Database:
//
// if m, ok := db.(graph.StorageOptimizer); ok {
// err := m.Optimize(ctx, graph.OptimizeTargets(graph.Nodes, graph.Relationships))
// ...
// }
type StorageOptimizer interface {
// OptimizeStorage runs a storage optimization pass against the regions
// identified by opts. With no options, every region the driver
// recognizes is optimized.
OptimizeStorage(ctx context.Context, opts ...OptimizeOption) error
// A non-nil error returned from Optimize signals a driver-specific failure
// during a supported call.
//
// Consumers detect support with a type assertion against a graph.Database:
//
// if m, ok := db.(graph.StorageOptimizer); ok {
// err := m.OptimizeStorage(ctx, graph.OptimizeTargets(graph.Nodes, graph.Relationships))
// ...
// }
type StorageOptimizer interface {
// OptimizeStorage runs a storage optimization pass against the regions
// identified by opts. With no options, every region the driver
// recognizes is optimized.
OptimizeStorage(ctx context.Context, opts ...OptimizeOption) error
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@graph/optimize.go` around lines 43 - 56, Update the documentation and example
to use the new method name OptimizeStorage everywhere instead of the old
Optimize: change the consumer example to call m.OptimizeStorage(ctx,
graph.OptimizeTargets(...)) and update all prose references to "Optimize"
(including lines mentioning the interface and behavior) to "OptimizeStorage";
ensure the interface name StorageOptimizer and the parameter type OptimizeOption
remain unchanged and the example still shows the type assertion against
graph.Database and the correct ctx and options usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants