feat(export): stream /export/dump to R2 with DO alarm resumption (#59)#251
Open
ViperDroid wants to merge 1 commit into
Open
feat(export): stream /export/dump to R2 with DO alarm resumption (#59)#251ViperDroid wants to merge 1 commit into
ViperDroid wants to merge 1 commit into
Conversation
…erbase#59) The legacy /export/dump route buffers the entire dump in memory and runs synchronously, so it falls over on databases that exceed the 30s Worker timeout or the Durable Object memory ceiling (currently 1GB, soon 10GB). This change adds a streaming path that lives inside the Durable Object: - POST /export/dump kicks off a job, opens an R2 multipart upload, and returns 202 with a jobId. Supports format=sql|csv|json plus optional callbackUrl/table/chunkSize. - GET /export/dump/status/:jobId returns progress (tables, rows, bytes, parts uploaded) and a downloadUrl once status is 'completed'. - GET /export/dump/download/:jobId streams the finished object back from R2 to the client. - DELETE /export/dump/:jobId aborts an in-flight upload. The engine paginates 1000 rows at a time, buffers up to the R2 multipart 5 MiB minimum, flushes parts as they fill, and budgets each tick at 20s. When a tick yields, the leftover bytes are persisted to a temp R2 object (DO storage values are capped at 128 KiB and cannot hold the buffer directly). The DO alarm() handler dispatches dump work first, then falls through to the existing cron logic, so the two co-exist on the same alarm channel. A new [[r2_buckets]] binding named DATABASE_DUMPS gates the streaming path. The legacy GET /export/dump remains untouched for small databases and existing clients. Tests: 17 new unit tests covering the engine (mid-tick yield/resume, multipart flushing at the 5 MiB threshold, error abort, BLOB literals, empty databases, CSV/JSON formats) and the HTTP routes.
08847db to
2099316
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The legacy /export/dump route buffers the entire dump in memory and runs synchronously, so it falls over on databases that exceed the 30s Worker timeout or the Durable Object memory ceiling (currently 1GB, soon 10GB).
This change adds a streaming path that lives inside the Durable Object:
The engine paginates 1000 rows at a time, buffers up to the R2 multipart 5 MiB minimum, flushes parts as they fill, and budgets each tick at 20s. When a tick yields, the leftover bytes are persisted to a temp R2 object (DO storage values are capped at 128 KiB and cannot hold the buffer directly). The DO alarm() handler dispatches dump work first, then falls through to the existing cron logic, so the two co-exist on the same alarm channel.
A new [[r2_buckets]] binding named DATABASE_DUMPS gates the streaming path. The legacy GET /export/dump remains untouched for small databases and existing clients.
Tests: 17 new unit tests covering the engine (mid-tick yield/resume, multipart flushing at the 5 MiB threshold, error abort, BLOB literals, empty databases, CSV/JSON formats) and the HTTP routes.
Purpose
Tasks
Verify