Skip to content

multipart upload reliability: retry finish, handle 504, increase verify timeout#368

Open
Goober5000 wants to merge 1 commit intoKnossosNET:mainfrom
Goober5000:claude/upload_reliability_fixes
Open

multipart upload reliability: retry finish, handle 504, increase verify timeout#368
Goober5000 wants to merge 1 commit intoKnossosNET:mainfrom
Goober5000:claude/upload_reliability_fixes

Conversation

@Goober5000
Copy link
Copy Markdown
Contributor

@Goober5000 Goober5000 commented Apr 5, 2026

Large uploads fail when the reverse proxy times out during chunk reassembly,
returning 504. The client treated this as failure and retried finish with no
server-state verification, causing cascading 500 errors when the server had
already completed and deleted the chunks.

Changes:

  • Add multiupload/finish to the GatewayTimeout hack so 504 is treated as
    tentative success (same pattern as mod/release endpoints)
  • Add bounded retry loop (max 3) to Finish() with exponential backoff;
    between retries, call multiupload/start to check the server's done flag
    via new CheckUploadDone() helper before blindly retrying
  • Increase verify_part timeout from 45s to 120s to handle disk I/O
    contention during parallel uploads
  • Remove premature completed=true assignment in Upload(); Finish() now
    manages the flag internally

…verify timeout

Large uploads fail when the reverse proxy times out during chunk reassembly,
returning 504. The client treated this as failure and retried finish with no
server-state verification, causing cascading 500 errors when the server had
already completed and deleted the chunks.

Changes:
- Add multiupload/finish to the GatewayTimeout hack so 504 is treated as
  tentative success (same pattern as mod/release endpoints)
- Add bounded retry loop (max 3) to Finish() with exponential backoff;
  between retries, call multiupload/start to check the server's done flag
  via new CheckUploadDone() helper before blindly retrying
- Increase verify_part timeout from 45s to 120s to handle disk I/O
  contention during parallel uploads
- Remove premature completed=true assignment in Upload(); Finish() now
  manages the flag internally

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Goober5000 Goober5000 force-pushed the claude/upload_reliability_fixes branch from 3630168 to a7845fe Compare April 5, 2026 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant