Skip to content

Fix(backend): Fix race condition in download queue when concurrent jobs share same destination#8931

Open
lstein wants to merge 1 commit intoinvoke-ai:mainfrom
lstein:lstein/bugfix/test_download_queue-failure
Open

Fix(backend): Fix race condition in download queue when concurrent jobs share same destination#8931
lstein wants to merge 1 commit intoinvoke-ai:mainfrom
lstein:lstein/bugfix/test_download_queue-failure

Conversation

@lstein
Copy link
Collaborator

@lstein lstein commented Feb 28, 2026

Summary

When two download jobs target the same destination directory simultaneously, a TOCTOU race between glob("*.downloading") and the subsequent .stat() call could cause a FileNotFoundError if a concurrent job completed and renamed its .downloading file in between. This surfaced as an intermittent test failure in test_errors where broken's job error was FileNotFoundError instead of the expected HTTPError(NOT FOUND).

Note that this scenario rarely (if ever) occurs in real life. However, it has been causing increasingly frequent test failures in the download and install managers unit tests.

Fix: In _do_download, wrap the candidates[0].stat().st_size call in a try-except FileNotFoundError. If the file disappears between glob and stat, reset job.download_path = None and leave resume_from = 0 so the job proceeds as a fresh download.

# Before
resume_from = candidates[0].stat().st_size  # crashes if file renamed by concurrent job

# After
try:
    resume_from = candidates[0].stat().st_size
except FileNotFoundError:
    # .downloading file renamed/deleted between glob and stat; skip resume
    job.download_path = None

Related Issues / Discussions

QA Instructions

Run tests/app/services/download/test_download_queue.py::test_errors repeatedly — it previously failed intermittently due to this race.

Merge Plan

Simple merge.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

…nation directory (#104)

* Initial plan

* Fix race condition in _do_download when scanning for .downloading files

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

* chore(backend): update copyright

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
@github-actions github-actions bot added python PRs that change python files services PRs that change app services labels Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python PRs that change python files services PRs that change app services

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants