Skip to content

Improve image reading with buffer validation#240

Open
hert1zm wants to merge 2 commits intoPFCCLab:mainfrom
hert1zm:patch-1
Open

Improve image reading with buffer validation#240
hert1zm wants to merge 2 commits intoPFCCLab:mainfrom
hert1zm:patch-1

Conversation

@hert1zm
Copy link

@hert1zm hert1zm commented Feb 18, 2026

Currently, if the image set contains even a single corrupted or unreadable file, the entire application crashes during auto-detection/auto-recognition. This patch introduces a validation step on the image buffer before calling cv2.imdecode(). If the buffer is empty or invalid, the application logs a warning and skips the file instead of raising an exception. This prevents the full process from terminating unexpectedly and avoids losing progress when processing large batches of images.

Since this problem could occur in multiple occasions, consider building a custom utility to replace all the calls to cv2.imdecode() to implement a buffer validation and prevent crashing in case the image set contains corrupted or unusable images:

def imread(path):
    buf = np.fromfile(path, dtype=np.uint8)
    if buf.size == 0:
        logger.warning(
                "Failed to read the image's buffer. The file may be corrupted or in an unsupported format : %s", 
                img_path,
        )
        self.result_dic = None
        return None
    return cv2.imdecode(buf, cv2.IMREAD_COLOR)

Add buffer check for image reading to handle empty or corrupted files.
@PFCCLab PFCCLab locked as resolved and limited conversation to collaborators Mar 22, 2026
@PFCCLab PFCCLab unlocked this conversation Mar 22, 2026
@GreatV GreatV marked this pull request as draft March 22, 2026 23:32
@GreatV GreatV marked this pull request as ready for review March 22, 2026 23:33
Copilot AI review requested due to automatic review settings March 22, 2026 23:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds defensive handling in the auto-recognition worker so corrupted/undecodable images don’t crash the batch run, improving robustness when processing large image sets.

Changes:

  • Validate the np.fromfile() buffer before calling cv2.imdecode().
  • Log warnings and skip files when the buffer is empty or decode returns None.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +45 to +59
if buf.size == 0:
logger.warning(
"Failed to read the image's buffer. The file may be corrupted or in an unsupported format : %s",
img_path,
)
self.result_dic = None
continue
img = cv2.imdecode(buf, cv2.IMREAD_COLOR)
if img is None:
logger.warning(
"Failed to decode image file %s. The file may be corrupted or in an unsupported format.",
img_path,
)
self.result_dic = None
continue
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue here skips the common post-processing block ("结果保存"), including findex += 1 and progressBarValue.emit(findex). This changes behavior for decode failures/empty buffers: skipped files no longer advance the progress bar, so the progress UI can finish below len_bar and the time-left estimate becomes inaccurate. Consider letting execution fall through to the existing self.result_dic is None handling (no continue), or increment/emit progress before continuing, so every input file advances progress consistently.

Copilot uses AI. Check for mistakes.
img = cv2.imdecode(
np.fromfile(img_path, dtype=np.uint8), cv2.IMREAD_COLOR
)
buf = np.fromfile(img_path, dtype=np.uint8)
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.fromfile(img_path, dtype=np.uint8) can raise (e.g., missing file, permission error, I/O error). Since run() catches Exception but then re-raises, these cases will still terminate the worker/app, which undermines the goal of “skip unreadable files”. Consider catching OSError/IOError around the fromfile/imdecode path, logging a warning, and continuing without re-raising.

Suggested change
buf = np.fromfile(img_path, dtype=np.uint8)
try:
buf = np.fromfile(img_path, dtype=np.uint8)
except (OSError, IOError) as e:
logger.warning(
"Failed to read image file %s due to an OS/I/O error: %s",
img_path,
e,
)
self.result_dic = None
continue

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants