Skip to content

write_images() and saving with garbage>1 spoils fonts (xrefs) #4896

@ASTimch

Description

@ASTimch

Description of the bug

I've found that after performing write_images() and then saving document with garbage>1 fonts in pdf are broken. It seems that xref's somehow was mixed. Saving with garabe>1 without write_images() retains document well formatted as saving document with garbage=1 after write_images().

How to reproduce the bug

Is it broken pdf structure or some issue with pymupdf?
(Source pdf contains images with 'Indexed' colorspace, is it the problem?)
Code sample

import fitz

if __name__ == "__main__":
    pdf_name = "test_orig.pdf"

    # just resave original image
    with fitz.open(pdf_name) as doc:
        doc.ez_save("test_garbage_2.pdf", garbage=2)  # will be ok

    # save with image optimization
    with fitz.open(pdf_name) as doc:
        doc.rewrite_images(dpi_threshold=160, dpi_target=150)
        doc.ez_save("test_optimized_g1.pdf", garbage=1)  # will be ok
        doc.ez_save("test_optimized_g2.pdf", garbage=2)  # fonts(xrefs) are spoiled
        doc.ez_save("test_optimized_g3.pdf", garbage=3)  # fonts(xrefs) are spoiled
        doc.ez_save("test_optimized_g4.pdf", garbage=4)  # fonts(xrefs)

Thank you.

test_orig.pdf

spoiled pdf
test_optimized_g2.pdf

PyMuPDF version

1.26.5

Operating system

Windows

Python version

3.11

Metadata

Metadata

Assignees

No one assigned

    Labels

    upstream bugbug outside this package

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions