PDF: throw on reads from an unauthenticated encrypted file#528
Merged
Conversation
Reading an encrypted file without an installed decryptor used to silently serve undecrypted bytes. Guard read_object/read_object_stream so they throw the new UnauthenticatedReadError instead. m_is_encrypted is now set only after the /Encrypt dictionary is resolved during construction, since that read predates any decryptor and would otherwise trip the guard. Adds is_authenticated() and a test that an encrypted file reports as encrypted-but-unauthenticated and throws on parse_document(). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
9559c49 to
a0f27b6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #527.
Reading an encrypted file without an installed decryptor previously served undecrypted bytes silently. This guards
read_object/read_object_streamso they throw instead.Changes
UnauthenticatedReadErrorexception (no existing exception covered this —FileEncryptedError/NotEncryptedErrorare different concepts).is_encrypted() && !is_authenticated().DocumentParser::is_authenticated()(encrypted and a decryptor installed).m_is_encryptedis now set only after the/Encryptdictionary is resolved during construction. That resolution runs before any decryptor exists, so setting the flag earlier would trip the new guard on every encrypted file's open.Why it's safe
read_trailer_chain, xref streams) runs beforem_is_encryptedis set → guard passes, xref streams stay raw (ISO 32000-1 7.5.8.2).read_object_stream, so they're covered transitively.create_parser(m_decryptor)beforeparse_document(), so it's unaffected.Test
DocumentParser.read_without_authentication_throwsopens the public RC4 fixture without authenticating, assertsis_encrypted()/!is_authenticated(), and expectsparse_document()to throwUnauthenticatedReadError. Uses only the public fixture, so it runs in CI without the private submodule.