Discussed in #4993
Originally posted by kai-36 May 17, 2026
From the documentation, I understand that by default, search_for will detect for hyphenated words at the end of a line, and join it with the first word in the next line. However, I'm not sure why it's not working for me.
This is a segment of the output from page.get_text():
If replicated on a large scale,
this would, in theory, enable the
ocean to soak up more of the
planet-warming gas driving cli-
mate change.
When I call page.search_for("climate"), it detects "climate" that appear as a whole in the page, but not the hyphenated one.
The text is extracted from a PDF file, and it is justified, if that matters.
Discussed in #4993
Originally posted by kai-36 May 17, 2026
From the documentation, I understand that by default, search_for will detect for hyphenated words at the end of a line, and join it with the first word in the next line. However, I'm not sure why it's not working for me.
This is a segment of the output from page.get_text():
If replicated on a large scale,
this would, in theory, enable the
ocean to soak up more of the
planet-warming gas driving cli-
mate change.
When I call page.search_for("climate"), it detects "climate" that appear as a whole in the page, but not the hyphenated one.
The text is extracted from a PDF file, and it is justified, if that matters.