I have this weird result when transferring a single pdf with no content to a.txt file.
I am using this PHP code in a foreach for all the files found in the dir. It works ridiculously well with the -raw option if there is text available in the pdf.
system("pdftotext -raw $page_name 2>&1");
However, if there is no content, or the file just contains an image, it produces this code in the.txt file:
(view of Line 1 in the.txt file)
I've tried multiple pdftotext-settings, but can't seem to get rid of it.
Is there any way to tackle this with pdftotext?
Some further info: with that character, the file produced is always 1 byte. Where I'd like to have it listed as 0 bytes in the dir.
(ps. first time use of adding an image. Hope it is clear!)
Because of what I just (finally) found, I will close this one with this best answer from @mkl. In Bold is the answer to this question:
More exactly, that Worksheet PDF does not contain text drawing instructions, merely graphics drawing instructions (the results of which look like text) .
pdfminer pdf2text outputs 'FF'
The solution is reading that weird character when working with files that have this content.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.