简体   繁体   English

为什么 pyPdf2.PdfFileReader() 需要文件 object 作为输入?

[英]Why does pyPdf2.PdfFileReader() require a file object as an input?

csv.reader() doesn't require a file object, nor does open() . csv.reader()不需要文件 object, open()也不需要。 Does pyPdf2.PdfFileReader() require a file object because of the complexity of the PDF format, or is there some other reason? pyPdf2.PdfFileReader()是否需要文件 object 因为 PDF 格式的复杂性,还是有其他原因?

It's just a matter of how the library was written.这只是库的编写方式的问题。 csv.reader allows any iterable that returns strings (which includes files). csv.reader允许任何返回字符串(包括文件)的迭代器。 open is opening the file, so of course it doesn't take an open file (although it can take an integer pointing at an open file descriptor). open正在打开文件,因此它当然不会使用打开的文件(尽管它可以使用指向打开文件描述符的 integer)。 Typically, it is better to handle the file separately, usually within a with block so that it is closed properly.通常,最好单独处理文件,通常在with块中,以便正确关闭文件。

with open('input.pdf', 'rb') as f:
    # do something with the file

pypdf can take a BytesIO stream or a file path as well. pypdf也可以采用 BytesIO stream 或文件路径。 I actually recommend passing the file path in most cases as pypdf will then take care of closing the file for you.我实际上建议在大多数情况下传递文件路径,因为 pypdf 然后会为您关闭文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM