简体   繁体   English

PyPDF2 PdfFileWriter没有属性流

[英]PyPDF2 PdfFileWriter has no attribute stream

I am trying to split a pdf into its pages and save each page as a new pdf. 我正在尝试将pdf分成其页面并将每个页面保存为新的pdf。 I have tried this method from a previous question with no success and the pypdf2 split example from here with no success. 我从前一个问题尝试过这个方法没有成功,pypdf2从这里拆分示例没有成功。 EDIT: I can see in my files that it does successfully write the first page, the second page pdf is then created but is empty. 编辑:我可以在我的文件中看到它成功写入第一页,然后创建第二页pdf但是为空。

Here is the code I am trying to run: 这是我试图运行的代码:

from PyPDF2 import PdfFileWriter, PdfFileReader

inputpdf = PdfFileReader(open("my_pdf.pdf", "rb"))

for i in range(inputpdf.numPages):
    output = PdfFileWriter()
    output.addPage(inputpdf.getPage(i))
    with open("document-page%s.pdf" % i, "wb") as outputStream:
        output.write(outputStream)

Here is the full error message: 这是完整的错误消息:

Traceback (most recent call last):
  File "pdf_functions.py", line 9, in <module>
    output.write(outputStream)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 482, in write
    self._sweepIndirectReferences(externalReferenceMap, self._root)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 572, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 548, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 572, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 548, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 557, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, data[i])
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 572, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 548, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 575, in _sweepIndirectReferences
    if data.pdf.stream.closed:
AttributeError: 'PdfFileWriter' object has no attribute 'stream'

I also tried this and confirmed that I can indeed extract a single page. 我也试过这个,并确认我确实可以提取一个页面。

from PyPDF2 import PdfFileWriter, PdfFileReader
inputpdf = PdfFileReader(open("/home/ubuntu/inputs/cityshape/form5.pdf", "rb"))

#for i in range(inputpdf.numPages):
output = PdfFileWriter()
output.addPage(inputpdf.getPage(2))
with open("document-page2.pdf", "wb") as outputStream:
    output.write(outputStream)

The same thing happened to me. 这样的事情我也经历过。

I was able to solve it by moving the following line inside the loop: 我能够通过在循环内移动以下行来解决它:

inputpdf = PdfFileReader(open("/home/ubuntu/inputs/cityshape/form5.pdf", "rb"))

I believe that some versions of PyPDF2 have some sort of bug, that when you invoke the PdfFileWriter.write method, it messes with the PdfFileReader instance. 我相信某些版本的PyPDF2存在某种错误,当你调用PdfFileWriter.write方法时,它会与PdfFileReader实例混淆。 By recreating the PdfFileReader instance after each write, it bypasses this bug. 通过在每次写入后重新创建PdfFileReader实例,它会绕过此错误。

The following code should work (untested): 以下代码应该有效(未经测试):

from PyPDF2 import PdfFileWriter, PdfFileReader

pdf_in_file = open("my_pdf.pdf",'rb')

inputpdf = PdfFileReader(pdf_in_file)
pages_no = inputpdf.numPages

for i in range(pages_no):
    inputpdf = PdfFileReader(pdf_in_file)
    output = PdfFileWriter()
    output.addPage(inputpdf.getPage(i))
    with open("document-page%s.pdf" % i, "wb") as outputStream:
        output.write(outputStream)

pdf_in_file.close()        

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM