从 pdf 文件读取到文本没有结果

Question

So I'm trying something very simple: I just want to read text from a pdf file in to a variable - that's it.所以我正在尝试一些非常简单的事情：我只想将 pdf 文件中的文本读入一个变量 - 就是这样。 This is what I'm getting:这就是我得到的：

Does anyone know a reliable way to just read pdf in to a text file?有谁知道将 pdf 读入文本文件的可靠方法？

Answer 1

Try the following library - pdfplumber:尝试以下库 - pdfplumber：

import pdfplumber
pdf_file = pdfplumber.open('anyfile.pdf')
page = pdf_file.pages[0]
text = page.extract_text()
print(text)
pdf_file.close()

I haven't used PyPDF2 before but pdfplumber seems to work well for me.我以前没有使用过 PyPDF2，但 pdfplumber 似乎很适合我。

从 pdf 文件读取到文本没有结果

问题描述

1 个解决方案

解决方案1
0 2020-09-01 18:21:55

从 pdf 文件读取到文本没有结果

问题描述

1 个解决方案

解决方案1 0 2020-09-01 18:21:55

解决方案1
0 2020-09-01 18:21:55