簡體 English 中英

如何使用 Python 從 pdf 中的表格中提取數據？

[英]How to extract data from tables in a pdf using Python?

原文 2020-09-17 02:38:56 9 1 python/ pdf/ tabula

我需要使用 Python 從多個 PDF 的表格中提取數據。 我已經測試了 Camelot 和 tabula，但是它們都無法准確獲取數據。 表格有一些合並的單元格，具有多行信息等的單元格，因此這兩個庫都會混淆。 有沒有解決這個問題的好方法？

1 個解決方案

如果是這種情況，PDF 中編碼的表格的底層結構可能有問題。

您可以使用 OCR，並進行一些字符串/正則表達式操作以從每一行中提取列數據。 github.com/cseas/ocr-table似乎工作。 請參閱input.pdf和output.txt以查看它是否適用於您的情況。

如何使用 Python 從 PDF 文件中提取圖表/表格/圖形？

[英]How to extract charts/tables/graphs from PDF files using Python?

如何使用 camelot 從 pdf 中提取表？

[英]how to extract tables from pdf using camelot?

如何使用python從pdf中提取單行表數據？

[英]How to extract a single row table data from a pdf using python?

如何使用PDFrw在Python中從PDF中提取數據

[英]How To Extract Data From PDF In Python Using PDFrw

如何使用python將給定的PDF提取到文本和表格並將數據存儲在.csv文件中？

[英]How to extract given PDF to text and tables using python and store the data in .csv file?

我如何使用 python 從 PDF 中提取文本、表格和圖像

[英]How do i extract text, tables and images from PDF using python

在python中從PDF中提取所有表格

[英]Extract all tables from PDF in python

使用 PDFminer 從發票 PDF 中提取特定數據值：Python

[英]Extract specific Data values from Invoices PDF using PDFminer : Python

如何使用PDFMiner從pdf提取表？

[英]How to extract tables from a pdf with PDFMiner?

如何使用Python從手寫的掃描PDF中提取數據？

[英]How can I extract data from a handwritten, scanned PDF using Python?

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 如何使用 Python 從 PDF 文件中提取圖表/表格/圖形？如何使用 camelot 從 pdf 中提取表？如何使用python從pdf中提取單行表數據？如何使用PDFrw在Python中從PDF中提取數據如何使用python將給定的PDF提取到文本和表格並將數據存儲在.csv文件中？我如何使用 python 從 PDF 中提取文本、表格和圖像在python中從PDF中提取所有表格使用 PDFminer 從發票 PDF 中提取特定數據值：Python 如何使用PDFMiner從pdf提取表？如何使用Python從手寫的掃描PDF中提取數據？

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM