簡體 English 中英

使用 Amazon Textract 從多頁文檔 PDF 同步檢測和分析文本

[英]Detect and analyze text using Amazon Textract from a multi page document PDF synchronously

原文 2020-06-30 11:28:56 1 1 python/ amazon-web-services/ ocr/ amazon-textract

回答https://stackoverflow.com/a/62174368/8117673

進一步的問題是 - 它會影響 Amazon Textract文本檢測的准確性嗎？

我是否需要對圖像進行預處理才能從 Amazon Textract 獲得更好的結果？

1 個解決方案

我使用命令pdftoppm將 PDF 轉換為 PNG。 在 Python -> subprocess.Popen(['pdftoppm -png Sample.pdf Sample'])

Amazon Textract 對 PDF 文件的准確性超過了 PNG 格式。 因為 PDF 是原始文件。

使用 Amazon Textract 分析 PDF 的特定頁面

[英]Analyzing a Specific Page of a PDF with Amazon Textract

使用 Amazon Textract 時不支持的文檔格式，

[英]Unsupported Document format while using Amazon Textract,

如何將從 PDF（使用 textract）中提取的文本寫入 python 中的 docx 文件

[英]How do you write text extracted from PDF (using textract) to docx files in python

使用 AWS Textract 進行處理 PDF

[英]Using AWS Textract for processing PDF

如何使用pyPDF2從PDF格式的多頁簡歷中提取文本數據？

[英]How to extract text data from a multi page CV in a PDF format using pyPDF2?

如何在 Python 中檢測 PDF 文檔中的旋轉頁面？

[英]How to detect a rotated page in a PDF document in Python?

自動打印多頁文字文檔為pdf

[英]Auto Print multi page word document to pdf

如何使用 Amazon Textract 以同步方式分析 PDF 文檔？

[英]How to analyse PDF documents with Amazon Textract in a Synchronous way?

AWS 使用 texttract 開始文檔分析不起作用

[英]AWS start document analysis using textract not working

如何在python-3中分析PDF中的特定文本字符串？

[英]How to analyze specific string of text from PDF in python-3?

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 使用 Amazon Textract 分析 PDF 的特定頁面使用 Amazon Textract 時不支持的文檔格式，如何將從 PDF（使用 textract）中提取的文本寫入 python 中的 docx 文件使用 AWS Textract 進行處理 PDF 如何使用pyPDF2從PDF格式的多頁簡歷中提取文本數據？如何在 Python 中檢測 PDF 文檔中的旋轉頁面？自動打印多頁文字文檔為pdf 如何使用 Amazon Textract 以同步方式分析 PDF 文檔？ AWS 使用 texttract 開始文檔分析不起作用如何在python-3中分析PDF中的特定文本字符串？

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM