简体繁体中英

How Can i improve the quality of documentai document-ocr processor result

原文 2023-01-24 17:03:01 1 1 google-cloud-platform/ nlp/ cloud-document-ai

I have this image (first image) which I want to process using the document-ocr processor. The output I got looks something like the second image printed onto the python console. This output has been badly process in inclusion with most of my files. How can I make document-ocr understand and yield a perfect result.

1 answers

To set expectations, no machine learning model can give "perfect results" consistently.

Results will greatly depend on the quality of the input files. In this case, the document is a scan of a handwritten file and handwriting can vary greatly from document to document.

In this particular example, some of these words could be difficult for humans to read, so the performance for Document AI could be inconsistent.

In general for improving quality of OCR output, higher quality for the source material results in higher accuracy of OCR detected text. So scanning at a higher DPI can improve results.

If the PDF file has embedded text already, then you can also use the Native PDF Parsing feature in the pretrained-ocr-v1.2-2022-11-10 processor version. This repository has some sample code for how to use it.

https://github.com/GoogleCloudPlatform/document-ai-samples/tree/main/pdf-embedded-text

How can i return each line of blocks in the documentai-ocr processor response

How to get the document quality score in google document AI as the "Intelligent document quality processor" is not available now?

How to batch send documents in DocumentAI?

Document AI OCR processor returning error 3 Unsupported input file format randomly

How can I create Subcollection in document?

Exception while batch processing document with Google Cloud DocumentAI V1 - StatusCode="DeadlineExceeded"

How can I get updates from a Firestore document? ReactJS

How can I get the document ID when using Codable in Firestore?

how can i define the last document of the list of posts?

How can i Update Cloud firestore document Field with the help DocumentID?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How can i return each line of blocks in the documentai-ocr processor response How to get the document quality score in google document AI as the "Intelligent document quality processor" is not available now? How to batch send documents in DocumentAI? Document AI OCR processor returning error 3 Unsupported input file format randomly How can I create Subcollection in document? Exception while batch processing document with Google Cloud DocumentAI V1 - StatusCode="DeadlineExceeded" How can I get updates from a Firestore document? ReactJS How can I get the document ID when using Codable in Firestore? how can i define the last document of the list of posts? How can i Update Cloud firestore document Field with the help DocumentID?

Related Tags

How Can i improve the quality of documentai document-ocr processor result

Question

1 answers

solution1 1 ACCPTED 2023-01-24 19:09:22

solution1
1 ACCPTED 2023-01-24 19:09:22