简体   繁体   中英

Does AWS Comprehend classify images?

I am fairly new to AWS Comprehend. I know that AWS Comprehend can custom classify documents (Text Files). Does, AWS Comprehend also classify Image files? Also, while training the model, is it necessary to give the entire document text in the CSV or will just keywords do?

The reason being, I want to built a custom classifier that can classify invoice, Pay Stubs and few other such document types which are in image formats. Can Comprehend do this? If so how?

Googled quite a lot but couldn't find anything much relevant around. Really appreciate your help with this.

Thank you!

Comprehend doesn't do this natively, so you would have to build a solution. Something you could try is to combine Amazon Textract (for extracting the details from the documents) and then Comprehend to classify them.

From the FAQ, Textract calls out this as a common use case. I couldn't find an exact example of someone doing this, but it is directly called out in the documentation .

Amazon Comprehend only works on text.

Amazon Rekognition works on images.

AWS has all the building blocks to accomplish this, but you will have to configure/build this yourself. You can use AWS Textract to extract all the text from a document, and then pass the text into the AWS Comprehend service to do the classification for document type.

Before you can do this you need to train the machine learning part of Comprehend to do the correct identification of the document types. You need to configure and train a custom classifier in AWS Comprehend where you supply a CSV file with a list of classifications for example 'document type' and then text that would be in the type of document. If it is just forms then you can use Textract Form feature to only get key value pairs, then use the keys (labels in the form) as text for the custom classifier.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM