简体   繁体   English

Amazon Textract - 如何定义我的键值对

[英]Amazon Textract - How to define my key-value pairs

I have tried textract and I can see that it extracts few interesting key-value pairs.我已经尝试过 textract,我可以看到它提取了一些有趣的键值对。

I have an image dataset each annotated with a set of domain-specific key-value pairs which are different of what textract found.我有一个图像数据集,每个图像数据集都用一组特定于域的键值对进行注释,这些键值对与 textract 发现的不同。

Is there anyway to make textract looking for my key-value pairs?无论如何让 textract 寻找我的键值对? Kind of transfer learning, or specific configuration of the tool?某种迁移学习,或工具的特定配置?

No. There is no way to change how textract predicts text or identifies relationships between them.不可以。没有办法改变 textract 预测文本或识别文本之间关系的方式。 You can keep adding your images and forms and textract will (in theory) train itself on them, but I doubt it will help much.你可以继续添加你的图像和 forms 和 textract 将(理论上)在它们上训练自己,但我怀疑它会有多大帮助。 You can try to get the raw text that is detected and come up with your own script to put them in relationships.您可以尝试获取检测到的原始文本并提出您自己的脚本以将它们放入关系中。 Note that textract will return the raw text detected in order that it finds them on the image/pdf.请注意,textract 将返回检测到的原始文本,以便在图像/pdf 上找到它们。 So it is fairly easy to come up with your own logic to map them however you want.所以很容易想出你自己的逻辑来 map 他们,但是你想要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM