简体   繁体   English

在ABBYY SDK中获取字符边界框和置信度

[英]Get char bounding boxes and confidence levels in ABBYY SDK

I convert an image using ABBYY's OCR SDK: 我使用ABBYY的OCR SDK转换图像:

CSafePtr<IFRDocument> frDocument = ...;
frDocument->AddImageFile( "C:\\test\\input.tif" );
frDocument->Process( 0 );
frDocument->Export( "C:\\test\\output.rtf", FEF_RTF, 0  );

But now I need to get the char bounding boxes and confidence levels, as well. 但是现在,我还需要获取字符边界框和置信度。 I can get them from Tesseract so I assume it's possible with ABBYY's SDK as well. 我可以从Tesseract获得它们,所以我认为ABBYY的SDK也可以实现。

How do I get the bounding boxes and confidence levels? 如何获得边界框和置信度?

I eventually found how to do it, you need to use the IPlainText::GetCharacterData() . 我最终找到了解决方法,您需要使用IPlainText::GetCharacterData()

GetCharacterData Method of the PlainText Object This method returns the information about all characters in the text as a set of arrays: the page numbers on which the characters are located, the coordinates of characters' rectangles, and characters' confidences. PlainText对象的GetCharacterData方法此方法以一组数组的形式返回有关文本中所有字符的信息:字符所在的页码,字符矩形的坐标以及字符的置信度。

Example: 例:

CSafePtr<IPlainText> plainText;
frDocument->get_PlainText(&plainText);
SAFEARRAY *confidences, *pageNumbers, *leftBorders, *topBorders, *rightBorders, *bottomBorders, *isSuspicious;
plainText->GetCharacterData(&pageNumbers, &leftBorders, &topBorders, &rightBorders, &bottomBorders, &confidences, &isSuspicious);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM