[英]Azure Search Service built-in OCR skill performing worse than Cognitive Service standalone OCR
我尝试按照本教程使用来自Azure 搜索服务的索引器创建 pdf AI 丰富管道: https://learn.microsoft.com/en-us/azure/search/cognitive-search-quickstart-blob 。 就像在示例中一样,我使用了内置的 OCR 技能。 该解决方案(尤其是 OCR)没有达到我希望的效果。
我决定只隔离和测试 OCR 步骤,并可能找到改进它的方法。 我从 Cognitive Services - Computer Vision 运行简单的独立 OCR。 据我所知,这两个 OCR 之间应该没有区别,因为 Azure 搜索服务使用来自认知服务的相同 OCR。
令人惊讶的是,Azure 搜索服务中使用的 OCR 比认知服务 - 计算机视觉中的 OCR 表现更差(相当显着)。
两个 OCR 都在相同的测试 pdf 上运行。 最简单的一个(单页pdf,文字为图片)如下图(不同格式的结果应该无关紧要): enter image description here
Output 来自 Azure 搜索服务索引器:
"\nThis is a normal test text. It does not need OCR \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n This is a text 2. Text size vs image size 2. This is a text 2. Text size vs image size 2. \n\n This is a text 3. Text size vs image size 4. This is a text 3. Text size vs image size 4. This is a text 3. Text size vs image size 4. This is a text 3. Text size vs image size 4. \n\n This is a text 4. Text size vs image size . This is a text 4. Text size vs image size. This is a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs image size .| \n\n\n"
Output 来自 Azure 认知服务 - 计算机视觉 OCR:
"This is a normal test text. It does not need OCR",
"This is a text 1. Text size vs image size 1.",
"This is a text 2. Text size vs image size 2.",
"This is a text 2. Text size vs image size 2.",
"This is a text 3. Text size vs image size 4.",
"This is a text 3. Text size vs image size 4.",
"This is a text 3. Text size vs image size 4.",
"This is a text 3. Text size vs image size 4.",
"This is a text 4. Text size vs image size . This is a text 4. Text size vs image size. This is",
"a text 4. Text size vs image size . This is a text 4. Text size vs image size . This is a text",
"4. Text size vs image size . This is a text 4. Text size vs image size . This is a text 4.",
"Text size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text",
"size vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size",
"vs image size . This is a text 4. Text size vs image size . This is a text 4. Text size vs",
"image size ."
可以看出,搜索服务的 OCR 完全错过了文本 1。
我不确定这种差异是从哪里来的。 我的猜测是,认知服务的 OCR 将整个页面视为单个图像,而搜索服务的 OCR 提取以 pdf 格式嵌入的图像,单独处理它们,并且在给定图像中的文本太大时会遇到困难。 这是一个好的方向吗? 如果是这样,我如何使用搜索服务索引器处理此类情况?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.