ai-form-recognizer 与 cognitiveservices-computervision

Question

Currently using @azure/ai-form-recognizer 3.2.0 to OCR from images and PDF like:当前使用@azure/ai-form-recognizer 3.2.0 从图像和 PDF 进行 OCR，例如：

const poller = await MsClient.beginRecognizeInvoices(stream, 
            {
                onProgress: (state) => {}
            });
const [ocrResult] = await poller.pollUntilDone();

What's the diff or relationship of @azure/cognitiveservices-computervision ? @azure/cognitiveservices-computervision的区别或关系是什么？ I'm only interested in OCR.我只对 OCR 感兴趣。

Answer 1

There are several key differences between the two.两者之间有几个关键区别。 Form Recognizer's primary goal is to structure data from forms and other digitized documents for further processing.表单识别器的主要目标是构建来自 forms 和其他数字化文档的数据以供进一步处理。 The key here is that Form Recognizer provides features that can help better contextualize the information that is read from said documents than just stand-alone optical character recognition.这里的关键是表单识别器提供的功能可以帮助更好地将从所述文档中读取的信息上下文化，而不仅仅是独立的光学字符识别。 From the Form Recognizer documentation (emphasis mine):来自表单识别器文档（强调我的）：

Azure Form Recognizer is a cloud-based Azure Applied AI Service that uses machine-learning models to extract and analyze form fields, text, and tables from your documents. Azure 表单识别器是一种基于云的 Azure 应用人工智能服务，它使用机器学习模型从您的文档中提取和分析表单字段、文本和表格。 Form Recognizer analyzes your forms and documents, extracts text and data, maps field relationships as key-value pairs, and returns a structured JSON output. You quickly get accurate results that are tailored to your specific content without excessive manual intervention or extensive data science expertise.表单识别器分析您的 forms 和文档，提取文本和数据，将字段关系映射为键值对，并返回结构化的 JSON output。您可以快速获得针对您的特定内容量身定制的准确结果，无需过多的人工干预或广泛的数据科学专业知识. Use Form Recognizer to automate your data processing in applications and workflows, enhance data-driven strategies, and enrich document search capabilities.使用表单识别器自动处理应用程序和工作流中的数据，增强数据驱动策略，并丰富文档搜索功能。

On the other hand, Azure Computer Vision provides three distinct features.另一方面，Azure 计算机视觉提供了三个截然不同的功能。 While the OCR t.net below describes something similar to Form Recognizer, it's more general-purpose in use in that it does not provide as robust contextualization of key/value pairs that Form Recognizer does.虽然下面的 OCR t.net 描述了类似于Form Recognizer 的东西，但它的用途更广泛，因为它不像 Form Recognizer 那样提供健壮的键/值对上下文化。 The service also provides higher-level AI functionality for processing images and video to identify people/celebrities, landmarks, and common objects in them (among others).该服务还提供更高级别的 AI 功能，用于处理图像和视频以识别人物/名人、地标和其中的常见对象（以及其他）。 From the Computer Vision documentation :来自计算机视觉文档：

Service服务 Description描述

Optical Character Recognition (OCR)光学字符识别 (OCR) The Optical Character Recognition (OCR) service extracts text from images.光学字符识别 (OCR) 服务从图像中提取文本。 You can use the new Read API to extract printed and handwritten text from photos and documents.您可以使用新的 Read API 从照片和文档中提取打印和手写的文本。 It uses deep-learning-based models and works with text on a variety of surfaces and backgrounds.它使用基于深度学习的模型，并处理各种表面和背景上的文本。 These include business documents, invoices, receipts, posters, business cards, letters, and whiteboards.其中包括商业文件、发票、收据、海报、名片、信件和白板。 The OCR APIs support extracting printed text in several languages... OCR API 支持提取多种语言的打印文本...

Image Analysis图像分析 The Image Analysis service extracts many visual features from images, such as objects, faces, adult content, and auto-generated text descriptions.图像分析服务从图像中提取许多视觉特征，例如对象、面部、成人内容和自动生成的文本描述。 Follow the Image Analysis quickstart to get started.按照图像分析快速入门开始。

Spatial Analysis空间分析 The Spatial Analysis service analyzes the presence and movement of people on a video feed and produces events that other systems can respond to.空间分析服务分析视频源中人员的存在和移动，并生成其他系统可以响应的事件。 Install the Spatial Analysis container to get started.安装空间分析容器以开始使用。

At first glance, there is some overlap between the two, but upon further inspection there are clear delineations for the primary use cases for the two.乍一看，两者之间有一些重叠，但进一步检查后，可以清楚地划分出两者的主要用例。

ai-form-recognizer 与 cognitiveservices-computervision

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-02-11 22:37:55

Service服务	Description描述
Optical Character Recognition (OCR)光学字符识别 (OCR)	The Optical Character Recognition (OCR) service extracts text from images.光学字符识别 (OCR) 服务从图像中提取文本。 You can use the new Read API to extract printed and handwritten text from photos and documents.您可以使用新的 Read API 从照片和文档中提取打印和手写的文本。 It uses deep-learning-based models and works with text on a variety of surfaces and backgrounds.它使用基于深度学习的模型，并处理各种表面和背景上的文本。 These include business documents, invoices, receipts, posters, business cards, letters, and whiteboards.其中包括商业文件、发票、收据、海报、名片、信件和白板。 The OCR APIs support extracting printed text in several languages... OCR API 支持提取多种语言的打印文本...
Image Analysis图像分析	The Image Analysis service extracts many visual features from images, such as objects, faces, adult content, and auto-generated text descriptions.图像分析服务从图像中提取许多视觉特征，例如对象、面部、成人内容和自动生成的文本描述。 Follow the Image Analysis quickstart to get started.按照图像分析快速入门开始。
Spatial Analysis空间分析	The Spatial Analysis service analyzes the presence and movement of people on a video feed and produces events that other systems can respond to.空间分析服务分析视频源中人员的存在和移动，并生成其他系统可以响应的事件。 Install the Spatial Analysis container to get started.安装空间分析容器以开始使用。

ai-form-recognizer 与 cognitiveservices-computervision

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-02-11 22:37:55

解决方案1
1 已采纳 2022-02-11 22:37:55