为什么 Tesseract 的边界框未在图像文本上对齐？

Question

I'm using the tesseract R package to recognize text within an image file.我正在使用 tesseract R 包来识别图像文件中的文本。 However, when plotting the bounding box for a word, the coordinates don't seem to be right.但是，在绘制单词的边界框时，坐标似乎不正确。

Why is the bounding box for the word "This" not aligned with the text "This" in the image?为什么单词“This”的边界框与图像中的文本“This”不对齐？
Is there an easier way to plot all bounding box rectangles on the image?有没有更简单的方法来绘制图像上的所有边界框矩形？

library(tesseract)
library(magick)
library(tidyverse)

text <- tesseract::ocr_data("http://jeroen.github.io/images/testocr.png")
image <- image_read("http://jeroen.github.io/images/testocr.png")

text <- text %>% 
  separate(bbox, c("x1", "y1", "x2", "y2"), ",") %>% 
  mutate(
    x1 = as.numeric(x1),
    y1 = as.numeric(y1),
    x2 = as.numeric(x2),
    y2 = as.numeric(y2)
  )

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = text$y1[1], 
  xright = text$x2[1], 
  ytop = text$y2[1])

Answer 1

This is simply because the x, y co-ordinates of images are counted from the top left, whereas rect counts from the bottom left.这仅仅是因为图像的 x、y 坐标从左上角开始计数，而rect从左下角开始计数。 The image is 480 pixels tall, so we can do:图像高 480 像素，因此我们可以执行以下操作：

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = 480 - text$y1[1], 
  xright = text$x2[1], 
  ytop = 480 - text$y2[1])

Or, to show this generalizes:或者，为了表明这一点，概括：

plot(image)

rect(
  xleft = text$x1, 
  ybottom = magick::image_info(image)$height - text$y1, 
  xright = text$x2, 
  ytop = magick::image_info(image)$height - text$y2,
  border = sample(128, nrow(text)))

为什么 Tesseract 的边界框未在图像文本上对齐？

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-10-15 11:01:40

为什么 Tesseract 的边界框未在图像文本上对齐？

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-10-15 11:01:40

解决方案1
1 已采纳 2021-10-15 11:01:40