简体   繁体   English

为什么 Tesseract 的边界框未在图像文本上对齐?

[英]Why are the Bounding Boxes from Tesseract not aligned on the image text?

I'm using the tesseract R package to recognize text within an image file.我正在使用 tesseract R 包来识别图像文件中的文本。 However, when plotting the bounding box for a word, the coordinates don't seem to be right.但是,在绘制单词的边界框时,坐标似乎不正确。

  1. Why is the bounding box for the word "This" not aligned with the text "This" in the image?为什么单词“This”的边界框与图像中的文本“This”不对齐? 输出
  2. Is there an easier way to plot all bounding box rectangles on the image?有没有更简单的方法来绘制图像上的所有边界框矩形?
library(tesseract)
library(magick)
library(tidyverse)

text <- tesseract::ocr_data("http://jeroen.github.io/images/testocr.png")
image <- image_read("http://jeroen.github.io/images/testocr.png")

text <- text %>% 
  separate(bbox, c("x1", "y1", "x2", "y2"), ",") %>% 
  mutate(
    x1 = as.numeric(x1),
    y1 = as.numeric(y1),
    x2 = as.numeric(x2),
    y2 = as.numeric(y2)
  )

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = text$y1[1], 
  xright = text$x2[1], 
  ytop = text$y2[1])

This is simply because the x, y co-ordinates of images are counted from the top left, whereas rect counts from the bottom left.这仅仅是因为图像的 x、y 坐标从左上角开始计数,而rect从左下角开始计数。 The image is 480 pixels tall, so we can do:图像高 480 像素,因此我们可以执行以下操作:

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = 480 - text$y1[1], 
  xright = text$x2[1], 
  ytop = 480 - text$y2[1])

在此处输入图片说明

Or, to show this generalizes:或者,为了表明这一点,概括:

plot(image)

rect(
  xleft = text$x1, 
  ybottom = magick::image_info(image)$height - text$y1, 
  xright = text$x2, 
  ytop = magick::image_info(image)$height - text$y2,
  border = sample(128, nrow(text)))

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM