根据图像中的坐标对 OCR-Result 的单独字符串进行分组

Question

I use easyocr to read the key figures from an image (display output of measuring instrument).我使用 easyocr 从图像中读取关键数字（测量仪器显示 output）。 Because of different proportions of characters on the picture, some characters/strings, that are meant to be one unit, like value and unit (eg "230 Volt"), are recognised as separate strings ("230", "Volt").由于图片上字符的比例不同，一些本应作为一个单位的字符/字符串，如值和单位（例如“230 伏特”），被识别为单独的字符串（“230”、“伏特”）。 Another example are multiline strings, where each line is recognised as separate string.另一个例子是多行字符串，其中每一行都被识别为单独的字符串。 To illustrate it I prepared a picture.为了说明这一点，我准备了一张图片。 Its a little bit exaggerated but I hope it´s easy to understand.它有点夸张，但我希望它很容易理解。

Example picture to illustrate the problem示例图片来说明问题

What I want to do我想做的事

I try to find the elements that are on the same line or column (and very close to each other) and concatenate these strings.我试图找到位于同一行或同一列（并且彼此非常接近）的元素并将这些字符串连接起来。

Example data to handle (Output of easyocr)要处理的示例数据（easyocr 的输出）

(Coordinates are from top-left corner to bottom-left corner in clockwise direction) （坐标是从左上角到左下角顺时针方向）

([[239, 31], [563, 31], [563, 195], [239, 195]], '230', 0.7262734770774841)
([[591, 147], [661, 147], [661, 183], [591, 183]], 'Volt', 0.983400155647826)
([[801, 171], [1039, 171], [1039, 239], [801, 239]], 'This is a', 0.9870205241250117)
([[802, 256], [1232, 256], [1232, 328], [802, 328]], 'sentence with', 0.9997852752308181)
([[805, 341], [1065, 341], [1065, 427], [805, 427]], 'multiple', 0.9999849956753041)
([[212, 427], [311, 427], [311, 479], [212, 479]], 'Text', 0.9999873638153076)
([[362, 428], [474, 428], [474, 476], [362, 476]], 'More', 0.9999922513961792)
([[505, 413], [643, 413], [643, 479], [505, 479]], 'Text', 0.9999755620956421)
([[798, 428], [1136, 428], [1136, 500], [798, 500]], 'linebreaks.', 0.8525006562415545)
([[317, 601], [479, 601], [479, 669], [317, 669]], 'More', 0.9999911785125732)
([[529, 603], [665, 603], [665, 669], [529, 669]], 'Text', 0.9757571413464591)
([[699, 603], [841, 603], [841, 669], [699, 669]], 'with', 0.9999924302101135)
([[950, 608], [1182, 608], [1182, 683], [950, 683]], 'spaces.', 0.8026406194725301)

Output as DataFrame Output 为 DataFrame

I tried to handle it as Dataframe and split the values to x and y for each point.我尝试将其处理为 Dataframe 并将每个点的值拆分为 x 和 y。 I though this view will help me.我认为这种观点会对我有所帮助。 But i am still stucked但我还是卡住了

             Text     Score  tl_x  tl_y  tr_x  tr_y  bl_x  bl_y  br_x  br_y
0             230  0.726273   239    31   563    31   239   195   563   195
1            Volt  0.983400   591   147   661   147   591   183   661   183
2       This is a  0.987021   801   171  1039   171   801   239  1039   239
3   sentence with  0.999785   802   256  1232   256   802   328  1232   328
4        multiple  0.999985   805   341  1065   341   805   427  1065   427
5            Text  0.999987   212   427   311   427   212   479   311   479
6            More  0.999992   362   428   474   428   362   476   474   476
7            Text  0.999976   505   413   643   413   505   479   643   479
8     linebreaks.  0.852501   798   428  1136   428   798   500  1136   500
9            More  0.999991   317   601   479   601   317   669   479   669
10           Text  0.975757   529   603   665   603   529   669   665   669
11           with  0.999992   699   603   841   603   699   669   841   669
12        spaces.  0.802641   950   608  1182   608   950   683  1182   683

What output I want to achieve: output 我想实现什么：

I`m happy with a list of concatinated strings我对连接字符串列表感到满意

["230 Volt", "This is a sentence with multiple linebreaks.","More text with spaces", ...]

What I tried我试过的

I tried to sort the column-values with exact or almost the same values to bins with pd.cut().我尝试使用 pd.cut() 将具有完全相同或几乎相同值的列值排序到 bin 中。 And access the strings over the bin-name, but I have no idea how to code it, so its not hardcoded and can be run automatically on different pictures.并通过 bin-name 访问字符串，但我不知道如何编码，所以它不是硬编码的，可以在不同的图片上自动运行。
Also tried to use np.isclose() with relative tolerance to values to group them together还尝试使用对值具有相对容差的 np.isclose() 将它们组合在一起
tried looping with a lot of if-conditions, but nothing works.尝试使用很多 if 条件循环，但没有任何效果。

Help帮助

I am sure there is a very simple solution to this, just I am not good enough at programming yet to see it.我确信有一个非常简单的解决方案，只是我还不够擅长编程，还没有看到它。

What would be the best approach to find the closest neighbours (on the same line/column) and group them together?找到最近的邻居（在同一行/列上）并将它们组合在一起的最佳方法是什么？

Answer 1

Every time you will similar pattern text like in above case you get 230 volts ...Like in another example you will get 320 volts?每次你会像上面的情况一样出现类似的模式文本，你会得到230 volts ......就像在另一个例子中你会得到320 volts? ...So which will be formatted as x volts? ...那么哪个将被格式化为x volts?

If so如果是这样的话

import pandas as pd
c1_str = ' '.join(df["Text"])

c1_str =  c1_str.replace('linebreaks', 'linebreaks|')
c1_str =  c1_str.replace('Volt', 'Volt|')
mask =c1_str.split('|')
print(mask)

Gives #给#

['230 Volt', ' This is a sentence with multiple linebreaks', '. More text with spaces']

Explination:解释：

Convert df column to string & manuplate string according to pattern you want by creating split patterns.通过创建拆分模式，根据您想要的模式将 df 列转换为字符串和 manuplate 字符串。 Converting string to list based on split pattern |基于拆分模式将字符串转换为列表|

根据图像中的坐标对 OCR-Result 的单独字符串进行分组

问题描述

What I want to do我想做的事

Example data to handle (Output of easyocr)要处理的示例数据（easyocr 的输出）

Output as DataFrame Output 为 DataFrame

What output I want to achieve: output 我想实现什么：

What I tried我试过的

Help帮助

1 个解决方案

解决方案1
0 2022-11-29 16:09:27

Explination:解释：

根据图像中的坐标对 OCR-Result 的单独字符串进行分组

问题描述

What I want to do我想做的事

Example data to handle (Output of easyocr)要处理的示例数据（easyocr 的输出）

Output as DataFrame Output 为 DataFrame

What output I want to achieve: output 我想实现什么：

What I tried我试过的

Help帮助

1 个解决方案

解决方案1 0 2022-11-29 16:09:27

Explination:解释：

解决方案1
0 2022-11-29 16:09:27