使用 python 在 Google Cloud Vision 中逐行检测文本

Question

This is the code I used:这是我使用的代码：

from http import client
import os, io
from google.cloud import vision

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'ServiceAccountToken.json'
client = vision.ImageAnnotatorClient()


def detectText(img):
    with io.open(img,'rb') as image_file:
        content = image_file.read()
    image = vision.Image(content=content)
    response = client.text_detection(image=image)
    texts = response.text_annotations
    texts = response.text_annotations[0].description
    print(texts)


FILE_NAME = 'Scan_20220819.png'
FOLDER_NAME = r'C:\Users\RafiAbrar\Downloads\vision ai'
detectText(os.path.join(FOLDER_NAME, FILE_NAME))

The output is: output 是：

A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
Haemoglobin
ESR (Capillary Method - Alifax)
Total Count
Red Blood Cells
Platelets
White Blood Cells
Differential Count
Neutrophil
Lymphocyte
Monocyte
Eosinophil
Basophil
Red Cell Indices
P.C.V. (Hct)
M.C.V.
M.C.H.
M.C.H.C.
R.D.W.-C.V.
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
Result
11.5 g/dL
12 mm in 1st hour
3.7 X 10^12/L
257 X 10^9/L
8.0 X 10^9/L
62%
32 %
04%
02 %
00%
33%
88 fL
31 pg
35 g/dL
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
12.3%
+
Reference Value
F 11.5 - 15.5, M 13.5 - 18.0 g/dL
< 30 mm in 1st hour
F 3.8-4.8, M 4.5-5.5 X 0^12/L
150-450 X 10^9/L
04.00 11.00 X 10^9/L
40 - 75%
20 - 50 %
02-10%
01-06 %
<01 %
F: 36-46 %, M: 40 - 50 %
82 - 100 fl
27-32 pg
30-35 g/dL
11.60 14.00 %
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs

The text is not detected as line by line.文本未被逐行检测。

I'm expecting out put like: "parameter" (space) "value" (space) "unit"我期待这样的输出：“参数”（空间）“值”（空间）“单位”

Like:喜欢：

Haemoglobin 11.5 g/dL

ESR 12 mm in 1st hour

RBC 3.7 X 10^12/L

..... and the list goes on

So what I'm asking is:所以我要问的是：

Either help me with a way where I can ocr text line by line using python要么帮助我，我可以使用 python 逐行 ocr 文本
or help me find an appropriate LOGIC (slicing/loop/parsing/anything) for the given output where I can merge every specific test report parameter with its corresponding value.或者帮我为给定的 output 找到合适的逻辑（切片/循环/解析/任何东西），我可以将每个特定的测试报告参数与其对应的值合并。

Answer 1

Well, after 2 days of struggling, I made my own version utilizing string methods and lists.好吧，经过 2 天的努力，我使用字符串方法和列表制作了自己的版本。 saved my GCV outputs (with coordinates) in a text file and used the txt file for line-by-line alignment.将我的 GCV 输出（带坐标）保存在文本文件中，并使用 txt 文件逐行 alignment。

GCV Code: GCV 代码：

from http import client
import os, io
from google.cloud import vision
import cv2

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'ServiceAccountToken.json'
client = vision.ImageAnnotatorClient()


def Detect_Text(img):
    with io.open(img,'rb') as image_file:
        content = image_file.read()
    image = vision.Image(content=content)
    # image2 = cv2.resize(image, (1080, 1920), interpolation=cv2.INTER_LINEAR)
    response = client.text_detection(image=image)
    # texts = response.text_annotations
    texts = response.text_annotations
    # print(texts)
    for text in texts:
        # print('=' * 30)
        # print(text.description)
        vertices = ['(%s,%s)' % (v.x, v.y) for v in text.bounding_poly.vertices]
        # print('bounds:', ",".join(vertices), text.description)
        print(" ".join(vertices), text.description, "*")


FILE_NAME = 'complex.png'
FOLDER_NAME = r'C:\Users\RafiAbrar\Downloads\vision ai'
Detect_Text(os.path.join(FOLDER_NAME, FILE_NAME))

Line-by-Line string alignment code:逐行字符串 alignment 代码：

txt_file = open("test_str.txt", "r")
file_content = txt_file.read().upper()
# print(file_content)

for i in file_content[:]:
    if "(" in i:
        file_content = file_content.replace("(", "")
    if ")" in i:
        file_content = file_content.replace(")", "")
    if "," in i:
        file_content = file_content.replace(",", " ")

clean1 = file_content.split(" *\n")
del clean1[0]
# print(clean1)

empty_list = []
for i in clean1[:]:
    # print(i)
    clean2 = i.split(" ")
    # print(clean2)
    empty_list.append(clean2)

for i in empty_list[:]:
    del i[2:8]

for i in empty_list[:]:
    if i[0].lstrip("-").isdigit():
        i[0] = int(i[0])
    if i[1].lstrip("-").isdigit():
        i[1] = int(i[1])
empty_list.sort(key=lambda x:x[1])
# for i in empty_list:
#     print(i)
list1 = []
# # for k in range(200):
while len(empty_list) > 0:
    list2 = []
    for i in empty_list[:]:
        if len(list2) == 0:
            list2.append(i)
            if i in empty_list[:]:
                empty_list.remove(i)
        else:
            for j in list2[:]:
                if abs(i[1] - j[1]) <= 15: # change this "15" if output not correct
                    list2.append(i)
                    list2[:] = list(map(list, set(map(tuple, list2))))
                    if i in empty_list[:]:
                        empty_list.remove(i)
    list2.sort()
    # empty_list.sort(key=lambda x: x[0])
    list1.append(list2)
    del list2
print(list1)
# full_txt = []
# full_txt2 = []
for i in list1[:]:
    for j in i:
        print("".join(j[2:]), end=" ")

The output works for me as long as the image is not way too skewed.只要图像不太偏斜，output 对我有用。 The image resolution may cause some issues, best to convert the resolution into 1700*2350 px图像分辨率可能会导致一些问题，最好将分辨率转换为 1700*2350 px

使用 python 在 Google Cloud Vision 中逐行检测文本

问题描述

1 个解决方案

解决方案1
0 2022-09-22 04:07:58

使用 python 在 Google Cloud Vision 中逐行检测文本

问题描述

1 个解决方案

解决方案1 0 2022-09-22 04:07:58

解决方案1
0 2022-09-22 04:07:58