Line by Line Text Detection in Google Cloud Vision using python

Question

This is the code I used:

from http import client
import os, io
from google.cloud import vision

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'ServiceAccountToken.json'
client = vision.ImageAnnotatorClient()


def detectText(img):
    with io.open(img,'rb') as image_file:
        content = image_file.read()
    image = vision.Image(content=content)
    response = client.text_detection(image=image)
    texts = response.text_annotations
    texts = response.text_annotations[0].description
    print(texts)


FILE_NAME = 'Scan_20220819.png'
FOLDER_NAME = r'C:\Users\RafiAbrar\Downloads\vision ai'
detectText(os.path.join(FOLDER_NAME, FILE_NAME))

The output is:

A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
Haemoglobin
ESR (Capillary Method - Alifax)
Total Count
Red Blood Cells
Platelets
White Blood Cells
Differential Count
Neutrophil
Lymphocyte
Monocyte
Eosinophil
Basophil
Red Cell Indices
P.C.V. (Hct)
M.C.V.
M.C.H.
M.C.H.C.
R.D.W.-C.V.
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
Result
11.5 g/dL
12 mm in 1st hour
3.7 X 10^12/L
257 X 10^9/L
8.0 X 10^9/L
62%
32 %
04%
02 %
00%
33%
88 fL
31 pg
35 g/dL
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
12.3%
+
Reference Value
F 11.5 - 15.5, M 13.5 - 18.0 g/dL
< 30 mm in 1st hour
F 3.8-4.8, M 4.5-5.5 X 0^12/L
150-450 X 10^9/L
04.00 11.00 X 10^9/L
40 - 75%
20 - 50 %
02-10%
01-06 %
<01 %
F: 36-46 %, M: 40 - 50 %
82 - 100 fl
27-32 pg
30-35 g/dL
11.60 14.00 %
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs
A lot of extra stuffs

The text is not detected as line by line.

I'm expecting out put like: "parameter" (space) "value" (space) "unit"

Like:

Haemoglobin 11.5 g/dL

ESR 12 mm in 1st hour

RBC 3.7 X 10^12/L

..... and the list goes on

So what I'm asking is:

Either help me with a way where I can ocr text line by line using python
or help me find an appropriate LOGIC (slicing/loop/parsing/anything) for the given output where I can merge every specific test report parameter with its corresponding value.

Answer 1

Well, after 2 days of struggling, I made my own version utilizing string methods and lists. saved my GCV outputs (with coordinates) in a text file and used the txt file for line-by-line alignment.

GCV Code:

from http import client
import os, io
from google.cloud import vision
import cv2

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'ServiceAccountToken.json'
client = vision.ImageAnnotatorClient()


def Detect_Text(img):
    with io.open(img,'rb') as image_file:
        content = image_file.read()
    image = vision.Image(content=content)
    # image2 = cv2.resize(image, (1080, 1920), interpolation=cv2.INTER_LINEAR)
    response = client.text_detection(image=image)
    # texts = response.text_annotations
    texts = response.text_annotations
    # print(texts)
    for text in texts:
        # print('=' * 30)
        # print(text.description)
        vertices = ['(%s,%s)' % (v.x, v.y) for v in text.bounding_poly.vertices]
        # print('bounds:', ",".join(vertices), text.description)
        print(" ".join(vertices), text.description, "*")


FILE_NAME = 'complex.png'
FOLDER_NAME = r'C:\Users\RafiAbrar\Downloads\vision ai'
Detect_Text(os.path.join(FOLDER_NAME, FILE_NAME))

Line-by-Line string alignment code:

txt_file = open("test_str.txt", "r")
file_content = txt_file.read().upper()
# print(file_content)

for i in file_content[:]:
    if "(" in i:
        file_content = file_content.replace("(", "")
    if ")" in i:
        file_content = file_content.replace(")", "")
    if "," in i:
        file_content = file_content.replace(",", " ")

clean1 = file_content.split(" *\n")
del clean1[0]
# print(clean1)

empty_list = []
for i in clean1[:]:
    # print(i)
    clean2 = i.split(" ")
    # print(clean2)
    empty_list.append(clean2)

for i in empty_list[:]:
    del i[2:8]

for i in empty_list[:]:
    if i[0].lstrip("-").isdigit():
        i[0] = int(i[0])
    if i[1].lstrip("-").isdigit():
        i[1] = int(i[1])
empty_list.sort(key=lambda x:x[1])
# for i in empty_list:
#     print(i)
list1 = []
# # for k in range(200):
while len(empty_list) > 0:
    list2 = []
    for i in empty_list[:]:
        if len(list2) == 0:
            list2.append(i)
            if i in empty_list[:]:
                empty_list.remove(i)
        else:
            for j in list2[:]:
                if abs(i[1] - j[1]) <= 15: # change this "15" if output not correct
                    list2.append(i)
                    list2[:] = list(map(list, set(map(tuple, list2))))
                    if i in empty_list[:]:
                        empty_list.remove(i)
    list2.sort()
    # empty_list.sort(key=lambda x: x[0])
    list1.append(list2)
    del list2
print(list1)
# full_txt = []
# full_txt2 = []
for i in list1[:]:
    for j in i:
        print("".join(j[2:]), end=" ")

The output works for me as long as the image is not way too skewed. The image resolution may cause some issues, best to convert the resolution into 1700*2350 px

Line by Line Text Detection in Google Cloud Vision using python

Question

1 answers

solution1
0 2022-09-22 04:07:58

Line by Line Text Detection in Google Cloud Vision using python

Question

1 answers

solution1 0 2022-09-22 04:07:58

solution1
0 2022-09-22 04:07:58