I'm trying out to retrieve data from vertical bar charts as x
, y
, x_axis_title
, y_axis_title
. An example image is as follows.
I am currently creating a dataset with vertical bar charts as above. with a CSV file for each image as follows.
Would it be possible to create a model that can output the titles, and values for both x, and y without using OCR(Optical character recognition) with a certain number of images?
Is there a specific method to build the model? or a better method?
**I was trying to do this using tesseract OCR but it was a bit inaccurate
Any help would be appreciated!
In my personal experience, Paddle OCR works a lot better than Tesseract and can help you identify all the fields. Here is a good article explaining the usage of Paddle. Using this OCR, you will not really face any problem with the text since most of it is clearly visible
I don't think if it will be possible to get the data without that. As for the values of x and y, you can use Open CV and get intersection points of blue and white to get the points and link them with the y axis to get the exact values
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.