简体   繁体   中英

Model Training using a Data Set with CSV files and Images

I'm trying out to retrieve data from vertical bar charts as x , y , x_axis_title , y_axis_title . An example image is as follows.

2.png

I am currently creating a dataset with vertical bar charts as above. with a CSV file for each image as follows.

2.csv

  1. Would it be possible to create a model that can output the titles, and values for both x, and y without using OCR(Optical character recognition) with a certain number of images?

  2. Is there a specific method to build the model? or a better method?

**I was trying to do this using tesseract OCR but it was a bit inaccurate

Any help would be appreciated!

In my personal experience, Paddle OCR works a lot better than Tesseract and can help you identify all the fields. Here is a good article explaining the usage of Paddle. Using this OCR, you will not really face any problem with the text since most of it is clearly visible

I don't think if it will be possible to get the data without that. As for the values of x and y, you can use Open CV and get intersection points of blue and white to get the points and link them with the y axis to get the exact values

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM