简体   繁体   中英

Improve tesseract accuracy for detecting amounts of money (digits, commas, dots and symbols) possibly without preprocessing the image?

I am writing a simple android app for detecting amounts of money (digits, commas, dots and symbols). I am using Tesseract, more specifically tess-two.

Code Snippet:

        this.tessBaseAPI = new TessBaseAPI();
        this.tessBaseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO_ONLY);
        //EXTRA SETTINGS
        this.tessBaseAPI.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, "0123456789 $€+=-,.");
        //this.tessBaseAPI.setVariable(TessBaseAPI.VAR_CHAR_BLACKLIST, "!@#$%^&*()_+=-[]}{;:'\"\\|~`,./<>?");
        try {
            this.tessBaseAPI.setDebug(true);
            this.tessBaseAPI.init(path, "eng+snum"); //eng+osd+snum
            this.tessBaseAPI.setImage(bitmap);
            this.text = tessBaseAPI.getUTF8Text();
            //this.text = tessBaseAPI.getHOCRText(0);
            this.tessBaseAPI.end();
        } catch (Exception e) {
            e.printStackTrace();
            System.err.println(e.getMessage());
        }

Sadly I am not satisfied at all with the accuracy of Tesseract. I have tried to preprocess the image with a few binarization algorithms and as a matter of fact that improved the accuracy but I would like to try to avoid to preprocess the image since the API I am using is heavy and time-consuming.
So, how can I adjust tesseract to improve accuracy? So far, I have only tried with the white list. Anything else I can do?

Image:

在此处输入图像描述

I am not sure what you are doing but tesseract screen.png - shows all amount correctly:

Estimating resolution as 269
10:06 > = @

€— Movimenti e richieste

AGOSTO 2022
Amazon.it -5,73€
LUGLIO 2022
Paypal -2,19€
GIUGNO 2022
Amazon.it* -16,69€

B | Ricarica con carta +15,00€
B | Ricarica con carta +20,00€
Amazon.it* -20,83€

B | Ricarica con carta +10,00€
Amazon.it* -5,95€
-0,05€

Amazon.it*

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM