简体   繁体   English

使用tess-two库的OCR返回不正确的文本

[英]OCR using tess-two library returning incorrect text

I am using tess-two library for implementing OCR in my android application. 我正在使用tess-two库在我的android应用程序中实现OCR。 The code I am using is: 我使用的代码是:

BitmapFactory.Options options = new BitmapFactory.Options();
            options.inSampleSize = 4;

            Bitmap bitmap = BitmapFactory.decodeFile(filePath, options);    
            bitmap = Bitmap.createBitmap(bitmap, 0, 0, mPreview.getWidth(), mPreview.getHeight()/2);       

            bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true);

            TessBaseAPI baseApi = new TessBaseAPI();
            //eng.traineddata
             baseApi.init(Environment.getExternalStorageDirectory().toString(), "eng", TessBaseAPI.OEM_TESSERACT_CUBE_COMBINED);
             baseApi.setImage(bitmap);
             String recognizedText = baseApi.getUTF8Text();

             Log.d("Recognized Text", recognizedText);

             baseApi.end();

This is the string I got after scan- 这是我扫描后得到的字符串-

'r8''_, IIFP"" >- .
_ ~11 r-- _ _
3} .
' at H k
CO' f
ty, . s
_ 1 V Fre 111'};
_ _ 011g
I .1. ' Q
h.

which is not at all correct. 这一点都不正确。 I am not understanding what I am doing wrong here. 我不明白我在做什么错。 I have downloaded the language data for english. 我已经下载了英语的语言数据。 There are few similar questions on SO but nothing could help me. 关于SO的类似问题很少,但没有任何帮助。 My code seems to be correct. 我的代码似乎是正确的。 I have been struggling with this since two days. 自两天以来,我一直在为此苦苦挣扎。 Any help will be greatly appreciated. 任何帮助将不胜感激。

EDIT: 编辑:

Image scanned: 扫描的图像: 在此处输入图片说明

So, the issue was image saved was rotated by 90 degrees and that's why code was unable to recognize the text correctly. 因此,问题是保存的图像旋转了90度,这就是代码无法正确识别文本的原因。 Rotating bitmap by -90 degrees did the trick. 将位图旋转-90度可以解决问题。 Now the text is being recognized correctly. 现在可以正确识别文本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM