简体   繁体   English

使用 Tesseract 的 Android OCR 应用程序

[英]Android OCR App using Tesseract

I was following the tutorial mentioned at this this site:我正在关注这个网站上提到的教程:

http://gaut.am/making-an-ocr-android-app-using-tesseract/ http://gaut.am/making-an-ocr-android-app-using-tesseract/

First I imported Tess-Two from github: https://github.com/rmtheis/tess-two首先我从 github 导入Tess-Twohttps : //github.com/rmtheis/tess-two

And linked it to my project https://github.com/GautamGupta/Simple-Android-OCR并将其链接到我的项目https://github.com/GautamGupta/Simple-Android-OCR

The app compiles and runs fine.该应用程序编译并运行良好。 But after clicking an image when I hit save it crashes.但是当我点击保存时点击图像后它崩溃了。

Here is the source main activity:这是源主要活动:

public class SimpleAndroidOCRActivity extends Activity {
  public static final String PACKAGE_NAME = "com.datumdroid.android.ocr.simple";
  public static final String DATA_PATH = Environment
                    .getExternalStorageDirectory().toString() + "/SimpleAndroidOCR/";

  // You should have the trained data file in assets folder
  // You can get them at:
  // http://code.google.com/p/tesseract-ocr/downloads/list
  public static final String lang = "eng";

  private static final String TAG = "SimpleAndroidOCR.java";

  protected Button _button;
  // protected ImageView _image;
  protected EditText _field;
  protected String _path;
  protected boolean _taken;

  protected static final String PHOTO_TAKEN = "photo_taken";

  @Override
  public void onCreate(Bundle savedInstanceState) {

    String[] paths = new String[] { DATA_PATH, DATA_PATH + "tessdata/" };

    for (String path : paths) {
      File dir = new File(path);
      if (!dir.exists()) {
        if (!dir.mkdirs()) {
          Log.v(TAG, "ERROR: Creation of directory " + path + " on sdcard failed");
          return;
        } else {
          Log.v(TAG, "Created directory " + path + " on sdcard");
        }
      }
    }

    // lang.traineddata file with the app (in assets folder)
    // You can get them at:
    // http://code.google.com/p/tesseract-ocr/downloads/list
    // This area needs work and optimization
    if (!(new File(DATA_PATH + "tessdata/" + lang + ".traineddata")).exists()) {
      try {
        AssetManager assetManager = getAssets();
        InputStream in = assetManager.open("tessdata/" + lang + ".traineddata");
        //GZIPInputStream gin = new GZIPInputStream(in);
        OutputStream out = new FileOutputStream(DATA_PATH
                                            + "tessdata/" + lang + ".traineddata");

        // Transfer bytes from in to out
        byte[] buf = new byte[1024];
        int len;
        //while ((lenf = gin.read(buff)) > 0) {
        while ((len = in.read(buf)) > 0) {
          out.write(buf, 0, len);
        }
        in.close();
        //gin.close();
        out.close();

        Log.v(TAG, "Copied " + lang + " traineddata");
      } catch (IOException e) {
        Log.e(TAG, "Was unable to copy " + lang + " traineddata " + e.toString());
      }
    }

    super.onCreate(savedInstanceState);

    setContentView(R.layout.main);

    // _image = (ImageView) findViewById(R.id.image);
    _field = (EditText) findViewById(R.id.field);
    _button = (Button) findViewById(R.id.button);
    _button.setOnClickListener(new ButtonClickHandler());

    _path = DATA_PATH + "/ocr.jpg";
  }

  public class ButtonClickHandler implements View.OnClickListener {
    public void onClick(View view) {
      Log.v(TAG, "Starting Camera app");
        startCameraActivity();
      }
    }

  // Simple android photo capture:
  // http://labs.makemachine.net/2010/03/simple-android-photo-capture/

  protected void startCameraActivity() {
    File file = new File(_path);
    Uri outputFileUri = Uri.fromFile(file);

    final Intent intent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
    intent.putExtra(MediaStore.EXTRA_OUTPUT, outputFileUri);

    startActivityForResult(intent, 0);
  }

  @Override
  protected void onActivityResult(int requestCode, int resultCode, Intent data) {

    Log.i(TAG, "resultCode: " + resultCode);

    if (resultCode == -1) {
      onPhotoTaken();
    } else {
      Log.v(TAG, "User cancelled");
    }
  }

  @Override
  protected void onSaveInstanceState(Bundle outState) {
    outState.putBoolean(SimpleAndroidOCRActivity.PHOTO_TAKEN, _taken);
  }

  @Override
  protected void onRestoreInstanceState(Bundle savedInstanceState) {
    Log.i(TAG, "onRestoreInstanceState()");
    if (savedInstanceState.getBoolean(SimpleAndroidOCRActivity.PHOTO_TAKEN)) {
      onPhotoTaken();
    }
  }

  protected void onPhotoTaken() {
    _taken = true;

    BitmapFactory.Options options = new BitmapFactory.Options();
    options.inSampleSize = 4;

    Bitmap bitmap = BitmapFactory.decodeFile(_path, options);

    try {
      ExifInterface exif = new ExifInterface(_path);
      int exifOrientation = exif.getAttributeInt(
                                    ExifInterface.TAG_ORIENTATION,
                                    ExifInterface.ORIENTATION_NORMAL);

      Log.v(TAG, "Orient: " + exifOrientation);

      int rotate = 0;

      switch (exifOrientation) {
        case ExifInterface.ORIENTATION_ROTATE_90:
          rotate = 90;
          break;
        case ExifInterface.ORIENTATION_ROTATE_180:
          rotate = 180;
          break;
        case ExifInterface.ORIENTATION_ROTATE_270:
          rotate = 270;
          break;
      }

      Log.v(TAG, "Rotation: " + rotate);

      if (rotate != 0) {

        // Getting width & height of the given image.
        int w = bitmap.getWidth();
        int h = bitmap.getHeight();

        // Setting pre rotate
        Matrix mtx = new Matrix();
        mtx.preRotate(rotate);

        // Rotating Bitmap
        bitmap = Bitmap.createBitmap(bitmap, 0, 0, w, h, mtx, false);
      }

      // Convert to ARGB_8888, required by tess
      bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true);

    } catch (IOException e) {
      Log.e(TAG, "Couldn't correct orientation: " + e.toString());
    }

    // _image.setImageBitmap( bitmap );

    Log.v(TAG, "Before baseApi");

    TessBaseAPI baseApi = new TessBaseAPI();
    baseApi.setDebug(true);
    baseApi.init(DATA_PATH, lang);
    baseApi.setImage(bitmap);

    String recognizedText = baseApi.getUTF8Text();

    baseApi.end();

    // You now have the text in recognizedText var, you can do anything with it.
    // We will display a stripped out trimmed alpha-numeric version of it (if lang is eng)
    // so that garbage doesn't make it to the display.

    Log.v(TAG, "OCRED TEXT: " + recognizedText);

    if ( lang.equalsIgnoreCase("eng") ) {
        recognizedText = recognizedText.replaceAll("[^a-zA-Z0-9]+", " ");
    }

    recognizedText = recognizedText.trim();

    if ( recognizedText.length() != 0 ) {
      _field.setText(_field.getText().toString().length() == 0 ? recognizedText : _field.getText() + " " + recognizedText);
      _field.setSelection(_field.getText().toString().length());
    }     
    // Cycle done.
  }

  // www.Gaut.am was here
  // Thanks for reading!
}

And I know the error is towards the end in the following part:而且我知道错误即将在以下部分结束:

TessBaseAPI baseApi = new TessBaseAPI();
baseApi.setDebug(true);
baseApi.init(DATA_PATH, lang);
baseApi.setImage(bitmap);

String recognizedText = baseApi.getUTF8Text();

baseApi.end();

Any idea why it might be crashing?知道为什么它可能会崩溃吗?

去解压你的目标 *.apk 文件,检查是否有一个包含 *.so 文件的libs文件夹。如果这是你的问题,请查看我之前回答过的这个链接。

The issue is with your path, you have to download a language in which you want to get a result from code.google.com/p/tesseract-ocr/downloads/list download (eg: tesseract-ocr-3.02.eng.tar.gz ) extract it and locate the file “yourLanguage.traineddata” (eg: “eng.traineddata” ) and put this file in your storage path in sd card, then you can get it as a path问题在于您的路径,您必须从code.google.com/p/tesseract-ocr/downloads/list download (例如: tesseract-ocr-3.02.eng.tar.gz ) 解压并找到文件“yourLanguage.traineddata” (例如: “eng.traineddata” )并将此文件放入sd卡的存储路径中,然后您可以将其作为路径获取

TessBaseAPI mTess = new TessBaseAPI();
String datapath = Environment.getExternalStorageDirectory() + "/tesseract/";
String language = "eng";
File dir = new File(datapath + "tessdata/");
if (!dir.exists())
    dir.mkdirs();
mTess.init(datapath, language);
mTess.setImage(yourBitmapImage);
String result = mTess.getUTF8Text();
Toast.makeText(tesswork.this, "Result : " + result, Toast.LENGTH_LONG).show();

Are you using correct tessdata files ?您是否使用正确的 tessdata 文件? if not go for this.如果不这样做。 https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-304305 https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-304305

 <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
 <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />

Here may be permission issue while saving files to directory.将文件保存到目录时,这可能是权限问题。

Or check lib folder of code directory it can be issue of libtess.so file not available for CPU architecture like x86,x64,mips,armv7.或者检查代码目录的 lib 文件夹,可能是 libtess.so 文件不适用于 x86、x64、mips、armv7 等 CPU 架构的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM