简体   繁体   中英

Exception has been thrown by the target of an invocation when using Tesseract ocr

I'm coding a program that extract text from image using Tesseract .. The program should bring all images from a directory and put them one by one in a picture box and then extract the text from them. I have downloaded an English trained data from this link and put it inside Debug folder https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata
The exception is: Here's my code: Exception has been thrown by the target of an invocation.
and the inner exception indicated in catch was : Failed to find library "liblept172.dll" for platform x86.

private void button1_Click(object sender, EventArgs e)
{
    Image image;
    string[] images = Directory.GetFiles("E:\\Omar Project\\New", "*.png");
    for (int i = 0; i < images.Length;i++)
    {
        image = Image.FromFile(images[i]);
        pictureBox1.Image = image;
        //ocr = new TesseractEngine(@"tessdata", "eng", EngineMode.Default);
        using (var engine = new TesseractEngine("E:\\Omar Project\\Extracting Text From Image Using Microsoft Office\\Extracting Text From Image Using Microsoft Office\\bin\\Debug\\eng.traineddata", "eng", EngineMode.Default))
        {
          using(var img=Pix.LoadFromFile(images[i]))
          {
              using(var page=engine.Process(img))
              {
                  richTextBox1.Text += page.GetText();
              }
          }
        }
    }

}

Create a folder "x64" or "x86" in your project directory (where your .csproj file is located) and copy liblept1760.so and libtesseract400.so into it. Now you should be able to see your both files in visual studio. You will have to set "copy to output directory" (is the property called like this in english?) for both files to "always"

Make sure you have the following packages installed inside your container

apt-get install -y libgif7 libjpeg62 libopenjp2-7 libpng16-16 libtiff5 libwebp6

Otherwise the dlopen command for liblept will fail and you will get your mentioned errormessage.

If you don't have the liblept package installed inside your container and only copied the .so file into the x64 directory, the open command for libtesseract will fail.

To fix this, you have to create a sym link to your liblept shared object.

Just run inside your container / Dockerfile

ln -s /app/x64/liblept1760.so /usr/lib/x86_64-linux-gnu/liblept.so.5

Make sure to use the correct source path. For default asp.net core docker images and my described way /app/x64/liblept1760.so should be working.

Wiki Tesseract

Common reasons for failure are: The Visual Studio 2015 x86 & x64 Runtime is not installed, as detailed on the main readme page these can be found here. The x86 and x64 versions leptonica (liblept172.dll) and tesseract (libtesseract304.dll) were not copied to their respective folders in the bin directory. DotNet may be reporting the wrong architecture, for instance a known issue is that for an Application compiled using the prefer 32-bit flag will report it's running on x64 when running on windows 64-bit OS (see issue #55). A common workaround is to either change the cpu architecture to x86. Your running this on an unsupported architecture (eg ARM). Further diagnosis Tesseract will write the detected architectures and paths searched to the System.Diagnostics source named "Tesseract" which will be helpful for figuring out what is going on. An example configuration is provided at the end of the page.

If the correct version of the library is being found but fails to load, the log will tell you if this is the case, then the next step is to enable logging of binding errors and check the fusion log. Note that it's been my experience that Windows may log these errors to the Windows Event Log so it might be worth checking that first. Further details can be found here:

https://blogs.msdn.microsoft.com/suzcook/2003/05/29/debugging-assembly-loading-failures/ http://stackoverflow.com/questions/255669/how-to-enable-assembly-bind-failure-logging-fusion-in-net If you cannot resolve the issue please file a new issue including the library version, operating system your executing the code on, the target architecture for the entry program, and a copy of the full standard and trace outputs. Note you'll need to enable System.Diagnostics output for the Tesseract source as previously mentioned.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM