简体   繁体   中英

unable to find tessdata for Tesseract

Hi I am new to python and tesseract. I am using anaconda distribution and trying to use pytesseract-ocr when I try to get the data from image it gives me following error:

tesseract imageSample1.jpg test.txt digits
// output 
Tesseract Open Source OCR Engine v3.04.01 with Leptonica
Error opening data file /anaconda/envs/_build/share/tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.

So first this there is no such /anaconda/envs/_build/share/tessdata/ directory. I have anaconda3 folder. I downloaded end.traindata from git. but not sure where to put that data. Am I doing some thing wrong. Need some help. thank you.

Tesseract will search in /usr/share/tessdata first.

If you want tesseract to search somewhere else, you can do one of the following

  • set the environment variable TESSDATA_PREFIX to the path where you put your data.
  • call tesseract with --tessdata-dir=<pathToYourData>

Have you tried executing the command : tesseract from your command window , You should get an output like this : tesseract输出

If not then you should install any version of tesseract on your machine tesseract download

Note : for pytesseract to work you need to have tessearct installed into system.

Hope this helps :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM