[英]How to install leptonica+tesseract on Windows without Visual Studio to use in Anaconda?
I wanted to perform text recognition from images and I want to use Python. 我想从图像中执行文本识别,我想使用Python。 I installed Anaconda.
我安装了Anaconda。 Now I want to install Tesseract but I also need to install Leptonica.
现在我想安装Tesseract但我还需要安装Leptonica。 I did not find any clear instruction how to do it in windows.
我没有找到任何明确的说明如何在Windows中执行此操作。 For Leptonica I do not want to install Visual Studio.
对于Leptonica我不想安装Visual Studio。 So could anybody provide clear instructions how to install leptonica and tesseract on Windows without Visual Studio to use in anaconda ?
那么有人可以提供明确的说明如何在Windows上安装leptonica和tesseract而不使用Visual Studio在anaconda中使用吗? Thanks.
谢谢。
Here is simple set of steps to have tesseract 3.05 dev version as of 04/22/2016 working both on windows 7 and windows 8 machines: 以下是在Windows 7和Windows 8机器上使用tesseract 3.05 dev版本的简单步骤:
1- install tesseract from its executable from official tesseract-ocr page (version 3.02 for windoes will suffice) 1-从官方tesseract-ocr页面安装tesseract来自其可执行文件(版本3.02 for windoes就足够了)
2- download the following two files for tesseract 3.05 dev version from http://domasofan.spdns.eu/tesseract/ 2-从http://domasofan.spdns.eu/tesseract/下载tesseract 3.05开发版的以下两个文件
There are 2 exe files: 有2个exe文件:
(yyyymmdd means year 4 digits, month 2 digits and day 2 digits.) (yyyymmdd表示年份4位数,月份2位数字和第2位数字。)
The app is portable so you can install it on a USB stick or in another location. 该应用程序是便携式的,因此您可以将其安装在USB记忆棒或其他位置。
sub Steps to install these: sub安装这些的步骤:
Double click the tesseract-langs package and extract it to the same directory but add \\tessdata to it in the above "Tess_temp" folder. 双击tesseract-langs包并将其解压缩到同一目录,但在上面的“Tess_temp”文件夹中添加\\ tessdata。 For example if i would have extracted tesseract-core to c:\\Tess_temp, tesseract-langs needs to go to c:\\Tess_temp\\tessdata.
例如,如果我将tesseract-core提取到c:\\ Tess_temp,则tesseract-langs需要转到c:\\ Tess_temp \\ tessdata。
Now copy what ever you have in "Tess_temp" to where tesseract 3.02 was installed in step 1 above (its usially in C:\\Program Files (x86)\\Tesseract-OCR) (replace 3.02 materials with 3.05 ) 现在将“Tess_temp”中的内容复制到上面步骤1中安装tesseract 3.02的地方(它通常位于C:\\ Program Files(x86)\\ Tesseract-OCR)(用3.05替换3.02材料)
It should work now with the 3.05 version on windows. 它现在应该在Windows上使用3.05版本。 copy a sample image test.png (with text) to this tesseract-ocr folder and open a cmd and type in the following commands:
将示例图像test.png(带文本)复制到此tesseract-ocr文件夹并打开cmd并键入以下命令:
go to tesseract folder: cd C:\\Program Files <x86>\\Tesseract-OCR
转到tesseract文件夹:
cd C:\\Program Files <x86>\\Tesseract-OCR
run tesseract on test.png: tesseract -l eng test.png test_text -psm 6
在test.png上运行tesseract:
tesseract -l eng test.png test_text -psm 6
it will show you 它会告诉你
Tesseract Open Source OCR Engine v3.05.00dev with Leptonica
congratulations ! 恭喜! (check test_txt.txt for the extracted text)
(检查提取文本的test_txt.txt)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.