Pytesseract：“TesseractNotFound 錯誤：tesseract 未安裝或不在您的路徑中”，我該如何解決？

Question

我正在嘗試在 python 中運行一個基本且非常簡單的代碼。

from PIL import Image
import pytesseract

im = Image.open("sample1.jpg")

text = pytesseract.image_to_string(im, lang = 'eng')

print(text)

這就是它的樣子，我實際上已經通過安裝程序為 windows 安裝了 tesseract。 我對 Python 很陌生，我不確定如何繼續？

這里的任何指導都會非常有幫助。 我嘗試重新啟動我的 Spyder 應用程序，但無濟於事。

Answer 1

我看到步驟分散在不同的答案中。 根據我最近在 Windows 上遇到此 pytesseract 錯誤的經驗，按順序編寫不同的步驟以更容易解決錯誤：

1 . 使用 Windows 安裝程序安裝 tesseract： https : //github.com/UB-Mannheim/tesseract/wiki

2 . 請注意安裝中的 tesseract 路徑。 此編輯時的默認安裝路徑為： C:\\Users\\USER\\AppData\\Local\\Tesseract-OCR 。 它可能會改變，所以請檢查安裝路徑。

3 . pip install pytesseract

4 . 在調用image_to_string之前在腳本中設置 tesseract 路徑：

pytesseract.pytesseract.tesseract_cmd = r'C:\\Users\\USER\\AppData\\Local\\Tesseract-OCR\\tesseract.exe'

Answer 2

首先你應該安裝二進制文件：

在 Linux 上

sudo apt-get update
sudo apt-get install libleptonica-dev 
sudo apt-get install tesseract-ocr tesseract-ocr-dev
sudo apt-get install libtesseract-dev

在 Mac 上

brew install tesseract

在 Windows 上

從https://github.com/UB-Mannheim/tesseract/wiki下載二進制文件。 然后將pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'到您的腳本中。

然后你應該使用 pip 安裝 python 包：

pip install tesseract
pip install tesseract-ocr

參考資料： https : //pypi.org/project/pytesseract/ （安裝部分）和https://github.com/tesseract-ocr/tesseract/wiki#installation

Answer 3

僅適用於 Windows

1 - 您需要在您的計算機上安裝 Tesseract OCR。

從這里得到它。 https://github.com/UB-Mannheim/tesseract/wiki

下載合適的版本。

2 - 將 Tesseract 路徑添加到您的系統環境。 即編輯系統變量。

3 - 運行pip install pytesseract和pip install tesseract

4 -每次都將此行添加到您的 python 腳本中

pytesseract.pytesseract.tesseract_cmd = 'C:/OCR/Tesseract-OCR/tesseract.exe'  # your path may be different

5 - 運行代碼。

Answer 4

此錯誤是因為您的計算機上未安裝 tesseract。

如果您使用的是 Ubuntu，請使用以下命令安裝 tesseract：

sudo apt-get install tesseract-ocr

對於 Mac：

brew install tesseract

Answer 5

從https://pypi.org/project/pytesseract/ ：

pytesseract.pytesseract.tesseract_cmd = '<full_path_to_your_tesseract_executable>'
# Include the above line, if you don't have tesseract executable in your PATH
# Example tesseract_cmd: 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract'

Answer 6

在窗口中：

pip install tesseract

pip install tesseract-ocr

並檢查系統中存儲的文件usr/appdata/local/programs/site-pakages/python/python36/lib/pytesseract/pytesseract.py文件並編譯該文件

Answer 7

在 Mac 上，您可以如下所示安裝它。 這對我有用。

brew install tesseract

Answer 8

你可以安裝這個包... https://github.com/UB-Mannheim/tesseract/wiki之后你應該去這個路徑 C:\\Program Files (x86)\\Tesseract-OCR\\ tesseract.exe 然后運行 tesseract 文件. 我想這會幫助你...

Answer 9

在 Windows 64 位上，只需將以下內容添加到 PATH 環境變量： "C:\\Program Files\\Tesseract-OCR" ，它就會工作。

Answer 10

我可以通過使用 pytesseract.py 文件中的 bin/tesseract 路徑更新 tesseract_cmd 變量來解決它

Answer 11

我在 Windows 上遇到了同樣的問題。 我嘗試更新 tesseract 路徑的環境變量，但沒有成功。

對我有用的是修改可以在路徑C:\\Program Files\\Python37\\Lib\\site-packages\\pytesseract或通常在C:\\Users\\YOUR USER\\APPDATA\\Python C:\\Program Files\\Python37\\Lib\\site-packages\\pytesseract

我按如下更改了一行：

#tesseract_cmd = 'tesseract' 
#tesseract_cmd = 'C:\Program Files\Tesseract-OCR\\tesseract.exe'

注意我必須在 tesseract 之前添加一個額外的\\ ，因為 Python 的解釋與\\t相同，您將收到以下錯誤消息：

pytesseract.pytesseract.TesseractNotFoundError: C:\\Program Files\\Tesseract-OCR esseract.exe 未安裝或不在您的路徑中

Answer 12

第1步：

根據操作系統在您的系統上安裝 tesseract。 最新的安裝程序可以在https://github.com/UB-Mannheim/tesseract/wiki找到

第 2 步：使用 pip install pytesseract pip install opencv-python pip install numpy 安裝以下依賴庫

第 3 步：示例代碼

import cv2
import numpy as np
import pytesseract
from PIL import Image
from pytesseract import image_to_string

# Path of working folder on Disk Replace with your working folder
src_path = "C:\\Users\\<user>\\PycharmProjects\\ImageToText\\input\\"
# If you don't have tesseract executable in your PATH, include the 
following:
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract- 
OCR/tesseract'
TESSDATA_PREFIX = 'C:/Program Files (x86)/Tesseract-OCR'

def get_string(img_path):
    # Read image with opencv
    img = cv2.imread(img_path)

    # Convert to gray
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Apply dilation and erosion to remove some noise
    kernel = np.ones((1, 1), np.uint8)
    img = cv2.dilate(img, kernel, iterations=1)
    img = cv2.erode(img, kernel, iterations=1)

    # Write image after removed noise
    cv2.imwrite(src_path + "removed_noise.png", img)

    #  Apply threshold to get image with only black and white
    #img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)

    # Write the image after apply opencv to do some ...
    cv2.imwrite(src_path + "thres.png", img)

    # Recognize text with tesseract for python
    result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))

    # Remove template file
    #os.remove(temp)

    return result


print('--- Start recognize text from image ---')
print(get_string(src_path + "image.png") )

print("------ Done -------")

Answer 13

您將需要安裝tesseract。

https://github.com/tesseract-ocr/tesseract/wiki

查看以上有關安裝的文檔。

Answer 14

在 Windows 中，必須重定向命令路徑，以進行默認的 Windows tesseract 安裝。

在 32 位系統中，在導入命令后添加這一行。

pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'

在 64 位系統中，改為添加此行。

 pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files\Tesseract-OCR\tesseract.exe'

Answer 15

也許這是因為，即使 Tesseract 安裝正確，您也沒有安裝您的語言，就像我的情況一樣。 幸運的是，這很容易修復，我什至不需要弄亂tesseract_cmd 。

sudo apt-get install tesseract-ocr -y
sudo apt-get install tesseract-ocr-spa -y
tesseract --list-langs

請注意，在第二行中，我們為西班牙語指定了-spa 。

如果安裝成功，您應該獲得可用語言的列表，例如：

List of available languages (3):
eng
osd
spa

我在這篇博文（西班牙語）中找到了這個。 還有一個在 Windows 中安裝西班牙語的帖子（顯然不是那么容易）。

注意：由於該問題使用lang = 'eng' ，因此這可能不是該特定情況下的答案。 但是在其他情況下可能會發生同樣的錯誤，這就是我在這里發布答案的原因。

Answer 16

僅適用於 Windows 用戶：

使用以下命令安裝 tesseract：

pip install tesseract

然后將此行添加到您的代碼中，注意“\\”

pytesseract.pytesseract.tesseract_cmd = "C:\Program Files (x86)\Tesseract-OCR\\tesseract.exe"

Answer 17

僅通過使用conda安裝tesseract就對我conda 。

conda install -c conda-forge tesseract

Answer 18

對於 Linux 發行版 (Ubuntu)

嘗試

sudo apt install tesseract-ocr
sudo apt install libtesseract-dev

Answer 19

使用以下命令安裝tesseract

pip install tesseract

Answer 20

# {Windows 10 instructions}
# before you use the script you need to install the dependence
# 1. download the tesseract from the official link:
#   https://github.com/UB-Mannheim/tesseract/wiki
# 2. install the tesseract
#   i chosed this path
#       *replace the user string in the below path with you name of user that you are using in your current machine
#       C:\Users\user\AppData\Local\Tesseract-OCR\
# 3. Install the  pillow for your python version
# * the best way for me is to install is this form(i'am using python3.7 version and in my CMD i run this version of python by     typing py -3.7):
# * if you are using another version of python first look how you start the python from you CMD
# * for some machine the run of python from the CMD is different
    # [examples]
    # =================================
    # PYTHON VERSION 3.7
    # python
    # python3.7
    # python -3.7
    # python 3.7
    # python3
    # python -3
    # python 3
    # py3.7
    # py -3.7
    # py 3.7
    # py3
    # py -3
    # py 3
    # PYTHON VERSION 3.6
    # python
    # python3.6
    # python -3.6
    # python 3.6
    # python3
    # python -3
    # python 3
    # py3.6
    # py -3.6
    # py 3.6
    # py3
    # py -3
    # py 3
    # PYTHON VERSION 2.7
    # python
    # python2.7
    # python -2.7
    # python 2.7
    # python2
    # python -2
    # python 2
    # py2.7
    # py -2.7
    # py 2.7
    # py2
    # py -2
    # py 2
    # ================================
# we are using pip to install the dependences
# because for me i start the python version 3.7 with the following line 
    # py -3.7
# open the CMD in windows machine and type the following line:
    # py -3.7 -m pip install pillow
# 4. Install the  pytesseract and tesseract for your python version
# * the best way for me is to install is this form(i'am using python3.7 version and in my CMD i run this version of python by     typing py -3.7):
# we are using pip to install the dependences
# open the CMD in windows machine and type the following lines:
    # py -3.7 -m pip install pytesseract
    # py -3.7 -m pip install tesseract


#!/usr/bin/python
from PIL import Image
import pytesseract
import os
import getpass

def extract_text_from_image(image_file_name_arg):

    # IMPORTANT
    # if you have followed my instructions to install this dependence in above text explanatin
    # for my machine is
    # if you don't put the right path for tesseract.exe the script will not work
    username = getpass.getuser()
    # here above line get the username for your machine automatically
    tesseract_exe_path_installation="C:\\Users\\"+username+"\\AppData\\Local\\Tesseract-OCR\\tesseract.exe"
    pytesseract.pytesseract.tesseract_cmd=tesseract_exe_path_installation

# specify the direction of your image files manually or use line bellow if the images are in the script directory in     folder  images
    # image_dir="D:\\GIT\\ai_example\\extract_text_from_image\\images"
    image_dir=os.getcwd()+"\\images"
    dir_seperator="\\"
    image_file_name=image_file_name_arg
    # if your image are in different format change the extension(ex. ".png")
    image_ext=".jpg"
    image_path_dir=image_dir+dir_seperator+image_file_name+image_ext

    print("=============================================================================")
    print("image used is in the following path dir:")
    print("\t"+image_path_dir)
    print("=============================================================================")

    img=Image.open(image_path_dir)
    text=pytesseract.image_to_string(img, lang="eng")
    print(text)

# change the name "image_1" whith the name without extension for your image name
# image_file_name_arg="image_1"
image_file_name_arg="image_2"
# image_file_name_arg="image_3"
# image_file_name_arg="image_4"
# image_file_name_arg="image_5"
extract_text_from_image(image_file_name_arg)

# ==================================
# CREATED BY: SHERIFI
# e-mail: sherif_co@yahoo.com
# git-link for script: https://github.com/sherifi/ai_example.git
# ==================================

Answer 21

For Ubuntu 18.04

如果您收到類似的錯誤

 tesseract is not installed or it's not in your path

 and 

 OSError: [Errno 12] Cannot allocate memory

這可能是交換內存分配問題

您可以檢查此答案分配更多交換內存希望有幫助:)

https://askubuntu.com/questions/920595/fallocate-fallocate-failed-text-file-busy-in-ubuntu-17-04?answertab=active#tab-top

Answer 22

這個問題已經有很多不錯的答案，但我想分享一個很棒的網站，當我無法解決“TesseractNotFound 錯誤：tesseract 未安裝或不在您的路徑中”時，我想分享一個很棒的網站，請參閱此網站： https：/ /www.thetopsites.net/article/50655738.shtml

我意識到我收到這個錯誤是因為我用 pip 安裝了pytesseract但忘記安裝二進制文件。 您的機器上可能缺少 tesseract-ocr。 在此處查看安裝說明： https : //github.com/tesseract-ocr/tesseract/wiki

在 Mac 上，您可以使用自制軟件進行安裝：

brew install tesseract

之后應該可以正常運行！

在 Windows 10 操作系統環境下，以下方法對我有用：

轉到此鏈接並下載 tesseract 並安裝它。 Windows 版本可在此處獲得： https : //github.com/UB-Mannheim/tesseract/wiki
從 C:\\Users\\User\\Anaconda3\\Lib\\site-packages\\pytesseract 中找到腳本文件 pytesseract.py 並打開它。 將以下代碼從tesseract_cmd = 'tesseract'改為： tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe' （這是安裝 Tesseract-OCR 的路徑，請檢查安裝位置並相應地更新路徑）
您可能還需要添加環境變量 C:/Program Files (x86)/Tesseract-OCR/

希望這對你有用！

Answer 23

UBUNTU 的解決方案對我有用：

通過以下鏈接在 ubuntu 中安裝了 tesseract

https://medium.com/quantrium-tech/installing-tesseract-4-on-ubuntu-18-04-b6fcd0cbd78f

后來通過以下鏈接將 traindata 語言添加到 tessdata

Tesseract 運行錯誤

Answer 24

最新版本的 pip 模塊 pytesseract=0.3.7 似乎存在問題。 我已經將它降級為 pytesseract=0.3.6 並且沒有看到錯誤。

Answer 25

對於 Windows，只需簡單的步驟：

從https://github.com/UB-Mannheim/tesseract/wiki下載 Windows 版本
安裝

在您的 .py 文件中寫入以下內容（檢查安裝位置）

 pytesseract.pytesseract.tesseract_cmd = r"C:\\Program Files\\Tesseract-OCR\\tesseract.exe" img_text = pytesseract.image_to_string(Image.open(filename))

Answer 26

對我來說，它通過放置單引號起作用

pytesseract.pytesseract.tesseract_cmd =r'C:/Program Files/Tesseract-OCR/tesseract.exe'

實際上放在雙引號內是自動插入不需要的字符

Answer 27

上面的提示並沒有幫我解決問題，因為安裝pytesseract（pycharm，python 2.7）時出現了小節指定的錯誤。 奇怪的是 tesseract 也是從命令行工作的，所以安裝是正確的。

我可以按照以下步驟解決這個問題：

從保險庫https://github.com/madmaze/pytesseract下載 pytesseract.py
刪除與解釋器（2.7 和 3.*）差異相關的所有語法錯誤，包括 try catch 方法
將編輯過的腳本作為自己編寫的腳本導入到您的程序中，並根據存儲庫中的建議配置 tesseract_cmd 變量。

隨后，圖像到文本的翻譯功能在python 2.7中工作

Answer 28

蟒蛇安裝：

適用於 Mac、Linux 和 Windows

conda-forge/包/tesseract 4.1.1

第1步：

conda install -c conda-forge tesseract

第 2 步：如果您還沒有，請查找 Tesseract PATH

for r,s,f in os.walk("/"):
    for i in f:
        if "tesseract" in i:
            print(os.path.join(r,i))

例如，我的 Tesseract PATH 是 /anaconda/bin/tesseract

第 3 步：將 tesseract 添加到 PATH

pytesseract.pytesseract.tesseract_cmd = r'/anaconda/bin/tesseract'

Answer 29

我已經在我的樹莓派上試過這個了。 我只是從這里改變了路徑：

C:/Program Files/Tesseract-OCR/tesseract.exe'

（因為它適用於 Windows）對此：

/usr/local/lib/python3.7/dist-packages

因為，這是我每次嘗試運行此命令時看到的路徑：

pip3 show pytesseract

為了更清楚，這里是消息。 命令行在這里

Answer 30

我在安裝 tesseract 時也面臨同樣的錯誤。

根據我最近解決的問題，我正在按照以下步驟進行操作

使用 gievn 鏈接中提供的 Windows 安裝程序安裝 tesseract： https ://github.com/UB-Mannheim/tesseract/wiki
請注意安裝中的 tesseract 路徑。 此編輯時的默認安裝路徑為：C:\\Users\\USER\\AppData\\Local\\Tesseract-OCR。 它可能會改變，所以請檢查安裝路徑。

安裝后，它仍然顯示錯誤或未安裝您面臨的錯誤然后按 windows + R 鍵並運行您的文件路徑（C:\\Program Files\\Tesseract-OCR\\tesseract.exe）它會為我工作，

3. pip install pytesseract

在調用```image_to_string:``之前在腳本中設置tesseract路徑

對於 Windows 文件路徑 -

pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files(x86)\Tesseract-OCR\tesseract.exe'

對於 linux 安裝會有所不同，但下面給出了 linux 文件路徑

pytesseract.pytesseract.tesseract_cmd = r'home/user/bin/tesseract'

要安裝opencv，請參考這個問題鏈接

Answer 31

!sudo apt install tesseract-ocr

Answer 32

pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'

這對我有幫助

Answer 33

我推薦每個人都看這個家伙的視頻，他很棒，沒有一個能解決我的問題，但是這個，鏈接https://youtu.be/R4zK1-1lgCQ

Pytesseract：“TesseractNotFound 錯誤：tesseract 未安裝或不在您的路徑中”，我該如何解決？

問題描述

31 個解決方案

解決方案1 178 已采納 2018-12-07 15:16:44

解決方案2 75 2018-09-08 03:45:11

首先你應該安裝二進制文件：

在 Linux 上

在 Mac 上

在 Windows 上

然后你應該使用 pip 安裝 python 包：

解決方案3 19 2019-03-06 12:50:15

僅適用於 Windows

解決方案4 12 2020-07-09 17:32:22

解決方案5 10 2018-07-18 08:20:29

解決方案6 7 2018-10-30 06:23:06

解決方案7 5 2019-07-03 22:53:37

解決方案8 3 2018-11-08 10:23:21

解決方案9 3 2019-11-25 16:06:18

解決方案10 3 2020-04-12 01:34:02

解決方案11 3 2020-04-15 18:14:49

解決方案12 2 2019-03-05 05:26:41

解決方案13 1 2018-06-20 16:16:29

解決方案14 1 2019-06-28 02:14:12

解決方案15 1 2020-07-21 10:08:58

解決方案16 1 2020-10-24 13:26:54

解決方案17 1 2021-03-12 11:47:06

解決方案18 1 2021-04-15 08:49:26

解決方案19 0 2018-09-23 07:32:42

解決方案20 0 2019-07-18 12:02:33

解決方案21 0 2019-10-22 04:46:39

解決方案22 0 2020-09-26 06:43:23

解決方案23 0 2021-01-19 09:34:13

解決方案24 0 2021-01-28 10:46:16

解決方案25 0 2021-04-04 14:46:11

解決方案26 0 2021-04-06 09:44:19

解決方案27 0 2021-07-10 17:06:51

解決方案28 0 2021-08-27 18:09:31

解決方案29 0 2021-11-01 14:53:09

解決方案30 0 2022-02-03 16:25:59

解決方案31 0 2022-08-26 04:57:42

解決方案32 -3 2022-04-08 14:12:31

解決方案33 -4 2021-12-27 14:07:19

解決方案1
178 已采納 2018-12-07 15:16:44

解決方案2
75 2018-09-08 03:45:11

解決方案3
19 2019-03-06 12:50:15

解決方案4
12 2020-07-09 17:32:22

解決方案5
10 2018-07-18 08:20:29

解決方案6
7 2018-10-30 06:23:06

解決方案7
5 2019-07-03 22:53:37

解決方案8
3 2018-11-08 10:23:21

解決方案9
3 2019-11-25 16:06:18

解決方案10
3 2020-04-12 01:34:02

解決方案11
3 2020-04-15 18:14:49

解決方案12
2 2019-03-05 05:26:41

解決方案13
1 2018-06-20 16:16:29

解決方案14
1 2019-06-28 02:14:12

解決方案15
1 2020-07-21 10:08:58

解決方案16
1 2020-10-24 13:26:54

解決方案17
1 2021-03-12 11:47:06

解決方案18
1 2021-04-15 08:49:26

解決方案19
0 2018-09-23 07:32:42

解決方案20
0 2019-07-18 12:02:33

解決方案21
0 2019-10-22 04:46:39

解決方案22
0 2020-09-26 06:43:23

解決方案23
0 2021-01-19 09:34:13

解決方案24
0 2021-01-28 10:46:16

解決方案25
0 2021-04-04 14:46:11

解決方案26
0 2021-04-06 09:44:19

解決方案27
0 2021-07-10 17:06:51

解決方案28
0 2021-08-27 18:09:31

解決方案29
0 2021-11-01 14:53:09

解決方案30
0 2022-02-03 16:25:59

解決方案31
0 2022-08-26 04:57:42

解決方案32
-3 2022-04-08 14:12:31

解決方案33
-4 2021-12-27 14:07:19