无法安装tesseract-ocr软件包-``在/ tmp / pip_build_root / tesseract-ocr中编译失败，错误代码为1''

Question

Trying to install tesseract-ocr package for use with pytesseract, running into an odd issue. 试图安装与pytesseract一起使用的tesseract-ocr软件包，遇到了一个奇怪的问题。 Installing everything else with pip worked, but when I tried sudo pip install tesseract-ocr as instructed here , I get the following errors: 使用pip安装其他所有东西都可以，但是当我按照此处的说明尝试sudo pip install tesseract-ocr ，出现以下错误：

Command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip_build_root/tesseract-ocr/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-zsaPkE-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /tmp/pip_build_root/tesseract-ocr
Traceback (most recent call last):
  File "/usr/bin/pip", line 9, in <module>
    load_entry_point('pip==1.5.4', 'console_scripts', 'pip')()
  File "/usr/lib/python2.7/dist-packages/pip/__init__.py", line 235, in main
    return command.main(cmd_args)
  File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 161, in main
    text = '\n'.join(complete_log)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 42: ordinal not in range(128)

I have a feeling that the traceback is causing the UnicodeDecodeError. 我感觉是回溯导致了UnicodeDecodeError。 Does anyone have any ideas on how to resolve this? 有人对如何解决这个问题有任何想法吗？

Answer 1

The link provided only mentions the use of Pip for installing pytesseract not Tesseract-OCR. 提供的链接仅提及使用Pip来安装pytesseract，而不是Tesseract-OCR。

As mentioned you will also need the Python Imaging Library (PIL), if it is not installed in your system you can use Pillow by using sudo pip install pillow . 如前所述，您还将需要Python Imaging Library（PIL），如果未在系统中安装它，则可以使用sudo pip install pillow来使用Pillow。

Tesseract-OCR is not installed with Pip using sudo pip install tesseract-ocr since it is not a Python module like pytesseract. Tesseract-OCR未使用sudo pip install tesseract-ocr与Pip一起sudo pip install tesseract-ocr因为它不是pytesseract之类的Python模块。 From what I see Tesseract-OCR is written mostly in C++. 从我看来，Tesseract-OCR主要是用C ++编写的。

The link given, http://code.google.com/p/tesseract-ocr/ , is no longer hosting Tesseract-OCR as the project has been moved to https://github.com/tesseract-ocr/tesseract . 由于该项目已移至https://github.com/tesseract-ocr/tesseract ，因此给出的链接http://code.google.com/p/tesseract-ocr/不再托管Tesseract-OCR。

Install instructions can be found on https://github.com/tesseract-ocr/tesseract/wiki . 可以在https://github.com/tesseract-ocr/tesseract/wiki上找到安装说明。

For Linux use, sudo apt-get install tesseract-ocr or sudo apt-get install tesseract-ocr-all to install all languages. 对于Linux使用， sudo apt-get install tesseract-ocr使用sudo apt-get install tesseract-ocr或sudo apt-get install tesseract-ocr-all来安装所有语言。

For Mac use, brew install tesseract or brew install tesseract --all-languages to install all languages. 对于Mac使用， brew install tesseract或brew install tesseract --all-languages安装所有语言。 You will need Homebrew installed, it can be found at https://brew.sh . 您将需要安装Homebrew，可以在https://brew.sh上找到它。

For Windows, installer can be found on https://github.com/tesseract-ocr/tesseract/wiki/Downloads/ . 对于Windows，可以在https://github.com/tesseract-ocr/tesseract/wiki/Downloads/上找到安装程序。 Current stable version should comes with all languages included. 当前的稳定版本应包含所有语言。

无法安装tesseract-ocr软件包-``在/ tmp / pip_build_root / tesseract-ocr中编译失败，错误代码为1''

问题描述

1 个解决方案

解决方案1
4 已采纳 2017-03-14 15:54:56

无法安装tesseract-ocr软件包-``在/ tmp / pip_build_root / tesseract-ocr中编译失败，错误代码为1&#39;&#39;

问题描述

1 个解决方案

解决方案1 4 已采纳 2017-03-14 15:54:56

无法安装tesseract-ocr软件包-``在/ tmp / pip_build_root / tesseract-ocr中编译失败，错误代码为1''

解决方案1
4 已采纳 2017-03-14 15:54:56