简体   繁体   English

无法在 heroku 中配置 pytesseract

[英]Unable to config pytesseract in heroku

I try to deploy pytesseract app in heroku after doing much researchs online.在网上做了很多研究后,我尝试在 heroku 中部署 pytesseract 应用程序。

I added TESSDATA_PREFIX=./.apt/usr/share/tesseract-ocr/4.00/tessdata in Heroku Config vars我在 Heroku 配置变量中添加了TESSDATA_PREFIX=./.apt/usr/share/tesseract-ocr/4.00/tessdata

I have https://github.com/heroku/heroku-buildpack-apt in my heroku buildpack.我的 heroku 构建包中有https://github.com/heroku/heroku-buildpack-apt

I have Aptfile containing:我有 Aptfile 包含:

tesseract-ocr
tesseract-ocr-eng

I have我有

pytesseract.tesseract_cmd = '/app/.apt/usr/bin/tesseract'

in my code.在我的代码中。

I am deploying flask API to heroku, so my Procfile is: web: gunicorn app:app我正在部署 flask API 到 heroku,所以我的 Procfile 是: web: gunicorn app:app

The error from heroku logs:来自 heroku 日志的错误:

2022-11-16T04:22:39.262113+00:00 app[web.1]: text = pytesseract.image_to_string(img, config="--psm 6")
2022-11-16T04:22:39.262115+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.10/site-packages/pytesseract/pytesseract.py", line 423, in image_to_string  
2022-11-16T04:22:39.262116+00:00 app[web.1]: return {
2022-11-16T04:22:39.262117+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.10/site-packages/pytesseract/pytesseract.py", line 426, in <lambda>
2022-11-16T04:22:39.262117+00:00 app[web.1]: Output.STRING: lambda: run_and_get_output(*args),
2022-11-16T04:22:39.262117+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.10/site-packages/pytesseract/pytesseract.py", line 288, in run_and_get_output
2022-11-16T04:22:39.262118+00:00 app[web.1]: run_tesseract(**kwargs)
2022-11-16T04:22:39.262118+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.10/site-packages/pytesseract/pytesseract.py", line 264, in run_tesseract    
2022-11-16T04:22:39.262119+00:00 app[web.1]: raise TesseractError(proc.returncode, get_errors(error_string))
2022-11-16T04:22:39.262121+00:00 app[web.1]: pytesseract.pytesseract.TesseractError: (127, '/app/.apt/usr/bin/tesseract: error while loading shared libraries: libarchive.so.13: cannot open shared object file: No such file or directory')

Anything I missed or how should I solve this?我错过了什么或者我应该如何解决这个问题?

The error message indicates a missing library:错误消息表明缺少库:

error while loading shared libraries: libarchive.so.13: cannot open shared object file: No such file or directory

The apt buildpack doesn't do dependency resolution, so you may have to explicitly include transitive dependencies. apt buildpack不做依赖解析,所以你可能必须显式地包含传递依赖。

You can search https://packages.ubuntu.com to see which packages contain missing files.您可以搜索https://packages.ubuntu.com以查看哪些软件包包含丢失的文件。 Make sure to match the Ubuntu LTS major version to the Heroku stack you are using, eg for Heroku 22 you'll want to look at packages for Ubuntu 22.04 LTS (Jammy).确保将 Ubuntu LTS 主要版本与您正在使用的 Heroku 堆栈相匹配,例如,对于 Heroku 22,您需要查看 Ubuntu 22.04 LTS (Jammy) 的软件包。

In this case, the libarchive13 package contains libarchive.so.13 .在这种情况下, libarchive13 package包含libarchive.so.13 Add that to your Aptfile , commit, and redeploy.将其添加到您的Aptfile中,提交并重新部署。 If you find other missing dependencies, repeat the process.如果您发现其他缺失的依赖项,请重复该过程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM