简体   繁体   English

使用来自uwsgi-nginx-flask-docker的Tesseract 4 - Docker Container

[英]Use Tesseract 4 - Docker Container from uwsgi-nginx-flask-docker

I had my python project running local, and it works. 我让我的python项目运行本地,它的工作原理。 I use tesseract from python with the subprocess package. 我使用python的tesseract和subprocess包。

Then I deployed my project and since I use Flask, I installed tiangolo-uwsgi-flask-nginx-docker but, Tesseract isn't installed there. 然后我部署了我的项目,因为我使用Flask,我安装了tiangolo-uwsgi-flask-nginx-docker,但是没有安装Tesseract。 That's why my project doesn't work anymore because it cannot find tesseract. 这就是为什么我的项目不再起作用的原因,因为它无法找到tesseract。 And it doesn't recognize the tesseract that is installed on my AWS instance because tesseract isn't installed in the docker container. 并且它无法识别我的AWS实例上安装的tesseract,因为docker容器中未安装tesseract。

That's why I would like to use also tesseract 4 Docker which has an installation of Tesseract. 这就是为什么我想使用tesseract 4 Docker ,它安装了Tesseract。

I have both Dockers: 我有两个Dockers:

c82b61361992        tesseractshadow/tesseract4re:latest   "/bin/bash"            6 seconds ago       Up 5 seconds                                      t4re
e122633ef81c        my_project:latest                 "/entrypoint.sh /sta   35 minutes ago      Up 35 minutes       0.0.0.0:80->80/tcp, 443/tcp   modest_perlman

But I don't know how to tell my_project that it has to take Tesseract from the Tesseract Container. 但我不知道如何告诉my_project它必须从Tesseract容器中取出Tesseract。

I read this post about connecting two Docker containers, but I get even more lost. 我读过这篇关于连接两个Docker容器的帖子 ,但是我更加迷失了。 :) :)

I saw that the Tesseract Docker should work this way: 我看到Tesseract Docker应该这样工作:

#!/bin/bash
docker ps -f name=t4re
TASK_TMP_DIR=TASK_$$_$(date +"%N")
echo "====== TASK $TASK_TMP_DIR started ======"
docker exec -it t4re mkdir \-p ./$TASK_TMP_DIR/
docker cp ./ocr-files/phototest.tif t4re:/home/work/$TASK_TMP_DIR/
docker exec -it t4re /bin/bash -c "mkdir -p ./$TASK_TMP_DIR/out/; cd ./$TASK_TMP_DIR/out/; tesseract ../phototest.tif phototest -l eng --psm 1 --oem 2 txt pdf hocr"
mkdir -p ./ocr-files/output/$TASK_TMP_DIR/
docker cp t4re:/home/work/$TASK_TMP_DIR/out/ ./ocr-files/output/$TASK_TMP_DIR/
docker exec -it t4re rm \-r ./$TASK_TMP_DIR/
docker exec -it t4re ls
echo "====== Result files was copied to ./ocr-files/output/$TASK_TMP_DIR/ ======"

But I've no clue, how to implement it in my python script and from the other container. 但我不知道如何在我的python脚本和其他容器中实现它。

My python-tesseract script looks quite similar to pytesseract.py I just changed a few lines and deleted some stuff I don't need. 我的python-tesseract脚本看起来非常类似于pytesseract.py我只是更改了几行并删除了一些我不需要的东西。

Maybe someone knows how to do this, or could propose another better way to use tesseract with the tiangolo-docker 也许有人知道如何做到这一点,或者可以提出另一种更好的方法来使用tiangolo-docker tesseract

EDIT (See the edit below) 编辑 (见下面的编辑)

I found the answer. 我找到了答案。 Since it would work for every two docker containers, I'm gonna write a general solution which one can always use. 因为它适用于每两个docker容器,所以我将编写一个可以随时使用的通用解决方案。

I have both docker images and containers in the same instance: 我在同一个实例中有docker镜像和容器:

CONTAINER ID        IMAGE                 COMMAND             CREATED             STATUS              PORTS                    NAMES
14524d364cff        (image)               "java -jar ..."   40 hours ago        Up 40 hours         0.0.0.0:5000->5000/tcp   api-1
3392994ae3ac        (image)               "java -jar ..."   40 hours ago        Up 40 hours         0.0.0.0:5002->5002/tcp   api-2

Until here it's easy. 直到这里很容易。

Then, I wrote a docker-compose.yml 然后,我写了一个docker-compose.yml

version: '2'
services:         
  api-1:
    image: _name-of-image_
    container_name: api-1
    ports:
      - "5000:5000"
    depends_on:
      - api-2

  api-2:
    image: _name-of-image_
    container_name: api-2
    ports:
      - "5002:5002"

Then, in the docker file of api-1, for example. 然后,在api-1的docker文件中,例如。

...
ENV API-2HOST api-2
...

and that's it. 就是这样。

In my particular case, I have an api-1.conf with: 在我的特定情况下,我有一个api-1.conf:

accounts = {
  http = {
    host = "localhost"
    host = ${?API-2HOST}
    port = 5002
    poolBufferSize = 100
    routes = {
      authentication = "/authentication"
      login = "/login/"
      logout = "/logout"
      refreshTokens = "/refreshTokens"
    }
  }
}

and then I can easily make a request there and so are both docker containers communicated. 然后我可以很容易地在那里提出请求,因此两个docker容器都是通信的。

Hope it can help someone. 希望它可以帮助某人。

EDIT 编辑

Since it can be complicated, I created a git project with just a dockerfile where you can use flask, nginx, uwsgi and tesseract. 由于它可能很复杂,我创建了一个只有dockerfile的git项目,你可以使用flask,nginx,uwsgi和tesseract。 So there's no need to use both containers. 所以不需要使用两个容器。

docker-flask-nginx-uwsgi-tesseract 搬运工烧瓶-nginx的-uwsgi-的tesseract

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM