简体   繁体   English

在 docker-compose.yml 中运行 Python package 和 multiple.py 脚本

[英]Run Python package with multiple .py scripts in docker-compose.yml

Summarize the problem:总结问题:

The Python package basically opens PDFs in batch folder, reads the first page of each PDF, matches keywords, and dumps compatible PDFs in source folder for OCR scripts to kick in. The first script to take all PDFs are MainBankClass.py . Python package 基本上在批处理文件夹中打开 PDF,读取每个 PDF 的第一页,匹配关键字,并将兼容的 PDF 转储到源文件夹中以供 OCR 脚本启动。第一个获取所有 PDF 的脚本是MainBankClass.py I am trying to use a docker-compose file to include all these python scripts under the same network and volume so that each OCR script starts to scan bank statements when the pre-processing is done.我正在尝试使用 docker-compose 文件将所有这些 python 脚本包含在同一网络下,以便每个 OCR 脚本在预处理完成后开始扫描银行对帐单。 This link is the closest so far to accomplish the goal but it seems that I missed some parts of it. 这个链接是迄今为止最接近实现目标的链接,但似乎我错过了其中的某些部分。 The process to call different OCR scripts is achieved by runpy.run_path(path_name='ChaseOCR.py') , thus these scripts are in the same directory of __init__.py .调用不同OCR脚本的过程是通过runpy.run_path(path_name='ChaseOCR.py')实现的,因此这些脚本在__init__.py的同一目录中。 Here is the filesystem structure:这是文件系统结构:

BankStatements
 ┣ BankofAmericaOCR
 ┃ ┣ BancAmericaOCR.py
 ┃ ┗ Dockerfile.bankofamerica
 ┣ ChaseBankStatementOCR
 ┃ ┣ ChaseOCR.py
 ┃ ┗ Dockerfile.chase
 ┣ WellsFargoStatementOCR
 ┃ ┣ Dockerfile.wellsfargo
 ┃ ┗ WellsFargoOCR.py
 ┣ BancAmericaOCR.py
 ┣ ChaseOCR.py
 ┣ Dockerfile
 ┣ WellsFargoOCR.py
 ┣ __init__.py
 ┗ docker-compose.yml

What I've tried so far:到目前为止我已经尝试过:

In docker-compose.yml:在 docker-compose.yml 中:

version: '3'

services:
    mainbankclass_container:
        build: 
            context: '.'
            dockerfile: Dockerfile
        volumes: 
            - /Users:/Users
        #links:
        #    - "chase_container"
        #    - "wellsfargo_container"
        #    - "bankofamerica_container"
    chase_container:
        build: .
        working_dir: /app/ChaseBankStatementOCR
        command: ./ChaseOCR.py
        volumes: 
            - /Users:/Users
    bankofamerica_container:
        build: .
        working_dir: /app/BankofAmericaOCR
        command: ./BancAmericaOCR.py
        volumes: 
            - /Users:/Users
    wellsfargo_container:
        build: .
        working_dir: /app/WellsFargoStatementOCR
        command: ./WellsFargoOCR.py
        volumes: 
            - /Users:/Users

And each dockerfile under each bank folder is similar except CMD would be changed accordingly.每个银行文件夹下的每个dockerfile都是类似的,除了CMD会相应改变。 For example, in ChaseBankStatementOCR folder:例如,在 ChaseBankStatementOCR 文件夹中:

FROM python:3.7-stretch
WORKDIR /app
COPY . /app
CMD ["python3", "ChaseOCR.py"] <---- changes are made here for the other two bank scripts

The last element is for Dockerfile outside of each folder:最后一个元素是每个文件夹外的 Dockerfile:

FROM python:3.7-stretch
WORKDIR /app
COPY ./requirements.txt ./ 
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
RUN pip3 install --upgrade PyMuPDF

COPY . /app

COPY ./ChaseOCR.py /app
COPY ./BancAmericaOCR.py /app
COPY ./WellsFargoOCR.py /app

EXPOSE 8080

CMD ["python3", "MainBankClass.py"]

After running docker-compose build , containers and.network are successfully built.运行docker-compose build后,容器和.network构建成功。 Error occurs when I run docker run -v /Users:/Users: python3 python3 ~/BankStatementsDemoOCR/BankStatements/MainBankClass.py and the error message is FileNotFoundError: [Errno 2] No such file or directory: 'BancAmericaOCR.py'当我运行docker run -v /Users:/Users: python3 python3 ~/BankStatementsDemoOCR/BankStatements/MainBankClass.py时出现错误,错误消息是FileNotFoundError: [Errno 2] No such file or directory: 'BancAmericaOCR.py'

I am assuming that the container doesn't have BancAmericaOCR.py but I have composed each.py file under the same.network and I don't think links is a good practice since docker recommended to use networks here .我假设容器没有 BancAmericaOCR.py,但我已经在 same.network 下编写了 each.py 文件,我认为links不是一个好习惯,因为 docker 建议在此处使用networks What am I missing here?我在这里错过了什么? Any help is much appreciated.任何帮助深表感谢。 Thanks in advance.提前致谢。

single application in a single container... need.networks for different py files to communicate单个容器中的单个应用程序...需要不同py文件进行通信的网络

You only have one container.你只有一个容器。 Docker.networks are for multiple containers to talk to one another. Docker.networks 用于多个容器相互通信。 And Docker Compose has a default bridge.network defined for all services, so you shouldn't need that if you were still using docker-compose并且 Docker Compose 为所有服务定义了默认的 bridge.network,因此如果您仍在使用 docker-compose,则不需要它

Here's a cleaned up Dockerfile with all the scripts copied in, with the addition of an entrypoint file这是一个清理过的 Dockerfile,其中复制了所有脚本,并添加了一个入口点文件

FROM python:3.7-stretch
WORKDIR /app
COPY ./requirements.txt ./  
RUN pip3 install --upgrade pip PyMuPDF && pip3 install -r requirements.txt

COPY . /app

COPY ./docker-entrypoint.sh /
ENTRYPOINT /docker-entrypoint.sh

In your entrypoint, you can loop over every file在您的入口点,您可以遍历每个文件

#!/bin/bash

for b in Chase WellsFargo BofA ; do 
    python3 /app/$b.py
done

exec python3 /app/MainBankClass.py

So after days of searching regarding my case, I am closing this thread with an implementation of single application in a single container suggested on this link from docker forum.因此,在对我的案例进行了几天的搜索之后,我将关闭此线程,并在来自 docker 论坛的此链接上建议在单个容器中实施单个应用程序 Instead of going with docker-compose, the suggested approach is to use 1 container with dockerfile for this application and it's working as expected.建议的方法不是使用 docker-compose,而是为此应用程序使用 1 个容器和 dockerfile,它按预期工作。

On top of the dockerfile, we also need networks for different py files to communicate.在dockerfile之上,我们还需要不同py文件进行通信的网络 For example:例如:

docker network create my_net
docker run -it --network my_net -v /Users:/Users --rm my_awesome_app

EDIT: No.network is needed since we are only running one container.编辑:不需要。网络是必需的,因为我们只运行一个容器。

EDIT 2: Please see the accepted answer for future reference编辑 2:请参阅已接受的答案以供将来参考

Any answers are welcomed if anyone has better ideas on the case.如果有人对此案有更好的想法,欢迎任何答案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM