简体   繁体   English

在 docker 容器中运行 tensorflow 时出错

[英]Error running tensorflow in docker container

I'm trying to use Tensorflow module in a Python application running in a Docker container (actually I am using Keras but errors come from Tensorflow)我正在尝试在 Docker 容器中运行的 Python 应用程序中使用 Tensorflow 模块(实际上我使用的是 Keras,但错误来自 Tensorflow)

I have models ( .json and .h5 files) that I would like to load in order to use it :我有模型( .json.h5文件),我想加载以使用它:

import logging
import os
from keras.models import model_from_json # library for machine learning
from numpy import array
import json

def load_models():
    global loaded_h_model
    global loaded_u_model
    global loaded_r_model
    global loaded_c_model

    modelPath = os.getenv("MODELPATH", "./models/")

    # load models
    json_h_file = open(modelPath+'model_HD.json', 'r')
    loaded_model_h_json = json_h_file.read()
    json_h_file.close()
    loaded_h_model = model_from_json(loaded_model_h_json)
    loaded_h_model.load_weights(modelPath+"model_HD.h5")

    json_u_file = open(modelPath+'model_UD.json', 'r')
    loaded_model_u_json = json_u_file.read()
    json_u_file.close()
    loaded_u_model = model_from_json(loaded_model_u_json)
    loaded_u_model.load_weights(modelPath+"model_UD.h5")

    json_r_file = open(modelPath+'model_RD.json', 'r')
    loaded_model_r_json = json_r_file.read()
    json_r_file.close()
    loaded_r_model = model_from_json(loaded_model_r_json)
    loaded_r_model.load_weights(modelPath+"model_RD.h5")

    json_c_file = open(modelPath+'model_CD.json', 'r')
    loaded_model_c_json = json_c_file.read()
    json_c_file.close()
    loaded_c_model = model_from_json(loaded_model_c_json)
    loaded_c_model.load_weights(modelPath+"model_CD.h5")

Here is the Dockerfile I use:这是我使用的 Dockerfile:

FROM python:3.7

# copy source code files
COPY machinelearning.py ./

# copy models files
COPY models/* ./models/

# install dependencies
RUN pip3 install --upgrade pip \
    && pip3 install h5py \
    && pip3 install tensorflow \
    && pip3 install keras

# run script
CMD [ "python", "./machinelearning.py" ]

But when I run the Docker container, I have the following Warnings/Errors:但是当我运行 Docker 容器时,我有以下警告/错误:

2020-01-29 09:40:24.542588: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-01-29 09:40:24.542727: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-01-29 09:40:24.542743: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Using TensorFlow backend.
2020-01-29 09:40:25.394254: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-01-29 09:40:25.394289: E tensorflow/stream_executor/cuda/cuda_driver.cc:351] failed call to cuInit: UNKNOWN ERROR (303)
2020-01-29 09:40:25.394321: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (dd231f397f1f): /proc/driver/nvidia/version does not exist
2020-01-29 09:40:25.394539: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-01-29 09:40:25.419513: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1992000000 Hz
2020-01-29 09:40:25.420250: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55cab5bf9760 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-01-29 09:40:25.420299: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version

I believe I need to install libraries or a different version of Tensorflow/Keras in my Dockerfile.我相信我需要在我的 Dockerfile 中安装库或不同版本的 Tensorflow/Keras。

How can I solve this issue ?我该如何解决这个问题? Thanks谢谢

First of all, you need to COPY requirements.txt /to/destination .首先,您需要COPY requirements.txt /to/destination your requirements.txt should contain dependencies with the version number.您的 requirements.txt 应包含具有版本号的依赖项。

FROM python:latest
COPY requirements.txt /usr/src/code/

After that run在那之后运行

RUN pip3 install -r requirements.txt

Instead of below code in your Dockerfile而不是 Dockerfile 中的以下代码

RUN pip3 install --upgrade pip \
    && pip3 install h5py \
    && pip3 install tensorflow \
    && pip3 install keras 

I hope the problem will get resolved by mentioning version numbers in requirements.txt, not just --upgrade tag.我希望通过在 requirements.txt 中提及版本号来解决问题,而不仅仅是 --upgrade 标签。

Also don't run upgrades if not needed.如果不需要,也不要运行升级。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM