简体   繁体   English

如何从 Docker 容器访问 CSV output?

[英]How to access CSV output from Docker Container?

Python Program generates a CSV output. Python 程序生成 CSV output。 Currently, able to run test.py on host machine and sample_output.csv is generated.目前,能够在主机上运行 test.py 并生成 sample_output.csv。

However when implementing the program through Docker Containers faced difficulty in locating the sample_output.csv file.然而,当通过 Docker 实现该程序时,Containers 在定位 sample_output.csv 文件时遇到了困难。 Below are the Dockerfile and requirements.txt files.以下是 Dockerfile 和 requirements.txt 文件。

numpy==1.19.4
pandas==1.2.0
python-dateutil==2.8.1
pytz==2020.5
scipy==1.5.4
six==1.15.0     // -> requirements.txt



FROM python:3

WORKDIR /demo

COPY requirements.txt ./

RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["python" , "-u" ,"./test.py"]   // -> Dockerfile

Docker Image can be generated by running Docker build -t imagename. Docker 可以通过运行 Docker build -t imagename 生成图像。 However, when running docker run imagename, no csv file is generated.但是,在运行 docker 运行 imagename 时,不会生成 csv 文件。

Would like to seek help in locating the sample_output.csv file after running the Docker container based on the docker image.在运行基于 docker 映像的 Docker 容器后,希望寻求帮助以查找 sample_output.csv 文件。

//test.py

import pandas as pd
import numpy as np
import os 
from scipy.stats import uniform, exponweib
from scipy.special import gamma
from scipy.optimize import curve_fit


N_SUBSYS = 30
STEPS = 100
LIMIT = 101*STEPS
times = np.arange(0, LIMIT, STEPS)

if not os.path.exists('./output'):
    os.mkdir('./output')

    print("Directory")

class WeibullFailure():
    def __init__(self):
        N_TRAINS = 92
        LOWER_BETA = 0.9
        RANGE_BETA = 0.3
        LOWER_LOGSCALE = 4
        RANGE_LOGSCALE = 1.5
        LOWER_SIZE = 4
        RANGE_SIZE = 8
        gensize = N_TRAINS * int(uniform.rvs(LOWER_SIZE, RANGE_SIZE))
        genbeta = uniform.rvs(LOWER_BETA, RANGE_BETA)
        genscale = np.power(10, uniform.rvs(LOWER_LOGSCALE, RANGE_LOGSCALE))
        self.beta = genbeta
        self.eta = genscale
        self.size = gensize

    def generate_failures(self):
        return exponweib.rvs(
            a=1, loc=0, c=self.beta, scale=self.eta, size=self.size
        )

    def __repr__(self):
        string = f"Subsystem ~ ({self.size} Instances)"
        string += f" Weibull({self.eta:.2f}, {self.beta:.4f})"
        return string


def get_cumulative_failures(failure_times, times):
    cumulative_failures = {
        i: np.histogram(ft, times)[0].cumsum()
        for i, ft in failure_times.items()
    }
    cumulative_failures = pd.DataFrame(cumulative_failures, index=times[1:])
    return cumulative_failures


def fit_failures(cumulative_failures, subsystems):
    fitted = {}
    for i, x in cumulative_failures.items():
        size = subsystems[i].size
        popt, _ = curve_fit(
            lambda x, a, b: np.exp(a)*np.power(x, b), x.index, x.values
        )
        fitted[i] = (np.exp(-popt[0]/popt[1])*size, popt[1])
    return fitted


def kl_divergence(p1, p2):
    em_constant = 0.57721  # Euler-Mascheroni constant
    eta1, beta1 = p1
    eta2, beta2 = p2
    e11 = np.log(beta1/np.power(eta1, beta1))
    e12 = np.log(beta2/np.power(eta2, beta2))
    e2 = (beta1 - beta2)*(np.log(eta1) - em_constant/beta1)
    e3 = np.power(eta1/eta2, beta2)*gamma(beta2/beta1 + 1) - 1
    divergence = e11 - e12 + e2 + e3
    return divergence


subsystems = {i: WeibullFailure() for i in range(N_SUBSYS)}
failure_times = {i: s.generate_failures() for i, s in subsystems.items()}

cumulative_failures = get_cumulative_failures(failure_times, times)
fitted = fit_failures(cumulative_failures, subsystems)
divergences = {
    i: kl_divergence(f, [subsystems[i].eta, subsystems[i].beta])
    for i, f in fitted.items()
}

expected_failures = {i: np.power(times[1:]/s.eta, s.beta)*s.size
                     for i, s in subsystems.items()}
expected_failures = pd.DataFrame(expected_failures, index=times[1:])

modeled_failures = {i: np.power(times[1:]/f[0], f[1])*subsystems[i].size
                    for i, f in fitted.items()}
modeled_failures = pd.DataFrame(modeled_failures, index=times[1:])

cols = ['eta', 'fit_eta', 'beta', 'fit_beta', 'kl_divergence', 'n_instance']
out = pd.concat([
    pd.DataFrame({i: [s.size, s.eta, s.beta] for i, s in subsystems.items()},
                 index=['n_instance', 'eta', 'beta']).T,
    pd.DataFrame(fitted, index=['fit_eta', 'fit_beta']).T,
    pd.Series(divergences, name='kl_divergence')
], axis=1)[cols]
out.to_csv('./output/sample_output.csv')

if not os.path.exists('./output/sample_output.csv'):
    print("Hello")

I would recommend you to redirect output to console instead of file something like this:我建议您将 output 重定向到控制台,而不是像这样的文件:

from io import StringIO
output = StringIO()
out.to_csv(output)
print(output.getvalue())

and than you your container而不是你你的容器

docker run <container> > output.csv

Docker containers are isolated from the host by definition. Docker 容器根据定义与主机隔离。 When you run something in container, it stays in container.当你在容器中运行某些东西时,它会留在容器中。

You can mount host directory into container where you think script output should appear.您可以将主机目录挂载到您认为应该出现脚本 output 的容器中。 You can do this with -v (volume) option:您可以使用-v (volume) 选项执行此操作:

docker run -v /host/path:/container/path ...

Multiple volumes can be specified:可以指定多个卷:

docker run -v /host/path:/container/path -v /another/host/path:/another/container/path ...

After that host directory with all contents will appear in container as it is on host and if your program would add or replace something in there you will see it.之后,包含所有内容的主机目录将出现在容器中,就像它在主机上一样,如果您的程序将在其中添加或替换某些内容,您将看到它。

UPD: Looking at the test.py your output file should be in /demo/output so you can mount some host directory there, like, for example, your current directory: docker run -v $(pwd):/demo/output... UPD:查看test.py你的 output 文件应该在/demo/output中,这样你就可以在那里挂载一些主机目录,例如你的当前目录: docker run -v $(pwd):/demo/output...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM