简体   繁体   English

如何从带有 python pandas 的 docker 容器访问 CSV 文件(位于 pc hdd 中)?

[英]How to access CSV file (located in pc hdd) from a docker container with python pandas?

I want to implement a Machine Learning algorithm which can operate on homomorphic data using PySEAL library.我想实现一个机器学习算法,它可以使用PySEAL库对同态数据进行操作 PySEAL library is released as a docker container with an 'examples.py' file which shows some homomorphic encryption example. PySEAL 库作为带有“examples.py”文件的 docker 容器发布,该文件显示了一些同态加密示例。 I want to edit the 'examples.py' file to implement the ML algorithm.我想编辑 'examples.py' 文件以实现 ML 算法。 I trying to import a CSV file in this way -我试图以这种方式导入一个 CSV 文件 -

dataset = pd.read_csv ('Dataset.csv')

I have imported pandas library successfully.我已经成功导入了熊猫库。 I have tried many approaches to import the CSV file but failed.我尝试了很多方法来导入 CSV 文件,但都失败了。 How can I import it?我怎样才能导入它?

I am new to Docker.我是 Docker 的新手。 Detailed procedure will be really helpful.详细的程序将非常有帮助。

You can either do it via the Docker build process (assuming you are the one creating the image) or through a volume mapping that would be accessed by the container during runtime.您可以通过 Docker 构建过程(假设您是创建映像的人)或通过容器在运行时访问的卷映射来完成。

Building source with Dataset.csv within使用 Dataset.csv 构建源

For access through the build, you could do a Docker Copy command to get the file within the workspace of the container要通过构建进行访问,您可以执行 Docker Copy 命令以获取容器工作区中的文件

FROM 3.7

COPY /Dataset.csv /app/Dataset.csv
...

Then you can directly access the file via /app/Dataset.csv from the container using pandas.read_csv() function, like -然后,您可以使用 pandas.read_csv() 函数从容器中通过 /app/Dataset.csv 直接访问该文件,例如 -

data=pandas.read_csv('/app/Dataset.csv')

Mapping volume share for Dataset.csv Dataset.csv 的映射卷共享

If you don't have direct control over the source image creation, or do not want the dataset packaged with the container (which may be the best practice depending on the use case).如果您无法直接控制源映像的创建,或者不希望将数据集与容器一起打包(这可能是最佳实践,具体取决于用例)。 You can share it through a volume mapping when starting the container:您可以在启动容器时通过卷映射共享它:

dataset = pd.read_csv ('app/Dataset.csv')

Assuming your Dataset.csv is in my/user/dir/Dataset.csv假设您的 Dataset.csv 在 my/user/dir/Dataset.csv

From CLI:从命令行界面:

docker run -v my/user/dir:app my-python-container

The benefit of the latter solution is you can then continue to edit the file 'Dataset.csv' on your host and the file will reflect changes made by you OR the python process should that occur.后一种解决方案的好处是您可以继续在您的主机上编辑文件“Dataset.csv”,该文件将反映您所做的更改或 Python 进程(如果发生这种情况)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM