简体   繁体   English

如何通过本地 Docker 容器中的 python 应用程序从云存储桶中读取文件

[英]How to read a file from a cloud storage bucket via a python app in a local Docker container

Let me preface this with the fact that I am fairly new to Docker, Jenkins, GCP/Cloud Storage and Python.首先,我对 Docker、Jenkins、GCP/云存储和 Python 相当陌生。

Basically, I would like to write a Python app, that runs locally in a Docker container (alpine3.7 image) and reads chunks, line by line, from a very large text file that is dropped into a GCP cloud storage bucket.基本上,我想编写一个 Python 应用程序,它在 Docker 容器(alpine3.7 映像)中本地运行,并从一个非常大的文本文件中逐行读取块,该文本文件放入 GCP 云存储桶。 Each line should just be output to the console for now.现在每一行都应该输出到控制台。

I learn best by looking at working code, I am spinning my wheels trying to put all the pieces together using these technologies (new to me).我通过查看工作代码学习得最好,我正在旋转我的轮子,试图使用这些技术将所有部分组合在一起(对我来说是新的)。

I already have the key file for that cloud storage bucket on my local machine.我的本地机器上已经有了该云存储桶的密钥文件。

I am also aware of these posts:我也知道这些帖子:

I just need some help putting all these pieces together into a working app.我只需要一些帮助,将所有这些部分整合到一个可运行的应用程序中。

I understand that I need to set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of the key file in the container.我知道我需要将 GOOGLE_APPLICATION_CREDENTIALS 环境变量设置为容器中密钥文件的路径。 However, I don't know how to do that in a way that works well for multiple developers and multiple environments (Local, Dev, Stage and Prod).但是,我不知道如何以适合多个开发人员和多个环境(本地、开发、阶段和生产)的方式来做到这一点。

This is just a simple quickstart (I am sure it can be done better) to read a file from a Google Cloud Storage bucket via a python app (Docker container deployed to Google Cloud Run):这只是一个简单的快速入门(我相信它可以做得更好)通过 python 应用程序(部署到 Google Cloud Run 的 Docker 容器)从 Google Cloud Storage 存储桶读取文件:

You can find more information here link您可以在此处找到更多信息链接

  1. Create a directory with the following files:创建一个包含以下文件的目录:

    a.一种。 app.py应用程序

    import os from flask import Flask from google.cloud import storage app = Flask(__name__) @app.route('/') def hello_world(): storage_client = storage.Client() file_data = 'file_data' bucket_name = 'bucket' temp_file_name = 'temp_file_name' bucket = storage_client.get_bucket(bucket_name) blob = bucket.get_blob(file_data) blob.download_to_filename(temp_file_name) temp_str='' with open (temp_file_name, "r") as myfile: temp_str = myfile.read().replace('\\n', '') return temp_str if __name__ == "__main__": app.run(debug=True,host='0.0.0.0',port=int(os.environ.get('PORT', 8080)))

    b.Dockerfile文件

    # Use an official Python runtime as a parent image FROM python:2.7-slim # Set the working directory fo /app WORKDIR /app # Copy the current directory contents into the container /app COPY . /app # Install any needed packages specified in requirements.txt RUN pip install --trusted-host pypi.python.org -r requirements.txt RUN pip install google-cloud-storage # Make port 80 available to the world outside the container EXPOSE 80 # Define environment variable ENV NAME World # Run app.py when the container launches CMD ["python", "app.py"]

    c. C。 requirements.txt要求.txt

     Flask==1.1.1 gunicorn==19.9.0 google-cloud-storage==1.19.1
  2. Create a service account to access the storage form Cloud Run:创建一个服务帐号以访问 Cloud Run 的存储形式:

     gcloud iam service-accounts create cloudrun --description 'cloudrun'
  3. Set the permission of the service account:设置服务帐号的权限:

     gcloud projects add-iam-policy-binding wave25-vladoi --member serviceAccount:cloud-run@project.iam.gserviceaccount.com --role roles/storage.admin
  4. Build the container image:构建容器镜像:

     gcloud builds submit --tag gcr.io/project/hello
  5. Deploy the application to Cloud Run:将应用部署到 Cloud Run:

     gcloud run deploy --image gcr.io/project/hello --platform managed ----service-account cloud-run@project.iam.gserviceaccount.com

EDIT :编辑 :

One way to develop locally is :在本地开发的一种方法是:

  1. Your Dev Opp Team will get the service account key.json:您的 Dev Opp 团队将获得服务帐户 key.json:

     gcloud iam service-accounts keys create ~/key.json --iam-account serviceAccount:cloudrun@project.iam.gserviceaccount.com
  2. Store the key.json file in the same working directory将 key.json 文件存储在同一工作目录中

  3. The Dockerfile command `COPY . Dockerfile 命令 `COPY 。 /app ' will copy the file to Docker container /app ' 将文件复制到 Docker 容器

  4. Change the app.py to :将 app.py 更改为:

     storage.Client.from_service_account_json('key.json')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从谷歌云存储桶中读取 python 代码中的.json 文件 - How to Read .json file in python code from google cloud storage bucket 如何使用 Cloud Functions 读取和修改云存储中一个存储桶上的 csv 文件并将结果保存在另一个存储桶中 - How to read and modify a csv file on one bucket in cloud storage and save the results in another bucket using Cloud Functions 使用 python 获取某个文件后,如何从 Google 云存储桶中获取文件? - How do you fetch files from Google cloud storage bucket after a certain file is fetched using python? 如何使用云函数读取云存储中的 json 文件 - python - How to read json file in cloud storage using cloud functions - python 我正在尝试使用python代码读取Google Cloud Storage存储桶中的文件,但出现错误 - I am trying to read a file in in Google Cloud Storage bucket with a python code but getting the error 从 Google Cloud Function (Python) 将新文件写入 Google Cloud Storage 存储桶 - Writing a new file to a Google Cloud Storage bucket from a Google Cloud Function (Python) 如何使用 Google Cloud Function 将文件从 Cloud Storage 存储桶推送到实例中? - How can I use a Google Cloud Function to push a file from a Cloud Storage bucket into an instance? 努力从 Google Cloud Storage 存储桶中读取 csv 文件 - Struggling to read csv files from Google Cloud Storage bucket GCP 存储 - 无法从 Google 存储桶中读取文件内容 - GCP Storage - unable to read contents of file from Google storage bucket 使用 Python 将文件上传到 Google Cloud Storage Bucket 子目录 - Upload File to Google Cloud Storage Bucket Sub Directory using Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM