简体   繁体   English

如何使用带有 Python 的 Google Cloud Functions 将列表写入 Google Cloud Storage 中的文件

[英]How to write a list to a file in Google Cloud Storage using Google Cloud Functions with Python

I am trying to write members of a list into a file in a bucket in Cloud Storage using Cloud Functions.我正在尝试使用 Cloud Functions 将列表成员写入 Cloud Storage 存储桶中的文件。

I found this page showing how to upload a file to my bucket, but what I need is to loop through the members of my list and write them to the file in Cloud Storage.我发现这个页面显示了如何将文件上传到我的存储桶,但我需要的是遍历列表中的成员并将它们写入 Cloud Storage 中的文件。

I need to be able to do this using Cloud Functions, which reads from my Google Cloud SQL database.我需要能够使用从我的 Google Cloud SQL 数据库读取的 Cloud Functions 来完成此操作。 I want to be able to store the data from certain tables in my PostreSQL database as a file in Cloud Storage.我希望能够将 PostreSQL 数据库中某些表中的数据作为 Cloud Storage 中的文件存储。

Thanks.谢谢。

  • If you simply need to loop your list in Python and write the results to a file, you can use any of the multiple Python examples online or in Stack Overflow, such as this one:如果您只需要在 Python 中循环您的列表并将结果写入文件,您可以在线或在 Stack Overflow 中使用多个 Python 示例中的任何一个,例如这个

     with open('your_file.txt', 'w') as f: for item in my_list: f.write("%s\\n" % item)

    That is of course, depending on how your list looks like, the data, and the file type you need to write to Cloud Storage;当然,这取决于您的列表的外观、数据以及您需要写入 Cloud Storage 的文件类型; those would have to be replaced according to your needs.那些必须根据您的需要更换。

  • To connect from your Cloud Function to your Cloud SQL for PostgreSQL database you can follow the documentation .要从 Cloud Functions 连接到 Cloud SQL for PostgreSQL 数据库,您可以按照文档进行操作 One example using SQLAlchemy and Unix sockets is:使用SQLAlchemyUnix套接字的一个示例是:

     db = sqlalchemy.create_engine( # Equivalent URL: # postgres+pg8000://<db_user>:<db_pass>@/<db_name>?unix_sock=/cloudsql/<cloud_sql_instance_name>/.s.PGSQL.5432 sqlalchemy.engine.url.URL( drivername='postgres+pg8000', username=db_user, password=db_pass, database=db_name, query={ 'unix_sock': '/cloudsql/{}/.s.PGSQL.5432'.format( cloud_sql_connection_name) } ), )

    Where db_user , db_pass and db_name would have to be replaced with your database's username, password and the database's name.其中db_userdb_passdb_name必须替换为您数据库的用户名、密码和数据库名称。

  • The link you referenced mentions how to upload a blob to Cloud Storage using Python as you are probably aware, so once the data is extracted from the database and written to your_file.txt for example, you can upload it to Cloud Storage with:您引用的链接提到了如何使用 Python 将 blob 上传到 Cloud Storage,因为您可能知道,因此一旦从数据库中提取数据并写入your_file.txt ,例如,您可以使用以下命令将其上传到 Cloud Storage:

     from google.cloud import storage def upload_blob(bucket_name, source_file_name, destination_blob_name): """Uploads a file to the bucket.""" bucket_name = "your-bucket-name" source_file_name = "local/path/to/file/your_file.txt" destination_blob_name = "storage-object-name" storage_client = storage.Client() bucket = storage_client.bucket(bucket_name) blob = bucket.blob(destination_blob_name) blob.upload_from_filename(source_file_name) print( "File {} uploaded to {}.".format( source_file_name, destination_blob_name ) )

    Replace your-bucket-name with the name of your Cloud Storage bucket, local/path/to/file/your_file.txt with the local path to your file, and storage-object-name with the name and extension you want the file to have once it's uploaded to your Cloud Storage bucket.your-bucket-name替换为 Cloud Storage 存储分区的名称,将local/path/to/file/your_file.txt替换为local/path/to/file/your_file.txt的本地路径,将storage-object-name替换为您希望文件使用的名称和扩展名一旦它上传到您的 Cloud Storage 存储分区。

Putting all those together, you can achieve what you're looking for.把所有这些放在一起,你可以实现你想要的。

I managed to get it done with the following python code:我设法使用以下 python 代码完成了它:

import datetime
import logging
import os
import sqlalchemy
from google.cloud import storage
import pandas as pd

# Remember - storing secrets in plaintext is potentially unsafe. Consider using
# something like https://cloud.google.com/kms/ to help keep secrets secret.
db_user = "<DB_USER>"#os.environ.get("DB_USER")
db_pass = "<DB_PASS>"#os.environ.get("DB_PASS")
db_name = "<DB_NAME>"#os.environ.get("DB_NAME")
cloud_sql_connection_name = "<Cloud SQL Instance Connection Name>"#os.environ.get("CLOUD_SQL_CONNECTION_NAME")
logger = logging.getLogger()

# [START cloud_sql_postgres_sqlalchemy_create]
db = sqlalchemy.create_engine(
    # Equivalent URL:
    # postgres+pg8000://<db_user>:<db_pass>@/<db_name>?unix_sock=/cloudsql/<cloud_sql_instance_name>/.s.PGSQL.5432
    sqlalchemy.engine.url.URL(
        drivername='postgres+pg8000',
        username=db_user,
        password=db_pass,
        database=db_name,
        query={
            'unix_sock': '/cloudsql/{}/.s.PGSQL.5432'.format(
                cloud_sql_connection_name)
        }
    ),
    # ... Specify additional properties here.
    pool_size=5,
    max_overflow=2,
    pool_timeout=30,  # 30 seconds
    pool_recycle=1800,  # 30 minutes
)

def read_source_data(request):
    bucket_name = <YOUR_BUCKET_NAME>
    folder_name = "sample_files"
    file_name = "test.txt"

    with db.connect() as conn:
        sales_records = conn.execute(
            "SELECT * FROM sales;"
        ).fetchall()

    if len(sales_records) > 0:
        #for val in sales_records:
            #print(val)
        df = pd.DataFrame(sales_records)
        df.columns = sales_records[0].keys()
        create_file(bucket_name, "sample_files/test.txt", df)
        return "Done!"
    else:
        print("Nothing!")
        return "Nothing!"

def create_file(bucketname, path, records_read):
  storage_client = storage.Client()
  bucket = storage_client.get_bucket(bucketname)
  blob = storage.Blob(
        name=path,
        bucket=bucket,
    )

  content = records_read.to_csv(index=False)#'\n'.join(map(str, records_read))

  blob.upload_from_string(
        data=content,
        content_type='text/plain',
        client=storage_client,
    )

I stitched this together from multiple code snippets and as not-a-python-developer I'm pretty sure there are better ways of getting this done.我从多个代码片段中将其拼接在一起,作为非 Python 开发人员,我很确定有更好的方法来完成这项工作。 I then deployed my function using然后我使用

gcloud deployment-manager deployments  create

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 轻量级的ETL,将Google Cloud Storage和Cloud Functions与Python 3.7结合使用 - lightweight ETL using Google Cloud Storage and Cloud Functions with Python 3.7 使用Python列出Google Cloud Storage存储桶 - List Google Cloud Storage buckets using Python 如何使用适用于 Cloud Functions 的 Google Python 客户端获取 Google Cloud Functions 的列表? - How can I get a list of Google Cloud Functions using Google Python Client for Cloud Functions? 如何将文件上传到 Python 3 上的 Google Cloud Storage? - How to upload a file to Google Cloud Storage on Python 3? 使用Python的Google云存储 - Google Cloud Storage using Python 在 Python 中写入流到 Google Cloud Storage - Write-streaming to Google Cloud Storage in Python python和谷歌云存储 - python and google cloud storage 如何将原始字节数组作为二进制文件写入Google云存储 - How to write array of raw bytes to google cloud storage as binary file 如何从 Google Cloud Functions 读取存储在 Google Cloud Storage 上的非文本文件 - How to read non-text file stored on Google Cloud Storage from Google Cloud Functions 将 Python 列表上传到 Google Cloud Storage 并覆盖文件 - Upload Python List to Google Cloud Storage and Overwrite File
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM