简体   繁体   中英

How to write a list to a file in Google Cloud Storage using Google Cloud Functions with Python

I am trying to write members of a list into a file in a bucket in Cloud Storage using Cloud Functions.

I found this page showing how to upload a file to my bucket, but what I need is to loop through the members of my list and write them to the file in Cloud Storage.

I need to be able to do this using Cloud Functions, which reads from my Google Cloud SQL database. I want to be able to store the data from certain tables in my PostreSQL database as a file in Cloud Storage.

Thanks.

  • If you simply need to loop your list in Python and write the results to a file, you can use any of the multiple Python examples online or in Stack Overflow, such as this one:

     with open('your_file.txt', 'w') as f: for item in my_list: f.write("%s\\n" % item)

    That is of course, depending on how your list looks like, the data, and the file type you need to write to Cloud Storage; those would have to be replaced according to your needs.

  • To connect from your Cloud Function to your Cloud SQL for PostgreSQL database you can follow the documentation . One example using SQLAlchemy and Unix sockets is:

     db = sqlalchemy.create_engine( # Equivalent URL: # postgres+pg8000://<db_user>:<db_pass>@/<db_name>?unix_sock=/cloudsql/<cloud_sql_instance_name>/.s.PGSQL.5432 sqlalchemy.engine.url.URL( drivername='postgres+pg8000', username=db_user, password=db_pass, database=db_name, query={ 'unix_sock': '/cloudsql/{}/.s.PGSQL.5432'.format( cloud_sql_connection_name) } ), )

    Where db_user , db_pass and db_name would have to be replaced with your database's username, password and the database's name.

  • The link you referenced mentions how to upload a blob to Cloud Storage using Python as you are probably aware, so once the data is extracted from the database and written to your_file.txt for example, you can upload it to Cloud Storage with:

     from google.cloud import storage def upload_blob(bucket_name, source_file_name, destination_blob_name): """Uploads a file to the bucket.""" bucket_name = "your-bucket-name" source_file_name = "local/path/to/file/your_file.txt" destination_blob_name = "storage-object-name" storage_client = storage.Client() bucket = storage_client.bucket(bucket_name) blob = bucket.blob(destination_blob_name) blob.upload_from_filename(source_file_name) print( "File {} uploaded to {}.".format( source_file_name, destination_blob_name ) )

    Replace your-bucket-name with the name of your Cloud Storage bucket, local/path/to/file/your_file.txt with the local path to your file, and storage-object-name with the name and extension you want the file to have once it's uploaded to your Cloud Storage bucket.

Putting all those together, you can achieve what you're looking for.

I managed to get it done with the following python code:

import datetime
import logging
import os
import sqlalchemy
from google.cloud import storage
import pandas as pd

# Remember - storing secrets in plaintext is potentially unsafe. Consider using
# something like https://cloud.google.com/kms/ to help keep secrets secret.
db_user = "<DB_USER>"#os.environ.get("DB_USER")
db_pass = "<DB_PASS>"#os.environ.get("DB_PASS")
db_name = "<DB_NAME>"#os.environ.get("DB_NAME")
cloud_sql_connection_name = "<Cloud SQL Instance Connection Name>"#os.environ.get("CLOUD_SQL_CONNECTION_NAME")
logger = logging.getLogger()

# [START cloud_sql_postgres_sqlalchemy_create]
db = sqlalchemy.create_engine(
    # Equivalent URL:
    # postgres+pg8000://<db_user>:<db_pass>@/<db_name>?unix_sock=/cloudsql/<cloud_sql_instance_name>/.s.PGSQL.5432
    sqlalchemy.engine.url.URL(
        drivername='postgres+pg8000',
        username=db_user,
        password=db_pass,
        database=db_name,
        query={
            'unix_sock': '/cloudsql/{}/.s.PGSQL.5432'.format(
                cloud_sql_connection_name)
        }
    ),
    # ... Specify additional properties here.
    pool_size=5,
    max_overflow=2,
    pool_timeout=30,  # 30 seconds
    pool_recycle=1800,  # 30 minutes
)

def read_source_data(request):
    bucket_name = <YOUR_BUCKET_NAME>
    folder_name = "sample_files"
    file_name = "test.txt"

    with db.connect() as conn:
        sales_records = conn.execute(
            "SELECT * FROM sales;"
        ).fetchall()

    if len(sales_records) > 0:
        #for val in sales_records:
            #print(val)
        df = pd.DataFrame(sales_records)
        df.columns = sales_records[0].keys()
        create_file(bucket_name, "sample_files/test.txt", df)
        return "Done!"
    else:
        print("Nothing!")
        return "Nothing!"

def create_file(bucketname, path, records_read):
  storage_client = storage.Client()
  bucket = storage_client.get_bucket(bucketname)
  blob = storage.Blob(
        name=path,
        bucket=bucket,
    )

  content = records_read.to_csv(index=False)#'\n'.join(map(str, records_read))

  blob.upload_from_string(
        data=content,
        content_type='text/plain',
        client=storage_client,
    )

I stitched this together from multiple code snippets and as not-a-python-developer I'm pretty sure there are better ways of getting this done. I then deployed my function using

gcloud deployment-manager deployments  create

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM