简体   繁体   English

云 Function 发送 CSV 到云存储

[英]Cloud Function Sending CSV to Cloud Storage

I have a cloud function that is meant to create a CSV from an API call and then send that CSV to Cloud Storage.我有一个云 function,用于从 API 调用创建一个 CSV,然后将该 CSV 发送到 Cloud Storage。

Here is my code:这是我的代码:

import requests
import pprint
import pandas as pd
from flatsplode import flatsplode
import csv
import datetime
import schedule
import time
import json
import numpy as np
import os
import tempfile
from google.cloud import storage

api_url = 'https://[YOUR_DOMAIN].com/api/v2/[API_KEY]/keywords/list?site_id=[SITE_ID][&start={start}][&results=100]&format=json'

def export_data(url):
    response = requests.get(url)  # Make a GET request to the URL
    payload = response.json() # Parse `response.text` into JSON
    pp = pprint.PrettyPrinter(indent=1)

    # Use the flatsplode package to quickly turn the JSON response to a DF
    new_list = pd.DataFrame(list(flatsplode(payload)))

    # Drop certain columns from the DF
    idx = np.r_[1:5,14:27,34,35]
    new_list = new_list.drop(new_list.columns[idx], axis=1)

    # Create a csv and load it to google cloud storage
    new_list = new_list.to_csv('/tmp/temp.csv')
    def upload_blob(bucket_name, source_file_name, destination_blob_name):

        storage_client = storage.Client()
        bucket = storage_client.get_bucket(bucket_name)
        blob = bucket.blob(destination_blob_name)
        blob.upload_from_file(source_file_name)

    message = "Data for CSV file"    # ERROR HERE
    csv = open(new_list, "w")
    csv.write(message)
    with open(new_list, 'r') as file_obj:
        upload_blob('data-exports', file_obj, 'data-' + str(datetime.date.today()) + '.csv')

export_data(api_url)

I attempted to have the file in the /tmp format to allow me to write it to storage but haven't had much success.我试图将文件设为/tmp格式以允许我将其写入存储,但没有取得太大成功。 The API call works like a charm and I am able to get a CSV locally. API 电话非常有效,我可以在本地拨打 CSV。 The upload to Cloud Storage is where I get the error.上传到云存储是我收到错误的地方。

Any help is much appreciated!任何帮助深表感谢!

Instead of trying using temporary storage in your cloud functions, try converting to string your dataframe and upload the result to Google Cloud Storage.不要尝试在您的云函数中使用临时存储,而是尝试将您的 dataframe 转换为字符串并将结果上传到 Google Cloud Storage。

Consider for instance:考虑例如:

import requests
import pprint
import pandas as pd
from flatsplode import flatsplode
import csv
import datetime
import schedule
import time
import json
import numpy as np
import os
import tempfile
from google.cloud import storage

api_url = 'https://[YOUR_DOMAIN].com/api/v2/[API_KEY]/keywords/list?site_id=[SITE_ID][&start={start}][&results=100]&format=json'

def export_data(url):
    response = requests.get(url)  # Make a GET request to the URL
    payload = response.json() # Parse `response.text` into JSON
    pp = pprint.PrettyPrinter(indent=1)

    # Use the flatsplode package to quickly turn the JSON response to a DF
    new_list = pd.DataFrame(list(flatsplode(payload)))

    # Drop certain columns from the DF
    idx = np.r_[1:5,14:27,34,35]
    new_list = new_list.drop(new_list.columns[idx], axis=1)

    # Convert your df to str: it is straightforward, just do not provide
    # any value for the first param path_or_buf
    csv_str = new_list.to_csv()

    # Then, upload it to cloud storage
    def upload_blob(bucket_name, data, destination_blob_name):

        storage_client = storage.Client()
        bucket = storage_client.get_bucket(bucket_name)
        blob = bucket.blob(destination_blob_name)
        # Note the use of upload_from_string here. Please, provide
        # the appropriate content type if you wish
        blob.upload_from_string(data, content_type='text/csv')

    upload_blob('data-exports', csv_str, 'data-' + str(datetime.date.today()) + '.csv')

export_data(api_url)

From what I can tell, you've got a couple issues here.据我所知,你这里有几个问题。

First up, pd.to_csv does not return anything if a filepath or buffer is provided as an argument.首先,如果提供文件路径或缓冲区作为参数, pd.to_csv不会返回任何内容。 So this line writes the file, but also assigns the value None to new_list .所以这一行写入文件,但也将值None分配给new_list

new_list = new_list.to_csv('/tmp/temp.csv')

To fix this, simply drop the assignment - you only need the new_list.to_csv('/tmp/tmp.csv') line.要解决此问题,只需删除分配 - 您只需要new_list.to_csv('/tmp/tmp.csv')行。

This first error is causing the problem later on, because you can't write a CSV to the location None .第一个错误导致了以后的问题,因为您无法将 CSV 写入位置None Instead, provide a string as the argument to open .相反,提供一个字符串作为open的参数。 Also, if you use the open mode 'w' , the CSV data will be overwritten.此外,如果您使用打开模式'w' ,则 CSV 数据将被覆盖。 What's the format you're going for here?你在这里要的格式是什么? Do you mean to append to the file, with 'a' ?您的意思是将 append 添加到文件中,并带有'a'吗?

message = "Data for CSV file"    # ERROR HERE
csv = open(new_list, "w")
csv.write(message)

Finally, you're providing a file object where a string is expected, this time to the upload_blob function's source_file_name argument.最后,您提供了一个文件 object,其中需要一个字符串,这次是upload_blob函数的source_file_name参数。


    with open(new_list, 'r') as file_obj:
        upload_blob('data-exports', file_obj, 'data-' + str(datetime.date.today()) + '.csv')

I think here you can skip the file open and just pass the path to the file as the second argument.我认为在这里您可以跳过文件打开,只需将文件路径作为第二个参数传递。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将 csv 写入谷歌云存储 - Write csv to google cloud storage Google Storage // Cloud Function // Python 修改Bucket中的CSV文件 - Google Storage // Cloud Function // Python Modify CSV file in the Bucket 云存储中的 Memory 问题 Function - Memory issues in a Cloud Storage Function 谷歌云存储加入多个 csv 文件 - Google Cloud Storage Joining multiple csv files 云 function 发送 json 作为字符串 - Cloud function sending json as string 如何使用 Cloud Functions 读取存储在 Google Cloud Storage 中的 CSV 数据 - How to read CSV data stored in Google Cloud Storage with Cloud Functions Trigger in cloud function 监控云存储并触发数据流 - Trigger in cloud function that monitors cloud storage and triggers dataflow 如何使用 Cloud Function 服务帐户在 Cloud Function 中创建签名的 Google Cloud Storage URL? - How to create signed Google Cloud Storage URLs in Cloud Function using Cloud Function service account? 如何在特定文件集从谷歌云到达云存储时启动云数据流管道 function - how to launch a cloud dataflow pipeline when particular set of files reaches Cloud storage from a google cloud function 如何将 BigQuery 视图作为 csv 文件传输到 Google Cloud Storage 存储桶 - How to Transfer a BigQuery view to a Google Cloud Storage bucket as a csv file
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM