[英]Cloud Function Sending CSV to Cloud Storage
I have a cloud function that is meant to create a CSV from an API call and then send that CSV to Cloud Storage.我有一个云 function,用于从 API 调用创建一个 CSV,然后将该 CSV 发送到 Cloud Storage。
Here is my code:这是我的代码:
import requests
import pprint
import pandas as pd
from flatsplode import flatsplode
import csv
import datetime
import schedule
import time
import json
import numpy as np
import os
import tempfile
from google.cloud import storage
api_url = 'https://[YOUR_DOMAIN].com/api/v2/[API_KEY]/keywords/list?site_id=[SITE_ID][&start={start}][&results=100]&format=json'
def export_data(url):
response = requests.get(url) # Make a GET request to the URL
payload = response.json() # Parse `response.text` into JSON
pp = pprint.PrettyPrinter(indent=1)
# Use the flatsplode package to quickly turn the JSON response to a DF
new_list = pd.DataFrame(list(flatsplode(payload)))
# Drop certain columns from the DF
idx = np.r_[1:5,14:27,34,35]
new_list = new_list.drop(new_list.columns[idx], axis=1)
# Create a csv and load it to google cloud storage
new_list = new_list.to_csv('/tmp/temp.csv')
def upload_blob(bucket_name, source_file_name, destination_blob_name):
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_file(source_file_name)
message = "Data for CSV file" # ERROR HERE
csv = open(new_list, "w")
csv.write(message)
with open(new_list, 'r') as file_obj:
upload_blob('data-exports', file_obj, 'data-' + str(datetime.date.today()) + '.csv')
export_data(api_url)
I attempted to have the file in the /tmp
format to allow me to write it to storage but haven't had much success.我试图将文件设为
/tmp
格式以允许我将其写入存储,但没有取得太大成功。 The API call works like a charm and I am able to get a CSV locally. API 电话非常有效,我可以在本地拨打 CSV。 The upload to Cloud Storage is where I get the error.
上传到云存储是我收到错误的地方。
Any help is much appreciated!任何帮助深表感谢!
Instead of trying using temporary storage in your cloud functions, try converting to string your dataframe and upload the result to Google Cloud Storage.不要尝试在您的云函数中使用临时存储,而是尝试将您的 dataframe 转换为字符串并将结果上传到 Google Cloud Storage。
Consider for instance:考虑例如:
import requests
import pprint
import pandas as pd
from flatsplode import flatsplode
import csv
import datetime
import schedule
import time
import json
import numpy as np
import os
import tempfile
from google.cloud import storage
api_url = 'https://[YOUR_DOMAIN].com/api/v2/[API_KEY]/keywords/list?site_id=[SITE_ID][&start={start}][&results=100]&format=json'
def export_data(url):
response = requests.get(url) # Make a GET request to the URL
payload = response.json() # Parse `response.text` into JSON
pp = pprint.PrettyPrinter(indent=1)
# Use the flatsplode package to quickly turn the JSON response to a DF
new_list = pd.DataFrame(list(flatsplode(payload)))
# Drop certain columns from the DF
idx = np.r_[1:5,14:27,34,35]
new_list = new_list.drop(new_list.columns[idx], axis=1)
# Convert your df to str: it is straightforward, just do not provide
# any value for the first param path_or_buf
csv_str = new_list.to_csv()
# Then, upload it to cloud storage
def upload_blob(bucket_name, data, destination_blob_name):
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
# Note the use of upload_from_string here. Please, provide
# the appropriate content type if you wish
blob.upload_from_string(data, content_type='text/csv')
upload_blob('data-exports', csv_str, 'data-' + str(datetime.date.today()) + '.csv')
export_data(api_url)
From what I can tell, you've got a couple issues here.据我所知,你这里有几个问题。
First up, pd.to_csv
does not return anything if a filepath or buffer is provided as an argument.首先,如果提供文件路径或缓冲区作为参数,
pd.to_csv
不会返回任何内容。 So this line writes the file, but also assigns the value None
to new_list
.所以这一行写入文件,但也将值
None
分配给new_list
。
new_list = new_list.to_csv('/tmp/temp.csv')
To fix this, simply drop the assignment - you only need the new_list.to_csv('/tmp/tmp.csv')
line.要解决此问题,只需删除分配 - 您只需要
new_list.to_csv('/tmp/tmp.csv')
行。
This first error is causing the problem later on, because you can't write a CSV to the location None
.第一个错误导致了以后的问题,因为您无法将 CSV 写入位置
None
。 Instead, provide a string as the argument to open
.相反,提供一个字符串作为
open
的参数。 Also, if you use the open mode 'w'
, the CSV data will be overwritten.此外,如果您使用打开模式
'w'
,则 CSV 数据将被覆盖。 What's the format you're going for here?你在这里要的格式是什么? Do you mean to append to the file, with
'a'
?您的意思是将 append 添加到文件中,并带有
'a'
吗?
message = "Data for CSV file" # ERROR HERE
csv = open(new_list, "w")
csv.write(message)
Finally, you're providing a file object where a string is expected, this time to the upload_blob
function's source_file_name
argument.最后,您提供了一个文件 object,其中需要一个字符串,这次是
upload_blob
函数的source_file_name
参数。
with open(new_list, 'r') as file_obj:
upload_blob('data-exports', file_obj, 'data-' + str(datetime.date.today()) + '.csv')
I think here you can skip the file open and just pass the path to the file as the second argument.我认为在这里您可以跳过文件打开,只需将文件路径作为第二个参数传递。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.