[英]Write csv to google cloud storage
我想了解如何將多行 csv 文件寫入谷歌雲存儲。 我只是沒有遵循文檔
靠近這里: 無法讀取上傳到谷歌雲存儲桶上的 csv 文件
例子:
from google.cloud import storage
from oauth2client.client import GoogleCredentials
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"
a=[1,2,3]
b=['a','b','c']
storage_client = storage.Client()
bucket = storage_client.get_bucket("<mybucketname>")
blob=bucket.blob("Hummingbirds/trainingdata.csv")
for eachrow in range(3):
blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]))
這讓你在谷歌雲存儲上單行
3,c
很明顯,它每次都打開一個新文件並寫下這行。
好的,添加一個新行 delim 怎么樣?
for eachrow in range(3):
blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]) + "\n")
添加換行符,但再次從頭開始寫。
有人可以說明這種方法是什么嗎? 我可以將我所有的行組合成一個字符串,或者寫一個臨時文件,但這看起來很丑陋。
也許打開為文件?
請參考以下答案,希望對您有所幫助。
import pandas as pd
data = [['Alex','Feb',10],['Bob','jan',12]]
df = pd.DataFrame(data,columns=['Name','Month','Age'])
print df
輸出
Name Month Age
0 Alex Feb 10
1 Bob jan 12
添加一行
row = ['Sally','Oct',15]
df.loc[len(df)] = row
print df
輸出
Name Month Age
0 Alex Feb 10
1 Bob jan 12
2 Sally Oct 15
使用 gsutil 寫入/復制到 GCP Bucket
df.to_csv('text.csv', index = False)
!gsutil cp 'text.csv' 'gs://BucketName/folderName/'
Python 代碼(文檔https://googleapis.dev/python/storage/latest/index.html )
from google.cloud import storage
def upload_to_bucket(bucket_name, blob_path, local_path):
bucket = storage.Client().bucket(bucket_name)
blob = bucket.blob(blob_path)
blob.upload_from_filename(local_path)
return blob.url
# method call
bucket_name = 'bucket-name' # do not give gs:// ,just bucket name
blob_path = 'path/folder name inside bucket'
local_path = 'local_machine_path_where_file_resides' #local file path
upload_to_bucket(bucket_name, blob_path, local_path)
blob.upload_from_string(data)
方法創建一個新對象,其內容與字符串data
的內容完全相同。 它覆蓋現有對象而不是追加。
最簡單的解決方案是將整個 CSV 寫入一個臨時文件,然后使用blob.upload_from_filename(filename)
函數將該文件上傳到 GCS。
from google.cloud import storage
from oauth2client.client import GoogleCredentials
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"
a=[1,2,3]
b=['a','b','c']
storage_client = storage.Client()
bucket = storage_client.get_bucket("<mybucketname>")
blob=bucket.blob("Hummingbirds/trainingdata.csv")
# build up the complete csv string
csv_string_to_upload = ''
for eachrow in range(3):
# add the lines
csv_string_to_upload = csv_string_to_upload + str(a[eachrow]) + ',' + b[eachrow] + '\n'
# upload the complete csv string
blob.upload_from_string(
data=csv_string_to_upload,
content_type='text/csv'
)
在遇到完全相同的問題后才看到這篇文章。 經過一番努力,我發現對我來說最好的解決方案是將 .csv 文件作為字節上傳。 這是我的做法:
new_csv_filename = csv_path + "report_" + start_date_str + "-" + end_date_str +
".csv"
df.to_csv(new_csv_filename, index=False)
# upload the file to the storage
blob = bucket.blob(new_csv_filename)
with open(new_csv_filename, 'rb') as f: # here we open the file with read bytes option
blob.upload_from_file(f) # upload from file is now uploading the file as bytes
blob.make_public()
# generate a download url and return it
return blob.public_url
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.