簡體   English   中英

將 csv 寫入谷歌雲存儲

[英]Write csv to google cloud storage

我想了解如何將多行 csv 文件寫入谷歌雲存儲。 我只是沒有遵循文檔

靠近這里: 無法讀取上傳到谷歌雲存儲桶上的 csv 文件

例子:

from google.cloud import storage
from oauth2client.client import GoogleCredentials
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"

a=[1,2,3]

b=['a','b','c']

storage_client = storage.Client()
bucket = storage_client.get_bucket("<mybucketname>")

blob=bucket.blob("Hummingbirds/trainingdata.csv")

for eachrow in range(3):
    blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]))

這讓你在谷歌雲存儲上單行

3,c

很明顯,它每次都打開一個新文件並寫下這行。

好的,添加一個新行 delim 怎么樣?

for eachrow in range(3):
    blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]) + "\n")

添加換行符,但再次從頭開始寫。

有人可以說明這種方法是什么嗎? 我可以將我所有的行組合成一個字符串,或者寫一個臨時文件,但這看起來很丑陋。

也許打開為文件?

請參考以下答案,希望對您有所幫助。

import pandas as pd
 data = [['Alex','Feb',10],['Bob','jan',12]]
 df = pd.DataFrame(data,columns=['Name','Month','Age'])
 print df

輸出

   Name Month  Age
0  Alex   Feb   10
1   Bob   jan   12

添加一行

row = ['Sally','Oct',15]
df.loc[len(df)] = row
print df

輸出

     Name Month  Age
 0   Alex   Feb   10
 1    Bob   jan   12
 2  Sally   Oct   15

使用 gsutil 寫入/復制到 GCP Bucket

  df.to_csv('text.csv', index = False)
 !gsutil cp 'text.csv' 'gs://BucketName/folderName/'

Python 代碼(文檔https://googleapis.dev/python/storage/latest/index.html

from google.cloud import storage

def upload_to_bucket(bucket_name, blob_path, local_path):
    bucket = storage.Client().bucket(bucket_name)
    blob = bucket.blob(blob_path)
    blob.upload_from_filename(local_path)
    return blob.url

# method call
bucket_name = 'bucket-name' # do not give gs:// ,just bucket name
blob_path = 'path/folder name inside bucket'
local_path = 'local_machine_path_where_file_resides' #local file path
upload_to_bucket(bucket_name, blob_path, local_path)

blob.upload_from_string(data)方法創建一個新對象,其內容與字符串data的內容完全相同。 它覆蓋現有對象而不是追加。

最簡單的解決方案是將整個 CSV 寫入一個臨時文件,然后使用blob.upload_from_filename(filename)函數將該文件上傳到 GCS。

from google.cloud import storage
from oauth2client.client import GoogleCredentials
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"

a=[1,2,3]

b=['a','b','c']

storage_client = storage.Client()
bucket = storage_client.get_bucket("<mybucketname>")

blob=bucket.blob("Hummingbirds/trainingdata.csv")

# build up the complete csv string
csv_string_to_upload = ''

for eachrow in range(3):
    # add the lines
    csv_string_to_upload = csv_string_to_upload + str(a[eachrow]) + ',' + b[eachrow] + '\n'

# upload the complete csv string
blob.upload_from_string(
            data=csv_string_to_upload,
            content_type='text/csv'
        )

在遇到完全相同的問題后才看到這篇文章。 經過一番努力,我發現對我來說最好的解決方案是將 .csv 文件作為字節上傳。 這是我的做法:

new_csv_filename = csv_path + "report_" + start_date_str + "-" + end_date_str + 
".csv"
df.to_csv(new_csv_filename, index=False)
# upload the file to the storage
blob = bucket.blob(new_csv_filename)
with open(new_csv_filename, 'rb') as f:  # here we open the file with read bytes option
    blob.upload_from_file(f)   # upload from file is now uploading the file as bytes
blob.make_public()
# generate a download url and return it
return blob.public_url 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM