將 Pandas df 轉換為 parquet-file-bytes-object

Question

我有一個 Pandas 數據框，想將它作為鑲木地板文件寫入 Azure 文件存儲。

到目前為止，我還無法將數據幀直接轉換為字節，然后我可以將其上傳到 Azure。 我目前的解決方法是將它作為鑲木地板文件保存到本地驅動器，然后將其作為字節對象讀取，我可以將其上傳到 Azure。

誰能告訴我如何將 Pandas 數據幀直接轉換為“鑲木地板文件”-bytes 對象而不將其寫入磁盤？ I/O 操作真的減慢了速度，感覺很像非常丑陋的代碼......

# Transform the data_frame into a parquet file on the local drive    
data_frame.to_parquet('temp_p.parquet', engine='auto', compression='snappy')

# Read the parquet file as bytes.
with open("temp_p.parquet", mode='rb') as f:
     fileContent = f.read()

     # Upload the bytes object to Azure
     service.create_file_from_bytes(share_name, file_path, file_name, fileContent, index=0, count=len(fileContent))

我正在尋找實現這樣的東西，其中 transform_functionality 返回一個字節對象：

my_bytes = data_frame.transform_functionality()
service.create_file_from_bytes(share_name, file_path, file_name, my_bytes, index=0, count=len(my_bytes))

Answer 1

我找到了一個解決方案，我會在這里發布，以防有人需要執行相同的任務。 使用 to_parquet 文件將其寫入緩沖區后，我使用 _.getvalue() 功能從緩沖區中獲取字節對象，如下所示：

buffer = BytesIO()
data_frame.to_parquet(buffer, engine='auto', compression='snappy')

service.create_file_from_bytes(share_name, file_path, file_name, buffer.getvalue(), index=0, count=buffer.getbuffer().nbytes )

將 Pandas df 轉換為 parquet-file-bytes-object

問題描述

1 個解決方案

解決方案1
10 已采納 2019-01-16 07:13:31

將 Pandas df 轉換為 parquet-file-bytes-object

問題描述

1 個解決方案

解決方案1 10 已采納 2019-01-16 07:13:31

解決方案1
10 已采納 2019-01-16 07:13:31