简体   繁体   English

将数据帧作为压缩的 csv 直接上传到 s3,而无需将其保存在本地机器上

[英]upload a dataframe as a zipped csv directly to s3 without saving it on the local machine

How can I upload a data frame as a zipped csv into S3 bucket without saving it on my local machine first?如何将数据帧作为压缩的csv 上传到 S3 存储桶中,而无需先将其保存在本地计算机上?

I have the connection to that bucket already running using:我已经使用以下方法连接到该存储桶:

self.s3_output = S3(bucket_name='test-bucket', bucket_subfolder='')

We can make a file-like object with BytesIO and zipfile from the standard library.我们可以使用标准库中的 BytesIO 和 zipfile 制作一个类似文件的对象。

# 3.7
from io import BytesIO
import zipfile

# .to_csv returns a string when called with no args
s = df.to_csv()

with zipfile.ZipFile(BytesIO(), mode="w",) as z:
  z.writestr("df.csv", s)
  # upload file here

You'll want to refer to upload_fileobj in order to customize how the upload behaves.您需要参考upload_fileobj以自定义上传的行为方式。

yourclass.s3_output.upload_fileobj(z, ...)

This works equally well for zip and gz:这同样适用于 zip 和 gz:

import boto3
import gzip
import pandas as pd
from io import BytesIO, TextIOWrapper


s3_client = boto3.client(
        service_name = "s3",
        endpoint_url = your_endpoint_url,
        aws_access_key_id = your_access_key,
        aws_secret_access_key = your_secret_key
    
    
# Your file name inside zip

your_filename = "test.csv"
    
s3_path = f"path/to/your/s3/compressed/file/test.zip"
    
bucket = "your_bucket"
    
df = your_df
    
    
gz_buffer = BytesIO()


with gzip.GzipFile(   
    
    filename = your_filename,
    mode = 'w', 
    fileobj = gz_buffer ) as gz_file:

    
    df.to_csv(TextIOWrapper(gz_file, 'utf8'), index=False)
    
    
    s3.put_object(
        Bucket=bucket, Key=s3_path, Body=gz_buffer.getvalue()
    )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将 Pandas Dataframe 转换为镜像并直接上传到 S3 存储桶,无需本地保存使用 Python - Convert Pandas Dataframe to Image and Upload Directly to S3 Bucket Without Saving Locally Using Python 将 object 保存到 CSV 直接保存到 S3 - Saving an object to CSV directly to S3 从 S3 读取一个压缩的 csv 到 python 数据帧 - read a zipped csv from S3 into python dataframe 保存Dataframe到csv直接到s3 Python - Save Dataframe to csv directly to s3 Python 将 pandas 数据帧作为压缩的 CSV 直接写入 Amazon s3 存储桶? - Write pandas dataframe as compressed CSV directly to Amazon s3 bucket? 将 Python 中的 DataFrame 保存为 csv 并将其上传到具有公共访问权限的 AWS S3 - Save DataFrame in Python as csv and upload it on AWS S3 with public access Python 3: How to upload a pandas dataframe as a csv stream without saving on disc? - Python 3: How to upload a pandas dataframe as a csv stream without saving on disc? 将数据上传到 S3 存储桶而不将其保存到磁盘 - Upload data to S3 bucket without saving it to a disk 将 pandas dataframe 上传到 azure blob 而不创建 Z628CB5675FF524F3E719BFEAAZ8 本地文件 - upload pandas dataframe to azure blob without creating csv local file 如何在不创建临时本地文件的情况下将文件上传到 S3 - How to upload a file to S3 without creating a temporary local file
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM