简体   繁体   中英

Save a pandas df into a file-like object with gzip compression

I am trying to save a pandas DF into a in-memory json_buffer and the load the file to S3 using the following code:

json_buffer = StringIO()
df.to_json(json_buffer, orient='records', date_format='iso', compression='gzip')
json_file_name = file_to_load.split(".")[0] + ".json"
s3_conn.put_object(Body=json_buffer.getvalue(), Bucket=s3_bucket, Key=f"{target_path}{json_file_name}")

When I try to apply compression I get this error:

RuntimeWarning: compression has no effect when passing a non-binary object as input.\

how can still apply the compression and save the JSON file to S3 with.gz compression?

Thank you!

Got it to work, I will share my how it worked out for me using BytesIO and gzip :

json_buffer = BytesIO()

with gzip.GzipFile(mode='w', fileobj=json_buffer) as gz_file:
  df.to_json(gz_file, orient='records', date_format='iso')

json_file_name = file_to_load.split(".")[0] + ".json.gz"
s3_conn.put_object(Body=json_buffer.getvalue(), Bucket=s3_bucket, Key=f"{target_path}{json_file_name}")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM