I created a dataframe from pandas and used to_parquet(...) to write to s3 directly.
arguments are:
df.to_parquet('s3://bucket/fn.parquet', compression='gzip', engine='fastparquet', partition_cols=['col1'])
when I use pandas's pandas.read_parquet(url)
, the dataframe is loaded fine.
But when I use modin.pandas.read_parquet(url)
, I get following error:
File "/home/mguo/anaconda3/envs/testenv/lib/python3.7/site-packages/s3fs/core.py", line 1779, in __init__
self.req_kw["IfMatch"] = self.details["ETag"]
KeyError: 'ETag'
Below are my version:
python==3.7.3
pandas==1.2.4
modin==0.10.0
s3fs==2021.6.0
This issue was tracked on GitHub here and fixed here .
Another user posted a link to the GitHub issue in an answer here, but it was deleted. Mods, if you see this post, please don't deleted.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.