[英]How to read parquet file from s3 using dask with specific AWS profile
How to read a parquet file on s3 using dask
and specific AWS profile (stored in a credentials file).如何使用dask
和特定的 AWS 配置文件(存储在凭证文件中)读取 s3 上的镶木地板文件。 Dask uses s3fs
which uses boto
. DASK使用s3fs
它采用boto
。 This is what I have tried:这是我尝试过的:
>>>import os
>>>import s3fs
>>>import boto3
>>>import dask.dataframe as dd
>>>os.environ['AWS_SHARED_CREDENTIALS_FILE'] = "~/.aws/credentials"
>>>fs = s3fs.S3FileSystem(anon=False,profile_name="some_user_profile")
>>>fs.exists("s3://some.bucket/data/parquet/somefile")
True
>>>df = dd.read_parquet('s3://some.bucket/data/parquet/somefile')
NoCredentialsError: Unable to locate credentials
Never mind, that was easy, but did not find any reference online, so here it is:没关系,这很容易,但在网上找不到任何参考资料,所以这里是:
>>>import os
>>>import dask.dataframe as dd
>>>os.environ['AWS_SHARED_CREDENTIALS_FILE'] = "/path/to/credentials"
>>>df = dd.read_parquet('s3://some.bucket/data/parquet/somefile',
storage_options={"profile_name":"some_user_profile"})
>>>df.head()
# works
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.