简体   繁体   English

如何使用具有特定 AWS 配置文件的 dask 从 s3 读取镶木地板文件

[英]How to read parquet file from s3 using dask with specific AWS profile

How to read a parquet file on s3 using dask and specific AWS profile (stored in a credentials file).如何使用dask和特定的 AWS 配置文件(存储在凭证文件中)读取 s3 上的镶木地板文件。 Dask uses s3fs which uses boto . DASK使用s3fs它采用boto This is what I have tried:这是我尝试过的:

>>>import os
>>>import s3fs
>>>import boto3
>>>import dask.dataframe as dd

>>>os.environ['AWS_SHARED_CREDENTIALS_FILE'] = "~/.aws/credentials"

>>>fs = s3fs.S3FileSystem(anon=False,profile_name="some_user_profile")
>>>fs.exists("s3://some.bucket/data/parquet/somefile")
True
>>>df = dd.read_parquet('s3://some.bucket/data/parquet/somefile')
NoCredentialsError: Unable to locate credentials

Never mind, that was easy, but did not find any reference online, so here it is:没关系,这很容易,但在网上找不到任何参考资料,所以这里是:

>>>import os
>>>import dask.dataframe as dd
>>>os.environ['AWS_SHARED_CREDENTIALS_FILE'] = "/path/to/credentials"

>>>df = dd.read_parquet('s3://some.bucket/data/parquet/somefile',
                      storage_options={"profile_name":"some_user_profile"})
>>>df.head()
# works

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何选择性地从AWS S3作为Dask Data Frame读取Parquet文件? - How to selectively read Parquet files from AWS S3 as a Dask Data Frame? 在Python Pandas中使用read_parquet从AWS S3读取Parquet文件时出现分段错误 - Segmentation Fault while reading parquet file from AWS S3 using read_parquet in Python Pandas 如何使用 python 中的 spark dataframe 从 AWS S3 读取镶木地板文件(pyspark) - How to read parquet files from AWS S3 using spark dataframe in python (pyspark) Athena 如何从 S3 存储桶中读取镶木地板文件 - How can Athena read parquet file from S3 bucket 如何使用 Boto3 从 S3 将压缩的镶木地板文件读入 Python? - How do I read a gzipped parquet file from S3 into Python using Boto3? 如何使用 python 从 s3 读取按日期文件夹分区的镶木地板文件? - How to read parquet file partitioned by date folder from s3 using python? 如何使用 mdfreader 从 AWS S3 读取 .dat 文件 - How to read .dat file from AWS S3 using mdfreader 使用 AWS Lambda (Python 3) 读取存储在 S3 中的 Parquet 文件 - Read Parquet file stored in S3 with AWS Lambda (Python 3) 使用 aws boto 将文件从 csv 转换为 S3 上的镶木地板 - Convert file from csv to parquet on S3 with aws boto 使用 boto 和 pandas 从 aws s3 读取 csv 文件 - Read a csv file from aws s3 using boto and pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM