简体   繁体   English

AWS athena 在尝试查询 S3 中已在 Glue 数据目录中编目的文件时出错

[英]AWS athena giving error when trying to query files in S3 that have already been catalogued in Glue data catalog

Trying to build a data lake using S3 for files that are in.csv.gz format and then further cleansing/processing data in AWS environment itself.尝试使用 S3 为 .csv.gz 格式的文件构建数据湖,然后在 AWS 环境本身中进一步清理/处理数据。 First used AWS Glue to create a data catalog\ (crawler was able to identify all tables).首先使用 AWS Glue 创建数据目录\(爬虫能够识别所有表)。 The tables from catalog are also available in AWS Athena but when i try to run a Select * from the table it gives me following error.目录中的表在 AWS Athena 中也可用,但是当我尝试从表中运行 Select * 时,出现以下错误。

Error opening Hive split s3://BUCKET_NAME/HEADER FOLDER/FILENAME.csv.gz (offset=0, length=44354) using org.apache.hadoop.mapred.TextInputFormat: Permission denied on S3 path: 3://BUCKET_NAME/HEADER FOLDER/FILENAME.csv.gz .打开 Hive split s3://BUCKET_NAME/HEADER FOLDER/FILENAME.csv.gz (offset=0, length=44354) 使用 org.apache.hadoop.mapred.TextInputFormat 时出错:S3 路径 F 路径上的权限被拒绝_NAME 3///BUCKDERETRET /文件名.csv.gz

Could it be that the file is in CSV.GZ format and that is why it cannot be accessed as is or do i need to give user or role a specific access for these files?可能是该文件是 CSV.GZ 格式,这就是为什么它不能按原样访问的原因,或者我是否需要为用户或角色提供对这些文件的特定访问权限?

You need to fix your permissions.您需要修复您的权限。 The error says the principal (user/role) that ran the query does not have permission to read an object on S3.该错误表明运行查询的主体(用户/角色)没有在 S3 上读取 object 的权限。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 AWS Glue 读取 glue 目录表 VS 从 s3 读取文件 - AWS Glue reading glue catalog table VS reading files from s3 AWS Glue Crawler 在没有 Glue Job 的情况下将所有数据发送到 Glue Catalog 和 Athena - AWS Glue Crawler sends all data to Glue Catalog and Athena without Glue Job 通过 Athena 创建 Glue 数据目录 SDK - Create Glue data catalog via Athena SDK Django 在 AWS S3 中托管 static 个文件,在尝试访问管理字体文件时导致 CORS 错误 - Django hosting static files in AWS S3 causing CORS error when trying to access admin font files AWS Athena 从 GLUE Crawler 输入的表中返回零记录来自 S3 - AWS Athena Return Zero Records from Tables Created by GLUE Crawler input csv from S3 AWS Athena 从从 S3 的 GLUE 爬虫输入 csv 创建的表中返回零记录 - AWS Athena Returning Zero Records from Tables Created from GLUE Crawler input csv from S3 使用 AWS Glue 将 AWS Redshift 转换为 S3 Parquet 文件 - AWS Redshift to S3 Parquet Files Using AWS Glue Aws Wrangler 给出未实现的错误:无法查询 aws athena - Aws Wrangler giving not implemented error: impossible to query aws athena 使用 AWS Glue 在两个 S3 存储桶之间加载数据时如何更新数据? - How to update data when loading it between two S3 buckets using AWS Glue? 使用 AWS Glue 将数据从 S3 加载到 Aurora Serverless - Load data from S3 into Aurora Serverless using AWS Glue
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM