如何抑制 AWS Athena 查询结果中的列标题？

Question

I'm running a SELECT Athena query on an S3 bucket manifest.我正在 S3 存储桶清单上运行 SELECT Athena 查询。 I then want to use the results of that query, in .csv format, in an S3 Batch operation.然后，我想在 S3 批处理操作中以 .csv 格式使用该查询的结果。

My query runs fine and I am able to access the .csv output via S3 Batch, but since the first row is actually column headers, S3 Batch to throws an unrecoverable error because it thinks that the manifest is now referring to multiple buckets.我的查询运行良好，我能够通过 S3 Batch 访问 .csv 输出，但由于第一行实际上是列标题，S3 Batch 会引发不可恢复的错误，因为它认为清单现在指的是多个存储桶。

How can I easily strip the column headers out of my results?如何轻松地从结果中去除列标题？ I would prefer to just do it in SQL.我宁愿只用 SQL 来做。 The file size makes using standard unix tools prohibitive.文件大小使得使用标准 unix 工具望而却步。 I could use AWS Glue, but this seems like overkill for just suppressing headers in a SQL query.我可以使用 AWS Glue，但这对于仅抑制 SQL 查询中的标头来说似乎有点矫枉过正。

Answer 1

Here's a hacky way to get around it这是一种绕过它的hacky方法

SELECT bucket as "my-bucket-name", key as "fakekey"
from your_athena_table

This will make your header look like the rest of the file which will not break the S3 Batch copy job.这将使您的标题看起来像文件的其余部分，不会破坏 S3 批量复制作业。 You will have just one failed record of fakekey您将只有一个失败的 fakekey 记录

如何抑制 AWS Athena 查询结果中的列标题？

问题描述

1 个解决方案

解决方案1
4 2020-03-26 19:21:05

如何抑制 AWS Athena 查询结果中的列标题？

问题描述

1 个解决方案

解决方案1 4 2020-03-26 19:21:05

解决方案1
4 2020-03-26 19:21:05