简体   繁体   English

查找雅典娜查询结果的来源

[英]Find the source of athena query result

We have thousands of files stored in S3. 我们在S3中存储了数千个文件。 These files are exposed to athena so that we can query on them. 这些文件暴露于雅典娜,以便我们对其进行查询。 While doing debugging i found that athena shows multiple blank lines when queries on a specific id . 在进行调试时,我发现当对特定id查询时,雅典娜会显示多个空白行。 Given that there are thousands of files, I am not sure where that data is coming from. 鉴于有成千上万个文件,所以我不确定这些数据来自何处。

Is there a way that i can see the source file for respective rows in athena result? 有没有一种方法可以查看雅典娜结果中各行的源文件?

There is a hidden column exposed by Presto Hive connector: "$path" This column exposes the path of the file particular row has been read from. Presto Hive连接器公开了一个隐藏的列: "$path"此列公开了已读取特定行的文件的路径。

Note: the column name is actually $path , but you need to " -quote it in SQL. This is because $ is otherwise illegal in an identifier. 注意:列名实际上是$path ,但是您需要在SQL中用"引号" 。这是因为$否则在标识符中是非法的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM