简体   繁体   中英

How to get input file name as column in AWS Athena external tables

I have external tables created in AWS Athena to query S3 data, however, the location path has 1000+ files. So I need the corresponding filename of the record to be displayed as a column in the table.

select file_name , col1 from table where file_name = "test20170516"

In short, I need to know INPUT__FILE__NAME(hive) equivalent in AWS Athena Presto or any other ways to achieve the same.

您可以使用 $path 伪列执行此操作。

select "$path" from table

If you need just the filename, you can extract it with regeexp_extract() .

To use it in Athena on the "$path" you can do something like this:

SELECT regexp_extract("$path", '[^/]+$') AS filename from table;

If you need the filename without the extension, you can do:

SELECT regexp_extract("$path", '[ \w-]+?(?=\.)') AS filename_without_extension from table;

Here is the documentation on Presto Regular Expression Functions

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM