[英]aws firehose to s3 bucket partitioning name like year=YYYY, month=MM, day=MM, hour=HH
Currently, AWS Firehose has a default partitioning feature to return the data into S3 with this following partitioned format of folders: YYYY/MM/DD/HH => eg: 2017/10/26/18
目前,AWS Firehose 具有默认分区功能,可以使用以下文件夹分区格式将数据返回到 S3:YYYY/MM/DD/HH =>
eg: 2017/10/26/18
But, I would like to make it like this:但是,我想这样做:
Year=2017/Month=10/Day=26/Hour=18
Is there a way to make the default way to be like above in firehose?有没有办法让 firehose 中的默认方式像上面那样?
I was trying to trigger a SNS topic to invoke a lambda to change the names to be year=yyyy, month=mm, etc, but the problem is that firehose takes some time to create those default partitioned folders.我试图触发一个 SNS 主题来调用 lambda 将名称更改为 year=yyyy、month=mm 等,但问题是 firehose 需要一些时间来创建这些默认分区文件夹。 So I am not too sure how to achieve this without possible conflicts - lambda calls before folder has been created.
所以我不太确定如何在没有可能的冲突的情况下实现这一目标 - 在创建文件夹之前调用 lambda。
It would be best if there is an AWS
way to handle this, which would be an ideal - which I have not found it yet.如果有一种
AWS
方法来处理这个问题,那将是最好的,这将是一个理想的——我还没有找到它。
Any suggestion would be appreciative.任何建议将不胜感激。 Thanks!
谢谢!
Using Dynamic Partitioning, you can use the following expression in the S3 bucket prefix
on the Kinesis Firehose configuration:使用动态分区,您可以在 Kinesis Firehose 配置的
S3 bucket prefix
中使用以下表达式:
input/kinesis-realtime/year=:{timestamp:yyyy}/month=:{timestamp:MM}/day=!{timestamp:dd}/
使用 s3 前缀选项作为 'year=!{timestamp:YYYY}/month=!{timestamp:MM}/day=!{timestamp:dd}/' 将您的文件夹结构设为 Year=2017/Month=10/Day= 26/小时=18
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.