[英]Can Kinesis Firehose receive content uncompressed from CloudWatch Logs subscription?
I'm using Kinesis Firehose to copy application logs from CloudWatch Logs into S3 buckets. 我正在使用Kinesis Firehose将应用程序日志从CloudWatch Logs复制到S3存储桶中。
However, there is a problem with this flow. 但是,该流程存在问题。 Often I've noticed that the Lambda transform function fails because the output data exceeds the 6 MiB response payload limit for Lambda synchronous invocation. 我经常注意到Lambda转换功能失败,因为输出数据超过了Lambda同步调用的6 MiB响应有效负载限制 。 It makes sense this would happen because the input is compressed but the output is not compressed. 这是有道理的,因为输入已压缩,但输出未压缩。 Doing it this way seems like the only way to get the file extension and MIME type set correctly on the resultant object in S3. 这样做似乎是唯一一种在S3中的结果对象上正确设置文件扩展名和MIME类型的方法。
Is there any way to deliver the input to the Lambda transform function uncompressed? 有什么方法可以将输入未经压缩地传递给Lambda变换函数吗?
This would align the input/output sizes. 这将对齐输入/输出大小。 I've already tried reducing the buffer size on the Firehose delivery stream, but the buffer size limit seems to be on compressed data, not raw data. 我已经尝试过减小Firehose传递流上的缓冲区大小,但是缓冲区大小限制似乎针对压缩数据,而不是原始数据。
CloudWatch Logs always delivery in compressed format which is a benefit from a cost and performance perspective. CloudWatch Logs始终以压缩格式交付,从成本和性能角度来看,这是一个好处。 But I understand your frustration from not having file extension correct in the S3. 但是我知道您对S3中文件扩展名不正确感到沮丧。
What you could do: 1) Have your lambda uncompress on read and compress on write. 您可以做什么:1)让您的lambda在读取时解压缩,在写入时压缩。
2) Create an S3 event trigger on ObjectCreate that renames the file with the correct extension. 2)在ObjectCreate上创建一个S3事件触发器,该触发器将使用正确的扩展名重命名该文件。 Due to the way the firehose writes to S3 you can not use a suffix filter so you're lambda will need to check if it did the rename already. 由于firehose写入S3的方式,您不能使用后缀过滤器,因此您需要lambda检查它是否已经进行了重命名。
lambda logic 拉姆达逻辑
if object does not end .gz
then
aws s3 mv object object.gz
end if
No, it doesn't seem possible to change whether the input from CloudWatch Logs is compressed. 不,似乎无法更改CloudWatch Logs的输入是否已压缩。 CloudWatch Logs will always push GZIP-compressed payloads onto the Kinesis stream. CloudWatch Logs将始终将GZIP压缩的负载推送到Kinesis流中。
For confirmation, take a look at the AWS reference implementation kinesis-firehose-cloudwatch-logs-processor of the newline handler for CloudWatch Logs. 为了进行确认,请查看CloudWatch Logs换行处理程序的AWS参考实现kinesis-firehose-cloudwatch-logs-processor 。 This handler accepts GZIP-compressed input and returns the decompressed message as output. 该处理程序接受GZIP压缩的输入,并将解压缩的消息作为输出返回。 In order to work around the 6 MiB limit and avoid body size is too long
error messages, the reference handler slices the input into two parts: payloads that fit within the 6 MiB limit, and the remainder. 为了解决6 MiB限制并避免body size is too long
错误消息,参考处理程序将输入分为两部分:适合6 MiB限制的有效负载,以及其余部分。 The remainder is re-inserted into Kinesis using PutRecordBatch
. 使用PutRecordBatch
将其余部分重新插入Kinesis。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.