简体   繁体   中英

Kinesis firehose is really realtime processing?

Actually i have pipeline with Kinesis firehose and that source data is a PUT request to firehose, but my total time of source to target (s3) is too slow, roughly 3 minutes. This time not apply like realtime, i attach setting of window

在此处输入图像描述

finally in target (s3 -Destination settings )

在此处输入图像描述

How could you improve el time of source to target in less that 1 minute?

Kinesis Firehose for S3 allows you to set the buffer interval to a value between 60 und 900 seconds as well as a batch size of 1 to 128MiB. Whichever value is reached first will trigger writing to S3.

The frequency of data delivery to Amazon S3 is determined by the Amazon S3 Buffer size and Buffer interval value that you configured for your delivery stream. Kinesis Data Firehose buffers incoming data before it delivers it to Amazon S3. You can configure the values for Amazon S3 Buffer size (1–128 MB) or Buffer interval (60–900 seconds). The condition satisfied first triggers data delivery to Amazon S3. When data delivery to the destination falls behind data writing to the delivery stream, Kinesis Data Firehose raises the buffer size dynamically. It can then catch up and ensure that all data is delivered to the destination.

https://docs.aws.amazon.com/firehose/latest/dev/basic-deliver.html#frequency

So you can increase the chance of more timely delivery by decreasing the buffer size and batch size at the cost of increasing the number of files you have.


If you really need a sub-minute latency here, consider using a Kinesis Data Stream, have a Lambda function read from that stream, and later write the data to S3 from the Lambda. See the docs for more details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM