简体   繁体   中英

How to transfer better sqs queue messages into redshift?

I have a sqs queue, that my application constantly sends messages to (about 5-15 messages per second). I need to take the messages data and put it in redshift. Right now, I have background service which gets X messages from the queue every Y minutes, then the service put them in an s3 file, and transfer the data into redshift using the COPY command.

This implementation have some problems:

  1. In my service, I get X messages at a time, and because of the sqs limits, amazon allow to receive only 10 messages at max at a time (meaning that if I want to get 1000 messages, I will need to make 100.network calls)

  2. My service doesn't scale as the application scales -> when there will be 30 (or 300) messages per second, my service won't be able to handle all the messages.

  3. Using aws firehose is a little inconvenient the way I see it, because SHARDS are not scalable (I will need to configure manually to add shards) but maybe I'm wrong here...

A a result of those things, I need something that will be scalable and efficient as possible. any ideas?

For the purpose you have described, I think AWS would say that Kinesis Data Streams plus Kinesis Data Firehose is a more appropriate service than SQS.

Yes, like you said, you do have to configure the shards . But just one shard can handle 1000 incoming records/sec. Also there are ways to automate the scaling, for example like AWS have documented here

One further advantage of using Kinesis Data Firehose is you can create a delivery stream which pushes the data straight into Redshift if you wish.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM