简体   繁体   English

Kafka到S3 - 如何将切片从kafka加载到S3

[英]Kafka to S3 - How to loading slices from kafka to S3

It's not clear to me if there is some kind of plugin to consume data from kafka topics and insert them to the S3. 我不清楚是否有某种插件可以使用来自kafka主题的数据并将它们插入S3。

I already found this topic , but I could not solve this issue yet, there is this project , but honestly is hard to use because the last commit is from 2 years ago. 我已经找到了这个主题 ,但我还没有解决这个问题,有这个项目 ,但老实说很难用,因为最后一次提交是从2年前开始的。

The main goal to me it was consume directly from kafka to spark jobs, but I think this can be a kind of complicated, so if could populate S3 with slices of events from kafka is enough to me. 我的主要目标是直接从kafka消费来激活工作,但我认为这可能是一种复杂的,所以如果用来自kafka的一些事件来填充S3就足够了。

Also, there is any consumer example in scala? 另外,scala中有任何消费者示例? It is kind of funny because kafka is built in scala but the documentation code is java. 它有点搞笑,因为kafka是用scala构建的,但文档代码是java。 =p = p

I appreciate any help 我感谢任何帮助

Updated : 更新

Camus may be a option too 加缪也可能是一个选择

This tool from pinterest was the perfect answer for me. 来自pinterest的这个工具对我来说是完美的答案。

Secor 西科尔

StreamX ( https://github.com/qubole/streamx ) which is based on Kafka-Connect framework can help copy data from Kafka to S3 reliably. 基于Kafka-Connect框架的StreamX( https://github.com/qubole/streamx )可以帮助可靠地将数据从Kafka复制到S3。 It's feature-rich, supports multiple output formats and different partitioning mechanisms. 它功能丰富,支持多种输出格式和不同的分区机制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM