简体   繁体   中英

Read a CSV file in real time using Kafka Connect

how can I integrate Kafka connect with Kafka producer in java so that I can read a CSV file in real time? I am having a hard time finding any source related to this.

Right now I am using scala-ide to run a simple Kafka producer class, but have no idea on how to use it with Kafka connect.

Kafka Connect has a producer already built in . You just need to use the right connector plugin. For reading a CSV file, the FileStreamSource connector which is part of Kafka should work. You can start the Kafka Connect probably in the standalone mode (see Kafka docs ) and configure the connector. The example config files which are part of Kafka should help you to get started.

A Kafka Connect connector for reading CSV files already exists: https://github.com/jcustenborder/kafka-connect-spooldir .

Yuo can see an example of it in action here: https://www.confluent.io/blog/ksql-in-action-enriching-csv-events-with-data-from-rdbms-into-AWS/


Disclaimer: I wrote the above article, and work for Confluent on whose blog it was published

Just another Kafka Connect plugin for reading CSV files : https://github.com/streamthoughts/kafka-connect-file-pulse

Hope this project can help people looking for similar solution.


Disclaimer: I am one of the contributors to this project

The issue with the casual spooldir connector is that you need to have the csv file inside the pod/ container of kafka-connect. It might not be secure for large sized files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM