I'm doing real time streaming on Twitter and wonder is there a way to extract only messages and certain values from Kafka topic?
You can use ksqlDB to do this. For example:
ksql> CREATE STREAM TWEETS WITH (KAFKA_TOPIC='twitter_01', VALUE_FORMAT='Avro');
ksql> SELECT USER->SCREENNAME, TEXT FROM TWEETS WHERE TEXT LIKE '%cool%' EMIT CHANGES;
+-------------------+------------------------------------------------------------------------------------------+
|USER__SCREENNAME |TEXT |
+-------------------+------------------------------------------------------------------------------------------+
|MobileGist |This is super cool!! Great work @houchens_kim! |
You can also build a new topic with the results of this if you want
ksql> CREATE STREAM COOL_TWEETS AS SELECT USER->SCREENNAME, TEXT FROM TWEETS WHERE TEXT LIKE '%cool%' EMIT CHANGES;
Since you tagged Python it's worth pointing out that you can call ksqlDB using its REST API from Python. Here's an example .
You didn't mention what type of data you are receiving. Tweets, yes, but as CSV? JSON? Avro? Protobuf?
The short answer is "yes". Just as you can open a text file and read data out of it, you can get data out of a Kafka record. They just happen to be streaming in constantly
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.