[英]Analyze messages from Kafka consumer
I set up a Kafka consumer-producer system, and I need to process the transmitted messages. 我建立了一个Kafka消费者-生产者系统,我需要处理传输的消息。 These are lines from a JSON file like 这些是来自JSON文件的行,例如
ConsumerRecord(topic=u'json_data103052', partition=0, offset=676, timestamp=1542710197257, timestamp_type=0, key=None, value='{"Name": "Simone", "Surname": "Zimbolli", "gender": "Other", "email": "zzz@uiuc.edu", "country": "Nigeria", "date": "11/07/2018"}', checksum=354265828, serialized_key_size=-1, serialized_value_size=189)
I am looking for an easy to implement solution to 我正在寻找一种易于实施的解决方案
Does anybody have suggestions on how to proceed? 有人对如何进行有建议吗? Thanks. 谢谢。
I am having issues using Spark, so I would prefer avoiding it. 我在使用Spark时遇到问题,所以我宁愿避免使用它。 I am scripting in Python using Jupyter. 我正在使用Jupyter在Python中编写脚本。
Here is my code: 这是我的代码:
from kafka import KafkaConsumer
from random import randint
from time import sleep
bootstrap_servers = ['localhost:9092']
%store -r topicName # Get the topic name from the kafka producer
print topicName
consumer = KafkaConsumer(bootstrap_servers = bootstrap_servers,
auto_offset_reset='earliest'
)
consumer.subscribe([topicName])
for message in consumer:
print (message)
Using Kafka Streams API is what you need I guess. 我想您需要使用Kafka Streams API。 You have all the features you need for windowing. 您具有开窗所需的所有功能。 You can find more info about Kafka Streams here: 您可以在此处找到有关Kafka Streams的更多信息:
https://kafka.apache.org/documentation/streams/ https://kafka.apache.org/documentation/streams/
For your scenario, Kafka Streams seems suitable. 对于您的情况,Kafka Streams似乎合适。 It has support of windowing with following 4 types : 它支持以下四种窗口类型:
Tumbling time window - Time-based Fixed-size, non-overlapping, gap-less windows
Hopping time window- Time-based Fixed-size, overlapping windows
Sliding time window- Time-based Fixed-size, overlapping windows that work on differences between record timestamps
Session window
For python, there is library : https://github.com/wintoncode/winton-kafka-streams 对于python,有一个库: https : //github.com/wintoncode/winton-kafka-streams
That can be useful for you. 这对您可能有用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.