简体   繁体   中英

How to check whether a key already exists in a Kafka topic?

I want a function, say checkKey() which should work as follows:

def checkKey(key):
    if(key in topic-name):
        return True
    return False

I could not find this in Kafka's documentation. I am aware that in order to deduplicate data, Kafka can update the key. However, I do not want the updation, I want to know whether it already exists or not. And if it exists, I want to update its value in the following way:

def updateValue(key):
    if(checkKey(key)):
        value of key in topic-name += 1

This I need to perform using Python, so code sample of the same would be very helpful.

You can use Kafka Streams for that. Simply define a KTable for your topic using StreamBuilder. table providing a name for the state store using Materialized.as("store-name") and then you can query it using Interactive Queries , see that page for more examples but it's as simple as streams.store("store-name", QueryableStoreTypes.keyValueStore()).get(key) .

Kafka is not a table, it's a queue. To see if a key exists in a topic, you need to read the entire topic or, if at all possible, keep a local copy of the topic. You might be able to restrict your search to a specific partition if you know your partitioning logic.

That being said, Confluent has a streaming SQL engine called KSQL which might help you. You can look it up here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM