简体   繁体   中英

Kafka producer skipping messages

I am trying to write data from a file to a kafka topic. My code looks like this:

 Properties properties = new Properties();
    properties.put("bootstrap.servers", <bootstrapServers>);
    properties.put("key.serializer", StringSerializer.class.getCanonicalName());
    properties.put("value.serializer", StringSerializer.class.getCanonicalName());
    properties.put("retries",100);
    properties.put("linger.ms",5);
    properties.put("acks", "all");

    KafkaProducer<Object, String> producer = new KafkaProducer<>(properties);

    try (BufferedReader bf = new BufferedReader(new InputStreamReader(new FileInputStream(filePath), "UTF-8"))) {
        String line;
        int count = 0;
        while ((line = bf.readLine()) != null) {
            count++;
            producer.send(new ProducerRecord<>(topicName, line));
        }
  producer.flush();
        Logger.log("Done producing data messages. Total no of records produced:" + count);
    } catch (InterruptedException | ExecutionException | IOException e) {
        Throwables.propagate(e);
    } finally {
        producer.close();
    }

The size of the data is above 1 million records.

When I check the offset of data on brokers using following command, there are only half of the messages (around 5,00,000) are written on the topic:

./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list <broker_list> --time -1 --topic <topic_name>

Output of the above command:

topic_name:1:292954
topic_name:0:296787

What changes should I do in approach to make sure that all the are written on the topic.

The send message is asynchronous. You may be checking the offsets before all the messages are processed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM