简体   繁体   English

如何创建 kafka 压缩主题

[英]How to create kafka compacted topic

I have a Kafka application that has a producer who produces messages to a topic.我有一个Kafka应用程序,它有一个producer为某个主题生成消息。 A consumer then takes the messages from the topic, does some logic to the given messages and then produces them to another topic. consumer然后从主题中获取消息,对给定消息执行一些逻辑,然后将它们生成到另一个主题。 I'm using ProducerRecord and ConsumerRecords .我正在使用ProducerRecordConsumerRecords

I want my app to create 2 compacted topics and then use them.我希望我的应用程序创建 2 个compacted topics ,然后使用它们。 If the compacted topics already exist, just display a message and continue.如果compacted topics已经存在,只需显示一条消息并继续。

My SimpleProducer class:我的SimpleProducer类:

  package com.kafkatest.demo;

import java.util.*;

import org.apache.kafka.clients.producer.*;
public class SimpleProducer extends Thread{

   public static void main(String[] args) throws Exception{

      String topicName = "nodesTopic";
      String key = "Key1";
      String value = "Value-1";

      String key1 = "Key2";
      String value1 = "Value-2";



      Properties props = new Properties();
      props.put("bootstrap.servers", "localhost:9092,localhost:9093");
      props.put("key.serializer","org.apache.kafka.common.serialization.StringSerializer");         
      props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

      Producer<String, String> producer = new KafkaProducer <>(props);

      ProducerRecord<String, String> record = new ProducerRecord<>(topicName,key,value);
      producer.send(record);


      ProducerRecord<String, String> record2 = new ProducerRecord<>(topicName,key1,value1);
      producer.send(record2);


      ProducerRecord<String, String> record3 = new ProducerRecord<>(topicName,key,value);
      producer.send(record3);


      ProducerRecord<String, String> record4 = new ProducerRecord<>(topicName,key,value);
      producer.send(record4);


      ProducerRecord<String, String> record5 = new ProducerRecord<>(topicName,key,value);
      producer.send(record5);


      ProducerRecord<String, String> record6 = new ProducerRecord<>(topicName,key,value);
      producer.send(record6);
      producer.close();

      System.out.println("SimpleProducer Completed.");
   }
}

My SimpleConsumer class:我的 SimpleConsumer 类:

   package com.kafkatest.demo;

import java.time.Duration;
import java.time.temporal.ChronoUnit;
import java.util.Arrays;
import java.util.Properties;

import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerRecord;

public class SimpleConsumer extends Thread{

    public static void main(String[] args) {

    Properties props1 = new Properties();
    props1.put("bootstrap.servers", "localhost:9092,localhost:9093");
    props1.put("key.serializer","org.apache.kafka.common.serialization.StringSerializer");         
    props1.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

    Producer<String, String> producer = new KafkaProducer <>(props1);

    Duration duration = Duration.of(2, ChronoUnit.MINUTES);
    String topicName = "nodesTopic";

    Properties props = new Properties();
    props.put("bootstrap.servers", "localhost:9092");
    props.put("group.id", "consumer-tutorial");
    props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props); 

    consumer.subscribe(Arrays.asList(topicName));

    try {
        while (true) {
        try {
            Thread.sleep(5000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        consumer.beginningOffsets(consumer.assignment());
          ConsumerRecords<String, String> records = consumer.poll(duration);
          for (ConsumerRecord<String, String> record : records) {
            System.out.println(record.offset() + ": " + record.value());
            System.out.println("Record: " + record.value().toLowerCase());
            ProducerRecord<String, String> record1 = new ProducerRecord<>("forecastTopic", "Key", record.offset() + ". " + record.value().toLowerCase());
            String a = "" + records.count();
            ProducerRecord<String, String> record2 = new ProducerRecord<>("forecastTopic", "Key", record.offset() + ". " + a);
            producer.send(record1);
            producer.send(record2);
          }
        }
      } finally {
        producer.close();
        consumer.close();
      }

    }

}

When I run bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic forecastTopic --from-beginning and I run my producer a couple of time, I get当我运行bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic forecastTopic --from-beginning并运行生产者几次时,我得到

0. value-1
0. 6
1. value-2
1. 6
2. value-1
2. 6
3. value-1
3. 6
4. value-1
4. 6
5. value-1
5. 6
6. value-1
6. 6
7. value-2
7. 6
8. value-1
8. 6
9. value-1
9. 6
10. value-1
10. 6
11. value-1
11. 6
12. value-1
12. 6
13. value-2
13. 6
14. value-1
14. 6
15. value-1
15. 6
16. value-1
16. 6
17. value-1
17. 6
18. value-1
18. 6
19. value-2
19. 6
20. value-1
20. 6
21. value-1
21. 6
22. value-1
22. 6
23. value-1
23. 6
24. value-1
24. 6
25. value-2
25. 6
26. value-1
26. 6
27. value-1
27. 6
28. value-1
28. 6
29. value-1
29. 6
30. value-1
30. 6
31. value-2
31. 6
32. value-1
32. 6
33. value-1
33. 6
34. value-1
34. 6
35. value-1
35. 6
36. value-1
36. 6
37. value-2
37. 6
38. value-1
38. 6
39. value-1
39. 6
40. value-1
40. 6
41. value-1
41. 6
42. value-1
42. 6
43. value-2
43. 6
44. value-1
44. 6
45. value-1
45. 6
46. value-1
46. 6
47. value-1
47. 6
48. value-1
48. 12
49. value-2
49. 12
50. value-1
50. 12
51. value-1
51. 12
52. value-1
52. 12
53. value-1
53. 12
54. value-1
54. 12
55. value-2
55. 12
56. value-1
56. 12
57. value-1
57. 12
58. value-1
58. 12
59. value-1
59. 12
60. value-1
60. 6
61. value-2
61. 6
62. value-1
62. 6
63. value-1
63. 6
64. value-1
64. 6
65. value-1
65. 6
66. value-1
66. 6
67. value-2
67. 6
68. value-1
68. 6
69. value-1
69. 6
70. value-1
70. 6
71. value-1
71. 6
72. value-1
72. 6
73. value-2
73. 6
74. value-1
74. 6
75. value-1
75. 6
76. value-1
76. 6
77. value-1
77. 6
78. value-1
78. 6
79. value-2
79. 6
80. value-1
80. 6
81. value-1
81. 6
82. value-1
82. 6
83. value-1
83. 6

I put the log.cleanup.policy=compact in the server.properties file, but it doesn't seem to work, because I have all the 83 offsets in the topic.我把log.cleanup.policy=compact放在server.properties文件中,但似乎不起作用,因为我在主题中有所有 83 个偏移量。

Thank you.谢谢你。

Just in case somebody finds this helpful, this is the actual CLI command to use if you want to make a specific topic compacted, and don't want to set all your topics to compacted at the server level.以防万一有人发现这有帮助,如果您想压缩特定主题,并且不想在服务器级别将所有主题设置为压缩,则这是实际使用的 CLI 命令。

bin/kafka-topics --alter --topic my_topic_name --zookeeper my_zookeeper:2181 --config cleanup.policy=compact

Assuming the above is being run from the confluent base directory.假设上面是从融合的基本目录运行的。 I believe you have to change the command to call bin/kafka-topics.sh if using just a regular Apache Kafka distribution.我相信如果只使用常规的 Apache Kafka 发行版,您必须更改命令以调用bin/kafka-topics.sh

When you set log.cleanup.policy=compact in server.properties, it will be the default policy when creating new topics.当您在 server.properties 中设置log.cleanup.policy=compact时,它将成为创建新主题时的默认策略。 If you change server.properties after creating your topic, your topic configuration won't change.如果您在创建主题后更改 server.properties,您的主题配置不会更改。

You can alter your topic configuration to set cleanup.policy=compact您可以更改主题配置以设置cleanup.policy=compact

As compaction is made by log cleaner, you may want to set specific delete.retention.ms on your topic as default retention is 24 hours.由于压缩是由日志清理器进行的,您可能希望在您的主题上设置特定的delete.retention.ms作为默认保留时间为 24 小时。

Last, compaction doesn't occur on active segment.最后,活动段上不会发生压缩。 see Kafka Log Compaction not starting请参阅Kafka 日志压缩未启动

使用这个命令:

kafka-topics --topic topic_name --bootstrap-server localhost:9092 --config "cleanup.policy=compact"

Create compact kafka topic command:创建紧凑的 kafka 主题命令:

kafka-topics.sh --bootstrap-server localhost:9092 --create --topic foo --config "cleanup.policy=compact"

Describe topic ( where you can see cleanup.policy=compact )描述主题(您可以在其中看到 cleanup.policy=compact )

kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic foo 

Please note that you should not keep cleanup.policy in server.properties.请注意,您不应在 server.properties 中保留 cleanup.policy。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM