简体   繁体   English

Kafka Cassandra连接器实际上并未写入数据库

[英]Kafka Cassandra connector doesnt actually write to database

I am using open source Kafka Cassandra connector from here: https://github.com/tuplejump/kafka-connect-cassandra 我正在从这里使用开源Kafka Cassandra连接器: https : //github.com/tuplejump/kafka-connect-cassandra

I followed the tutorial and set up instructions. 我遵循了教程并设置了说明。 However the connector doesn't insert any data to my database. 但是,连接器不会将任何数据插入到我的数据库中。 Here is the content of my sink.properties file: 这是我的sink.properties文件的内容:

name=cassandra-sink-connector
connector.class=com.tuplejump.kafka.connect.cassandra.CassandraSink
tasks.max=1
topics=hello-mqtt-kafka
cassandra.sink.route.hello-mqtt-kafka=devices_data.messages

I run Kafka, Cassandra and Zookeeper and they are working. 我运行卡夫卡,卡桑德拉和Zookeeper,他们正在工作。 I send some messages to the topic "hello-kafka". 我向“ hello-kafka”主题发送了一些消息。 For test purposes I have console consumer running and it sees all the messages: 出于测试目的,我正在运行控制台使用者,它会看到所有消息:

    {"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":false,"field":"text"}],"optional":false,"name":"devices.schema"},"payload":{"id":75679795,"text":"example5"}}
    {"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":false,"field":"text"}],"optional":false,"name":"devices.schema"},"payload":{"id":86874233,"text":"example6"}}

Here is the schema for my cassandra table: 这是我的cassandra表的架构:

CREATE TABLE IF NOT EXISTS devices_data.messages (
        id int,
        created text,
        message text,
        PRIMARY KEY (id, created))
        WITH ID = 2de24390-03d5-11e7-a32a-ed242ef1cc00
        AND CLUSTERING ORDER BY (created ASC)
        AND bloom_filter_fp_chance = 0.01
        AND dclocal_read_repair_chance = 0.1
        AND crc_check_chance = 1.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND min_index_interval = 128
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND read_repair_chance = 0.0
        AND speculative_retry = '99PERCENTILE'
        AND comment = ''
        AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
        AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
        AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }

Now, I have the connector running and it doesn't throw any error, but when I make a select query from cqlsh I see that my data was not inserted into cassandra. 现在,连接器正在运行,并且不会引发任何错误,但是当我从cqlsh进行选择查询时,我看到我的数据没有插入到cassandra中。 I have followed the instructions for set up and logs from worker are not showing any problems too. 我已经按照说明进行了设置,工作人员的日志也没有显示任何问题。 For debug purposes I passed some wrongly formatted data to kafka and connector was reporting bug in format of the message. 为了调试,我将一些格式错误的数据传递给kafka,并且连接器以消息格式报告错误。 So for sure it sees the messages, but for some reason it doesn't insert it into database. 因此,可以肯定地看到了消息,但是由于某种原因,它没有将其插入数据库。

I am sitting with this bug for hours and have no idea what can be wrong... I would really appreciate any help or idea what possibly am I missing. 我在这个错误中坐了几个小时,却不知道可能出什么问题...我真的很感谢任何帮助或想法,我可能会错过什么。

Here are the logs from connector. 这是来自连接器的日志。 The last lines about 'Commiting offsets' is repeating all the time, but nothing appears in database. 关于“提交偏移量”的最后几行一直在重复,但是数据库中什么也没有出现。

[2017-03-13 15:46:40,540] INFO Kafka version : 0.10.0.0 (org.apache.kafka.common.utils.AppInfoParser:83)
[2017-03-13 15:46:40,540] INFO Kafka commitId : b8642491e78c5a13 (org.apache.kafka.common.utils.AppInfoParser:84)
[2017-03-13 15:46:40,547] INFO Created connector cassandra-sink-connector (org.apache.kafka.connect.cli.ConnectStandalone:91)
[2017-03-13 15:46:40,554] INFO Configured 1 Kafka - Cassandra mappings. (com.tuplejump.kafka.connect.cassandra.CassandraSinkTask:86)
[2017-03-13 15:46:40,955] INFO Did not find Netty's native epoll transport in the classpath, defaulting to NIO. (com.datastax.driver.core.NettyUtil:83)
[2017-03-13 15:46:42,039] INFO Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor) (com.datastax.driver.core.policies.DCAwareRoundRobinPolicy:95)
[2017-03-13 15:46:42,040] INFO New Cassandra host localhost/127.0.0.1:9042 added (com.datastax.driver.core.Cluster:1475)
[2017-03-13 15:46:42,041] INFO Connected to Cassandra cluster: Test Cluster (com.tuplejump.kafka.connect.cassandra.CassandraCluster:81)
[2017-03-13 15:46:42,114] INFO com.datastax.driver.core.SessionManager@63cd5271 created. (com.tuplejump.kafka.connect.cassandra.CassandraCluster:84)
[2017-03-13 15:46:42,146] INFO CassandraSinkTask starting with 1 routes. (com.tuplejump.kafka.connect.cassandra.CassandraSinkTask:189)
[2017-03-13 15:46:42,149] INFO Sink task WorkerSinkTask{id=cassandra-sink-connector-0} finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask:208)
[2017-03-13 15:46:42,294] INFO Discovered coordinator ismop-virtual-machine:9092 (id: 2147483647 rack: null) for group connect-cassandra-sink-connector. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:505)
[2017-03-13 15:46:42,302] INFO Revoking previously assigned partitions [] for group connect-cassandra-sink-connector (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:280)
[2017-03-13 15:46:42,309] INFO (Re-)joining group connect-cassandra-sink-connector (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:326)
[2017-03-13 15:46:42,319] INFO Successfully joined group connect-cassandra-sink-connector with generation 1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:434)
[2017-03-13 15:46:42,320] INFO Setting newly assigned partitions [hello-mqtt-kafka-0] for group connect-cassandra-sink-connector (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:219)
[2017-03-13 15:46:43,337] INFO Reflections took 4909 ms to scan 64 urls, producing 3915 keys and 28184 values  (org.reflections.Reflections:229)
[2017-03-13 15:46:50,462] INFO WorkerSinkTask{id=cassandra-sink-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask:244)
[2017-03-13 15:47:00,460] INFO WorkerSinkTask{id=cassandra-sink-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask:244)
[2017-03-13 15:47:10,455] INFO WorkerSinkTask{id=cassandra-sink-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask:244)
[2017-03-13 15:47:20,455] INFO WorkerSinkTask{id=cassandra-sink-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask:244)

Apache Kafka includes a consumer group admin tool: bin/kafka-consumer-groups.sh Apache Kafka包括一个使用者组管理工具:bin / kafka-consumer-groups.sh

The consumer group in your connector that should be reading events from Kafka is called "connect-cassandra-sink-connector" (you can see it in the log snippet you posted). 连接器中应该从Kafka读取事件的使用者组称为“ connect-cassandra-sink-connector”(您可以在发布的日志片段中看到它)。

I suggest using this tool to check: 我建议使用此工具进行检查:

  1. Is the consumer all caught up (ie the last offset it got is the end of the log)? 消费者是否被追赶(即,最后获得的补偿是日志的结尾)? If so, try writing new events to the topic and see if those get written. 如果是这样,请尝试为该主题编写新事件,并查看这些事件是否被写入。 Maybe it just got started late and missed early events? 也许它只是起步较晚而错过了早期活动?
  2. Does it appear to make progress at all? 它看起来根本没有进步吗? If yes, it looks like it thinks it is successfully writing to Cassandra. 如果是,则似乎认为已成功写入Cassandra。 If not, it is failing to read from Kafka. 如果不是,则无法从Kafka读取。 Try to check for errors on the broker log and maybe increase log level for Connect and see if anything interesting shows up. 尝试检查代理日志上的错误,并可能增加Connect的日志级别,然后查看是否出现了有趣的事情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM