简体   繁体   中英

Kafka Cassandra connector doesnt actually write to database

I am using open source Kafka Cassandra connector from here: https://github.com/tuplejump/kafka-connect-cassandra

I followed the tutorial and set up instructions. However the connector doesn't insert any data to my database. Here is the content of my sink.properties file:

name=cassandra-sink-connector
connector.class=com.tuplejump.kafka.connect.cassandra.CassandraSink
tasks.max=1
topics=hello-mqtt-kafka
cassandra.sink.route.hello-mqtt-kafka=devices_data.messages

I run Kafka, Cassandra and Zookeeper and they are working. I send some messages to the topic "hello-kafka". For test purposes I have console consumer running and it sees all the messages:

    {"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":false,"field":"text"}],"optional":false,"name":"devices.schema"},"payload":{"id":75679795,"text":"example5"}}
    {"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":false,"field":"text"}],"optional":false,"name":"devices.schema"},"payload":{"id":86874233,"text":"example6"}}

Here is the schema for my cassandra table:

CREATE TABLE IF NOT EXISTS devices_data.messages (
        id int,
        created text,
        message text,
        PRIMARY KEY (id, created))
        WITH ID = 2de24390-03d5-11e7-a32a-ed242ef1cc00
        AND CLUSTERING ORDER BY (created ASC)
        AND bloom_filter_fp_chance = 0.01
        AND dclocal_read_repair_chance = 0.1
        AND crc_check_chance = 1.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND min_index_interval = 128
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND read_repair_chance = 0.0
        AND speculative_retry = '99PERCENTILE'
        AND comment = ''
        AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
        AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
        AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }

Now, I have the connector running and it doesn't throw any error, but when I make a select query from cqlsh I see that my data was not inserted into cassandra. I have followed the instructions for set up and logs from worker are not showing any problems too. For debug purposes I passed some wrongly formatted data to kafka and connector was reporting bug in format of the message. So for sure it sees the messages, but for some reason it doesn't insert it into database.

I am sitting with this bug for hours and have no idea what can be wrong... I would really appreciate any help or idea what possibly am I missing.

Here are the logs from connector. The last lines about 'Commiting offsets' is repeating all the time, but nothing appears in database.

[2017-03-13 15:46:40,540] INFO Kafka version : 0.10.0.0 (org.apache.kafka.common.utils.AppInfoParser:83)
[2017-03-13 15:46:40,540] INFO Kafka commitId : b8642491e78c5a13 (org.apache.kafka.common.utils.AppInfoParser:84)
[2017-03-13 15:46:40,547] INFO Created connector cassandra-sink-connector (org.apache.kafka.connect.cli.ConnectStandalone:91)
[2017-03-13 15:46:40,554] INFO Configured 1 Kafka - Cassandra mappings. (com.tuplejump.kafka.connect.cassandra.CassandraSinkTask:86)
[2017-03-13 15:46:40,955] INFO Did not find Netty's native epoll transport in the classpath, defaulting to NIO. (com.datastax.driver.core.NettyUtil:83)
[2017-03-13 15:46:42,039] INFO Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor) (com.datastax.driver.core.policies.DCAwareRoundRobinPolicy:95)
[2017-03-13 15:46:42,040] INFO New Cassandra host localhost/127.0.0.1:9042 added (com.datastax.driver.core.Cluster:1475)
[2017-03-13 15:46:42,041] INFO Connected to Cassandra cluster: Test Cluster (com.tuplejump.kafka.connect.cassandra.CassandraCluster:81)
[2017-03-13 15:46:42,114] INFO com.datastax.driver.core.SessionManager@63cd5271 created. (com.tuplejump.kafka.connect.cassandra.CassandraCluster:84)
[2017-03-13 15:46:42,146] INFO CassandraSinkTask starting with 1 routes. (com.tuplejump.kafka.connect.cassandra.CassandraSinkTask:189)
[2017-03-13 15:46:42,149] INFO Sink task WorkerSinkTask{id=cassandra-sink-connector-0} finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask:208)
[2017-03-13 15:46:42,294] INFO Discovered coordinator ismop-virtual-machine:9092 (id: 2147483647 rack: null) for group connect-cassandra-sink-connector. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:505)
[2017-03-13 15:46:42,302] INFO Revoking previously assigned partitions [] for group connect-cassandra-sink-connector (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:280)
[2017-03-13 15:46:42,309] INFO (Re-)joining group connect-cassandra-sink-connector (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:326)
[2017-03-13 15:46:42,319] INFO Successfully joined group connect-cassandra-sink-connector with generation 1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:434)
[2017-03-13 15:46:42,320] INFO Setting newly assigned partitions [hello-mqtt-kafka-0] for group connect-cassandra-sink-connector (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:219)
[2017-03-13 15:46:43,337] INFO Reflections took 4909 ms to scan 64 urls, producing 3915 keys and 28184 values  (org.reflections.Reflections:229)
[2017-03-13 15:46:50,462] INFO WorkerSinkTask{id=cassandra-sink-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask:244)
[2017-03-13 15:47:00,460] INFO WorkerSinkTask{id=cassandra-sink-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask:244)
[2017-03-13 15:47:10,455] INFO WorkerSinkTask{id=cassandra-sink-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask:244)
[2017-03-13 15:47:20,455] INFO WorkerSinkTask{id=cassandra-sink-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask:244)

Apache Kafka includes a consumer group admin tool: bin/kafka-consumer-groups.sh

The consumer group in your connector that should be reading events from Kafka is called "connect-cassandra-sink-connector" (you can see it in the log snippet you posted).

I suggest using this tool to check:

  1. Is the consumer all caught up (ie the last offset it got is the end of the log)? If so, try writing new events to the topic and see if those get written. Maybe it just got started late and missed early events?
  2. Does it appear to make progress at all? If yes, it looks like it thinks it is successfully writing to Cassandra. If not, it is failing to read from Kafka. Try to check for errors on the broker log and maybe increase log level for Connect and see if anything interesting shows up.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM