简体   繁体   中英

Debezium MySql Connector : Table snapshots are taken in a single thread?

I am going through Debezium MySql Connector source code and trying to understand the table snapshot logic.

1) Looking at the class 'io.debezium.connector.mysql.SnapshotReader' execute() method , it seems all the table snapshots are taken in a single thread. Is this true? For a database with large number of tables it doesn't process tables in parallel way?

https://github.com/debezium/debezium/blob/master/debezium-connector-mysql/src/main/java/io/debezium/connector/mysql/SnapshotReader.java

2) Also,seems for taking snapshot it uses "SELECT * from {table}" query.If snapshot operation is failed(due to DB connection failure,Kafka Connector restart..etc) does it recover from the previous location using Kafka Connect offset mechanism?

  1. Yes, a single thread is used for snapshot even for a large database.

  2. No

If the connector fails, is rebalanced, or stops before the snapshot is complete, the connector will begin a new snapshot when it is restarted.

Refer : https://debezium.io/docs/connectors/mysql/#snapshots

The reason for both these is the snapshot mechanism. Snapshot is taken in a single transaction. Firstly a transaction is limited to a single DB connection. Even using multiple threads with a single DB connection will result in threads waiting for the connection being released by other thread.

Secondly, resuming snapshot using Kafka connect offsets has many issues. What offset? By the time there might have been some modifications in the table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM