I did configure the standalone Debezium and tested the streaming. After that I created a pipeline as follows
pipeline.apply("Read from DebeziumIO",
DebeziumIO.<String>read()
.withConnectorConfiguration(
DebeziumIO.ConnectorConfiguration.create()
.withUsername("user")
.withPassword("password")
.withHostName("hostname")
.withPort("1433")
.withConnectorClass(SqlServerConnector.class)
.withConnectionProperty("database.server.name", "customer")
.withConnectionProperty("database.dbname", "test001")
.withConnectionProperty("database.include.list", "test002")
.withConnectionProperty("include.schema.changes", "true")
.withConnectionProperty("database.history.kafka.bootstrap.servers", "kafka:9092")
.withConnectionProperty("database.history.kafka.topic", "schema-changes.inventory")
.withConnectionProperty("connect.keep.alive", "false")
.withConnectionProperty("connect.keep.alive.interval.ms", "200")
).withFormatFunction(new SourceRecordJson.SourceRecordJsonMapper()).withCoder(StringUtf8Coder.of())
)
When I start the pipeline using DirectRunner, datastream is not captured by the pipeline. In my pipeline code I just added code to dump the data into console for the time being.
Also from the log I observe that the Debezium is being started and stopped frequently. Is that by design? Also when there is a change made into the DB (INSERT/DELETE/UPDATE), I dont find it being reflected in the logs.
So my question is,
Restarting debezium multiple times can it cause performance impacts. Since it creates a jdbc connection.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.