DebeziumIO read with SQL Server not streaming with Apache beam in GCP

Question

I did configure the standalone Debezium and tested the streaming. After that I created a pipeline as follows

pipeline.apply("Read from DebeziumIO",
               DebeziumIO.<String>read()
                 .withConnectorConfiguration(
                   DebeziumIO.ConnectorConfiguration.create()
                     .withUsername("user")
                     .withPassword("password")
                     .withHostName("hostname")
                     .withPort("1433")
                     .withConnectorClass(SqlServerConnector.class)
                     .withConnectionProperty("database.server.name", "customer")
                     .withConnectionProperty("database.dbname", "test001")
                     .withConnectionProperty("database.include.list", "test002")
                     .withConnectionProperty("include.schema.changes", "true")
                     .withConnectionProperty("database.history.kafka.bootstrap.servers", "kafka:9092")  
                     .withConnectionProperty("database.history.kafka.topic", "schema-changes.inventory") 
                     .withConnectionProperty("connect.keep.alive", "false")               
                     .withConnectionProperty("connect.keep.alive.interval.ms", "200")
                  ).withFormatFunction(new SourceRecordJson.SourceRecordJsonMapper()).withCoder(StringUtf8Coder.of())
      )

When I start the pipeline using DirectRunner, datastream is not captured by the pipeline. In my pipeline code I just added code to dump the data into console for the time being.

Also from the log I observe that the Debezium is being started and stopped frequently. Is that by design? Also when there is a change made into the DB (INSERT/DELETE/UPDATE), I dont find it being reflected in the logs.

So my question is,

Configuration what I provided is that sufficient?
Why is the pipeline not being triggered when there is a change?
What additional steps I need to perform to get it working?

Answer 1

Restarting debezium multiple times can it cause performance impacts. Since it creates a jdbc connection.

DebeziumIO read with SQL Server not streaming with Apache beam in GCP

Question

1 answers

solution1
0 2022-04-05 13:28:11

DebeziumIO read with SQL Server not streaming with Apache beam in GCP

Question

1 answers

solution1 0 2022-04-05 13:28:11

solution1
0 2022-04-05 13:28:11