Kafka JDBC Sink Connector，批量插入值

Question

I receive a lot of the messages (by http-protocol) per second (50000 - 100000) and want to save them to PostgreSql.我每秒收到很多消息（通过 http 协议）（50000 - 100000）并想将它们保存到 PostgreSql。 I decided to use Kafka JDBC Sink for this purpose.为此，我决定使用 Kafka JDBC Sink。

The messages are saved to database by one record, not in batches.消息按一条记录保存到数据库中，而不是批量保存。 I want to insert records in PostgreSQL in batches with size 500-1000 records.我想在 PostgreSQL 中批量插入记录，记录大小为 500-1000。

I found some answers on this problem in issue: How to use batch.size?我在这个问题上找到了一些答案： How to use batch.size?

I tried to use related options in configuration, but it seems that they no have any effect.我尝试在配置中使用相关选项，但似乎没有任何效果。

My Kafka JDBC Sink PostgreSql configuration ( etc/kafka-connect-jdbc/postgres.properties ):我的 Kafka JDBC Sink PostgreSql 配置（ etc/kafka-connect-jdbc/postgres.properties ）：

name=test-sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=3

# The topics to consume from - required for sink connectors like this one
topics=jsonb_pkgs

connection.url=jdbc:postgresql://localhost:5432/test?currentSchema=test
auto.create=false
auto.evolve=false

insert.mode=insert
connection.user=postgres
table.name.format=${topic}

connection.password=pwd

batch.size=500
# based on 500*3000byte message size
fetch.min.bytes=1500000
fetch.wait.max.ms=1500
max.poll.records=4000

I also added options to connect-distributed.properties :我还为connect-distributed.properties添加了选项：

consumer.fetch.min.bytes=1500000
consumer.fetch.wait.max.ms=1500

Although each a partition gets more than 1000 records per second, records are saved to PostgreSQL by one.虽然每个分区每秒获取 1000 多条记录，但记录是按一条保存到 PostgreSQL 的。

Edit: consumer options were added in other file with correct names编辑：消费者选项已添加到具有正确名称的其他文件中

I also added options to etc/schema-registry/connect-avro-standalone.properties :我还在etc/schema-registry/connect-avro-standalone.properties添加了选项：

# based on 500*3000 byte message size
consumer.fetch.min.bytes=1500000
consumer.fetch.wait.max.ms=1500
consumer.max.poll.records=4000

Answer 1

I realised that I misunderstood the documentation.我意识到我误解了文档。 The records are inserted in database one by one.记录被一条一条地插入到数据库中。 The count of the records inserted in one transaction depends on batch.size and consumer.max.poll.records .一笔交易中插入的记录数取决于batch.size和consumer.max.poll.records 。 I expected that the batch insert was implemented the other way.我预计批量插入是以另一种方式实现的。 I would like to have an option to insert records like this:我想有一个选项来插入这样的记录：

INSERT INTO table1 (First, Last)
VALUES
    ('Fred', 'Smith'),
    ('John', 'Smith'),
    ('Michael', 'Smith'),
    ('Robert', 'Smith');

But that seems impossible.但这似乎是不可能的。

Kafka JDBC Sink Connector，批量插入值

问题描述

1 个解决方案

解决方案1
3 2019-12-01 16:21:21

Kafka JDBC Sink Connector，批量插入值

问题描述

1 个解决方案

解决方案1 3 2019-12-01 16:21:21

解决方案1
3 2019-12-01 16:21:21