简体   繁体   English

Kafka connect sink 配置问题 - “Igninging invalid task provided offset -- partition not assigned”

[英]Kafka connect sink configuration problem - "Ignoring invalid task provided offset -- partition not assigned"

We are trying to run kafka connect worker, on GCP with kube.netes, with one source connector configured on Postgresql, one sink connector syncing to BigQuery, and managed Confluent kafka.我们正在尝试使用 kube.netes 在 GCP 上运行 kafka connect worker,在 Postgresql 上配置了一个源连接器,一个同步到 BigQuery 的接收器连接器,以及托管的 Confluent kafka。 Kafka topics for offsets, config and status are configured per specification with 25, 1, 5 partitions respectively, compact cleaning policy, and retention of 7 days.偏移量、配置和状态的 Kafka 主题根据规范分别配置为 25、1、5 个分区,紧凑的清理策略,并保留 7 天。

Connectors are started through REST API. Source connector seams to be working fine, but sink connector after some time starts logging these warnings:连接器通过 REST API 启动。源连接器接缝工作正常,但一段时间后接收器连接器开始记录这些警告:

[2021-09-06 08:13:12,429] WARN WorkerSinkTask{id=master-gcp-bq-sink-0} Ignoring invalid task provided offset sometable-1/OffsetAndMetadata{offset=500, leaderEpoch=null, metadata=''} -- partition not assigned, assignment=[com_sync_master_dev.schema.table-1, com_sync_master_dev.schema.table-0] (org.apache.kafka.connect.runtime.WorkerSinkTask)

Furthermore, every restart of sink connector starts from the beginning, like it cannot read offset to start from.此外,接收器连接器的每次重启都从头开始,就像它无法读取偏移量一样。

Before the issue, broker loses connection, connector stops, then rebalance is started.在问题发生之前,代理失去连接,连接器停止,然后重新平衡开始。


2021-09-09 07:55:51,291] INFO [Worker clientId=connect-1, groupId=database-sync] Group coordinator *************.europe-west3.gcp.confluent.cloud:9092 (id: 2147483636 rack: null) is unavailable or invalid due to cause: session timed out without receiving a heartbeat response.isDisconnected: false. Rediscovery will be attempted. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2021-09-09 07:55:51,295] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Skipping offset commit, task opted-out by returning no offsets from preCommit (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:55:51,295] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Finished offset commit successfully in 0 ms for sequence number 5: null (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:55:51,298] INFO [Worker clientId=connect-1, groupId=database-sync] Discovered group coordinator *************.europe-west3.gcp.confluent.cloud:9092 (id: 2147483636 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2021-09-09 07:55:51,300] DEBUG Putting 500 records in the sink. (com.wepay.kafka.connect.bigquery.BigQuerySinkTask)
[2021-09-09 07:55:51,301] INFO [Worker clientId=connect-1, groupId=database-sync] Discovered group coordinator *************.europe-west3.gcp.confluent.cloud:9092 (id: 2147483636 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2021-09-09 07:55:51,302] INFO [Worker clientId=connect-1, groupId=database-sync] Group coordinator *************.europe-west3.gcp.confluent.cloud:9092 (id: 2147483636 rack: null) is unavailable or invalid due to cause: coordinator unavailable.isDisconnected: false. Rediscovery will be attempted. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2021-09-09 07:55:56,732] DEBUG re-attempting insertion (com.wepay.kafka.connect.bigquery.write.row.AdaptiveBigQueryWriter)
[2021-09-09 07:55:56,735] DEBUG table insertion completed successfully (com.wepay.kafka.connect.bigquery.write.row.AdaptiveBigQueryWriter)
[2021-09-09 07:55:56,739] DEBUG Wrote 500 rows over 1 successful calls and 0 failed calls. (com.wepay.kafka.connect.bigquery.write.batch.TableWriter)
[2021-09-09 07:55:56,736] INFO [Worker clientId=connect-1, groupId=database-sync] Broker coordinator was unreachable for 3000ms. Revoking previous assignment Assignment{error=0, leader='connect-1-fd48e893-1729-4df4-8d1e-3370c1e76e1f', leaderUrl='http://confluent-bigquery-connect:8083/', offset=555, connectorIds=[master-gcp-bq-sink, master-gcp-source], taskIds=[master-gcp-bq-sink-0, master-gcp-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} to avoid running tasks while not being a member the group (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)

Offsets for sink connector are always restarted from 0, and WorkerSinkTask is skipping last commit, logs:接收器连接器的偏移量始终从 0 重新启动,并且 WorkerSinkTask 正在跳过最后一次提交,记录:

[2021-09-09 07:29:25,177] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Skipping offset commit, no change since last commit (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:29:25,177] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Finished offset commit successfully in 0 ms for sequence number 1345: null (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:50:39,281] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Initializing and starting task for topics com_sync_master_dev.someshema.sometable (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:50:39,300] INFO WorkerSinkTask{id=master-gcp-bq-sink-0} Sink task finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:50:39,595] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Partitions assigned [com_sync_master_dev.someshema.sometable-1, com_sync_master_dev.someshema.sometable-0] (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:50:39,795] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Assigned topic partition com_sync_master_dev.someshema.sometable-1 with offset 0 (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:50:39,817] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Assigned topic partition com_sync_master_dev.someshema.sometable-0 with offset 0 (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:51:39,308] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Skipping offset commit, task opted-out by returning no offsets from preCommit (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:51:39,308] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Finished offset commit successfully in 0 ms for sequence number 1: null (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:52:39,355] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Skipping offset commit, task opted-out by returning no offsets from preCommit (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 07:52:39,355] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Finished offset commit successfully in 0 ms for sequence number 2: null (org.apache.kafka.connect.runtime.WorkerSinkTask)
...
[2021-09-09 08:01:03,158] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Initializing and starting task for topics com_sync_master_dev.someshema.sometable (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 08:01:03,168] INFO WorkerSinkTask{id=master-gcp-bq-sink-0} Sink task finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 08:01:03,381] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Partitions assigned [com_sync_master_dev.someshema.sometable-1, com_sync_master_dev.someshema.sometable-0] (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 08:01:03,410] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Assigned topic partition com_sync_master_dev.someshema.sometable-1 with offset 0 (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 08:01:03,762] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Assigned topic partition com_sync_master_dev.someshema.sometable-0 with offset 0 (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 08:02:03,145] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Skipping offset commit, task opted-out by returning no offsets from preCommit (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 08:02:03,145] DEBUG WorkerSinkTask{id=master-gcp-bq-sink-0} Finished offset commit successfully in 0 ms for sequence number 1: null (org.apache.kafka.connect.runtime.WorkerSinkTask)
....
[2021-09-09 08:09:17,085] WARN WorkerSinkTask{id=master-gcp-bq-sink-0} Ignoring invalid task provided offset sometable-0/OffsetAndMetadata{offset=395300, leaderEpoch=null, metadata=''} -- partition not assigned, assignment=[com_sync_master_dev.someshema.sometable-1, com_sync_master_dev.someshema.sometable-0] (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2021-09-09 08:09:17,085] WARN WorkerSinkTask{id=master-gcp-bq-sink-0} Ignoring invalid task provided offset sometable-1/OffsetAndMetadata{offset=380428, leaderEpoch=null, metadata=''} -- partition not assigned, assignment=[com_sync_master_dev.someshema.sometable-1, com_sync_master_dev.someshema.sometable-0] (org.apache.kafka.connect.runtime.WorkerSinkTask)

Source configuration:源配置:

{
"name": "master-gcp-source",
"config": {
  "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
  "plugin.name": "pgoutput",
  "key.converter": "io.confluent.connect.avro.AvroConverter",
  "key.converter.basic.auth.credentials.source": "******",
  "key.converter.schema.registry.basic.auth.user.info":"*****",
  "key.converter.schema.registry.url": "https://************.gcp.confluent.cloud",
  "value.converter": "io.confluent.connect.avro.AvroConverter",
  "errors.tolerance": "none",
  "errors.deadletterqueue.topic.name":"dlq_postgres_source",
  "errors.deadletterqueue.topic.replication.factor": 1,
  "errors.deadletterqueue.context.headers.enable":true,
  "errors.log.enable":true,
  "errors.log.include.messages":true,
  "value.converter.basic.auth.credentials.source": "******",
  "value.converter.schema.registry.basic.auth.user.info":"***************",
  "value.converter.schema.registry.url": "https://************.gcp.confluent.cloud",
  "transforms.extractKey.type":"org.apache.kafka.connect.transforms.ExtractField$Key",
  "database.hostname": "hostname",
  "database.port": "5432",
  "database.user": "some_db_user",
  "database.password": "***********",
  "database.dbname" : "master",
  "database.server.name": "com_master_dev",
  "database.sslmode": "require",
  "table.include.list": "schema.table",
  "table.ignore.builtin": true,
  "heartbeat.interval.ms": "5000",
  "tasks.max": "1",
  "slot.drop.on.stop": false,
  "xmin.fetch.interval.ms": 0,
  "interval.handling.mode": "numeric",
  "binary.handling.mode": "bytes",
  "sanitize.field.names": true,
  "slot.max.retries":6,
  "slot.retry.delay.ms": 10000,
  "event.processing.failure.handling.mode": "fail",
  "slot.name": "debezium",
  "publication.name": "dbz_publication",
  "decimal.handling.mode": "precise",
  "snapshot.lock.timeout.ms": "10000",
  "snapshot.mode":"initial",
  "output.data.format": "AVRO",
  "transforms": "unwrap",
  "offset.flush.interval.ms": "0",
  "offset.flush.timeout.ms" : "20000",
  "max.batch.size": "1024",
  "max.queue.size":"4096",
  "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState"
}
}

Sink configuration:水槽配置:

{
"name": "master-gcp-bq-sink",
"config": {
  "connector.class": "com.wepay.kafka.connect.bigquery.BigQuerySinkConnector",
  "tasks.max": "1",
  "key.converter": "io.confluent.connect.avro.AvroConverter",
  "key.converter.basic.auth.credentials.source": "*********",
  "key.converter.schema.registry.basic.auth.user.info":"************",
  "key.converter.schema.registry.url": "https://*********.europe-west3.gcp.confluent.cloud",
  "value.converter": "io.confluent.connect.avro.AvroConverter",
  "value.converter.basic.auth.credentials.source": "*******",
  "value.converter.schema.registry.basic.auth.user.info":"****************************",
  "value.converter.schema.registry.url": "https://*********.europe-west3.gcp.confluent.cloud",
  "config.action.reload": "restart",
  "topics": "com_master_dev.schema.table",
  "project": "dev",
  "defaultDataset": "schema",
  "keyfile": "{********}",
  "keySource": "JSON",
  "errors.tolerance": "none",
  "errors.deadletterqueue.topic.name":"dlq_bigquery_sink",
  "errors.deadletterqueue.topic.replication.factor": 3,
  "errors.deadletterqueue.context.headers.enable":true,
  "errors.log.enable":true,
  "errors.log.include.messages":true,
   "data.format":"AVRO",
  "upsertEnabled": true,
  "deleteEnabled": false,
  "allowNewBigQueryFields": "true",
  "sanitizeTopics": true,
  "sanitizeFieldNames": true,
  "autoCreateTables": true,
  "timePartitioningType": "DAY",
  "kafkaKeyFieldName":"key_placeholder",
  "mergeIntervalMs": "60000",
  "mergeRecordsThreshold": "-1",
  "transforms": "unwrap",
  "consumer.override.session.timeout.ms":"60000",
  "consumer.override.fetch.max.bytes": "1048576",
  "consumer.override.request.timeout.ms":"60000",   
  "consumer.override.reconnect.backoff.max.ms":"10000",
  "consumer.override.reconnect.backoff.ms":"250",
  "consumer.override.partition.assignment.strategy":"org.apache.kafka.clients.consumer.CooperativeStickyAssignor", // also tried with RoundRobinAssignor
  "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
  "transforms": "RegexTransformation",
  "transforms.RegexTransformation.type":"org.apache.kafka.connect.transforms.RegexRouter",
  "transforms.RegexTransformation.regex":"(com_sync_master_dev.schema.)(.*)",
  "transforms.RegexTransformation.replacement": "$2"
}
}

What are we missing?我们缺少什么? How to avoid invalid task offsets, and make sure that sink connector continue from previous offset?如何避免无效的任务偏移量,并确保接收器连接器从以前的偏移量继续?

So in Kafka you need to configure a consumer group for a consumer of a topic to take advantage of the offset.所以在Kafka中需要为一个topic的consumer配置一个consumer group来利用offset。 Otherwise the failed sink will just respawn and have no idea of what offset it was reading from when it was alive the last time.否则失败的接收器将重生并且不知道它从上次活着时读取的偏移量是多少。

When a member of a consumer group reads from a topic it commits its progress to the broker, so that it knows where it has read from.当一个消费者组的成员从一个主题中读取时,它会将其进度提交给代理,以便它知道它从哪里读取的。

It seems that the offset commit is skipped for some reason, and that is why the sink always starts from offset 0.似乎由于某种原因跳过了偏移量提交,这就是为什么接收器总是从偏移量 0 开始的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Kafka 连接 s3 sink 多个分区 - Kafka connect s3 sink multiple partitions Kafka Connect S3 Sink 添加元数据 - Kafka Connect S3 Sink add MetaData 无效的操作配置图像 URI 容器名为:<image_uri> 与提供的任务定义文件中缺少的任何容器都不匹配</image_uri> - Invalid action configuration Image URI container named: <IMAGE_URI> does not match any of the missing containers in the task definition file provided BigQuery 接收器连接器配置错误 - Errors with BigQuery Sink Connector Configuration AKKA,Kafka Source 使用 KPL 到 Kinesis Sink - AKKA, Kafka Source to Kinesis Sink using KPL Confluent 的 S3 Sink Connector for Kafka Connect 能否使用“topics.dir”将主题写入 S3 存储桶中的嵌套(不是顶级)文件夹? - Can Confluent's S3 Sink Connector for Kafka Connect write topics to a nested (not a top-level) folder in an S3 bucket using `topics.dir`? Kafka 偏移量在重启后重置 - Kafka offset is reset after restart AWS:“身份池配置无效。检查为此池分配的 IAM 角色。” AWS Lambda 查询 MYSQL db 时出错 - AWS: "Invalid identity pool configuration. Check assigned IAM roles for this pool." Error with AWS Lambda to query MYSQL db Kafka S3 Sink基本疑惑 - Kafka S3 Sink basic doubts segmentio/kafka-go 阅读器客户端未订阅主题和分区 - segmentio/kafka-go reader client not subscribing to the topic and partition
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM