簡體   English   中英

無法將數據從 kafka 主題發送到 elasticsearch

[英]Unable to send data from kafka topic to elasticsearch

我正在嘗試使用 mongo 從 mongoDB 到我的 kafka 主題數據庫(作為源)、elasticsearch(作為接收器)和 kafka 構建數據管道。 我已經成功地從 mongoDB 接收到我的 kafka 主題的數據。 這是從 mongoDB 捕獲的數據示例

{"_id": {"_data": "825E88FED8000000012B022C0100296E5A10044D2CA180FAF94580B30CFA4B3CC80E1546645F696400645E88FED793AFA61A58411B2A0004"}, "operationType": "insert", "clusterTime": {"$timestamp": {"t": 1586036440, "i": 1}}, "fullDocument": {"_id": {"$oid": "5e88fed793afa61a58411b2a"}, "name": "Lefèvre Mathis", "phoneNumber": 87640262, "phoneNumber2": 98462768, "phoneNumber3": 50591075, "email": "LefèvreMathis@gmail.com", "websiteUrl": "www.LefèvreMathis.fr", "legalInformation": {"companyName": "Duval EI", "siren": 7.3887975858196E13, "nic": 28866, "siret": 7.3887975858196E13, "ape": "49.53", "tva": "FR-1173030343", "description": "Blanditiis et placeat voluptas hic et. Quae et autem inventore ut enim fugit. Nihil velit in ut magnam."}, "professionType": {"type": "Hotel", "category": "professionnel"}, "operator": {"name": "Orange"}, "address": [{"city": "Paris", "street": "Quartier Les Halles, Paris 1er Arrondissement, Paris, Île-de-France, France métropolitaine, 75001, France", "zipCode": 75001, "latitude": "48.86330665", "longitude": "2.348370623761905"}], "openingTimeSet": [{"day": "Lundi", "opening": "08:00", "closing": "18:00"}, {"day": "Mardi", "opening": "08:00", "closing": "18:00"}, {"day": "Mercredi", "opening": "08:00", "closing": "18:00"}, {"day": "Jeudi", "opening": "08:00", "closing": "18:00"}, {"day": "Vendredi", "opening": "08:00", "closing": "18:00"}, {"day": "Samedi", "opening": "08:00", "closing": "18:00"}, {"day": "Dimanche", "opening": "08:00", "closing": "18:00"}], "_class": "com.sofrecom.elasticsearch.model.Subscriber"}, "ns": {"db": "elasticsearchApp", "coll": "subscriber"}, "documentKey": {"_id": {"$oid": "5e88fed793afa61a58411b2a"}}}

問題是當我運行我的 ES sink 連接器時,我得到了這個異常:

Caused by: org.apache.kafka.connect.errors.DataException: Converting byte[] to Kafka Connect data failed due to serialization error: 
at org.apache.kafka.connect.json.JsonConverter.toConnectData(JsonConverter.java:355)
at org.apache.kafka.connect.storage.Converter.toConnectData(Converter.java:86)
at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:485)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
... 13 more

Caused by: org.apache.kafka.common.errors.SerializationException: java.io.CharConversionException: Invalid UTF-32 character 0x658b027b (above 0x0010ffff) at char #1, byte #7)

這是我的 kafka-connect 配置:

 CONNECT_BOOTSTRAP_SERVERS: kafka:9092
  CONNECT_REST_ADVERTISED_HOST_NAME: connect
  CONNECT_REST_PORT: 8083
  CONNECT_GROUP_ID: compose-connect-group
  CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
  CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
  CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
  CONNECT_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
  CONNECT_VALUE_CONVERTER:  org.apache.kafka.connect.json.JsonConverter
  CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
  CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
  CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR:  1
  CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR:  1
  CONNECT_STATUS_STORAGE_REPLICATION_FACTOR:  1
  CONNECT_PLUGIN_PATH: '/usr/share/java,/etc/kafka-connect/jars'
  CONNECT_CONFLUENT_TOPIC_REPLICATION_FACTOR: 1

我的 es-sink 連接器:

{ "name": "sink", "config": { "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector", "connection.url": "http://172.21.0.4:9200", "type.name": "subscriber", "topics": "test5.elasticsearchApp.subscriber", "key.ignore": "false","value.converter.schemas.enable": "false","schema.ignore": "true","value.converter":"org.apache.kafka.connect.json.JsonConverter" } }

和 mongodb-source-connector

{ "name": "mongo-source", "config": { "connector.class": "com.mongodb.kafka.connect.MongoSourceConnector","tasks.max":1,"connection.uri":"mongodb://mongo1:27017,mongo2:27017","database":"elasticsearchApp","collection":"subscriber", "topic.prefix":"test15","value.converter":"org.apache.kafka.connect.storage.StringConverter"} }

當我嘗試在我的 mongoDBConnector 中使用 json 轉換器時,我在從 kafka 主題消費時得到了我的有效負載的字符串格式

{"schema":{"type":"string","optional":false},"payload":"{\"_id\": {\"_data\": \"825E89EA94000000012B022C0100296E5A10044D2CA180FAF94580B30CFA4B3CC80E1546645F696400645E89EA94FC56002500157F490004\"}, \"operationType\": \"insert\", \"clusterTime\": {\"$timestamp\": {\"t\": 1586096788, \"i\": 1}}, \"fullDocument\": {\"_id\": {\"$oid\": \"5e89ea94fc56002500157f49\"}, \"name\": \"Lefèvre Mathis\", \"phoneNumber\": 87640262, \"phoneNumber2\": 98462768, \"phoneNumber3\": 50591075, \"email\": \"LefèvreMathis@gmail.com\", \"websiteUrl\": \"www.LefèvreMathis.fr\", \"legalInformation\": {\"companyName\": \"Duval EI\", \"siren\": 7.3887975858196E13, \"nic\": 28866, \"siret\": 7.3887975858196E13, \"ape\": \"49.53\", \"tva\": \"FR-1173030343\", \"description\": \"Blanditiis et placeat voluptas hic et. Quae et autem inventore ut enim fugit. Nihil velit in ut magnam.\"}, \"professionType\": {\"type\": \"Hotel\", \"category\": \"professionnel\"}, \"operator\": {\"name\": \"Orange\"}, \"address\": [{\"city\": \"Paris\", \"street\": \"Quartier Les Halles, Paris 1er Arrondissement, Paris, Île-de-France, France métropolitaine, 75001, France\", \"zipCode\": 75001, \"latitude\": \"48.86330665\", \"longitude\": \"2.348370623761905\"}], \"openingTimeSet\": [{\"day\": \"Lundi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Mardi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Mercredi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Jeudi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Vendredi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Samedi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Dimanche\", \"opening\": \"08:00\", \"closing\": \"18:00\"}], \"_class\": \"com.sofrecom.elasticsearch.model.Subscriber\"}, \"ns\": {\"db\": \"elasticsearchApp\", \"coll\": \"subscriber\"}, \"documentKey\": {\"_id\": {\"$oid\": \"5e89ea94fc56002500157f49\"}}}"}
  1. 如果您不希望 Mongo 連接器生成字符串有效負載,請不要使用它

    "value.converter":"org.apache.kafka.connect.storage.StringConverter"
  2. 您將在接收器中需要它,因為您在主題的 JSON 中同時具有schemapayload

     "value.converter.schemas.enable": "true"
  3. 您需要使用 Elasticsearch 索引映射來解析字符串,因為 Connect 不會為您執行此操作。

我不確定 Mongo 連接器中是否存在錯誤。 從未使用過它,但我想 JSON 轉換器應該可以工作,或者至少是 Avro。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM