簡體   English   中英

無法在 KAFKA 主題中推送 Json

[英]Unable to push Json in KAFKA topic

我嘗試在 KAFKA 主題中以 JSON 格式推送數據,但沒有成功。

我使用了以下 AVRO SCHEMA :

{"schemaType":"AVRO","schema":"{\"title\":\"json pipeline\",\"name\":\"MyClass\",\"type\":\"record\",\"namespace\":\"com.acme.avro\",\"fields\":[{\"name\":\"web\",\"type\":{\"name\":\"web\",\"type\":\"record\",\"fields\":[{\"name\":\"test\",\"type\":{\"name\":\"test\",\"type\":\"record\",\"fields\":[{\"name\":\"createdDate\",\"type\":\"string\"},{\"name\":\"modifiedDate\",\"type\":\"string\"},{\"name\":\"createdBy\",\"type\":\"string\"},{\"name\":\"modifiedBy\",\"type\":\"string\"},{\"name\":\"enabled\",\"type\":\"int\"},{\"name\":\"savedEvent\",\"type\":\"int\"},{\"name\":\"testId\",\"type\":\"int\"},{\"name\":\"testName\",\"type\":\"string\"},{\"name\":\"type\",\"type\":\"string\"},{\"name\":\"interval\",\"type\":\"int\"},{\"name\":\"httpInterval\",\"type\":\"int\"},{\"name\":\"url\",\"type\":\"string\"},{\"name\":\"protocol\",\"type\":\"string\"},{\"name\":\"networkMeasurements\",\"type\":\"int\"},{\"name\":\"mtuMeasurements\",\"type\":\"int\"},{\"name\":\"bandwidthMeasurements\",\"type\":\"int\"},{\"name\":\"bgpMeasurements\",\"type\":\"int\"},{\"name\":\"usePublicBgp\",\"type\":\"int\"},{\"name\":\"alertsEnabled\",\"type\":\"int\"},{\"name\":\"liveShare\",\"type\":\"int\"},{\"name\":\"httpTimeLimit\",\"type\":\"int\"},{\"name\":\"httpTargetTime\",\"type\":\"int\"},{\"name\":\"httpVersion\",\"type\":\"int\"},{\"name\":\"pageLoadTimeLimit\",\"type\":\"int\"},{\"name\":\"pageLoadTargetTime\",\"type\":\"int\"},{\"name\":\"followRedirects\",\"type\":\"int\"},{\"name\":\"includeHeaders\",\"type\":\"int\"},{\"name\":\"sslVersionId\",\"type\":\"int\"},{\"name\":\"verifyCertificate\",\"type\":\"int\"},{\"name\":\"useNtlm\",\"type\":\"int\"},{\"name\":\"authType\",\"type\":\"string\"},{\"name\":\"contentRegex\",\"type\":\"string\"},{\"name\":\"identifyAgentTrafficWithUserAgent\",\"type\":\"int\"},{\"name\":\"probeMode\",\"type\":\"string\"},{\"name\":\"pathTraceMode\",\"type\":\"string\"},{\"name\":\"description\",\"type\":\"string\"},{\"name\":\"numPathTraces\",\"type\":\"int\"},{\"name\":\"apiLinks\",\"type\":{\"type\":\"array\",\"items\":{\"name\":\"apiLinks_record\",\"type\":\"record\",\"fields\":[{\"name\":\"rel\",\"type\":\"string\"},{\"name\":\"href\",\"type\":\"string\"}]}}},{\"name\":\"sslVersion\",\"type\":\"string\"}]}},{\"name\":\"pageLoad\",\"type\":{\"type\":\"array\",\"items\":{\"name\":\"pageLoad_record\",\"type\":\"record\",\"fields\":[{\"name\":\"agentName\",\"type\":\"string\"},{\"name\":\"countryId\",\"type\":\"string\"},{\"name\":\"date\",\"type\":\"string\"},{\"name\":\"agentId\",\"type\":\"int\"},{\"name\":\"roundId\",\"type\":\"int\"},{\"name\":\"responseTime\",\"type\":\"int\"},{\"name\":\"totalSize\",\"type\":\"int\"},{\"name\":\"numObjects\",\"type\":\"int\"},{\"name\":\"numErrors\",\"type\":\"int\"},{\"name\":\"domLoadTime\",\"type\":\"int\"},{\"name\":\"pageLoadTime\",\"type\":\"int\"},{\"name\":\"permalink\",\"type\":\"string\"}]}}}]}},{\"name\":\"pages\",\"type\":{\"name\":\"pages\",\"type\":\"record\",\"fields\":[{\"name\":\"current\",\"type\":\"int\"}]}}]}"

此 AVRO 架構已成功推送到我的 SchemaRegistry

然后在我的制作人中我使用了 AvroSerializer

import time
import json
import sys
import requests

from confluent_kafka import Producer
from confluent_kafka import SerializingProducer
from confluent_kafka.serialization import StringSerializer
from confluent_kafka.schema_registry.schema_registry_client import SchemaRegistryClient
from confluent_kafka.schema_registry.json_schema import JSONSerializer
from  confluent_kafka.schema_registry.avro import AvroSerializer

from utils import set_logger

from confluent_kafka.admin import AdminClient, NewTopic

TOPIC = os.environ.get("MY_TOPIC_IN")
WEB_PAGE_LOAD_URL = os.environ.get("URL")
ACCOUNT_GROUP_ID_1000EYES = os.environ.get("ACCOUNT_ID")
TE_BEARER = os.environ.get("TE_BEARER")
LOGGER = set_logger("producer_logger")

def metrics(test_id):
    res = {}
    #url= WEB_PAGE_LOAD_URL + '{}.json?aid{}'.format(test_id, ACCOUNT_GROUP_ID_1000EYES)
    url= "{}{}.json?aid{}".format(WEB_PAGE_LOAD_URL, test_id, ACCOUNT_GROUP_ID_1000EYES)
    session = requests.session()
    headers = {'Authorization': TE_BEARER}
    rep=session.get(url, headers=headers)
    res = rep.json()
    print(res)
    return res

if __name__ == "__main__":

    conf={"bootstrap.servers":"json_kafka:29094"}
    admin_client = AdminClient(conf)
    topic_list = [NewTopic("my_topic_in", 1, 1)]
    admin_client.create_topics(new_topics=topic_list)

    if sys.argv[1] == "json" :

        schema_registry_url = {"url": "http://json_schema-registry:8083"}
        sr = SchemaRegistryClient(schema_registry_url)
        subjects = sr.get_subjects()
        '''retrieve json shcema in schema registry'''
        for subject in subjects:
            #print(subject)
            schema = sr.get_latest_version(subject)
            print(schema.subject)
            if schema.subject == "{}-value".format(TOPIC) :
                my_schema=schema.schema.schema_str
                json_serializer = JSONSerializer(my_schema,sr,to_dict=None,conf=None)
                '''create json producer'''
                json_producer_conf = {'bootstrap.servers':'json_kafka:29094' ,
                                      'key.serializer': StringSerializer('utf_8'),
                                      'value.serializer': AvroSerializer}
                                      
                producer = SerializingProducer(json_producer_conf)

    elif sys.argv[1]=="string":
        string_producer_conf = {'bootstrap.servers':'json_kafka:29094',
            'enable.idempotence': 'true'}
        '''create string producer '''
        producer = Producer(string_producer_conf)

    while True:

        response_json=metrics(1136837) #300

        raw_json = json.dumps(response_json,indent=4)

        print(raw_json)

        try:
            #producer.produce(topic=TOPIC, value=raw_json)
            producer.produce(topic=TOPIC, value=raw_json)
            producer.poll(1)

        except Exception as e:
            LOGGER.error("There is a problem with the topic {}\n".format(TOPIC))
            LOGGER.error("The problem is: {}!".format(e))

        LOGGER.info("Produced into Kafka topic: {}.".format(TOPIC))
        LOGGER.info("Waiting for the next round...")
        time.sleep(300)}

然后當我啟動我的生產者時,我有以下錯誤

錯誤問題是:KafkaError{code=_VALUE_SERIALIZATION,val=-161,str="'SerializationContext' object has no attribute 'strip'"}!>

備注:當我使用“字符串”作為我的制作人的參數時,效果很好

我嘗試了很多事情都沒有成功,真的不明白錯誤信息,任何幫助將不勝感激,謝謝。

您的代碼的問題在於您將實際類AvroSerializer傳遞給value.serializer屬性,而不是一個實例。

示例代碼所示,您需要使用架構、URL 和序列化程序函數句柄創建一個實例。 然后你會從AvroSerializer序列化器函數句柄返回一個dict ,而不是從json.dumps產生一個字符串......如果你想發送一個實際的JSON字符串,那么你不需要使用AvroSerializer ,因為它會發送二進制 Avro 數據

將代碼減少到重要的部分......

class User:
  def __init__(self, ...):
     pass

def user_to_dict(user, ctx):
    return dict(...)


schema_registry_conf = {'url': 'http://...'}
schema_registry_client = SchemaRegistryClient(schema_registry_conf)

avro_serializer = AvroSerializer(schema_str,
                                 schema_registry_client,
                                 user_to_dict)  # object serializer function defined here

producer_conf = {'bootstrap.servers': '...',
                 'key.serializer': StringSerializer('utf_8'),
                 'value.serializer': avro_serializer}

producer = SerializingProducer(producer_conf)

...
while True:
    # Serve on_delivery callbacks from previous calls to produce()
    producer.poll(0.0)
    try:
        # ... get fields 
        user = User(...)  # create an object
        producer.produce(topic=topic, key='...', value=user,  # sending the object
                         on_delivery=delivery_report)
    ...

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM