[英]Unable to push Json in KAFKA topic
I try to push data in JSON format inside a KAFKA topic but without success.我尝试在 KAFKA 主题中以 JSON 格式推送数据,但没有成功。
I used the following AVRO SCHEMA :我使用了以下 AVRO SCHEMA :
{"schemaType":"AVRO","schema":"{\"title\":\"json pipeline\",\"name\":\"MyClass\",\"type\":\"record\",\"namespace\":\"com.acme.avro\",\"fields\":[{\"name\":\"web\",\"type\":{\"name\":\"web\",\"type\":\"record\",\"fields\":[{\"name\":\"test\",\"type\":{\"name\":\"test\",\"type\":\"record\",\"fields\":[{\"name\":\"createdDate\",\"type\":\"string\"},{\"name\":\"modifiedDate\",\"type\":\"string\"},{\"name\":\"createdBy\",\"type\":\"string\"},{\"name\":\"modifiedBy\",\"type\":\"string\"},{\"name\":\"enabled\",\"type\":\"int\"},{\"name\":\"savedEvent\",\"type\":\"int\"},{\"name\":\"testId\",\"type\":\"int\"},{\"name\":\"testName\",\"type\":\"string\"},{\"name\":\"type\",\"type\":\"string\"},{\"name\":\"interval\",\"type\":\"int\"},{\"name\":\"httpInterval\",\"type\":\"int\"},{\"name\":\"url\",\"type\":\"string\"},{\"name\":\"protocol\",\"type\":\"string\"},{\"name\":\"networkMeasurements\",\"type\":\"int\"},{\"name\":\"mtuMeasurements\",\"type\":\"int\"},{\"name\":\"bandwidthMeasurements\",\"type\":\"int\"},{\"name\":\"bgpMeasurements\",\"type\":\"int\"},{\"name\":\"usePublicBgp\",\"type\":\"int\"},{\"name\":\"alertsEnabled\",\"type\":\"int\"},{\"name\":\"liveShare\",\"type\":\"int\"},{\"name\":\"httpTimeLimit\",\"type\":\"int\"},{\"name\":\"httpTargetTime\",\"type\":\"int\"},{\"name\":\"httpVersion\",\"type\":\"int\"},{\"name\":\"pageLoadTimeLimit\",\"type\":\"int\"},{\"name\":\"pageLoadTargetTime\",\"type\":\"int\"},{\"name\":\"followRedirects\",\"type\":\"int\"},{\"name\":\"includeHeaders\",\"type\":\"int\"},{\"name\":\"sslVersionId\",\"type\":\"int\"},{\"name\":\"verifyCertificate\",\"type\":\"int\"},{\"name\":\"useNtlm\",\"type\":\"int\"},{\"name\":\"authType\",\"type\":\"string\"},{\"name\":\"contentRegex\",\"type\":\"string\"},{\"name\":\"identifyAgentTrafficWithUserAgent\",\"type\":\"int\"},{\"name\":\"probeMode\",\"type\":\"string\"},{\"name\":\"pathTraceMode\",\"type\":\"string\"},{\"name\":\"description\",\"type\":\"string\"},{\"name\":\"numPathTraces\",\"type\":\"int\"},{\"name\":\"apiLinks\",\"type\":{\"type\":\"array\",\"items\":{\"name\":\"apiLinks_record\",\"type\":\"record\",\"fields\":[{\"name\":\"rel\",\"type\":\"string\"},{\"name\":\"href\",\"type\":\"string\"}]}}},{\"name\":\"sslVersion\",\"type\":\"string\"}]}},{\"name\":\"pageLoad\",\"type\":{\"type\":\"array\",\"items\":{\"name\":\"pageLoad_record\",\"type\":\"record\",\"fields\":[{\"name\":\"agentName\",\"type\":\"string\"},{\"name\":\"countryId\",\"type\":\"string\"},{\"name\":\"date\",\"type\":\"string\"},{\"name\":\"agentId\",\"type\":\"int\"},{\"name\":\"roundId\",\"type\":\"int\"},{\"name\":\"responseTime\",\"type\":\"int\"},{\"name\":\"totalSize\",\"type\":\"int\"},{\"name\":\"numObjects\",\"type\":\"int\"},{\"name\":\"numErrors\",\"type\":\"int\"},{\"name\":\"domLoadTime\",\"type\":\"int\"},{\"name\":\"pageLoadTime\",\"type\":\"int\"},{\"name\":\"permalink\",\"type\":\"string\"}]}}}]}},{\"name\":\"pages\",\"type\":{\"name\":\"pages\",\"type\":\"record\",\"fields\":[{\"name\":\"current\",\"type\":\"int\"}]}}]}"
This AVRO schema is succeffuly pushed in my SchemaRegistry此 AVRO 架构已成功推送到我的 SchemaRegistry
Then in my producer I used AvroSerializer然后在我的制作人中我使用了 AvroSerializer
import time
import json
import sys
import requests
from confluent_kafka import Producer
from confluent_kafka import SerializingProducer
from confluent_kafka.serialization import StringSerializer
from confluent_kafka.schema_registry.schema_registry_client import SchemaRegistryClient
from confluent_kafka.schema_registry.json_schema import JSONSerializer
from confluent_kafka.schema_registry.avro import AvroSerializer
from utils import set_logger
from confluent_kafka.admin import AdminClient, NewTopic
TOPIC = os.environ.get("MY_TOPIC_IN")
WEB_PAGE_LOAD_URL = os.environ.get("URL")
ACCOUNT_GROUP_ID_1000EYES = os.environ.get("ACCOUNT_ID")
TE_BEARER = os.environ.get("TE_BEARER")
LOGGER = set_logger("producer_logger")
def metrics(test_id):
res = {}
#url= WEB_PAGE_LOAD_URL + '{}.json?aid{}'.format(test_id, ACCOUNT_GROUP_ID_1000EYES)
url= "{}{}.json?aid{}".format(WEB_PAGE_LOAD_URL, test_id, ACCOUNT_GROUP_ID_1000EYES)
session = requests.session()
headers = {'Authorization': TE_BEARER}
rep=session.get(url, headers=headers)
res = rep.json()
print(res)
return res
if __name__ == "__main__":
conf={"bootstrap.servers":"json_kafka:29094"}
admin_client = AdminClient(conf)
topic_list = [NewTopic("my_topic_in", 1, 1)]
admin_client.create_topics(new_topics=topic_list)
if sys.argv[1] == "json" :
schema_registry_url = {"url": "http://json_schema-registry:8083"}
sr = SchemaRegistryClient(schema_registry_url)
subjects = sr.get_subjects()
'''retrieve json shcema in schema registry'''
for subject in subjects:
#print(subject)
schema = sr.get_latest_version(subject)
print(schema.subject)
if schema.subject == "{}-value".format(TOPIC) :
my_schema=schema.schema.schema_str
json_serializer = JSONSerializer(my_schema,sr,to_dict=None,conf=None)
'''create json producer'''
json_producer_conf = {'bootstrap.servers':'json_kafka:29094' ,
'key.serializer': StringSerializer('utf_8'),
'value.serializer': AvroSerializer}
producer = SerializingProducer(json_producer_conf)
elif sys.argv[1]=="string":
string_producer_conf = {'bootstrap.servers':'json_kafka:29094',
'enable.idempotence': 'true'}
'''create string producer '''
producer = Producer(string_producer_conf)
while True:
response_json=metrics(1136837) #300
raw_json = json.dumps(response_json,indent=4)
print(raw_json)
try:
#producer.produce(topic=TOPIC, value=raw_json)
producer.produce(topic=TOPIC, value=raw_json)
producer.poll(1)
except Exception as e:
LOGGER.error("There is a problem with the topic {}\n".format(TOPIC))
LOGGER.error("The problem is: {}!".format(e))
LOGGER.info("Produced into Kafka topic: {}.".format(TOPIC))
LOGGER.info("Waiting for the next round...")
time.sleep(300)}
And then when I launch my producer I have the following error然后当我启动我的生产者时,我有以下错误
ERROR The problem is: KafkaError{code=_VALUE_SERIALIZATION,val=-161,str="'SerializationContext' object has no attribute 'strip'"}!>
错误问题是:KafkaError{code=_VALUE_SERIALIZATION,val=-161,str="'SerializationContext' object has no attribute 'strip'"}!>
remark : when I used "string" as argument with my producer it works well备注:当我使用“字符串”作为我的制作人的参数时,效果很好
I tried many things without success and really don't understand the error message, any help will be appreciate thanks.我尝试了很多事情都没有成功,真的不明白错误信息,任何帮助将不胜感激,谢谢。
The problem with your code is that you passed the actual class AvroSerializer
to value.serializer
property, not an instance of one.您的代码的问题在于您将实际类
AvroSerializer
传递给value.serializer
属性,而不是一个实例。
As shown in the example code , you need to create an instance with the schema, URL, and a serializer function-handle.如示例代码所示,您需要使用架构、URL 和序列化程序函数句柄创建一个实例。 You then would return a
dict
from the AvroSerializer
serializer function-handle, and not produce a string from json.dumps
... If you wanted to send an actual JSON string, then you don't need to use AvroSerializer
, as that would send binary Avro data然后你会从
AvroSerializer
序列化器函数句柄返回一个dict
,而不是从json.dumps
产生一个字符串......如果你想发送一个实际的JSON字符串,那么你不需要使用AvroSerializer
,因为它会发送二进制 Avro 数据
Reducing the code to the important parts...将代码减少到重要的部分......
class User:
def __init__(self, ...):
pass
def user_to_dict(user, ctx):
return dict(...)
schema_registry_conf = {'url': 'http://...'}
schema_registry_client = SchemaRegistryClient(schema_registry_conf)
avro_serializer = AvroSerializer(schema_str,
schema_registry_client,
user_to_dict) # object serializer function defined here
producer_conf = {'bootstrap.servers': '...',
'key.serializer': StringSerializer('utf_8'),
'value.serializer': avro_serializer}
producer = SerializingProducer(producer_conf)
...
while True:
# Serve on_delivery callbacks from previous calls to produce()
producer.poll(0.0)
try:
# ... get fields
user = User(...) # create an object
producer.produce(topic=topic, key='...', value=user, # sending the object
on_delivery=delivery_report)
...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.