简体   繁体   中英

How to deserialize avro data from mqtt message?

I am receiving serialized(AVRO) data as an mqtt message. Message looks somethings like this Objavro.codecnullavro.schemaº{"type": "record", "name": "User", "namespace": "example.avro", "fields": [{"type": "string", "name": "name"}, {"type": ["int", "null"], "name": "favorite_number"}, {"type": ["string", "null"], "name": "favorite_color"}]} Œpq+±)žJ@xX·,Alyssa €Ben redŒpq+±)žJ@xX·

I have to deserialize this data using Python3 with known schema user.avsc -

{"namespace": "example.avro",
 "type": "record",
 "name": "User",
 "fields": [
     {"name": "name", "type": "string"},
     {"name": "favorite_number",  "type": ["int", "null"]},
     {"name": "favorite_color", "type": ["string", "null"]}
 ]
}

Deserialized data should look something like this

{u'favorite_color': None, u'favorite_number': 256, u'name': u'Alyssa'}
{u'favorite_color': u'red', u'favorite_number': 7, u'name': u'Ben'}

With the example given at https://avro.apache.org/docs/current/gettingstartedpython.html the data is written/read from DataFileWriter/Reader methods, however it would be great to have this on-the-fly like as message arrives python code deserializes the data and prints it.

The MQTT subscription logic is handled already which as of now just prints the incoming message, i would like to print the deserialized data with incoming message.

I tried the following(deserialization logic):

import avro.schema
from avro.io import DatumReader, DatumWriter
import io

schema = avro.schema.parse(open("user.avsc", "rb").read())
# message passed here is incoming message
bytes_reader = io.BytesIO(bytes(message, encoding='utf-8'))
decoder = avro.io.BinaryDecoder(bytes_reader)

reader = avro.io.DatumReader(schema)
data = reader.read(decoder)
print(data)

The above code fails( TypeError: ord() expected a character, but string of length 0 found ) since i couldn't figure out the right format to use as argument for reader.read() method. The reason i used io.BytesIO is since the data arrives as string, i cannot pass a string and clearly the example from apache page reads data in binary format and uses the same for deserialization.

Thank you

If the message you get from MQTT is in a string format (and not bytes) then you probably are not going to be able to deserialize it. If you are seeing the avro binary in a string format you are not going to be able to just encode it as UTF-8 and deserialize it. You need the actual binary.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM