Microsoft Azure decides, in some cases, to dump data in avro format. The data in question is simply json records, from my perspective. So, I just want my json data back from the avro file.
I am looking at how to 'deserialize' avro data, and the examples here:
https://avro.apache.org/docs/1.8.1/gettingstartedjava.html
make the claim:
Data in Avro is always stored with its corresponding schema, meaning we can always read a serialized item regardless of whether we know the schema ahead of time.
Unfortunately, the examples do require knowing the schema ahead of time:
DatumReader<GenericRecord> datumReader = new GenericDatumReader<GenericRecord>(schema);
DataFileReader<GenericRecord> dataFileReader = new DataFileReader<GenericRecord>(file, datumReader);
I must be missing something, I just want my data (text / json) format, out of avro. Is there any way of doing that without knowing a schema? Can't avro just read that out of the file itself?
Why write code when there's already a tool to get json?
java -jar avro-tools-1.8.2.jar tojson data.avro > output.json
http://central.maven.org/maven2/org/apache/avro/avro-tools/1.8.2/avro-tools-1.8.2.jar
Otherwise, your file has a schema, and you'd have to extract it first before reading the file contents, which is exactly what the source code of above tool does
您需要提供读者的架构,以便 Avro 可以执行架构解析
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.