[英]Reading Avro files in Java application with Hortonworks Schema Registry
I have an application that is writing files in Avro format (multiple records per file) but I cannot read it in another Java app.我有一个应用程序以 Avro 格式写入文件(每个文件有多个记录),但我无法在另一个 Java 应用程序中读取它。 Here's what I've tried这是我尝试过的
Map<String, Object> registryConfig = new HashMap<>();
registryConfig.put("schema.registry.client.class.loader.cache.size", 10L);
registryConfig.put("schema.registry.url", "http://localhost:9090/api/v1");
registryConfig.put("schema.registry.client.class.loader.cache.expiry.interval.secs", 10L);
registryConfig.put("schema.registry.deserializer.schema.cache.size", 10L);
registryConfig.put("schema.registry.client.schema.metadata.cache.size", 10L);
registryConfig.put("schema.registry.client.schema.text.cache.expiry.interval.secs", 10000L);
registryConfig.put("schema.registry.client.schema.version.cache.expiry.interval.secs", 10000L);
registryConfig.put("schema.registry.client.schema.metadata.cache.expiry.interval.secs", 10L);
registryConfig.put("specific.avro.reader", false);
registryConfig.put("schema.registry.client.schema.version.cache.size", 10L);
registryConfig.put("schema.registry.client.schema.version.text.size", 10L);
registryConfig.put("schemaregistry.deserializer.schema.cache.expiry.secs", 10000L);
SchemaRegistryClient registryClient = new SchemaRegistryClient(registryConfig);
AvroSnapshotDeserializer deserializer = new AvroSnapshotDeserializer(registryClient);
deserializer.init(registryConfig);
Path p = Paths.get("/tmp/dump.avro");
InputStream is = Files.newInputStream(p);
deserializer.deserialize(is);
But it throws但它抛出
Exception in thread "main" com.hortonworks.registries.schemaregistry.serdes.avro.exceptions.AvroException: Unknown protocol id [79] received while deserializing the payload
at com.hortonworks.registries.schemaregistry.serdes.avro.AvroSnapshotDeserializer.checkProtocolHandlerExists(AvroSnapshotDeserializer.java:70)
at com.hortonworks.registries.schemaregistry.serdes.avro.AvroSnapshotDeserializer.retrieveProtocolId(AvroSnapshotDeserializer.java:63)
at com.hortonworks.registries.schemaregistry.serdes.avro.AvroSnapshotDeserializer.retrieveProtocolId(AvroSnapshotDeserializer.java:32)
at com.hortonworks.registries.schemaregistry.serde.AbstractSnapshotDeserializer.deserialize(AbstractSnapshotDeserializer.java:141)
at com.hortonworks.registries.schemaregistry.serde.AbstractSnapshotDeserializer.deserialize(AbstractSnapshotDeserializer.java:55)
at com.hortonworks.registries.schemaregistry.serde.SnapshotDeserializer.deserialize(SnapshotDeserializer.java:60)
I know it would be difficult for you to reproduce this problem as it requires my schema registry and a file.我知道您很难重现此问题,因为它需要我的架构注册表和文件。 I hope though, that I am doing something silly here.不过,我希望我在这里做一些愚蠢的事情。 Any help would be appreciated.任何帮助,将不胜感激。
Okay... I've realized that 79
from the error message is ASCII
code of the letter O
.好的...我已经意识到错误消息中的79
是字母O
ASCII
代码。 I then double checked if my files are REALLY using schema registry - it turns out they don't.然后我仔细检查我的文件是否真的使用架构注册表 - 结果他们没有。 They are just Avro files with embedded schema.它们只是带有嵌入式架构的 Avro 文件。 Thus, I don't need Hortonworks' AvroSnapshotDeserializer
- simple DataFileReader
will do.因此,我不需要 Hortonworks 的AvroSnapshotDeserializer
- 简单的DataFileReader
就可以了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.