[英]Apache Avro Parquet java.lang.NoSuchFieldError: NULL_VALUE
I have been stuck for 3 days.我已经卡了3天了。 i am trying to read parquet files with Apache Avro.我正在尝试使用 Apache Avro 读取镶木地板文件。 I am simply reading a file, from a list of files, and then iterating until all files are complete.我只是从文件列表中读取文件,然后迭代直到所有文件都完成。
the code works fine within its own scala file, however, I suspect it could be something to do with dependencies and the external lib that I am including.该代码在其自己的 scala 文件中运行良好,但是,我怀疑这可能与依赖项和我包含的外部库有关。
Has anyone else had a similar error and been able to solve this?有没有其他人有过类似的错误并且能够解决这个问题?
Code代码
override def generateData(): Option[GenericRecord] = {
val conf: Configuration = new Configuration()
conf.setBoolean(AvroReadSupport.AVRO_COMPATIBILITY, true)
if (filePaths.size == 0){
dataSourceComplete()
None
} else {
x += 1
var line = parquetReader.read()
if (line == null){
println(x)
val nextFile = filePaths.last
filePaths = filePaths.init
println(nextFile)
parquetReader = AvroParquetReader.builder[GenericRecord](HadoopInputFile.fromPath(new Path(nextFile), conf)).withConf(conf).build()
line = parquetReader.read()
}
Some(line)
}
}
Error错误
Uncaught error from thread [Raphtory-akka.actor.default-dispatcher-17]: NULL_VALUE, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[Raphtory]
java.lang.NoSuchFieldError: NULL_VALUE
at org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:246)
at org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:231)
at org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:130)
at org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:183)
at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:156)
at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135)
at com.raphtory.ethereum.spout.EthereumTransactionSpout.generateData(EthereumTransactionSpout.scala:59)
This is my build.sbt这是我的 build.sbt
scalaVersion := "2.12.11"
Compile / unmanagedJars += baseDirectory.value / "lib/raphtory.jar"
val AkkaVersion = "2.6.14"
libraryDependencies ++= Seq(
"com.lightbend.akka" %% "akka-stream-alpakka-avroparquet" % "3.0.3",
"com.typesafe.akka" %% "akka-stream" % AkkaVersion
)
0 0
I came across the same issue so thought I'd share.我遇到了同样的问题,所以我想分享一下。 I came across shading jars and for my application as I discovered I am using some dependency libraries that introduce some conflicts with the avro version.我遇到了着色 jars 和我的应用程序,因为我发现我使用了一些依赖库,这些库引入了与 avro 版本的一些冲突。 So I shade the avro library in my pom.xml - ie rename the package so it doesn't conflict with anything else.所以我在我的 pom.xml 中隐藏了 avro 库——即重命名包,这样它就不会与其他任何东西冲突。
First thing is to add the maven-shade-plugin which allows you to (a) create an uber JAR and (b) to shade its contents, see here for more info.首先是添加 maven-shade-plugin,它允许您 (a) 创建一个 uber JAR 和 (b) 为其内容着色,请参阅此处了解更多信息。 Here is a snippet:这是一个片段:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.6.2</version>
... Then I shade the library: ...然后我为图书馆遮蔽:
<relocation>
<pattern>org.apache.avro</pattern>
<shadedPattern>[RENAME-HERE].shaded.org.apache.avro</shadedPattern>
</relocation>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.