简体   繁体   中英

Spark catalyst optimizer cast exception

I have 2 ( Foo and Bar ) classes, each implements one interface.

Application has method which check some condition for interface-objects.

private boolean check(Interface1 obj)

I apply this method to both classes:

Dataset<Foo> foos = getStapSession()....load().as(Encoders.bean(Foo.class));
Dataset<Bar> bars = getStapSession()....load().as(Encoders.bean(Bar.class));

foos.filter((FilterFunction<Foo>) this::check).collectAsList();
bars.filter((FilterFunction<Bar>) this::check).collectAsList();

There is a cast error (!!!):

Caused by: java.lang.ClassCastException: test.Bar cannot be cast to test.Foo
    at org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters$$anonfun$org$apache$spark$sql$catalyst$optimizer$CombineTypedFilters$$combineFilterFunction$1.apply(objects.scala:85)
    at org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters$$anonfun$org$apache$spark$sql$catalyst$optimizer$CombineTypedFilters$$combineFilterFunction$1.apply(objects.scala:85)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(generated.java:273)
....

There are different classes with different properites etc. Datasets creation are different. (several src files).

EDIT :

.filter((FilterFunction<Foo>)obj -> check(obj))

and

.filter((FilterFunction<Bar>)obj -> check(obj))

Works correct. There are some problem connected with method reference this::check

It's not a bug from Spark, it's a bug of JDK deserialization (BUG ID: 8154236) , deserialization of lambda will cause ClassCastException. You can see the similar problem description of this issue in Spark Issues (SPARK-9135) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM