[英]Apache Spark filter by Some
I have the following leftOuterJoin
operation: 我有以下leftOuterJoin
操作:
val totalsAndProds = transByProd.leftOuterJoin(products)
println(totalsAndProds.first())
which prints: 打印:
(19,([Ljava.lang.String;@261ea657,Some([Ljava.lang.String;@25290bca)))
then I try to apply the following filter
operations: 然后我尝试应用以下filter
操作:
totalsAndProds.filter(x => x._2 == Some).first
but it fails with the following exception: 但失败,但以下异常:
Exception in thread "main" java.lang.UnsupportedOperationException: empty collection
at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1380)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.first(RDD.scala:1377)
at com.example.spark.WordCount$.main(WordCount.scala:98)
at com.example.spark.WordCount.main(WordCount.scala)
what am I doing wrong and the filter operation returns the empty collection? 我在做什么错,筛选器操作返回空集合?
Your predicate is wrong: 您的谓词是错误的:
(Int, (Array[String], Option[Array[String]]))
, therefore _._2
is of type (Array[String], Option[Array[String]])
, not Option[Array[String]]
您的RDD类型是(Int, (Array[String], Option[Array[String]]))
,因此_._2
是(Array[String], Option[Array[String]])
,而不是Option[Array[String]]
Try 尝试
totalsAndProds.filter{ case (_, (_, s)) => s.isDefined }
Example below: 下面的例子:
scala> val rdd = sc.parallelize(List((19, (Array("a"), Some(Array("a"))))))
rdd: org.apache.spark.rdd.RDD[(Int, (Array[String], Some[Array[String]]))] = ParallelCollectionRDD[0] at parallelize at <console>:24
scala> rdd.filter{ case (_, (_, s)) => s.isDefined }
res0: org.apache.spark.rdd.RDD[(Int, (Array[String], Some[Array[String]]))] = MapPartitionsRDD[1] at filter at <console>:27
scala> rdd.filter{ case (_, (_, s)) => s.isDefined }.collect
res1: Array[(Int, (Array[String], Some[Array[String]]))] = Array((19,(Array(a),Some([Ljava.lang.String;@5307fee))))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.