[英]Spark java.lang.UnsupportedOperationException: empty collection
When I run this code, I get empty collection error in some cases. 运行此代码时,在某些情况下会出现空集合错误。
val result = df
.filter(col("channel_pk") === "abc")
.groupBy("member_PK")
.agg(sum(col("price") * col("quantityOrdered")) as "totalSum")
.select("totalSum")
.rdd.map(_ (0).asInstanceOf[Double]).reduce(_ + _)
The error happens at this line: 错误发生在此行:
.rdd.map(_ (0).asInstanceOf[Double]).reduce(_ + _)
When collection is empty, I want result
to be equal to 0. How can I do it? 当collection为空时,我希望
result
等于0。该怎么办?
The error appears only at that line because this is the first time you make some action. 该错误仅出现在该行,因为这是您第一次执行某些操作。 before that spark doesn't execute anything (laziness).
在此之前,火花不会执行任何操作(惰性)。 your df is just empty.
您的df只是空的。 You can verify it by adding before:
assert(!df.take(1).isEmpty)
您可以通过在以下代码之前添加来进行验证:
assert(!df.take(1).isEmpty)
When collection is empty, I want result to be equal to 0. How can I do it?
当collection为空时,我希望结果等于0。该怎么办?
Before you do aggregation, just check if the dataframe has some rows or not 进行聚合之前,只需检查数据框是否有一些行
val result = if(df.take(1).isEmpty) 0 else df
.filter(col("channel_pk") === "abc")
.groupBy("member_PK")
.agg(sum(col("price") * col("quantityOrdered")) as "totalSum")
.select("totalSum")
.rdd.map(_(0).asInstanceOf[Double]).reduce(_ + _)
or you can use count
too 或者你也可以使用
count
val result = if(df.count() == 0) 0 else df
.filter(col("channel_pk") === "abc")
.groupBy("member_PK")
.agg(sum(col("price") * col("quantityOrdered")) as "totalSum")
.select("totalSum")
.rdd.map(_(0).asInstanceOf[Double]).reduce(_ + _)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.