Spark java.lang.UnsupportedOperationException：空集合

Question

When I run this code, I get empty collection error in some cases. 运行此代码时，在某些情况下会出现空集合错误。

    val result = df
                  .filter(col("channel_pk") === "abc")
                  .groupBy("member_PK")
                  .agg(sum(col("price") * col("quantityOrdered")) as "totalSum")
                  .select("totalSum")
                  .rdd.map(_ (0).asInstanceOf[Double]).reduce(_ + _)

The error happens at this line: 错误发生在此行：

.rdd.map(_ (0).asInstanceOf[Double]).reduce(_ + _)

When collection is empty, I want result to be equal to 0. How can I do it? 当collection为空时，我希望result等于0。该怎么办？

Answer 1

The error appears only at that line because this is the first time you make some action. 该错误仅出现在该行，因为这是您第一次执行某些操作。 before that spark doesn't execute anything (laziness). 在此之前，火花不会执行任何操作（惰性）。 your df is just empty. 您的df只是空的。 You can verify it by adding before: assert(!df.take(1).isEmpty) 您可以通过在以下代码之前添加来进行验证： assert(!df.take(1).isEmpty)

Answer 2

When collection is empty, I want result to be equal to 0. How can I do it? 当collection为空时，我希望结果等于0。该怎么办？

Before you do aggregation, just check if the dataframe has some rows or not 进行聚合之前，只需检查数据框是否有一些行

val result = if(df.take(1).isEmpty) 0 else df
  .filter(col("channel_pk") === "abc")
  .groupBy("member_PK")
  .agg(sum(col("price") * col("quantityOrdered")) as "totalSum")
  .select("totalSum")
  .rdd.map(_(0).asInstanceOf[Double]).reduce(_ + _)

or you can use count too 或者你也可以使用count

val result = if(df.count() == 0) 0 else df
  .filter(col("channel_pk") === "abc")
  .groupBy("member_PK")
  .agg(sum(col("price") * col("quantityOrdered")) as "totalSum")
  .select("totalSum")
  .rdd.map(_(0).asInstanceOf[Double]).reduce(_ + _)

Spark java.lang.UnsupportedOperationException：空集合

问题描述

2 个解决方案

解决方案1
1 2018-04-26 12:57:50

解决方案2
1 已采纳 2018-04-26 13:14:04

Spark java.lang.UnsupportedOperationException：空集合

问题描述

2 个解决方案

解决方案1 1 2018-04-26 12:57:50

解决方案2 1 已采纳 2018-04-26 13:14:04

解决方案1
1 2018-04-26 12:57:50

解决方案2
1 已采纳 2018-04-26 13:14:04