如何欺骗 Scala map 方法为每个输入项生成多个 output？

Question

相当复杂的算法被应用于 Spark 数据集的行列表（列表是使用 groupByKey 和 flatMapGroups 获得的）。 大多数行以 1:1 的比例从输入转换为 output，但在某些情况下，每个输入需要多个 output。 输入行架构可以随时更改。 map()非常适合 1:1 转换的要求，但是有没有办法使用它来生成 1: n output？

我发现的唯一解决方法依赖于foreach方法，该方法通过创建初始空列表而导致令人不快的重叠（请记住，与下面的简化示例不同，现实生活中的列表结构是随机变化的）。

我原来的问题太复杂了，不能在这里分享，但是这个例子演示了这个概念。 让我们有一个整数列表。 每个都应该转换成它的平方值，如果输入是偶数，它也应该转换成原始值的一半：

val X = Seq(1, 2, 3, 4, 5)

val y = X.map(x => x * x) //map is intended for 1:1 transformation so it works great here

val z = X.map(x => for(n <- 1 to 5) (n, x * x)) //this attempt FAILS - generates list of five rows with emtpy tuples

// this work-around works, but newX definition is problematic
var newX = List[Int]() //in reality defining as head of the input list and dropping result's tail at the end
val za = X.foreach(x => {
  newX = x*x :: newX
  if(x % 2 == 0) newX = (x / 2) :: newX
})

newX

有没有比foreach构造更好的方法？

Answer 1

.flatMap从单个输入产生任意数量的输出。

val X = Seq(1, 2, 3, 4, 5)

X.flatMap { x => 
  if (x % 2  == 0) Seq(x*x, x / 2) else Seq(x / 2) 
}
#=> Seq[Int] = List(0, 4, 1, 1, 16, 2, 2)

flatMap 更详细

在X.map(f)中， f是一个 function，它将每个输入映射到单个 output。相比之下，在X.flatMap(g)中，function g将每个输入映射到一系列输出。 然后flatMap获取所有生成的序列（一个对应于f中的每个元素）并将它们连接起来。

巧妙的是.flatMap不仅适用于序列，而且适用于所有类似序列的对象。 例如，对于一个选项， Option(x)#flatMap(g)将允许g返回一个Option 。 同样， Future(x)#flatMap(g)将允许g返回 Future。

当你返回的元素数量取决于输入时，你应该想到flatMap 。

如何欺骗 Scala map 方法为每个输入项生成多个 output？

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-12-02 21:27:39

flatMap 更详细

如何欺骗 Scala map 方法为每个输入项生成多个 output？

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-12-02 21:27:39

flatMap 更详细

解决方案1
3 已采纳 2020-12-02 21:27:39