简体   繁体   English

Scala:突破 foldLeft

[英]Scala: Breaking out of foldLeft

Suppose we have Seq val ourSeq = Seq(10,5,3,5,4) .假设我们有 Seq val ourSeq = Seq(10,5,3,5,4)

I want to return a new list which reads from the left and stop when it sees a duplicate number (eg Seq(10,5,3) since 5 is repeated).我想返回一个从左边读取的新列表,当它看到重复的数字时停止(例如Seq(10,5,3)因为 5 重复)。

I was thinking of using fold left as such我正在考虑使用 fold left 这样

ourSeq.foldLeft(Seq())(op = (temp, curr) => {

  if (!temp.contains(curr)) {
    temp :+ curr 
  } else break

})

but as far as I understand, there is no way to break out of a foldLeft ?但据我所知,没有办法摆脱foldLeft

Although it can be accomplished with a foldLeft() without any breaking out, I would argue that fold is the wrong tool for the job.尽管可以使用foldLeft()而不会出现任何中断,但我认为fold是完成这项工作的错误工具。

I'm rather fond of unfold() , which was introduced in Scala 2.13.0.我比较喜欢在 Scala 2.13.0 中引入的 expand unfold()

val ourSeq = Seq(10,5,3,5,4)
Seq.unfold((Set.empty[Int],ourSeq)){ case (seen,ns) =>
  Option.when(ns.nonEmpty && !seen(ns.head)) {
    (ns.head, (seen+ns.head, ns.tail))
  }
}
//res0: Seq[Int] = Seq(10, 5, 3)

You are correct that it's not possible to break out of foldLeft .你是对的,不可能突破foldLeft It would theoretically be possible to get the correct result with foldLeft , but you're still going to iterate the whole data structure.从理论foldLeft ,使用foldLeft可以获得正确的结果,但您仍然要迭代整个数据结构。 It'll be better to use an algorithm that already understands how to terminate early, and since you want to take a prefix, takeWhile will suffice.最好使用已经了解如何提前终止的算法,并且由于您想使用前缀, takeWhile就足够了。

import scala.collection.mutable.Set

val ourSeq = Seq(10, 5, 3, 5, 4)

val seen: Set[Int] = Set()
val untilDups = ourSeq.takeWhile((x) => {
  if (seen contains x) {
    false
  } else {
    seen += x
    true
  }
})
print(untilDups)

If you wanted to be totally immutable about this, you could wrap the whole thing in some kind of lazy fold that uses an immutable Set to keep its data.如果您想对此完全不可变,您可以将整个内容包装在某种惰性折叠中,该折叠使用不可变的Set来保存其数据。 And that's certainly how I'd do it in Haskell.这当然是我在 Haskell 中所做的。 But this is Scala;但这是 Scala; we have mutability, and we may as well use it locally when it suits us.我们有可变性,当它适合我们时,我们也可以在本地使用它。

This can be done using a recursive function:这可以使用递归函数来完成:

def uniquePrefix[T](ourSeq: Seq[T]): List[T] = {
  @annotation.tailrec
  def loop(rem: List[T], res: List[T]): List[T] = 
    rem match {
      case hd::tail if !res.contains(hd) =>
        loop(tail, res :+ hd)
      case _ =>
        res
    }

  loop(ourSeq.toList, Nil)
}

This appears more complicated, but once you are familiar with the general pattern recursive functions are simple to write and more powerful than fold operations.这看起来更复杂,但是一旦您熟悉了一般模式,递归函数就很容易编写并且比fold操作更强大。

If you are working on large collections, this version is more efficient because it is O(n) :如果您正在处理大型集合,则此版本效率更高,因为它是O(n)

def distinctPrefix[T](ourSeq: Seq[T]): List[T] = {
  @annotation.tailrec
  def loop(rem: List[T], found: Set[T], res: List[T]): List[T] = 
    rem match {
      case hd::tail if !found.contains(hd) =>
        loop(tail, found + hd, hd +: res)
      case _ =>
        res.reverse
    }

  loop(ourSeq.toList, Set.empty, Nil)
}

This version works with any Seq and there are other options using Iterator etc. as described in the comments.此版本适用于任何Seq ,还有其他选项使用Iterator等。如评论中所述。 You would need to be more specific about the type of the collection in order to create an optimised algorithm.为了创建优化算法,您需要更具体地了解集合的类型。

def uniquePrefix[T](ourSeq: Seq[T]): List[T] = {
  @annotation.tailrec
  def loop(rem: Seq[T], res: List[T]): List[T] = 
    rem.take(1) match {
      case Seq(hd) if !res.contains(hd) =>
        loop(rem.drop(1), res :+ hd)
      case _ =>
        res
    }

  loop(ourSeq, Nil)
}

Another option you have, is to use the function inits :您拥有的另一个选择是使用函数inits

ourSeq.inits.dropWhile(curr => curr.distinct.size != curr.size).next()

Code run at Scastie .代码在Scastie运行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM