简体   繁体   中英

What is the equivalent of Scala's Seq.Span in F#?

To quote Scala's documentation :

def span(p: (A) => Boolean): (Seq[A], Seq[A])

Splits this iterable collection into a prefix/suffix pair according to a predicate.

Note: c span p is equivalent to (but possibly more efficient than) (c takeWhile p, c dropWhile p), provided the evaluation of the predicate p does not cause any side-effects.

Note: might return different results for different runs, unless the underlying collection type is ordered.

  • p - the test predicate
  • returns - a pair consisting of the longest prefix of this iterable collection whose elements all satisfy p, and the rest of this iterable collection.
  • Definition Classes - IterableOps → IterableOnceOps
  • Note - Reuse: After calling this method, one should discard the iterator it was called on, and use only the iterators that were returned. Using the old iterator is undefined, subject to change, and may result in changes to the new iterators as well.

When looking at the F# documentation for Seq I don't see any equivalent.

groupBy, partition, splitAt, none of them match what I'm trying to do. It's like doing a takeWhile and skipWhile at the same time but instead of needing two iterations you only need one where the function would return a tuple of (takeWhile, skipWhile).

The output should match the following function

module List
let span (predicate: 'a -> bool) (list: 'a list): ('a list * 'a list) =
    (list |> List.takeWhile predicate, list |> List.skipWhile predicate)

But only need one iteration since my sequence may be infinite.

[1;2;3;4] |> List.span (fun i -> i % 2 = 1) => ([1], [2;3;4])

Your explanation is not what Seq.span does in Scala: it partitions one sequence in two, putting all input elements into first tuple value as long as predicate function returns true. Once function returns false, all remaining elements will be pushed into second tuple value.

Example in F# would be like:

[1;2;3;4] |> Seq.span (fun i -> i % 2 = 1) => ([1], [2;3;4])

This can be achieved quite easily using mutualy recursive functions:

module Seq

open System.Collections.Generic

let span (predicate: 'a -> bool) (seq: 'a seq): ('a seq * 'a seq) =
    let rec insertLeft predicate (e: IEnumerator<'a>) (left: ResizeArray<'a>) (right: ResizeArray<'a>) =
        if e.MoveNext() then
            if predicate e.Current then
                left.Add e.Current
                insertLeft predicate e left right
            else
                // once predicate returned false, all consecutive elements land in right list
                right.Add e.Current 
                insertRight e right
    and insertRight (e: IEnumerator<'a>) (right: ResizeArray<'a>) =
        if e.MoveNext() then 
            right.Add e.Current
            insertRight e right
    let left = ResizeArray<_>()
    let right = ResizeArray<_>()
    use enumerator = seq.GetEnumerator()
    insertLeft predicate enumerator left right
    (upcast left, upcast right)

I have a helper function which I find quite useful for this sort of thing:

module Seq =
    let groupAdjacentBy f xs =
        let mutable prevKey, i = None, 0
        xs
        |> Seq.groupBy (fun x ->
            let key = f x
            if prevKey <> Some key then
                i <- i + 1
                prevKey <- Some key
            (i, key))
        |> Seq.map (fun ((_, k), v) -> (k, v))

Note that it uses locally contained mutation in the implementation because it's the simplest way to reuse the existing Seq.groupBy .

This is actually a grouping function but it only puts items in the same group if they are adjacent to each other. To my mind this is a very general way of solving problems that need several uses of takeWhile and skipWhile , but simpler because it's all done in one pass. The grouping function returns a group key of any type rather than just a boolean, adding more flexibility.

Here's an example usage with a grouping function that returns a boolean:

[ 1; 2; -1; -2; 3; 4; -5 ]
|> Seq.groupAdjacentBy (fun x -> x > 0) // positive?
|> Seq.map snd
// seq [seq [1; 2]; seq [-1; -2]; seq [3; 4]; seq [-5]]

In this example the first two lines return the groups with their keys ( true , false , true , false respectively). You can then use these keys in your logic, but if you don't care about them then the Seq.map snd will discard them. And you're left with a seq<seq<int>> as shown above.

This is what I've come up with, it's not a great answer because it requires you eagerly iterate through the first one so if you return true long enough you will blow without of memory. I will leave the question open for a few more days and if I don't see a better answer I will mark this one. Again I would really like a better answer that completely works with an infinite sequence in any situation.

module Seq

let span (predicate: 'a -> bool) (sequence: 'a seq): ('a seq * 'a seq) =
    let enumerator = sequence.GetEnumerator()
    let isNotDone = ref (enumerator.MoveNext())
    let first = seq {
        let e = enumerator
        if !isNotDone then
            while (!isNotDone && predicate e.Current) do
                yield enumerator.Current
                isNotDone := e.MoveNext() }
    let second = seq {
        use e = enumerator
        if !isNotDone then
            yield e.Current
            while e.MoveNext() do
                yield e.Current }
    let eagerFirst = List.toSeq (Seq.toList first)
    (eagerFirst, second)

Looks like you want something like the following, you'd just pass the same Seq (or List or whatever) to filter twice as follows:

//Returns a tuple of seq<'a> * seq<'a> (which looks like what you want).
let scalaSpanTuple test sq = (Seq.filter test sq, Seq.filter (test>>not) sq)

or:

//Returns a list of 2 lists of 'a
let scalaSpanList test ls = [List.filter test ls; List.filter (test>>not) ls]

etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM