简体   繁体   中英

How to write efficient list/seq functions in F#? (mapFoldWhile)

I was trying to write a generic mapFoldWhile function, which is just mapFold but requires the state to be an option and stops as soon as it encounters a None state.

I don't want to use mapFold because it will transform the entire list, but I want it to stop as soon as an invalid state (ie None ) is found.

This was myfirst attempt:

let mapFoldWhile (f : 'State option -> 'T -> 'Result * 'State option) (state : 'State option) (list : 'T list) =
  let rec mapRec f state list results =
    match list with 
    | [] -> (List.rev results, state)
    | item :: tail -> 
      let (result, newState) = f state item
      match newState with 
      | Some x -> mapRec f newState tail (result :: results)
      | None -> ([], None)
  mapRec f state list []

The List.rev irked me, since the point of the exercise was to exit early and constructing a new list ought to be even slower.

So I looked up what F#'s very own map does, which was:

let map f list = Microsoft.FSharp.Primitives.Basics.List.map f list

The ominous Microsoft.FSharp.Primitives.Basics.List.map can be found here and looks like this:

let map f x = 
    match x with
    | [] -> []
    | [h] -> [f h]
    | (h::t) -> 
        let cons = freshConsNoTail (f h)
        mapToFreshConsTail cons f t
        cons

The consNoTail stuff is also in this file:

// optimized mutation-based implementation. This code is only valid in fslib, where mutation of private
// tail cons cells is permitted in carefully written library code.
let inline setFreshConsTail cons t = cons.(::).1 <- t
let inline freshConsNoTail h = h :: (# "ldnull" : 'T list #)

So I guess it turns out that F#'s immutable lists are actually mutable because performance? I'm a bit worried about this, having used the prepend-then-reverse list approach as I thought it was the "way to go" in F#.

I'm not very experienced with F# or functional programming in general, so maybe (probably) the whole idea of creating a new mapFoldWhile function is the wrong thing to do, but then what am I to do instead?

I often find myself in situations where I need to "exit early" because a collection item is "invalid" and I know that I don't have to look at the rest. I'm using List.pick or Seq.takeWhile in some cases, but in other instances I need to do more ( mapFold ).

Is there an efficient solution to this kind of problem ( mapFoldWhile in particular and "exit early" in general) with functional programming concepts, or do I have to switch to an imperative solution / use a Collections.Generics.List ?

In most cases, using List.rev is a perfectly sufficient solution.

You are right that the F# core library uses mutation and other dirty hacks to squeeze some more performance out of the F# list operations, but I think the micro-optimizations done there are not particularly good example. F# list functions are used almost everywhere so it might be a good trade-off, but I would not follow it in most situations.

Running your function with the following:

let l = [ 1 .. 1000000 ]

#time 
mapFoldWhile (fun s v -> 0, s) (Some 1) l

I get ~240ms on the second line when I run the function without changes. When I just drop List.rev (so that it returns the data in the other order), I get around ~190ms. If you are really calling the function frequently enough that this matters, then you'd have to use mutation (actually, your own mutable list type), but I think that is rarely worth it.

For general "exit early" problems, you can often write the code as a composition of Seq.scan and Seq.takeWhile . For example, say you want to sum numbers from a sequence until you reach 1000. You can write:

input
|> Seq.scan (fun sum v -> v + sum) 0
|> Seq.takeWhile (fun sum -> sum < 1000)

Using Seq.scan generates a sequence of sums that is over the whole input, but since this is lazily generated, using Seq.takeWhile stops the computation as soon as the exit condition happens.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM