MailboxProcessor性能問題

Question

我一直在嘗試設計一個允許大量並發用戶同時在內存中表示的系統。 當我開始設計這個系統時，我立刻想到了某種基於actor的解決方案，這是Erlang的親屬。

系統必須在.NET中完成，所以我開始使用MailboxProcessor在F＃中開發原型，但是遇到了嚴重的性能問題。 我最初的想法是每個用戶使用一個actor（MailboxProcessor）來為一個用戶序列化通信通信。

我已經隔離了一小段代碼，可以重現我看到的問題：

open System.Threading;
open System.Diagnostics;

type Inc() =

    let mutable n = 0;
    let sw = new Stopwatch()

    member x.Start() =
        sw.Start()

    member x.Increment() =
        if Interlocked.Increment(&n) >= 100000 then
            printf "UpdateName Time %A" sw.ElapsedMilliseconds

type Message
    = UpdateName of int * string

type User = {
    Id : int
    Name : string
}

[<EntryPoint>]
let main argv = 

    let sw = Stopwatch.StartNew()
    let incr = new Inc()
    let mb = 

        Seq.initInfinite(fun id -> 
            MailboxProcessor<Message>.Start(fun inbox -> 

                let rec loop user =
                    async {
                        let! m = inbox.Receive()

                        match m with
                        | UpdateName(id, newName) ->
                            let user = {user with Name = newName};
                            incr.Increment()
                            do! loop user
                    }

                loop {Id = id; Name = sprintf "User%i" id}
            )
        ) 
        |> Seq.take 100000
        |> Array.ofSeq

    printf "Create Time %i\n" sw.ElapsedMilliseconds
    incr.Start()

    for i in 0 .. 99999 do
        mb.[i % mb.Length].Post(UpdateName(i, sprintf "User%i-UpdateName" i));

    System.Console.ReadLine() |> ignore

    0

在我的四核i7上創建100k演員需要大約800ms。 然后將UpdateName消息提交給每個actor並等待它們完成大約需要1.8秒。

現在，我意識到所有隊列都有開銷：在ThreadPool上，在MailboxProcessor內部設置/重置AutoResetEvents等。 但這真的是預期的表現嗎？ 通過閱讀MSDN和MailboxProcessor上的各種博客，我已經認識到它將成為erlang演員的親戚，但從我看到的深淵表現來看，這在現實中似乎並不成立？

我還嘗試了一個代碼的修改版本，它使用了8個MailboxProcessors，每個代碼都有一個Map<int, User>地圖，用於通過id查找用戶，它產生了一些改進，減少了UpdateName的總時間操作到1.2秒。 但它仍然感覺很慢，修改后的代碼在這里：

open System.Threading;
open System.Diagnostics;

type Inc() =

    let mutable n = 0;
    let sw = new Stopwatch()

    member x.Start() =
        sw.Start()

    member x.Increment() =
        if Interlocked.Increment(&n) >= 100000 then
            printf "UpdateName Time %A" sw.ElapsedMilliseconds

type Message
    = CreateUser of int * string
    | UpdateName of int * string

type User = {
    Id : int
    Name : string
}

[<EntryPoint>]
let main argv = 

    let sw = Stopwatch.StartNew()
    let incr = new Inc()
    let mb = 

        Seq.initInfinite(fun id -> 
            MailboxProcessor<Message>.Start(fun inbox -> 

                let rec loop users =
                    async {
                        let! m = inbox.Receive()

                        match m with
                        | CreateUser(id, name) ->
                            do! loop (Map.add id {Id=id; Name=name} users)

                        | UpdateName(id, newName) ->
                            match Map.tryFind id users with
                            | None -> 
                                do! loop users

                            | Some(user) ->
                                incr.Increment()
                                do! loop (Map.add id {user with Name = newName} users)
                    }

                loop Map.empty
            )
        ) 
        |> Seq.take 8
        |> Array.ofSeq

    printf "Create Time %i\n" sw.ElapsedMilliseconds

    for i in 0 .. 99999 do
        mb.[i % mb.Length].Post(CreateUser(i, sprintf "User%i-UpdateName" i));

    incr.Start()

    for i in 0 .. 99999 do
        mb.[i % mb.Length].Post(UpdateName(i, sprintf "User%i-UpdateName" i));

    System.Console.ReadLine() |> ignore

    0

所以我的問題在這里，我做錯了嗎？ 我是否誤解了應該如何使用MailboxProcessor？ 或者這是預期的表現。

更新：

所以我在## fsharp @ irc.freenode.net上找到了一些人，這告訴我使用sprintf非常慢，而事實證明這是我的大部分性能問題都來自於。 但是，刪除上面的sprintf操作並且只為每個用戶使用相同的名稱，我仍然最終需要大約400ms才能進行操作，這感覺非常慢。

Answer 1

現在，我意識到所有隊列都有開銷：在ThreadPool上，在MailboxProcessor內部設置/重置AutoResetEvents等。

並且printf ， Map ， Seq和爭奪你的全球可變Inc 。 而且你正在泄漏堆分配的堆棧幀。 實際上，運行基准測試所花費的時間只占用於郵箱MailboxProcessor一小部分時間。

但這真的是預期的表現嗎？

我對你的程序的性能並不感到驚訝，但它並沒有說明MailboxProcessor的性能。

通過閱讀MSDN和MailboxProcessor上的各種博客，我已經認識到它將成為erlang演員的親戚，但從我看到的深淵表現來看，這在現實中似乎並不成立？

MailboxProcessor在概念上有點類似於Erlang的一部分。 你所看到的糟糕表現是由於各種各樣的事情，其中一些是相當微妙的，並將影響任何這樣的程序。

所以我的問題在這里，我做錯了嗎？

我覺得你做錯了幾件事。 首先，你試圖解決的問題不明確，所以這聽起來像一個XY問題。 其次，您正在嘗試對錯誤的事情進行基准測試（例如，您正在抱怨創建MailboxProcessor所需的微秒時間，但可能只在建立TCP連接時才會這樣做，這需要花費幾個數量級的時間）。 第三，你已經編寫了一個基准程序來衡量一些事情的表現，但是把你的觀察結果歸結為完全不同的事情。

讓我們更詳細地看一下您的基准程序。 在我們做任何其他事情之前，讓我們修復一些錯誤。 您應該始終使用sw.Elapsed.TotalSeconds來測量時間，因為它更精確。 您應該始終使用return!異步工作流程return! 而不是do! 或者你會泄漏堆棧幀。

我的初步時間是：

Creation stage: 0.858s
Post stage: 1.18s

接下來，讓我們運行一個配置文件，以確保我們的程序真正花費大部分時間來顛覆F＃ MailboxProcessor ：

77%    Microsoft.FSharp.Core.PrintfImpl.gprintf(...)
 4.4%  Microsoft.FSharp.Control.MailboxProcessor`1.Post(!0)

顯然不是我們所希望的。 更抽象地思考，我們使用sprintf東西生成大量數據，然后應用它，但我們正在一起進行生成和應用。 讓我們分離出我們的初始化代碼：

let ids = Array.init 100000 (fun id -> {Id = id; Name = sprintf "User%i" id})
...
    ids
    |> Array.map (fun id ->
        MailboxProcessor<Message>.Start(fun inbox -> 
...
            loop id
...
    printf "Create Time %fs\n" sw.Elapsed.TotalSeconds
    let fxs =
      [|for i in 0 .. 99999 ->
          mb.[i % mb.Length].Post, UpdateName(i, sprintf "User%i-UpdateName" i)|]
    incr.Start()
    for f, x in fxs do
      f x
...

現在我們得到：

Creation stage: 0.538s
Post stage: 0.265s

因此創建速度提高了60％，發布速度提高了4.5倍。

讓我們嘗試完全重寫您的基准：

do
  for nAgents in [1; 10; 100; 1000; 10000; 100000] do
    let timer = System.Diagnostics.Stopwatch.StartNew()
    use barrier = new System.Threading.Barrier(2)
    let nMsgs = 1000000 / nAgents
    let nAgentsFinished = ref 0
    let makeAgent _ =
      new MailboxProcessor<_>(fun inbox ->
        let rec loop n =
          async { let! () = inbox.Receive()
                  let n = n+1
                  if n=nMsgs then
                    let n = System.Threading.Interlocked.Increment nAgentsFinished
                    if n = nAgents then
                      barrier.SignalAndWait()
                  else
                    return! loop n }
        loop 0)
    let agents = Array.init nAgents makeAgent
    for agent in agents do
      agent.Start()
    printfn "%fs to create %d agents" timer.Elapsed.TotalSeconds nAgents
    timer.Restart()
    for _ in 1..nMsgs do
      for agent in agents do
        agent.Post()
    barrier.SignalAndWait()
    printfn "%fs to post %d msgs" timer.Elapsed.TotalSeconds (nMsgs * nAgents)
    timer.Restart()
    for agent in agents do
      use agent = agent
      ()
    printfn "%fs to dispose of %d agents\n" timer.Elapsed.TotalSeconds nAgents

此版本需要nMsgs到每個代理程序之前該代理程序將增加共享計數器，從而大大降低該共享計數器的性能影響。 該程序還檢查了不同數量的代理的性能。 在這台機器上我得到：

Agents  M msgs/s
     1    2.24
    10    6.67
   100    7.58
  1000    5.15
 10000    1.15
100000    0.36

因此，您看到的msg / s速度較低的部分原因似乎是異常大量（100,000）的代理。 使用10-1,000個代理程序，F＃實現速度比使用100,000個代理程序快10倍以上。

因此，如果您可以使用這種性能，那么您應該能夠在F＃中編寫整個應用程序，但如果您需要獲得更多性能，我建議您使用不同的方法。 你可能甚至不必犧牲使用F＃（並且你當然可以用它來進行原型設計）采用像Disruptor這樣的設計。 在實踐中，我發現在.NET上進行序列化所花費的時間往往遠遠大於在F＃async和MailboxProcessor花費的時間。

Answer 2

消除了sprintf ，我得到了大約12秒（Mac上的單聲道並不那么快）。 以Phil Trelford建議使用Dictionary而不是Map為例，它達到了600ms。 沒有在Win / .Net上嘗試過。

代碼更改很簡單，本地可變性對我來說是完全可以接受的：

let mb = 
    Seq.initInfinite(fun id -> 
        MailboxProcessor<Message>.Start(fun inbox -> 
            let di = System.Collections.Generic.Dictionary<int,User>()
            let rec loop () =
                async {
                    let! m = inbox.Receive()

                    match m with
                    | CreateUser(id, name) ->
                        di.Add(id, {Id=id; Name=name})
                        return! loop ()

                    | UpdateName(id, newName) ->
                        match di.TryGetValue id with
                        | false, _ -> 
                            return! loop ()

                        | true, user ->
                            incr.Increment()
                            di.[id] <- {user with Name = newName}
                            return! loop ()
                }

            loop ()
        )
    ) 
    |> Seq.take 8
    |> Array.ofSeq

MailboxProcessor性能問題

問題描述

2 個解決方案

解決方案1
16 2013-07-01 09:25:13

解決方案2
2 2013-06-28 20:38:58

MailboxProcessor性能問題

問題描述

2 個解決方案

解決方案1 16 2013-07-01 09:25:13

解決方案2 2 2013-06-28 20:38:58

解決方案1
16 2013-07-01 09:25:13

解決方案2
2 2013-06-28 20:38:58