簡體   English   中英

如何將項目動態添加到 PowerShell ArrayList 並使用運行空間池遞歸處理它們?

[英]How to dynamically add items to a PowerShell ArrayList and process them recursively using Runspace pool?

我有一個for循環,它遍歷ArrayList並在此過程中,將更多項目添加到列表中並(迭代地)處理它們。 我正在嘗試將此函數轉換為使用 Runspacepool 同時運行。

這是沒有運行空間的正常代碼:

$array = [System.Collections.ArrayList]@(1, 2, 3, 4, 5)
Write-Host "Number of items in array before loop: $($array.Count)"
for ($i = 0; $i -lt $array.Count; $i++) {
    Write-Host "Counter: $i`tArray: $array"
    if ($array[$i] -in @(1, 2, 3, 4, 5)) {
        $array.Add($array[$i] + 3) | Out-Null
    }
}
Write-Host "Array: $array"
Write-Host "Number of items in array after loop: $($array.Count)"

輸出是:

Number of items in array before loop: 5
Counter: 0      Array: 1 2 3 4 5
Counter: 1      Array: 1 2 3 4 5 4
Counter: 2      Array: 1 2 3 4 5 4 5
Counter: 3      Array: 1 2 3 4 5 4 5 6
Counter: 4      Array: 1 2 3 4 5 4 5 6 7
Counter: 5      Array: 1 2 3 4 5 4 5 6 7 8
Counter: 6      Array: 1 2 3 4 5 4 5 6 7 8 7
Counter: 7      Array: 1 2 3 4 5 4 5 6 7 8 7 8
Counter: 8      Array: 1 2 3 4 5 4 5 6 7 8 7 8
Counter: 9      Array: 1 2 3 4 5 4 5 6 7 8 7 8
Counter: 10     Array: 1 2 3 4 5 4 5 6 7 8 7 8
Counter: 11     Array: 1 2 3 4 5 4 5 6 7 8 7 8
Array: 1 2 3 4 5 4 5 6 7 8 7 8
Number of items in array after loop: 12

這是我要實現的運行空間功能

$pool = [RunspaceFactory]::CreateRunspacePool(1, 10)
$pool.Open()
$runspaces = @()

$scriptblock = {
    Param ($i, $array)
    # Start-Sleep 1 # <------ Output varies significantly if this is enabled
    Write-Output "$i value: $array"
    if ($i -in @(1, 2, 3, 4, 5)) {
        $array.Add($i + 3) | Out-Null
    }
}

$array = [System.Collections.ArrayList]::Synchronized(([System.Collections.ArrayList]$(1, 2, 3, 4, 5)))
Write-Host "Number of items in array before loop: $($array.Count)"
for ($i = 0; $i -lt $array.Count; $i++) {
    $runspace = [PowerShell]::Create().AddScript($scriptblock).AddArgument($array[$i]).AddArgument($array)
    $runspace.RunspacePool = $pool
    $runspaces += [PSCustomObject]@{ Pipe = $runspace; Status = $runspace.BeginInvoke() }
}

while ($runspaces.Status -ne $null) {
    $completed = $runspaces | Where-Object { $_.Status.IsCompleted -eq $true }
    foreach ($runspace in $completed) {
        $runspace.Pipe.EndInvoke($runspace.Status)
        $runspace.Status = $null
    }
}
Write-Host "array: $array"
Write-Host "Number of items in array after loop: $($array.Count)"
$pool.Close()
$pool.Dispose()

沒有睡眠功能的輸出如預期:

Number of items in array before loop: 5
Current value: 1        Array: 1 2 3 4 5
Current value: 2        Array: 1 2 3 4 5 4
Current value: 3        Array: 1 2 3 4 5 4 5
Current value: 4        Array: 1 2 3 4 5 4 5 6
Current value: 5        Array: 1 2 3 4 5 4 5 6 7
Current value: 4        Array: 1 2 3 4 5 4 5 6 7 8
Current value: 5        Array: 1 2 3 4 5 4 5 6 7 8 7
Current value: 6        Array: 1 2 3 4 5 4 5 6 7 8 7
Current value: 7        Array: 1 2 3 4 5 4 5 6 7 8 7
Current value: 8        Array: 1 2 3 4 5 4 5 6 7 8 7
Current value: 7        Array: 1 2 3 4 5 4 5 6 7 8 7 8
Current value: 8        Array: 1 2 3 4 5 4 5 6 7 8 7 8
Array: 1 2 3 4 5 4 5 6 7 8 7 8
Number of items in array after loop: 12

睡眠輸出:

Number of items in array before loop: 5
Current value: 1        Array: 1 2 3 4 5
Current value: 2        Array: 1 2 3 4 5 4
Current value: 3        Array: 1 2 3 4 5 4 5
Current value: 4        Array: 1 2 3 4 5 4 5 6
Current value: 5        Array: 1 2 3 4 5 4 5 6 7
Array: 1 2 3 4 5 4 5 6 7 8
Number of items in array after loop: 10

我知道發生這種情況是因為for循環在睡眠時間完成之前退出,因此只有前 5 個項目被添加到運行空間池中。

有沒有辦法動態地將更多項目添加到 ArrayList 並仍然使用運行空間同時處理它們?

“工作”行為的核心是 PowerShell運行“非睡眠”腳本塊的速度比它在for循環中創建它們的速度要快,因此循環在到達結束之前看到以前的迭代添加的新項目數組。 因此,它必須在退出並進入while循環之前處理所有項目。

當您添加Start-Sleep時,它改變了平衡,運行腳本塊比創建腳本塊花費的時間更長,因此for循環在最早的迭代添加新項目之前到達數組的末尾。

以下腳本通過組合您的forwhile循環在 (i) 創建新線程和 (ii) 檢查它們是否已完成以及僅在所有工作完成后退出之間反復交替來解決此問題。

然而,多線程很難,所以最好假設我在某個地方犯了錯誤,並在你將它發布到你的實時工作流程之前進行適當的測試......

$scriptblock = {
    Param ($i, $array)
    # random sleep to simulate variable-length workloads. this is
    # more likely to flush out error conditions than a fixed sleep 
    # period as threads will finish out-of-turn more often
    Start-Sleep (Get-Random -Minimum 1 -Maximum 10)
    Write-Output "$i value: $array"
    if ($i -in @(1, 2, 3, 4, 5)) {
        $array.Add($i + 3) | Out-Null
    }
}

$pool = [RunspaceFactory]::CreateRunspacePool(1, 10)
$pool.Open()

# note - your "$runspaces" variable is misleading as you're creating 
# "PowerShell" objects, and a "Runspace" is a different thing entirely,
# so I've called it $instances instead
# see https://docs.microsoft.com/en-us/dotnet/api/system.management.automation.powershell?view=powershellsdk-7.0.0
#  vs https://docs.microsoft.com/en-us/dotnet/api/system.management.automation.runspaces.runspace?view=powershellsdk-7.0.0
$instances = @()

$array = [System.Collections.ArrayList]::Synchronized(([System.Collections.ArrayList]$(1, 2, 3, 4, 5)))
Write-Host "Number of items in array before loop: $($array.Count)"

while( $true )
{

    # start PowerShell instances for any items in $array that don't already have one.
    # on the first iteration this will seed the initial instances, and in
    # subsequent iterations it will create new instances for items added to
    # $array since the last iteration.
    while( $instances.Length -lt $array.Count )
    {
        $instance = [PowerShell]::Create().AddScript($scriptblock).AddArgument($array[$instances.Length]).AddArgument($array);
        $instance.RunspacePool = $pool
        $instances += [PSCustomObject]@{ Value = $instance; Status = $instance.BeginInvoke() }
    }

    # watch out because there's a race condition here. it'll need very unlucky 
    # timing, *but* an instance might have added an item to $array just after
    # the while loop finished, but before the next line runs, so there *could* 
    # be an item in $array that hasn't had an instance created for it even
    # if all the current instances have completed

    # is there any more work to do? (try to mitigate the race condition
    # by checking again for any items in $array that don't have an instance
    # created for them)
    $active = @( $instances | Where-Object { -not $_.Status.IsCompleted } )
    if( ($active.Length -eq 0) -and ($instances.Length -eq $array.Count) )
    {
        # instances have been created for every item in $array,
        # *and* they've run to completion, so there's no more work to do
        break;
    }

    # if there are incomplete instances, wait for a short time to let them run
    # (this is to avoid a "busy wait" - https://en.wikipedia.org/wiki/Busy_waiting)
    Start-Sleep -Milliseconds 250;

}

# all the instances have completed, so end them
foreach ($instance in $instances)
{
    $instance.Value.EndInvoke($instance.Status);
}

Write-Host "array: $array"
Write-Host "Number of items in array after loop: $($array.Count)"
$pool.Close()
$pool.Dispose()

示例輸出:

Number of items in array before loop: 5
1 value: 1 2 3 4 5 6 5 7
2 value: 1 2 3 4 5 6
3 value: 1 2 3 4 5
4 value: 1 2 3 4 5 6 5
5 value: 1 2 3 4 5 6 5 7 4
6 value: 1 2 3 4 5 6 5 7
5 value: 1 2 3 4 5 6 5 7 4 8
7 value: 1 2 3 4 5 6 5 7
4 value: 1 2 3 4 5 6 5 7 4 8 8
8 value: 1 2 3 4 5 6 5 7 4 8 8
8 value: 1 2 3 4 5 6 5 7 4 8 8
7 value: 1 2 3 4 5 6 5 7 4 8 8 7

請注意,數組中項目的順序將根據$scriptblock中隨機睡眠的長度而有所不同。

可能還可以進行其他改進,但這至少似乎可行……

這個答案試圖使用BlockingCollection<T>生產者-消費者問題提供更好的解決方案,它提供了生產者/消費者模式的實現

正如 OP 在評論中指出的那樣,用我之前的回答澄清這個問題:

如果隊列的起始計數(比如 2)小於最大線程數(比如 5),那么無論有多少項目被添加到隊列中,只有那么多(在這種情況下為 2)線程保持活動狀態之后。 只有起始數量的線程處理隊列中的其余項目。 就我而言,起始計數通常是一。 然后我提出一個irmInvoke-RestMethod別名)請求,並添加了一些 10~20 項。 這些僅由第一個線程處理。 其他線程一開始就進入 Completed 狀態 有針對這個的解決方法嗎?

對於此示例,運行空間將使用TryTake(T, TimeSpan)方法重載,該方法會阻塞線程並等待指定的超時。 在每次循環迭代中,運行空間也將使用它們的TryTake(..)結果更新同步哈希表。

主線程將使用同步哈希表等到所有運行空間都發送了$false狀態,當這種情況發生時,將向線程發送退出信號以使用.CompleteAdding()

即使不完美,這也解決了一些線程可能會提前退出循環並嘗試確保所有線程同時結束(當集合中沒有更多項目時)的問題

生產者邏輯將與前面的答案非常相似,但是,在這種情況下,每個線程將在每次循環迭代中等待$timeout.Seconds - 5$timeout.Seconds + 5之間的隨機時間。

可以在這個 gist上找到可以從這個演示中得到的結果。

using namespace System.Management.Automation.Runspaces
using namespace System.Collections.Concurrent
using namespace System.Threading

try {
    $threads = 20
    $bc      = [BlockingCollection[int]]::new()
    $status  = [hashtable]::Synchronized(@{ TotalCount = 0 })

    # set a timer, all threads will wait for it before exiting
    # this timespan should be tweaked depending on the task at hand
    $timeout = [timespan]::FromSeconds(5)

    foreach($i in 1, 2, 3, 4, 5) {
        $bc.Add($i)
    }


    $scriptblock = {
        param([timespan] $timeout, [int] $threads)

        $id = [runspace]::DefaultRunspace
        $status[$id.InstanceId] = $true
        $syncRoot = $status.SyncRoot
        $release  = {
            [Threading.Monitor]::Exit($syncRoot)
            [Threading.Monitor]::PulseAll($syncRoot)
        }

        # will use this to simulate random delays
        $min = $timeout.Seconds - 5
        $max = $timeout.Seconds + 5

        [ref] $target = $null
        while(-not $bc.IsCompleted) {
            # NOTE from `Hashtable.Synchronized(Hashtable)` MS Docs:
            #
            #    The Synchronized method is thread safe for multiple readers and writers.
            #    Furthermore, the synchronized wrapper ensures that there is only
            #    one writer writing at a time.
            #
            #    Enumerating through a collection is intrinsically not a
            #    thread-safe procedure. Even when a collection is synchronized,
            #    other threads can still modify the collection, which causes the
            #    enumerator to throw an exception.

            # Mainly doing this (lock on the sync hash) to get the Active Count
            # Not really needed and only for demo porpuses

            # if we can't lock on this object in 200ms go next iteration
            if(-not [Threading.Monitor]::TryEnter($syncRoot, 200)) {
                continue
            }

            # if there are no items in queue, send `$false` to the main thread
            if(-not ($status[$id.InstanceId] = $bc.TryTake($target, $timeout))) {
                # release the lock and signal the threads they can get a handle
                & $release
                # and go next iteration
                continue
            }

            # if there was an item in queue, get the active count
            $active = @($status.Values -eq $true).Count
            # add 1 to the total count
            $status['TotalCount'] += 1
            # and release the lock
            & $release

            Write-Host (
                ('Target Value: {0}' -f $target.Value).PadRight(20) + '|'.PadRight(5) +
                ('Items in Queue: {0}' -f $bc.Count).PadRight(20)   + '|'.PadRight(5) +
                ('Runspace Id: {0}' -f $id.Id).PadRight(20)         + '|'.PadRight(5) +
                ('Active Runspaces [{0:D2} / {1:D2}]' -f $active, $threads)
            )

            $ran = [random]::new()
            # start a simulated delay
            Start-Sleep $ran.Next($min, $max)

            # get a random number between 0 and 10
            $ran = $ran.Next(11)
            # if the number is greater than the Dequeued Item
            if ($ran -gt $target.Value) {
                # enumerate starting from `$ran - 2` up to `$ran`
                foreach($i in ($ran - 2)..$ran) {
                    # enqueue each item
                    $bc.Add($i)
                }
            }

            # Send 1 to the Success Stream, this will help us check
            # if the test succeeded later on
            1
        }
    }

    $iss    = [initialsessionstate]::CreateDefault2()
    $rspool = [runspacefactory]::CreateRunspacePool(1, $threads, $iss, $Host)
    $rspool.ApartmentState = [ApartmentState]::STA
    $rspool.ThreadOptions  = [PSThreadOptions]::UseNewThread
    $rspool.InitialSessionState.Variables.Add([SessionStateVariableEntry[]]@(
        [SessionStateVariableEntry]::new('bc', $bc, 'Producer Consumer Collection')
        [SessionStateVariableEntry]::new('status', $status, 'Monitoring hash for signaling `.CompleteAdding()`')
    ))
    $rspool.Open()

    $params = @{
        Timeout = $timeout
        Threads = $threads
    }

    $rs = for($i = 0; $i -lt $threads; $i++) {
        $ps = [powershell]::Create($iss).AddScript($scriptblock).AddParameters($params)
        $ps.RunspacePool = $rspool

        @{
            Instance    = $ps
            AsyncResult = $ps.BeginInvoke()
        }
    }

    while($status.ContainsValue($true)) {
        Start-Sleep -Milliseconds 200
    }

    # send signal to stop
    $bc.CompleteAdding()

    [int[]] $totalCount = foreach($r in $rs) {
        try {
            $r.Instance.EndInvoke($r.AsyncResult)
            $r.Instance.Dispose()
        }
        catch {
            Write-Error $_
        }
    }
    Write-Host ("`nTotal Count [ IN {0} / OUT {1} ]" -f $totalCount.Count, $status['TotalCount'])
    Write-Host ("Items in Queue: {0}" -f $bc.Count)
    Write-Host ("Test Succeeded: {0}" -f (
        [Linq.Enumerable]::Sum($totalCount) -eq $status['TotalCount'] -and
        $bc.Count -eq 0
    ))
}
finally {
    ($bc, $rspool).ForEach('Dispose')
}

請注意,此答案不能很好地解決 OP 的問題。 請參閱此答案以更好地了解生產者-消費者問題


這是與mclayton 的有用答案不同的方法,希望這兩個答案都能引導您解決問題。 此示例使用ConcurrentQueue<T>並包含執行相同操作的多個線程。

如您所見,在這種情況下,我們只啟動了5 個線程,它們將嘗試同時使項目出隊。

如果0 到 10 之間的隨機生成的數字大於出列項,它會創建一個從隨機數 - 2 到給定隨機數的數組並將它們排入隊列(嘗試模擬,糟糕的是,您在評論中發布的內容, “實際問題涉及到多個端點的Invoke-RestMethod ( irm ),基於其結果,我可能必須查詢更多相似的端點” )。

請注意,對於此示例,我使用的是$threads = $queue.Count但情況並非總是如此 不要啟動太多線程,否則您可能會終止會話! 另請注意,如果同時查詢多個端點,您的網絡可能會過載。 我想說,保持線程始終低於$queue.Count

您可以從下面的代碼中獲得的結果在每個運行時都會有很大差異。

using namespace System.Management.Automation.Runspaces
using namespace System.Collections.Concurrent

try {
    $queue = [ConcurrentQueue[int]]::new()
    foreach($i in 1, 2, 3, 4, 5) {
        $queue.Enqueue($i)
    }
    $threads = $queue.Count

    $scriptblock = {
        [ref] $target = $null
        while($queue.TryDequeue($target)) {
            [pscustomobject]@{
                'Target Value'      = $target.Value
                'Elements in Queue' = $queue.Count
            }

            # get a random number between 0 and 10
            $ran = Get-Random -Maximum 11
            # if the number is greater than the Dequeued Item
            if ($ran -gt $target.Value) {
                # enumerate starting from `$ran - 2` up to `$ran`
                foreach($i in ($ran - 2)..$ran) {
                    # enqueue each item
                    $queue.Enqueue($i)
                }
            }
        }
    }

    $iss    = [initialsessionstate]::CreateDefault2()
    $rspool = [runspacefactory]::CreateRunspacePool(1, $threads, $iss, $Host)
    $rspool.InitialSessionState.Variables.Add([SessionStateVariableEntry]::new(
        'queue', $queue, ''
    ))
    $rspool.Open()

    $rs = for($i = 0; $i -lt $threads; $i++) {
        $ps = [powershell]::Create().AddScript($scriptblock)
        $ps.RunspacePool = $rspool

        @{
            Instance    = $ps
            AsyncResult = $ps.BeginInvoke()
        }
    }

    foreach($r in $rs) {
        try {
            $r.Instance.EndInvoke($r.AsyncResult)
            $r.Instance.Dispose()
        }
        catch {
            Write-Error $_
        }
    }
}
finally {
    $rspool.ForEach('Dispose')
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM