简体   繁体   English

如何提高此 Powershell 代码的性能

[英]How to Improve the Performance of this Powershell Code

I have a Powershell script which reads a 4000 KB text file (88,500 lines approx) This is the first time I have had my code do this much work.我有一个 Powershell 脚本,它读取一个 4000 KB 的文本文件(大约 88,500 行) 这是我第一次让我的代码做这么多的工作。 The script below took over 2 minutes to run and consumed around 20% CPU (see Task Manager screenshot below)下面的脚本运行时间超过 2 分钟,消耗了大约 20% 的 CPU(请参阅下面的任务管理器屏幕截图) 在此处输入图像描述 Can I improve performance using different code choices?我可以使用不同的代码选择来提高性能吗?

# extractUniqueBaseNames.ps1    --- copy first UPPERCASE word in each line of text, remove duplicates & store

$listing = 'C:\roll-conversion dump\LINZ Place Street Index\StreetIndexOutput.txt'

[array]$tempStorage = $null
[array]$Storage = $null

# select only CAPITALISED first string (at least two chars or longer) from listings
Select-String -Pattern '(\b[A-Z]{2,}\b[$\s])' -Path $listing -CaseSensitive |
    ForEach-Object {$newStringValue = $_.Matches.Value -replace '$\s', '\n' 
                    $tempStorage += $newStringValue 
                    }

    $Storage += $tempStorage | Select-Object -Unique

I have also added the following line to output results to a new text file (this was not included for the previous Task Manager reading):我还添加了以下行以将结果输出到新的文本文件(之前的任务管理器阅读中不包括此内容):

$Storage | Out-File -Append atest.txt

Since I am at an early stage of my development I would appreciate any suggestions that would improve the performance of this kind of Powershell script.由于我处于开发的早期阶段,我将不胜感激任何可以提高这种 Powershell 脚本性能的建议。

If I understand correctly your code, this should do the same but faster and more efficient.如果我正确理解您的代码,这应该会做同样的事情,但速度更快,效率更高。

Reference documentations:参考文件:

using namespace System.IO
using namespace System.Collections.Generic

try {
    $re      = [regex] '(\b[A-Z]{2,}\b[$\s])'
    $reader  = [StreamReader] 'some\path\to\inputfile.txt'
    $stream  = [File]::Open('some\path\to\outputfile.txt', [FileMode]::Append, [FileAccess]::Write)
    $writer  = [StreamWriter]::new($stream)
    $storage = [HashSet[string]]::new()

    while(-not $reader.EndOfStream) {
        # if the line matches the regex
        if($match = $re.Match($reader.ReadLine())) {
            $line = $match.Value -replace '$\s', '\n'
            # if the line hasn't been found before
            if($storage.Add($line)) {
                $writer.WriteLine($line)
            }
        }
    }
}
finally {
    ($reader, $writer, $stream).ForEach('Dispose')
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM