[英]How to Improve the Performance of this Powershell Code
I have a Powershell script which reads a 4000 KB text file (88,500 lines approx) This is the first time I have had my code do this much work.我有一个 Powershell 脚本,它读取一个 4000 KB 的文本文件(大约 88,500 行) 这是我第一次让我的代码做这么多的工作。 The script below took over 2 minutes to run and consumed around 20% CPU (see Task Manager screenshot below)
下面的脚本运行时间超过 2 分钟,消耗了大约 20% 的 CPU(请参阅下面的任务管理器屏幕截图)
Can I improve performance using different code choices?
我可以使用不同的代码选择来提高性能吗?
# extractUniqueBaseNames.ps1 --- copy first UPPERCASE word in each line of text, remove duplicates & store
$listing = 'C:\roll-conversion dump\LINZ Place Street Index\StreetIndexOutput.txt'
[array]$tempStorage = $null
[array]$Storage = $null
# select only CAPITALISED first string (at least two chars or longer) from listings
Select-String -Pattern '(\b[A-Z]{2,}\b[$\s])' -Path $listing -CaseSensitive |
ForEach-Object {$newStringValue = $_.Matches.Value -replace '$\s', '\n'
$tempStorage += $newStringValue
}
$Storage += $tempStorage | Select-Object -Unique
I have also added the following line to output results to a new text file (this was not included for the previous Task Manager reading):我还添加了以下行以将结果输出到新的文本文件(之前的任务管理器阅读中不包括此内容):
$Storage | Out-File -Append atest.txt
Since I am at an early stage of my development I would appreciate any suggestions that would improve the performance of this kind of Powershell script.由于我处于开发的早期阶段,我将不胜感激任何可以提高这种 Powershell 脚本性能的建议。
If I understand correctly your code, this should do the same but faster and more efficient.如果我正确理解您的代码,这应该会做同样的事情,但速度更快,效率更高。
Reference documentations:参考文件:
StreamReader
Class StreamReader
类StreamWriter
Class StreamWriter
类Regex
Class Regex
类File.Open
Method File.Open
方法using namespace System.IO
using namespace System.Collections.Generic
try {
$re = [regex] '(\b[A-Z]{2,}\b[$\s])'
$reader = [StreamReader] 'some\path\to\inputfile.txt'
$stream = [File]::Open('some\path\to\outputfile.txt', [FileMode]::Append, [FileAccess]::Write)
$writer = [StreamWriter]::new($stream)
$storage = [HashSet[string]]::new()
while(-not $reader.EndOfStream) {
# if the line matches the regex
if($match = $re.Match($reader.ReadLine())) {
$line = $match.Value -replace '$\s', '\n'
# if the line hasn't been found before
if($storage.Add($line)) {
$writer.WriteLine($line)
}
}
}
}
finally {
($reader, $writer, $stream).ForEach('Dispose')
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.