[英]How to speed up Powershell Get-Childitem over UNC
DIR
or GCI
is slow in Powershell, but fast in CMD. DIR
或GCI
在 Powershell 中很慢,但在 CMD 中很快。 Is there any way to speed this up?有没有办法加快这个速度?
In CMD.exe, after a sub-second delay, this responds as fast as the CMD window can keep up在 CMD.exe 中,在亚秒级延迟后,这会以 CMD 窗口可以跟上的速度响应
dir \\remote-server.domain.com\share\folder\file*.*
In Powershell (v2), after a 40+ second delay, this responds with a noticable slowness (maybe 3-4 lines per second)在 Powershell (v2) 中,经过 40 秒以上的延迟后,响应速度明显变慢(可能每秒 3-4 行)
gci \\remote-server.domain.com\share\folder\file*.*
I'm trying to scan logs on a remote server, so maybe there's a faster approach.我正在尝试扫描远程服务器上的日志,所以也许有更快的方法。
get-childitem \\$s\logs -include $filemask -recurse | select-string -pattern $regex
Okay, this is how I'm doing it, and it seems to work.好的,这就是我的做法,它似乎有效。
$files = cmd /c "$GETFILESBAT \\$server\logs\$filemask"
foreach( $f in $files ) {
if( $f.length -gt 0 ) {
select-string -Path $f -pattern $regex | foreach-object { $_ }
}
}
Then $GETFILESBAT points to this:然后 $GETFILESBAT 指向这个:
@dir /a-d /b /s %1
@exit
I'm writing and deleting this BAT file from the PowerShell script, so I guess it's a PowerShell-only solution, but it doesn't use only PowerShell.我正在从 PowerShell 脚本中编写和删除这个 BAT 文件,所以我猜这是一个仅限 PowerShell 的解决方案,但它不仅仅使用 PowerShell。
My preliminary performance metrics show this to be eleventy-thousand times faster.我的初步性能指标显示这要快一万一千倍。
I tested gci vs. cmd dir vs. FileIO.FileSystem.GetFiles from @Shawn Melton's referenced link .我从@Shawn Melton 的引用链接中测试了 gci 与 cmd dir 与 FileIO.FileSystem.GetFiles 。
The bottom line is that, for daily use on local drives, GetFiles
is the fastest.最重要的是,对于本地驱动器上的日常使用,
GetFiles
是最快的。 By far .到目前为止。
CMD DIR
is respectable. CMD DIR
是可敬的。 Once you introduce a slower network connection with many files, CMD DIR
is slightly faster than GetFiles
.一旦您为许多文件引入了较慢的网络连接,
CMD DIR
比GetFiles
稍快。 Then Get-ChildItem
... wow, this ranges from not too bad to horrible, depending on the number of files involved and the speed of the connection.然后
Get-ChildItem
......哇,这范围从不太糟糕到可怕,取决于涉及的文件数量和连接速度。
Some test runs.一些测试运行。 I've moved GCI around in the tests to make sure the results were consistent.
我在测试中移动了 GCI 以确保结果一致。
10 iterations of scanning c:\\windows\\temp
for *.tmp files扫描
c:\\windows\\temp
以获取 *.tmp 文件的 10 次迭代
.\test.ps1 "c:\windows\temp" "*.tmp" 10
GetFiles ... 00:00:00.0570057
CMD dir ... 00:00:00.5360536
GCI ... 00:00:01.1391139
GetFiles is 10x faster than CMD dir, which itself is more than 2x faster than GCI. GetFiles 比 CMD 目录快 10 倍,而 CMD 目录本身比 GCI 快 2 倍以上。
10 iterations of scanning c:\\windows\\temp
for *.tmp files with recursion使用递归扫描
c:\\windows\\temp
以获取 *.tmp 文件的 10 次迭代
.\test.ps1 "c:\windows\temp" "*.tmp" 10 -recurse
GetFiles ... 00:00:00.7020180
CMD dir ... 00:00:00.7644196
GCI ... 00:00:04.7737224
GetFiles is a little faster than CMD dir, and both are almost 7x faster than GCI. GetFiles 比 CMD 目录快一点,而且两者都比 GCI 快 7 倍。
10 iterations of scanning an on-site server on another domain for application log files扫描另一个域上的现场服务器以获取应用程序日志文件的 10 次迭代
.\test.ps1 "\\closeserver\logs\subdir" "appname*.*" 10
GetFiles ... 00:00:00.3590359
CMD dir ... 00:00:00.6270627
GCI ... 00:00:06.0796079
GetFiles is about 2x faster than CMD dir, itself 10x faster than GCI. GetFiles 大约比 CMD 目录快 2 倍,本身比 GCI 快 10 倍。
One iteration of scanning a distant server on another domain for application log files, with many files involved扫描另一个域上的远程服务器以查找应用程序日志文件的一次迭代,涉及许多文件
.\test.ps1 "\\distantserver.company.com\logs\subdir" "appname.2011082*.*"
CMD dir ... 00:00:00.3340334
GetFiles ... 00:00:00.4360436
GCI ... 00:11:09.5525579
CMD dir is fastest going to the distant server with many files, but GetFiles is respectably close. CMD 目录可以最快地到达带有许多文件的远程服务器,但 GetFiles 相当接近。 GCI on the other hand is a couple of thousand times slower.
另一方面,GCI 慢了几千倍。
Two iterations of scanning a distant server on another domain for application log files, with many files两次迭代扫描另一个域上的远程服务器以查找应用程序日志文件,其中包含许多文件
.\test.ps1 "\\distantserver.company.com\logs\subdir" "appname.20110822*.*" 2
CMD dir ... 00:00:00.9360240
GetFiles ... 00:00:01.4976384
GCI ... 00:22:17.3068616
More or less linear increase as test iterations increase.随着测试迭代的增加或多或少线性增加。
One iteration of scanning a distant server on another domain for application log files, with fewer files扫描另一个域上的远程服务器以获取应用程序日志文件的一次迭代,文件更少
.\test.ps1 "\\distantserver.company.com\logs\othersubdir" "appname.2011082*.*" 10
GetFiles ... 00:00:00.5304170
CMD dir ... 00:00:00.6240200
GCI ... 00:00:01.9656630
Here GCI is not too bad, GetFiles is 3x faster, and CMD dir is close behind.这里 GCI 还不错,GetFiles 快了 3 倍,CMD 目录紧随其后。
Conclusion结论
GCI
needs a -raw
or -fast
option that does not try to do so much. GCI
需要一个-raw
或-fast
选项,它不会尝试做太多事情。 In the meantime, GetFiles
is a healthy alternative that is only occasionally a little slower than CMD dir
, and usually faster (due to spawning CMD.exe?).与此同时,
GetFiles
是一个健康的替代方案,只是偶尔比CMD dir
慢一点,通常更快(由于产生 CMD.exe?)。
For reference, here's the test.ps1 code.作为参考,这里是 test.ps1 代码。
param ( [string]$path, [string]$filemask, [switch]$recurse=$false, [int]$n=1 )
[reflection.assembly]::loadwithpartialname("Microsoft.VisualBasic") | Out-Null
write-host "GetFiles... " -nonewline
$dt = get-date;
for($i=0;$i -lt $n;$i++){
if( $recurse ){ [Microsoft.VisualBasic.FileIO.FileSystem]::GetFiles( $path,
[Microsoft.VisualBasic.FileIO.SearchOption]::SearchAllSubDirectories,$filemask
) | out-file ".\testfiles1.txt"}
else{ [Microsoft.VisualBasic.FileIO.FileSystem]::GetFiles( $path,
[Microsoft.VisualBasic.FileIO.SearchOption]::SearchTopLevelOnly,$filemask
) | out-file ".\testfiles1.txt" }}
$dt2=get-date;
write-host $dt2.subtract($dt)
write-host "CMD dir... " -nonewline
$dt = get-date;
for($i=0;$i -lt $n;$i++){
if($recurse){
cmd /c "dir /a-d /b /s $path\$filemask" | out-file ".\testfiles2.txt"}
else{ cmd /c "dir /a-d /b $path\$filemask" | out-file ".\testfiles2.txt"}}
$dt2=get-date;
write-host $dt2.subtract($dt)
write-host "GCI... " -nonewline
$dt = get-date;
for($i=0;$i -lt $n;$i++){
if( $recurse ) {
get-childitem "$path\*" -include $filemask -recurse | out-file ".\testfiles0.txt"}
else {get-childitem "$path\*" -include $filemask | out-file ".\testfiles0.txt"}}
$dt2=get-date;
write-host $dt2.subtract($dt)
Here is a good explanation on why Get-ChildItem is slow by Lee Holmes. 这是为什么要-ChildItem是李福尔摩斯慢了很好的解释。 If you take note of the comment from "Anon 11 Mar 2010 11:11 AM" at the bottom of the page his solution might work for you.
如果您注意到页面底部“Anon 11 Mar 2010 11:11 AM”的评论,他的解决方案可能适合您。
Anon's Code: Anon的代码:
# SCOPE: SEARCH A DIRECTORY FOR FILES (W/WILDCARDS IF NECESSARY)
# Usage:
# $directory = "\\SERVER\SHARE"
# $searchterms = "filname[*].ext"
# PS> $Results = Search $directory $searchterms
[reflection.assembly]::loadwithpartialname("Microsoft.VisualBasic") | Out-Null
Function Search {
# Parameters $Path and $SearchString
param ([Parameter(Mandatory=$true, ValueFromPipeline = $true)][string]$Path,
[Parameter(Mandatory=$true)][string]$SearchString
)
try {
#.NET FindInFiles Method to Look for file
# BENEFITS : Possibly running as background job (haven't looked into it yet)
[Microsoft.VisualBasic.FileIO.FileSystem]::GetFiles(
$Path,
[Microsoft.VisualBasic.FileIO.SearchOption]::SearchAllSubDirectories,
$SearchString
)
} catch { $_ }
}
I tried some of the suggested methods with a large amount of files (~190.000).我在大量文件(~190.000)中尝试了一些建议的方法。 As mentioned in Kyle's comment,
GetFiles
isn't very useful here, because it needs nearly forever.正如 Kyle 的评论中提到的,
GetFiles
在这里不是很有用,因为它几乎永远需要。
cmd dir was better than Get-ChildItems
at my first tests, but it seems, GCI speeds up a lot if you use the -Force
parameter.在我的第一次测试中,cmd dir 比
Get-ChildItems
更好,但似乎,如果您使用-Force
参数,GCI 会加快很多速度。 With this the needed time was about the same as for cmd dir.这样,所需的时间与 cmd 目录大致相同。
PS: In my case I had to exclude most of the files because of their extension. PS:就我而言,由于扩展名,我不得不排除大多数文件。 This was made with
-Exclude
in gci and with a |
这是用
-Exclude
中的-Exclude
和|
where in the other commands.在其他命令中的位置。 So the results for just searching files might slightly differ.
因此,仅搜索文件的结果可能略有不同。
Here's an interactive reader that parses cmd /c dir
(which can handle unc paths), and will collect the 3 most important properties for most people: full path, size, timestamp这是一个交互式阅读器,它解析
cmd /c dir
(可以处理 unc 路径),并将收集对大多数人来说最重要的 3 个属性:完整路径、大小、时间戳
usage would be something like $files_with_details = $faster_get_files.GetFileList($unc_compatible_folder)
用法类似于
$files_with_details = $faster_get_files.GetFileList($unc_compatible_folder)
and there's a helper function to check combined size $faster_get_files.GetSize($files_with_details)
并且有一个辅助函数来检查组合大小
$faster_get_files.GetSize($files_with_details)
$faster_get_files = New-Module -AsCustomObject -ScriptBlock {
#$DebugPreference = 'Continue' #verbose, this will take figuratively forever
#$DebugPreference = 'SilentlyContinue'
$directory_filter = "Directory of (.+)"
$file_filter = "(\d+/\d+/\d+)\s+(\d+:\d+ \w{2})\s+([\d,]+)\s+(.+)" # [1] is day, [2] is time (AM/PM), [3] is size, [4] is filename
$extension_filter = "(.+)[\.](\w{3,4})" # [1] is leaf, [2] is extension
$directory = ""
function GetFileList ($directory = $this.directory) {
if ([System.IO.Directory]::Exists($directory)) {
# Gather raw file list
write-Information "Gathering files..."
$files_raw = cmd /c dir $directory \*.* /s/a-d
# Parse file list
Write-Information "Parsing file list..."
$files_with_details = foreach ($line in $files_raw) {
Write-Debug "starting line {$($line)}"
Switch -regex ($line) {
$this.directory_filter{
$directory = $matches[1]
break
}
$this.file_filter {
Write-Debug "parsing matches {$($matches.value -join ";")}"
$date = $matches[1]
$time = $matches[2] # am/pm style
$size = $matches[3]
$filename = $matches[4]
# we do a second match here so as to not append a fake period to files without an extension, otherwise we could do a single match up above
Write-Debug "parsing extension from {$($filename)}"
if ($filename -match $this.extension_filter) {
$file_leaf = $matches[1]
$file_extension = $matches[2]
} else {
$file_leaf = $filename
$file_extension = ""
}
[pscustomobject][ordered]@{
"fullname" = [string]"$($directory)\$($filename)"
"filename" = [string]$filename
"folder" = [string]$directory
"file_leaf" = [string]$file_leaf
"extension" = [string]$file_extension
"date" = get-date "$($date) $($time)"
"size" = [int]$size
}
break
}
} # finish directory/file test
} # finish all files
return $files_with_details
} #finish directory exists test
else #directory doesn't exist {throw("Directory not found")}
}
function GetSize($files_with_details) {
$combined_size = ($files_with_details|measure -Property size -sum).sum
$pretty_size_gb = "$([math]::Round($combined_size / 1GB, 4)) GB"
return $pretty_size_gb
}
Export-ModuleMember -Function * -Variable *
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.