[英]Powershell regex reading multiple lines
我正在嘗試使用正則表達式讀取文件並匹配多行,但遇到了一些問題。 我試圖閱讀的文件看起來像:
I 09/07/20 05:55PM [Backup Set] Starting backup to CrashPlan Central: 122 files (93.30MB) to back up
I 09/07/20 06:00PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:39s: 147 files (197.90MB) backed up, 5.30MB encrypted and sent @ 323.5Kbps (Effective rate: 2.7Mbps)
I 09/07/20 06:00PM - Unable to backup 1 file (next attempt within 15 minutes)
I 09/07/20 06:15PM [Backup Set] Starting backup to CrashPlan Central: 27 files (250MB) to back up
I 09/07/20 06:19PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:03s: 28 files (250MB) backed up, 5MB encrypted and sent @ 302.5Kbps (Effective rate: 4.3Mbps)
I 09/07/20 06:34PM [Backup Set] Starting backup to CrashPlan Central: 18 files (169.30KB) to back up
行似乎以CR LF
結尾。 最終,我想找到包含“已完成備份到”的每一行,后面沒有緊跟包含“無法備份”的行。 但是,即使是最簡單的查詢,我也遇到了麻煩。
這是我在文本中提取的方式:
PS C:\temp> $rawtext = Get-Content '.\new 1.txt' -raw
PS C:\temp> $rawtext.GetType()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String System.Object
PS C:\temp> $rawtext | Measure-Object -Line
Lines Words Characters Property
----- ----- ---------- --------
6
以及一些簡單的正則表達式查詢的結果:
PS C:\temp> Select-String -InputObject $rawtext -pattern '^.*Completed.*$' # returns nothing
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?m)^.*Completed.*$' # returns the entire contents of $rawtext
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?ms)^.*Completed.*$' # also returns the entire contents of $rawtext
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?ms)^.*Completed.*\r\n$' # returns nothing
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?ms)^.*Completed.*\r\n' # returns the entire contents of $rawtext
我希望這些查詢中至少有一個返回包含“已完成”的每一行。 但顯然 Powershell 並沒有像我假設的那樣處理多行。 有人能解釋一下如何在 Powershell 中處理多行正則表達式嗎?
FWIW,以下命令在 OSX 終端中成功獲取了我想要的內容,並且基本上是我想在 PoSH 中復制的內容:
completedBackups=$(sed '/Completed[[:space:]]backup[[:space:]]to/!d;$!N;/\n.*Unable[[:space:]]to[[:space:]]backup[[:space:]]/!P;D' $f)
您可以執行以下操作:
$rawtext = Get-Content '.\new 1.txt' -Raw
$rawtext | Select-String -Pattern '(?m)^.*?Completed backup to.*$(?!\r?\n.*Unable to backup)' -AllMatches |
Foreach-Object {$_.Matches.Value}
解釋:
(?m)
是多行模式,它允許^
和$
匹配每行的開頭和結尾。
(?!)
是一個不消耗任何字符的負前瞻。 所以我們從字符串$
的末尾向前看,找不到零個或多個回車\\r?
和換行\\n
后跟任何字符.*
(在一行中,因為我們沒有使用(?s)
)並且unable to backup
。
-AllMatches
開關指示命令在第一次成功匹配后保持匹配。
使用-Raw
開關很好,因為它可以讓我們輕松地查看下一行文本。 如果沒有-Raw
,我們將需要跟蹤通過管道傳輸到Select-String
前幾行。 這是可行的,但方法不同。
(?s)
或單行模式在使用.
來匹配字符。 .
將在單行模式下匹配換行符。
由於Select-String
返回MatchInfo
對象,因此您需要訪問實際匹配行的對象的Matches
屬性的Value
屬性。
為什么不這樣做...
# Create the data file
'
I 09/07/20 05:55PM [Backup Set] Starting backup to CrashPlan Central: 122 files (93.30MB) to back up
I 09/07/20 06:00PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:39s: 147 files (197.90MB) backed up, 5.30MB encrypted and sent @ 323.5Kbps (Effective rate: 2.7Mbps)
I 09/07/20 06:00PM - Unable to backup 1 file (next attempt within 15 minutes)
I 09/07/20 06:15PM [Backup Set] Starting backup to CrashPlan Central: 27 files (250MB) to back up
I 09/07/20 06:19PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:03s: 28 files (250MB) backed up, 5MB encrypted and sent @ 302.5Kbps (Effective rate: 4.3Mbps)
I 09/07/20 06:34PM [Backup Set] Starting backup to CrashPlan Central: 18 files (169.30KB) to back up
' |
Out-File -FilePath 'D:\Temp\BackUpLog.txt'
(Get-Content -Path 'D:\temp\BackUpLog.txt').GetType()
# Results
<#
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Object[] System.Array
#>
((Get-Content -Path 'D:\temp\BackUpLog.txt') |
Measure-Object -Line).Lines
# Results
<#
6
#>
(Get-Content -Path 'D:\temp\BackUpLog.txt' -Raw).GetType()
# Results
<#
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String System.Object
#>
((Get-Content -Path 'D:\temp\BackUpLog.txt' -Raw) |
Measure-Object -Line).Lines
# Results
<#
6
#>
# Use Select-String with pattern and -AllMatches
(Get-Content -Path 'D:\temp\BackUpLog.txt').Split([Environment]::NewLine) |
Select-String -Pattern 'Completed backup to' -AllMatches
# Use RegEx matches to collect specific strings
(Get-Content -Path 'D:\temp\BackUpLog.txt').Split([Environment]::NewLine) -match 'Completed backup to'
# Results of both are
<#
I 09/07/20 06:00PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:39s: 147 files (197.90MB) backed up, 5.30MB encrypted and sent @ 323.5Kbps (Effective rate: 2.7Mbps)
I 09/07/20 06:19PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:03s: 28 files (250MB) backed up, 5MB encrypted and sent @ 302.5Kbps (Effective rate: 4.3Mbps)
#>
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.