簡體   English   中英

Powershell正則表達式讀取多行

[英]Powershell regex reading multiple lines

我正在嘗試使用正則表達式讀取文件並匹配多行,但遇到了一些問題。 我試圖閱讀的文件看起來像:

I 09/07/20 05:55PM [Backup Set] Starting backup to CrashPlan Central: 122 files (93.30MB) to back up
I 09/07/20 06:00PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:39s: 147 files (197.90MB) backed up, 5.30MB encrypted and sent @ 323.5Kbps (Effective rate: 2.7Mbps)
I 09/07/20 06:00PM  - Unable to backup 1 file (next attempt within 15 minutes)
I 09/07/20 06:15PM [Backup Set] Starting backup to CrashPlan Central: 27 files (250MB) to back up
I 09/07/20 06:19PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:03s: 28 files (250MB) backed up, 5MB encrypted and sent @ 302.5Kbps (Effective rate: 4.3Mbps)
I 09/07/20 06:34PM [Backup Set] Starting backup to CrashPlan Central: 18 files (169.30KB) to back up

行似乎以CR LF結尾。 最終,我想找到包含“已完成備份到”的每一行,后面沒有緊跟包含“無法備份”的行。 但是,即使是最簡單的查詢,我也遇到了麻煩。

這是我在文本中提取的方式:

PS C:\temp> $rawtext = Get-Content '.\new 1.txt' -raw

PS C:\temp> $rawtext.GetType()

IsPublic IsSerial Name                 BaseType
-------- -------- ----                 --------
True     True     String               System.Object


PS C:\temp> $rawtext | Measure-Object -Line

Lines Words Characters Property
----- ----- ---------- --------
    6                       

以及一些簡單的正則表達式查詢的結果:

PS C:\temp> Select-String -InputObject $rawtext -pattern '^.*Completed.*$' # returns nothing
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?m)^.*Completed.*$' # returns the entire contents of $rawtext
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?ms)^.*Completed.*$' # also returns the entire contents of $rawtext
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?ms)^.*Completed.*\r\n$' # returns nothing
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?ms)^.*Completed.*\r\n' # returns the entire contents of $rawtext

我希望這些查詢中至少有一個返回包含“已完成”的每一行。 但顯然 Powershell 並沒有像我假設的那樣處理多行。 有人能解釋一下如何在 Powershell 中處理多行正則表達式嗎?

FWIW,以下命令在 OSX 終端中成功獲取了我想要的內容,並且基本上是我想在 PoSH 中復制的內容:

completedBackups=$(sed '/Completed[[:space:]]backup[[:space:]]to/!d;$!N;/\n.*Unable[[:space:]]to[[:space:]]backup[[:space:]]/!P;D' $f)

您可以執行以下操作:

$rawtext = Get-Content '.\new 1.txt' -Raw
$rawtext | Select-String -Pattern '(?m)^.*?Completed backup to.*$(?!\r?\n.*Unable to backup)' -AllMatches |
    Foreach-Object {$_.Matches.Value}

解釋:

(?m)是多行模式,它允許^$匹配每行的開頭和結尾。

(?!)是一個不消耗任何字符的負前瞻。 所以我們從字符串$的末尾向前看,找不到零個或多個回車\\r? 和換行\\n后跟任何字符.* (在一行中,因為我們沒有使用(?s) )並且unable to backup

-AllMatches開關指示命令在第一次成功匹配后保持匹配。

使用-Raw開關很好,因為它可以讓我們輕松地查看下一行文本。 如果沒有-Raw ,我們將需要跟蹤通過管道傳輸到Select-String前幾行。 這是可行的,但方法不同。

(?s)或單行模式在使用. 來匹配字符。 . 將在單行模式下匹配換行符。

由於Select-String返回MatchInfo對象,因此您需要訪問實際匹配行的對象的Matches屬性的Value屬性。

為什么不這樣做...

# Create the data file
'
I 09/07/20 05:55PM [Backup Set] Starting backup to CrashPlan Central: 122 files (93.30MB) to back up
I 09/07/20 06:00PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:39s: 147 files (197.90MB) backed up, 5.30MB encrypted and sent @ 323.5Kbps (Effective rate: 2.7Mbps)
I 09/07/20 06:00PM  - Unable to backup 1 file (next attempt within 15 minutes)
I 09/07/20 06:15PM [Backup Set] Starting backup to CrashPlan Central: 27 files (250MB) to back up
I 09/07/20 06:19PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:03s: 28 files (250MB) backed up, 5MB encrypted and sent @ 302.5Kbps (Effective rate: 4.3Mbps)
I 09/07/20 06:34PM [Backup Set] Starting backup to CrashPlan Central: 18 files (169.30KB) to back up
' | 
Out-File -FilePath 'D:\Temp\BackUpLog.txt'


(Get-Content -Path 'D:\temp\BackUpLog.txt').GetType()
# Results
<#
IsPublic IsSerial Name                                     BaseType                                                                                                
-------- -------- ----                                     --------                                                                                                
True     True     Object[]                                 System.Array  
#>
((Get-Content -Path 'D:\temp\BackUpLog.txt') | 
Measure-Object -Line).Lines
# Results
<#
6
#>


(Get-Content -Path 'D:\temp\BackUpLog.txt' -Raw).GetType()
# Results
<#
IsPublic IsSerial Name                                     BaseType                                                                                                
-------- -------- ----                                     --------                                                                                                
True     True     String                                   System.Object 
#>


((Get-Content -Path 'D:\temp\BackUpLog.txt' -Raw) | 
Measure-Object -Line).Lines
# Results
<#
6
#>

# Use Select-String with pattern and -AllMatches
(Get-Content -Path 'D:\temp\BackUpLog.txt').Split([Environment]::NewLine) | 
Select-String -Pattern 'Completed backup to' -AllMatches
# Use RegEx matches to collect specific strings
(Get-Content -Path 'D:\temp\BackUpLog.txt').Split([Environment]::NewLine) -match 'Completed backup to'
# Results of both are
<#
I 09/07/20 06:00PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:39s: 147 files (197.90MB) backed up, 5.30MB encrypted and sent @ 323.5Kbps (Effective rate: 2.7Mbps)
I 09/07/20 06:19PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:03s: 28 files (250MB) backed up, 5MB encrypted and sent @ 302.5Kbps (Effective rate: 4.3Mbps)
#>

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM