简体   繁体   English

匹配并捕获多个RegEx语句

[英]Matching and capturing multiple RegEx statements

I'm running the below on a 2008 R2 file server to extract quota info since the PS FSRM module isn't available. 由于PS FSRM模块不可用,因此我正在2008 R2文件服务器上运行以下命令以提取配额信息。 When matching the strings in the $RegEx variable, it works fine as long as there's only 2 strings in the variable, and the $matches[1] and $matches[2] values are added as expected to the object array, but when I try to add a 3rd capture, or in this case 5 captures, I get no output at all. 当匹配$RegEx变量中的字符串时,只要变量中只有2个字符串,就可以正常工作,并且$matches[1]$matches[2]值将按预期添加到对象数组,但是当我尝试添加第三个捕获,或者在这种情况下添加5个捕获,我根本没有任何输出。 Nothing in $matches and nothing in $objArr . $matches什么也没有, $objArr什么也没有。

$RegEx = 'Quota Path:\s+(.*)[\s\S]*?' +
         'Source Template:\s+(.*)\s+' +
         'Limit:\s+(.*)\s+' +
         'Used:\s+(.*)\s+' +
         'Available:\s+(.*)'
$objArr = @()

$objArr = (dirquota qu l | Out-String) -replace '\r\n', "`n" -split '\n\n' |
          where {$_ -match $RegEx} |
          foreach {
            New-Object -TypeName psobject -Property ([ordered]@{
              QuotaPath  = $matches[1]
              Template   = $matches[2]
              QuotaLimit = $matches[3]
              Used       = $matches[4]
              Availble   = $matches[5]
            })
          }

What I don't understand is I can rearrange the captures and any combination of 2 will work, so it seems the capture strings are correct to some degree, but as soon as I try to add a 3rd or more, I get nothing. 我不了解的是,我可以重新排列捕获,并且2的任何组合都可以使用,因此捕获字符串似乎在某种程度上是正确的,但是当我尝试添加第三个或更多时,我一无所获。 I'm sure I'm missing something with the way the RegEx capture strings are formatted. 我确定我缺少正则表达式捕获字符串的格式化方式。

The dirquota qu l | Out-String dirquota qu l | Out-String dirquota qu l | Out-String outputs a string as follows: dirquota qu l | Out-String输出一个字符串,如下所示:

...

Quota Path:             E:\DirA\SubdirA\SubdirA1
Share Path:             \\SERVER\SubdirA\SubdirA1
                        \\SERVER\E\DirA\SubdirA\SubdirA1
                        \\SERVER\DirA\SubdirA\SubdirA1
Source Template:        TemplateA (Matches template)
Quota Status:           Enabled
Limit:                  500.00 MB (Hard)
Used:                   6.00 KB (0%)
Available:              499.99 MB
Peak Usage:             6.00 KB (4/1/2015 12:27 PM)
Thresholds:
   Warning ( 80%):      Event Log
   Limit (100%):        Event Log

Quota Path:             E:\DirB\SubdirB\SubdirB1
Share Path:             \\SERVER\SubdirB\SubdirB1
                        \\SERVER\E\DirB\SubdirB\SubdirB1
                        \\SERVER\DirB\SubdirB\SubdirB1
Source Template:        TemplateB (Matches template)
Quota Status:           Enabled
Limit:                  500.00 MB (Hard)
Used:                   1.00 KB (0%)
Available:              500.00 MB
Peak Usage:             1.00 KB (7/12/2016 12:09 PM)
Thresholds:
   Warning ( 80%):      Event Log
   Limit (100%):        Event Log

...
  • I recently read in an answer that the validity of the $matches collection over pipe boundaries isn't guarantied. 我最近在一个答案中读到,不能保证$ matches集合在管道边界上的有效性。
  • Therefor I removed the where, 为此,我删除了
  • get the data from file 从文件中获取数据
  • A new second RegEx is used to split the file into chunks starting with (and including) Quota Path ) 新的第二个RegEx用于将文件拆分为以(包括Quota Path )开头的块
  • I broke down the RegEx in RegEx101.com see link. 我在RegEx101.com上破坏了RegEx的链接。
  • and use named capture groups to better keep track 并使用命名的捕获组来更好地跟踪
  • the resulting $objArr is piped to Out-Gridview 结果$ objArr被管道传输到Out-Gridview

# https://www.regex101.com/r/3WrfYk/1
$File = ".\quota.txt"
# dirquota qu l | Set-Content $File
$Delimiter = 'Quota Path:'
$Escaped   = [regex]::Escape($Delimiter)
$Split     = "(?!^)(?=$Escaped)"
$RegEx = '(?smi)^Quota Path:\s+(?<QuotaPath>.*?)$.*?' `
         + '^Source Template:\s+(?<Template>.*?)$.*?' `
         + '^Limit:\s+(?<QuotaLimit>.*?)' `
         + 'Used:\s+(?<Used>.*?)$.' `
         + 'Available:\s+(?<Available>.*?)$.'
$objArr = @()
$objArr = ((Get-Content $File -Raw) -split $Split)|
  foreach {
    if ($_ -match $RegEx) {
       New-Object -TypeName psobject -Property (
       [ordered]@{ QuotaPath  = $matches.QuotaPath 
                   Template   = $matches.Template  
                   QuotaLimit = $matches.QuotaLimit
                   Used       = $matches.Used      
                   Availble   = $matches.Availble   
                })
    } # if
} # foreach
$objArr|select QuotaPath,Template,QuotaLimit,Used,Available|out-gridview

You're not getting the desired results because your regular expression simply doesn't match. 您没有得到理想的结果,因为您的正则表达式根本不匹配。 There's an additional line between the SourceTemplate and Limit record that your modified regular expression doesn't account for: SourceTemplateLimit记录之间还有一条额外的线,您的修改后的正则表达式SourceTemplate

...
Quota Path:             E:\DirA\SubdirA\SubdirA1
Share Path:             \\SERVER\SubdirA\SubdirA1
                        \\SERVER\E\DirA\SubdirA\SubdirA1
                        \\SERVER\DirA\SubdirA\SubdirA1
Source Template:        TemplateA (Matches template)
Quota Status: Enabled
Limit:                  500.00 MB (Hard)
Used:                   6.00 KB (0%)
Available:              499.99 MB
...

The Source Template:\\s+(.*)\\s+ part of your regular expression matches the (sub)string "Source Template:" followed by one or more whitespace characters ( \\s+ ), a grouped match of all characters up to (but not including) the next newline ( (.*) ), and again one or more whitespace characters ( \\s+ ). Source Template:\\s+(.*)\\s+正则表达式的Source Template:\\s+(.*)\\s+部分与(子)字符串“ Source Template:”匹配,后跟一个或多个空格字符( \\s+ ),所有字符的分组匹配,直到(但(不包括)下一个换行符( (.*) ),以及一个或多个空格字符( \\s+ )。 However, since the next part of your regular expression is Limit:\\s+(.*)\\s+ you'd only get a match if the line after Source Template: begins with Limit: . 但是,由于正则表达式的下一部分是Limit:\\s+(.*)\\s+ ,因此,如果Source Template:之后的行以Limit:开头,则只有匹配项。

Basically, the pattern Source Template:\\s+(.*)\\s+ only matches this: 基本上,模式Source Template:\\s+(.*)\\s+仅与此匹配:

...
                        \\SERVER\DirA\SubdirA\SubdirA1
Source Template: TemplateA (Matches template) Quota Status:           Enabled
Limit:                  500.00 MB (Hard)
...

when you actually need it to match this: 当您实际需要它来满足以下要求时:

...
                        \\SERVER\DirA\SubdirA\SubdirA1
Source Template: TemplateA (Matches template) Quota Status: Enabled Limit:                  500.00 MB (Hard)
...

To make it include additional lines you need to change 要使其包含其他行,您需要进行更改

'Source Template:\s+(.*)\s+'

into 进入

'Source Template:\s+(.*)[\s\S]+?'

The character class [\\s\\S] matches any character instead of just whitespace characters ( \\s ), and the modifier +? 字符类[\\s\\S]与任何字符匹配,而不仅仅是空白字符( \\s ),并且修饰符+? makes an non-greedy match of one or more characters. 使一个或多个字符的非贪心匹配。 That way the expression includes all text up to the next occurrence of the string Limit: . 这样,表达式将包含所有字符串,直到字符串Limit:的下一次出现为止。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM