[英]PowerShell: Select line preceding a match — Select-String -Context issue when using input string variable
I need return a line preceeding a match on a multi-line string variable. 我需要在多行字符串变量的匹配之前返回一行。
It seems when using a string variable for the input Select-String considers the entire string as having matched. 当输入使用字符串变量时,Select-String似乎认为整个字符串已匹配。 As such the Context properties are "outside" either end of the string and are null.
因此,Context属性在字符串的两端“外部”并且为null。
Consider the below example: 考虑以下示例:
$teststring = @"
line1
line2
line3
line4
line5
"@
Write-Host "Line Count:" ($teststring | Measure-Object -Line).Lines #verify PowerShell does regard input as a multi-line string (it does)
Select-String -Pattern "line3" -InputObject $teststring -AllMatches -Context 1,0 | % {
$_.Matches.Value #this prints the exact match
$_.Context #output shows all context properties to be empty
$_.Context.PreContext[0] #this would ideally output first line before the match
$_.Context.PreContext[0] -eq $null #but instead is null
}
Am I misunderstanding something here? 我在这里误解了什么吗?
What is the best way to return "line2" when matching for "line3"? 匹配“line3”时返回“line2”的最佳方法是什么?
Thanks! 谢谢!
Edit: Additional requirements I neglected to state: Needs to provide the line above ALL matched lines for a string of indeterminate length. 编辑:我忽略的附加要求:需要在所有匹配的行上方提供一行不确定长度的行。 EG when searching the below for "line3" I need to return "line2" and "line5".
EG在下面搜索“line3”时我需要返回“line2”和“line5”。
line1
line2
line3
line4
line5
line3
line6
Select-String
operates on arrays of input, so rather than a single, multiline string you must provide an array of lines for -Context
and -AllMatches
to work as intended: Select-String
对输入数组进行操作,因此您必须为-Context
和-AllMatches
提供一系列行 , 而不是单个多行字符串 ,以便按预期工作:
$teststring = @"
line1
line2
line3
line4
line5
line3
line6
"@
$teststring -split '\r?\n' | Select-String -Pattern "line3" -AllMatches -Context 1,0 | % {
"line before: " + $_.Context.PreContext[0]
"matched part: " + $_.Matches.Value # Prints the what the pattern matched
}
This yields: 这会产生:
line before: line2
matched part: line3
line before: line5
matched part: line3
$teststring -split '\\r?\\n'
splits the multi-line string into an array of lines: $teststring -split '\\r?\\n'
将多行字符串拆分为一行数组:
\\r?\\n
handles either style. \\r?\\n
处理任何一种风格。 Note that it is crucial to use the pipeline to provide Select-String
's input; 请注意,使用管道提供
Select-String
的输入至关重要; if you used -InputObject
, the array would be coerced back to a single string. 如果使用
-InputObject
,则数组将被强制转换回单个字符串。
Select-String
is convenient, but slow . Select-String
很方便,但很慢 。
Especially for a single string already in memory, a solution using the .NET Framework's [Regex]::Matches()
method will perform much better , though it is more complex . 特别是对于已经在内存中的单个字符串, 使用.NET Framework的
[Regex]::Matches()
方法的解决方案将表现得更好 ,尽管它更复杂 。
Note that PowerShell's own -match
and -replace
operators are built on the same .NET class, but do not expose all of its functionality; 请注意,PowerShell自己的
-match
和-replace
运算符构建在同一个.NET类上,但不公开其所有功能; -match
- which does report capture groups in the automatic $Matches
variable - is not an option here, because it only ever returns 1 match. -match
- 在自动$Matches
变量中报告捕获组 - 这里不是一个选项,因为它只返回1个匹配。
The following is essentially the same approach as in mjolinor's answer answer, but with several problems corrected[1]. 以下基本上与mjolinor的答案答案相同,但纠正了几个问题[1]。
# Note: The sample string is defined so that it contains LF-only (\n)
# line breaks, merely to simplify the regex below for illustration.
# If your script file use LF-only line breaks, the
# `-replace '\r?\n', "`n" call isn't needed.
$teststring = @"
line1
line2
line3
line4
line5
line3
line6
"@ -replace '\r?\n', "`n"
[Regex]::Matches($teststring, '(?:^|(.*)\n).*(line3)') | ForEach-Object {
"line before: " + $_.Groups[1].Value
"matched part: " + $_.Groups[2].Value
}
Regex (?:^|(.*)\\n).*(line3)
uses 2 capture groups ( (...)
) to capture both the (matching part of) the line to match and the line before ( (?:...)
is an auxiliary non -capturing group that is needed for precedence): 正则表达式
(?:^|(.*)\\n).*(line3)
使用2个捕获组( (...)
)来捕获要匹配的行(匹配部分)和之前的行( (?:...)
是优先所需的辅助非捕获组:
(?:^|(.*)\\n)
matches either the very start of the string ( ^
) or ( |
) any - possibly empty - sequence of non-newline characters ( .*
) followed by a newline ( \\n
); (?:^|(.*)\\n)
匹配字符串的开头( ^
)或( |
)any - 可能为空 - 非换行符( .*
)后跟换行符( \\n
) ; this ensures that the line to match is also found when there is no preceding line (ie, of the line to match is the first one). (line3)
is the group defining the line to match; (line3)
是定义要匹配的行的组; it is preceded by .*
to match the behavior in the question, where pattern line3
is found even it is only part of a line. .*
来匹配问题中的行为,其中找到了pattern line3
,即使它只是一行的一部分 。
(?:^|(.*)\\n)(line3)(?:\\n|$)
[Regex]::Matches()
finds all matches and returns them as a collection of System.Text.RegularExpressions.Match
objects, which the ForEach-Object
cmdlet call can then operate on to extract the capture-group matches ( $_.Groups[<n>].Value
). [Regex]::Matches()
查找所有匹配项并将它们作为System.Text.RegularExpressions.Match
对象的集合返回,然后ForEach-Object
cmdlet调用可以对其进行操作以提取捕获组匹配项( $_.Groups[<n>].Value
)。
[1] As of this writing: [1]撰写本文时:
- There is no need to match twice - the enclosing if ($teststring -match $pattern) { ... }
is unnecessary. - 没有必要匹配两次 - 封闭
if ($teststring -match $pattern) { ... }
是不必要的。
- Inline option (?m)
is not needed, because .
- 不需要内联选项
(?m)
,因为.
does not match newlines by default . 默认情况下与换行符不匹配。
- (.+?)
captures only nonempty lines (and ?
, the non-greedy quantifier, is not needed). -
(.+?)
只捕获非空行 (和?
,不需要非贪婪量词)。
- If the line of interest is the first line - ie, if there's no line before , it won't be matched. - 如果感兴趣的行是第一行 - 即,如果之前没有行,则不会匹配。
You can use a multi-line regex, with the -match
operator: 您可以使用带有
-match
运算符的多行正则表达式:
$teststring = @"
line1
line2
line3
line4
line5
line3
line6
"@
$pattern =
@'
(?m)
(.+?)
line3
'@
if ($teststring -match $pattern)
{ [Regex]::Matches($teststring,$pattern) |
foreach {$_.groups[1].value} }
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.