如何在 Powershell 中解析这个字符串？

Question

I have a block of text I need to parse (saved in a variable) but I'm unsure how to go about it.我有一段文本需要解析（保存在变量中），但我不确定如何处理。 This block of text, saved in a variable we can call $block for simplicity's sake, includes all the whitespace shown below.这个文本块保存在一个变量中，为简单起见，我们可以称之为$block ，包括下面显示的所有空格。

I would like the result to be an iterable list, the first value being Health_AEPOEP_Membership_Summary - Dev and the second one being Health_AEPOEP_YoY_Comparison_Summary - Dev .我希望结果是一个可迭代列表，第一个值是Health_AEPOEP_Membership_Summary - Dev ，第二个值是Health_AEPOEP_YoY_Comparison_Summary - Dev 。 Assume this list of workbooks can be longer (up to 50) or shorter (minimum 1 workbook), and all workbooks are formatted similarly (in terms of name_with_underscores - Dev . I'd try the $block.split(" ") method, but this method gives many spaces which may be hard to enumerate and account for.假设此工作簿列表可以更长（最多 50 个）或更短（最少 1 个工作簿），并且所有工作簿的格式都类似（就name_with_underscores - Dev 。我会尝试$block.split(" ")方法，但是这种方法提供了许多可能难以枚举和解释的空间。


                    Workbooks : Health_AEPOEP_Membership_Summary - Dev [Project: Health - Dev]
                                Health_AEPOEP_YoY_Comparison_Summary - Dev [Project: Health - Dev]

Any help is much appreciated!任何帮助深表感谢！

Answer 1

If the text is in a file it would make this a little easier, and I would recommend this approach如果文本在文件中，它会使这更容易一些，我会推荐这种方法

switch -Regex -file ($file){
    '(\w+_.+- Dev)' {$matches.1}
}

Regex details正则表达式详情

() - capture group () - 捕获组

\\w+ - match one or more letter characters \\w+ - 匹配一个或多个字母字符

_ - match literal underscore _ - 匹配文字下划线

.+ - match one or more of any character .+ - 匹配一个或多个任意字符

- Dev - literal match of dash space Dev - Dev - 破折号空间 Dev 的字面匹配

If it's already in a variable, it would depend if it's a string array or a single string.如果它已经在变量中，则取决于它是字符串数组还是单个字符串。 Assuming it's a single string, I'd recommend this approach假设它是一个单一的字符串，我会推荐这种方法

$regex = [regex]'(\w+_.+)(?=(\s\[.+))'

$regex.Matches($block).value

Health_AEPOEP_Membership_Summary - Dev
Health_AEPOEP_YoY_Comparison_Summary - Dev

Regex details正则表达式详情

Same as above but added the following与上面相同，但添加了以下内容

(?=) - Look ahead (?=) - 向前看

\\s\\[.+ - match a space, a left square bracket, one or more characters \\s\\[.+ - 匹配一个空格、一个左方括号、一个或多个字符

Simply add a variable assignment $strings = before either of these to capture the output.只需在其中任何一个之前添加变量赋值$strings =即可捕获输出。 Either would work on one or 500 workbooks.可以处理一本或 500 份工作簿。

Answer 2

You could write a multi-line regex pattern and try to extract the names, but it might be easier to reason about if you just breaking it into simple(r) steps:您可以编写一个多行正则表达式模式并尝试提取名称，但如果您只是将其分解为简单的（r）步骤，则可能更容易推理：

$string = @'

                    Workbooks : Health_AEPOEP_Membership_Summary - Dev [Project: Health - Dev]
                                Health_AEPOEP_YoY_Comparison_Summary - Dev [Project: Health - Dev]



'@

# Split into one string per line
$strings = $string -split '\r?\n'

# Remove leading whitespace
$strings = $strings -replace '^\s*' 

# Remove `Workbooks : ` prefix (strings that don't match will be left untouched)
$strings = $strings -replace '^Workbooks :\s*' 

# Remove `[Project $NAME]` suffix
$strings = $strings -replace '\s*\[Project: [^\]]+\]'

# Get rid of empty lines
$strings = $strings |Where-Object Length

$strings now contains the two project names $strings现在包含两个项目名称

如何在 Powershell 中解析这个字符串？

问题描述

2 个解决方案

解决方案1
1 2020-11-18 18:19:16

解决方案2
1 已采纳 2020-11-18 18:27:14

如何在 Powershell 中解析这个字符串？

问题描述

2 个解决方案

解决方案1 1 2020-11-18 18:19:16

解决方案2 1 已采纳 2020-11-18 18:27:14

解决方案1
1 2020-11-18 18:19:16

解决方案2
1 已采纳 2020-11-18 18:27:14