简体   繁体   English

从powershell中的shell命令中提取两个关键字之间的多行文本

[英]Extract multiple lines of text between two key words from shell command in powershell

I have a shell command I'd like to extract data from using Powershell.我有一个 shell 命令,我想从使用 Powershell 中提取数据。 The data I need will always sit between two key words and the number of lines captured can change.我需要的数据将始终位于两个关键字之间,并且捕获的行数可能会发生变化。

The output can look something like this.输出看起来像这样。

Sites:
System1: 
RPAs: OK
Volumes: 
  WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_TX
  WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_TX
  WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_TX
Splitters: OK
System2: 
RPAs: OK
Volumes: 
  WARNING: Storage group MA_UCS_1 contains both replicated and unreplicated volumes. ; CS_MA
  WARNING: Storage group MA_UCS_2 contains both replicated and unreplicated volumes. ; CS_MA
  WARNING: Storage group MA_UCS_3 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK
WAN: OK
System: OK

I would like to capture and store into a variable (or text file if easier?) part of this data to be reused later in the script.我想捕获并存储到变量(或文本文件,如果更容易?)此数据的一部分,以便稍后在脚本中重用。 For example, I would like to capture everything between System1: and System2: which would produce:例如,我想捕获 System1: 和 System2: 之间的所有内容:

RPAs: OK
Volumes: 
  WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_MA
  WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_MA
  WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK

I've been messing with different regex combinations with no success.我一直在搞乱不同的正则表达式组合,但没有成功。 I've had some moderate success with this code but it doesn't seem to be able to handle the warning lines and I also can't seem to get Out-File to work with it either, only Write-Host which does not help me much.我用这段代码取得了一些适度的成功,但它似乎无法处理警告行,而且我似乎也无法让 Out-File 使用它,只有 Write-Host 没有帮助我多。

$RP = plink -l User -pw Password 192.168.1.100 "get_system_status summary=no" #extract from

$script = $RP

$in = $false

$script | %{
if ($_.Contains("System1"))
    { $in = $true }
elseif ($_.Contains("System2"))
    { $in = $false; }
elseif ($in)
    { Write-Host $_ }
}

Ideally I'd like to be able to take this script and use it to parse data from any shell command.理想情况下,我希望能够使用此脚本并使用它来解析来自任何 shell 命令的数据。 I'm currently lost and almost ready to give up on this.我现在迷路了,几乎准备放弃这个。

Try this regex:试试这个正则表达式:

$result = ($text | Select-String 'System1:\s*\r\n((.*\r\n)*)\s*System2:' -AllMatches)
$result.Matches[0].Groups[1].Value

Where $text is your original input.其中 $text 是您的原始输入。 Note that you might have to adjust your line endings from \\r\\n to \\n depending on your input.请注意,您可能需要根据您的输入将行尾从 \\r\\n 调整为 \\n。 You may also have more than one match, I'm not sure from your sample.您可能也有不止一场比赛,我不确定您的样本。

The regex starts matching with System1:\\s*\\r\\n which is System1 followed by any number of spaces, followed by a newline.正则表达式开始与System1:\\s*\\r\\n匹配,即 System1 后跟任意数量的空格,后跟换行符。 It ends the match with the literal System2: .它以文字System2:结束匹配。 The inner middle, .*\\r\\n , matches all characters followed by a newline.内部中间.*\\r\\n匹配所有后跟换行符的字符。 The outer middle (.*\\r\\n)* says to repeatedly match that pattern.外中间(.*\\r\\n)*表示重复匹配该模式。 Finally that construct is grouped, ((.*\\r\\n)*) so that all the matching lines can be extracted as the result.最后,该构造被分组, ((.*\\r\\n)*)以便可以提取所有匹配的行作为结果。

One option is to join the text with newlines, then use -split with a multi-line regex:一种选择是用换行符连接文本,然后将 -split 与多行正则表达式一起使用:

$text = 
(@'
Sites:
System1: 
RPAs: OK
Volumes: 
  WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_TX
  WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_TX
  WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_TX
Splitters: OK
System2: 
RPAs: OK
Volumes: 
  WARNING: Storage group MA_UCS_1 contains both replicated and unreplicated volumes. ; CS_MA
  WARNING: Storage group MA_UCS_2 contains both replicated and unreplicated volumes. ; CS_MA
  WARNING: Storage group MA_UCS_3 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK
WAN: OK
System: OK
'@).split("`n") |
foreach {$_.trim()} 

$text -join "`n" -split '(?ms)(?=^System\d+:\s*)' -match '^System\d+:'

System1:
RPAs: OK
Volumes:
WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_TX
Splitters: OK

System2:
RPAs: OK
Volumes:
WARNING: Storage group MA_UCS_1 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_2 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_3 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK
WAN: OK
System: OK

Edit: a more generic solution to just capturing the output between two specific keywords:编辑:一个更通用的解决方案,只捕获两个特定关键字之间的输出:

$regex = '(?ms)System1:(.+?)System2:'

$text = $text -join "`n"

$OutputText = 
[regex]::Matches($text,$regex) |
 foreach {$_.groups[1].value -split }

I tried to adapt this script for myself, i want to do the same but instead catch what's between我试图为自己改编这个脚本,我想做同样的事情,但要抓住两者之间的内容and(note-file from kobo-reader). (来自kobo-reader的笔记文件)。 At last got it working and it looks like this:最后让它工作,它看起来像这样:

$text = @"
<text>The deaths I see are frequently undignified; the dying very often have not accepted or understood their situation, the truth denied them by well-intentioned relatives and doctors. Their death has been stolen from them.
</text>
            </fragment>
        </target>
        <content>
                <text>It is indeed impossible to imagine our own death; and whenever we attempt to do so, we can perceive that we are in fact still present as </text>
"@
$regex = '(?ms)<text>(.+?)</text>'

#Test
$OutputText = [regex]::Matches($text,$regex) | 
foreach {$_.groups[1].value }
Write-Host $OutputText

#Output
[regex]::Matches($text,$regex) | 
foreach {$_.groups[1].value } |
Out-File c:\temp\kobo\example_out.txt -Encoding utf8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM