简体   繁体   中英

How to Select-String from Multiple Lines with Powershell

I have this file below test.dat

        <category>Games</category>
</game>

        <category>Applications</category>
</game>

        <category>Demos</category>
</game>

        <category>Games</category>
        <description>MLB 2002 (USA)</description>
</game>

        <category>Bonus Discs</category>
</game>

        <category>Multimedia</category>
</game>

        <category>Add-Ons</category>
</game>

        <category>Educational</category>
</game>

        <category>Coverdiscs</category>
</game>

        <category>Video</category>
</game>

        <category>Audio</category>
</game>

        <category>Games</category>
</game>

How do I use Get-Content and Select-String to output the following to terminal from the input of the file above. Using the above input I need to receive this output.

            <category>Games</category>
    </game>
            <category>Games</category>
    </game>

This is the command I'm currently using but it isn't working. Get-Content '.\test.dat' | Select-String -pattern '(^\s+<category>Games<\/category>\n^\s+<\/game>$)'

First thing is you need to read it all in as one string to match across lines.

Get-Content '.\test.dat' -Raw

Since it seems you want to exclude the entry with you can use this pattern that grabs only those that don't have white space after and before

'(?s)\s+<category>Games\S+\r?\n</game>'

Select string returns a matchinfo object and you need to extract the Value property of the Matches property. You can do that a few different ways.

Get-Content '.\test.dat' -Raw |
    Select-String '(?s)\s+<category>Games\S+\r?\n</game>' -AllMatches |
        ForEach-Object Matches | ForEach-Object Value

or

$output = Get-Content '.\test.dat' -Raw |
    Select-String '(?s)\s+<category>Games\S+\r?\n</game>' -AllMatches

$output.Matches.Value

or

(Get-Content '.\test.dat' -Raw |
    Select-String '(?s)\s+<category>Games\S+\r?\n</game>' -AllMatches).Matches.Value

Output

        <category>Games</category>
</game>


        <category>Games</category>
</game>

You could also use [regex] type accelerator.

$str = Get-Content '.\test.dat' -Raw

[regex]::Matches($str,'(?s)\s+<category>Games\S+\r?\n</game>').value

EDIT

Based on your additional info, the way I understand it is you want to remove any game categories that are empty. We can simplify this greatly by using a here string.

$pattern = @'
        <category>Games</category>
    </game>

'@

The additional blank line is intentional to capture the final newline character. You could also write it like this

$pattern = @'
        <category>Games</category>
    </game>\r?\n
'@

Now if we do a replace on the pattern, you'll see what I believe is what you expect for your final result.

(Get-Content $inputfile -Raw) -replace $pattern

And to finish it off you can just put the above command inside a Set-Content command. Since the Get-Content command is enclosed in parenthesis, it is completely read into memory before the file is written to.

Set-Content -Path $inputfile -Value ((Get-Content $inputfile -Raw) -replace $pattern)

EDIT 2

Well it seems to work in ISE but not in powershell console. In case you encounter the same thing, try this.

$pattern = '(?s)\s+<category>Games</category>\r?\n\s+</game>'

Set-Content -Path $inputfile -Value ((Get-Content $inputfile -Raw) -replace $pattern)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM