Notepad++ Regex to find group of lines with condition

Question

Given this example text:

<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ABB</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="FM"/>
...lines...
...........
</abr:rules>
<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ADE</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="CM"/>
...lines...
...........
</abr:rules> (end of group)

I would like to find and remove all that goes from <abr:rules> to </abr:rules> with the condition that subapplication IS NOT "CM" . Organization and application are the same, <abr:code> it's any string.

What I tried so far is

<abr:rules>\n<abr:ruleTypeDefinition>\n<abr:code>[a-zA-Z0-9]{3,}<\/abr:code>\n<abr:ownership>\n<.*"(FM|PSD|SSC)"\/>\n(?s).*?\n<\/abr:rules>\n

which works but only because I know the other subapplication names.

Is there any way to do it with Regex only ?

Answer 1

Try the following find and replace:

Find:

<abr:rules>((?!subapplication=).)*subapplication="(?!CM")[^"]+"((?!</abr:rules>).)*</abr:rules>

Replace:

(empty string)

Demo

Note: The above pattern will only work if you enable dot in Notepad++ to match newlines. If you don't want to do that, then you may use [\\S\\s] instead of dot.

Answer 2

You should not use regex for xml, you can read why here: https://stackoverflow.com/a/1732454/3763374

Instead you can use some parser like Xpath

Notepad++ Regex to find group of lines with condition

Question

2 answers

solution1
2 ACCPTED 2018-04-13 15:05:15

Demo

solution2
2 2018-04-13 16:08:39

Notepad++ Regex to find group of lines with condition

Question

2 answers

solution1 2 ACCPTED 2018-04-13 15:05:15

Demo

solution2 2 2018-04-13 16:08:39

solution1
2 ACCPTED 2018-04-13 15:05:15

solution2
2 2018-04-13 16:08:39