简体   繁体   English

正则表达式,找到具有某些子节点的xml节点(在Sublime Text中)

[英]regex, find xml node which has certain subnodes (In Sublime Text)

I am looking for the regular expression (for Sublime Text) to select xml elemets which have a certain sub-element. 我正在寻找正则表达式(用于Sublime Text)以选择具有某些子元素的xml元素。 I can select all the elements with this: 我可以选择所有元素:

(?s)<wp:comment>.+?</wp:comment>

This works perfect but I want to find the blocks which contain 这工作完美,但我想找到包含

<wp:comment_approved>0</wp:comment_approved> 

And not which contain: 并且不包含:

<wp:comment_approved>1</wp:comment_approved>

So I need a lookaraound (look ahead or look behind) or conditional expression, but I cant get it right. 所以我需要一个lookaraound(向前看或向后看)或条件表达式,但我无法正确处理。 When I try: 当我尝试:

(?s)<wp:comment>.+?comment_approved>1.+?</wp:comment>   

It selects more elements in one select than it should. 它一次选择就选择了更多元素。

It seems very simple but I cant find the right answer anywhere. 看起来很简单,但我在任何地方都找不到正确的答案。

I suppose this would work: 我想这会起作用:

(?s)<wp:comment>(?:(?!<wp:comment>).)+?+<wp:comment_approved>0.+?+</wp:comment>

Note the possessive matching ( .+?+ ) to avoid unnecessary backtracking. 请注意所有格匹配( .+?+ ),以避免不必要的回溯。

Oke, the answer to the problem explained. 好吧,问题的答案得到了解释。 Find only the comments which are not approved. 仅查找未批准的评论。

<xml>
    <node>bla</node>
    <wp:comment>
        <node>bla</node>
        <node>bli</node>
        <wp:comment_approved>1</wp:comment_approved>
        <node></node>
        <node></node>
    </wp:comment>
    <wp:comment>
        <node>ble</node>
        <node>blu</node>
        <wp:comment_approved>0</wp:comment_approved>
        <node></node>
        <node></node>
    </wp:comment>
</xml>

This is the syntax for regex find in the xml in Sublime Text: 这是Sublime Text中xml中的正则表达式的语法:

(?s)<wp:comment>(?:(?!<wp:comment>).)+?<wp:comment_approved>0.+?</wp:comment>

(?s)           -> global search and multiline
<wp:comment>   -> find occurrence
(?: ... )      -> group but do not capture submatch 
(?! ... )      -> negative lookahead
<wp:comment>(?:(?!<wp:comment>).)+? 
               -> find <wp:comment> plus everything 
                  until a new <wp:comment> starts. 
                  This is for not select two or more comments 
                  in one time.
<wp:comment_approved>0.+?</wp:comment> 
               -> then find '<wp:comment_approved>0' 
                  Plus everything, then find </wp:comment>.

So: 所以:

first find the start (main pattern) then 首先找到开始(主要模式),然后

find everything but not a new start 找到一切,但没有一个新的起点

find the sub pattern 找到子模式

find the rest 找到其余的

find the end (main pattern) 找到终点(主要模式)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM