简体   繁体   English

查找字符串是否包含给定xml元素的正则表达式

[英]Regular expression to find whether the string contains the given xml element

I have the following xml 我有以下xml

<person>
    <id>1</id>
    <name>John</name>
    <phone>235 234</phone>
    <address>
        <street>1</street>
        <city>A</city>
        <state>B</state>
        <country>C</country>
    </address>
</person>

I transformed this xml into string and this is a dynamic xml. 我将此xml转换为字符串,这是一个动态xml。 Some xml has all these elements and some not have the specified element and some xml has additional element. 某些xml具有所有这些元素,而某些xml没有指定的元素,而某些xml具有其他元素。

Based on the xml string I want to write the regular expression to find whether the given element (as input) present in the string. 基于xml字符串,我想编写正则表达式以查找给定元素(作为输入)是否存在于字符串中。

How to write regular expression for this? 如何为此编写正则表达式?

All the comment writers are right. 所有评论作者都是正确的。 There are better methods than using a regular expression search to find out if an XML element contains a specified element or its tag. 有比使用正则表达式搜索来查找XML元素是否包含指定元素或其标签的方法更好的方法。

But if you really want to do this task with a regular expression search, it is possible to use for your example: 但是,如果您真的想通过正则表达式搜索执行此任务,则可以使用以下示例:

<person>(?:(?!</person>)[\S\s])+<XXX\b(?:(?!</person>)[\S\s])+</person>

This expression matches everything from starting tag <person> to ending tag </person> if it contains <XXX whereby XXX is the element to find within element person . 如果表达式包含<XXX ,则该表达式匹配从开始标记<person>到结束标记</person>的所有内容,其中XXX是在person元素中找到的元素。

Note: This regular expression works only if element person does not contain itself another person element and there is no CDATA section containing </person> or <person or <XXX . 注意:仅当元素person自身不包含另一个person元素并且不存在包含</person><person<XXX CDATA节时,此正则表达式才有效。

The expression just checks if the starting tag for element XXX is found without a check for the ending tag as it is not clear from the question if all elements must be present with a starting and an ending tag or if some could be also empty elements in form <XXX /> . 该表达式仅检查是否找到了元素XXX的开始标记,而没有检查结束标记,因为从问题中尚不清楚是否所有元素都必须带有开始和结束标记,或者在其中某些元素也可能是空元素表格<XXX />

For an explanation of this regular expression read my answer on Deleting duplicate values using find and replace in a text editor . 有关此正则表达式的解释,请阅读我的答案,有关在文本编辑器中使用查找和替换删除重复值的信息

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM