Java RegExp：找到正确的正则表达式

Question

I am struggling with finding the correct regular expression for extracting the strings according to the following criteria:我正在努力寻找根据以下标准提取字符串的正确正则表达式：

I have an xml fragment with multiple tags.我有一个带有多个标签的 xml 片段。 Each element starts with <ABC_xxxx> and ends with </ABC_xxxx>每个元素以<ABC_xxxx>开头，以</ABC_xxxx>结尾

The xxxx changes for each element.每个元素的 xxxx 都会发生变化。 For example:例如：

 <ABC_A1S1>1234</ABC_A1S1>
 <ABC_uw3ey>1234</ABC_uw3ey>
 <ABC_PD4frfr5>1234</ABC_PD4frfr5>

etc...等等...

The number of x is not fixed! x的数量不固定！

I want to extract each element, including the tags themselves.我想提取每个元素，包括标签本身。

How can I do that?我怎样才能做到这一点？

Thanks.谢谢。

Answer 1

Assuming that there will be no such elements nested inside each other, try this:假设没有这样的元素相互嵌套，试试这个：

\<ABC(\w+)\>[^\<]+\<\/ABC(\1)\>

Explanation:解释：

\\<ABC(\\w+)\\> is the opening tag that starts with ABC the letters after ABC are captured in a group (hence parentheses). \\<ABC(\\w+)\\>是开始标记，与开始ABC后的字母ABC的基团（因此括号）被捕获。 We need them later我们以后需要它们
[^\\<]+ is the body of the element which is any character except opening angle bracket [^\\<]+是元素的主体，它是除左尖括号外的任何字符
<\\/ABC(\\1)\\> is the closing element that starts with ABC and must follow with the exact letters after ABC in the opening tag. <\\/ABC(\\1)\\>是以ABC开头的结束元素，并且必须跟在开始标签中ABC之后的确切字母之后。 \\1 is a reference to the first captured group. \\1是对第一个捕获组的引用。

Important Note : XML is not a regular language , therefore Regular Expressions are not capable to parse it.重要说明：XML 不是常规语言，因此正则表达式无法解析它。 Eg, imagine 2 or more such elements nested inside each other.例如，想象两个或更多这样的元素相互嵌套。 Use an XML parser to parse XML.使用 XML 解析器来解析 XML。

Answer 2

尝试这个：

<ABC_([^>]*)>([^<]*)<\/ABC_([^>]*)>

Java RegExp：找到正确的正则表达式

问题描述

2 个解决方案

解决方案1
1 2016-09-28 10:56:33

解决方案2
0 2016-09-28 10:56:36

Java RegExp：找到正确的正则表达式

问题描述

2 个解决方案

解决方案1 1 2016-09-28 10:56:33

解决方案2 0 2016-09-28 10:56:36

解决方案1
1 2016-09-28 10:56:33

解决方案2
0 2016-09-28 10:56:36