简体   繁体   English

正则表达式匹配标签内的标签和最后匹配的标签

[英]Regex match for tags within tags and last matching tag

I am trying to parse some xml tags with data containing Escaped strings Some samples 我正在尝试使用包含转义字符串的数据来解析一些xml标签。

other tags with our without newlines
<tag name="abc1" type="bcd" value="test"><tag name="abc2" type="bcd" value="test">  
other tags other tags with our without newlines
<tag name="abc2" type="bcd" value="<w:test xmlns:wst=&quot;http://schemas.xmlsoap.org/ws/2005/02/trust&quot;><a xmlns:&quot;a:b:c:ddd:&quot;>XEduAjr8MoV</a></w:test>">

basically I need to find values in tags within other strings. 基本上我需要在其他字符串中的标记中查找值。 Something like this 像这样

<tag name="wwww" type="wwww" value="SOME HTML ESCAPED STRING WITH NEWLINES">

Here is what I have 这是我所拥有的

<tag name="(?<name>\w*)" type="(?<id>\w*)" value="(?<value>.*)">

I am using this c# code 我正在使用此C#代码

var regex = new Regex(regstr, RegexOptions.Multiline);
MatchCollection mc = regex.Matches(sourcestring);

I am running into problems with multiple matches combined because of (?<value>.*) for if both are same line <tag name="abc1" type="bcd" value="test"><tag name="abc2" type="bcd" value="test"> Any way to get around this? 我遇到合并多个匹配项的问题,这是因为(?<value>.*)是否都在同一行<tag name="abc1" type="bcd" value="test"><tag name="abc2" type="bcd" value="test">可以解决这个问题吗? Is there any better way? 有什么更好的办法吗?

Its not advisable to parse xml files with regex patterns. 不建议使用正则表达式模式解析xml文件。 A reason for this is because xml involves/requires deep nesting. 这样做的原因是因为xml涉及/需要深度嵌套。

It's well known that you should not use regex to parse xhtml, unless you don't have complex tags and a weird set of characters. 众所周知,除非您没有复杂的标记和一组怪异的字符,否则不应该使用正则表达式来解析xhtml。

However, if you want to use regex, for your specific example, you have to use non greedy (or lazy) quantifiers: 但是,如果要使用正则表达式,则对于您的特定示例,必须使用非贪婪 (或惰性)量词:

<tag name="(?<name>\w*?)" type="(?<id>\w*?)" value="(?<value>.*?)">
                                                       HERE ---^
also I put it here ---^------------------^ 
since it is more secure, but it is not needed

Working demo 工作演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM