[英]Extracting values using regex
I want to extract the value "64,111" from this piece of text (html markup). 我想从这段文本(html标记)中提取值“ 64,111”。
<tr>
<th id="abc-xyz">Page <span class="sub">avg</span></th>
<td headers="abc-xyz">
10th Aug, 2011 </td>
<td headers="abc-xyz">64,111</td>
</tr>
I am currently using this regex -: 我目前正在使用此正则表达式-:
Match m2 = Regex.Match(text, @"\<td headers=""abc-xyz""\>(.*?)\</td\>", RegexOptions.IgnoreCase);
But no results ,Please tell me what am I doing wrong? 但没有结果,请告诉我我做错了什么?
escape the double quote with \\
用\\
转义双引号
Match m2 = Regex.Match(text, "(?<=<td\sheaders=\"abc-xyz\">).*(?=</td>)",
RegexOptions.IgnoreCase);
Instead of "." 代替 ”。” use a character class excluding the stop character. 使用除终止字符之外的字符类。 That is, instead of ">(.*)<"
you want ">([^<]*)<"
. 也就是说,您想要">([^<]*)<"
而不是">(.*)<"
">([^<]*)<"
。
I assume you know that this is no substitute for real parsing, which regex can't do, so I won't preach about that. 我假设您知道这不能替代真正的解析,而正则表达式则无法做到这一点,因此我不会对此进行宣传。 There's already a really funny response somewhere on this site to that effect. 这个网站上已经有一个非常有趣的回应。
Well there is more than one way to skin a cat. 嗯,有多种方法可以给猫皮剥皮。
Parsing XML is not limited to regex so here is one way to do it using Linq to XML. 解析XML不仅限于正则表达式,因此这是使用Linq to XML的一种方法。
string found = (from td in XElement.Parse(myxml).Elements("td")
where td.HasAttributes
let headers = td.Attribute("headers")
where headers != null && headers.Value == "abc-xyz" && !td.HasElements
select td.Value).FirstOrDefault();
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.