[英]Regular expression to read tags in a HTML
<td width="100%"><h1>Chicago, IL Weather</h1></td>
I want to get the text in tag h1. 我想在标签h1中获取文本。 for this I want to use regular expression code in C#.
为此,我想在C#中使用正则表达式代码。 Can anybody tell me the solution?
有人可以告诉我解决方案吗?
System.Text.RegularExpressions.Regex bodyRegex = new System.Text.RegularExpressions.Regex(@"(<h1[^>]*>[\u0000-\uFFFF]+?</h1>)");
System.Text.RegularExpressions.Match bodyMatch = bodyRegex.Match(line);
if (bodyMatch.Success)
{
FileContent = bodyMatch.Result("$0");
FileContent = (FileContent.Replace(@"<h1>", "")).Replace(@"</h1>", "");
}
By the use of this you can find the first h1 tag value 通过此操作,您可以找到第一个h1标签值
Give it a shot 试一试
String h1Regex = "<h1[^>]*?>(?<TagText>.*?)</h1>";
MatchCollection mc = Regex.Matches(Data, h1Regex, RegexOptions.Singleline);
foreach (Match m in mc) {
Console.Writeline (m.Groups["TagText"].Value);
}
Why do you want to Regex, i know it is the fastest way but it got disadvantages too like : 1. It messes up the code readability, 为什么要使用Regex,我知道这是最快的方法,但是它也有缺点,例如:1.弄乱了代码的可读性,
Unless you absolutely have to, leave regex and go for Html parsers(like above mentioned HTMLAgilityPack). 除非您绝对需要,否则请离开regex并使用HTML解析器(如上述HTMLAgilityPack)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.