[英]Get anchor tag HREF and VALUE
I have a string that looks like this: 我有一个看起来像这样的字符串:
<a href="http://forum.tibia.com/forum/?action=board&boardid=476">Amera</a><br><font class="ff_info">This board is for general discussions related to the game world Amera.</font>
How can I ignore/remove everything after the </a>
and then only get the url: http://forum.tibia.com/forum/?action=board&boardid=476
and the value Amera
我如何才能忽略/删除
</a>
之后的所有内容,然后仅获得以下网址: http://forum.tibia.com/forum/?action=board&boardid=476
: http://forum.tibia.com/forum/?action=board&boardid=476
Amera
和值Amera
So afterwards, I want 2 variables with their values, like: 所以之后,我想要2个变量及其值,例如:
string url = "http://forum.tibia.com/forum/?action=board&boardid=476";
and 和
string value = "Amera";
I tried this to get the value: 我试图这样做来获得价值:
string value = System.Text.RegularExpressions.Regex.Replace(MYSTRING, "(<[a|A][^>]*>|)", "");
But it returns: 但它返回:
Amera</a><br><font class="ff_info">This board is for general discussions related to the game world Amera.</font>
For getting the URL, maybe try, this regex pattern: /href=\\"(.*)\\"/
要获取URL,请尝试以下正则表达式模式:/
/href=\\"(.*)\\"/
...And to get the values between > Amera </a>
use a pattern like: >(.+?)</a>
...并获取
> Amera </a>
之间的值,请使用类似如下的模式: >(.+?)</a>
...although, this seems far from perfect... 尽管这似乎还不完美
If the a
tag won't contain more attributes, you can use just this for the URL only: 如果
a
标签将不包含多种属性,你可以用这个唯一的网址:
\bhref="(.*?)"
And little more complex for URL and text: URL和文本的复杂程度略高一些:
<a\b[^>]*?\bhref="([^"]*?)"[^>]*?>(.*?)<\/a>
So in C# code (quotation marks need to be escaped!): 因此,在C#代码中(引号需要转义!):
var html = "<a href=\"http://forum.tibia.com/forum/?action=board&boardid=476\">Amera</a><br><font class=\"ff_info\">This board is for general discussions related to the game world Amera.</font>";
var match = Regex.Match(html, "<a\\b[^>]*?\\bhref=\"([^\"]*?)\"[^>]*?>(.*?)<\\/a>", RegexOptions.IgnoreCase);
if (match.Success) {
var url = match.Groups[1];
var text = match.Groups[2]
}
Try this: 尝试这个:
HtmlDocument dc = new HtmlAgilityPack.HtmlDocument();
dc.LoadHtml("<a href='http://forum.tibia.com/forum/?action=board&boardid=476'>Amera</a><br><font class='ff_info'>This board is for general discussions related to the game world Amera.</font>");
foreach (HtmlNode link in dc.DocumentNode.SelectNodes("a"))
{
string url = link.Attributes["href"].Value; // http://forum.tibia.com/forum/?action=board&boardid=476
string value = link.InnerText; // Amera
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.