如何使用正則表達式從html字符串中提取網址

Question

我需要從以下字符串中提取網址：

<p> Feb 24 - <a href="http://austin.daylife.org/apa/2867907745.html">$390 / 2br - 600ft&sup2; - Sleeps 4-Walk to SXSW-SOCO-Perfect Location</a> - <font size="-1"> (South 5th)</font> <span class="p"> pic</span></p>

如何在C＃中使用正則表達式實現相同目的？

Answer 1

使用以下正則表達式：

http(s)?://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?

編輯：更簡單的表達式：

http(s)?://([\w-]+.)+[\w-]+(/[\w- ./?%&=])?

Answer 2

這對我有用：

        string source = " <p> Feb 24 - <a href=\"http://austin.daylife.org/apa/2867907745.html\">$390 / 2br - 600ft&sup2; - Sleeps 4-Walk to SXSW-SOCO-Perfect Location</a> - <font size=\"-1\"> (South 5th)</font> <span class=\"p\"> pic</span></p> ";
        Regex regex = new Regex("<a[^>]*? href=\"(?<url>[^\"]+)\"[^>]*?>(?<text>.*?)</a>");
        var m = regex.Match(source);
        string url = m.Groups["url"];

如何使用正則表達式從html字符串中提取網址

問題描述

2 個解決方案

解決方案1
1 已采納 2012-02-24 10:36:49

解決方案2
1 2012-02-24 10:55:26

如何使用正則表達式從html字符串中提取網址

問題描述

2 個解決方案

解決方案1 1 已采納 2012-02-24 10:36:49

解決方案2 1 2012-02-24 10:55:26

解決方案1
1 已采納 2012-02-24 10:36:49

解決方案2
1 2012-02-24 10:55:26