[英]How to create regex for retrieving HTML code in Java?
Piece of html code : 一段HTML代码:
<a class="context_link" href="/thuc-don/41-Thit-vit-ram-sa-gung.html">
<img src="http://monngonmoingay.com/uploads/monan/201205170430310000000_thit" +
"-vit-ram-sa-gung-48aq570.png" alt="Thịt vịt ram sả gừng " />
so i use regex to get link from code : 所以我用正则表达式从代码中获取链接:
String pat = "<a\\s+class=\"context_link\"\\s+href=\"(.+)\"";
Pattern pattern = Pattern.compile(pat,Pattern.DOTALL | Pattern.UNIX_LINES);
Matcher math = pattern.matcher(source);
while(math.find()){Log.i("Value",math.group(1));}
When i check is match or not, result always is false. 当我检查是否匹配时,结果始终为假。
Who can help me fix error it ? 谁能帮助我修复错误呢?
If you're trying to extract HREF then you should use Jsoup
library 如果您尝试提取HREF,则应使用
Jsoup
库
Now working example: 现在工作的例子:
import java.io.IOException;
import org.jsoup.Jsoup;
public class Test {
public static void main(String args[]) throws IOException {
String source = "<a class=\"context_link\" href=\"/thuc-don/41-Thit-vit-ram-sa-gung.html\"> <img src=\"http://monngonmoingay.com/uploads/monan/201205170430310000000_thit\" + \"-vit-ram-sa-gung-48aq570.png\" alt=\"Th?t v?t ram s? g?ng \" />";
String link = Jsoup.parse(source).select("a").first().attr("href");
System.out.println("Your link :" + link);
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.