如何创建正则表达式来检索Java中的HTML代码？

Question

Piece of html code : 一段HTML代码：

<a class="context_link" href="/thuc-don/41-Thit-vit-ram-sa-gung.html">
        <img src="http://monngonmoingay.com/uploads/monan/201205170430310000000_thit" +
                "-vit-ram-sa-gung-48aq570.png" alt="Thịt vịt ram sả gừng " />

so i use regex to get link from code : 所以我用正则表达式从代码中获取链接：

String pat = "<a\\s+class=\"context_link\"\\s+href=\"(.+)\"";       
   Pattern pattern = Pattern.compile(pat,Pattern.DOTALL | Pattern.UNIX_LINES);
   Matcher math = pattern.matcher(source);
   while(math.find()){Log.i("Value",math.group(1));}

When i check is match or not, result always is false. 当我检查是否匹配时，结果始终为假。

Who can help me fix error it ? 谁能帮助我修复错误呢？

Answer 1

If you're trying to extract HREF then you should use Jsoup library 如果您尝试提取HREF，则应使用Jsoup库

Jsoup 汤

Now working example: 现在工作的例子：

import java.io.IOException;

import org.jsoup.Jsoup;

public class Test {

    public static void main(String args[]) throws IOException {

        String source = "<a class=\"context_link\" href=\"/thuc-don/41-Thit-vit-ram-sa-gung.html\">        <img src=\"http://monngonmoingay.com/uploads/monan/201205170430310000000_thit\" +                \"-vit-ram-sa-gung-48aq570.png\" alt=\"Th?t v?t ram s? g?ng \" />";
        String link = Jsoup.parse(source).select("a").first().attr("href");
        System.out.println("Your link  :" + link);

    }

}

如何创建正则表达式来检索Java中的HTML代码？

问题描述

1 个解决方案

解决方案1
0 已采纳 2013-10-08 08:15:59

如何创建正则表达式来检索Java中的HTML代码？

问题描述

1 个解决方案

解决方案1 0 已采纳 2013-10-08 08:15:59

解决方案1
0 已采纳 2013-10-08 08:15:59