简体   繁体   中英

java pattern find html tag between text

I want find text 'ABCD' in

String text = "<div class=\"aaaa\">1234</div>"
            + "   <li class=\"pcs05\">ABCD</li>";

Pattern p = Pattern.compile("<li class=[^A-Za-z0-9]>(\\S+)</li>");
Matcher m = p.matcher(text);
if(m.find()){
    System.out.println(m.group(1));
}

but it doesn't print anything.

String text =  "<div class=\"aaaa\">1234</div>";
               text +=    "<li class=\"pcs05\">ABCD</li>";
    Pattern p = Pattern.compile("<li class=\"[A-Za-z0-9]+\">(\\S+)</li>");
    Matcher m = p.matcher(text);
    if(m.find()){
        System.out.println(m.group(1));
    }

Preferred tool for this kind of task is HTML or XML parser (more info Can you provide some examples of why it is hard to parse XML and HTML with a regex? ). One of simpler parser I like to use is jsoup . Nice thing about it is that it supports CSS query syntax.

So your code could look like:

String text = "<div class=\"aaaa\">1234</div>"
            + "   <li class=\"pcs05\">ABCD</li>";

Document doc = Jsoup.parse(text);
String liValue = doc.select("li").text();

System.out.println(liValue);

Output: ABCD

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM