How to use HTML parser to get whats in a div tag or another tag in Java

Question

I want to get text in a tag, ie

<div id="title">    MotoGP  </div>

I want to extract "MotoGP" from here. I'm using org.htmlparser .

I've tried

NodeList nodes = parser.extractAllNodesThatMatch(new AndFilter(new TagNameFilter("div"),
     new HasAttributeFilter("id", "title")));

    SimpleNodeIterator nodeIterator = nodes.elements();
    while (nodeIterator.hasMoreNodes()) {

             HeadingTag tag = (HeadingTag)node;
             System.out.println(tag.getStringText());

Answer 1

Looks like something like this:

Parser p;

// initialize p somehow
p = createParser(html /* actual html String */,
    charset /* null for default */);

NodeList nl = p.extractAllNodesThatMatch(
    new HasAttributeFilter("id", "title")); // or other id...

// if you want the text of the 1st matching node:
System.out.println(nl.elementAt(0).getText());

See especially:

How to use HTML parser to get whats in a div tag or another tag in Java

Question

1 answers

solution1
0 2012-05-01 21:49:32

How to use HTML parser to get whats in a div tag or another tag in Java

Question

1 answers

solution1 0 2012-05-01 21:49:32

solution1
0 2012-05-01 21:49:32