In org.htmlparser
I want to get tbody
node by id
Parser htmlParser = Parser.createParser("<table id='_table' border='0' cellspacing='0' cellpadding='0' class='tableRegion' width='100%' ><thead><tr><td>1</td><td>2</td></tr></thead><tbody id='_table_body' ><tr><td>4</td><td>5</td></tr></tbody></table>","gbk");
NodeFilter filter = new HasAttributeFilter("id", "_table_body");
NodeFilter f = new AndFilter(new TagNameFilter("tr"), new HasParentFilter(filter));
NodeList nodelist1 = htmlParser.parse(filter); //Tag (144[0,144],173[0,173]): tbody id='_table_body'
NodeList nodelist2 = htmlParser.parse(f); //
Why doesn't nodelist1
read <tr><td>4</td><td>5</td></tr>
?
If you get the <tbody>
node, you should expect to have:
<tbody id='_table_body' ><tr><td>4</td><td>5</td></tr></tbody>
rather than
<tr><td>4</td><td>5</td>
The latter is a child node of the <tbody>
element ... not the element itself. Basically, your code (using filter
) looks like it is giving you the right stuff.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.