Cannot get all matched nodes while using htmlparser to parse a website

Question

I'm using htmlparser for parsing a website, but I've trapped in a really weird problem. I'm trying to get all <li> nodes at a webpage and my code is such as:

String url = "http://s.1688.com/selloffer/offer_search.htm?keywords=%BD%A8%B2%C4&n=y&categoryId=";
Parser parser = new Parser(url);
parser.setEncoding("gb2312");

NodeList list = parser.extractAllNodesThatMatch(new TagNameFilter("li"));
// NodeList list = parser.parse(new CssSelectorNodeFilter("li[class=\"sm-offerShopwindow\"]"));
System.out.print(list.size() + "\n");
for (int i = 0; i < list.size(); i++) {
Node li = list.elementAt(i);
System.out.print("text:" + li.getText() + "\n");
}

But the output of list size is always 20. It seems that it doesn't travel all nodes on that page. Why? Thanks for any advices.

Answer 1

即使是顶级的浏览器也并不总是就如何解析所有假装为HTML的奇怪内容达成共识，并且自2006年以来网络已经非常发达。因此，如果这样的旧软件无法应对现代技术，我不会感到惊讶。 HTML。

Cannot get all matched nodes while using htmlparser to parse a website

Question

1 answers

solution1
0 2013-12-22 08:15:20

Cannot get all matched nodes while using htmlparser to parse a website

Question

1 answers

solution1 0 2013-12-22 08:15:20

solution1
0 2013-12-22 08:15:20