简体   繁体   English

用jsoup解析表

[英]Trouble parsing table with jsoup

I have been stuck on this for a few days now. 我已经坚持了几天。 I am attempting to create an app that is for the forum fightlockdown (its an MMA forum). 我正在尝试创建一个用于论坛斗争锁定的应用程序(它是MMA论坛)。

The area where I am running into trouble is on pages such as http://fightlockdown.com/forum/forumdisplay.php?f=1 where I would like to display each section in a table as a row but am having trouble grabbing only the sections ie UFC, The Ultimate Fighter, etc... 我遇到麻烦的区域是在诸如http://fightlockdown.com/forum/forumdisplay.php?f=1之类的页面上,我想将表中的每个部分显示为一行,但只能抓住它UFC,终极格斗等部分。

The closest I have been able to get is grabbing all of the anchor tags but there are obviously others on the page which could throw off my results if I don't remove them from the returned Elements correctly. 我能得到的最接近的是抓取所有的定位标记,但是如果我没有正确地从返回的Elements中删除它们,显然页面上还有其他标记可能会破坏我的结果。

I have not been able to figure out how to get the table to narrow down my results since as far as I can tell doc.select("table.tborder") does not yield any results and neither does doc.select("td.alt1Active"). 我一直无法弄清楚如何缩小表格范围,因为据我所知doc.select(“ table.tborder”)不会产生任何结果,而doc.select(“ td)也不会。 alt1Active”)。

Any help would be very much appreciated. 任何帮助将不胜感激。 Thanks in advance. 提前致谢。

You are not very specific in what you are looking for, so I'll throw some code out there and see if it's what you are looking for. 您对所要查找的内容不是很明确,因此我将在其中添加一些代码,看看它是否正是您所要的。

On this page specifically, the divs you are trying to pull have one of two classes associated with them. 特别是在此页面上,您要拉的div具有与它们关联的两个类之一。 This code selects those divs and iterates over them and then prints out the anchor tags in the divs. 此代码选择这些div并对其进行迭代,然后在div中打印出锚标记。

    Document doc = Jsoup.connect("http://fightlockdown.com/forum/forumdisplay.php?f=1").get();
    for (Element div : doc.select("div.forumold_lock, div.old_lockwindowbg")) {
        System.out.println(div.select("a"));
    }

Let me know if you need any more help. 让我知道您是否需要更多帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM