使用jSoup从Android中的外部网站接收表数据

Question

Inside my Android app I want to receive some table data from an external website. 在我的Android应用程序内部，我想从外部网站接收一些表格数据。

Lets say website page X has this table inside it's HTML: 可以说，网站页面X的HTML内包含此表：

<table summary="Foo" border="0" bgcolor="#ffffff" cellpadding="0"> </table>

How would I receive the strings inside all the cells of the second column of the table (top to bottom)? 如何接收表格第二列（从上到下）的所有单元格内的字符串？

So far what I have done is the following: 到目前为止，我已经做了以下工作：

Create an AsyncTask 创建一个AsyncTask
Use jSoup to scrape the external website. 使用jSoup抓取外部网站。

I used the following code inside my AsyncTask: 我在AsyncTask中使用了以下代码：

ArrayList<String> list = new ArrayList<String>(); //table data
Document document = Jsoup.connect(url).get();
Elements nextTurns = document.select(":contains(Foo) td:eq(1)");            
        for (Element nextTurn : nextTurns) {
            list.add(nextTurn.text());
        }

When running the code it just seems to stop at the document.select statement and the GC is going crazy. 在运行代码时，它似乎只是停在document.select语句上，GC快要疯了。 After a very long time it does get past the document.select statement and it does get most of the data correct but it still has random other elements from the website. 经过很长时间后，它确实超越了document.select语句，并且确实使大多数数据正确，但是它仍然从网站中随机提取其他元素。

I am pretty sure this is completely wrong: 我很确定这是完全错误的：

Elements nextTurns = document.select(":contains(Foo) td:eq(1)");

But I am unsure how to fix it because the table also lacks any ID's. 但是我不确定如何解决它，因为该表也缺少任何ID。 And I find this page confusing. 而且我发现此页面令人困惑。

How can I fix the select statement and/or for loop so it fills up the ArrayList with data from the second table column? 如何修复select语句和/或for循环，以便它用第二个表列中的数据填充ArrayList？

Edit: by removing contains(Foo) it's now really fast so that's 1 problem less. 编辑：通过删除contains(Foo)它现在确实非常快，因此少了1个问题。 I still need help with traversing the DOM elements to the second column of the table without taking a bunch of random parts of the website. 在将DOM元素遍历到表的第二列时，我仍然需要帮助，而无需占用网站的大量随机部分。

Answer 1

This is the correct selection, guessing based on your post 这是正确的选择，根据您的帖子进行猜测

document.select("table[summary=Foo] tr");

Loop through the list above, and get the second <td> which is at index 1 of the list. 循环浏览上面的列表，并获得第二个<td> ，它位于列表的索引1。

使用jSoup从Android中的外部网站接收表数据

问题描述

1 个解决方案

解决方案1
1 已采纳 2013-02-27 03:06:07

使用jSoup从Android中的外部网站接收表数据

问题描述

1 个解决方案

解决方案1 1 已采纳 2013-02-27 03:06:07

解决方案1
1 已采纳 2013-02-27 03:06:07