[英]how to parse a table without an Id tag using jsoup.
how to parse a table without an Id tag. 如何解析没有Id标签的表格。 I'm trying to parse a table with source code line 2290 to 3153 http://pastebin.com/DjGHED5t 我正在尝试使用源代码行2290至3153解析表http://pastebin.com/DjGHED5t
It isn't obvious to me as to how to do it. 对我来说,如何做到这一点并不明显。 what I have now is 我现在所拥有的是
import java.util.*;
import java.io.*;
import java.awt.*;
import javax.swing.*;
import org.jsoup.*;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.awt.event.KeyEvent;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class test{
public static void main (String []args){
String Ticker = "KO";
URL url = new URL("http://toolbox.investools.com/graphs/fundamentalAnalysis.iedu?report=BS&symbol="+(Ticker));
Document doc = Jsoup.parse(url, 3000);
Elements table = doc.select(table);
Iterator<Element> ite = table.select(table[width="100%"] [bgcolor="#CCCCCC"] [cellpadding="0"] [cellspacing="2"]);
String[][] balanceSheetInfo = new String [46][11];
while (ite.hasNext()){
for (int row = 0, row_size = balanceSheetInfo[row].length; row < row_size; row++){
for (int col = 0, col_size = balanceSheetInfo.length; col < col_size; col++){
if(ite.hasNext()){
balanceSheetInfo[col][row] = input.next();
System.out.printf("%s",balanceSheetInfo[col][row]); }
}
}
}
}
}
But i am getting symbol not found errors. 但是我得到符号未找到错误。 I am not strong with Jsoup and scraping given this is the first project I have used it in. If someone could guide me it would be greatly appreciated. 我对Jsoup并不满意,因为这是我使用过的第一个项目。如果有人可以指导我,将不胜感激。
Read your code: 阅读您的代码:
Elements table = doc.select(table);
You're using the table variable (in doc.select(table)
) before it's even declared. 在声明之前,您正在使用table变量(在doc.select(table)
)。 The Element.select()
method takes a String as argument. Element.select()
方法采用String作为参数。 You need 你需要
Elements table = doc.select("table");
with double quotes, which will select all the table elements. 用双引号将选择所有表元素。
The next line has the same problem: 下一行有相同的问题:
table.select(table[width="100%"] [bgcolor="#CCCCCC"] [cellpadding="0"] [cellspacing="2"]);
should be 应该
table.select("table[width=\"100%\"] [bgcolor=\"#CCCCCC\"] [cellpadding=\"0\"] [cellspacing=\"2\"]");
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.