JSoup使用Java解析包含html表的文本文件

Question

I am really unsure how I can get the information I need to place into a database, the code below just prints the whole file. 我真的不确定如何获取需要放入数据库的信息，下面的代码仅打印整个文件。

File input = new File("shipMove.txt");
Document doc = Jsoup.parse(input, null);    
System.out.println(doc.toString());

My HTML is here from line 61 and I am needing to get the items under the column headings but also grab the MMSI number which is not under a column heading but in the href tag. 我的HTML是这里从61行，我需要得到的列标题下的项目，但也抢MMSI编号不是列标题下，但在href标记。 I haven't used JSoup other than to get the HTML from the web page. 除了从网页获取HTML外，我没有使用过JSoup。 I can only really see tutorials to use php and I'd rather not use it. 我只能真正看到使用php的教程，而宁愿不使用它。

Answer 1

To get those information, the best way is to use Jsoup's selector API . 为了获得这些信息，最好的方法是使用Jsoup的选择器API 。 Using selectors, your code will look something like this (pseudeocode!): 使用选择器，您的代码将类似于以下内容（伪代码！）：

File input = new File("shipMove.txt");
Document doc = Jsoup.parse(input, null);


Elements matches = doc.select("<your selector here>");

for( Element element : matches )
{
    // do something with found elements
}

There's a good documentation available here: Use selector-syntax to find elements . 这里有一个很好的文档：使用选择器语法查找元素。 If you get stuck nevertheless, please describe your problem. 如果仍然卡住，请描述您的问题。

Here are some hints for that selector, you can use: 以下是该选择器的一些提示，您可以使用：

// Select the table with class 'shipinfo'
Elements tables = doc.select("table.shipinfo");

// Iterate over all tables found (since it's only one, you can use first() instead
for( Element element : tables )
{
    // Select all 'td' tags of that table
    Elements tdTags = element.select("td"); 

    // Iterate over all 'td' tags found
    for( Element td : tdTags )
    {
        // Print it's text if not empty
        final String text = td.text();

        if( text.isEmpty() == false )
        {
            System.out.println(td.text());
        }
    }
}

JSoup使用Java解析包含html表的文本文件

问题描述

1 个解决方案

解决方案1
0 已采纳 2014-04-12 17:51:58

JSoup使用Java解析包含html表的文本文件

问题描述

1 个解决方案

解决方案1 0 已采纳 2014-04-12 17:51:58

解决方案1
0 已采纳 2014-04-12 17:51:58