从html文件获取信息

Question

I'm writing a program where I get information from a page and put it in excel file. 我正在编写一个程序，从页面中获取信息并将其放在excel文件中。

The problem is, I don't find a way to search for the tag with the specific info. 问题是，我找不到用特定信息搜索标签的方法。

Here is my code(so far): 这是我的代码（到目前为止）：

  private void getAll() throws IOException {

    for (int i = 0;i<250;i++){
        URL vurl = new URL("http://www.bamart.be/nl/artists/detail/" + i);
        BufferedReader reader = new BufferedReader(new InputStreamReader(vurl.openStream()));
        String line;
        while ((line = reader.readLine()) != null){
          if (line.equalsIgnoreCase("<div class=\"subcontent\">"){ 
            System.out.println("Found info!");
          }

            printInfo(line,i);
        }
        }
    }


private void printInfo(String info,int i){
        System.out.println("/***********************************************/");
        System.out.println("************\t" + info + "**********************/");
        System.out.println("/************" +" Artist page:" +  i + " of 999 **********************/" );


    }

The println doesn't come up, but it is in the html file. println不会出现，但是在html文件中。

Answer 1

if (line.equalsIgnoreCase("<div class=\"subcontent\">"){ }

This if statement is checking for exact equality (ignoring case) however there could be other content on that line including whitespace for example. 该if语句正在检查完全相等（忽略大小写），但是该行上可能还有其他内容，例如空格。

What you might want instead would be something like 相反，您可能想要的是

if (line.toLowerCase().contains("<div class=\"subcontent\">") { }

Answer 2

从这个例子开始尝试使用Jsoup

从html文件获取信息

问题描述

2 个解决方案

解决方案1
0 已采纳 2012-06-22 15:53:32

解决方案2
0 2012-06-22 15:56:35

从html文件获取信息

问题描述

2 个解决方案

解决方案1 0 已采纳 2012-06-22 15:53:32

解决方案2 0 2012-06-22 15:56:35

解决方案1
0 已采纳 2012-06-22 15:53:32

解决方案2
0 2012-06-22 15:56:35