简体   繁体   English

运行Web抓取程序时出现“线程“ main”中的异常“ java.lang.NullPointerException”错误

[英]“Exception in thread ”main“ java.lang.NullPointerException” error when running web scraper program

I'm fairly new to web scraping and have limited knowledge on Java. 我是Web抓取的新手,对Java的了解有限。

Every time I run this code, I get the error: 每次运行此代码,都会收到错误消息:

Exception in thread "main" java.lang.NullPointerException

    at sws.SWS.scrapeTopic(SWS.java:38)
    at sws.SWS.main(SWS.java:26)
Java Result: 1
BUILD SUCCESSFUL (total time: 0 seconds)

My code is: 我的代码是:

import java.io.*;

import java.net.*;

import org.jsoup.Jsoup;

import org.jsoup.nodes.Document;

public class SWS
{

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args)
    {
        scrapeTopic("wiki/Python");

    }

    public static void scrapeTopic(String url)
    {
        String html = getUrl("http://www.wikipedia.org/" + url);



        Document doc = Jsoup.parse(html);

        String contentText = doc.select("#mw-content-text > p").first().text();

        System.out.println(contentText);


    }


    public static String getUrl(String Url)
    {
        URL urlObj = null;

        try 
        {
            urlObj = new URL(Url);

        }

        catch(MalformedURLException e)
        {
            System.out.println("The url was malformed");

            return "";
        }


        URLConnection urlCon = null;

        BufferedReader in = null;

        String outputText = "";

        try
        {
            urlCon = urlObj.openConnection();

            in = new BufferedReader(new InputStreamReader(urlCon.getInputStream()));
             String line = "";

             while ((line = in.readLine()) != null)
             {
                 outputText += line;

             }

             in.close();
        }

         catch(IOException e)
         {
             System.out.println("There was a problem connecting to the url");

             return "";

         }

        return outputText;



    }

}

I've been staring at my screen for sometime now and in need of help! 我一直盯着屏幕看已有一段时间,需要帮助!

Thanks in advance. 提前致谢。

In the following code: 在下面的代码中:

 String contentText = doc.select("#mw-content-text > p").first().text()

If doc.select("#mw-content-text > p") doesn't find any element that match the query and returns an empty element calling first() on such element should give a NullPointerException . 如果doc.select("#mw-content-text > p")找不到与查询匹配的任何元素,并返回一个元素,对该元素调用first() ,则应提供NullPointerException

check the jsoup document page of Element.select and Elements.first() 检查Element.selectElements.first()的jsoup文档页面

In this line 在这条线

doc.select("#mw-content-text > p").first().text();

Your doc.select obviously dont find anything, so it returns null. 您的doc.select显然找不到任何内容,因此它返回null。 You then call first() method on null and thats why it ends with error. 然后,您对null调用first()方法,这就是为什么它以错误结尾。

Your code works fine for me exactly as it is. 您的代码对我来说完全可以正常工作。

For the purpose of debugging and diagnosing whether the possible errors noted by other answers, you'd likely do well to use some temporary variables and step through the code in a debugger. 为了调试和诊断其他答案是否指出了可能的错误,您最好使用一些临时变量并在调试器中逐步执行代码。

public static void scrapeTopic(String url)
{
    String html = getUrl("http://www.wikipedia.org/" + url);
    Document doc = Jsoup.parse(html);

    Elements select = doc.select("#mw-content-text > p");
    Element first = select.first();
    String contentText = first.text();

    System.out.println(contentText);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 国际象棋程序的线程“main”java.lang.NullPointerException中的异常 - Exception in thread “main” java.lang.NullPointerException for my chess program DVD分类程序中的线程“ main”中的异常java.lang.NullPointerException - Exception in thread “main” java.lang.NullPointerException in DVD sorting program 运行mahout时线程“ main”中的异常java.lang.NullPointerException - Exception in thread “main” java.lang.NullPointerException while running mahout 线程“main”中的异常java.lang.NullPointerException错误在Java - Exception in thread “main” java.lang.NullPointerException Error At Java , 线程“main”中的异常 java.lang.NullPointerException - ,Exception in thread "main" java.lang.NullPointerException 线程“ main”中的异常java.lang.NullPointerException - Exception in thread “main” java.lang.NullPointerException “线程“ main”中的异常java.lang.NullPointerException” - “Exception in thread ”main“ java.lang.NullPointerException” 线程“ main”中的异常java.lang.NullPointerException - Exception in thread “main” java.lang.NullPointerException 线程“ main”中的异常java.lang.NullPointerException - Exception in thread “main” java.lang.NullPointerException 线程“main”中的异常 java.lang.NullPointerException 5 - Exception in thread "main" java.lang.NullPointerException 5
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM