简体   繁体   English

Java-从url读取页面源不起作用

[英]Java - Read page source from url does not work

I am using the code below to read page source from url. 我正在使用下面的代码从url读取页面源代码。 It works almost for all urls but not for this url and just returns the url itself. 它几乎适用于所有网址,但不适用于此网址,仅返回网址本身。

public static String getURLSource(String url) throws IOException
{
    URL urlObject = new URL(url);
    URLConnection urlConnection = urlObject.openConnection();
    //urlConnection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11");

    return toString(urlConnection.getInputStream());
}

private static String toString(InputStream inputStream) throws IOException
{
    try (BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream, "UTF-8")))
    {
        String inputLine;
        StringBuilder stringBuilder = new StringBuilder();
        while ((inputLine = bufferedReader.readLine()) != null)
        {
            stringBuilder.append(inputLine);
        }

        return stringBuilder.toString();
    }
}

What is the problem and how can I modify the code to work properly? 有什么问题,如何修改代码才能正常工作? Thanks. 谢谢。

您必须使用HttpsURLConnection,因为它是https。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM