[英]jsoup throws 204 status despite a status code check
While i connect to a url through jsoup. 虽然我通过jsoup连接到url。 Here is the snippet of my code:
这是我的代码片段:
for (int j = 0; j < unq_urls.size(); j++) {
Response response2 = Jsoup.connect(unq_urls.get(j))
.userAgent("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 (KHTML, like Gecko) Chrome/19.0.1042.0 Safari/535.21")
.timeout(100*1000)
.ignoreContentType(true)
.execute();
if (response2.statusCode() == 200) {
...}
}
When the connection is executed jsoup throws the following error: 执行连接后,jsoup会引发以下错误:
org.jsoup.HttpStatusException: HTTP error fetching URL. Status=204, URL=https://www.google.com/gen_204?reason=EmptyURL
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:459)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:475)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:475)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:434)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:181)
at cseapiandparsing.CSE_Author_Name_Dis.<init>(CSE_Author_Name_Dis.java:187)
at cseapiandparsing.CSE_Author_Name_Dis.main(CSE_Author_Name_Dis.java:263)
How can I overcome this? 我该如何克服? I mean i want jsoup to pass another URL if it cannot connect to a specific URL.
我的意思是,如果jsoup无法连接到特定URL,我希望它传递另一个URL。 Related to this jsoup also throws time out error when it takes too much time to connect a URL.
当花费太多时间连接URL时,与此jsoup相关的操作还会引发超时错误。 To this end I have already put .timeout(100*1000) option.
为此,我已经放置了.timeout(100 * 1000)选项。 However, I was wondering is there a way of passing to another URL if the attempt for the current one takes too long?
但是,我想知道如果对当前URL的尝试花费的时间太长,是否可以传递到另一个URL?
Thanks in advance. 提前致谢。
I believe you are looking for a try-catch
mechanism here. 我相信您在这里正在寻找一种
try-catch
机制。
Surround the Jsoup.connect
part with a try
clause, then in your catch
clause handle the exceptions gracefully, which in your case would be continuing to the next loop. 用
try
子句包围Jsoup.connect
部分,然后在catch
子句中优雅地处理异常,在您的情况下,这些异常将继续到下一个循环。
To skip the current one if it takes too long, simply set timeout()
value to your desired waiting period, if it passes that period it will throw a timeout exception, which again will be caught by the catch
clause. 要跳过当前时间太长,只需将
timeout()
值设置为所需的等待时间,如果超过该时间,则会抛出超时异常,再次由catch
子句catch
。 Try the code I posted below: 试试我在下面发布的代码:
for (int j = 0; j < unq_urls.size(); j++) {
try{
Response response2 = Jsoup.connect(unq_urls.get(j))
.userAgent("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 (KHTML, like Gecko) Chrome/19.0.1042.0 Safari/535.21")
.timeout(100*1000)
.ignoreContentType(true)
.execute();
} catch(Exception e) {
continue; //continue to the next loop if exception occurs
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.