[英]Jsoup unable to connect (sometimes)
I'm using jsoup in a project, but I find that sometimes, the following instruction is unable to obtain the document: 我在项目中使用jsoup,但是我发现有时以下说明无法获取该文档:
Document document = Jsoup.connect(url).timeout(30000).get();
The strange thing is that with the browser I can open the url in less then 2 seconds, while jsoup has this problem. 奇怪的是,使用浏览器,我可以在不到2秒的时间内打开url,而jsoup却存在此问题。 Another strange thing is that most of the time, jsoup works well.
另一个奇怪的是,在大多数情况下,jsoup都能很好地工作。
What's wrong? 怎么了?
Some websites look at the user-agent string of the connecting browser to decide what content to deliver. 一些网站会查看连接浏览器的用户代理字符串,以确定要传递的内容。 It may be that the user-agent Jsoup sends along is not clear enough.
Jsoup发送的用户代理可能不够清楚。 So my suggestions would be to play with the user-agent like this:
所以我的建议是像这样使用用户代理:
Document document = Jsoup.connect(url)
.userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
.timeout(30000)
.get();
Another possibility would be that the webserver needs some cookies to be set correctly. 另一种可能性是Web服务器需要正确设置一些cookie。 You need to look at the exact traffic between a browser and the website to find out more.
您需要查看浏览器和网站之间的确切流量,以了解更多信息。 (Use the Network tab in the browser development mode)
(在浏览器开发模式下使用“网络”选项卡)
Without the URL that gives you the problems I fear this is all advice I can offer. 如果没有提供给您问题的URL,我担心这是我所能提供的所有建议。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.