简体   繁体   English

Jsoup无法连接(有时)

[英]Jsoup unable to connect (sometimes)

I'm using jsoup in a project, but I find that sometimes, the following instruction is unable to obtain the document: 我在项目中使用jsoup,但是我发现有时以下说明无法获取该文档:

 Document document = Jsoup.connect(url).timeout(30000).get();

The strange thing is that with the browser I can open the url in less then 2 seconds, while jsoup has this problem. 奇怪的是,使用浏览器,我可以在不到2秒的时间内打开url,而jsoup却存在此问题。 Another strange thing is that most of the time, jsoup works well. 另一个奇怪的是,在大多数情况下,jsoup都能很好地工作。

What's wrong? 怎么了?

Some websites look at the user-agent string of the connecting browser to decide what content to deliver. 一些网站会查看连接浏览器的用户代理字符串,以确定要传递的内容。 It may be that the user-agent Jsoup sends along is not clear enough. Jsoup发送的用户代理可能不够清楚。 So my suggestions would be to play with the user-agent like this: 所以我的建议是像这样使用用户代理:

Document document = Jsoup.connect(url)
   .userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
   .timeout(30000)
   .get();

Another possibility would be that the webserver needs some cookies to be set correctly. 另一种可能性是Web服务器需要正确设置一些cookie。 You need to look at the exact traffic between a browser and the website to find out more. 您需要查看浏览器和网站之间的确切流量,以了解更多信息。 (Use the Network tab in the browser development mode) (在浏览器开发模式下使用“网络”选项卡)

Without the URL that gives you the problems I fear this is all advice I can offer. 如果没有提供给您问题的URL,我担心这是我所能提供的所有建议。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM