简体   繁体   English

JSOUP在Eclipse中抛出url状态503但是URL在浏览器中工作正常

[英]JSOUP throws url status 503 in Eclipse but URL works fine in browser

In particular, this is with the website amazon.com to be specific. 特别是,这与amazon.com网站是具体的。 I am receiving a 503 error for their domain, but I can successfully parse other domains. 我收到了他们域名的503错误,但我可以成功解析其他域名。

I am using the line 我正在使用这条线

Document doc = Jsoup.connect(url).timeout(30000).get();

to connect to the URL. 连接到URL。

You have to set a User Agent : 您必须设置用户代理

Document doc = Jsoup.connect(url).timeout(30000).userAgent("Mozilla/17.0").get();

(Or others; best you choose a browser user agent) (或其他人;最好选择浏览器用户代理)

Else you'll get blocked. 否则你会被封锁。

Please see also: Jsoup: select(div[class=rslt prod]) returns null when it shouldn't 另请参阅: Jsoup:select(div [class = rslt prod])在不应该返回null时返回null

you can try 你可以试试

val ret=Jsoup.connect(url)
  .userAgent("Mozilla/5.0 Chrome/26.0.1410.64 Safari/537.31")
  .timeout(2*1000)
  .followRedirects(true)
  .maxBodySize(1024*1024*3)    //3Mb Max
  //.ignoreContentType(true) //for download xml, json, etc
  .get()

it maybe works, maybe amazon.com need followRedirects set to true. 它可能有效,也许amazon.com需要followRedirects设置为true。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 jsoup 错误获取 URL。 状态=503 仅在 Heroku 上 - jsoup error fetching URL. Status=503 only on Heroku JSOUP / HTTP错误正在获取URL。 状态= 503 - JSOUP / HTTP error fetching URL. Status=503 在 Eclipse Tomcat 服务器中抛出“HTTP 状态 404 – 未找到”错误,因为它适用于 Mozilla 浏览器? - “HTTP Status 404 – Not Found” error is throws in Eclipse Tomcat server, as it works fine for Mozilla browser? 如何修复org.jsoup.HttpStatusException:HTTP错误获取URL。 状态= 503 - How to fix org.jsoup.HttpStatusException: HTTP error fetching URL. Status=503 如何解决HTTP错误提取URL。 状态= 503 JSOUP(我试过所有解决方案) - How to solve HTTP error fetching URL. Status=503 JSOUP(I Tried All solutions) Android,从url获取文本在Eclipse中工作正常,但在设备上不行 - Android, get text from url works fine in eclipse but not on device Eclipse浏览器URL更改 - Eclipse browser URL changes 在浏览器中工作的URL的FileNotFoundException - FileNotFoundException for URL that works in browser 当我单击链接时出现403禁止错误。直接在浏览器中输入网址后,效果很好 - 403 Forbidden Error when I click on the link.Works Fine when enter the url in browser directly Hibernate / JPA在Eclipse中工作正常,部署时抛出SchemaManagementException - Hibernate/JPA works fine in Eclipse, throws a SchemaManagementException when deployed
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM