I am trying to scrape a site (www.oddsportal.com) with JSoup, but i have run into an issue.
String url = "http://www.oddsportal.com/matches/";
Document doc = null;
System.out.println("Connecting to " + url + "...");
try {
doc = Jsoup.connect(url).get();
} catch (IOException e1) {
e1.printStackTrace();
}
When i connect and do a "get" i get the following:
Connecting to http://www.oddsportal.com/matches/...
org.jsoup.HttpStatusException: HTTP error fetching URL. Status=456,
URL=http://www.oddsportal.com/matches/
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:435)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:410)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:164)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:153)
What could be the cause? It seems there is no HTTP 456 status code, so i assume it's some sort of site-specific code? There is a login function at the site but it is not mandatory for viewing the content. Other sites i have tried works just fine.
如果包括user agent
,它将通过文档提供帮助 :
Document doc = Jsoup.connect("http://example.com").userAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0").get();
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.