简体   繁体   中英

Getting an error using JSoup. Why?

I'm trying to login and extract data from a fantasyfootball website.

I get the following error,

Jul 24, 2015 8:01:12 PM StatsCollector main SEVERE: null org.jsoup.HttpStatusException: HTTP error fetching URL. Status=403, URL= http://fantasy.premierleague.com/ at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:537) at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:493) at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:205) at StatsCollector.main(StatsCollector.java:26)

whenever I try this code. Where am I going wrong?

    public class StatsCollector {

    public static void main (String [] args){

        try {
            String url = "http://fantasy.premierleague.com/";
            Connection.Response response = Jsoup.connect(url).method(Connection.Method.GET).execute();

            Response res= Jsoup
                    .connect(url)
                    .data("ismEmail", "example@googlemail.com", "id_password", "examplepassword")
                    .method(Method.POST)
                    .execute();


            Map<String, String> loginCookies = res.cookies();

            Document doc = Jsoup.connect("http://fantasy.premierleague.com/transfers")
                    .cookies(loginCookies)
                    .get();

            String title = doc.title();
            System.out.println(title);
        }  

        catch (IOException ex) {
            Logger.getLogger(StatsCollector.class.getName()).log(Level.SEVERE,null,ex);
        }
    }

}
Response res= Jsoup
                .connect(url)
                .data("ismEmail", "example@googlemail.com", "id_password", "examplepassword")
                .method(Method.POST)
                .execute();

Are you trying to execute this actual code? This seems to be an example code with placeholders instead of login credentials. This would explain the error you received, HTTP 403 .

Edit 1

My bad. I took a look at the login form on that site, and it seems to me that you confused the id of the input elements ("ismEmail" and "id_password" with the name which gets sent with the form ("email", "password"). Is this working for you?

Response res= Jsoup
                .connect(url)
                .data("email", "example@googlemail.com", "password", "examplepassword")
                .method(Method.POST)
                .execute();

Edit 2

Okay, this was stuck in my head, beacause signing into a website with JSoup should not be that hard. I created an account there and tried for myself. Code first:

 String url = "https://users.premierleague.com/PremierUser/j_spring_security_check";

        Response res = Jsoup
                .connect(url)
                .followRedirects(false)
                .timeout(2_000)
                .data("j_username", "<USER>")
                .data("j_password", "<PASSWORD>")
                .method(Method.POST)
                .execute();

        Map<String, String> loginCookies = res.cookies();

        Document doc = Jsoup.connect("http://fantasy.premierleague.com/squad-selection/")
                .cookies(loginCookies)
                .get();

So what is happening here? First I realized, that the target of the login form was wrong. The page seems to be built on spring, so the form attributes and target use spring defaults j_spring_security_check , j_username and j_password . Then a read timeout occurred to me, until I set the flag followRedirects(false) . I can only guess why this helped, but maybe this is a protection against crawlers?

In the end i try to connect to the squad selection page, and the parsed response contains my personal view and data. This code seems to work for me, would you give it a try?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM