简体   繁体   中英

How to solve HTTP error fetching URL. Status=503 JSOUP(I Tried All solutions)

During my Big Data project I have to develop a JSOUP script to get Paris 2018 meterological data and store them

public static final String USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; 
Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 
Safari/537.36";

    int numberDay = 1;
    for(numberDay=1; numberDay<32 ; numberDay++ ) {
        //une condition sur le numberDay =1 puisque dans URL 1  = 1er
        if(numberDay==1){

            String First=numberDay+"er";
            String url = "https://www.infoclimat.fr/observations-meteo/archives/"+First+"/"+listMois.get(1)+"/2018/paris-montsouris/07156.html";
            System.out.println(url);
            //Document doc = Jsoup.connect(url).userAgent("Mozilla").get();
            Document doc = Jsoup.connect(url).userAgent(USER_AGENT).get();

            //appel de la methode DataCollect
            dataCollect.GetData(doc);

        }


    else if(numberDay!=1) {

            String url = "https://www.infoclimat.fr/observations- 
meteo/archives/"+numberDay+"/"+listMois.get(1)+"/2018/paris- 
montsouris/07156.html";
            System.out.println(url);
            Document doc = 
Jsoup.connect(url).userAgent(USER_AGENT).get();

            dataCollect.GetData(doc);

        }
    else{
            System.out.println("erreur");
        }


}

I tried all the solutions of user_Agent but i have the same errors(URL works on the Browser)

 Exception in thread "main" org.jsoup.HttpStatusException: HTTP error 
 fetching URL. Status=503

the error is displayed in the 8th day so he can detect that it's a robot after 8 requests.

我能够通过设置一个线程来扩大查询之间的时间来解决问题

Thread.sleep(5000);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM