简体   繁体   中英

requests.exceptions.ConnectionError Python

I have problem couse i need find bad urls of pictures its my script:

import requests
import csv
import time
with open(nazwa_pliku) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=';')
    count=0
    mapa = []
    id = 1
    next(csv_reader)
    next(csv_reader)

    for row in csv_reader:
        if row[1] != "":
            ID=row[0]
            NUMBER=row[1]
            PICTURES=row[2].split('|')


            for url in PICTURES:
                url="https://sw67383.mywebshop.io/upload_dir/shop/"+url

                result = requests.get(url, stream=True)
                if result.status_code != 200:

                    print(colored("Brak: ", "red"), url)
                    object = {
                        "PRODUCT_ID": ID,
                        "NUMBER":NUMBER,
                        "PHOTO":url,

                    }
                    count += 1
                    mapa.append(object)
                else:
                    print(colored(str(id)+" Poprawny: ", "green"), url)
                id+=1

    print(colored("Liczba Brakujących zdjęć: ", "yellow")+"{}/{}").format(count,id)
    return mapa  

 

For example i get it from csv files and I request urls but some times i have connection error i dont know why. Maybe my internet or server.

and i getting error

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='sw67383.mywebshop.io', port=443): Max retries exceeded with url: /upload_dir/shop/maxtone/MAXTON_4306_4.jpg (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object.....

What can i do to avoid this problem

I need to check 3000 urls of pictures. And in future much more.

EDIT: I change it like

   for row in csv_reader:
        if row[1] != "":
            ID=row[0]
            NUMBER=row[1]
            PICTURES=row[2].split('|')


            for url in PICTURES:
                url="https://sw67383.mywebshop.io/upload_dir/shop/"+url


                try:
                    result = requests.get(url, stream=True)
                    if result.status_code != 200:

                        print(colored("Brak: ", "red"), url)
                        object = {
                            "PRODUCT_ID": ID,
                            "NUMBER": NUMBER,
                            "PHOTO": url,

                        }
                        count += 1
                        mapa.append(object)
                    else:
                        print(colored(str(id) + " Poprawny: ", "green"), url)
                    id += 1
                except requests.ConnectionError:
                    print("Problem z połączeniem z adresem: {} ".format(url))

And now i know when is "time out" but not good when it will bad link to picture (404):P so maybe i shoud save this to object too? and manual verify link like its correct url or wrong

Ok I foud it how I can avoid problem:

 for url in PICTURES:
                    url="https://sw67383.mywebshop.io/upload_dir/shop/"+url
                    session = requests.Session()
                    retry = Retry(connect=3, backoff_factor=0.5)
                    adapter = HTTPAdapter(max_retries=retry)
                    session.mount('http://', adapter)
                    session.mount('https://', adapter)

                    result = session.get(url)
                    if result.status_code != 200:

                        print(colored("Brak: ", "red"), url)
                        object = {
                                "PRODUCT_ID": ID,
                                "NUMBER": NUMBER,
                                "PHOTO": url,
                                "COMMUNICATE":"BRAK"

                            }
                         count += 1
                         mapa.append(object)

                     else:
                          print(colored(str(id) + " Poprawny: ", "green"), url)
                     id += 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM