简体   繁体   中英

Python-3 Trying to iterate through a csv and get http response codes

I am attempting to read a csv file that contains a long list of urls. I need to iterate through the list and get the urls that throw a 301, 302, or 404 response. In trying to test the script I am getting an exited with code 0 so I know it is error free but it is not doing what I need it to. I am new to python and working with files, my experience has been ui automation primarily. Any suggestions would be gladly appreciated. Below is the code.

import csv
import requests
import responses
from urllib.request import urlopen
from bs4 import BeautifulSoup

f = open('redirect.csv', 'r')
contents = []
with open('redirect.csv', 'r') as csvf:  # Open file in read mode
    urls = csv.reader(csvf)
    for url in urls:
        contents.append(url)  # Add each url to list contents
    


def run():
    resp = urllib.request.urlopen(url)
    print(self.url, resp.getcode())
    run()


print(run)

Given you have a CSV similar to the following (the heading is URL)

URL
https://duckduckgo.com
https://bing.com

You can do something like this using the requests library .

import csv
import requests

with open('urls.csv', newline='') as csvfile:
    errors = []
    reader = csv.DictReader(csvfile)
    # Iterate through each line of the csv file
    for row in reader:
        try:
            r = requests.get(row['URL'])
            if r.status_code in [301, 302, 404]:
                # print(f"{r.status_code}: {row['url']}")
                errors.append([row['url'], r.status_code])
        except:
            pass

Uncomment the print statement if you want to see the results in the terminal. The code at the moment appends a list of URL and status code to an errors list. You can print or continue processing this if you prefer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM