I had the following problem: There is a bunch of.csv files I have to download from a web site, about 100 of them. Downloading each one requires 3-4 clicks and I had to renew the download about every 3-4 months or so. So I created a Python 3 script to do this for me, using urllib.request.urlopen() .
The problem was that, if the manual download works fine, when I used the script, the returned file, instead of being a.csv was an html file containing the message "The requested URL was rejected. Please consult with your administrator." I tried using wget or curl in bash and got the same result.
Add user-agent in headers of your request. Your code will be like:
import requests
url = 'http://example.com/filename.doc'
filename = url.split('/')[-1]
user_agent = {'User-agent': 'Mozilla/5.0'}
r = requests.get(url, allow_redirects=True, headers = user_agent)
open(f'dirrictory/{filename}', 'wb').write(r.content)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.