I am trying to download a zip file that is stored here:
http://e4ftl01.cr.usgs.gov/MEASURES/SRTMGL1.003/2000.02.11/N45W074.SRTMGL1.hgt.zip
If you paste this into the browser and hit enter, it will download the.zip folder.
If you inspect the browser while this is happening, you will see that there is an internal redirect going on:
And eventually the zip gets downloaded.
I am trying to automate this downloading using the python requests library by doing the following:
import requests
requests.get(url,
allow_redirects=True,
headers={'User-Agent':'Chrome/107.0.0.0'})
I've tried tons of combinations, using the full header string from the HTML inspection, forcing verify=True, with and without redirects, adding a HTTPBasicAuth
user/pass that says is required although the file seems to download fine without any credentials.
Honestly no clue what I'm missing, this is not my expertise. I keep getting this error:
>>> requests.get(url,
... allow_redirects=True,
... headers={'User-Agent':'Chrome/107.0.0.0'})
Traceback (most recent call last):
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\util\connection.py", line 95, in create_connection
raise err
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine
actively refused it
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connectionpool.py", line 398, in _make_request
conn.request(method, url, **httplib_request_kw)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connection.py", line 239, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 1037, in _send_output
self.send(msg)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 975, in send
self.connect()
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connection.py", line 205, in connect
conn = self._new_conn()
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x0000016B76A6FE50>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\adapters.py", line 489, in send
resp = conn.urlopen(
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\util\retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='e4ftl01.cr.usgs.gov', port=80): Max retries exceeded with url: /MEASURES/SRTMGL1.003/2000.02.11/N45W074.SRTMGL1.hgt.zip (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000016B76A6FE50>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\adapters.py", line 565, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='e4ftl01.cr.usgs.gov', port=80): Max retries exceeded with url: /MEASURES/SRTMGL1.003/2000.02.11/N45W074.SRTMGL1.hgt.zip (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000016B76A6FE50>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
Can someone help me arrive at the code that will result in a successful request? I know how to write the request into a zip afterwards..
Change http to Https: This should work
import requests
# download zip file from url
url = "https://e4ftl01.cr.usgs.gov/MEASURES/SRTMGL1.003/2000.02.11/N45W074.SRTMGL1.hgt.zip"
r = requests.get(url)
with open("N45W074.SRTMGL1.hgt.zip", "wb") as f:
f.write(r.content)
Thanks to @cnemri for the tip.
It was a combination of changing http to https and also including this specific cookie header in the request header:
headers = {'Cookie': '_gid=GA1.2.48775707.1669266346; _hjSessionUser_606685=eyJpZCI6IjQ1MzEzM2QzLTI3MWEtNWM0YS04M2YzLWRmMmMzNDk4NjY1ZSIsImNyZWF0ZWQiOjE2NjkyNjYzNDU2MjUsImV4aXN0aW5nIjp0cnVlfQ==; ERS_production_2=b7dfade669180a6d6d250b030ffc3cf2UZZa8%2F471MfcV%2FaNuFtu6Pli%2BpP8jKOIwR4JvQjm%2B6DLPjl679vVf5SCDk7C5TLQsj7qckIev6lmtGb6Mes5RKDHUs%2BBp3EAjKW2%2BMCUpV%2Fnx0z1pdaCQQ%3D%3D; EROS_SSO_production_secure=eyJjcmVhdGVkIjoxNjY5MjY5MjEzLCJ1cGRhdGVkIjoiMjAyMi0xMS0yMyAyMzo1MzoyOCIsImF1dGhUeXBlIjoiRVJTIiwiYXV0aFNlcnZpY2UiOiJFUk9TIiwidmVyc2lvbiI6MS4xLCJzdGF0ZSI6ImVjMDY3YmMzYmRhMzBiZTUyNTkxYTNiZTYwMTMwZWNmNjAwMWU1Y2JlMGMxZmNkYTU4Y2Y4OTY0YjRlNTJkOTEiLCJpZCI6InY5U1Q4YVl3M1hRKCsyIiwic2VjcmV0IjoiPl8lMCZPYiY9MUVkclRLZTt4fHVZLVcyU1VvTiA1SE17KiQqaG12OSxvPl5%2BdyJ9; _ga_0YWDZEJ295=GS1.1.1669269458.1.0.1669269460.0.0.0; _ga=GA1.2.1055277358.1669266346; _ga_71JPYV1CCS=GS1.1.1669306300.2.1.1669306402.0.0.0; DATA=Y3_gg9uD6LmunsfDHneR9wAAARQ'}
then, this worked:
requests.get(url, headers=headers)
I'm not sure if this is the "solution" or just a workaround I've discovered. Still appreciate any input. I think it may just be credentials that are stored?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.