[英]Get status_code with max_retries setting for requests.head
As seen here , max-retries
can be set for requests.Session()
, but I only need the head.status_code
to check if a url is valid and active. 正如所看到这里 ,
max-retries
可以设置requests.Session()
但我只需要head.status_code
检查URL是有效的和积极的。
Is there a way to just get the head within a mount session? 有没有办法让您在挂载会话中处于领先地位?
import requests
def valid_active_url(url):
try:
site_ping = requests.head(url, allow_redirects=True)
except requests.exceptions.ConnectionError:
print('Error trying to connect to {}.'.format(url))
try:
if (site_ping.status_code < 400):
return True
else:
return False
except Exception:
return False
return False
Based on docs am thinking I need to either: 基于文档,我认为我需要:
session.mount
method results return a status code (which I haven't found yet) session.mount
方法的结果是否返回状态码(我尚未找到) In terms of the first approach I have tried: 在第一种方法中,我尝试过:
s = requests.Session()
a = requests.adapters.HTTPAdapter(max_retries=3)
s.mount('http://redirected-domain.com', a)
resp = s.get('http://www.redirected-domain.org')
resp.status_code
Are we only using s.mount()
to get in and set max_retries
? 我们仅使用
s.mount()
进入并设置max_retries
吗? Seems to be a redundancy, aside from that the http connection would be pre-established. 除了可以预先建立http连接外,这似乎是一种冗余。
Also resp.status_code
returns 200
where I am expecting a 301
(which is what requests.head
returns. 另外
resp.status_code
返回200
,我期望的是301
( requests.head
返回的内容)。
NOTE: resp.ok
might be all I need for my purposes here. 注意:
resp.ok
可能就是我在这里需要的。
After a mere two hours of developing the question, the answer took five minutes: 在仅花了两个小时提出问题后,答案花了五分钟:
def valid_url(url):
if (url.lower() == 'none') or (url == ''):
return False
try:
s = requests.Session()
a = requests.adapters.HTTPAdapter(max_retries=5)
s.mount(url, a)
resp = s.head(url)
return resp.ok
except requests.exceptions.MissingSchema:
# If it's missing the schema, run again with schema added
return valid_url('http://' + url)
except requests.exceptions.ConnectionError:
print('Error trying to connect to {}.'.format(url))
return False
Based on this answer it looks like the head
request will be slightly less resource intensive than the get, particularly if the url contains a large amount of data. 根据这个答案 ,
head
请求看起来将比获取请求少一些资源,特别是在url包含大量数据的情况下。
The requests.adapters.HTTPAdapter is the built in adaptor for the urllib3 library that underlies the Requests library. requests.adapters.HTTPAdapter是urllib3库的内置适配器,该库是Requests库的基础。
On another note, I'm not sure what the correct term or phrase for what I'm checking here is. 另一方面,我不确定此处要检查的正确术语或短语是什么。 A url could still be valid if it returns an error code.
如果网址返回错误代码,则该网址可能仍然有效 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.