通过请求的max_retries设置获取status_code

Question

As seen here , max-retries can be set for requests.Session() , but I only need the head.status_code to check if a url is valid and active. 正如所看到这里， max-retries可以设置requests.Session()但我只需要head.status_code检查URL是有效的和积极的。

Is there a way to just get the head within a mount session? 有没有办法让您在挂载会话中处于领先地位？

import requests
def valid_active_url(url):
    try:
        site_ping = requests.head(url, allow_redirects=True)
    except requests.exceptions.ConnectionError:
        print('Error trying to connect to {}.'.format(url))

    try:
        if (site_ping.status_code < 400):
            return True
        else:
            return False
    except Exception:
        return False
    return False

Based on docs am thinking I need to either: 基于文档，我认为我需要：

see if the session.mount method results return a status code (which I haven't found yet) 查看session.mount方法的结果是否返回状态码（我尚未找到）
roll my own retry method, perhaps with a decorator like this or this or a (less eloquent) loop like this . 推出自己的重试方法，也许像一个装饰这个或这或类似（较少雄辩）循环此。

In terms of the first approach I have tried: 在第一种方法中，我尝试过：

s = requests.Session()
a = requests.adapters.HTTPAdapter(max_retries=3)
s.mount('http://redirected-domain.com', a)
resp = s.get('http://www.redirected-domain.org')
resp.status_code

Are we only using s.mount() to get in and set max_retries ? 我们仅使用s.mount()进入并设置max_retries吗？ Seems to be a redundancy, aside from that the http connection would be pre-established. 除了可以预先建立http连接外，这似乎是一种冗余。

Also resp.status_code returns 200 where I am expecting a 301 (which is what requests.head returns. 另外resp.status_code返回200 ，我期望的是301 （ requests.head返回的内容）。

NOTE: resp.ok might be all I need for my purposes here. 注意： resp.ok可能就是我在这里需要的。

Answer 1

After a mere two hours of developing the question, the answer took five minutes: 在仅花了两个小时提出问题后，答案花了五分钟：

def valid_url(url):
    if (url.lower() == 'none') or (url == ''):
        return False
    try:
        s = requests.Session()
        a = requests.adapters.HTTPAdapter(max_retries=5)
        s.mount(url, a)
        resp = s.head(url)
        return resp.ok
    except requests.exceptions.MissingSchema:
        # If it's missing the schema, run again with schema added
        return valid_url('http://' + url)
    except requests.exceptions.ConnectionError:
        print('Error trying to connect to {}.'.format(url))
        return False

Based on this answer it looks like the head request will be slightly less resource intensive than the get, particularly if the url contains a large amount of data. 根据这个答案， head请求看起来将比获取请求少一些资源，特别是在url包含大量数据的情况下。

The requests.adapters.HTTPAdapter is the built in adaptor for the urllib3 library that underlies the Requests library. requests.adapters.HTTPAdapter是urllib3库的内置适配器，该库是Requests库的基础。

On another note, I'm not sure what the correct term or phrase for what I'm checking here is. 另一方面，我不确定此处要检查的正确术语或短语是什么。 A url could still be valid if it returns an error code. 如果网址返回错误代码，则该网址可能仍然有效。

通过请求的max_retries设置获取status_code

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-03-12 17:14:05

通过请求的max_retries设置获取status_code

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-03-12 17:14:05

解决方案1
0 已采纳 2019-03-12 17:14:05