简体   繁体   English

我可以为 requests.request 设置 max_retries 吗?

[英]Can I set max_retries for requests.request?

The Python requests module is simple and elegant but one thing bugs me. Python requests 模块简单而优雅,但有一件事让我烦恼。 It is possible to get a requests.exception.ConnectionError with a message like:可能会收到requests.exception.ConnectionError的消息,例如:

Max retries exceeded with url: ...

This implies that requests can attempt to access the data several times.这意味着请求可以多次尝试访问数据。 But there is not a single mention of this possibility anywhere in the docs.但是在文档的任何地方都没有提到这种可能性。 Looking at the source code I didn't find any place where I could alter the default (presumably 0) value.查看源代码,我没有找到任何可以更改默认值(大概是 0)的地方。

So is it possible to somehow set the maximum number of retries for requests?那么是否有可能以某种方式设置请求的最大重试次数?

This will not only change the max_retries but also enable a backoff strategy which makes requests to all http:// addresses sleep for a period of time before retrying (to a total of 5 times):这不仅会改变max_retries还会启用退避策略,该策略使对所有http://地址的请求在重试之前休眠一段时间(总共 5 次):

import requests
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter

s = requests.Session()

retries = Retry(total=5,
                backoff_factor=0.1,
                status_forcelist=[ 500, 502, 503, 504 ])

s.mount('http://', HTTPAdapter(max_retries=retries))

s.get('http://httpstat.us/500')

As per documentation for Retry : if the backoff_factor is 0.1 , then sleep() will sleep for [0.1s, 0.2s, 0.4s, ...] between retries.根据Retry文档:如果 backoff_factor 为0.1 ,则 sleep() 将在重试之间休眠 [0.1s, 0.2s, 0.4s, ...] 。 It will also force a retry if the status code returned is 500 , 502 , 503 or 504 .如果返回的状态代码是500502503504 ,它也会强制重试。

Various other options to Retry allow for more granular control: Retry各种其他选项允许更精细的控制:

  • total – Total number of retries to allow. total – 允许的重试总数。
  • connect – How many connection-related errors to retry on.连接- 要重试的连接相关错误的数量。
  • read – How many times to retry on read errors. read – 重试读取错误的次数。
  • redirect – How many redirects to perform.重定向- 要执行的重定向次数。
  • method_whitelist – Set of uppercased HTTP method verbs that we should retry on. method_whitelist – 我们应该重试的一组大写 HTTP 方法动词。
  • status_forcelist – A set of HTTP status codes that we should force a retry on. status_forcelist – 我们应该强制重试的一组 HTTP 状态代码。
  • backoff_factor – A backoff factor to apply between attempts. backoff_factor – 在尝试之间应用的退避因子。
  • raise_on_redirect – Whether, if the number of redirects is exhausted, to raise a MaxRetryError , or to return a response with a response code in the 3xx range. raise_on_redirect – 如果重定向次数用尽,是否引发MaxRetryError或返回响应代码在3xx范围内的响应。
  • raise_on_status – Similar meaning to raise_on_redirect : whether we should raise an exception, or return a response, if status falls in status_forcelist range and retries have been exhausted. raise_on_status – 与 raise_on_redirect 的含义相似:如果状态落在status_forcelist范围内并且重试已用尽,我们是否应该引发异常或返回响应。

NB : raise_on_status is relatively new, and has not made it into a release of urllib3 or requests yet. 注意raise_on_status相对较新,尚未发布到 urllib3 或 requests 中。 The raise_on_status keyword argument appears to have made it into the standard library at most in python version 3.6. raise_on_status关键字参数似乎最多在 python 3.6 版中进入标准库。

To make requests retry on specific HTTP status codes, use status_forcelist .要使请求重试特定的 HTTP 状态代码,请使用status_forcelist For example, status_forcelist=[503] will retry on status code 503 (service unavailable).例如, status_forcelist=[503]将重试状态代码503 (服务不可用)。

By default, the retry only fires for these conditions:默认情况下,重试仅在以下情况下触发:

  • Could not get a connection from the pool.无法从池中获得连接。
  • TimeoutError
  • HTTPException raised (from http.client in Python 3 else httplib ). HTTPException引发(来自 Python 3 else httplib 中的http.client )。 This seems to be low-level HTTP exceptions, like URL or protocol not formed correctly.这似乎是低级 HTTP 异常,例如未正确形成的 URL 或协议。
  • SocketError
  • ProtocolError

Notice that these are all exceptions that prevent a regular HTTP response from being received.请注意,这些都是阻止接收常规 HTTP 响应的异常。 If any regular response is generated, no retry is done.如果生成任何常规响应,则不进行重试。 Without using the status_forcelist , even a response with status 500 will not be retried.如果不使用status_forcelist ,即使状态为 500 的响应也不会重试。

To make it behave in a manner which is more intuitive for working with a remote API or web server, I would use the above code snippet, which forces retries on statuses 500 , 502 , 503 and 504 , all of which are not uncommon on the web and (possibly) recoverable given a big enough backoff period.为了使它以更直观的方式与远程 API 或 Web 服务器一起工作,我将使用上面的代码片段,它强制重试状态500502503504 ,所有这些在web 并且(可能)在足够大的回退期的情况下可恢复。

EDITED : Import Retry class directly from urllib3 .编辑:直接从urllib3导入Retry类。

It is the underlying urllib3 library that does the retrying.重试是底层的urllib3库。 To set a different maximum retry count, use alternative transport adapters :要设置不同的最大重试次数,请使用替代传输适配器

from requests.adapters import HTTPAdapter

s = requests.Session()
s.mount('http://stackoverflow.com', HTTPAdapter(max_retries=5))

The max_retries argument takes an integer or a Retry() object ; max_retries参数采用整数或Retry()对象 the latter gives you fine-grained control over what kinds of failures are retried (an integer value is turned into a Retry() instance which only handles connection failures; errors after a connection is made are by default not handled as these could lead to side-effects).后者使您可以对重试何种类型的失败进行细粒度控制(一个整数值被转换为一个Retry()实例,它只处理连接失败;建立连接后的错误在默认情况下不被处理,因为这些可能导致侧-效果)。


Old answer, predating the release of requests 1.2.1 :旧答案,早于请求 1.2.1 的发布

The requests library doesn't really make this configurable, nor does it intend to (see this pull request ). requests库并没有真正使这个可配置,也没有打算(请参阅此拉取请求)。 Currently (requests 1.1), the retries count is set to 0. If you really want to set it to a higher value, you'll have to set this globally:当前(请求 1.1),重试次数设置为 0。如果您真的想将其设置为更高的值,则必须全局设置:

import requests

requests.adapters.DEFAULT_RETRIES = 5

This constant is not documented;这个常数没有记录; use it at your own peril as future releases could change how this is handled.使用它有你自己的危险,因为未来的版本可能会改变处理方式。

Update : and this did change;更新:这确实改变了; in version 1.2.1 the option to set the max_retries parameter on the HTTPAdapter() class was added, so that now you have to use alternative transport adapters, see above.在 1.2.1 版中max_retries了在HTTPAdapter()设置max_retries参数的选项,因此现在您必须使用替代传输适配器,请参见上文。 The monkey-patch approach no longer works, unless you also patch the HTTPAdapter.__init__() defaults (very much not recommended).猴子补丁方法不再有效,除非您还补丁HTTPAdapter.__init__()默认值(非常不推荐)。

Be careful, Martijn Pieters's answer isn't suitable for version 1.2.1+.请注意,Martijn Pieters 的回答不适合 1.2.1+ 版本。 You can't set it globally without patching the library.您不能在不修补库的情况下全局设置它。

You can do this instead:你可以这样做:

import requests
from requests.adapters import HTTPAdapter

s = requests.Session()
s.mount('http://www.github.com', HTTPAdapter(max_retries=5))
s.mount('https://www.github.com', HTTPAdapter(max_retries=5))

After struggling a bit with some of the answers here, I found a library called backoff that worked better for my situation.在对这里的一些答案苦苦挣扎之后,我找到了一个名为backoff的库,它更适合我的情况。 A basic example:一个基本的例子:

import backoff

@backoff.on_exception(
    backoff.expo,
    requests.exceptions.RequestException,
    max_tries=5,
    giveup=lambda e: e.response is not None and e.response.status_code < 500
)
def publish(self, data):
    r = requests.post(url, timeout=10, json=data)
    r.raise_for_status()

I'd still recommend giving the library's native functionality a shot, but if you run into any problems or need broader control, backoff is an option.我仍然建议尝试一下库的本机功能,但是如果您遇到任何问题或需要更广泛的控制,则可以选择退避。

A cleaner way to gain higher control might be to package the retry stuff into a function and make that function retriable using a decorator and whitelist the exceptions.获得更高控制权的一种更简洁的方法可能是将重试内容打包到一个函数中,并使用装饰器使该函数可重试并将异常列入白名单。

I have created the same here: http://www.praddy.in/retry-decorator-whitelisted-exceptions/我在这里创建了相同的: http : //www.praddy.in/retry-decorator-whitelisted-exceptions/

Reproducing the code in that link :重现该链接中的代码:

def retry(exceptions, delay=0, times=2):
"""
A decorator for retrying a function call with a specified delay in case of a set of exceptions

Parameter List
-------------
:param exceptions:  A tuple of all exceptions that need to be caught for retry
                                    e.g. retry(exception_list = (Timeout, Readtimeout))
:param delay: Amount of delay (seconds) needed between successive retries.
:param times: no of times the function should be retried


"""
def outer_wrapper(function):
    @functools.wraps(function)
    def inner_wrapper(*args, **kwargs):
        final_excep = None  
        for counter in xrange(times):
            if counter > 0:
                time.sleep(delay)
            final_excep = None
            try:
                value = function(*args, **kwargs)
                return value
            except (exceptions) as e:
                final_excep = e
                pass #or log it

        if final_excep is not None:
            raise final_excep
    return inner_wrapper

return outer_wrapper

@retry(exceptions=(TimeoutError, ConnectTimeoutError), delay=0, times=3)
def call_api():

You can use the requests library to accomplish all in one go.您可以使用请求库一次性完成所有操作。 The following code will retry 3 times if you receive 429,500,502,503 or 504 status code, each time with a longer delay set through "backoff_factor".如果您收到 429,500,502,503 或 504 状态代码,以下代码将重试 3 次,每次都通过“backoff_factor”设置更长的延迟。 See https://findwork.dev/blog/advanced-usage-python-requests-timeouts-retries-hooks/ for a nice tutorial.有关不错的教程,请参阅https://findwork.dev/blog/advanced-usage-python-requests-timeouts-retries-hooks/

from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

retry_strategy = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[429, 500, 502, 503, 504],
    method_whitelist=["HEAD", "GET", "OPTIONS"]
)
adapter = HTTPAdapter(max_retries=retry_strategy)
http = requests.Session()
http.mount("https://", adapter)
http.mount("http://", adapter)

response = http.get("https://en.wikipedia.org/w/api.php")
    while page is None:
        try:
            page = requests.get(url, timeout=5,proxies=proxies)
        except Exception:
            page = None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM