简体   繁体   English

为什么我们要在python正则表达式中使用re.purge()?

[英]Why should we use re.purge() in python regular expression?

What is significance of clearing cache while working with re in Python. 在Python中使用re时清除缓存有什么意义。 Does it help in performance or memory management? 它对性能或内存管理有帮助吗? What happens if we ignore it. 如果我们忽略它会发生什么。 Where should re.purge() called? re.purge()应该在哪里调用?

Most code will not need to worry about purging the re module cache. 大多数代码都不需要担心清除re模块缓存。 It brings very little memory benefit, and can actually hurt performance if you purged it. 它带来的内存利益非常小,如果你清除它,实际上可能会损害性能。

The cache is used to store compiled regular expression objects when you use the top-level re.* functions directly rather than use re.compile(pattern) . 当您直接使用顶级re.*函数而不是使用re.compile(pattern)时,缓存用于存储已编译的正则表达式对象 For example, if you used re.search(r'<some pattern>', string_value) in a loop, then the re module would compile '<some pattern>' only once and store it in the cache, avoiding having to re-compile the pattern each time. 例如,如果在循环中使用了re.search(r'<some pattern>', string_value) ,那么re模块只会编译一次'<some pattern>'并将其存储在缓存中,从而避免重新每次编译模式。

How many such objects are cached and how the cache is managed is an implementation detail, really, but regular expression objects are light-weight objects, taking up at most a few hundred bytes, and Python won't store more than a few hundred of these (Python 3.7 stores up to 512). 实际上,有多少这样的对象被缓存以及如何管理缓存是一个实现细节,但正则表达式对象是轻量级对象,最多占用几百个字节,Python不会存储超过几百个这些(Python 3.7存储最多512个)。

The cache is also automatically managed, so purging is not normally needed at all. 缓存也是自动管理的,因此通常根本不需要清除。 Use it if you specifically need to account for regular expression compilation time in a repeated time trial test involving re.* functions, or are testing the caching functionality itself. 如果您特别需要在涉及re.*函数的重复计时测试中考虑正则表达式编译时间,或者正在测试缓存功能本身,请使用它。 The only locations in the Python standard library that call re.purge() are in tests (specifically in the test_re unittests for the re module and the reference leak test in the regression test suite). 调用re.purge()的Python标准库中的唯一位置是在测试中(特别是在re模块的test_re单元测试和回归测试套件中的参考泄漏测试中)。

If your code is creating a lot of regular expression objects that you intent to keep using, it is better to use re.compile() and keep your own references to those compiled expression objects. 如果您的代码正在创建许多您打算继续使用的正则表达式对象,那么最好使用re.compile()并保留对这些已编译表达式对象的引用。 See the re.compile() documentation : 请参阅re.compile()文档

The sequence 序列

 prog = re.compile(pattern) result = prog.match(string) 

is equivalent to 相当于

 result = re.match(pattern, string) 

but using re.compile() and saving the resulting regular expression object for reuse is more efficient when the expression will be used several times in a single program. 但是当在单个程序中多次使用表达式时,使用re.compile()并保存生成的正则表达式对象以便重用更有效。

Note : The compiled versions of the most recent patterns passed to re.compile() and the module-level matching functions are cached, so programs that use only a few regular expressions at a time needn't worry about compiling regular expressions. 注意 :传递给re.compile()最新模式的编译版本和模块级匹配函数被缓存,因此一次只使用几个正则表达式的程序不必担心编译正则表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM