简体   繁体   English

使用Python的string.replace vs re.sub

[英]Use Python's string.replace vs re.sub

For Python 2.5, 2.6, should I be using string.replace or re.sub for basic text replacements? 对于Python string.replace ,我应该使用string.replace还是re.sub进行基本的文本替换?

In PHP, this was explicitly stated but I can't find a similar note for Python. 在PHP中,这是明确说明的,但我找不到类似的Python注释。

As long as you can make do with str.replace() , you should use it. 只要您可以使用str.replace() ,就应该使用它。 It avoids all the pitfalls of regular expressions (like escaping), and is generally faster. 它避免了正则表达式的所有陷阱(如转义),并且通常更快。

str.replace() should be used whenever it's possible to. 只要有可能,就应该使用str.replace() It's more explicit, simpler, and faster. 它更明确,更简单,更快捷。

In [1]: import re

In [2]: text = """For python 2.5, 2.6, should I be using string.replace or re.sub for basic text replacements.
In PHP, this was explicitly stated but I can't find a similar note for python.
"""

In [3]: timeit text.replace('e', 'X')
1000000 loops, best of 3: 735 ns per loop

In [4]: timeit re.sub('e', 'X', text)
100000 loops, best of 3: 5.52 us per loop

String manipulation is usually preferable to regex when you can figure out how to adapt it. 当你可以弄清楚如何调整它时,字符串操作通常比正则表达式更好。 Regex is incredibly powerful, but it's usually slower, and usually harder to write, debug, and maintain. 正则表达式非常强大,但它通常较慢, 通常更难编写,调试和维护。

That being said, notice the amount of "usually" in the above paragraph! 话虽如此,请注意上段中“通常”的数量! It's possible (and I've seen it done) to write a zillion lines of string manipulation for something you could've done with a 20-character regex. 有可能(而且我已经看到它已经完成)为一些20字符正则表达式完成的事情编写了数十亿行字符串操作。 It's also possible to waste valuable time using "efficient" string functions on tasks a good regex engine could do almost as fast. 使用“高效”字符串函数浪费宝贵的时间就可以完成一个好的正则表达式引擎几乎同样快的任务。 Then there's maintainability: Regex can be horribly complex, but sometimes a regex will be simpler and easier to read than a giant block of procedural code. 然后就是可维护性:正则表达式可能非常复杂,但有时候正则表达式会比一大块程序代码更简单,更容易阅读。

Regex is fantastic for its intended purpose: searching for highly-variable needles in highly-variable haystacks. 正则表达式的目的非常出色:在高度变化的草垛中寻找高度可变的针头。 Think of it as a precision torque wrench: It's the perfect tool for a specific set of jobs, but it makes a lousy hammer. 可以把它想象成一个精确的扭矩扳手:它是完成一系列特定作业的完美工具,但它却是一个糟糕的锤子。

Some guidelines you should follow when you aren't sure what to use: 当您不确定要使用什么时,您应该遵循一些准则:

If the answer to any of these questions is "yes", you probably want string manipulation. 如果任何这些问题的答案都是“是”,那么您可能需要字符串操作。 Otherwise, consider regex. 否则,请考虑正则表达式。

另一件需要考虑的事情是,如果您正在进行相当复杂的替换, str.translate()可能就是您正在寻找的东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM