简体   繁体   English

(Python)拆分字符串多个分隔符更有效? 1) 使用多重替换方法然后使用拆分 2) 使用正则表达式

[英](Python) which is more efficient to split a string multiple separators? 1) Using multiple replace method then using split 2) using regular Expressions

Example:例子:

(Case 1) (情况1)

#first using replace method to replace different type of separator to single type then using split method #首先使用replace方法将不同类型的分隔符替换为单一类型,然后使用split方法

text = "python is, an easy;language; to, learn."
text_one_delimiter = text.replace("# ", ", ").replace("% ", ", ").replace("; ", ", ").replace("- ", ", ")

print(text_one_delimiter.split(", "))

(case 2) (案例二)

#Using regular expression for splitting using multiple separators #使用正则表达式进行拆分使用多个分隔符

import re

text = "python is# an% easy;language- to, learn."
print(re.split('; |, |# |% |- ', text))

timeit module is useful for speed comparison of code snippet. timeit模块对于代码片段的速度比较很有用。 It might be used following way:它可以通过以下方式使用:

import timeit
case1 = '''text = "python is, an easy;language; to, learn."
text_one_delimiter = text.replace("# ", ", ").replace("% ", ", ").replace("; ", ", ").replace("- ", ", ")
text_one_delimiter.split(", ")'''
case2_setup = "import re"
case2 = '''text = "python is# an% easy;language- to, learn."
re.split('; |, |# |% |- ', text)'''
print(timeit.timeit(case1))
print(timeit.timeit(case2,case2_setup))

Output (will depend on your machine): Output(取决于您的机器):

1.1250261999999793
2.2901268999999616

Note that I excluded print s from examined code and make import re setup, as otherwise it would import it without need several time.请注意,我从检查的代码中排除了printimport re ,否则它会在不需要几次的情况下导入它。 Conclusion is that in this particular case method with multiple .replace s is faster than re.split .结论是,在这种特殊情况下,具有多个.replace的方法比re.split更快。

(tested in Python 3.7.3) (在 Python 3.7.3 中测试)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM