简体   繁体   English

使用正则表达式在Python中进行字符串拆分

[英]String splitting in Python using regex

I'm trying to split a string in Python so that I get everything before a certain regex. 我正在尝试在Python中拆分一个字符串,以便在某个正则表达式之前得到所有内容。

example string: "Some.File.Num10.example.txt" 示例字符串: "Some.File.Num10.example.txt"

I need everything before this part: "Num10" , regex: r'Num\\d\\d' (the number will vary and possibly what comes after). 在这一部分之前我需要一切: "Num10" ,正则表达式: r'Num\\d\\d' (数字会有所不同,可能会发生什么变化)。

Any ideas on how to do this? 关于如何做到这一点的任何想法?

>>> import re
>>> s = "Some.File.Num10.example.txt"
>>> p = re.compile("Num\d{2}")
>>> match = p.search(s)
>>> s[:match.start()]
'Some.File.'

This would be more efficient that doing a split because search doesn't have to scan the whole string. 执行拆分会更有效,因为搜索不必扫描整个字符串。 It breaks on the first match. 它打破了第一场比赛。 In your example it wouldn't make a different as the strings are short but in case your string is very long and you know that the match is going to be in the beginning, then this approach would be faster. 在你的例子中,它不会有所不同,因为字符串很短但是如果你的字符串很长并且你知道匹配将在开头,那么这种方法会更快。

I just wrote a small program to profile search() and split() and confirmed the above assertion. 我刚写了一个小程序来分析search()和split()并确认了上面的断言。

>>> import re
>>> text = "Some.File.Num10.example.txt"
>>> re.split(r'Num\d{2}',text)[0]
'Some.File.'

You can use Python's re.split() 你可以使用Python的re.split()

import re

my_str = "This is a string."

re.split("\W+", my_str)

['This', 'is', 'a', 'string', '']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM