[英]Split a line with matching regex special characters
Currently I want to split a line with all the matching special characters of the regex. 目前,我想用正则表达式的所有匹配特殊字符来分割一行。 As it is hard to explain, here are a few examples:
很难解释,下面是一些示例:
('.+abcd[0-9]+\\.mp3', 'Aabcd09.mp3')
-> [ 'A', '09' ]
('.+abcd[0-9]+\\.mp3', 'Aabcd09.mp3')
-> [ 'A', '09' ]
.+
is a special expression of the regex and this is the match that I want .+
是正则表达式的特殊表达,这是我想要的匹配项 [0-9]+
is another regex expression and I want what it matches too [0-9]+
是另一个正则表达式,我也想要它匹配的内容 ('.+\\..+_[0-9]+\\.mp3', 'A.abcd_09.mp3')
-> [ 'A', 'abcd', '09' ]
('.+\\..+_[0-9]+\\.mp3', 'A.abcd_09.mp3')
-> [ 'A', 'abcd', '09' ]
.+
is the first special expression of the regex, it matches A
.+
是正则表达式的第一个特殊表达式,它匹配A
.+
is the second special expression of the regex, it matches abcd
.+
是正则表达式的第二个特殊表达式,它匹配abcd
[0-9]+
is the third special expression of the regex, it matches 09
[0-9]+
是正则表达式的第三个特殊表达式,它匹配09
Do you know how to achieve this? 你知道如何做到这一点吗? I didn't find anything.
我什么都没找到
Looks like you need a so called tokenizer/lexer to parse a regular expression first. 看起来您需要一个所谓的tokenizer / lexer首先解析一个正则表达式。 It will allow you to split a base regex on sub-expressions.
它将允许您在子表达式上拆分基本正则表达式。 Then just apply these sub-expressions to the original string and print out matches.
然后,将这些子表达式应用于原始字符串并打印出匹配项即可。
You can try this: 您可以尝试以下方法:
import re
s = ['Aabcd09.mp3', 'A.abcd_09.mp3']
new_s = [re.findall('(?<=^)[a-zA-Z]|(?<=\.)[a-zA-Z]+(?=_)|\d+(?=\.mp3)', i) for i in s]
Output: 输出:
[['A', '09'], ['A', 'abcd', '09']]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.