用匹配的正则表达式特殊字符分隔行

Question

Currently I want to split a line with all the matching special characters of the regex. 目前，我想用正则表达式的所有匹配特殊字符来分割一行。 As it is hard to explain, here are a few examples: 很难解释，下面是一些示例：

('.+abcd[0-9]+\\.mp3', 'Aabcd09.mp3') -> [ 'A', '09' ] ('.+abcd[0-9]+\\.mp3', 'Aabcd09.mp3') -> [ 'A', '09' ]

.+ is a special expression of the regex and this is the match that I want .+是正则表达式的特殊表达，这是我想要的匹配项
[0-9]+ is another regex expression and I want what it matches too [0-9]+是另一个正则表达式，我也想要它匹配的内容

('.+\\..+_[0-9]+\\.mp3', 'A.abcd_09.mp3') -> [ 'A', 'abcd', '09' ] ('.+\\..+_[0-9]+\\.mp3', 'A.abcd_09.mp3') -> [ 'A', 'abcd', '09' ]

.+ is the first special expression of the regex, it matches A .+是正则表达式的第一个特殊表达式，它匹配A
.+ is the second special expression of the regex, it matches abcd .+是正则表达式的第二个特殊表达式，它匹配abcd
[0-9]+ is the third special expression of the regex, it matches 09 [0-9]+是正则表达式的第三个特殊表达式，它匹配09

Do you know how to achieve this? 你知道如何做到这一点吗？ I didn't find anything. 我什么都没找到

Answer 1

Looks like you need a so called tokenizer/lexer to parse a regular expression first. 看起来您需要一个所谓的tokenizer / lexer首先解析一个正则表达式。 It will allow you to split a base regex on sub-expressions. 它将允许您在子表达式上拆分基本正则表达式。 Then just apply these sub-expressions to the original string and print out matches. 然后，将这些子表达式应用于原始字符串并打印出匹配项即可。

Answer 2

You can try this: 您可以尝试以下方法：

import re
s = ['Aabcd09.mp3', 'A.abcd_09.mp3']
new_s = [re.findall('(?<=^)[a-zA-Z]|(?<=\.)[a-zA-Z]+(?=_)|\d+(?=\.mp3)', i) for i in s]

Output: 输出：

[['A', '09'], ['A', 'abcd', '09']]

用匹配的正则表达式特殊字符分隔行

问题描述

2 个解决方案

解决方案1
0 2018-01-23 15:51:56

解决方案2
0 2018-01-23 16:12:01

用匹配的正则表达式特殊字符分隔行

问题描述

2 个解决方案

解决方案1 0 2018-01-23 15:51:56

解决方案2 0 2018-01-23 16:12:01

解决方案1
0 2018-01-23 15:51:56

解决方案2
0 2018-01-23 16:12:01