在某些整数字符模式后拆分字符串

Question

I have a string stored in variable mystring .我有一个字符串存储在变量mystring 。 I wanted to split the string after a character 4-digit-integer character pattern ie (4-digit-integer) .我想在character 4-digit-integer character模式 ie (4-digit-integer)之后拆分字符串。 I suppose this can be done using Python regex.我想这可以使用 Python regex 来完成。

mystring = 'Lorem Ipsum (2018) Amet (Lorem Dolor Amet Elit)'

Desired Output :期望输出：

splitstring = ['Lorem Ipsum (2018)', 'Amet (Lorem Dolor Amet Elit)']

Answer 1

If you don't mind doing some filtering you could do:如果你不介意做一些过滤，你可以这样做：

import re

string = 'Lorem Ipsum (2018) Amet (Lorem Dolor Amet Elit)'
result = [m for m in re.split('([^\d(]+\(\d{4}\))\s+', string) if m]
print(result)

Output输出

['Lorem Ipsum (2018)', 'Amet (Lorem Dolor Amet Elit)']

When using split with a capturing group the result will include the group in this case ([^\\d(]+\\(\\d{4}\\)) ie anything that is not a number nor an open parenthesis followed exactly by four numbers surrounded by parenthesis. No the that the following spaces \\s+ are left out.当对捕获组使用split 时，结果将包括在这种情况下的组([^\\d(]+\\(\\d{4}\\))即任何不是数字也不是开括号的东西，后面紧跟四个数字被括号包围。不，后面的空格\\s+被遗漏了。

Answer 2

Here is a simple way how you could do it.这是一个简单的方法，您可以这样做。

Since brackets have another meaning in REs (they are called capturing groups), you need to escape them like : \\( for opening bracket. Then, you can search for (2018) and then split the text accodringly:由于括号在 RE 中具有另一种含义（它们称为捕获组），因此您需要将它们转义为： \\(用于打开括号。然后，您可以搜索(2018)然后按相应方式拆分文本：

import re
s = 'Lorem Ipsum (2018) Amet (Lorem Dolor Amet Elit)'
match = re.search(r'\(\d{4}\)', s)

split_string = [ s[:match.end()], s[match.end():] ]
print(split_string) 
# ['Lorem Ipsum (2018)', ' Amet (Lorem Dolor Amet Elit)']

在某些整数字符模式后拆分字符串

问题描述

2 个解决方案

解决方案1
2 已采纳 2019-01-13 11:56:26

解决方案2
0 2019-01-13 12:03:54

在某些整数字符模式后拆分字符串

问题描述

2 个解决方案

解决方案1 2 已采纳 2019-01-13 11:56:26

解决方案2 0 2019-01-13 12:03:54

解决方案1
2 已采纳 2019-01-13 11:56:26

解决方案2
0 2019-01-13 12:03:54