简体   繁体   English

如何通过python中不同的字符集拆分字符串

[英]How to split a string by character sets that are different in python

I want to split an string I have by characters that are different than the others into a list.我想将我拥有的字符串按与其他字符不同的字符拆分为一个列表。 for example, if I have string ccaaawq , I want my program to give me ['cc', 'aaa', 'w', 'q'] .例如,如果我有字符串ccaaawq ,我希望我的程序给我['cc', 'aaa', 'w', 'q'] Since there is no single differentiator between each split, I'm wondering what is the best approach to solving this problem.由于每个拆分之间没有单一的区别,我想知道解决这个问题的最佳方法是什么。 thanks in advance for your answers提前感谢您的回答

You can use itertools.groupby :您可以使用itertools.groupby

from itertools import groupby

s = "ccaaawq"

out = ["".join(g) for _, g in groupby(s)]
print(out)

Prints:印刷:

['cc', 'aaa', 'w', 'q']

Here is a regex find all approach:这是一个正则表达式查找所有方法:

inp = "ccaaawq"
output = [x[0] for x in re.findall(r'((.)\2*)', inp)]
print(output)  # ['cc', 'aaa', 'w', 'q']

The above works by matching any one character followed by that same character zero or more times.上述方法通过匹配任何一个字符后跟该相同字符零次或多次来工作。 These matches are then stored in the first capture group, which we extract from the 2D list output.然后将这些匹配存储在我们从二维列表输出中提取的第一个捕获组中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM