[英]regex split at start of word ending in character
I am again trying to work out how to split a string in Python, which has the following type of format: 我再次尝试找出如何在Python中分割字符串,该字符串具有以下类型的格式:
'aaaa bbbb cccc:dd eeee:ff ggg hhhh iiii:jjjj kkkk:llll:mm nnn:ooo pppp qqqq:rrr'
into the following list items: 进入以下列表项:
'aaaa bbbb'
'cccc:dd'
'eeee:ff ggg hhhh'
'iiii:jjjj'
'kkkk:
'llll:mm'
'nnn:ooo pppp'
'qqqq:rrr'
I am looking to split at the start of the word that ends with a colon (':') 我正在寻找以冒号(':')结尾的单词的开头
Any suggestions would be much appreciated :) 我们欢迎所有的建议 :)
Following works for the provided example: 提供的示例的以下作品:
import re
string = 'aaaa bbbb cccc:dd eeee:ff ggg hhhh iiii:jjjj kkkk:llll:mm nnn:ooo pppp qqqq:rrr'
result = []
# split the string at each word followed by a colon
# wrap regex pattern as group so it is added to result list
parts = re.split("(\w+:)", string)
# if anything was previous to first delimitation token
# add it to results
if parts[0]:
result.append(parts[0].strip())
# create pairs of a delimitation token and next string
# start from first delimitation token (list index 1)
groups = zip(*[parts[i+1::2] for i in range(2)])
# join each pair to one string and strip spacing
result.extend(["".join(group).strip() for group in groups])
print(result)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.