简体   繁体   English

正则表达式在以字符结尾的单词开头分割

[英]regex split at start of word ending in character

I am again trying to work out how to split a string in Python, which has the following type of format: 我再次尝试找出如何在Python中分割字符串,该字符串具有以下类型的格式:

'aaaa bbbb cccc:dd eeee:ff ggg hhhh iiii:jjjj kkkk:llll:mm nnn:ooo pppp qqqq:rrr'

into the following list items: 进入以下列表项:

'aaaa bbbb' 
'cccc:dd'
'eeee:ff ggg hhhh'
'iiii:jjjj'
'kkkk:
'llll:mm'
'nnn:ooo pppp'
'qqqq:rrr'

I am looking to split at the start of the word that ends with a colon (':') 我正在寻找以冒号(':')结尾的单词的开头

Any suggestions would be much appreciated :) 我们欢迎所有的建议 :)

Following works for the provided example: 提供的示例的以下作品:

import re

string = 'aaaa bbbb cccc:dd eeee:ff ggg hhhh iiii:jjjj kkkk:llll:mm nnn:ooo pppp qqqq:rrr'
result = []

# split the string at each word followed by a colon
# wrap regex pattern as group so it is added to result list
parts = re.split("(\w+:)", string)

# if anything was previous to first delimitation token
# add it to results
if parts[0]:
    result.append(parts[0].strip())

# create pairs of a delimitation token and next string
# start from first delimitation token (list index 1)
groups = zip(*[parts[i+1::2] for i in range(2)])

# join each pair to one string and strip spacing
result.extend(["".join(group).strip() for group in groups])

print(result)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM