简体   繁体   English

拆分具有多个分隔符的字符串时如何保持分隔符到位?

[英]How to keep the delimiters in place when split a string with multiple delimiters?

import re

p = re.compile(r"([?.;])")

ss = re.split(p, 'This is a test? This is a test?good.bad')

for s in ss:
    print(s)

The result is:结果是:

This is a test
?
 This is a test
?
good
.
bad

I hope the result would be:我希望结果是:

This is a test?
This is a test?
good.
bad

Why does it put the delimiter on another line?为什么它将分隔符放在另一行?

EDIT: I think I understand why it did that.编辑:我想我理解它为什么这样做。 The question is how to produce the result I want.问题是如何产生我想要的结果。

You can join back the delimiters and preceding items:您可以加入分隔符和前面的项目:

 ss = re.split(p, 'This is a test? This is a test?good.bad')
 result = [ a+b for a, b in zip(ss[::2], ss[1::2]) ] + (ss[-1:] if len(ss) % 2 else [])

A comment said you must use the pattern p .评论说您必须使用模式p Here's a way to join the pairs up after a split.这是一种在拆分后加入对的方法。 zip_longest ensures an odd pairing works out by returning None for the second element, which is converted to an empty string if present. zip_longest通过为第二个元素返回None来确保奇偶配对,如果存在,则将其转换为空字符串。

import re
from itertools import zip_longest

p = re.compile(r"([?.;])")

ss = re.split(p, 'This is a test? This is a test?good.bad')

for a,b in zip_longest(ss[::2],ss[1::2]):
    print(a+(b if b else ''))

Output: Output:

This is a test?
 This is a test?
good.
bad

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM