[英]how to make punctuation a separate item when using split()
I'm writing a program that compresses text by replicating it with a sequence of numbers - but I don't know how to get the program to recognise punctuation as a separate item in the list. 我正在编写一个程序,该程序通过使用数字序列复制文本来压缩文本-但我不知道如何使该程序将标点符号识别为列表中的单独项。
eg, in this sentence with a comma, the comma means that the words 'comma,'
and 'comma'
are different when using split()
. 例如,在带逗号的句子中,逗号表示使用
split()
时单词'comma,'
和'comma'
是不同的。 I want to have 'comma' ',' 'comma'
instead. 我想改用
'comma' ',' 'comma'
。
I don't want to get rid of the punctuation - i want it as a separate item in a list 我不想删除标点符号-我希望将其作为列表中的单独项目使用
You can use re.split
like this: 您可以这样使用
re.split
:
>>> re.split('([{}])'.format(re.escape(string.punctuation)), "comma,comma")
['comma', ',', 'comma']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.