简体   繁体   English

使用 Python,如何拆分多个分隔符并在输出列表中只保留一个?

[英]Using Python, how do I split on multiple delimiters and keep only one in my output list?

a very green python user here, so go easy on me and the docs haven't helped me understand what I'm missing.这里是一个非常绿色的python用户,所以对我放轻松,文档并没有帮助我理解我错过了什么。 Similar to RE split multiple arguments |类似于RE 拆分多个参数 | (or) returns none python , I need to split a string on multiple delimiters. (或)返回 none python ,我需要在多个分隔符上拆分字符串。 The above question only allows either keeping none or keeping both delimiters - I need to keep only one of them.上述问题只允许使用保留或保留两个分隔符-我只需要保留其中之一 Note that the above question was from 2012, so likely a much earlier version of Python that 3.6, which I'm using.请注意,上述问题来自 2012 年,因此很可能是我正在使用的 Python 3.6 的更早版本。

My data:我的数据:

line = 'APPLE,ORANGE CHERRY APPLE'

I want a list returned that looks like:我想要一个返回的列表,如下所示:

['APPLE', ',', 'ORANGE', 'CHERRY', 'APPLE']

I need to keep the comma so I can remove duplicate components later.我需要保留逗号,以便稍后删除重复的组件。 I have that part working if I could just get the list created properly.如果我能正确创建列表,那部分就可以工作了。 Here's what I've got.这就是我所拥有的。

list = re.split(r'\s|(,)',line)
print(list)

My logic here is split on space and comma but only keep the comma - makes sense to me.我这里的逻辑在空格和逗号上分开,但只保留逗号 - 对我来说很有意义。 Nope:不:

['APPLE', ',', 'ORANGE', None, 'CHERRY', None, 'APPLE']

I've also tried what is mentioned in the above linked question, to put the entire group in a capture:我还尝试了上面链接问题中提到的内容,将整个组放入捕获中:

re.split(r'(\s|(,))',line)

Nope again:没有了:

['APPLE', ',', ',', 'ORANGE', ' ', None, 'CHERRY', ' ', None, 'APPLE']

What am I missing?我错过了什么? I know it's related to how my capture groups are set up but I can't figure it out.我知道这与我的捕获组的设置方式有关,但我无法弄清楚。 Thanks in advance!提前致谢!

I suggest using a matching approach with我建议使用匹配的方法

re.findall(r'[^,\s]+|,', line)

See the regex demo .请参阅正则表达式演示 The [^,\\s]+|, pattern matches [^,\\s]+|,模式匹配

  • [^,\\s]+ - one or more chars other than a comma and whitespace [^,\\s]+ - 除逗号和空格之外的一个或多个字符
  • | - or - 或者
  • , - a comma. , - 逗号。

See a Python demo :看一个Python 演示

import re
line = 'APPLE,ORANGE CHERRY APPLE'
l = re.findall(r'[^,\s]+|,', line)
print(l) # => ['APPLE', ',', 'ORANGE', 'CHERRY', 'APPLE']

Without using regex you can do like this不使用regex你可以这样做

res = [x for x in line.replace(',', ' , ').split()]
print(res)

Output:输出:

['APPLE', ',', 'ORANGE', 'CHERRY', 'APPLE']

Filter out None s:过滤掉None s:

import re
line = 'APPLE,ORANGE CHERRY APPLE'
print([m for m in re.split('\s+|(,)', line) if m])
>>> ['APPLE', ',', 'ORANGE', 'CHERRY', 'APPLE']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用多个分隔符拆分字符串 (Python) - How Do I Split A String Using Multiple Delimiters (Python) 如何在Python中使用多个单词定界符分割字符串? - How do I split a string with multiple word delimiters in Python? 如何在python中使用索引按“ /”分隔符拆分? - How do I split by “/” delimiters using index in python? Python:使用多个拆分分隔符拆分文件 - Python: split files using multiple split delimiters 拆分具有多个分隔符的字符串时如何保持分隔符到位? - How to keep the delimiters in place when split a string with multiple delimiters? 如何拆分具有多个定界符的字符串,但每个定界符仅一次? Python - How do I split a string with several delimiters, but only once on each delimiter? Python 如何在Python中使用多个分隔符分割字符串? - How do you split a string in Python with multiple delimiters? 如何根据多个分隔符拆分字符串,包括 Python3 中括号内的文本? - How do I split a string based on multiple delimiters including text within parentheses in Python3? 如何通过将df.str.split()与multipe分隔符一起使用来保留分隔符 - How to keep the delimiters by using df.str.split() with multipe delimiters 如何在不同的定界符上分割字符串,但在输出中保留某些所说的定界符? (标记字符串) - How do I split a string on different delimiters, but keeping on the output some of said delimiters? (Tokenize a string)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM