简体   繁体   English

拆分不带非字符的字符串

[英]Split string without non-characters

I'm trying to split a string that looks like this for example: 我试图拆分一个看起来像这样的字符串:

':foo [bar]'

Using str.split() on this of course returns [':foo','[bar]'] 在这个上使用str.split()当然会返回[':foo','[bar]']

But how can I make it return just ['foo','bar'] containing only these characters? 但是如何让它只返回只包含这些字符的['foo','bar']

I don't like regular expressions, but do like Python, so I'd probably write this as 我不喜欢正则表达式,但是喜欢Python,所以我可能会写这个

>>> s = ':foo [bar]'
>>> ''.join(c for c in s if c.isalnum() or c.isspace())
'foo bar'
>>> ''.join(c for c in s if c.isalnum() or c.isspace()).split()
['foo', 'bar']

The ''.join idiom is a little strange, I admit, but you can almost read the rest in English: "join every character for the characters in s if the character is alphanumeric or the character is whitespace, and then split that". 我承认,'。join成语有点奇怪,但你几乎可以用英语阅读其余内容:“如果字符是字母数字或字符是空格,则加入s中字符的每个字符,然后将其拆分”。

Alternatively, if you know that the symbols you want to remove will always be on the outside and the word will still be separated by spaces, and you know what they are, you might try something like 或者,如果您知道要删除的符号将始终位于外部并且该单词仍将以空格分隔,并且您知道它们是什么,您可能会尝试类似

>>> s = ':foo [bar]'
>>> s.split()
[':foo', '[bar]']
>>> [word.strip(':[]') for word in s.split()]
['foo', 'bar']

Do str.split() as normal, and then parse each element to remove the non-letters. str.split()正常,然后解析每个元素以删除非字母。 Something like: 就像是:

>>> my_string = ':foo [bar]'
>>> parts = [''.join(c for c in s if c.isalpha()) for s in my_string.split()]
['foo', 'bar']

You'll have to pass through the list ['foo','[bar]'] and strip out all non-letter characters, using regular expressions. 你必须通过列表['foo','[bar]']并使用正则表达式删除所有非字母字符。 Check Regex replace (in Python) - a simpler way? 检查Regex替换(在Python中) - 一种更简单的方法? for examples and references to documentation. 例如和文档参考。

You have to try regular expressions . 你必须尝试正则表达式

Use re.sub() to replace :,[,] characters and than split your resultant string with white space as delimiter. 使用re.sub()替换:,[,]字符,然后将结果字符串拆分为white space作为分隔符。

>>> st = ':foo [bar]'
>>> import re
>>> new_st = re.sub(r'[\[\]:]','',st)
>>> new_st.split(' ')
['foo', 'bar']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM