[英]python split without creating blanks
I understand why it is important to create blanks using split thanks to this question , but sometimes it is necessary not to grab them. 我知道为什么要使用split创建空白很重要,这要归功于这个问题 ,但有时有必要不要抓住它们。
lets say you parsed some css and got the following strings: 假设您解析了一些CSS,并获得了以下字符串:
s1 = 'background-color:#000;color:#fff;border:1px #ccc dotted;'
s2 = 'color:#000;background-color:#fff;border:1px #333 dotted'
both are valid css even though there is a semicolon lacking at the end of the string. 即使字符串的末尾缺少分号,两者都有效。 when splitting the strings, you get the following:
拆分字符串时,将获得以下信息:
>>> s1.split(';')
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted', '']
>>> s2.split(';')
['color:#000', 'background-color:#fff', 'border:1px #333 dotted']
that extra semicolon creates a blank item in the list. 多余的分号将在列表中创建一个空白项目。 now if I want to manipulate further I would need to test the beginning and end of each list, and remove them if they are blank, which is not that bad, but seems avoidable.
现在,如果我想进一步操作,则需要测试每个列表的开头和结尾,如果它们为空,则将其删除,这还不错,但可以避免。
is there a method that is essentially the same as split
but does not include trailing blank items? 是否有一种与
split
基本相同但不包含尾随空白项的方法? or is there simply a way to remove those just like a string has strip
to remove the trailing whitespace 还是有一种简单的方法来删除这些字符,就像字符串具有
strip
带来删除尾随空格一样
Simply remove the items with the None
filter: 只需使用“
None
过滤器删除项目:
filter(None, s1.split(';'))
Demo: 演示:
>>> s1 = 'background-color:#000;color:#fff;border:1px #ccc dotted;'
>>> filter(None, s1.split(';'))
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted']
Calling filter()
with None
removes all 'empty' or numeric 0 items; 使用
None
调用filter()
会删除所有“空”或数字0项目; anything that would evaluate to false in a boolean context. 在布尔上下文中会评估为false的任何内容。
filter(None, ....)
eats list comprehensions for breakfast: filter(None, ....)
吃早餐的清单理解:
>>> import timeit
>>> timeit.timeit('filter(None, a)', "a = [1, 2, 3, None, 4, 'five', ''] * 100")
9.410392045974731
>>> timeit.timeit('[i for i in a if i]', "a = [1, 2, 3, None, 4, 'five', ''] * 100")
44.9318630695343
You can use a list comprehension to filter out the empty strings, as an empty string is considered False
: 您可以使用列表推导过滤掉空字符串,因为空字符串被视为
False
:
>>> s1 = 'background-color:#000;color:#fff;border:1px #ccc dotted;'
>>> [i for i in s1.split(';') if i]
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted']
Alternatively, you can rstrip()
the semicolon first: 或者,您可以
rstrip()
分号:
>>> s1.rstrip(';').split(';')
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted']
Apply str.strip
to the string before doing the split
: 在执行
split
之前,将str.strip
应用于字符串:
>>> s1 = 'background-color:#000;color:#fff;border:1px #ccc dotted;'
...
>>> s1.strip(';').split(';')
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted']
Works for both leading and trailing ';'
适用于开头和结尾的
';'
: :
>>> s1 = ';background-color:#000;color:#fff;border:1px #ccc dotted;'
>>> s1.strip(';').split(';')
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted']
I am not sure why you would want to avoid this as a strip before split is going to be faster than both LC
and filter
: 我不确定为什么要在拆分之前比
LC
和filter
更快的情况下避免出现这种情况:
>>> s1 = ';background-color:#000;color:#fff;border:1px #ccc dotted;'*1000
>>> %timeit filter(None, s1.split(';'))
1000 loops, best of 3: 638 us per loop
>>> %timeit s1.strip(';').split(';')
1000 loops, best of 3: 570 us per loop
>>> %timeit [i for i in s1.split(';') if i]
100 loops, best of 3: 931 us per loop
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.