简体   繁体   English

从列表中的元组中删除空字符串

[英]Remove empty strings from tuples inside a list

Right now I have three lists that are generated from a RE findall function, and I am trying to remove some of the empty strings from the tuples inside the list. 现在,我有三个从RE findall函数生成的列表,并且我试图从列表中的元组中删除一些空字符串。 And the numbers should be converted to integer in the process too: 并且数字也应在此过程中转换为整数:

Got: [('', '', '1', '1')] 得到了: [('', '', '1', '1')]

Expected: [(1, 1)] 预期值: [(1, 1)]

Got: [('', '', '20', '500'), ('21', 'failed', '', '')] 得到了: [('', '', '20', '500'), ('21', 'failed', '', '')]

Expected: [(20, 500), (21, 'failed')] 预期: [(20, 500), (21, 'failed')]

Got: [('3', 'failed', '', ''), ('', '', '48', '23'), ('', '', '96', '0')] 得到了: [('3', 'failed', '', ''), ('', '', '48', '23'), ('', '', '96', '0')]

expected: [(3, 'failed'), (48, 23), (96, 0)] 预期: [(3, 'failed'), (48, 23), (96, 0)]

Any ideas? 有任何想法吗?

A nested list comprehension with a tuple constructor: 具有元组构造函数的嵌套列表推导:

>>> lst = [('', '', '20', '500'), ('21', 'failed', '', '')]
>>> [(tuple(int(x) if x.isdigit() else x for x in _ if x)) for _ in lst]
[(20, 500), (21, 'failed')]

For each tuple ( _ ) in lst , construct a tuple with a generator expression. 对于lst每个元组( _ ),构造一个带有生成器表达式的tuple The tuple constructor alone is below: 元组的构造函数如下:

tuple(int(x) if x.isdigit() else x for x in _ if x)

Which seems confusing, but I'll break it down. 这似乎令人困惑,但我将其分解。 For each string x in tuple _ (which is a tuple in lst ), construct a tuple. 对于元组_每个字符串x (在lst是元组),构造一个元组。 if x checks to see if the string is not empty. if x检查字符串是否为空。 (If string x is empty, x is false.) if x , the generator expression will produce either x or int(x) depending on whether x in question is a digit in string form. (如果字符串x为空,则x为false。) if x ,则生成器表达式将生成xint(x)这取决于所讨论的x是字符串形式的数字。 (Trying to turn a non-numeric string into an integer will result in an exception.) (尝试将非数字字符串转换为整数将导致异常。)

For each tuple _ in lst , the generator creates a new, identical tuple, except the empty, false strings are filtered out and any digit strings are converted into int s. 对于lst每个元组_ ,生成器都会创建一个相同的新元组,除了过滤出空的假字符串并将所有数字字符串转换为int

Above code is equivalent to: 上面的代码等效于:

new_lst = []

for _ in lst: # For each tuple in lst
    for x in _: # For each string in tuple
        temp_tuple = ()
        if x: # Only add to tuple if string is not empty
            if x.isdigit(): # If x is a digit in string form
                temp_tuple += (int(x),) # Convert to int
            else:
                temp_tuple += (x,) # Keep string
    new_lst.append(temp_tuple)

How about this: 这个怎么样:

def sanitize(t):                                
    for i in t:
        try:
            yield int(i)
        except ValueError:
            yield i

inputs = [('3', 'failed', '', ''), ('', '', '48', '23'), ('', '', '96', '0')]
map(tuple, map(sanitize, [filter(None, i) for i in inputs]))

Gives output of: 提供以下内容的输出:

[(3, 'failed'), (48, 23), (96, 0)]

filter is a function that operates on a sequence and returns only the "truthy" elements. filter是对序列进行操作并仅返回“真实”元素的函数。 Empty strings are falsy. 空字符串是虚假的。 Map is another function that takes a sequence and runs each element in that sequence through a given function. Map是另一个函数,它采用一个序列并通过给定函数运行该序列中的每个元素。 In this case the function sanitize which converts a string to a int if it can or else just returns the string. 在这种情况下,函数sanitize会将字符串转换为int(如果可以),否则仅返回字符串。

We are using yield rather than return in the sanitize function as an easy way to return yet another sequence to the next map function. 我们使用yield ,而不是returnsanitize函数作为一种简单的方法另一序列返回尚未下一个地图功能。 Alternatively we could build a list inside the function and return it. 另外,我们可以在函数内部构建一个列表并返回它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM