简体   繁体   English

从list的子列表中提取最长的字符串。 蟒蛇

[英]Extract longest strings from sublist within list . Python

so i have a list of sublists and within the sublists, there are strings. 所以我有一个子列表的列表,并且在子列表中,有字符串。

the strings are usually at different lengths, but can be the same length as well. 字符串通常具有不同的长度,但长度也可以相同。

below is an example of the list 以下是列表的示例

sequences = [['aaa'],['aaaa','bb'],[],['aaaaaa','bb','cccccc']]

i want to find a way to extract the LONGEST string from each list and if there are two that are equally long, then take both of those strings 我想找到一种从每个列表中提取最长字符串的方法,如果有两个长度相等,则将这两个字符串都提取

example_output = [['aaa'],['aaaa'],[],['aaaaaa','cccccc']]

usually i would set a threshold in a for-loop where if it was longer than a certain length then append to a list and then after each iteration append that to a list . 通常,我会在for循环中设置一个阈值,如果该阈值长于特定长度,则将其追加到列表中,然后在每次迭代后将其追加到列表中。 . . but i don't have a threshold value in this case 但在这种情况下我没有阈值

if possible i would like try and avoid using lambda and functions since this will be within another function 如果可能的话,我想避免使用lambda和函数,因为这将在另一个函数中

You can use the length of the longest string seen so far as the threshold ( maxlen in the code below): 您可以使用最长的字符串长度作为阈值(在下面的代码中为maxlen ):

def get_longest(seq):
    maxlen = -1
    ret = []
    for el in seq:
        if len(el) > maxlen:
            ret = [el]
            maxlen = len(el)
        elif len(el) == maxlen:
            ret.append(el)
    return ret

sequences = [['aaa'],['aaaa','bb'],[],['aaaaaa','bb','cccccc']]
example_output = list(map(get_longest, sequences))
print(example_output)

This produces: 这将产生:

[['aaa'], ['aaaa'], [], ['aaaaaa', 'cccccc']]

This answer is not the most efficient, but easy to understand. 这个答案不是最有效,但很容易理解。

You can first extract the max lengths (here I'm using a generator expression for that), then extract the strings with those lengths. 您可以先提取最大长度(这里我使用的是生成器表达式),然后提取具有这些长度的字符串。

lengths = ( max(len(s) for s in sublist) if sublist else 0 for sublist in sequences )
[ [ s for s in sublist if len(s) == l ] for l, sublist in zip(lengths, sequences) ]
-> [['aaa'], ['aaaa'], [], ['aaaaaa', 'cccccc']]

itertools.izip is preferable over zip in this case. 在这种情况下, itertools.izip优于zip

我将使用以下内容(含糊的:))给我一个镜头:

example_output = [list(filter(lambda x: len(x)==len(max(sub_lst, key=len)), sub_lst)) for sub_lst in sequences]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM