简体   繁体   中英

select items from a list based on length of the item

I am having a big list of items (and the list may sometimes hold 1 million items). Now I want to filter elements in this list based on the length of each item. ie I want to add items which are either less than 7 chars or greater than 24 chars. The code which I wrote is:

returnNumbers //the list that holds million items
for num in returnNumbers:
    if((len(num)<7 or len(num)>24)):
        invalidLengthNumbers.append(num);

Not sure if there is a better way of doing this, as going thru 1 million items is time taking.

You want to take an iterative approach, really.

Your code can be replaced with a list comprehension:

invalidLengthNumbers = [num for num in returnNumbers if len(num) < 7 or len(num) > 24]

or, shorter and only taking one len() call by taking advantage of comparison chaining:

invalidLengthNumbers = [num for num in returnNumbers if not 7 <= len(num) <= 24]

but that'll only be marginally faster.

If you need to loop over invalidLengthNumbers later, don't use an intermediary list. Loop and filter over returnNumbers directly. Perhaps even returnNumbers itself can be replaced by a generator, and filtering that generator can be done iteratively too.

def produceReturnNumbers():
    for somevalue in someprocess:
        yield some_other_value_based_on_somevalue

from itertools import ifilter

for invalid in ifilter(lambda n: not 7 <= len(n) <= 24, produceReturnNumbers()):
    # do something with invalid

Now you no longer have a list of 1 million items. You have a generator that will produce 1 million items as needed without holding it all in memory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM