简体   繁体   中英

How do I remove list from list of lists if item is non-numeric or greater than a specific value?

I need to complete an assignment which involves cleansing a list of lists in Python. If a sub-list contains an item that is non-numeric or numeric but greater than 20, I need to remove the sub-list and add it to a separate list.

My current code correctly removes some sub-lists but not others. I think it is because of two consecutive sub-lists with errors but I haven't been able to fix this. My code:

datalist = [['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['1', '30', '2', '5', '7'], ['14', '1', '2', '9', '3'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'], ['12', '11', '8', '15', '7'], ['18', '9', '9', '22', '4'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0'], ['einundzwanzig', '14', '1', '2', '4']]

invalidList = []

def validate(myList): #non-numeric values or values greater than 20 must be removed from myList and added to invalidList
    for lst in myList: # check each list
            for item in lst: # check element in each list
                try:
                    val = int(item)
                    if val >20:
                        raise ValueError
                except ValueError:
                    invalidList.append(lst)
                    myList.remove(lst)

    return myList

The problematic sublist is:

['14', '11', 'forteen', '8', '20']

Actual output:

>>> print(validate(datalist)) # this should be the cleansed list
[['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['14', '1', '2', '9', '3'], ['14', '11', 'forteen', '8', '20'], ['12', '11', '8', '15', '7'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0']]

>>> print(invalidList)
[['1', '30', '2', '5', '7'], ['6', '9', '16', '0', ''], ['18', '9', '9', '22', '4'], ['einundzwanzig', '14', '1', '2', '4']]

Expected output:

>>> print(validate(datalist)) # this should be the cleansed list
[['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['14', '1', '2', '9', '3'], ['12', '11', '8', '15', '7'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0']]

>>> print(invalidList)
[['1', '30', '2', '5', '7'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'],['18', '9', '9', '22', '4'], ['einundzwanzig', '14', '1', '2', '4']]

Thanks in advance :)

The problem is that you change your list during the loop which leads to unexpected results. I'd advice against removing the element - just "mark it" for deletion and remove it just before returning.

This is an example way of doing that, without modifying much of your code:

datalist = [['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['1', '30', '2', '5', '7'], ['14', '1', '2', '9', '3'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'], ['12', '11', '8', '15', '7'], ['18', '9', '9', '22', '4'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0'], ['einundzwanzig', '14', '1', '2', '4']]

invalidList = []

def validate(myList): #non-numeric values or values greater than 20 must be removed from myList and added to invalidList
    for lst in myList: # check each list
            for item in lst:# check element in each list
                try:
                    val = int(item)
                    if val >20:
                        raise ValueError
                except ValueError:
                    invalidList.append(lst[:]) # copy the invalid list - otherwise the next line would break it because they share the list object
                    lst.clear() # this will change the invalid list
    return [elem for elem in myList if elem] # empty list evaluate to False

Returned values:

>>> validate(datalist)
[['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['14', '1', '2', '9', '3'], ['12', '11', '8', '15', '7'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0']]
>>> invalidList
[['1', '30', '2', '5', '7'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'], ['18', '9', '9', '22', '4'], ['einundzwanzig', '14', '1', '2', '4']]

Why is it happening?

When you remove an item from the middle of the list, all further elements shift left.

This means, after removing an element, next element jumps into the removed one's place... but the loop goes on, onto the next place.

When your list contains two invalid elements in a row, the second one is always skipped because it jumps into that place, as noted below:

[['16', '10', '8', '3', '7'], #ok
 ['8', '9', '19', '20', '4'], #ok
 ['6', '8', '16', '5', '0'], #ok
 ['1', '30', '2', '5', '7'], #removed
 ['14', '1', '2', '9', '3'], #skipped! but ok
 ['6', '9', '16', '0', ''], #removed
 ['14', '11', 'forteen', '8', '20'], #skipped! but should've been removed
 ['12', '11', '8', '15', '7'], #ok
 ['18', '9', '9', '22', '4'], #removed
 ['1', '3', '14', '18', '20'], #skipped! but ok
 ['5', '3', '19', '20', '0'], #ok
 ['einundzwanzig', '14', '1', '2', '4']] #removed

This is one approach using any() .

Ex:

datalist = [['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['1', '30', '2', '5', '7'], ['14', '1', '2', '9', '3'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'], ['12', '11', '8', '15', '7'], ['18', '9', '9', '22', '4'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0'], ['einundzwanzig', '14', '1', '2', '4']]

def validate(myList):
    invalidList = []
    validList = []
    for i in myList:
        if any(j=='' or j.isalpha() or int(j) > 20 for j in i):
            invalidList.append(i)
        else:
            validList.append(i)

    return validList, invalidList

print(validate(datalist))  

Output:

([['16', '10', '8', '3', '7'],
  ['8', '9', '19', '20', '4'],
  ['6', '8', '16', '5', '0'],
  ['14', '1', '2', '9', '3'],
  ['12', '11', '8', '15', '7'],
  ['1', '3', '14', '18', '20'],
  ['5', '3', '19', '20', '0']],

 [['1', '30', '2', '5', '7'],
  ['6', '9', '16', '0', ''],
  ['14', '11', 'forteen', '8', '20'],
  ['18', '9', '9', '22', '4'],
  ['einundzwanzig', '14', '1', '2', '4']])

You can achieve what you want with a one-liner as follows:

validlist = [sublist for sublist in datalist if all(i.isdigit() for i in sublist) and max([int(i) for i in sublist])<=20]

Output:

[['16', '10', '8', '3', '7'], 
['8', '9', '19', '20', '4'],
['6', '8', '16', '5', '0'],
['14', '1', '2', '9', '3'],
['12', '11', '8', '15', '7'],
['1', '3', '14', '18', '20'],
['5', '3', '19', '20', '0']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM