I need to complete an assignment which involves cleansing a list of lists in Python. If a sub-list contains an item that is non-numeric or numeric but greater than 20, I need to remove the sub-list and add it to a separate list.
My current code correctly removes some sub-lists but not others. I think it is because of two consecutive sub-lists with errors but I haven't been able to fix this. My code:
datalist = [['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['1', '30', '2', '5', '7'], ['14', '1', '2', '9', '3'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'], ['12', '11', '8', '15', '7'], ['18', '9', '9', '22', '4'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0'], ['einundzwanzig', '14', '1', '2', '4']]
invalidList = []
def validate(myList): #non-numeric values or values greater than 20 must be removed from myList and added to invalidList
for lst in myList: # check each list
for item in lst: # check element in each list
try:
val = int(item)
if val >20:
raise ValueError
except ValueError:
invalidList.append(lst)
myList.remove(lst)
return myList
The problematic sublist is:
['14', '11', 'forteen', '8', '20']
Actual output:
>>> print(validate(datalist)) # this should be the cleansed list
[['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['14', '1', '2', '9', '3'], ['14', '11', 'forteen', '8', '20'], ['12', '11', '8', '15', '7'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0']]
>>> print(invalidList)
[['1', '30', '2', '5', '7'], ['6', '9', '16', '0', ''], ['18', '9', '9', '22', '4'], ['einundzwanzig', '14', '1', '2', '4']]
Expected output:
>>> print(validate(datalist)) # this should be the cleansed list
[['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['14', '1', '2', '9', '3'], ['12', '11', '8', '15', '7'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0']]
>>> print(invalidList)
[['1', '30', '2', '5', '7'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'],['18', '9', '9', '22', '4'], ['einundzwanzig', '14', '1', '2', '4']]
Thanks in advance :)
The problem is that you change your list during the loop which leads to unexpected results. I'd advice against removing the element - just "mark it" for deletion and remove it just before returning.
This is an example way of doing that, without modifying much of your code:
datalist = [['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['1', '30', '2', '5', '7'], ['14', '1', '2', '9', '3'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'], ['12', '11', '8', '15', '7'], ['18', '9', '9', '22', '4'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0'], ['einundzwanzig', '14', '1', '2', '4']]
invalidList = []
def validate(myList): #non-numeric values or values greater than 20 must be removed from myList and added to invalidList
for lst in myList: # check each list
for item in lst:# check element in each list
try:
val = int(item)
if val >20:
raise ValueError
except ValueError:
invalidList.append(lst[:]) # copy the invalid list - otherwise the next line would break it because they share the list object
lst.clear() # this will change the invalid list
return [elem for elem in myList if elem] # empty list evaluate to False
Returned values:
>>> validate(datalist)
[['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['14', '1', '2', '9', '3'], ['12', '11', '8', '15', '7'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0']]
>>> invalidList
[['1', '30', '2', '5', '7'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'], ['18', '9', '9', '22', '4'], ['einundzwanzig', '14', '1', '2', '4']]
When you remove an item from the middle of the list, all further elements shift left.
This means, after removing an element, next element jumps into the removed one's place... but the loop goes on, onto the next place.
When your list contains two invalid elements in a row, the second one is always skipped because it jumps into that place, as noted below:
[['16', '10', '8', '3', '7'], #ok
['8', '9', '19', '20', '4'], #ok
['6', '8', '16', '5', '0'], #ok
['1', '30', '2', '5', '7'], #removed
['14', '1', '2', '9', '3'], #skipped! but ok
['6', '9', '16', '0', ''], #removed
['14', '11', 'forteen', '8', '20'], #skipped! but should've been removed
['12', '11', '8', '15', '7'], #ok
['18', '9', '9', '22', '4'], #removed
['1', '3', '14', '18', '20'], #skipped! but ok
['5', '3', '19', '20', '0'], #ok
['einundzwanzig', '14', '1', '2', '4']] #removed
This is one approach using any()
.
Ex:
datalist = [['16', '10', '8', '3', '7'], ['8', '9', '19', '20', '4'], ['6', '8', '16', '5', '0'], ['1', '30', '2', '5', '7'], ['14', '1', '2', '9', '3'], ['6', '9', '16', '0', ''], ['14', '11', 'forteen', '8', '20'], ['12', '11', '8', '15', '7'], ['18', '9', '9', '22', '4'], ['1', '3', '14', '18', '20'], ['5', '3', '19', '20', '0'], ['einundzwanzig', '14', '1', '2', '4']]
def validate(myList):
invalidList = []
validList = []
for i in myList:
if any(j=='' or j.isalpha() or int(j) > 20 for j in i):
invalidList.append(i)
else:
validList.append(i)
return validList, invalidList
print(validate(datalist))
Output:
([['16', '10', '8', '3', '7'],
['8', '9', '19', '20', '4'],
['6', '8', '16', '5', '0'],
['14', '1', '2', '9', '3'],
['12', '11', '8', '15', '7'],
['1', '3', '14', '18', '20'],
['5', '3', '19', '20', '0']],
[['1', '30', '2', '5', '7'],
['6', '9', '16', '0', ''],
['14', '11', 'forteen', '8', '20'],
['18', '9', '9', '22', '4'],
['einundzwanzig', '14', '1', '2', '4']])
You can achieve what you want with a one-liner as follows:
validlist = [sublist for sublist in datalist if all(i.isdigit() for i in sublist) and max([int(i) for i in sublist])<=20]
Output:
[['16', '10', '8', '3', '7'],
['8', '9', '19', '20', '4'],
['6', '8', '16', '5', '0'],
['14', '1', '2', '9', '3'],
['12', '11', '8', '15', '7'],
['1', '3', '14', '18', '20'],
['5', '3', '19', '20', '0']]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.