I have a nested list:
l = [['GILTI', 'was', 'intended', 'to','to', 'stifle', 'multinationals'. 'was'],
['like' ,'technology', 'and', 'and','pharmaceutical', 'companies', 'like']]
How can I detect two consecutive elements and delete one without using set or another similar operation? This should be the desired output :
l = [['GILTI', 'was', 'intended','to', 'stifle', 'multinationals'. 'was'],
['like' ,'technology', 'and','pharmaceutical', 'companies', 'like']]
I tried using itertools groupby like this:
from itertools import groupby
[i[0] for i in groupby(l)]
And also, an ordered dict:
from collections import OrderedDict
temp_lis = []
for x in l:
temp_lis.append(list(OrderedDict.fromkeys(x)))
temp_lis
out:
[['GILTI', 'was', 'intended', 'to', 'stifle', 'multinationals'],
['like', 'technology', 'and', 'pharmaceutical', 'companies']]
The second solution might look that works well. However,it is wrong because it is deleting non consecutive repeated elements (eg was and like). How can I get the above desired output ?
You can use groupby
like so:
[[k for k, g in groupby(x)] for x in l]
This will keep one if there are multiple repeating consecutive elements.
In case you need to completely remove repetitive consecutive elements, use:
[[k for k, g in groupby(x) if len(list(g)) == 1] for x in l]
Example :
from itertools import groupby
l = [['GILTI', 'was', 'intended', 'to','to', 'stifle', 'multinationals', 'was'],
['like' ,'technology', 'and', 'and','pharmaceutical', 'companies', 'like']]
print([[k for k, g in groupby(x)] for x in l])
# [['GILTI', 'was', 'intended', 'to', 'stifle', 'multinationals', 'was'],
# ['like', 'technology', 'and', 'pharmaceutical', 'companies', 'like']]
A custom generator solution:
def deduped(seq):
first = True
for el in seq:
if first or el != prev:
yield el
prev = el
first = False
[list(deduped(seq)) for seq in l]
# => [['GILTI', 'was', 'intended', 'to', 'stifle', 'multinationals', 'was'],
# ['like', 'technology', 'and', 'pharmaceutical', 'companies', 'like']]
EDIT: The previous version couldn't handle None
being the first element.
Ex.
l = [['GILTI', 'was', 'intended','to', 'stifle', 'multinationals','was'],
['like' ,'technology', 'and','pharmaceutical', 'companies', 'like']]
result = []
for sublist in l:
new_list = []
for index,x in enumerate(sublist):
#validate current and next element of list is same
if len(sublist)-1 >= index+1 and x == sublist[index+1]:
continue
#append none consecutive into new list
new_list.append(x)
#append list into result list
result.append(new_list)
print(result)
O/P:
[['GILTI', 'was', 'intended', 'to', 'stifle', 'multinationals', 'was'],
['like', 'technology', 'and', 'pharmaceutical', 'companies', 'like']]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.