[英]Finding unique elements in nested list of strings
Similar to the query posted at this URL: 类似于在此URL上发布的查询:
https://stackoverflow.com/questions/54477996/finding-unique-elements-in-nested-list/,
I have another query. 我还有另一个查询。
If I have a list that I have imported from Pandas and I need to get a single list as an output with all the unique elements as 如果我有一个从Pandas导入的列表,则需要获得一个列表作为所有唯一元素的输出
[Ac, Ad, An, Bi, Co, Cr, Dr, Fa, Mu, My, Sc]
Once I have all the unique elements, I want to check the count of each of these elements within the whole list. 拥有所有唯一元素后,我要检查整个列表中每个元素的计数。 Can someone advise as to how can I accomplish that? 有人可以建议我如何做到吗?
mylist = df.Abv.str.split().tolist()
mylist
[[‘Ac,Cr,Dr’],
[‘Ac,Ad,Sc'],
[‘Ac,Bi,Dr’],
[‘Ad,Dr,Sc'],
[‘An,Dr,Fa’],
[‘Bi,Co,Dr’],
[‘Dr,Mu’],
[‘Ac,Co,My’],
[‘Co,Dr’],
[‘Ac,Ad,Sc'],
[‘An,Ac,Ad’],
]
I have tried different things but can't seem to make it work. 我尝试了不同的方法,但似乎无法使其正常工作。
Tried to convert it into a string and apply split function on the string, but to no avail. 试图将其转换为字符串并在该字符串上应用拆分功能,但无济于事。
you can create a dictionary with keys as list value and value as their count 您可以创建一个字典,将键作为列表值,将值作为其计数
your code may look like this . 您的代码可能如下所示。
mylists = [[‘Ac,Cr,Dr’],
[‘Ac,Ad,Sc'],
[‘Ac,Bi,Dr’],
[‘Ad,Dr,Sc'],
[‘An,Dr,Fa’],
[‘Bi,Co,Dr’],
[‘Dr,Mu’],
[‘Ac,Co,My’],
[‘Co,Dr’],
[‘Ac,Ad,Sc'],
[‘An,Ac,Ad’],
]
unique = {}
for mylist in mylists:
for elem in mylist:
unique[elem] = unique[elem]+1 if elem in unique else 1
unique.keys() will give unique element array and if you want the count of any value you can get this from dictionary eg unique['Ad'] unique.keys()将给出唯一的元素数组,如果您希望对任何值进行计数,都可以从字典中获取,例如unique ['Ad']
You can use collections.Counter
to make a dictionary of the counts of the elements. 您可以使用collections.Counter
制作元素计数的字典。 This will also give you easy access to a list of all unique elements. 这也使您可以轻松访问所有唯一元素的列表。 It looks like you have a list of lists where each sublist contains a ingle string. 看起来您有一个列表列表,其中每个子列表都包含一个ingle字符串。 You will need to split
these before you add them to the counter. 您需要先将其split
然后再将其添加到计数器中。
from collections import Counter
count = Counter()
mylist = [['Ac,Cr,Dr'],
['Ac,Ad,Sc'],
['Ac,Bi,Dr'],
['Ad,Dr,Sc'],
['An,Dr,Fa'],
['Bi,Co,Dr'],
['Dr,Mu'],
['Ac,Co,My'],
['Co,Dr'],
['Ac,Ad,Sc'],
['An,Ac,Ad'],
]
for arr in mylist:
count.update(arr[0].split(','))
print(count) # dictionary of symbols: counts
print(list(count.keys())) # list of all unique elements
You can do it this way in Python3 您可以在Python3中这样做
mylist = [['Ac,Cr,Dr'],
['Ac,Ad,Sc'],
['Ac,Bi,Dr'],
['Ad,Dr,Sc'],
['An,Dr,Fa'],
['Bi,Co,Dr'],
['Dr,Mu'],
['Ac,Co,My'],
['Co,Dr'],
['Ac,Ad,Sc'],
['An,Ac,Ad'],
]
uniquedict = {}
for sublist in mylist:
for item in sublist[0].split(','):
if item in uniquedict.keys():
uniquedict[item] += 1
else:
uniquedict[item] = 1
print(uniquedict)
print(list(uniquedict.keys()))
{'Ac': 6, 'Cr': 1, 'Dr': 7, 'Ad': 4, 'Sc': 3, 'Bi': 2, 'An': 2, 'Fa': 1, 'Co': 3, 'Mu': 1, 'My': 1} ['Ac', 'Cr', 'Dr', 'Ad', 'Sc', 'Bi', 'An', 'Fa', 'Co', 'Mu', 'My'] {'Ac':6,'Cr':1,'Dr':7,'Ad':4,4,'Sc':3,'Bi':2,'An':2,'Fa':1,' Co':3,'Mu':1,'My':1} ['Ac','Cr','Dr','Ad','Sc','Bi','An','Fa', 'Co','Mu','My']
You can take advantage of the very powerful tools offered by collections
, itertools
and functools
and get a one-line solution. 您可以利用collections
, itertools
和functools
提供的功能非常强大的工具,并获得一线解决方案。
If your lists contain only one element: 如果您的列表仅包含一个元素:
from collections import Counter
from itertools import chain
from functools import partial
if __name__ == '__main__':
mylist = [
['Ac,Cr,Dr'],
['Ac,Ad,Sc'],
['Ac,Bi,Dr'],
['Ad,Dr,Sc'],
['An,Dr,Fa'],
['Bi,Co,Dr'],
['Dr,Mu'],
['Ac,Co,My'],
['Co,Dr'],
['Ac,Ad,Sc'],
['An,Ac,Ad'],
]
# if lists contain only one element
occurrence_count = Counter(chain(*map(lambda x: x[0].split(','), mylist)))
items = list(occurrence_count.keys()) # items, with no repetitions
all_items = list(occurrence_count.elements()) # all items
ac_occurrences = occurrence_count['Ac'] # occurrences of 'Ac'
print(f"Unique items: {items}")
print(f"All list elements: {all_items}")
print(f"Occurrences of 'Ac': {ac_occurrences}")
And this is what you get: 这就是您得到的:
Unique items: ['Ac', 'Cr', 'Dr', 'Ad', 'Sc', 'Bi', 'An', 'Fa', 'Co', 'Mu', 'My']
All list elements: ['Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Cr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Ad', 'Ad', 'Ad', 'Ad', 'Sc', 'Sc', 'Sc', 'Bi', 'Bi', 'An', 'An', 'Fa', 'Co', 'Co', 'Co', 'Mu', 'My']
Occurrences of 'Ac': 6
Otherwise, if your lists have more than one element: 否则,如果您的列表包含多个元素:
from collections import Counter
from itertools import chain
from functools import partial
if __name__ == '__main__':
mylist_complex = [
['Ac,Cr,Dr', 'Ac,Ad,Sc'],
['Ac,Ad,Sc', 'Ac,Bi,Dr'],
['Ac,Bi,Dr', 'Ad,Dr,Sc'],
['Ad,Dr,Sc', 'An,Dr,Fa'],
['An,Dr,Fa', 'Bi,Co,Dr'],
['Bi,Co,Dr', 'Dr,Mu'],
['Dr,Mu', 'Ac,Co,My'],
['Ac,Co,My', 'Co,Dr'],
['Co,Dr', 'Ac,Ad,Sc'],
['Ac,Ad,Sc', 'An,Ac,Ad'],
['An,Ac,Ad', 'Ac,Cr,Dr'],
]
# if lists contain more than one element
occurrence_count_complex = Counter(chain(*map(lambda x: chain(*map(partial(str.split, sep=','), x)), mylist_complex)))
items = list(occurrence_count_complex.keys()) # items, with no repetitions
all_items = list(occurrence_count_complex.elements()) # all items
ac_occurrences = occurrence_count_complex['Ac'] # occurrences of 'Ac'
print(f"Unique items: {items}")
print(f"All list elements: {all_items}")
print(f"Occurrences of 'Ac': {ac_occurrences}")
And this is what you get in this case: 这是在这种情况下的结果:
Unique items: ['Ac', 'Cr', 'Dr', 'Ad', 'Sc', 'Bi', 'An', 'Fa', 'Co', 'Mu', 'My']
All list elements: ['Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Cr', 'Cr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Ad', 'Ad', 'Ad', 'Ad', 'Ad', 'Ad', 'Ad', 'Ad', 'Sc', 'Sc', 'Sc', 'Sc', 'Sc', 'Sc', 'Bi', 'Bi', 'Bi', 'Bi', 'An', 'An', 'An', 'An', 'Fa', 'Fa', 'Co', 'Co', 'Co', 'Co', 'Co', 'Co', 'Mu', 'Mu', 'My', 'My']
Occurrences of 'Ac': 12
Try below: 请尝试以下方法:
from itertools import chain
mylist = [['Ac,Cr,Dr'],
['Ac,Ad,Sc'],
['Ac,Bi,Dr'],
['Ad,Dr,Sc'],
['An,Dr,Fa'],
['Bi,Co,Dr'],
['Dr,Mu'],
['Ac,Co,My'],
['Co,Dr'],
['Ac,Ad,Sc'],
['An,Ac,Ad']
]
flat_list = list(chain.from_iterable(mylist))
unique_list = set(','.join(flat_list).split(','))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.