[英]How to get a count of specific element in nested list python
count_freq data
3 [['58bcd029', 2, 'expert'],
['58bcd029', 2, 'user'],
['58bcd029', 2, 'expert']]
2 [['58bcd029', 2, 'expert'],
['58bcd029', 2, 'expert']]
1 [['1ee429fa', 1, 'expert']]
所以我想从数据框的每一行和每个列表中获取“专家”和“用户”的计数。 在获得专家和用户的数量后,我想将各自的 id 存储在另一个列表中。 我曾尝试将它们转换为字典并使用键进行计算,但它不起作用。 任何人都可以帮助我这样做吗?
我想要这种格式的数据框:
count_freq count_expert ids count_user ids
3 2 ['58bcd029','58bcd029'] 1 ['58bcd029']
2 2 ['58bcd029','58bcd029'] 0 []
1 1 ['1ee429fa'] 0 []
一种解决方案可能是:
import pandas as pd
data = pd.DataFrame({
'col': [[['58bcd029', 2, 'expert'],
['58bcd029', 2, 'user'],
['58bcd029', 2, 'expert']],
[['58bcd029', 2, 'expert'],
['58bcd029', 2, 'expert']],
[['1ee429fa', 1, 'expert']]]
})
print(data)
col
0 [[58bcd029, 2, expert], [58bcd029, 2, user], [...
1 [[58bcd029, 2, expert], [58bcd029, 2, expert]]
2 [[1ee429fa, 1, expert]]
data['count_expert'] = data['col'].apply(lambda x: [item for sublist in x for item in sublist].count('expert'))
data['count_user'] = data['col'].apply(lambda x: [item for sublist in x for item in sublist].count('user'))
data['ids_expert'] = data['col'].apply(lambda x: list(set([sublist[0] for sublist in x if sublist[2] == 'expert'])))
data['ids_user'] = data['col'].apply(lambda x: list(set([sublist[0] for sublist in x if sublist[2] == 'user'])))
# For the purpose of illustration, I just selected these rows, but `col` is also there.
print(data[['count_expert', 'count_user', 'ids_expert', 'ids_user']])
count_expert count_user ids_expert ids_user
0 2 1 [58bcd029] [58bcd029]
1 2 0 [58bcd029] []
2 1 0 [1ee429fa] []
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.