[英]How to compare if list of items are present in each row of a dataframe in python
I have a data_file
of size 88k rows with 76 columns. 我有一个大小为88k,76列的data_file
。
I want to compare if a list: subset
= [40,49] is present in how many rows 我想比较列表: subset
= [40,49]在多少行中存在
I am comparing one row at a time as shown below: 我一次比较一行,如下所示:
My Code: 我的代码:
counter=0
for row in data_file.itertuples():
if all(np.isin(subset, row)):
counter = counter+1
print('Total occurences of subset: ', subset, '= ', counter)
print('--------------------------')
Execution time: 6.6398055266834035 执行时间:6.6398055266834035
Is there a better way to compare all rows at a time and save some time. 是否有更好的方式一次比较所有行并节省一些时间。 I need to check may subsets so the time complexity of my code is high. 我需要检查可能的子集,所以我的代码的时间复杂度很高。
Thanks, 谢谢,
Gopi 戈皮
np.sum((data_file==subset[0]).any(axis=1) & (data_file==subset[1]).any(axis=1))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.