简体   繁体   中英

Fast way to find number of common items python

I have a large data of 145000 items (a bill of materials) and I want to check the % of shared items between two bill of materials.

Two for loops or other methods always run in similar time periods.

What is the fastest way to do this?

First&secondbill are the lists with components in them:

for FKid in FirstBill: 
      for SKid in SecondBill:
            CommonChild = (CommonChild + 1) if FKid == SKid else CommonChild
    return CommonChilds / len(FirstBill)

在此输入图像描述

Kinda optimal to use one set

# Python program to illustrate the intersection 
# of two lists in most simple way 
def intersection(lst1, lst2): 
    temp = set(lst2) 
    lst3 = [value for value in lst1 if value in temp ] 
    return lst3 

# Driver Code 
lst1 = [4, 9, 1, 17, 11, 26, 28, 54, 69] 
lst2 = [9, 9, 74, 21, 45, 11, 63, 28, 26] 
#print(intersection(lst1, lst2)) 

quantity = len(intersection(lst1, lst2))

Assuming that ids in the bills are unique, a simpler answer would be:

percentage = sum([1 for fkid in FirstBill if fkid in SecondBill]) / len(FirstBill) * 100

or

percentage = len(set(FirstBill).intersection(set(SecondBill))) / len(FirstBill) * 100

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM