I have a large data of 145000 items (a bill of materials) and I want to check the % of shared items between two bill of materials.
Two for loops or other methods always run in similar time periods.
What is the fastest way to do this?
First&secondbill are the lists with components in them:
for FKid in FirstBill:
for SKid in SecondBill:
CommonChild = (CommonChild + 1) if FKid == SKid else CommonChild
return CommonChilds / len(FirstBill)
Kinda optimal to use one set
# Python program to illustrate the intersection
# of two lists in most simple way
def intersection(lst1, lst2):
temp = set(lst2)
lst3 = [value for value in lst1 if value in temp ]
return lst3
# Driver Code
lst1 = [4, 9, 1, 17, 11, 26, 28, 54, 69]
lst2 = [9, 9, 74, 21, 45, 11, 63, 28, 26]
#print(intersection(lst1, lst2))
quantity = len(intersection(lst1, lst2))
Assuming that ids in the bills are unique, a simpler answer would be:
percentage = sum([1 for fkid in FirstBill if fkid in SecondBill]) / len(FirstBill) * 100
or
percentage = len(set(FirstBill).intersection(set(SecondBill))) / len(FirstBill) * 100
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.