[英]Loop For working too long
I have two list of dicts: prices_distincts
, prices
. 我有两个字典列表:
prices_distincts
, prices
。
They connect through hash_brand_artnum
, both of them sorted by hash_brand_artnum
I do not understand why loop works for so long: 它们通过
hash_brand_artnum
连接,它们都按hash_brand_artnum
排序,我不明白为什么循环这么长时间起作用:
If length of prices_distincts
is 100,000 it works for 30 min
如果
prices_distincts
长度为100,000,则它将工作30 min
But If length of prices_distincts
is 10,000 it works for 10 sec
. 但是,如果
prices_distincts
长度为10,000,则可以工作10 sec
。
Code: 码:
prices_distincts = [{'hash_brand_artnum':1202},...,..]
prices = [{'hash_brand_artnum':1202,'price':12.077},...,...]
for prices_distinct in prices_distincts:
for price in list(prices):
if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']:
print price['hash_brand_artnum']
#print prices
del prices[0]
else:
continue
I need to look for items with same prices. 我需要寻找价格相同的商品。 Relation beatween prices_distincts and prices one to many.
price_distincts和价格之间的关系一对多。 And group price with equal price['hash_brand_artnum']
和同等价格的团体价格['hash_brand_artnum']
it's working so long because your algorithm is O(N^2) and 100000 ^ 2 = 10000000000 and 10000 ^ 2 = 100000000. So factor between two number is 100, and factor between 30 min and 10 sec ~100. 它之所以有效,是因为您的算法是O(N ^ 2),并且100000 ^ 2 = 10000000000和10000 ^ 2 =100000000。所以两个数之间的因数是100,在30分钟和10秒〜100之间。
EDIT: It's hard to say by your code and such a small amount of data, and I don't know what your task is, but I think that your dictionaries is not very useful. 编辑:很难通过您的代码和如此少量的数据来说明,而且我不知道您的任务是什么,但是我认为您的词典不是很有用。 May be try this:
可以试试这个:
>>> prices_distincts = [{'hash_brand_artnum':1202}, {'hash_brand_artnum':14}]
>>> prices = [{'hash_brand_artnum':1202, 'price':12.077}, {'hash_brand_artnum':14, 'price':15}]
# turning first list of dicts into simple list of numbers
>>> dist = [x['hash_brand_artnum'] for x in prices_distincts]
# turning second list of dicts into dict where number is a key and price is a value
>>> pr = {x['hash_brand_artnum']:x["price"] for x in prices}
not you can iterate throuth your number and get prices: 您不能遍历您的电话号码并获得价格:
>>> for d in dist:
... print d, pr[d]
As @RomanPekar mentioned, your algorithm is running slow because its complexity is O(n^2)
. 如@RomanPekar所述,您的算法运行缓慢,因为其复杂度为
O(n^2)
。 To fix it, you should write it as an O(n)
algorithm: 要修复它,您应该将其编写为
O(n)
算法:
import itertools as it
for price, prices_distinct in it.izip(prices, prices_distincts):
if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']:
# do stuff
If prices grows more or less with prices_distincts, then if you multiply the size of prices_distincts by 10, your original 10 seconds will be multiply by 10 then again by 10 (second for loop), and then by ~2 because of the "list(prices)" (that, by the way, should definitively be done out of the loop): 如果价格随prices_distincts或多或少地增长,那么如果您将prices_distincts的大小乘以10,则原始的10秒将乘以10,然后再乘以10(循环秒数),然后再乘以〜2,这是因为“ list(价格)”(顺便说一句,应该明确地圈出):
10sec*10*10*2 = 2000sec = 33min 10秒* 10 * 10 * 2 = 2000秒= 33分钟
This conversion is usually expensive. 这种转换通常很昂贵。
prices_distincts = [{'hash_brand_artnum':1202},...,..]
prices = [{'hash_brand_artnum':1202,'price':12.077},...,...]
list_prices = list(prices)
for prices_distinct in prices_distincts:
for price in list_prices:
if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']:
print price['hash_brand_artnum']
#print prices
del prices[0]
else:
continue
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.