简体   繁体   English

循环工作时间过长

[英]Loop For working too long

I have two list of dicts: prices_distincts , prices . 我有两个字典列表: prices_distinctsprices

They connect through hash_brand_artnum , both of them sorted by hash_brand_artnum I do not understand why loop works for so long: 它们通过hash_brand_artnum连接,它们都按hash_brand_artnum排序,我不明白为什么循环这么长时间起作用:

  1. If length of prices_distincts is 100,000 it works for 30 min 如果prices_distincts长度为100,000,则它将工作30 min

  2. But If length of prices_distincts is 10,000 it works for 10 sec . 但是,如果prices_distincts长度为10,000,则可以工作10 sec

Code: 码:

 prices_distincts = [{'hash_brand_artnum':1202},...,..]
 prices = [{'hash_brand_artnum':1202,'price':12.077},...,...]

 for prices_distinct in prices_distincts:
    for price in list(prices):            
        if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']:

            print price['hash_brand_artnum']
            #print prices
            del prices[0]
        else:
            continue

I need to look for items with same prices. 我需要寻找价格相同的商品。 Relation beatween prices_distincts and prices one to many. price_distincts和价格之间的关系一对多。 And group price with equal price['hash_brand_artnum'] 和同等价格的团体价格['hash_brand_artnum']

it's working so long because your algorithm is O(N^2) and 100000 ^ 2 = 10000000000 and 10000 ^ 2 = 100000000. So factor between two number is 100, and factor between 30 min and 10 sec ~100. 它之所以有效,是因为您的算法是O(N ^ 2),并且100000 ^ 2 = 10000000000和10000 ^ 2 =100000000。所以两个数之间的因数是100,在30分钟和10秒〜100之间。

EDIT: It's hard to say by your code and such a small amount of data, and I don't know what your task is, but I think that your dictionaries is not very useful. 编辑:很难通过您的代码和如此少量的数据来说明,而且我不知道您的任务是什么,但是我认为您的词典不是很有用。 May be try this: 可以试试这个:

>>> prices_distincts = [{'hash_brand_artnum':1202}, {'hash_brand_artnum':14}]
>>> prices = [{'hash_brand_artnum':1202, 'price':12.077}, {'hash_brand_artnum':14, 'price':15}]
# turning first list of dicts into simple list of numbers
>>> dist = [x['hash_brand_artnum'] for x in prices_distincts]
# turning second list of dicts into dict where number is a key and price is a value
>>> pr = {x['hash_brand_artnum']:x["price"] for x in prices}

not you can iterate throuth your number and get prices: 您不能遍历您的电话号码并获得价格:

>>> for d in dist:
...     print d, pr[d]

As @RomanPekar mentioned, your algorithm is running slow because its complexity is O(n^2) . 如@RomanPekar所述,您的算法运行缓慢,因为其复杂度为O(n^2) To fix it, you should write it as an O(n) algorithm: 要修复它,您应该将其编写为O(n)算法:

import itertools as it

for price, prices_distinct in it.izip(prices, prices_distincts):
    if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']:
        # do stuff

If prices grows more or less with prices_distincts, then if you multiply the size of prices_distincts by 10, your original 10 seconds will be multiply by 10 then again by 10 (second for loop), and then by ~2 because of the "list(prices)" (that, by the way, should definitively be done out of the loop): 如果价格随prices_distincts或多或少地增长,那么如果您将prices_distincts的大小乘以10,则原始的10秒将乘以10,然后再乘以10(循环秒数),然后再乘以〜2,这是因为“ list(价格)”(顺便说一句,应该明确地圈出):

10sec*10*10*2 = 2000sec = 33min 10秒* 10 * 10 * 2 = 2000秒= 33分钟

This conversion is usually expensive. 这种转换通常很昂贵。

 prices_distincts = [{'hash_brand_artnum':1202},...,..]
 prices = [{'hash_brand_artnum':1202,'price':12.077},...,...]
 list_prices = list(prices)

 for prices_distinct in prices_distincts:
    for price in list_prices:            
        if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']:

            print price['hash_brand_artnum']
            #print prices
            del prices[0]
        else:
            continue

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM