简体   繁体   English

理解列表与for循环的复杂性

[英]Complexity of comprehension list vs for loop

I have two algorithms in Python, which convert a tuple list into a dictionary: 我在Python中有两种算法,可以将元组列表转换成字典:

  def _prep_high_low_data_for_view(self, low_high_list):
    dates = []
    prices = []
    lables = []
    for (x, y, z) in low_high_list:
        dates.append(x)
        prices.append(y)
        lables.append(z)

    return {'date': dates,
            'price': prices,
            'label': lables
            }

The second one being: 第二个是:

    def _prep_high_low_data_for_view(self, low_high_list):
    return {'date': [date for date, _, _ in low_high_list],
            'price': [price for _, price, _ in low_high_list],
            'label': [lable for _, _, lable in low_high_list],
            }

Both algorithms are equivalent in terms of what they do. 两种算法在功能上是等效的。 Is it the case that the second algorithm is worse in terms of complexity, because there are three separate list comprehensions? 是否存在第二种算法,因为存在三个单独的列表理解,因此第二种算法在复杂性方面更差吗?

You could build the 3 lists using zip: 您可以使用zip构建3个列表:

dates,prices,labels = zip(*low_high_list)

placed in a one line function: 放在一行函数中:

def third_function(low_high_list):
    return dict.fromkeys(zip(["date","price","label"],zip(*low_high_list)))

it will run faster, on average, than second_function() from Florian_H. 平均而言,它比Florian_H中的second_function()运行得更快。

TESTS AND RESULTS: 测试和结果:

def third_function(low_high_list):
    return dict.fromkeys(zip(["date","price","label"],zip(*low_high_list)))

def fourth_function(low_high_list):
    dates,prices,labels = zip(*low_high_list)
    return { "date":dates, "price":prices, "label":labels }


lst = [tuple(random.randint(0,100) for _ in range(3)) for i in range(10000)]

from timeit import timeit
count = 1000

t0 = timeit(lambda:first_function(lst), number=count)
print("first_function: ",f"{t0:.3f}","1x" )

t = timeit(lambda:second_function(lst), number=count)
print("second_function:",f"{t:.3f}",f"{t0/t:.1f}x" )

t = timeit(lambda:third_function(lst), number=count)
print("third_function: ",f"{t:.3f}",f"{t0/t:.1f}x" )

t = timeit(lambda:fourth_function(lst), number=count)
print("fourth_function:",f"{t:.3f}",f"{t0/t:.1f}x" )

# first_function:  1.338 1x
# second_function: 0.818 1.6x
# third_function:  0.426 3.1x
# fourth_function: 0.375 3.6x

Yes and no. 是的,没有。

It's basically O(n) vs O(3n) , but when working with complexities, O(3n) is just shortened to O(n) . 它基本上是O(n) vs O(3n) ,但是当处理复杂性时, O(3n)会缩短为O(n)

So yeah, both of these are algorithms with the complexity of O(n) , but the first one does three times less operations. 是的,这两种算法都具有O(n)的复杂度,但是第一个算法的运算量要少三倍。

As Markust Meskanen mentioned the first algorithm should be 3 times faster (less complex) but why not just trying it? 正如Markust Meskanen提到的那样,第一个算法应该快3倍(较不复杂),但为什么不试试呢? Here your code with random values and a time measuring. 在这里,您的代码具有随机值和时间度量。

import random, datetime

def first_function(low_high_list):
    dates = []
    prices = []
    lables = []
    for (x, y, z) in low_high_list:
        dates.append(x)
        prices.append(y)
        lables.append(z)

    return {'date': dates,
            'price': prices,
            'label': lables
            }


def second_function(low_high_list):
    return {'date': [date[0] for date in low_high_list],
            'price': [price[1] for price in low_high_list],
            'label': [label[2] for label in low_high_list],
            }


def second_function(low_high_list):
    return {'date': [date[0] for date in low_high_list],
            'price': [price[1] for price in low_high_list],
            'label': [label[2] for label in low_high_list],
            }


lst = [[random.randint(0,100),random.randint(0,100),random.randint(0,100)] for i in range(10000)]

print("first_function:")
tmp = datetime.datetime.now()
first_function(lst)
print(datetime.datetime.now() - tmp)

print("\nsecond_function:")
tmp = datetime.datetime.now()
second_function(lst)
print(datetime.datetime.now() - tmp)

And voila, the second function is two times faster than the first... 瞧,第二个功能比第一个功能快两倍...

[output]
first_function:
0:00:00.004001

second_function:
0:00:00.002001

So it seems, even though the second function runs three times instead of one, in this case list comprehension is still twice as fast as looping with appending to a list. 如此看来,即使第二个函数运行三次而不是运行一次,在这种情况下,列表理解仍然是循环附加列表的两倍。

1000 times average is still roughly twice as fast: 平均速度的1000倍仍然快大约两倍:

0:00:00.002820
0:00:00.001568

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM