[英]Why is my program faster than the one using a python built in function?
Ok so, I was doing a puzzle on coderbyte, and here is what the puzzle stated: 好的,我在coderbyte上做了一个难题,这就是难题的意思:
Have the function SimpleMode(arr) take the array of numbers stored in arr and return the number that appears most frequently (the mode). 让函数SimpleMode(arr)接受存储在arr中的数字数组并返回最常出现的数字(模式)。 For example: if arr contains [10, 4, 5, 2, 4] the output should be 4. If there is more than one mode return the one that appeared in the array first (ie. [5, 10, 10, 6, 5] should return 5 because it appeared first).
例如:如果arr包含[10、4、5、2、4],则输出应为4。如果有多个模式,则返回第一个出现在数组中的模式(即[5、10、10、6] ,,5]应该返回5,因为它首先出现。 If there is no mode return -1.
如果没有模式,则返回-1。 The array will not be empty.
数组不会为空。
So here is my program: 所以这是我的程序:
import time
from random import randrange
def SimpleMode(arr):
bestMode=0
numTimes=0
for x in range(len(arr)):
if len(arr)>0:
currentNum=arr[0]
currentMode=0
while currentNum in arr:
currentMode+=1
arr.remove(currentNum)
if currentMode>numTimes:
numTimes=currentMode
bestMode=currentNum
else: break
if numTimes==1: bestMode=-1
return bestMode
start_time = time.time()
numbers = [randrange(1,10) for x in range(0, 1000)]
print(SimpleMode(numbers))
print("--- %s seconds ---" % (time.time() - start_time))
And here is a much simpler program which someone else wrote: 这是别人写的一个简单得多的程序:
import time
from random import randrange
def SimpleMode(arr):
best = -1
best_count = 1
for c in arr:
if arr.count(c) > best_count:
best = c
best_count = arr.count(c)
return best
start_time = time.time()
numbers = [randrange(1,10) for x in range(0, 1000)]
print(SimpleMode(numbers))
print("--- %s seconds ---" % (time.time() - start_time))
Now I know that using my method of timing this it depends on what my CPU is doing and whatnot so this is not the most accurate way, but leaving that aside what I found was that my computer took 0.012000 seconds to run my program, yet it took 0.025001 seconds to run the second program. 现在,我知道使用我的计时方法取决于我的CPU正在做什么以及什么,所以这不是最准确的方法,但是撇开我发现的是,我的计算机花了0.012000秒来运行我的程序,但是运行第二个程序花费了0.025001秒。
Now here is where I am puzzled. 现在这是我感到困惑的地方。 My program which I have written myself takes less than half the time the other program takes which uses a built-in python function and has only one for-loop whereas my program has a while loop inside a for-loop.
我自己编写的程序所花的时间不到使用内置python函数且只有一个for循环的其他程序所花费时间的一半,而我的程序在for循环内有一个while循环。
Can anyone provide any insight into this? 谁能对此提供任何见解?
The second program calls count
twice each iteration, and since count
is O(n) (that is, it has to walk through the entire array, just like a for loop would), the time quickly adds up. 第二个程序每次迭代调用两次
count
,并且由于count
为O(n)(也就是说,它必须遍历整个数组,就像for循环一样),所以时间很快就累加了。
That said, your program can be reduced even further: 也就是说,您的程序可以进一步简化:
import collections
def SimpleMode(arr):
if not arr:
return -1
counts = collections.Counter(arr)
return max(counts, key=lambda k: (counts[k], -arr.index(k)))
In addition, note that your initial program mutates its input (it effectively destroys the list you pass it because of the .remove
calls, which will suck if you wanted to do anything with arr
after calling SimpleMode
). 另外,请注意,您的初始程序会更改其输入(由于
.remove
调用,它会有效破坏传递给它的列表,如果您希望在调用SimpleMode
之后用arr
做任何事情,它将很烂)。
And finally, in Python the [1, 2, 3, 4]
construct is called a list, not an array. 最后,在Python中,
[1, 2, 3, 4]
构造称为列表,而不是数组。 There exists something called an array in Python, and it's not this (most of the time it's a NumPy array, but it can also be an array from the array
module in the stdlib). Python中存在一种称为数组的东西, 并非如此(大多数情况下是NumPy数组,但也可以是stdlib中
array
模块的array
)。
Your code makes everything in a single pass. 您的代码可一次完成所有操作。
The other code contains hidden nested loops in the arr.count()
. 其他代码在
arr.count()
包含隐藏的嵌套循环。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.