I want a function to generate a list of length n containing an arithmetic sequence of numbers between 0 and 1, but put in a random order.
For example, for the function
def randSequence(n):
...
return myList
randSequence(10)
returns
[0.5, 0.3, 0.9, 0.8, 0.6, 0.2, 0.4, 0.0, 0.1, 0.7]
and
randSequence(5)
returns
[0.4, 0.0, 0.2, 0.8, 0.6]
Currently, I have it so it generates the sequence of numbers in one loop, and randomizes it in another, as follows:
def randSequence(n):
step = 1 / n
setList = []
myList = []
for i in range(n):
setList.append(i * step)
for i in range(n):
index = random.randint(0, len(setList) - 1)
myList.append(setList.pop(index))
return myList
Unfortunately, this solution is slow, especially for large numbers (like n > 1,000,000). Is there a better way to write this code, or even better, is there a function that can do this task for me?
@HeapOverflow suggested exchanging the second loop for the shuffle function:
def randSequence(n):
step = 1 / n
myList = []
for i in range(n):
myList.append(i * step)
random.shuffle(myList)
return myList
Which is an order of magnitude faster than before. From past experience, I suspect that the pop function on lists is rather slow and was the main bottleneck in the second loop.
First, I'd like to point out that the main reason for bad performance of your code is due to this line:
myList.append(setList.pop(index))
The time complexity list.pop
from the middle of a list is roughly O(n)
since popping from the middle of the list forces Python to move a bunch of memory around. This makes the net complexity O(n^2)
. You can drastically improve performance by making changes inplace, eg:
def randSequenceInplace(n):
'a.k.a. Fisher-Yates'
step = 1 / n
lst = [step * i for i in range(n)]
for i in range(n-1):
index = random.randint(i, n - 1)
lst[i], lst[index] = lst[index], lst[i]
# myList.append(setList.pop(index))
return lst
For completeness, you can go with a vectorized numpy
solution or use the previosuly mentioned random.shuffle
to get even better performance. Timings:
n = 10**6
%time randSequence(n)
# CPU times: user 1min 22s, sys: 33 ms, total: 1min 22s
# Wall time: 1min 22s
%time randSequenceInplace(n)
# CPU times: user 1.33 s, sys: 1.91 ms, total: 1.33 s
# Wall time: 1.33 s
%timeit np.random.permutation(n) / n
# 10 loops, best of 3: 22.4 ms per loop
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.