Speeding up this Python code for large input

Question

I wrote this Python code to do a particular computation in a bigger project and it works fine for smaller values of N but it doesn't scale up very well for large values and even though I ran it for a number of hours to collect the data, I was wondering if there was a way to speed this up

import numpy as np

def FillArray(arr):
while(0 in arr):
    ind1 = np.random.randint(0,N)
    if(arr[ind1]==0):
        if(ind1==0):
            arr[ind1] = 1
            arr[ind1+1] = 2
        elif(ind1==len(arr)-1):
            arr[ind1] = 1
            arr[ind1-1] = 2
        else:
            arr[ind1] = 1
            arr[ind1+1] = 2
            arr[ind1-1] = 2
    else:
        continue
return arr

N=50000

dist = []
for i in range(1000):
    arr = [0 for x in range(N)]
    dist.append(Fillarr(arr).count(2))

For N = 50,000 , it currently takes slightly over a minute on my computer for one iteration to fill the array. So if I want to simulate this, lets say, a 1000 times, it takes many hours. Is there something I can do to speed this up?

Edit 1: I forgot to mention what it actually does. I have a list of length N and I initialize it by having zeros in each entry. Then I pick a random number between 0 and N and if that index of the list has a zero, I replace it by 1 and its neighboring indices by 2 to indicate they are not filled by 1 but they can't be filled again. I keep doing this till I populate the whole list by 1 and 2 and then I count how many of the entries contain 2 which is the result of this computation. Thus I want to find out if I fill an array randomly with this constraint, how many entries will not be filled.

Obviously I do not claim that this is the most efficient way find this number so I am hoping that perhaps there is a better alternative way if this code can't be speeded up.

Answer 1

As @SylvainLeroux noted in the comments, the approach of trying to find what zero you're going to change by drawing a random location and hoping it's zero is going to slow down when you start running out of zeros. Simply choosing from the ones you know are going to be zero will speed it up dramatically. Something like

def faster(N):
    # pad on each side
    arr = np.zeros(N+2)
    arr[0] = arr[-1] = -1 # ignore edges
    while True:
        # zeros left
        zero_locations = np.where(arr == 0)[0]
        if not len(zero_locations):
            break # we're done
        np.random.shuffle(zero_locations)
        for zloc in zero_locations:
            if arr[zloc] == 0:
                arr[zloc-1:zloc+2] = [2, 1, 2]
    return arr[1:-1] # remove edges

will be much faster (times on my old notebook):

>>> %timeit faster(50000)
10 loops, best of 3: 105 ms per loop
>>> %time [(faster(50000) == 2).sum() for i in range(1000)]
CPU times: user 1min 46s, sys: 4 ms, total: 1min 46s
Wall time: 1min 46s

We could improve this by vectorizing more of the computation, but depending on your constraints this might already suffice.

Answer 2

First I will reformulate the problem from tri-variate to bi-variate. What you are doing is spliting the vector of length N into two smaller vectors at random point k.

Lets assume that you start with a vector of zeros, then you put '1' at randomly selected k and from there take two smaller vectors of zeros - [0..k-2] & [k+2.. N-1]. No need for 3rd state. You repeat the process until exhaustion - when you are left with vectors containing only one element.

Using recusion this is reasonably fast even on my iPad mini with Pythonista.

import numpy as np
from random import randint

def SplitArray(l, r):
    while(l < r):
        k = randint(l, r)
        arr[k] = 1
        return SplitArray(l, k-2) + [k] + SplitArray(k+2, r)
    return []

N = 50000
L = 1000
dist=np.zeros(L)
for i in xrange(L):
    arr = [0 for x in xrange(N)]
    SplitArray(0, N-1)
    dist[i] = arr.count(0)

print dist, np.mean(dist), np.std(dist)

However if you would like to make it really fast then bivariate problem could be coded very effectively and naturally as bit arrays instead of storing 1 and 0 in arrays of integers or worse floats in numpy arrays. The bit manipulation should be quick and in some you easily could get close to machine level speed.

Something along the line: (this is an idea not optimal code)

from bitarray import BitArray
from random import randint
import numpy as np

def SplitArray(l, r):
    while(l < r):
        k = randint(l, r)           
        arr.set_bit(k)
        return SplitArray(l, k-2) + [k] + SplitArray(k+2, r)
    return []

def count0(ba):
    i = 0
    for n in xrange(1, N):
        if ba.get_bit(n) == 0:
            i += 1
    return i

N = 50000
L = 1000
dist = np.zeros(L)
for i in xrange(L):
    arr = BitArray(N, initialize = 0)
    SplitArray(1, N)    
    dist[i] = count0(arr)

print np.mean(dist), np.std(dist)

using bitarray

The solution converges very nicely so perhaps half an hour spent looking for an analytical solution would make this whole MC excercise unnecessary?

Speeding up this Python code for large input

Question

2 answers

solution1
2 ACCPTED 2015-01-31 14:53:55

solution2
0 2015-02-01 09:09:54

Speeding up this Python code for large input

Question

2 answers

solution1 2 ACCPTED 2015-01-31 14:53:55

solution2 0 2015-02-01 09:09:54

solution1
2 ACCPTED 2015-01-31 14:53:55

solution2
0 2015-02-01 09:09:54