简体   繁体   English

优化函数切片numpy数组

[英]Optimize function slicing numpy arrays

I have the following function, which takes a numpy array of floats and an integer as its arguments. 我有以下函数,该函数需要一个numpy的float数组和一个整数作为其参数。 Each row in the array 'counts' is the result of some experiment, and I want to randomly draw a list of the experiments and add them up, then repeat this process to create lots of samples groups. 数组“计数”中的每一行都是某个实验的结果,我想随机绘制一个实验列表并将其相加,然后重复此过程以创建许多样本组。

def my_function(counts,nSamples):
    ''' Create multiple randomly drawn (with replacement)
        samples from the raw data '''
    nSat,nRegions = counts.shape
    sampleData = np.zeros((nSamples,nRegions))
    for i in range(nSamples):
        rc = np.random.randint(0,nSat,size=nSat)
        sampleData[i] = counts[rc].sum(axis=0)
    return sampleData

This function seems quite slow, typically counts has around 100,000 rows (and 4 columns) and nSamples is around 2000. I have tried using numba and implicit for loops to try and speed up this code with no success. 这个函数似乎很慢,通常计数大约有100,000行(和4列),nSamples大约是2000。 What are some other methods to try and increase the speed? 还有什么其他方法可以尝试提高速度?

I have run cProfile on the function and got the following output. 我在函数上运行了cProfile,并获得了以下输出。

8005 function calls in 60.208 seconds 在60.208秒内调用8005函数

Ordered by: standard name 订购:标准名称

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)

    1    0.000    0.000   60.208   60.208 <string>:1(<module>)

 2000    0.010    0.000   13.306    0.007 _methods.py:31(_sum)

    1   40.950   40.950   60.208   60.208 optimize_bootstrap.py:25(bootstrap)

    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

 2000    5.938    0.003    5.938    0.003 {method 'randint' of 'mtrand.RandomState' objects}

 2000   13.296    0.007   13.296    0.007 {method 'reduce' of 'numpy.ufunc' objects}

 2000    0.015    0.000   13.321    0.007 {method 'sum' of 'numpy.ndarray' objects}

    1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}

    1    0.000    0.000    0.000    0.000 {range}

Are you sure that 你确定

rc = np.random.randint(0,nSat,size=nSat)

is what you want, instead of size=someconstant ? 是您想要的,而不是size=someconstant Otherwise you're summing over all the rows with many repeats. 否则,您将对所有行进行多次重复求和。


edit does it help to replace the slicing altogether with a matrix product: 编辑是否有助于将切片完全替换为矩阵产品:

rcvec=np.zeros(nSat,np.int) for i in rc: rcvec[i]+=1 sampleData[i] = rcvec.dot(counts)

(maybe there is a function in numpy that can give you rcvec faster) (也许numpy中有一个函数可以使您更快地使用rcvec)

Simply generate all indices in one go with a 2D size for np.random.randint , use those to index into counts array and then sum along the first axis, just like you were doing with the loopy one. 只需一次为np.random.randint2D大小生成所有索引,就可以使用它们将它们索引到counts数组中,然后沿着第一个轴求和,就像处理循环的那样。

Thus, one vectorized way and as such faster one, would be like so - 因此,一种矢量化的方式以及这样一种更快的方式,将像这样-

RC = np.random.randint(0,nSat,size=(nSat, nSamples))
sampleData_out = counts[RC].sum(axis=0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM