如何加速python中的循环

Question

我想加快这段简短的代码

      max_x=array([max(x[(id==dummy)]) for dummy in ids])

x和id是相同维度的numpy数组， ids是较小维度的数组。 使用矢量运算的快速方法是什么？

Answer 1

除非id具有某种结构，否则进行矢量化并不容易（据我所知）。 否则瓶颈可能经常做id==dummy ，但我能想到的唯一解决方案是使用排序，并且由于缺少np.max（）的reduce功能，仍然需要相当多的python代码（编辑：实际上有一个通过np.fmax可用的reduce函数）。 对于1000x1000和id / ids在0..100中，这大约快3倍，但是因为它相当复杂，所以它只适用于有许多id的较大问题：

def max_at_ids(x, id, ids):
    # create a 1D view of x and id:
    r_x = x.ravel()
    r_id = id.ravel()
    sorter = np.argsort(r_id)

    # create new sorted arrays:
    r_id = r_id[sorter]; r_x = r_x[sorter]

    # unfortunatly there is no reduce functionality for np.max...

    ids = np.unique(ids) # create a sorted, unique copy, just in case

    # w gives the places where the sorted arrays id changes:
    w = np.where(r_id[:-1] != r_id[1:])[0] + 1

我最初提供了这个解决方案，它在切片上做了一个纯python循环，但下面是一个更短（和更快）的版本：

    # The result array:
    max_x = np.empty(len(ids), dtype=r_x.dtype)
    start_idx = 0; end_idx = w[0]
    i_ids = 0
    i_w = 0

    while i_ids < len(ids) and i_w < len(w) + 1:
        if ids[i_ids] == r_id[start_idx]:
            max_x[i_ids] = r_x[start_idx:end_idx].max()
            i_ids += 1
            i_w += 1
        elif ids[i_ids] > r_id[start_idx]:
            i_w += 1
        else:
            i_ids += 1
            continue # skip updating start_idx/end_idx

        start_idx = end_idx
        # Set it to None for the last slice (might be faster to do differently)
        end_idx = w[i_w] if i_w < len(w) else None

    return ids, max_x

编辑：用于计算每个切片的最大值的改进版本：

有一种方法可以通过使用np.fmax.reduceat来删除python循环，如果切片很小（实际上非常np.fmax.reduceat ，它可能会比前一个np.fmax.reduceat ：

# just to 0 at the start of w
# (or calculate first slice by hand and use out=... keyword argument to avoid even
# this copy.
w = np.concatenate(([0], w))
max_x = np.fmin.reduceat(r_x, w)
return ids, max_x

现在可能有一些小东西可以让它快一点。 如果id / ids有一些结构，那么应该可以简化代码，也许可以使用不同的方法来实现更大的加速。 否则，只要有许多（唯一的）id（并且x / id数组不是很小），此代码的加速应该很大。 请注意，代码强制执行np.unique（ids），这可能是一个很好的假设。

Answer 2

使用x[(id==dummy)].max()而不是内置的max应该可以提高速度。

Answer 3

scipy.ndimage.maximum就是这样：

import numpy as np
from scipy import ndimage as nd

N = 100  # number of values
K = 10   # number of class

# generate random data
x   = np.random.rand(N)
ID  = np.random.randint(0,K,N)  # random id class for each xi's
ids = np.random.randint(0,K,5)  # select 5 random class

# do what you ask
max_per_id = nd.maximum(x,labels=ID,index=ids)

print dict(zip(ids,max_per_id))

如果要计算所有ID的最大值，请执行ids = ID

请注意，如果在ids中找不到ID的特定类（即没有x被该类标记），则该类的最大返回值为0 。

如何加速python中的循环

问题描述

3 个解决方案

解决方案1
3 已采纳 2012-08-15 14:02:55

解决方案2
1 2012-08-15 10:45:49

解决方案3
0 2014-01-10 11:02:31

如何加速python中的循环

问题描述

3 个解决方案

解决方案1 3 已采纳 2012-08-15 14:02:55

解决方案2 1 2012-08-15 10:45:49

解决方案3 0 2014-01-10 11:02:31

解决方案1
3 已采纳 2012-08-15 14:02:55

解决方案2
1 2012-08-15 10:45:49

解决方案3
0 2014-01-10 11:02:31