简体   繁体   English

遍历 Boolean Numpy 数组的 `True` 条目

[英]Iterate over `True` entries of Boolean Numpy array

I want a loop for each index i at which array X (which is Boolean ) is True .我想要每个索引i的循环,其中数组X (即Boolean )为True

Is there something more efficient/pythonic than wrapping np.nonzero inside np.nditer as follows?有没有比在np.nditer中包装np.nonzero更有效/pythonic 的东西,如下所示?

for i in np.nditer(np.nonzero(X), flags=['zerosize_ok']):
    myfunction(Y[i],Z2[Z[i]])

The problem here is that it iterates twice instead of just once, and occupies memory (first, np.nonzero iterates through X and stores that to a big array, then np.nditer iterates through that array).这里的问题是它迭代两次而不是一次,并占用 memory (首先, np.nonzero遍历X并将其存储到一个大数组,然后np.nditer遍历该数组)。

Is there a command (slightly similar to np.nditer , so to speak) for efficiently iterating over True entries of a Boolean array directly, without listing them all explicitly with np.nonzero first?是否有一个命令(可以说有点类似于np.nditer )用于直接有效地迭代 Boolean 数组的True条目,而不首先使用np.nonzero明确列出它们? (Iterating over all entries and checking each with an if statement is probably less efficient than some iterator offered by Numpy, if it exists.) (遍历所有条目并使用if语句检查每个条目的效率可能低于 Numpy 提供的某些迭代器(如果存在)。)

People downvote because looping over the entries of a numpy array is a big nono.人们投反对票是因为遍历 numpy 数组的条目是一个很大的问题。 We are using numpy because it's fast and treating every element by itself rather than operating on the whole array at once makes it so you're getting python level performance rather than numpy/c performance.我们使用 numpy 是因为它速度很快,并且可以单独处理每个元素,而不是一次对整个阵列进行操作,因此您可以获得 python 级别的性能而不是 numpy/c 性能。

Wanting to exclude values by giving an array with true and false values is very common and is called masking.想要通过给出一个包含真假值的数组来排除值是很常见的,这被称为掩码。 you can do it by indexing into the true false array.你可以通过索引到真假数组来做到这一点。 To do that in numpy you use indexing.要在 numpy 中做到这一点,您需要使用索引。 Eg you do np.array([1,2,3])[np.array([True,False,True])] .例如,您执行np.array([1,2,3])[np.array([True,False,True])] And it gives you np.array([1, 3]) .它给你np.array([1, 3])

So basically try arranging things in a way that you can do所以基本上尝试以你可以做的方式安排事情

myfunction(Y[mask],Z2[Z[maks]]).

There are a couple of techniques of doing that.有几种技术可以做到这一点。 One way is only using numpy functions to create myfunction and an other one is to use decorators like numba.vectorize , numba.guvectorize or numba.njit and a few more.一种方法是仅使用 numpy 函数来创建 myfunction,另一种方法是使用numba.vectorizenumba.guvectorizenumba.njit等装饰器。

How about using numpy.vectorise with an boolean selector as an index?如何使用numpy.vectorise和 boolean 选择器作为索引?

np.vectorise takes a function and returns a vectorised version of that function that accepts arrays. np.vectorise 接受 function 并返回接受 arrays 的 function 的矢量化版本。 You can then run the function on the array or a subset of it using a selector.然后,您可以使用选择器在阵列或其子集上运行 function。

In numpy you can subselect using an index of an array using a list of numbers.在 numpy 中,您可以使用数字列表使用数组的索引进行子选择。

The numpy.where function returns the indexes of an array that match some value or function. numpy .其中 function 返回与某个值或 function 匹配的数组的索引。

Putting that together:把它放在一起:

import numpy as np
import random

random.seed(0) # For repeatability

def myfunction(y, z, z2):
    return y*z2[z]

Y = np.array(range(100))  # Array of any length
Z = np.array(range(len(Y)))  # Must be the same length as Y
Z2 = np.array(range(len(Y))) # Must be constrained to indexes of X, Y and Z
X = [random.choice([True, False]) for toss in range(len(Y))] # Throw some coins

# X is now an array of True/False
# print( "\n".join(f"{n}: {v}" for n, v in enumerate(X)))
print(np.where(X)) # Where gets the indexes of True items

# Vectorise our function
vfunc = np.vectorize(myfunction, excluded={2}) # We exclude {2} as it's the Z2 array

# Do the work
res = vfunc(Y[np.where(X)],Z[np.where(X)],Z2)

# Check our work
for outpos, inpos in enumerate(np.where(X)[0]):
    assert myfunction(Y[inpos], Z[inpos], Z2) == res[outpos], f"Mismatch at in={inpos}, out={outpos}"

# Print the results
print(res)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM