[英]Numpy find number of occurrences in a 2D array
Is there a numpy function to count the number of occurrences of a certain value in a 2D numpy array.有没有一个numpy function来统计二维numpy数组中某个值出现的次数。 Eg
例如
np.random.random((3,3))
array([[ 0.68878371, 0.2511641 , 0.05677177],
[ 0.97784099, 0.96051717, 0.83723156],
[ 0.49460617, 0.24623311, 0.86396798]])
How do I find the number of times 0.83723156
occurs in this array?如何找到
0.83723156
在此数组中出现的次数?
arr = np.random.random((3,3))
# find the number of elements that get really close to 1.0
condition = arr == 0.83723156
# count the elements
np.count_nonzero(condition)
The value of condition
is a list of booleans representing whether each element of the array satisfied the condition. condition
的值是一个布尔值列表,表示数组的每个元素是否满足条件。 np.count_nonzero
counts how many nonzero elements are in the array. np.count_nonzero
计算数组中有多少非零元素。 In the case of booleans it counts the number of elements with a True
value. 在布尔值的情况下,它计算具有
True
值的元素的数量。
To be able to deal with floating point accuracy, you could do something like this instead: 为了能够处理浮点精度,你可以这样做:
condition = np.fabs(arr - 0.83723156) < 0.001
For floating point arrays np.isclose
is much better option than either comparing with the exactly same element or defining a custom range. 对于浮点数组,
np.isclose
比完全相同的元素或定义自定义范围要好得多。
>>> a = np.array([[ 0.68878371, 0.2511641 , 0.05677177],
[ 0.97784099, 0.96051717, 0.83723156],
[ 0.49460617, 0.24623311, 0.86396798]])
>>> np.isclose(a, 0.83723156).sum()
1
Note that real numbers are not represented exactly in a computer, that is why np.isclose
will work while ==
doesn't: 请注意,实数并不完全在计算机中表示,这就是为什么
np.isclose
可以工作,而==
不会:
>>> (0.1 + 0.2) == 0.3
False
Instead: 代替:
>>> np.isclose(0.1 + 0.2, 0.3)
True
To count the number of times x
appears in any array, you can simply sum the boolean array that results from a == x
: 要计算
x
在任何数组中出现的次数,您可以简单地将由a == x
得到的布尔数组求和:
>>> col = numpy.arange(3)
>>> cols = numpy.tile(col, 3)
>>> (cols == 1).sum()
3
It should go without saying, but I'll say it anyway: this is not very useful with floating point numbers unless you specify a range, like so: 它应该不言而喻,但无论如何我会说:除非你指定一个范围,否则这对浮点数不是很有用,如下所示:
>>> a = numpy.random.random((3, 3))
>>> ((a > 0.5) & (a < 0.75)).sum()
2
This general principle works for all sorts of tests. 这个一般原则适用于各种测试。 For example, if you want to count the number of floating point values that are integral:
例如,如果要计算整数的浮点值的数量:
>>> a = numpy.random.random((3, 3)) * 10
>>> a
array([[ 7.33955747, 0.89195947, 4.70725211],
[ 6.63686955, 5.98693505, 4.47567936],
[ 1.36965745, 5.01869306, 5.89245242]])
>>> a.astype(int)
array([[7, 0, 4],
[6, 5, 4],
[1, 5, 5]])
>>> (a == a.astype(int)).sum()
0
>>> a[1, 1] = 8
>>> (a == a.astype(int)).sum()
1
You can also use np.isclose()
as described by Imanol Luengo , depending on what your goal is. 您也可以使用Imanol Luengo所描述的
np.isclose()
,具体取决于您的目标。 But often, it's more useful to know whether values are in a range than to know whether they are arbitrarily close to some arbitrary value. 但通常,知道值是否在范围内比知道它们是否任意接近某个任意值更有用。
The problem with isclose
is that its default tolerance values ( rtol
and atol
) are arbitrary, and the results it generates are not always obvious or easy to predict. isclose
的问题在于其默认容差值( rtol
和atol
)是任意的,并且它生成的结果并不总是显而易见或易于预测。 To deal with complex floating point arithmetic, it does even more floating point arithmetic! 为了处理复杂的浮点运算,它做了更多的浮点运算! A simple range is much easier to reason about precisely.
简单的范围更容易推理。 (This is an expression of a more general principle: first, do the simplest thing that could possibly work .)
(这是一个更一般的原则的表达: 首先,做最简单的事情,可能有效 。)
Still, isclose
and its cousin allclose
have their uses. 仍然,
isclose
及其堂兄allclose
有其用途。 I usually use them to see if a whole array is very similar to another whole array, which doesn't seem to be your question. 我通常使用它们来查看整个数组是否与另一个整个数组非常相似,这似乎不是你的问题。
If it may be of use to anyone: for very large 2D arrays, if you want to count how many time all elements appear within the entire array, one could flatten the array into a list and then count how many times each element appeared:如果它可能对任何人都有用:对于非常大的 2D arrays,如果你想计算所有元素在整个数组中出现的次数,可以将数组展平成一个列表,然后计算每个元素出现的次数:
from itertools import chain
import collections
from collections import Counter
#large array is called arr
flatten_arr = list(chain.from_iterable(arr))
dico_nodeid_appearence = Counter(flatten_arr)
#how may times x appeared in the arr
dico_nodeid_appearence[x]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.