计算numpy ndarray中的元素数

Question

How do I count the number of elements of each datapoint in a ndarray? 如何计算ndarray中每个数据点的元素数量？

What I want to do is to run a OneHotEncoder on all the values that are present at least N times in my ndarray. 我想做的是在ndarray中至少出现N次的所有值上运行OneHotEncoder。

I also want to replace all the values that appears less than N times with another element that it doesn't appear in the array (let's call it new_value). 我还想用数组中未出现的另一个元素替换所有出现少于N次的值（我们称它为new_value）。

So for example I have : 例如，我有：

import numpy as np

a = np.array([[[2], [2,3], [3,34]],
              [[3], [4,5], [3,34]],
              [[3], [2,3], [3,4] ]]])

with threshold N=2 I want something like: 阈值N = 2我想要类似的东西：

b = [OneHotEncoder(a[:,[i]])[0] if count(a[:,[i]])>2 
else OneHotEncoder(new_value) for i in range(a.shape(1)]

So only to understand the substitutions that I want, not considering the onehotencoder and using new_value=10 my array should look like: 因此，仅了解我想要的替换，而不考虑onehotencoder并使用new_value = 10，我的数组应如下所示：

a = np.array([[[10], [2,3], [3,34]],
                [[3], [10], [3,34]],
                [[3], [2,3], [10] ]]])

Answer 1

How about something like this? 这样的事情怎么样？

First count the number of unqiue elements in an array: 首先计算数组中的unqiue元素数：

>>> a=np.random.randint(0,5,(3,3))
>>> a
array([[0, 1, 4],
       [0, 2, 4],
       [2, 4, 0]])
>>> ua,uind=np.unique(a,return_inverse=True)
>>> count=np.bincount(uind)
>>> ua
array([0, 1, 2, 4]) 
>>> count
array([3, 1, 2, 3])

From the ua and count arrays it shows that 0 shows up 3 times, 1 shows up 1 time, and so on. 从ua和count数组中，它显示0出现3次，1显示1次，依此类推。

import numpy as np

def mask_fewest(arr,thresh,replace):
    ua,uind=np.unique(arr,return_inverse=True)
    count=np.bincount(uind)
    #Here ua has all of the unique elements, count will have the number of times 
    #each appears.


    #@Jamie's suggestion to make the rep_mask faster.
    rep_mask = np.in1d(uind, np.where(count < thresh))
    #Find which elements do not appear at least `thresh` times and create a mask

    arr.flat[rep_mask]=replace 
    #Replace elements based on above mask.

    return arr


>>> a=np.random.randint(2,8,(4,4))
[[6 7 7 3]
 [7 5 4 3]
 [3 5 2 3]
 [3 3 7 7]]


>>> mask_fewest(a,5,50)
[[10  7  7  3]
 [ 7  5 10  3]
 [ 3  5 10  3]
 [ 3  3  7  7]]

For the above example: Let me know if you intended a 2D array or 3D array. 对于上面的示例：让我知道您打算使用2D阵列还是3D阵列。

>>> a
[[[2] [2, 3] [3, 34]]
 [[3] [4, 5] [3, 34]]
 [[3] [2, 3] [3, 4]]]


>>> mask_fewest(a,2,10)
[[10 [2, 3] [3, 34]]
 [[3] 10 [3, 34]]
 [[3] [2, 3] 10]]

计算numpy ndarray中的元素数

问题描述

1 个解决方案

解决方案1
6 已采纳 2013-07-24 23:51:11

计算numpy ndarray中的元素数

问题描述

1 个解决方案

解决方案1 6 已采纳 2013-07-24 23:51:11

解决方案1
6 已采纳 2013-07-24 23:51:11