计算numpy数组中列的出现次数

Question

Given a 2 xd dimensional numpy array M, I want to count the number of occurences of each column of M. That is, I'm looking for a general version of bincount . 给定一个2 xd维numpy数组M，我想计算M每列的出现次数。也就是说，我正在寻找bincount的一般版本。

What I tried so far: (1) Converted columns to tuples (2) Hashed tuples (via hash ) to natural numbers (3) used numpy.bincount . 到目前为止我尝试过：（1）将列转换为元组（2）使用numpy.bincount将哈希元组（通过hash ）转换为自然数（3）。

This seems rather clumsy. 这看起来很笨拙。 Is anybody aware of a more elegant and efficient way? 有人知道更优雅高效的方式吗？

Answer 1

You can use collections.Counter : 你可以使用collections.Counter ：

>>> import numpy as np
>>> a = np.array([[ 0,  1,  2,  4,  5,  1,  2,  3],
...               [ 4,  5,  6,  8,  9,  5,  6,  7],
...               [ 8,  9, 10, 12, 13,  9, 10, 11]])
>>> from collections import Counter
>>> Counter(map(tuple, a.T))
Counter({(2, 6, 10): 2, (1, 5, 9): 2, (4, 8, 12): 1, (5, 9, 13): 1, (3, 7, 11):
1, (0, 4, 8): 1})

Answer 2

Given: 鉴于：

a = np.array([[ 0,  1,  2,  4,  5,  1,  2,  3],
              [ 4,  5,  6,  8,  9,  5,  6,  7],
              [ 8,  9, 10, 12, 13,  9, 10, 11]])
b = np.transpose(a)

A more efficient solution than hashing (still requires manipulation): 比散列更有效的解决方案（仍需要操作）：
I create a view of the array with the flexible data type np.void (see here ) such that each row becomes a single element. 我使用灵活的数据类型np.void （参见此处）创建数组视图，使每行成为单个元素。 Converting to this shape will allow np.unique to operate on it. 转换为此形状将允许np.unique进行操作。
```
 %%timeit c = np.ascontiguousarray(b).view(np.dtype((np.void, b.dtype.itemsize*b.shape[1]))) _, index, counts = np.unique(c, return_index = True, return_counts = True) #counts are in the last column, remember original array is transposed >>>np.concatenate((b[idx], cnt[:, None]), axis = 1) array([[ 0, 4, 8, 1], [ 1, 5, 9, 2], [ 2, 6, 10, 2], [ 3, 7, 11, 1], [ 4, 8, 12, 1], [ 5, 9, 13, 1]]) 10000 loops, best of 3: 65.4 µs per loop 
```
The counts appended to the unique columns of a . 计数追加到的唯一列a 。

Your hashing solution. 您的哈希解决方案。

 %%timeit array_hash = [hash(tuple(row)) for row in b] uniq, index, counts = np.unique(array_hash, return_index= True, return_counts = True) np.concatenate((b[idx], cnt[:, None]), axis = 1) 10000 loops, best of 3: 89.5 µs per loop

Update : Eph's solution is the most efficient and elegant. 更新：Eph的解决方案是最有效和最优雅的。

%%timeit
Counter(map(tuple, a.T))
10000 loops, best of 3: 38.3 µs per loop

计算numpy数组中列的出现次数

问题描述

2 个解决方案

解决方案1
4 已采纳 2015-12-12 05:09:58

解决方案2
2 2015-12-12 04:42:31

计算numpy数组中列的出现次数

问题描述

2 个解决方案

解决方案1 4 已采纳 2015-12-12 05:09:58

解决方案2 2 2015-12-12 04:42:31

解决方案1
4 已采纳 2015-12-12 05:09:58

解决方案2
2 2015-12-12 04:42:31