简体   繁体   English

计算 2D numpy 数组每行的重复数

[英]count number of duplicates for each row of a 2D numpy array

Is there a simple way in python to check the amount of duplicates in different rows. python中是否有一种简单的方法来检查不同行中的重复数量。 For example:例如:

Row1: 12  13  20  25  45  46  
Row2: 14  24  30  38  39  47  
Row3:  1   9  15  21  29  39  
Row4:  2   6  14  19  26  45  
Row5:  5  23  25  27  32  40  
Row6:  6   8  25  26  27  45  

I want to compare the Row6 to previous "n" rows.我想将 Row6 与之前的“n”行进行比较。 If n=5, then the output should be something like this: [2 0 0 3 2]如果 n=5,那么输出应该是这样的: [2 0 0 3 2]

Of course, I can compare each value in Row6 to each value from other row in the loop, and increase the counter for each row.当然,我可以将 Row6 中的每个值与循环中其他行的每个值进行比较,并为每一行增加计数器。

But do you know any already existing function in python?但是你知道python中已经存在的任何函数吗?

If you're working with numpy arrays, use broadcasted comparison,如果您正在使用 numpy 数组,请使用广播比较,

>>> n = 5
>>> v = df.values 
>>> v
array([[12, 13, 20, 25, 45, 46],
       [14, 24, 30, 38, 39, 47],
       [ 1,  9, 15, 21, 29, 39],
       [ 2,  6, 14, 19, 26, 45],
       [ 5, 23, 25, 27, 32, 40],
       [ 6,  8, 25, 26, 27, 45]])
>>> (v[None, -(n+1):-1, None] == v[-1, :, None]).sum(-1).sum(-1).squeeze()
array([2, 0, 0, 3, 2])

You could use unique from numpy您可以使用 numpy 中的 unique

>>> import numpy as np 
>>> np.unique([1, 1, 2, 2, 3, 3])
array([1, 2, 3])
>>> a = np.array([[1, 1], [2, 3]])
>>> np.unique(a)
array([1, 2, 3])

You would still need to loop over the n rows and then checking the length of the resulting array.您仍然需要遍历 n 行,然后检查结果数组的长度。 Maybe you find something more suitable still using numpy.也许你仍然使用 numpy 找到更合适的东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM