简体   繁体   English

按前 n 行对二维数组进行排序

[英]Sorting 2D array by the first n rows

How can I sort an array in NumPy by the two first rows?如何按前两行对 NumPy 中的数组进行排序?

For example,例如,

A=array([[9, 2, 2],
         [4, 5, 6],
         [7, 0, 5]])

And I'd like to sort columns by the first two rows, such that I get back:我想按前两行对列进行排序,这样我就回来了:

A=array([[2, 2, 9],
         [5, 6, 4],
         [0, 5, 7]])

Thank you!谢谢!

One approach is to transform the 2D array over which we want to take the argsort into an easier to handle 1D array.一种方法是将我们想要对其进行argsort的 2D 数组转换为更易于处理的 1D 数组。 For that one idea could be to multiply the rows to take into accounts for the sorting purpose by successively decreasing values in the power of 10 sequence, sum them and then use argsort ( note : this method will be numerically unstable for high values of k . Meant for values up to ~ 20 ):为此,一个想法可能是将行相乘以考虑排序目的,方法是通过以10序列的幂连续递减值,将它们sum然后使用argsort注意:对于k的高值,此方法在数值上是不稳定的。适用于高达 ~ 20的值):

def sort_on_first_k_rows(x, k):
    # normalize each row so that its max value is 1
    a = (x[:k,:]/x[:k,:,None].max(1)).astype('float64')
    # multiply each row by the seq 10^n, for n=k-1,k-2...0
    # Ensures that the contribution of each row in the sorting is
    # captured in the final sum
    a_pow = (a*10**np.arange(a.shape[0]-1,-1,-1)[:,None])
    # Sort with the argsort on the resulting sum
    return x[:,a_pow.sum(0).argsort()]

Checking with the shared example:检查共享示例:

sort_on_first_k_rows(A, 2)
array([[2, 2, 9],
       [5, 6, 4],
       [0, 5, 7]])

Or with another example:或者用另一个例子:

A=np.array([[9, 2, 2, 1, 5, 2, 9],
            [4, 7, 6, 0, 9, 3, 3],
            [7, 0, 5, 0, 2, 1, 2]])

sort_on_first_k_rows(A, 2)
array([[1, 2, 2, 2, 5, 9, 9],
       [0, 3, 6, 7, 9, 3, 4],
       [0, 1, 5, 0, 2, 2, 7]])

The pandas library is very flexible for sorting DataFrames - but only based on columns. pandas库对于DataFrames的排序非常灵活 - 但仅基于列。 So I suggest to transpose and convert your array to a DataFrame like this (note that you need to specify column names for later defining the sorting criteria):所以我建议像这样转置并将数组转换为DataFrame (请注意,您需要指定列名以便稍后定义排序标准):

df = pd.DataFrame(A.transpose(), columns=['col'+str(i) for i in range(len(A))])

Then sort it and convert it back like this:然后对其进行排序并将其转换回来,如下所示:

A_new = df.sort_values(['col0', 'col1'], ascending=[True, True]).to_numpy().transpose()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM