简体   繁体   English

通过匹配多行排序 Pandas Dataframe

[英]Sorting Pandas Dataframe by matching multiple rows

Suppose I have a DataFrame like this:假设我有一个像这样的 DataFrame:

data=np.array([[-1.5625e-05,-1.5625e-05,-4.6875e-05],
[-1.5625e-05,-1.5625e-05,-1.5625e-05],
[-1.5625e-05,1.5625e-05,-4.6875e-05],
[-1.5625e-05,1.5625e-05,-1.5625e-05],
[1.5625e-05,-1.5625e-05,-4.6875e-05],
[1.5625e-05,-1.5625e-05,-1.5625e-05],
[1.5625e-05,1.5625e-05,-4.6875e-05],
[1.5625e-05,1.5625e-05,-1.5625e-05]])

df=pd.DataFrame(data=data,columns=['x','y','z'])

and a numpy array和一个 numpy 阵列

coord=np.array([[-1.5625e-05,-1.5625e-05,-4.6875e-05],
[-1.5625e-05,1.5625e-05,-4.6875e-05],
[1.5625e-05,-1.5625e-05,-4.6875e-05],
[1.5625e-05,1.5625e-05,-4.6875e-05],
[-1.5625e-05,-1.5625e-05,-1.5625e-05],
[-1.5625e-05,1.5625e-05,-1.5625e-05],
[1.5625e-05,-1.5625e-05,-1.5625e-05],
[1.5625e-05,1.5625e-05,-1.5625e-05]])

The number of rows in the Pandas Dataframe and the coord array are always the same. Pandas Dataframe 和坐标数组中的行数始终相同。 As you can see, the rows between the Pandas DataFrame and the coord are the same but in a different order.如您所见,Pandas DataFrame 和坐标之间的行相同,但顺序不同。 I would like to sort the DataFrame according to the order of the coord array(eg df.x==coord[:,0] & df.y==coord[:,1] & df.z==coord[:,2]).我想根据坐标数组的顺序对 DataFrame 进行排序(例如 df.x==coord[:,0] & df.y==coord[:,1] & df.z==coord[:, 2])。

You can do the required sorting by:您可以通过以下方式进行所需的排序:

df.sort_values(['x', 'y', 'z'], ascending=[True, True, True])

The full code:完整代码:

import numpy as np
import pandas as pd

data=np.array([[-1.5625e-05,-1.5625e-05,-4.6875e-05],
[-1.5625e-05,-1.5625e-05,-1.5625e-05],
[-1.5625e-05,1.5625e-05,-4.6875e-05],
[-1.5625e-05,1.5625e-05,-1.5625e-05],
[1.5625e-05,-1.5625e-05,-4.6875e-05],
[1.5625e-05,-1.5625e-05,-1.5625e-05],
[1.5625e-05,1.5625e-05,-4.6875e-05],
[1.5625e-05,1.5625e-05,-1.5625e-05]])

df= pd.DataFrame(data=data,columns=['x','y','z'])

df.sort_values(['x', 'y', 'z'], ascending=[True, True, True])

print(df)

the output: output:

          x         y         z
0 -0.000016 -0.000016 -0.000047
1 -0.000016 -0.000016 -0.000016
2 -0.000016  0.000016 -0.000047
3 -0.000016  0.000016 -0.000016
4  0.000016 -0.000016 -0.000047
5  0.000016 -0.000016 -0.000016
6  0.000016  0.000016 -0.000047
7  0.000016  0.000016 -0.000016

you could do it like this:你可以这样做:

  • sort both arrays the same way以相同的方式对 arrays 进行排序
  • set the index of the dataframe to that of coord将 dataframe 的索引设置为坐标的索引
  • reset the index to get the original ordering:重置索引以获取原始排序:

code代码

df2 = pd.DataFrame(coord, columns=list("xyz"))
sort_cols = list("yxz")
df = df.sort_values(sort_cols)
df2 = df2.sort_values(sort_cols)
df.index = df2.index
df = df.sort_index()

This should returns the df sorted as coord (output):这应该返回排序为坐标(输出)的df:

          x         y         z
0 -0.000016 -0.000016 -0.000047
1 -0.000016  0.000016 -0.000047
2  0.000016 -0.000016 -0.000047
3  0.000016  0.000016 -0.000047
4 -0.000016 -0.000016 -0.000016
5 -0.000016  0.000016 -0.000016
6  0.000016 -0.000016 -0.000016
7  0.000016  0.000016 -0.000016

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM