按另一个数据帧中的列对熊猫数据帧进行排序 - 熊猫

Question

Let's say I have a Pandas DataFrame with two columns, like:假设我有一个包含两列的 Pandas DataFrame，例如：

df = pd.DataFrame({'a': [1, 2, 3, 4], 'b': [100, 200, 300, 400]})
print(df)

And let's say I also have a Pandas Series, like:假设我还有一个 Pandas 系列，例如：

s = pd.Series([1, 3, 2, 4])
print(s)

0    1
1    3
2    2
3    4
dtype: int64

How can I sort the a column to become the same order as the s series, with the corresponding row values sorted together?如何将a列排序为与s系列相同的顺序，并将相应的行值排序在一起？

My desired output would be:我想要的输出是：

Is there any way to achieve this?有没有办法实现这一目标？

Please check self-answer below.请检查下面的自我回答。

Answer 1

What about:关于什么：

(
    df.assign(s=s)
    .sort_values(by='s')
    .drop('s', axis=1)
)

Answer 2

I have ran into these issues quite often, so I just thought to share my solutions in Pandas.我经常遇到这些问题，所以我只是想在 Pandas 中分享我的解决方案。

Solutions:解决方案：

Solution 1:解决方案1：

Using set_index to convert the a column to the index, then use reindex to change the order, then use rename_axis to change the index name back to a , then use reset_index to convert the a column from an index back to a column:使用set_index将a列转换为索引，然后使用reindex更改顺序，然后使用rename_axis将索引名称更改回a ，然后使用reset_index将a列从索引转换回列：

print(df.set_index('a').reindex(s).rename_axis('a').reset_index('a'))

Solution 2:解决方案2：

Using set_index to convert the a column to the index, then use loc to change the order, then use reset_index to convert the a column from an index back to a column:使用set_index将a列转换为索引，然后使用loc更改顺序，然后使用reset_index将a列从索引转换回列：

print(df.set_index('a').loc[s].reset_index())

Solution 3:解决方案3：

Using iloc to index the rows in a different order, then use map to get that order that would fit the df to make it get sorted with the s series:使用iloc以不同的顺序索引行，然后使用map获取适合df顺序，使其与s系列进行排序：

print(df.iloc[list(map(df['a'].tolist().index, s))])

Solution 4:解决方案4：

Using pd.DataFrame to create a new DataFrame object, then use sorted with a key argument to sort the DataFrame by the s series:使用pd.DataFrame创建一个新的 DataFrame 对象，然后使用sorted with a key参数按s系列对 DataFrame 进行排序：

print(pd.DataFrame(sorted(df.values.tolist(), key=lambda x: s.tolist().index(x[0])), columns=df.columns))

Timings:时间：

Timing with the below code:使用以下代码计时：

import pandas as pd
from timeit import timeit
df = pd.DataFrame({'a': [1, 2, 3, 4], 'b': [100, 200, 300, 400]})
s = pd.Series([1, 3, 2, 4])
def u10_1():
    return df.set_index('a').reindex(s).rename_axis('a').reset_index('a')
def u10_2():
    return df.set_index('a').loc[s].reset_index()
def u10_3():
    return df.iloc[list(map(df['a'].tolist().index, s))]
def u10_4():
    return pd.DataFrame(sorted(df.values.tolist(), key=lambda x: s.tolist().index(x[0])), columns=df.columns)
print('u10_1:', timeit(u10_1, number=1000))
print('u10_2:', timeit(u10_2, number=1000))
print('u10_3:', timeit(u10_3, number=1000))
print('u10_4:', timeit(u10_4, number=1000))

Output:输出：

u10_1: 3.012849470495621
u10_2: 3.072132612502147
u10_3: 0.7498072134665241
u10_4: 0.8109911930595484

@Allen has a pretty good answer too. @Allen 也有一个很好的答案。

按另一个数据帧中的列对熊猫数据帧进行排序 - 熊猫

问题描述

2 个解决方案

解决方案1
3 2020-01-27 04:38:55

解决方案2
2 已采纳 2020-01-27 04:36:25

Solutions:解决方案：

Timings:时间：

按另一个数据帧中的列对熊猫数据帧进行排序 - 熊猫

问题描述

2 个解决方案

解决方案1 3 2020-01-27 04:38:55

解决方案2 2 已采纳 2020-01-27 04:36:25

Solutions:解决方案：

Timings:时间：

解决方案1
3 2020-01-27 04:38:55

解决方案2
2 已采纳 2020-01-27 04:36:25