简体   繁体   English

dplyr R 在 Pandas 中排列等效函数

[英]dplyr R arrange function equivalent in pandas

I have a data frame similar to this, my idea is to arrange the rows according to the vector my_order , as shown below.我有一个与此类似的数据框,我的想法是根据向量my_order排列行,如下所示。

R Code:代码:

df = data.frame(A = c("apple","cherry","orange","banana"), B = c(25,37,15,28))
df
       A  B
1  apple 25
2 cherry 37
3 orange 15
4 banana 28

my_order = c(2,3,4,1)
dplyr::arrange(df,my_order)
       A  B
1 banana 28
2  apple 25
3 cherry 37
4 orange 15

My question is how can I do this in python, is there any function in pandas, equivalent dplyr::arrange() ?我的问题是如何在 python 中执行此操作,在dplyr::arrange()是否有任何函数,相当于dplyr::arrange()

Python Code:蟒蛇代码:

import pandas as pd

df = pd.DataFrame({'A': ["apple","cherry","orange","banana"], 'B': [25,37,15,28]})
print(df)
        A   B
0   apple  25
1  cherry  37
2  orange  15
3  banana  28

my_order = [1,2,3,0]
df.iloc[my_order]
        A   B
1  cherry  37
2  orange  15
3  banana  28
0   apple  25

Okay, I figured it out.好吧,我想通了。 You are passing argsorted indices to arrange .您正在传递 argsorted 索引来arrange You can do the same thing with iloc , but you will have to argsort your indices to get its inverse.你可以用iloc做同样的事情,但你必须对你的索引进行argsort以获得它的逆。

my_order = [2,3,4,1]
df.iloc[pd.np.argsort(my_order)]

        A   B
3  banana  28
0   apple  25
1  cherry  37
2  orange  15

I am not sure about the right function.我不确定正确的功能。

work around:解决方法:

import pandas as pd

df = pd.DataFrame({'A': ["apple","cherry","orange","banana"], 'B': [25,37,15,28]})

print(df)

df['index']=[2,3,4,1]
df.set_index('index',inplace=True)
df.sort_index(inplace=True)

print(df)

Check with检查

df.loc[pd.Series(my_order,index=df.index).sort_values().index]
Out[42]: 
        A   B
3  banana  28
0   apple  25
1  cherry  37
2  orange  15

Now you don't have to learn APIs of pandas to turn your R code into python code!现在您无需学习 Pandas 的 API 即可将您的 R 代码转换为 Python 代码!

With datar :使用datar

>>> from datar import f
>>> from datar.tibble import tibble
>>> from datar.dplyr import arrange
>>> df = tibble(A = ["apple","cherry","orange","banana"], B = [25,37,15,28])
>>> df
        A   B
0   apple  25
1  cherry  37
2  orange  15
3  banana  28
>>> my_order = [2,3,4,1]
>>> df >> arrange(my_order)
        A   B
0  banana  28
1   apple  25
2  cherry  37
3  orange  15

I am the author of the package.我是包的作者。 Feel free to submit issues if you have any questions.如果您有任何问题,请随时提交问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM