[英]Pandas pivot table Nested Sorting
Given this data frame and pivot table: 给定此数据框和数据透视表:
import pandas as pd
df=pd.DataFrame({'A':['x','y','z','x','y','z'],
'B':['one','one','one','two','two','two'],
'C':[7,5,3,4,1,6]})
df
A B C
0 x one 7
1 y one 5
2 z one 3
3 x two 4
4 y two 1
5 z two 6
table = pd.pivot_table(df, index=['A', 'B'],aggfunc=np.sum)
table
A B
x one 7
two 4
y one 5
two 1
z one 3
two 6
Name: C, dtype: int64
I want to sort the pivot table such that the order of 'A' is z, x, y and the order of 'B' is based on the descendingly-sorted values from data frame column 'C'. 我想对数据透视表进行排序,以使“ A”的顺序为z,x,y,而“ B”的顺序基于数据框列“ C”中的降序排序值。
Like this: 像这样:
A B
z two 6
one 3
x one 7
two 4
y one 5
two 1
Name: C, dtype: int64
Thanks in advance! 提前致谢!
I don't believe there is an easy way to accomplish your objective. 我认为没有简单的方法可以实现您的目标。 The following solution first sorts your table is descending order based on the values of column
C
. 以下解决方案首先根据列
C
的值对表以降序排序。 It then concatenates each slice based on your desired order. 然后根据所需顺序将每个切片连接起来。
order = ['z', 'x', 'y']
table = table.reset_index().sort_values('C', ascending=False)
>>> pd.concat([table.loc[table.A == val, :].set_index(['A', 'B']) for val in order])
C
A B
z two 6
one 3
x one 7
two 4
y one 5
two 1
custom_order = ['z', 'x', 'y']
kwargs = dict(axis=0, level=0, drop_level=False)
new_table = pd.concat(
[table.xs(idx_v, **kwargs).sort_values(ascending=False) for idx_v in custom_order]
)
pd.concat([table.xs(i, drop_level=0).sort_values(ascending=0) for i in list('zxy')]
custom_order
is your desired order. custom_order
是您所需的订单。 kwargs
is a convenient way to improve readability (in my opinion). kwargs
是提高可读性的便捷方法(我认为)。 Key elements to note, axis=0
and level=0
might be important for you if you want to leverage this further. 如果要进一步利用这一点,则需要注意的关键元素,
axis=0
和level=0
可能对您很重要。 However, those are also the default values and can be left out. 但是,这些也是默认值,可以省略。
drop_level=False
is the key argument here and is necessary to keep the idx_v
we are taking a xs
of such that the pd.concat
puts it all together in the way we'd like. drop_level=False
是这里的关键参数,对于保持idx_v
我们的xs
所必需的,这样pd.concat
可以按照我们希望的方式将它们放在一起。
I use a list comprehension in almost the exact same manner as Alexander within the pd.concat
call. 在
pd.concat
调用中,我几乎以与Alexander完全相同的方式使用列表pd.concat
。
print new_table
A B
z two 6
one 3
x one 7
two 4
y one 5
two 1
Name: C, dtype: int64
If you can read in column A as categorical data, then it becomes much more straightforward. 如果您可以将A列作为分类数据读取,那么它将变得更加简单。 Setting your categories as
list('zxy')
and specifying ordered=True
uses your custom ordering. 将您的类别设置为
list('zxy')
并指定ordered=True
将使用您的自定义排序。
You can read in your data using something similar to: 您可以使用类似于以下内容的方式读取数据:
'A':pd.Categorical(['x','y','z','x','y','z'], list('zxy'), ordered=True)
Alternatively, you can read in the data as you currently are, then use astype
to convert A to categorical: 或者,您可以按当前状态读取数据,然后使用
astype
将A转换为分类:
df['A'] = df['A'].astype('category', categories=list('zxy'), ordered=True)
Once A is categorical, you can pivot the same as before, and then sort with: 将A归类后,您可以像以前一样进行数据透视,然后排序:
table = table.sort_values(ascending=False).sortlevel(0, sort_remaining=False)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.