[英]Pandas pivot_table preserve order
>>> df
A B C D
0 foo one small 1
1 foo one large 2
2 foo one large 2
3 foo two small 3
4 foo two small 3
5 bar one large 4
6 bar one small 5
7 bar two small 6
8 bar two large 7
>>> table = pivot_table(df, values='D', index=['A', 'B'],
... columns=['C'], aggfunc=np.sum)
>>> table
small large
foo one 1 4
two 6 NaN
bar one 5 4
two 6 7
我希望輸出如上所示,但我得到一個排序的輸出。 bar 高於 foo 等等。
我認為pivot_table沒有排序選項,但是groupby有:
df.groupby(['A', 'B', 'C'], sort=False)['D'].sum().unstack('C')
Out:
C small large
A B
foo one 1.0 4.0
two 6.0 NaN
bar one 5.0 4.0
two 6.0 7.0
您將分組列傳遞給groupby,對於要顯示為列值的那些,您可以使用unstack。
如果您不想要索引名稱,請將它們重命名為None:
df.groupby(['A', 'B', 'C'], sort=False)['D'].sum().rename_axis([None, None, None]).unstack(level=2)
Out:
small large
foo one 1.0 4.0
two 6.0 NaN
bar one 5.0 4.0
two 6.0 7.0
在創建pivot_table
,索引會按字母順序自動排序 。 不僅foo
和bar
,你也可能會注意到small
和large
的排序。 如果你想要foo
在上面,你可能需要使用sortlevel
再次sort
它們進行sort
。 如果你在這里想要輸出,那么可能需要在A
和C
上進行排序。
table.sortlevel(["A","B"], ascending= [False,True], sort_remaining=False, inplace=True)
table.sortlevel(["C"], axis=1, ascending=False, sort_remaining=False, inplace=True)
print(table)
輸出:
C small large
A B
foo one 1.0 4.0
two 6.0 NaN
bar one 5.0 4.0
two 6.0 7.0
要刪除索引名稱A
, B
和C
:
table.columns.name = None
table.index.names = (None, None)
從pandas 1.3.0 開始,可以在pd.pivot_table
指定sort=False
:
>>> import pandas as pd
>>> df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
... "B": ["one", "one", "one", "two", "two", "one", "one", "two", "two"],
... "C": ["small", "large", "large", "small","small", "large", "small", "small", "large"],
... "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
... "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
>>> pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'],
... aggfunc='sum', sort=False)
C large small
A B
foo one 4.0 1.0
two NaN 6.0
bar one 4.0 5.0
two 7.0 6.0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.