[英]pandas pivot table, creating table by taking difference of multiple columns
[英]Is there a function in Pandas pivot table to add the difference of multiple columns?
我有以下 pandas DataFrame:
df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo",
"bar", "bar", "bar", "bar",'foo' ],
"B": ["one", "one", "one", "two", "two",
"one", "one", "two", "two", 'two'],
"C": ["small", "large", "large", "small",
"small", "large", "small", "small",
"large", 'large'],
"D": [1, 2, 2, 3, 3, 4, 5, 6, 7,8],
})
使用以下 output:
print(df)
A B C D
0 foo one small 1
1 foo one large 2
2 foo one large 2
3 foo two small 3
4 foo two small 3
5 bar one large 4
6 bar one small 5
7 bar two small 6
8 bar two large 7
9 foo two large 8
然后我正在做一個 pivot 表,如下所示:
table = pd.pivot_table(df, values='D', index=['A'],
columns=['B','C'])
使用以下 output:
print(table)
B one two
C large small large small
A
bar 4 5 7 6
foo 2 1 8 3
我如何為one
和two
添加large
和small
( large
- small
)之間的diff
(下表中的差異)? 預期的 output 將是:
B one two
C large small diff large small difff
A
bar 4 5 -1 7 6 1
foo 2 1 1 8 3 5
我看到了一些以前的答案,但只處理了 1 列。 此外,理想情況下將使用aggfunc
完成
另外,如何將表格重新轉換為初始格式? 預計 output 將是:
A B C D
0 foo one small 1
1 foo one large 2
2 foo one large 2
3 foo two small 3
4 foo two small 3
5 bar one large 4
6 bar one small 5
7 bar two small 6
8 bar two large 7
9 foo two large 8
10 bar one diff -1
11 bar two diff 1
12 foo one diff 1
13 foo two diff 5
在此先感謝您的幫助!
diffs = (table.groupby(level="B", axis="columns")
.diff(-1).dropna(axis="columns")
.rename(columns={"large": "diff"}, level="C"))
new = table.join(diffs).loc[:, table.columns.get_level_values("B").unique()]
要得到
>>> new
B one two
C large small diff large small diff
A
bar 4 5 -1 7 6 1
foo 2 1 1 8 3 5
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.