[英]Pivot Tables with Pandas
I have the following data saved as a pandas dataframe我将以下数据保存为 pandas dataframe
Animal Day age Food kg
1 1 3 17 0.1
1 1 3 22 0.7
1 2 3 17 0.8
2 2 7 15 0.1
With pivot
I get the following:使用
pivot
我得到以下信息:
output = df.pivot(["Animal", "Food"], "Day", "kg") \
.add_prefix("Day") \
.reset_index() \
.rename_axis(None, axis=1)
>>> output
Animal Food Day1 Day2
0 1 17 0.1 0.8
1 1 22 0.7 NaN
2 2 15 NaN 0.1
However I would like to have the age column (and other columns) still included.但是我希望仍然包括年龄列(和其他列)。 It could also be possible that for animal x the value age is not always the same, then it doesn't matter which age value is taken.
也有可能对于动物 x 的年龄值并不总是相同的,那么采用哪个年龄值并不重要。
Animal Food Age Day1 Day2
0 1 17 3 0.1 0.8
1 1 22 3 0.7 NaN
2 2 15 7 NaN 0.1
How do I need to change the code above?我需要如何更改上面的代码?
IIUC, what you want is to pivot the weight, but to aggregate the age. IIUC,你想要的是 pivot 的重量,而是聚合年龄。
To my knowledge, you need to do both operations separately.据我所知,您需要分别执行这两项操作。 One with
pivot
, the other with groupby
(here I used first
for the example, but this could be anything), and join
:一个带有
pivot
,另一个带有groupby
(这里我first
使用示例,但这可以是任何东西),然后join
:
(df.pivot(index=["Animal", "Food"],
columns="Day",
values="kg",
)
.add_prefix('Day')
.join(df.groupby(['Animal', 'Food'])['age'].first())
.reset_index()
)
I am adding a non ambiguous example (here the age of Animal 1 changes on Day2).我正在添加一个明确的示例(此处动物 1 的年龄在第 2 天发生变化)。
Input:输入:
Animal Day age Food kg
0 1 1 3 17 0.1
1 1 1 3 22 0.7
2 1 2 4 17 0.8
3 2 2 7 15 0.1
output: output:
Animal Food Day1 Day2 age
0 1 17 0.1 0.8 3
1 1 22 0.7 NaN 3
2 2 15 NaN 0.1 7
Use pivot
, add other columns to index:使用
pivot
,将其他列添加到索引中:
>>> df.pivot(df.columns[~df.columns.isin(['Day', 'kg'])], 'Day', 'kg') \
.add_prefix('Day').reset_index().rename_axis(columns=None)
Animal age Food Day1 Day2
0 1 3 17 0.1 0.8
1 1 3 22 0.7 NaN
2 2 7 15 NaN 0.1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.