[英]Pandas pivot table cannot aggregate in 2 levels
I am exploring titanic data on seaborn and want to do pivot table using this code:我正在探索 seaborn 上的泰坦尼克号数据,并希望使用以下代码制作数据透视表:
import numpy as np
import pandas as pd
import seaborn as sns
titanic = sns.load_dataset('titanic')
age = pd.cut(titanic['age'], [0, 18, 80])
# fare = pd.cut(titanic['fare'], [0, 250, 500]) #1 - this does not work
fare = pd.qcut(titanic['fare'], 3) #2 this works as intended
titanic.pivot_table('survived', ['sex', age], ['class', fare])
The problem with #1 is that does not aggregate the fare for second and third class, only for the first one. #1 的问题是不汇总二等舱和三等舱的票价,只汇总第一等舱的票价。
Results:结果:
Is there anyone know why this happens?有谁知道为什么会这样?
Thank you and much appreciated!谢谢你,非常感谢!
Run this:运行这个:
pd.crosstab(titanic['class'], pd.cut(titanic['fare'],[0,10,50,74,100,200,300,500,1000]))
Output:输出:
fare (0, 10] (10, 50] (50, 74] (74, 100] (100, 200] (200, 300] (500, 1000]
class
First 1 71 42 44 33 17 3
Second 0 171 7 0 0 0 0
Third 320 153 14 0 0 0 0
Note: The highest fare for "Second" and "Third" classes is less than 75.注: “二等舱”和“三等舱”的最高票价低于75。
So, in your first example, all your Second and Third class fares are grouped in a bucket less than 250.因此,在您的第一个示例中,您的所有二等舱和三等舱票价都归入一个小于 250 的桶中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.