[英]What is the effective way to have a pivot-table having pandas dataset columns as its rows?
Let's take as an example the following dataset: 让我们以以下数据集为例:
make address all 3d our over length_total y
0 0.0 0.64 0.64 0.0 0.32 0.0 278 1
1 0.21 0.28 0.5 0.0 0.14 0.28 1028 1
2 0.06 0.0 0.71 0.0 1.23 0.19 2259 1
3 0.15 0.0 0.46 0.1 0.61 0.0 1257 1
4 0.06 0.12 0.77 0.0 0.19 0.32 749 1
5 0.0 0.0 0.0 0.0 0.0 0.0 21 1
6 0.0 0.0 0.25 0.0 0.38 0.25 184 1
7 0.0 0.69 0.34 0.0 0.34 0.0 261 1
8 0.0 0.0 0.0 0.0 0.9 0.0 25 1
9 0.0 0.0 1.42 0.0 0.71 0.35 205 1
10 0.0 0.0 0.0 0.0 0.0 0.0 23 0
11 0.48 0.0 0.0 0.0 0.48 0.0 37 0
12 0.12 0.0 0.25 0.0 0.0 0.0 491 0
13 0.08 0.08 0.25 0.2 0.0 0.25 807 0
14 0.0 0.0 0.0 0.0 0.0 0.0 38 0
15 0.24 0.0 0.12 0.0 0.0 0.12 227 0
16 0.0 0.0 0.0 0.0 0.75 0.0 77 0
17 0.1 0.0 0.21 0.0 0.0 0.0 571 0
18 0.51 0.0 0.0 0.0 0.0 0.0 74 0
19 0.3 0.0 0.15 0.0 0.0 0.15 155 0
I want to get pivot-table from the previous dataset, in which the columns (make, address all, 3d, our, over, length_total) will have their mean values processed by the column y. 我想从以前的数据集中获取数据透视表,其中的列(make,address all,3d,我们的,over,length_total)将由y列处理其平均值。 The following table is the expected result: 下表是预期结果:
y
1 0
make 0.048 0.183
address 0.173 0.008
all 0.509 0.098
3d 0.01 0.02
our 0.482 0.123
over 0.139 0.052
length_total 626.7 250
Is it possible to get the desired result through pivot_table method from pandas.data object? 是否可以通过pandas.data对象的pivot_table方法获得所需的结果? If so, how? 如果是这样,怎么办?
Is there a more effective way to do this? 有没有更有效的方法可以做到这一点?
Some people like using stack
or unstack
, but I prefer good ol' pd.melt
to "flatten" or "unpivot" a frame: 有些人喜欢使用stack
或unstack
,但是我更喜欢pd.melt
来“展平”或“ pd.melt
”框架:
>>> df_m = pd.melt(df, id_vars="y")
>>> df_m.pivot_table(index="variable", columns="y")
value
y 0 1
variable
3d 0.020 0.010
address 0.008 0.173
all 0.098 0.509
length_total 250.000 626.700
make 0.183 0.048
our 0.123 0.482
over 0.052 0.139
(If you want to preserve the original column order as the new row order, you can use .loc
to index into this, something like df2.loc[df.columns].dropna()
). (如果要将原始列顺序保留为新的行顺序,则可以使用.loc
对此进行索引,例如df2.loc[df.columns].dropna()
)。
Melting does the flattening, and preserves y
as a column, putting the old column names as a new column called "variable"
(which can be changed if you like): 融化会进行展平,并将y
保留为一列,将旧列名作为一个新列称为"variable"
(可以根据需要更改):
>>> pd.melt(df, id_vars="y").head()
y variable value
0 1 make 0.00
1 1 make 0.21
2 1 make 0.06
3 1 make 0.15
4 1 make 0.06
After that we can call pivot_table
as we would ordinarily. 之后,我们可以像通常那样调用pivot_table
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.