[英]Groupby agg with join not produce the expected output
I have dataframe as below 我的数据帧如下
Wash_Month Wash_Day
0 3 2
1 4 3
And the expected out put is 而预期的出局是
#d={'Wash_Month':'Wash_Month/Wash_Day','Wash_Day':'Wash_Month/Wash_Day'}
#df.T.astype(str).groupby(d).agg(','.join)
Out[329]:
0 1
Wash_Month/Wash_Day 3,2 4,3
As you saw , I first do the transpose T
. 如你所见,我先做转置T
If we groupby
with axis=1
and remove the T
, I expected the same out put. 如果我们用axis=1
groupby
并删除T
,我预计同样的输出。
df.astype(str).groupby(d,axis=1).agg(','.join)
Out[330]:
Wash_Month/Wash_Day
0 Wash_Month,Wash_Day
1 Wash_Month,Wash_Day
The out put is mismatched with expected output . 输出与预期产量不匹配。 Is there specific problem on agg
with join
with groupby
of axis=1
是否有具体的问题agg
与join
与groupby
的axis=1
Since other agg
function like sum
work as normal 由于其他agg
功能像sum
一样工作正常
df.astype(str).groupby({'Wash_Month':'Wash_Month/Wash_Day','Wash_Day':'Wash_Month/Wash_Day'}, axis=1).sum()
Out[332]:
Wash_Month/Wash_Day
0 32.0 # str 3 + str 2
1 43.0
About why the result become float rather than a str check link 关于为什么结果变成浮点而不是str检查链接
Appreciate your help :-) 感谢您的帮助 :-)
Here is a hint: 这是一个提示:
def f(x):
print(x)
print(type(x))
return 1
df.astype(str).groupby(d,axis=1).agg(f)
Output: 输出:
Wash_Month Wash_Day
0 3 2
1 4 3
<class 'pandas.core.frame.DataFrame'>
Note the output is a dataframe. 请注意,输出是一个数据帧。
As opposed to: 相反:
def f(x):
print(x)
print(type(x))
return 1
df.T.astype(str).groupby(d).agg(f)
Output: 输出:
Wash_Month 3
Wash_Day 2
Name: 0, dtype: object
<class 'pandas.core.series.Series'>
Wash_Month 4
Wash_Day 3
Name: 1, dtype: object
<class 'pandas.core.series.Series'>
Which f gets called with each series, hence 'join' is concatenating the column headers. 每个系列调用哪个f,因此'join'连接列标题。
I can't explain it with digging through the source code, but it appears that the groupby along with astype(str) is causing agg to act differently in each situation. 我无法通过挖掘源代码来解释它,但似乎groupby和astype(str)导致agg在每种情况下的行为都不同。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.