简体   繁体   English

具有连接的Groupby agg不会产生预期的输出

[英]Groupby agg with join not produce the expected output

I have dataframe as below 我的数据帧如下

   Wash_Month  Wash_Day
0           3         2
1           4         3

And the expected out put is 而预期的出局是

#d={'Wash_Month':'Wash_Month/Wash_Day','Wash_Day':'Wash_Month/Wash_Day'}

#df.T.astype(str).groupby(d).agg(','.join)
Out[329]: 
                       0    1
Wash_Month/Wash_Day  3,2  4,3

As you saw , I first do the transpose T . 如你所见,我先做转置T

If we groupby with axis=1 and remove the T , I expected the same out put. 如果我们用axis=1 groupby并删除T ,我预计同样的输出。

df.astype(str).groupby(d,axis=1).agg(','.join)
Out[330]: 
   Wash_Month/Wash_Day
0  Wash_Month,Wash_Day
1  Wash_Month,Wash_Day

The out put is mismatched with expected output . 输出与预期产量不匹配。 Is there specific problem on agg with join with groupby of axis=1 是否有具体的问题aggjoingroupbyaxis=1

Since other agg function like sum work as normal 由于其他agg功能像sum一样工作正常

df.astype(str).groupby({'Wash_Month':'Wash_Month/Wash_Day','Wash_Day':'Wash_Month/Wash_Day'}, axis=1).sum()
Out[332]: 
   Wash_Month/Wash_Day
0                 32.0 # str 3 + str 2
1                 43.0

About why the result become float rather than a str check link 关于为什么结果变成浮点而不是str检查链接

Appreciate your help :-) 感谢您的帮助 :-)

Here is a hint: 这是一个提示:

def f(x):
    print(x)
    print(type(x))
    return 1

df.astype(str).groupby(d,axis=1).agg(f)

Output: 输出:

  Wash_Month Wash_Day
0          3        2
1          4        3
<class 'pandas.core.frame.DataFrame'>

Note the output is a dataframe. 请注意,输出是一个数据帧。

As opposed to: 相反:

def f(x):
    print(x)
    print(type(x))
    return 1

df.T.astype(str).groupby(d).agg(f)

Output: 输出:

Wash_Month    3
Wash_Day      2
Name: 0, dtype: object
<class 'pandas.core.series.Series'>
Wash_Month    4
Wash_Day      3
Name: 1, dtype: object
<class 'pandas.core.series.Series'>

Which f gets called with each series, hence 'join' is concatenating the column headers. 每个系列调用哪个f,因此'join'连接列标题。

I can't explain it with digging through the source code, but it appears that the groupby along with astype(str) is causing agg to act differently in each situation. 我无法通过挖掘源代码来解释它,但似乎groupby和astype(str)导致agg在每种情况下的行为都不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM