I have dataframe as below
Wash_Month Wash_Day
0 3 2
1 4 3
And the expected out put is
#d={'Wash_Month':'Wash_Month/Wash_Day','Wash_Day':'Wash_Month/Wash_Day'}
#df.T.astype(str).groupby(d).agg(','.join)
Out[329]:
0 1
Wash_Month/Wash_Day 3,2 4,3
As you saw , I first do the transpose T
.
If we groupby
with axis=1
and remove the T
, I expected the same out put.
df.astype(str).groupby(d,axis=1).agg(','.join)
Out[330]:
Wash_Month/Wash_Day
0 Wash_Month,Wash_Day
1 Wash_Month,Wash_Day
The out put is mismatched with expected output . Is there specific problem on agg
with join
with groupby
of axis=1
Since other agg
function like sum
work as normal
df.astype(str).groupby({'Wash_Month':'Wash_Month/Wash_Day','Wash_Day':'Wash_Month/Wash_Day'}, axis=1).sum()
Out[332]:
Wash_Month/Wash_Day
0 32.0 # str 3 + str 2
1 43.0
About why the result become float rather than a str check link
Appreciate your help :-)
Here is a hint:
def f(x):
print(x)
print(type(x))
return 1
df.astype(str).groupby(d,axis=1).agg(f)
Output:
Wash_Month Wash_Day
0 3 2
1 4 3
<class 'pandas.core.frame.DataFrame'>
Note the output is a dataframe.
As opposed to:
def f(x):
print(x)
print(type(x))
return 1
df.T.astype(str).groupby(d).agg(f)
Output:
Wash_Month 3
Wash_Day 2
Name: 0, dtype: object
<class 'pandas.core.series.Series'>
Wash_Month 4
Wash_Day 3
Name: 1, dtype: object
<class 'pandas.core.series.Series'>
Which f gets called with each series, hence 'join' is concatenating the column headers.
I can't explain it with digging through the source code, but it appears that the groupby along with astype(str) is causing agg to act differently in each situation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.