Pandas group by to aggregate string field

Question

My df is this:

1   2   3
A  abc  ab
A  abc  cc
A  abc  ab

I'd like to group by the record to have

1   2   3
A  abc  ab
A  abc  cc

or even better, have one field with concatenated string:

   1  
A_abc_ab
A_abc_cc

Pandas GroupBy doesn't seem to work with string:

df = df.groupby(['1','2','3'])

return

<pandas.core.groupby.DataFrameGroupBy object at 0x7f4a37549bd0>

Answer 1

You are not applying groupby correctly. Also after groupby you have to group.aggregate() in order to reduce cells on the basis of some function

Probably you may want this better:

df.apply('-'.join, axis=1)

which produces

0    A-abc-ab
1    A-abc-cc
2    A-abc-ab
dtype: object

Of course you can drop_duplicates before of after joining

Answer 2

Moving from this:

1   2   3
A  abc  ab
A  abc  cc
A  abc  ab

To this:

1   2   3
A  abc  ab
A  abc  cc

Doesn't involve grouping at all! you're just dropping duplicates:

In [9]: df.drop_duplicates()
Out[9]: 
   1    2   3
0  A  abc  ab
1  A  abc  cc

You can then use apply to concatenate:

In [10]: df.drop_duplicates().apply('_'.join, axis=1)
Out[10]: 
0    A_abc_ab
1    A_abc_cc
dtype: object

Pandas group by to aggregate string field

Question

2 answers

solution1
4 2014-09-03 15:29:08

solution2
3 ACCPTED 2014-09-03 15:32:22

Pandas group by to aggregate string field

Question

2 answers

solution1 4 2014-09-03 15:29:08

solution2 3 ACCPTED 2014-09-03 15:32:22

solution1
4 2014-09-03 15:29:08

solution2
3 ACCPTED 2014-09-03 15:32:22