My df is this:
1 2 3
A abc ab
A abc cc
A abc ab
I'd like to group by the record to have
1 2 3
A abc ab
A abc cc
or even better, have one field with concatenated string:
1
A_abc_ab
A_abc_cc
Pandas GroupBy doesn't seem to work with string:
df = df.groupby(['1','2','3'])
return
<pandas.core.groupby.DataFrameGroupBy object at 0x7f4a37549bd0>
You are not applying groupby
correctly. Also after groupby
you have to group.aggregate()
in order to reduce cells on the basis of some function
Probably you may want this better:
df.apply('-'.join, axis=1)
which produces
0 A-abc-ab
1 A-abc-cc
2 A-abc-ab
dtype: object
Of course you can drop_duplicates
before of after joining
Moving from this:
1 2 3
A abc ab
A abc cc
A abc ab
To this:
1 2 3
A abc ab
A abc cc
Doesn't involve grouping at all! you're just dropping duplicates:
In [9]: df.drop_duplicates()
Out[9]:
1 2 3
0 A abc ab
1 A abc cc
You can then use apply to concatenate:
In [10]: df.drop_duplicates().apply('_'.join, axis=1)
Out[10]:
0 A_abc_ab
1 A_abc_cc
dtype: object
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.