I have a simple dataframe like this:
df = pd.DataFrame({'class':['a','b','c','d','e'],
'name':['Adi','leon','adi','leo','andy'],
'age':['9','8','9','9','8'],
'score':['40','90','35','95','85']})
then the result is like this
class name age score
a Adi 9 40
b leon 8 90
a adi 9 35
d leo 9 95
e andy 8 85
how can I combine the row named 'Adi' with 'adi' in the same column while he is only one person and the score 'Adi' is 75, not 40 and 35
You could use pandas.DataFrame.
and
pandas.DataFrame.
和pandas.DataFrame.
after first making the
name
column lowercase: :
import pandas as pd
df = pd.DataFrame({
'class': ['a', 'b', 'c', 'd', 'e'],
'name': ['Adi', 'leon', 'adi', 'leo', 'andy'],
'age': ['9', '8', '9', '9', '8'],
'score': ['40', '90', '35', '95', '85']
})
df['name'] = df['name'].str.lower()
df['score'] = df['score'].astype(int)
aggregate_funcs = {
'class': lambda s: ', '.join(set(s)),
'age': lambda s: ', '.join(set(s)),
'score': sum
}
df = df.groupby(df['name']).aggregate(aggregate_funcs)
print(df)
Output:
class age score
name
adi c, a 9 75
andy e 8 85
leo d 9 95
leon b 8 90
drop_duplicates()
is the best way if you are using pandas
df['name'] = df['name'].str.lower()
df['score'] = df['score'].astype(int)
df['score'] = df['score'].groupby(df['name']).transform(sum)
df.drop_duplicates(subset='name',keep='first',inplace=True)
output:
class name age score
0 a adi 9 75
1 b leon 8 90
3 d leo 9 95
4 e andy 8 85
you will have this output if you set keep='last'
:
class name age score
1 b leon 8 90
2 c adi 9 75
3 d leo 9 95
4 e andy 8 85
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.