I have a dataframe in Python Pandas with only two columns. The first one has repeated values like the following:
A B
apple 0.5
apple 0.8
apple 1.4
orange 0.4
orange 1.1
melon 0.3
melon 0.1
melon 0.9
melon 1.2
What I want to do is to create a new dataframe with the mean of each value in the first dataframe. For example:
A B
apple 0.9
orange 0.75
melon 0.625
The file has about 2.5m rows and I cannot do it in Excel. Any ideas how can this be done in Pandas?
Let df
be your dataframe, you can just groupby
by 'A' and get the mean with:
g = df.groupby('A').mean()
This returns:
B
A
1 0.900
2 0.750
3 0.625
EDIT: if you're not familiar with pandas and you've got an external file, you can import it with:
df = pandas.read_csv(yourfile)
EDIT2:
g = df.groupby('A').mean()
works also with your edited dataframe of fruits:
B
A
apple 0.900
melon 0.625
orange 0.750
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.