简体   繁体   中英

Add a new column after groupby in pandas

I am using pandas to process a csv file with the following structure

date, student, score, outof
20040601,mark,80,100
20040601,jack,40,100
20040602,mark,60,100
20040602,jack,30,100

I want to group the above data by date and then calculate the percentage for each date and then plot it.

I am able to do groupby, by using the following code

import pandas as pd
data = pd.read_csv("csv_file")
grouped_by_date = data.groupby('date')

I am also able to calculate the percentage for each line using the following code

import pandas as pd
data = pd.read_csv("csv_file")
date['percentage'] = data['score']/data['outof']

But I am not able to do it on the grouped data. How to do it on the grouped by data?

Edit :

The following is the desired output that I want

date, percentage
20040601,60
20040602,45

The following is what I would do:

>>> df.set_index(['date'], inplace=True)
>>> df['percentDate'] = df.groupby(level=0).apply(lambda x: np.mean(x['score']/x['outof']))
>>> df
Out[391]: 
         student  score  outof  percentDate
date                                         
20040601    mark     80    100           0.60
20040601    jack     40    100           0.60
20040602    mark     60    100           0.45
20040602    jack     30    100           0.45

First of all, since you want to set the values into the dataframe as a column, it's nice to set the index according to what you group-by: it makes setting the values later on easier (to me).

Second, Observe that I have used numpy's mean() function: since score/outof will return a column of observations (one row per student), you have to average out over that.

Simply:

data['percentage'] = data.score/data.outof
data.groupby('date').mean()

Yields:

          score  outof  percentage
date                              
20040601     60    100        0.60
20040602     45    100        0.45

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM