简体   繁体   中英

How to change values of a dataframe through groupby in pandas

Here I have a dataframe:

print te_c.head(10)

   Price          Vol   tradeDate
0     9            1  1994-05-09
1     3            2  1994-05-10
2     3            2  1994-05-10
3     3            2  1994-05-10
4     3            2  1994-05-10
5     4            3  1994-05-11
6     4            3  1994-05-11
7     4            3  1994-05-11
8     4            3  1994-05-11
9     5            4  1994-05-12

and a list:

te_index = range(1,te_c.drop_duplicates('tradeDate').shape[0] + 1)

Now I want to group te_c by 'tradeDate' ,and for each group let each 'Vol' * te_index[i] ( i from range(len(te_index)) ), then reset the 'Vol' with this new data and make the desired output like this:

   Price          Vol   tradeDate
0     9            1   1994-05-09
1     3            4   1994-05-10
2     3            4   1994-05-10
3     3            4   1994-05-10
4     3            4   1994-05-10
5     4            9   1994-05-11
6     4            9   1994-05-11
7     4            9   1994-05-11
8     4            9   1994-05-11
9     5           16   1994-05-12

So I tried using for loop in the groupby function:

for name, group in te_c.groupby('tradeDate'):
  i = 0
  for j in range(group.shape[0]):
      group.ix[j, 'Vol'] = group.ix[j, 'Vol'] * te_index[i]
  i += 1

However my code didn't work and had:

KEY ERROR:0L

I also tried to use .apply() but had no idea of how to multiply the groups by each te_index . How should I code to solve this problem?

EDIT: this calculation is only one of the calculations I wish to do through the dataframe, I also want to calculate something like group.ix[j, 'Vol'] = group.ix[j, 'Vol'] * te_index[-(i+1)] / sum(te_index) or group.ix[j, 'Price'] = group.ix[j, 'Price'] * (te_index[i] * weights[i])

You can use the information on the group index stored in the grouper attribute of a groupby object:

te_c['new_vol'] = te_c.Vol.mul(te_c.groupby('tradeDate').grouper.group_info[0] + 1)

   Price  Vol   tradeDate  new_vol
0      9    1  1994-05-09        1
1      3    2  1994-05-10        4
2      3    2  1994-05-10        4
3      3    2  1994-05-10        4
4      3    2  1994-05-10        4
5      4    3  1994-05-11        9
6      4    3  1994-05-11        9
7      4    3  1994-05-11        9
8      4    3  1994-05-11        9
9      5    4  1994-05-12       16

You can also add the group_info[0] as a new column to facilitate the other calculations like so:

df['group_info'] = df.groupby('tradeDate').grouper.group_info[0] + 1

   Price  Vol   tradeDate  group_info
0      9    1  1994-05-09           1
1      3    2  1994-05-10           2
2      3    2  1994-05-10           2
3      3    2  1994-05-10           2
4      3    2  1994-05-10           2
5      4    3  1994-05-11           3
6      4    3  1994-05-11           3
7      4    3  1994-05-11           3
8      4    3  1994-05-11           3
9      5    4  1994-05-12           4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM