熊猫：如何按日期将数据分组在一起，并对分组数据应用多种功能？

Question

In my code i have a pandas dataframe with a column for the day and a column called value. 在我的代码中，我有一个pandas数据框，其中有一天的列和称为value的列。 I would like to group the dataframe by day and find the minimum and maximum value for that day, average the min and max and then subtract that average from the value column in the dataframe. 我想按天对数据框进行分组，并找到当天的最小值和最大值，对最小值和最大值进行平均，然后从数据框的值列中减去该平均值。

The closest thing i have been able to do has been: 我最能做的是：

temp_max = var.groupby(['day']).max()
temp_min = var.groupby(['day']).min()

answer = var.groupby(['day'])['value'].apply(lambda x : x - (temp_max['value'] - temp_min['value']) / 2 )

input: 输入：

    Unnamed: 0  hrs                   vt                   rt      value
0       119899    1  2017-03-01 07:00:00  2017-03-01 06:00:00  67.910011
1       119900    2  2017-03-01 08:00:00  2017-03-01 06:00:00  52.970033
2       119901    3  2017-03-01 09:00:00  2017-03-01 06:00:00  49.010011
3       119902    4  2017-03-01 10:00:00  2017-03-01 06:00:00  47.030000
4       119903    5  2017-03-01 11:00:00  2017-03-01 06:00:00  45.949989
5       119904    6  2017-03-01 12:00:00  2017-03-01 06:00:00  45.949989

output: 输出：

1    0           NaN
 1     41.540022
 2     31.549989
 3     29.570005
 4     36.949989
 5     38.030000
 6     40.010011
 7     33.980000
 8     47.030000
 9           NaN
 10          NaN
 11          NaN
 12          NaN
 13          NaN
 14          NaN
 15          NaN
 16          NaN
2    1           NaN
     2           NaN
     3           NaN
     4           NaN
     5           NaN
     6           NaN
     7           NaN
     8           NaN
     17          NaN
     18          NaN
     19          NaN
     20          NaN
     21          NaN
             ...    
6    4           NaN
     5           NaN
     6           NaN
     7           NaN
     8           NaN
     53          NaN
     54          NaN
     55          NaN
     56          NaN
7    1           NaN
     2           NaN
     3           NaN
     4           NaN
     5           NaN
     6           NaN
     7           NaN
     8           NaN
     57          NaN
     58          NaN
     59          NaN
     60          NaN
8    1           NaN
     2           NaN
     3           NaN
     4           NaN
     5           NaN
     6           NaN
     7           NaN
     8           NaN
     61          NaN

The values appear to be correct but i was hoping to keep my original dataframe and just update the values in place. 该值似乎是正确的，但我希望保留原始数据框，并仅将这些值更新到位。 Is there a different way i should be approaching this? 我应该采用其他方法吗？ Thx in advance! 提前谢谢！

Answer 1

How about something like this? 这样的事情怎么样？

new_frame = pd.DataFrame(columns=var.columns)

for day,frame in var.groupby('day'):

    frame.loc[:,'value'] = frame['value'].apply(lambda x: x - (frame.value.max() + frame.value.min())/2)

    new_frame = new_frame.append(frame)

You could do it in one line using a list comprehension and groupby but it looks a bit ugly 您可以使用列表理解和groupby在一行中完成此操作，但是看起来有点难看

var.loc[:,'value'] = pd.concat([frm.value.apply(lambda x:x-(frm.value.min() + frm.value.max())/2) for d,frm in var.groupby('day')])

I believe that would accomplish what you're trying to do, albeit not being particularly readable! 我相信，即使不是特别易读，它也可以完成您想要的工作！

熊猫：如何按日期将数据分组在一起，并对分组数据应用多种功能？

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-03-30 15:18:16

熊猫：如何按日期将数据分组在一起，并对分组数据应用多种功能？

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-03-30 15:18:16

解决方案1
1 已采纳 2017-03-30 15:18:16