简体   繁体   English


[英]Pandas: Subtracting dates in columns, appending the difference

I was wondering if there is any way to select dates and subtract them from each other to get a difference.我想知道是否有任何方法可以选择日期并将它们相互减去以获得差异。 The difference I will be obtaining won't be in days, but in hours and minutes.我将获得的差异不是以天为单位,而是以小时和分钟为单位。

This difference will also be variable depending on the day, as I want the difference of each day, beginning and end, subtracted...这种差异也会因日期而异,因为我想要每天的差异,开始和结束,减去......

below is the dataframe I am working with:下面是我正在使用的数据框:

                     OfficeTemp  OutdoorTemp  SolarDiffuseRate  
2006-01-01 07:15:00   19.915275       0.8125             0.000   
2006-01-01 07:30:00   20.463506       0.8125             0.000   
2006-01-01 07:45:00   20.885112       0.8125             0.000   
2006-01-01 08:00:00   21.499246       0.8125             0.000   
2006-01-02 07:15:00   20.463326      11.5125             0.000   
2006-01-02 07:30:00   21.122635      11.5125             0.000   
2006-01-03 07:15:00   20.224612       6.9625             0.000   
2006-01-03 07:30:00   20.820027       6.9625             0.000   
2006-01-03 07:45:00   21.272505       6.9625             0.000   
2006-01-04 07:15:00   20.007434       3.0625             0.000   
2006-01-04 07:30:00   20.564662       3.0625             0.000   
2006-01-04 07:45:00   20.991727       3.0625             0.000   
2006-01-05 07:15:00   20.046861       8.0000             0.000   
2006-01-05 07:30:00   20.592663       8.0000             0.000   
2006-01-05 07:45:00   21.023338       8.0000             0.000   
2006-01-06 09:00:00   17.527457       3.8875            31.875   
2006-01-06 09:15:00   17.588175       4.7500            73.875   
2006-01-06 09:30:00   17.638827       4.7500            73.875   

The index column is the date time column, and as you can see, the number of samples per day differs despite them all starting at the same time, so the time difference can vary.索引列是日期时间列,如您所见,尽管它们都在同一时间开始,但每天的样本数量有所不同,因此时间差可能会有所不同。 Some are 45 minutes, whilst others are more or less.有些是 45 分钟,而有些则或多或少。

How would I calculate the difference per day, and append it to a Difference column?我将如何计算每天的差异,并将其附加到差异列中?

This works:这有效:

df['diff'] = df.groupby(df['DateTime'].dt.day) \
                        ['DateTime'] \
                        .transform(lambda x: (x.max()-x.min()).seconds/60)


              DateTime  OfficeTemp  OutdoorTemp  SolarDiffuseRate  diff
0  2006-01-01 07:15:00    19915275       0.8125               0.0  45.0
1  2006-01-01 07:30:00    20463506       0.8125               0.0  45.0
2  2006-01-01 07:45:00    20885112       0.8125               0.0  45.0
3  2006-01-01 08:00:00    21499246       0.8125               0.0  45.0
4  2006-01-02 07:15:00    20463326  115125.0000               0.0  15.0
5  2006-01-02 07:30:00    21122635  115125.0000               0.0  15.0
6  2006-01-03 07:15:00    20224612   69625.0000               0.0  30.0
7  2006-01-03 07:30:00    20820027   69625.0000               0.0  30.0
8  2006-01-03 07:45:00    21272505   69625.0000               0.0  30.0
9  2006-01-04 07:15:00    20007434   30625.0000               0.0  30.0
10 2006-01-04 07:30:00    20564662   30625.0000               0.0  30.0
11 2006-01-04 07:45:00    20991727   30625.0000               0.0  30.0
12 2006-01-05 07:15:00    20046861   80000.0000               0.0  30.0
13 2006-01-05 07:30:00    20592663   80000.0000               0.0  30.0
14 2006-01-05 07:45:00    21023338   80000.0000               0.0  30.0
15 2006-01-06 09:00:00    17527457   38875.0000           31875.0  30.0
16 2006-01-06 09:15:00    17588175   47500.0000           73875.0  30.0
17 2006-01-06 09:30:00    17638827   47500.0000           73875.0  30.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM