简体   繁体   中英

How to sum python pandas dataframe in certain time range

I have a dataframe like this

df

    order_date    amount
0   2015-10-02      1
1   2015-12-21      15
2   2015-12-24      3
3   2015-12-26      4
4   2015-12-27      5
5   2015-12-28      10

I would like to sum on df["amount"] based on range from df["order_date"] to df["order_date"] + 6 days

    order_date    amount   sum
0   2015-10-02      1       1 
1   2015-12-21      15      27  //comes from 15 + 3 + 4 + 5
2   2015-12-24      3       22  //comes from 3 + 4 + 5 + 10
3   2015-12-26      4       19
4   2015-12-27      5       15
5   2015-12-28      10      10

the data type of order_date is datetime have tried to use iloc but it did not work well... if anyone has any idea/example on who to work on this, please kindly let me know.

If pandas rolling allowed left-aligned window (default is right-aligned) then the answer would be a simple single liner: df.set_index('order_date').amount.rolling('7d',min_periods=1,align='left').sum() , however forward-looking has not been implemented yet (ie rolling does not accept an align parameter). So, the trick I came up with is to "reverse" the dates temporarily. Solution:

df.index = pd.to_datetime(pd.datetime.now() - df.order_date)
df['sum'] = df.sort_index().amount.rolling('7d',min_periods=1).sum()
df.reset_index(drop=True)

Output:

  order_date  amount   sum
0 2015-10-02       1   1.0
1 2015-12-21      15  27.0
2 2015-12-24       3  22.0
3 2015-12-26       4  19.0
4 2015-12-27       5  15.0
5 2015-12-28      10  10.0

Expanding on my comment:

from datetime import timedelta

df['sum'] = 0
for i in range(len(df)):
    dt1 = df['order_date'][i]
    dt2 = dt1 + timedelta(days=6)
    df['sum'][i] = sum(df['amount'][(df['order_date'] >= dt1) & (df['order_date'] <= dt2)])

There's probably a much better way to do this but it works...

There is my way for this problem. It works.. (I believe there should be a much better way to do this.)

    import pandas as pd

    df['order_date']=pd.to_datetime(pd.Series(df.order_date))
    Temp=pd.DataFrame(pd.date_range(start='2015-10-02', end='2017-01-01'),columns=['STDate'])
    Temp=Temp.merge(df,left_on='STDate',right_on='order_date',how='left')
    Temp['amount']=Temp['amount'].fillna(0)
    Temp.sort(['STDate'],ascending=False,inplace=True)
    Temp['rolls']=pd.rolling_sum(Temp['amount'],window =7,min_periods=0)
    Temp.loc[Temp.STDate.isin(df.order_date),:].sort(['STDate'],ascending=True)


    STDate  Unnamed: 0 order_date  amount  rolls


0  2015-10-02         0.0 2015-10-02     1.0    1.0
80 2015-12-21         1.0 2015-12-21    15.0   27.0
83 2015-12-24         2.0 2015-12-24     3.0   22.0
85 2015-12-26         3.0 2015-12-26     4.0   19.0
86 2015-12-27         4.0 2015-12-27     5.0   15.0
87 2015-12-28         5.0 2015-12-28    10.0   10.0

Set order_date to be DatetimeIndex , so that you can use df.ix[time1:time2] to get the time range rows, then filter amount column and sum them.

You can try with :

from datetime import timedelta
df = pd.read_fwf('test2.csv')
df.order_date = pd.to_datetime(df.order_date)
df =df.set_index(pd.DatetimeIndex(df['order_date']))
sum_list = list()
for i in range(len(df)):
    sum_list.append(df.ix[df.ix[i]['order_date']:(df.ix[i]['order_date'] + timedelta(days=6))]['amount'].sum())
df['sum'] = sum_list
df

Output:

            order_date  amount  sum
2015-10-02  2015-10-02  1       1
2015-12-21  2015-12-21  15      27
2015-12-24  2015-12-24  3       22
2015-12-26  2015-12-26  4       19
2015-12-27  2015-12-27  5       15
2015-12-28  2015-12-28  10      10
import datetime

df['order_date'] = pd.to_datetime(df['order_date'], format='%Y-%m-%d')
df.set_index(['order_date'], inplace=True)

# Sum rows within the range of six days in the future
d = {t: df[(df.index >= t) & (df.index <= t + datetime.timedelta(days=6))]['amount'].sum()
        for t in df.index}

# Assign the summed values back to the dataframe
df['amount_sum'] = [d[t] for t in df.index]

df is now:

            amount  amount_sum
order_date                    
2015-10-02     1.0         1.0
2015-12-21    15.0        27.0
2015-12-24     3.0        22.0
2015-12-26     4.0        19.0
2015-12-27     5.0        15.0
2015-12-28    10.0        10.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM