简体   繁体   中英

how to calculate row by row value till to every 60 minutes in same column using python

Here I have a dataset with time and value. So here I want to sum the value row by row in every 60 minutes.

 date x 8/6/2018 6:15 0 8/6/2018 6:20 2.89295 8/6/2018 6:25 2.89295 8/6/2018 6:30 2.89295 8/6/2018 6:35 2.89295 8/6/2018 6:40 2.89295 8/6/2018 6:45 2.89295 8/6/2018 6:50 2.89295 8/6/2018 6:55 2.89295 8/6/2018 7:00 2.89295 8/6/2018 7:05 2.89295 8/6/2018 7:10 2.89295 8/6/2018 7:15 2.89295 8/6/2018 7:20 2.89295 8/6/2018 7:25 2.89295 8/6/2018 7:30 2.89295 8/6/2018 7:35 2.89295 8/6/2018 7:40 2.89295 8/6/2018 7:45 3.155946 8/6/2018 7:50 3.155946 8/6/2018 7:55 3.155946 8/6/2018 8:00 3.155946 8/6/2018 8:05 3.155946 8/6/2018 8:10 3.155946 8/6/2018 8:15 3.155946

expected output is:

Here I want add each and value in every five minutes value sum it till to 60 minutes 60 minutes.

means:

 date x new_x 8/6/2018 6:15 0 0 8/6/2018 6:20 2.89295 2.89295 8/6/2018 6:25 2.89295 2.89295 + 2.89295 = 5.7859 8/6/2018 6:30 2.89295 2.89295 + 2.89295 + 2.89295 = 8.67885 8/6/2018 6:35 2.89295 2.89295 + 2.89295 + 2.89295 + 2.89295 = 11.5718 8/6/2018 6:40 2.89295 8/6/2018 6:45 2.89295 like wise till to one hour 8/6/2018 6:50 2.89295 8/6/2018 6:55 2.89295 8/6/2018 7:00 2.89295 8/6/2018 7:05 2.89295 8/6/2018 7:10 2.89295 8/6/2018 7:15 2.89295 2.89295 + 2.89295 + 2.89295 + 2.89295+........= 34.7154 8/6/2018 7:20 2.89295 2.89295 (after one hour then again another hour, so then again value will be 2.89295) it will depend on the value at that time)

I don't know how to do sum it with that increasing value. Can anyone help me to solve this problem?

I tried to see if this was possible using Pandas Grouper and Cumulative Sum function, however, I could not find a way. Its possible with a hard boundary at end of the hour for eg. if you wanted to reset the sum at 7:00 rather than 7:15, but not like you wanted. May be someone can suggest on those lines. Meanwhile a simple solution with lot of Python code.

I have put some comments inline to help you understand this, also this assumes you are holding your data in a DataFrame and the Date column is set as Date not string. You might need to convert strings to date in the loop below otherwise.

#Get the first Date and hold its reference
lastDate = dataset.iat[0,0]
#Initialize the sum to 0
cumulativeSum = 0
for i in dataset.index:
    #Find the time difference between this row and the last held Date
    dateDiff = dataset.at[i, 'Date'] - lastDate
    if dateDiff.total_seconds() > 3600:
        #If the difference is more than 60Min then we reset the sum also hold this date as the last reference date
        cumulativeSum = 0
        lastDate = dataset.at[i, 'Date']
    #Add the current value to cumulative sum and store it in our new field
    cumulativeSum = cumulativeSum + dataset.at[i, 'Value']
    dataset.at[i, 'NewX'] = cumulativeSum
print(dataset)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM