numpy(.ma) array: Number of values since last value change?

Question

i have a status-signal (measured, of a heatpump) in an numpy.ma-array, together with timestamps. What i want is the lengths of the periods it was on and the length of the periods it was off. (NOT the daily running time or something, that would be easy..)

What i have (exemplary, in fact i have minute-values over 16months..):

Time    Value
12:00   0
12:01   1
12:02   1
...
12:29   1
12:30   1
12:31   0
...
12:41   0
12:42   1
...
13:01   1
13:02   0
...and so on

And what i want to have as output:

running_time_was:
time   value (=minutes)
12:31  30
13:10  20

off_time_was:
time   value (=minutes)
12:42  11

(the exact timestamps doesn't matter so much, but there should be some)

I already asked the people i know who know python (they had no idea, neither), and tried to search on the internet, but i just don't know what to search for. So maybe someone could at least give me a hint which words i could type into google? :)

ps: Wow, stackoverflow is fantastic! I was already amazed of the usability as passive user, but the asking interface is even better :)

Answer 1

Well, finally i got i for myself. But millions of thanks to the answer of Joe Kington, which highly inspired me and teached me the whole stuff :)

Script

import numpy as np

def main():
    # Generate some data...
    t = np.linspace(0, 10*np.pi, 30)
    x = np.sin(t)
    condition = np.where(x>0,1,0)

    onarray,offarray = on_off_times(condition)

    print "Condition: ",condition
    print "Ontimes:   ",onarray
    print "Offtimes:  ",offarray


def on_off_times(condition):

    changing=np.diff(condition)         #complete array, -1 when tunring off, +1 when turning on
    idx, = changing.nonzero()           #Indices of changepoints
    times=np.diff(idx)              #'Times' between changes
    times=np.r_[0,times]            # The first time can't be calculated ->is set to 0

    ontimes=np.where(changing[idx]<0,times,False)   #When turning off: Was Ontime (times-array with "False" instead of offtimes)
    offtimes=np.where(changing[idx]>0,times,False)  #When turning on:  Was Offtime

    onarray=np.r_[changing.copy(),0]            #copy the array with all the values and add 1(to have an "empty" array of the right size)
    offarray=np.r_[changing.copy(),0]

    np.put(onarray,idx,ontimes)         #Put the times into it (at the changepoints)
    np.put(offarray,idx,offtimes)

    return onarray,offarray

main()

Yields in:

Condition:  [0 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0]
Ontimes:    [0 0 2 0 0 0 0 0 3 0 0 0 0 0 3 0 0 0 0 0 3 0 0 0 0 0 3 0 0 0]
Offtimes:   [0 0 0 0 0 3 0 0 0 0 0 3 0 0 0 0 0 3 0 0 0 0 0 3 0 0 0 0 0 0]

Things to mention:

Yes, the code gives another output as I requested in my Question.. Sorry for beeing unprecise, but the output the function gives is what i finally need.
Now i just have to get the zeros in the Outputs masked, which will be easy.
If anybody has a nicer way to put() the data in a new array ,please edit :)
How to link to the user Joe? Or how to thank him?

Answer 2

Basically, you have a boolean array and you want to find the start and stop of contiguous regions.

It's far better to avoid looping over each item of a numpy array.

There are a few different ways of doing it, but I usually do something similar to this (which I probably originally got from here ):

import numpy as np
def contiguous_regions(condition):
    """Finds contiguous True regions of the boolean array "condition". Returns
    a 2D array where the first column is the start index of the region and the
    second column is the end index."""

    # Find the indicies of changes in "condition"
    idx, = np.diff(condition).nonzero()

    # Prepend or append the start or end indicies to "idx"
    # if there's a block of "True"'s at the start or end...
    if condition[0]:
        idx = np.r_[0, idx]
    if condition[-1]:
        idx = np.r_[idx, len(condition)-1]

    return idx.reshape((-1,2))

As a quick example:

import numpy as np

def main():
    # Generate some data...
    t = np.linspace(0, 6*np.pi, 100)
    x = np.sin(t)
    condition = x > 0

    regions = contiguous_regions(condition)
    lengths = regions[:,1] - regions[:,0]

    for reg, length in zip(regions, lengths):
        print 'Condition was True for {0} seconds'.format(length)
        print '    From time {0}s to {1}s'.format(*reg)

def contiguous_regions(condition):
    idx, = np.diff(condition).nonzero()

    if condition[0]:
        idx = np.r_[0, idx]
    if condition[-1]:
        idx = np.r_[idx, len(condition)-1]

    return idx.reshape((-1,2))

main()

This yields:

Condition was True for 16 seconds
    From time 0s to 16s
Condition was True for 16 seconds
    From time 33s to 49s
Condition was True for 16 seconds
    From time 66s to 82s

Answer 3

I would have done this the following way:

# read data from file
dt = {'names' : ('ts', 'state'), 'formats' : ('S5','i4')}
data = np.loadtxt(datafn, dtype = dt)
# minutes counter
mc = 1
# current state, current timestamp
cs = data[0]['state']
ct = data[0]['ts']
# states dictionary
states = {1 : [], 0 : []}

for minute in data[1:]:
    if cs != minute['state']:
        states[cs].append([ct, mc])
        mc = 0
        cs = minute['state']
        ct = minute['ts']
    else:
        mc += 1
# Printing the result
print 'On time'
for [ts, mc] in states[1]:
     print '%s\t%i' % (ts, mc)
print 'Off time'
for [ts, mc] in states[0]:
     print '%s\t%i' % (ts, mc)

Extremely untested but you can get the logic.

Answer 4

(Possible) Answer

You could try this:

Since = 0
for i in range(1, Data.shape[0]):
    #Switched off
    if Data[i, 1] == 0.0 and Data[i - 1, 1] == 1.0:
        print "{0} for {1}min".format(Data[i, 0], i - Since)
    #Switched on
    elif Data[i, 1] == 1.0 and Data[i - 1, 1] == 0.0:
        Since = i

You loop through the hole array ( Data ) which has in its first column the time stamps and in its second column a 1.0 or 0.0 depending if the heater was on or off.

You detect the change of state looking at the actual on/off value and the previous one. Depending on those two values you see if the heater was Switched off or Switched on . All you need to do then is to save the value of the current index in Since and you get the time the heater was switched on.

Script

With the following script you can set up a data array and run the code above and see how it works:

import datetime
import numpy as np

#Setting up OnOff array
OnOff = np.concatenate((np.zeros((7,)), np.ones((20,)), np.zeros((3,)), np.ones((5,)), np.zeros((4,)), np.ones((16,)), np.zeros((2,)), np.ones((2,)), np.zeros((1,))))

#Setting up time array
start = datetime.time(12, 00)
TimeStamps = []

for i in range(OnOff.size):
    TimeStamps.append(datetime.time(12 + i/60, np.mod(i, 60)))

TimeStamps = np.array(TimeStamps)

#Concatenating both arrays to a single array
Data = np.hstack((np.reshape(TimeStamps, (TimeStamps.size, 1)), np.reshape(OnOff, (OnOff.size, 1))))

Since = 0
for i in range(1, Data.shape[0]):
    #Switched off
    if Data[i, 1] == 0.0 and Data[i - 1, 1] == 1.0:
        print "{0} for {1}min".format(Data[i, 0], i - Since)
    #Switched on
    elif Data[i, 1] == 1.0 and Data[i - 1, 1] == 0.0:
        Since = i

The output is

12:27:00 for 20min
12:35:00 for 5min
12:55:00 for 16min
12:59:00 for 2min

numpy(.ma) array: Number of values since last value change?

Question

4 answers

solution1
3 ACCPTED 2011-11-11 16:42:18

Script

solution2
1 2011-11-10 16:21:09

solution3
0 2011-11-10 09:35:07

solution4
0 2011-11-10 10:04:48

(Possible) Answer

Script

numpy(.ma) array: Number of values since last value change?

Question

4 answers

solution1 3 ACCPTED 2011-11-11 16:42:18

Script

solution2 1 2011-11-10 16:21:09

solution3 0 2011-11-10 09:35:07

solution4 0 2011-11-10 10:04:48

(Possible) Answer

Script

solution1
3 ACCPTED 2011-11-11 16:42:18

solution2
1 2011-11-10 16:21:09

solution3
0 2011-11-10 09:35:07

solution4
0 2011-11-10 10:04:48