Most efficient way in Pandas to assign tuples to segments

Question

I have written the following piece of code which assigns tuples to segments. A segment is a container of tuples and spans a certain time interval. Contrary to a tuple which has just 1 timestamp.

However, since my code has ~ 30 000 tuples, and this step is iterated quite often, it spends a lot of time on this method.

Is there a more efficient way to handle this?

for timestamp, tuple in tuples.iterrows():
    this_seg = [s for s in segments if s.can_have(timestamp)]
    assert(len(this_seg) <= 1)
    for s in this_seg:
        s.append(tuple)
return segments

Here is some more context:

A segment is a class of type Segment, and has a constructor as follows:

def __init__(self, ts_max, ts_min):
            self._df = pd.DataFrame({})
            self._ts_max = ts_max
            self._ts_min = ts_min

The method can_have checks whether the given timestamp, could be part of the segment: ie timestamp lies between ts_min and ts_max.

Tuples is a Pandas dataframe, which has timestamps as indices and some other features as columns.

Answer 1

Iterrows is the slowest way to do things in Pandas. It's not clear from your question what you're trying to do, but this tutorial offers several faster replacements for iterrows.

https://realpython.com/fast-flexible-pandas/

Most efficient way in Pandas to assign tuples to segments

Question

1 answers

solution1
2 ACCPTED 2018-12-17 14:05:02

Most efficient way in Pandas to assign tuples to segments

Question

1 answers

solution1 2 ACCPTED 2018-12-17 14:05:02

solution1
2 ACCPTED 2018-12-17 14:05:02