sum rows of column based on a repeating range of values in another with pandas groupby

Question

I have a DataFrame with several thousand rows that looks something like:

Index Chan Pick
1      1   0.001
2      2   0.001
3      3   0.001
4      4   0.001
5      1   0.003
6      2   0.003
7      3   0.003
8      1   0.006
9      2   0.006
10     1   0.002
11     2   0.002
12     3   0.002
13     4   0.002
14     5   0.002
15     6   0.002

The channel Chan column has values that can range from 1 to 24 (sometimes there may be all 24 values, sometimes there may only be 2 values or 6 values, etc. as shown above). The values in the Pick column will usually be the same for each group of channel values.

I need the average value in the Pick column from a common channel block (ie the first block will avg to 0.001...the second block avgs to 0.003, because the Pick values are all the same, but sometimes they may not be).

I know I need to use something similar to:

df.groupby('Chan')['Pick'].mean()

but I don't know how to implement the fact that Chan can be from 1 to 24 and then the pattern starts over (ie the Chan column can be 1 to 4, or 1 to 22, or 1 to 17, etc.)

Answer 1

A channel block essentially starts when the Chan value is exactly 1. We have to exploit this property to accomplish the task.

Let channel_id be a variable identifying each block with a unique progressive identifier. We can define it as follows:

channel_id = (df["Chan"] == 1).cumsum()

where (df["Chan"] == 1) creates a mask with a True where each block starts, then cumsum does the job propagating the identifier over the block and increasing it each time a new block starts.

Now we have just to group by according to this identifier and take the mean value of the Pick column:

df.groupby(channel_id)["Pick"].mean()

You can do everything in one line without supplementary variables.

sum rows of column based on a repeating range of values in another with pandas groupby

Question

1 answers

solution1
0 ACCPTED 2020-07-28 21:58:39

sum rows of column based on a repeating range of values in another with pandas groupby

Question

1 answers

solution1 0 ACCPTED 2020-07-28 21:58:39

solution1
0 ACCPTED 2020-07-28 21:58:39