Group by one column and then average each of the rest of the columns. Pandas dataframe

Question

There are two things I'm trying to do so that I can take the mean of each of 144 columns for each group in the dataframe.

I have 144 columns for different pressure readings, and then a column for 'cycle'. There are around 70 cycles. I want to group the dataframe by 'cycle' and then calculate the mean of each column for each cycle.

I have successfully grouped the data using:

cycles = df.groupby('cycle')

I am also having trouble with the logic for taking the average of each of the remaining columns as the following averages all the columns together which isn't what I want:

for cycle, group in cycles:
    cycles.mean()

I'd appreciate any help to do this or a simpler method if there is one.

Answer 1

You just have to specify the axis along which you want to calculate the mean, like so:

for cycle, group in cycles:
    group_mean = group.mean(axis=0)

axis=0 will give the mean of the rows (for each column), axis=1 - the mean of the columns (for each row)

Answer 2

The for loop will cast all columns as float except the "cycle" column, which I suppose, it is an "object" (string) type. Then You create a groupy object called "cycles" based on the key = "cycle" then you apply an "aggregate" function, "mean" in your case.

for column in df.loc[:, df.columns != 'cycle']:
    df[column] = df[column].astype(float)


cycles = df.groupy("cycle")
cycles.mean(axis = 0)

or directly

df.groupy("cycle").mean(axis = 0)

Group by one column and then average each of the rest of the columns. Pandas dataframe

Question

2 answers

solution1
1 ACCPTED 2019-11-13 15:19:43

solution2
0 2019-11-13 15:46:57

Group by one column and then average each of the rest of the columns. Pandas dataframe

Question

2 answers

solution1 1 ACCPTED 2019-11-13 15:19:43

solution2 0 2019-11-13 15:46:57

solution1
1 ACCPTED 2019-11-13 15:19:43

solution2
0 2019-11-13 15:46:57