简体   繁体   中英

Group by one column and then average each of the rest of the columns. Pandas dataframe

There are two things I'm trying to do so that I can take the mean of each of 144 columns for each group in the dataframe.

I have 144 columns for different pressure readings, and then a column for 'cycle'. There are around 70 cycles. I want to group the dataframe by 'cycle' and then calculate the mean of each column for each cycle.

I have successfully grouped the data using:

cycles = df.groupby('cycle')

I am also having trouble with the logic for taking the average of each of the remaining columns as the following averages all the columns together which isn't what I want:

for cycle, group in cycles:
    cycles.mean()

I'd appreciate any help to do this or a simpler method if there is one.

You just have to specify the axis along which you want to calculate the mean, like so:

for cycle, group in cycles:
    group_mean = group.mean(axis=0)

axis=0 will give the mean of the rows (for each column), axis=1 - the mean of the columns (for each row)

The for loop will cast all columns as float except the "cycle" column, which I suppose, it is an "object" (string) type. Then You create a groupy object called "cycles" based on the key = "cycle" then you apply an "aggregate" function, "mean" in your case.

for column in df.loc[:, df.columns != 'cycle']:
    df[column] = df[column].astype(float)


cycles = df.groupy("cycle")
cycles.mean(axis = 0)

or directly

df.groupy("cycle").mean(axis = 0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM