简体   繁体   中英

How to get an average of row excluding specific value less than or greater than and add new column at last, Python, Pandas

following is my input data frame

>>data frame after getting avg
   a  b  c  d  avg   
0  1  4  7  8  5  
1  3  4  5  6  4.5 
2  6  8  2  9  6.25
3  2  9  5  6  5.5   


Output required after adding criteria
>> 
   a  b  c  d  avg   avg_criteria
0  1  4  7  8  5     7.5 (<=5)
1  3  4  5  6  4.5   5.5 (<=4.5)
2  6  8  2  9  6.25  8.5 (<=6.25)
3  2  9  5  6  5.5   7.5 (<=5.5)

> This is the code I have tried

read file

df_input_data = pd.DataFrame(pd.read_excel(file_path,header=2).dropna(axis=1, how= 'all'))

adding column after calculating average

df_avg = df_input_data.assign(Avg=df_input_data.mean(axis=1, skipna=True))

criteria

criteria = df_input_data.iloc[, :] >= df_avg.iloc[1][-1]

#creating output data frame

df_output = df_input_data.assign(Avg_criteria= criteria)


I am unable to solve this issue. I have tried and googled it many times

From what I understand, you can try df.mask / df.where after comparing with the mean and then calculate mean:

m=df.drop("avg",1)
m.where(m.ge(df['avg'],axis=0)).mean(1)

0    7.5
1    5.5
2    8.5
3    7.5
dtype: float64

print(df.assign(Avg_criteria=m.where(m.ge(df['avg'],axis=0)).mean(1)))

   a  b  c  d   avg  Avg_criteria
0  1  4  7  8  5.00           7.5
1  3  4  5  6  4.50           5.5
2  6  8  2  9  6.25           8.5
3  2  9  5  6  5.50           7.5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM