简体   繁体   中英

How to do sort, calculation and find the maximum in python

recently I want to do some sorting, calculation and find the maximum in data frame. For example:

data = {'Name':['Penny','Ben','Benny','Mark'], 
        'Eng':[5,1,4,3], 
        'Math':[1,5,3,2],
        'Physics':[2,5,3,1],
        'Sports':[4,5,2,3],
        'Total':[12,16,12,9]}
 
df1=pd.DataFrame(data, columns=['Name','Eng','Math','Physics','Sports','Total']) 
df1

在此处输入图像描述

I want to get the range of different subject and I find a function

numpy.ptp

Which can find the range of values (maximum - minimum) along an axis, thus I do this import numpy as np

cols_of_interest = ['Eng','Math','Sports','Physics']
np.ptp(df1[cols_of_interest].values, axis=1)

Result

array([4, 4, 2, 2])

When I get the result, the information from the data frame is lost. For example, I want to find the students who have the largest range should be (Penny:4, Ben:4) However, when the data size is large, how can I merge those data back to the data frame and find the max?

Also, for cols_of_interest = ['Eng','Math','Sports','Physics'] , when the elements are large (like 100 subjects), is there any elegant way to apply np.ptp?

Many thanks!!

Simply assign the output of np.ptp :

df1['max_range'] = np.ptp(df1[cols_of_interest].values, axis=1)

Finally, you can find the max with: max_val = df1['max_range'].max() or df1['max_range'].idxmax() if you want the index of the max value.

is there any elegant way to apply np.ptp?

You can access the columns of a dataframe with df1.columns . This returns a list of columns; then simply drop the names you do not want from that list, and pass it into np.ptp .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM