简体   繁体   中英

Find maximum and minimum values of three columns in a python

I would like to know how can I find the difference between maximum and minimum values of three columns in python. (The columns name are POPESTIMATE2010-POPESTIMATE2012) Then I should find the maximum result among all my records. in other words, Which county has had the largest absolute change in population within the period 2010-2012?

eg If County Population in the 3 year period is 100, 80, 130, then its largest change in the period would be |130-80| = 50.

在此输入图像描述 Here is my code:

import pandas as pd
census_df = pd.read_csv('census.csv')

def answer_one():
    return ((census_df['POPESTIMATE2010'],census_df ['POPESTIMATE2011'],census_df ['POPESTIMATE2012']).max()-(census_df['POPESTIMATE2010'],census_df ['POPESTIMATE2011'],census_df ['POPESTIMATE2012']).min()).max()

answer_one()

I'm not sure what should be the end result, but if you want to get the column with biggest difference between max and min value in it, then you can do it like this:

>>> df = pd.DataFrame({'a':[3,4,6], 'b':[22,15,6], 'c':[7,18,9]})
>>> df
   a   b   c
0  3  22   7
1  4  15  18
2  6   6   9
>>> diff = df.max() - df.min()
>>> diff
a     3
b    16
c    11
dtype: int64
>>> diff.nlargest(1)
b    16
dtype: int64

and if you need just a number then

>>> diff.max()
16

And if you want to get difference between max and min value in each row, then just do it on different axis :

>>> diff = df.max(axis=1) - df.min(axis=1)
>>> diff
0    19
1    14
2     3
>>> diff.max()
19
import pandas as pd
d = {'a':[1,2,3], 'b':[4,5,6], 'c':[7,8,9]}
df = pd.DataFrame(d)

def answer_one():
    max_1 = max(df.max())
    min_1 = min(df.min())
    return max_1 - min_1

print answer_one()

and if you want to use a select group of columns:

max_1 = max(df[['a','b']].max())

max(list) gives you the max element in the list.

min(list) gives you the min element in the list.

The rest I assume should be fairly straightforward to understand!

You need to clean your data first and keep only the columns you need. Then transpose your data frame, and get the difference between max and min from them, and finally from the diff series get idxmax .

import pandas as pd
census_df = pd.read_csv('census.csv')
ans_df = census_df[census_df["SUMLEV"] == 50]    
ans_df = ans_df[["STNAME", "CTYNAME", "POPESTIMATE2010", "POPESTIMATE2011", "POPESTIMATE2012"]]
ans_df = ans_df.set_index(["STNAME", "CTYNAME"])
diff = ans_df.T.max() - ans_df.T.min()
diff.idxmax()[1]

I had the same problem, as I solved:

f1 = census_df[census_df['SUMLEV'] == 50].set_index(['STNAME','CTYNAME'])
f1 = f1.ix[:,'POPESTIMATE2010','POPESTIMATE2011','POPESTIMATE2012','POPESTIMATE2013'
,'POPESTIMATE2014','POPESTIMATE2015']].stack()
f2 = f1.max(level=['STNAME','CTYNAME']) - f1.min(level=['STNAME','CTYNAME'])
return f2.idxmax()[1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM