简体   繁体   中英

pandas grouping aggregtation across multiple columns in a dataframe

I would like to derive the min and max for each year, region, and weather_type from a pandas dataframe. The dataframe looks like this:

year   jan   feb   mar   apr   may   jun   aug   sept   oct   nov   dec    region  weathertype
1862   42.0  8.2   82.7  46.7  72.7  61.6  81.9  45.9   76.8  34.9  44.8   Anglia  Rain
1863   58.3  15.7  24.0  17.5  27.9  75.2  38.5  71.5   71.7  77.5  32.0   Anglia  Rain
1864   20.5  30.3  81.5  13.8  59.5  26.5  12.3  19.2   42.1  25.5  79.9   Anglia  Rain 

What is needed are two new columns giving the min and max for each region and year, effecting grouping across rows, with the result added to the existing dataframe as two new columns:

year   min   max
1862   42.0  81.9
1863   15.7  77.5
1864   12.3  81.5

My approach has been to use this code:

weather_data['max_value'] = weather_data.groupby(['year','region','weathertype'])['jan','feb','mar','apr','may','jun','jul','aug','sep','oct',  'nov','dec'].transform(np.min)

However, this produces a non-aggregated subset of the data, which is a duplication of the existing frame, resulting the following error:

Wrong number of items passed 12, placement implies 1

I then melted the dataframe into a long, rather than wide format:

year    region    Option_1    variable    value
1862    Anglia    Rain        jan         42.0
1863    Anglia    Rain        jan         58.3
1864    Anglia    Rain        jan         20.5

I used this code to produce what i needed:

weather_data['min_value'] = weather_data['value'].groupby(weather_data['region','Option_1']).transform(np.min)

but this either produces a key error where there is a single list.
[['region','Option_1]] produces Grouper for <class 'pandas.core.frame.DataFrame'> not 1-dimensional

Any suggestions are this point are gratefully received.

I would do:

(df.set_index(['year','region','weathertype'])
  .assign(min=lambda x: x.min(axis=1),
          max=lambda x: x.max(axis=1)
         )
  .reset_index())

Output:

      year  region    weathertype      jan    feb    mar    apr    may    jun    aug    sept    oct    nov    dec    min    max
--  ------  --------  -------------  -----  -----  -----  -----  -----  -----  -----  ------  -----  -----  -----  -----  -----
 0    1862  Anglia    Rain            42      8.2   82.7   46.7   72.7   61.6   81.9    45.9   76.8   34.9   44.8    8.2   82.7
 1    1863  Anglia    Rain            58.3   15.7   24     17.5   27.9   75.2   38.5    71.5   71.7   77.5   32     15.7   77.5
 2    1864  Anglia    Rain            20.5   30.3   81.5   13.8   59.5   26.5   12.3    19.2   42.1   25.5   79.9   12.3   81.5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM