简体   繁体   English

Groupby列并查找每个组的最小值和最大值

[英]Groupby column and find min and max of each group

I have the following dataset, 我有以下数据集,

        Day    Element  Data_Value
6786    01-01   TMAX    112
9333    01-01   TMAX    101
9330    01-01   TMIN    60
11049   01-01   TMIN    0
6834    01-01   TMIN    25
11862   01-01   TMAX    113
1781    01-01   TMAX    115
11042   01-01   TMAX    105
1110    01-01   TMAX    111
651     01-01   TMIN    44
11350   01-01   TMIN    83
1798    01-02   TMAX    70
4975    01-02   TMAX    79
12774   01-02   TMIN    0
3977    01-02   TMIN    60
2485    01-02   TMAX    73
4888    01-02   TMIN    31
11836   01-02   TMIN    26
11368   01-02   TMAX    71
2483    01-02   TMIN    26

I want to group by the Day and then find the overall min of TMIN an the max of TMAX and put these in to a data frame, so I get an output like... 我想按天分组,然后找到TMIN的整体最小值和TMAX的最大值,然后将其放入数据帧,所以我得到了如下输出:

Day    DayMin    DayMax
01-01  0         115
01-02  0         79

I know I need to do, 我知道我需要做

df.groupby(by='Day')

but I am a stuck with the next step - should create columns to store the TMAX and TMIN values? 但我对下一步感到困惑-应该创建列来存储TMAX和TMIN值吗?

You can use a assign + abs , followed by groupby + agg : 您可以使用assign + abs ,然后使用groupby + agg

df = (df.assign(Data_Value=df['Data_Value'].abs())
       .groupby(['Day'])['Data_Value'].agg([('Min' , 'min'), ('Max', 'max')])
       .add_prefix('Day'))

df 
       DayMin  DayMax
Day                  
01-01       0     115
01-02       0      79

Use 采用

In [5265]: def maxmin(x):
      ...:     mx = x[x.Element == 'TMAX'].Data_Value.max()
      ...:     mn = x[x.Element == 'TMIN'].Data_Value.min()
      ...:     return pd.Series({'DayMin': mn, 'DayMax': mx})
      ...:

In [5266]: df.groupby('Day').apply(maxmin)
Out[5266]:
       DayMax  DayMin
Day
01-01     115       0
01-02      79       0

Also, 也,

In [5268]: df.groupby('Day').apply(maxmin).reset_index()
Out[5268]:
     Day  DayMax  DayMin
0  01-01     115       0
1  01-02      79       0

Or, use query instead of x[x.Element == 'TMAX'] as x.query("Element == 'TMAX'") 或者,使用query代替x[x.Element == 'TMAX']作为x.query("Element == 'TMAX'")

Create duplicate columns and find min and max using agg ie 创建重复的列并使用agg查找最小值和最大值

ndf = df.assign(DayMin = df['Data_Value'].abs(),DayMax=df['Data_Value'].abs()).groupby('Day')\
     .agg({'DayMin':'min','DayMax':'max'})
DayMax  DayMin
Day                  
01-01     115       0
01-02      79       0

Incase you want both TMIN and TMAX then groupby(['Day','Element']) 如果您同时需要TMIN和TMAX,则进行groupby(['Day','Element'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在没有索引的最小值和最大值的情况下查找熊猫中每一列的最小值和最大值 - find min and max of each column in pandas without min and max of index 需要遍历每列并找到最大值和最小值 - Need to iterate through each column and find max and min Pandas groupby 并找出最大值和最小值之间的差异 - Pandas groupby and find difference between max and min 找到差异。 groupby 在 pandas 中的最大值和最小值? - find diff. of max and min in pandas by groupby? 在 groupby 之后查找单独列的最小/最大值 - Find min/max of separate columns after groupby 多列 groupby 与 pandas 找到每个组的最大值 - Multiple column groupby with pandas to find maximum value for each group 在给定文件中查找列的数据类型,查找每列的最大值和最小值,如果是字符串,则根据长度查找最大值,最小值字符串 - To find datatypes of column in a file given, to find max and min value of each column, in case of string find max, min string based on length Groupby序列按日期排序,根据其他列值查找min,max - Groupby sequence in order by date, find the min, max based on other column value 在 groupby 之后计算组中的最小值和最大值之间的差异 - Calculate difference between min and max values in a group after a groupby Pandas groupby 比切入组的最小/最大间隔 - Pandas groupby than cut into intervals of the min/max of the group
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM