简体   繁体   English

如何使用Pandas数据框中其他列的条件语句求和一列中的值?

[英]How to sum values in a column using conditional statements of other columns in a pandas dataframe?

I have a dataframe having 5 columns and 25552 rows. 我有一个具有5列和25552行的数据框。 The dataframe structure as follows: 数据框结构如下:

mydf.head(4)

station       date         Lat    Lon       prcp
USC00397992   1998-10-01   44.26  -99.44    0.5
USC00397993   1998-10-01   44.01  -100.35   1.2
USC00397994   1998-10-01   45.65  -97.12    1.1
USC00397995   1998-10-01   43.90  -99.52    0.7

There are many distinct stations in station column and the date column has dates range from 1998-10-01 to 1999-06-30. station列中有许多不同的站点, date列的日期范围为1998-10-01至1999-06-30。 Also, each distinct station has distinct lat and Lon. 同样,每个不同的站点都有不同的纬度和经度。 The prcp column is a record of precipitations for respective dates. prcp列记录各个日期的降水量。 Now I want to find the sum of prcp values for each station date range from 1999-05-01 to 1999-05-07. 现在,我想查找每个station日期范围从1999-05-01到1999-05-07的prcp值的总和。 I want output like this: 我想要这样的输出:

station       Lat      Lon     sum_from_May1_to_May7
 USC00397992  44.26  -99.44       2.5 (for instance)
  .             .       .           .
  .             .       .           .

  .  

First filter your data frame 首先过滤您的数据框

df2 = df.loc[(df.date >= '1999-05-01') & (df.date <= '1999-05-07)]

Then just straightforwardly 然后直接

df2.groupby('station').prcp.sum()

If you don't want different Lat and Lon grouped together, then 如果您不希望将不同的LatLon分组在一起,那么

df2.groupby(['station', 'Lat', 'Lon']).prcp.sum()

If you dont want to groupby with respect to lat long: 如果您不想针对经纬度进行分组:

df[(df['date']>pd.Timestamp(1995,5,1)) & (df['date']<pd.Timestamp(1995,5,7))]\
     .groupby('station').agg({'prcp':'sum', 'Lat' :'first', 'Lon' :'first'})

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas:如何根据其他列值的条件对列求和? - Pandas: How to sum columns based on conditional of other column values? Pandas:如何根据其他列值的条件创建对其他列求和的列? - Pandas: How create columns where sum other columns based on conditional of other column values? 如何更改按其他列的值过滤的pandas数据框列的值 - How to change values of a pandas dataframe column filtered by values of other columns 如何根据 pandas 中的其他列对一列的值求和? - How to sum values of one column based on other columns in pandas? 熊猫如何根据其他列中的值汇总一列的总和 - pandas how to aggregate sum on a column depending on values in other columns 如何将基于其他列值的列附加到pandas数据框 - How to append columns based on other column values to pandas dataframe 如何将一列中的值传播到其他列中的行(熊猫数据框) - How to propagate values in one column to rows in other columns (pandas dataframe) 基于条件选择的新列,来自Pandas DataFrame中其他2列的值 - New column based on conditional selection from the values of 2 other columns in a Pandas DataFrame 如何使用熊猫中其他行和列的值和分组创建新的数据框列? - How to create a new dataframe column using values and groupings from other rows and columns in pandas? 基于 Pandas dataframe 中其他列的值范围的列总和 - Sum of columns based on range of values of other columns in a Pandas dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM