简体   繁体   English

将 Pandas groupby 应用于多列

[英]Applying Pandas groupby to multiple columns

I have a set of data that has several different columns, with daily data going back several years.我有一组数据,其中包含几个不同的列,每天的数据可以追溯到几年前。 The variable is the exact same for each column.每列的变量完全相同。 I've calculated the daily, monthly, and yearly statistics for each column, and want to do the same, but combining all columns together to get one statistic for each day, month, and year rather than the several different ones I calculated before.我已经计算了每一列的每日、每月和每年的统计数据,并且想要做同样的事情,但是将所有列组合在一起以获得每天、每月和每年的一个统计数据,而不是我之前计算的几个不同的统计数据。

I've been using Pandas group by so far, using something like this:到目前为止,我一直在使用 Pandas 组,使用如下内容:

sum_daily_files = daily_files.groupby(daily_files.Date.dt.day).sum()
sum_monthly_files = daily_files.groupby(daily_files.Date.dt.month).sum()
sum_yearly_files = daily_files.groupby(daily_files.Date.dt.year).sum()

Any suggestions on how I might go about using Pandas - or any other package - to combine the statistics together?关于我如何使用 Pandas - 或任何其他 package - 将统计信息组合在一起的任何建议? Thanks so much!非常感谢!

edit编辑

Here's a snippet of my dataframe:这是我的 dataframe 的片段:

Date                 site1  site2  site3  site4  site5  site6
2010-01-01 00:00:00      2      0      1      1      0      1
2010-01-02 00:00:00      7      5      1      3      1      1
2010-01-03 00:00:00      3      3      2      2      2      1
2010-01-04 00:00:00      0      0      0      0      0      0
2010-01-05 00:00:00      0      0      0      0      0      1

I just had to type it in because I was having trouble getting it over, so my apologies.我只需要输入它,因为我无法完成它,所以我很抱歉。 Basically, it's six different sites from 2010 to 2019 that details how much snow (in inches) each site received on each day.基本上,从 2010 年到 2019 年,有六个不同的地点详细说明了每个地点每天收到的降雪量(以英寸为单位)。

(Your problem need to be clarify) (您的问题需要澄清)

Is this what you want?这是你想要的吗?

all_sum_daily_files = sum_daily_files.sum(axis=1)  # or daily_files.sum(axis=1)
all_sum_monthly_files = sum_monthly_files.sum(axis=1)
all_sum_yearly_files = sum_yearly_files.sum(axis=1)

If your data is daily, why calculate the daily sum, you can use directly daily_files.sum(axis=1) .如果你的数据是每天的,为什么要计算每天的总和,你可以直接使用daily_files.sum(axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM