没有多索引的 Pandas groupby 列

Question

I have a dataframe with a data about train stations in a month, of which three are indexes: Station, Date, Hour.我有一个包含一个月内火车站数据的数据框，其中三个是索引：Station、Date、Hour。 I could look like this:我可以看起来像这样：

Station    Date       Hour    Passengers 
Berlin HBF 2012-12-24 12:00   1000 
Berlin HBF 2012-12-24 13:00   2000  
Berlin HBF 2012-12-24 14:00   1000  
Berlin HBF 2012-12-24 15:00   1000  
....
Stuttgart 2012-12-24 12:00    500

Since I am only interested in sums for a station in a month, I would like to groupby by Station, Date, and Hour, so that the end result looks like this:由于我只对一个月内某个站的总和感兴趣，所以我想按站、日期和小时分组，以便最终结果如下所示：

Station    Passengers 
Berlin HBF 4000 
....
Stuttgart  500

But I am unable to get pandas to this solution, I tried: byStation = traindata.groupby(['Station', 'Date', 'Hour']).agg(np.sum()) But that simply returns a multiindex, with all rows...但是我无法让熊猫使用这个解决方案，我试过： byStation = traindata.groupby(['Station', 'Date', 'Hour']).agg(np.sum()) 但这只是返回一个多索引，与所有行...

Answer 1

Looks like you want to group by "Station" only and do a sum over the "Passangers"-column.看起来您只想按“Station”分组并对“Passangers”列进行求和。 You do not need a multi-index here.此处不需要多索引。 Your solution will create one, but as it is the same one as your raw data, it's quite useless.您的解决方案将创建一个，但由于它与您的原始数据相同，因此它毫无用处。

This one should work:这个应该有效：

traindata.groupby("Station").Passengers.sum()

没有多索引的 Pandas groupby 列

问题描述

1 个解决方案

解决方案1
2 已采纳 2014-03-31 14:23:13

没有多索引的 Pandas groupby 列

问题描述

1 个解决方案

解决方案1 2 已采纳 2014-03-31 14:23:13

解决方案1
2 已采纳 2014-03-31 14:23:13