重新采样'how = count'导致问题

Question

I have a simple pandas dataframe that has measurements at various times: 我有一个简单的pandas数据帧，可以在不同的时间进行测量：

                     volume
t
2013-10-13 02:45:00      17
2013-10-13 05:40:00      38
2013-10-13 09:30:00      29
2013-10-13 11:40:00      25
2013-10-13 12:50:00      11
2013-10-13 15:00:00      17
2013-10-13 17:10:00      15
2013-10-13 18:20:00      12
2013-10-13 20:30:00      20
2013-10-14 03:45:00       9
2013-10-14 06:40:00      30
2013-10-14 09:40:00      43
2013-10-14 11:05:00      10

I'm doing some basic resampling and plotting, such as the daily total volume, which works fine: 我正在做一些基本的重新采样和绘图，例如每日总量，它工作正常：

df.resample('D',how='sum').head()   

            volume
t
2013-10-13     184
2013-10-14     209
2013-10-15     197
2013-10-16     309
2013-10-17     317

But for some reason when I try do the total number of entries per day, it returns aa multiindex series instead of a dataframe: 但出于某些原因，当我尝试每天输入总数时，它会返回一个多索引系列而不是数据帧：

df.resample('D',how='count').head()

2013-10-13  volume     9
2013-10-14  volume     9
2013-10-15  volume     7
2013-10-16  volume     9
2013-10-17  volume    10

I can fix the data so it's easily plotted with a simple unstack call, ie df.resample('D',how='count').unstack() , but why does calling resample with how='count' have a different behavior than with how='sum' ? 我可以修复数据，因此可以通过简单的非df.resample('D',how='count').unstack()调用轻松绘制，即df.resample('D',how='count').unstack() ，但为什么调用resample with how='count'会有不同的行为而不是how='sum' ？

Answer 1

It does appear the resample and count leads to some odd behavior in terms of how the resulting dataframe is structured (Well, at least up to 0.13.1). 看来resample和count导致了一些奇怪的行为，就结果数据帧的结构而言（嗯，至少高达0.13.1）。 See here for a slightly different but related context: Count and Resampling with a mutli-ndex 请参阅此处了解略有不同但相关的背景：使用多重索引进行计数和重新采样

You can use the same strategy here: 您可以在此处使用相同的策略：

>>> df
                     volume
date                       
2013-10-13 02:45:00      17
2013-10-13 05:40:00      38
2013-10-13 09:30:00      29
2013-10-13 11:40:00      25
2013-10-13 12:50:00      11
2013-10-13 15:00:00      17
2013-10-13 17:10:00      15
2013-10-13 18:20:00      12
2013-10-13 20:30:00      20
2013-10-14 03:45:00       9
2013-10-14 06:40:00      30
2013-10-14 09:40:00      43
2013-10-14 11:05:00      10

So here is your issue: 所以这是你的问题：

>>> df.resample('D',how='count')

2013-10-13  volume    9
2013-10-14  volume    4

You can fix the issue by specifying that count applies to the volume column with a dict in the resample call: 您可以通过在resample调用中使用dict指定count应用于volume列来解决此问题：

>>> df.resample('D',how={'volume':'count'})

            volume
date              
2013-10-13       9
2013-10-14       4

重新采样'how = count'导致问题

问题描述

1 个解决方案

解决方案1
6 已采纳 2014-05-16 05:14:30

重新采样'how = count'导致问题

问题描述

1 个解决方案

解决方案1 6 已采纳 2014-05-16 05:14:30

解决方案1
6 已采纳 2014-05-16 05:14:30