pandas groupby 和重采样可能存在的错误

Question

I am a newbie in pandas and seeking advice if this is a possible bug?我是 pandas 的新手，如果这是一个可能的错误，请寻求建议？

Dataframe with non unique datetime index. Dataframe 具有非唯一日期时间索引。 Col1 is a group variable, col2 is values. Col1 是组变量，col2 是值。

i want to resample the hourly values to years and grouping by the group variable.我想将每小时值重新采样为年并按组变量分组。 i do this with this command我用这个命令来做

df_resample = df.groupby('col1').resample('Y').mean() This works fine and creates a multiindex of col1 and the datetimeindeks, where col1 is now NOT a column in the dataframe df_resample = df.groupby('col1').resample('Y').mean() 这很好用并创建了 col1 和 datetimeindeks 的多重索引，其中 col1 现在不是 dataframe 中的列

How ever if i change mean() to max() this is not the case.如果我将 mean() 更改为 max()，情况并非如此。 Then col1 is part of the multiindex, but the column is still present in the dataframe. Isnt this a bug?然后 col1 是 multiindex 的一部分，但该列仍然存在于 dataframe 中。这不是错误吗？

Sorry, but i dont know how to present dummy data as a dataframe in this post?抱歉，但我不知道如何在这篇文章中将虚拟数据呈现为 dataframe？

Edit: code example:编辑：代码示例：

from datetime import datetime, timedelta
import pandas as pd

data = {'category':['A', 'B', 'C'],
        'value_hour':[1,2,3]}
days = pd.date_range(datetime.now(), datetime.now() + timedelta(2), freq='D')

df = pd.DataFrame(data, index=days)

df_mean = df.groupby('category').resample('Y').mean()
df_max = df.groupby('category').resample('Y').max()
print(df_mean, df_max)
                        
category                value_hour              
A        2021-12-31         1.0
B        2021-12-31         2.0
C        2021-12-31         3.0     

category              category  value_hour                           
A        2021-12-31        A           1
B        2021-12-31        B           2
C        2021-12-31        C           3

Trying to drop the category column from df_max gives an KeyError尝试从 df_max 中删除类别列会给出 KeyError

df_max.drop('category')

File "C:\Users\mav\Anaconda3\envs\EWDpy\lib\site-packages\pandas\core\indexes\base.py", line 3363, in get_loc
raise KeyError(key) from err

KeyError: 'category'

Answer 1

Concerning the KeyError: the problem is that you are trying to drop the "category" row instead of the column.关于 KeyError：问题是您试图删除“类别”行而不是列。 When using drop to drop the columns you should add axis = 1 as in the following code:使用 drop 删除列时，您应该添加 axis = 1 ，如以下代码所示：

df_max.drop('category', axis=1)

axis=1 indicates you are looking at the columns axis=1 表示您正在查看列

pandas groupby 和重采样可能存在的错误

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-09-29 22:46:05

pandas groupby 和重采样可能存在的错误

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-09-29 22:46:05

解决方案1
0 已采纳 2021-09-29 22:46:05