[英]TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex' and I can't figure out why
I've this dataframe in pandas我有这个 dataframe 在 pandas
key date story_point Story point
0 SOF-158 2019-06-04 09:51:01.143000+02:00 3.0 3.0
1 SOF-152 2019-05-24 09:10:23.483000+02:00 3.0 3.0
2 SOF-151 2019-05-24 09:10:14.978000+02:00 3.0 3.0
3 SOF-150 2019-05-24 09:10:23.346000+02:00 3.0 3.0
4 SOF-149 2019-05-24 09:10:23.024000+02:00 3.0 3.0
5 SOF-148 2019-05-24 09:10:23.190000+02:00 3.0 3.0
6 SOF-146 2019-05-24 09:10:22.840000+02:00 5.0 5.0
7 SOF-142 2019-04-15 10:50:03.946000+02:00 2.0 2.0
8 SOF-141 2019-03-29 10:54:08.677000+01:00 2.0 2.0
9 SOF-139 2019-04-15 10:44:56.033000+02:00 3.0 3.0
10 SOF-138 2019-04-15 10:48:53.874000+02:00 3.0 3.0
11 SOF-129 2019-03-28 11:56:17.221000+01:00 5.0 5.0
12 SOF-128 2019-03-29 11:34:47.552000+01:00 1.0 1.0
13 SOF-106 2019-03-25 10:15:43.231000+01:00 5.0 5.0
14 SOF-105 2019-03-25 10:15:43.252000+01:00 3.0 3.0
15 SOF-103 2019-03-29 11:55:45.984000+01:00 8.0 8.0
16 SOF-102 2019-03-25 10:15:43.210000+01:00 8.0 8.0
17 SOF-101 2019-03-25 10:15:43.179000+01:00 8.0 8.0
18 SOF-100 2019-03-29 12:08:16.525000+01:00 13.0 13.0
19 SOF-99 2019-03-19 12:48:58.168000+01:00 1.0 1.0
20 SOF-98 2019-03-19 12:47:28.172000+01:00 13.0 13.0
21 SOF-91 2019-03-08 11:53:19.456000+01:00 3.0 3.0
22 SOF-89 2019-04-05 09:32:39.517000+02:00 8.0 8.0
23 SOF-88 2019-03-25 10:15:42.927000+01:00 5.0 5.0
24 SOF-87 2019-04-05 09:32:25.519000+02:00 8.0 8.0
At certain point I need to group by week, so I used resample
.在某些时候我需要按周分组,所以我使用了resample
。
weekly_summary["story_point"] = df.story_point.resample('W').sum()
But I have this error, and I can't figure out why但是我有这个错误,我不知道为什么
Traceback (most recent call last):
File "main.py", line 98, in <module>
main()
File "main.py", line 44, in main
analyze_project(project)
File "main.py", line 70, in analyze_project
weekly_summary["story_point"] = df.story_point.resample('W').sum()
File "/Users/xxxx/anaconda/envs/xxx/lib/python3.6/site-packages/pandas/core/generic.py", line 8449, in resample
level=level,
File "/Users/xxx/anaconda/envs/xxx/lib/python3.6/site-packages/pandas/core/resample.py", line 1306, in resample
return tg._get_resampler(obj, kind=kind)
File "/Users/xxx/anaconda/envs/xxxx/lib/python3.6/site-packages/pandas/core/resample.py", line 1443, in _get_resampler
"but got an instance of %r" % type(ax).__name__
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'
Convert column date
to datetimes and add parameter on
to resample
:转换列date
以日期时间,并添加参数on
,以resample
:
df['date'] = pd.to_datetime(df['date'])
weekly_summary = df.story_point.resample('W', on='date').sum()
If need new column:如果需要新列:
weekly_summary['weekly'] = df.story_point.resample('W', on='date').transform('sum')
Or create DatetimeIndex
:或创建DatetimeIndex
:
df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date')
weekly_summary = df.story_point.resample('W').sum()
If need new column:如果需要新列:
weekly_summary['weekly'] = df.story_point.resample('W').transform('sum')
The first part of jezrael's answer was giving me KeyError: 'The grouper name date is not found'
. jezrael 回答的第一部分是给我 KeyError KeyError: 'The grouper name date is not found'
。 It turns out that on=
argument can be used only if resample()
is called on a dataframe (not on a column).事实证明,只有在 dataframe(而不是列)上调用resample()
时,才能使用on=
参数。 So the following works.所以下面的作品。
# for weekly aggregate
weekly_summary = df.resample('W', on='date')['story_point'].sum()
# if you want to assign the summary back to df
df['weekly'] = df.resample('W', on='date')['story_point'].transform('sum')
If the index is datetime, then resample()
can be called on a column, so如果索引是日期时间,则可以在列上调用resample()
,因此
weekly_summary = df.set_index('date')['story_point'].resample('W').sum()
works just fine.工作得很好。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.