简体   繁体   English

Pandas TypeError:仅对DatetimeIndex,TimedeltaIndex或PeriodIndex有效,但具有“ Int64Index”的实例

[英]Pandas TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'

I've got some order data that I want to analyse. 我有一些要分析的订单数据。 Currently of interest is: How often has which SKU been bought in which month? 当前感兴趣的是:在哪个月份多久购买一次SKU?

Here a small example: 这里有个小例子:

import datetime
import pandas as pd
import numpy as np

d = {'sku': ['RT-17']}
df_skus = pd.DataFrame(data=d)
print(df_skus)

d = {'date': ['2017/02/17', '2017/03/17', '2017/04/17', '2017/04/18', '2017/05/02'], 'item_sku': ['HT25', 'RT-17', 'HH30', 'RT-17', 'RT-19']}
df_orders = pd.DataFrame(data=d)
print(df_orders)

for i in df_orders.index:
    print("\n toll")
    df_orders.loc[i,'date']=pd.to_datetime(df_orders.loc[i, 'date'])

df_orders = df_orders[df_orders["item_sku"].isin(df_skus["sku"])]
monthly_sales = df_orders.groupby(["item_sku", pd.Grouper(key="date",freq="M")]).size()
monthly_sales = monthly_sales.unstack(0) 

print(monthly_sales)

That works fine, but if I use my real order data (from CSV) I get after some minutes: 效果很好,但是如果我使用真实订单数据(来自CSV),则几分钟后会得到:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index' TypeError:仅对DatetimeIndex,TimedeltaIndex或PeriodIndex有效,但具有“ Int64Index”的实例

That problem comes from the line: 这个问题来自于线:

monthly_sales = df_orders.groupby(["item_sku", pd.Grouper(key="date",freq="M")]).size() month_sales = df_orders.groupby([[“ item_sku”,pd.Grouper(key =“ date”,freq =“ M”)])。size()

Is it possible to skip over the error? 是否可以跳过该错误? I tried a try except block: 我尝试了一下,除了块:

try:
    monthly_sales = df_orders.groupby(["item_sku", pd.Grouper(key="date",freq="M")]).size()
    monthly_sales = monthly_sales.unstack(0) 
except:
    print "\n Here seems to be one issue"

Then I get for the print(monthly_sales) 然后我得到打印(monthly_sales)

Empty DataFrame 空数据框
Columns: [txn_id, date, item_sku, quantity] 列:[txn_id,日期,item_sku,数量]
Index: [] 索引:[]

So something in my data empties or brakes the grouping it seems like? 因此,我的数据中的某些内容可能会清空或阻止分组吗? How can I 'clean' my data? 如何“清理”我的数据?
Or I'd be even fine with loosing the data of a sale here and there if I can just 'skip' over the error, is this possible? 或者,我什至可以在这里和那里丢失销售数据,如果我可以“跳过”错误,这可能吗?

When reading your CSV, use the parse_dates argument - 读取CSV时,请使用parse_dates参数-

df_order = pd.read_csv('file.csv', parse_dates=['date'])

Which automatically converts date to datetime. 自动将date转换为日期时间。 If that doesn't work, then you'll need to load it in as a string, and then use the errors='coerce' argument with pd.to_datetime - 如果这不起作用,则需要将其作为字符串加载,然后在pd.to_datetime使用errors='coerce'参数-

df_order['date'] = pd.to_datetime(df_order['date'], errors='coerce')

Note that you can pass series objects (amongst other things) to pd.to_datetime`. 请注意,您可以将系列对象(除其他外)传递给pd.to_datetime`。

Next, filter and group as you've been doing, and it should work. 接下来,按照您的操作进行过滤和分组,它应该可以工作。

df_orders[df_orders["item_sku"].isin(df_skus["sku"])]\
     .groupby(['item_sku', pd.Grouper(key='date', freq='M')]).size()

item_sku  date      
RT-17     2017-03-31    1
          2017-04-30    1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 类型错误:仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但得到了“RangeIndex”的实例 - TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex' Pandas重采样:TypeError:仅对DatetimeIndex,TimedeltaIndex或PeriodIndex有效,但得到'RangeIndex'的实例 - Pandas Resampling: TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex' 类型错误:如何修复仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但得到了“Index”的实例 - TypeError: How to fix Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index' Pandas dataframe.resample TypeError '仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但获得了“RangeIndex”实例 - Pandas dataframe.resample TypeError 'Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex' Python datetime 仍然给出“TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'” - Python datetime still gives "TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'" TypeError:仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但得到了“RangeIndex”的实例,我不知道为什么 - TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex' and I can't figure out why TypeError:仅当dtype为datetime64 [ns]时对DatetimeIndex,TimedeltaIndex或PeriodIndex有效 - TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex when dtype is datetime64[ns] pandas 不能按日期分组,仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但 - pandas cannot group by date, Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but 用DatetimeIndex索引返回Int64Index? - Indexing with DatetimeIndex returning Int64Index? Pandas 重采样错误:仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效 - Pandas Resampling error: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM