[英]Why am I getting unsupported operand type(s) for -: 'str' and 'str' error
I am doing Business Customer Segmentation.我正在做商业客户细分。 But when I run my code I am getting the error但是当我运行我的代码时,我收到了错误
unsupported operand type(s) for -: 'str' and 'str' - 不支持的操作数类型:“str”和“str”
The error is located on this line of code:错误位于以下代码行:
# Aggregate data by each customer
customers = df_fix.groupby(['CustomerID']).agg({
'InvoiceDate': lambda x: str(snapshot_date - x.max()).days ,
'InvoiceNo': 'count',
'TotalSum': 'sum'})
Here is my entire program:这是我的整个程序:
# Import The Libraries
# ! pip install xlrd
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Import The Dataset
df = pd.read_csv('path/data.csv',encoding='latin1')
df = df[df['CustomerID'].notna()]
# Create TotalSum column
df_fix["TotalSum"] = df_fix["Quantity"] * df_fix["UnitPrice"]
# Sample the dataset
df_fix = df.sample(10000, random_state = 42)
# Convert to show date only
from datetime import datetime
df_fix["InvoiceDate"] = pd.to_datetime(df_fix["InvoiceDate"], errors='coerce', utc=True).dt.strftime('%Y-%m-%d')
# Create date variable that records recency
import datetime
snapshot_date = max(df_fix.InvoiceDate)+str(datetime.timedelta(days=1))
# Aggregate data by each customer
customers = df_fix.groupby(['CustomerID']).agg({
'InvoiceDate': lambda x: (snapshot_date - x.max()).days ,
'InvoiceNo': 'count',
'TotalSum': 'sum'})
Please assist me请帮助我
You should keep the datetime type when calculate计算时应保留日期时间类型
df_fix["InvoiceDate"] = pd.to_datetime(df_fix["InvoiceDate"], errors='coerce', utc=True)
# Create date variable that records recency
snapshot_date = max(df_fix.InvoiceDate)+pd.Timedelta(days=1)
# Aggregate data by each customer
customers = df_fix.groupby(['CustomerID']).agg({
'InvoiceDate': lambda x: (snapshot_date - x.max()).days ,
'InvoiceNo': 'count',
'TotalSum': 'sum'})
Your snapshot_date
is no longer a datetime object, after your converted it into a string with the following line:您的snapshot_date
不再是 datetime 对象,在您将其转换为具有以下行的字符串后:
snapshot_date = max(df_fix.InvoiceDate)+str(datetime.timedelta(days=1))
You may check the output of your snapshot_date
with print(snapshot_date)
to figure out how you can convert it back to a datetime
object.您可以使用print(snapshot_date)
检查您的snapshot_date
日期的输出,以了解如何将其转换回datetime
时间对象。
I have solved the problem by replacing this line of code:我通过替换这行代码解决了这个问题:
df_fix["InvoiceDate"] = pd.to_datetime(df_fix["InvoiceDate"], errors='coerce', utc=True).dt.strftime('%Y-%m-%d')
to this line of code:到这行代码:
df_fix["InvoiceDate"] = pd.to_datetime(df_fix["InvoiceDate"], errors='coerce')
The problem is now solved.现在问题已经解决了。
Thank you all of you for your help.谢谢大家的帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.