[英]Python Pandas - Dataframe total by year appearing in one row
I have the following code that summarizes data into the table below listed under Output我有以下代码将数据汇总到下表中列出的输出
df['Customer ID'] = df['Ship To Customer'] + df['ZipCleaned']
df.Date = pd.to_datetime(df.Date)
min_dates = df.groupby(['Customer ID'])['Date'].min()
df['First_Purchase_Date'] = df.apply(lambda row: min_dates.loc[row['Customer ID']], axis=1)
df['New Customer'] = df['Date'] <= df['First_Purchase_Date']
df['Existing Customer'] = df['Date'] > df['First_Purchase_Date']
df['Total Customers'] = (df['New Customer']==True).value_counts() + (df['Existing Customer']==True).value_counts()
df['revenue'] = pd.to_numeric(df['revenue'])
df['Date'] = pd.to_datetime(df['Date'], unit='s')
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
df['First_Purchase_Date'] = pd.to_datetime(df['First_Purchase_Date'], unit='s')
FPRANGE = df.First_Purchase_Date.between('2014-01-01','2019-12-03') #customer first purchase dates we want to include in the dataset
Table = df.loc[FPRANGE].groupby(df['Year'])[['New Customer', 'Existing Customer', 'Total Customers','revenue']].sum() #with date filter on first purchase date
print(Table)
OUTPUT输出
New Customer Existing Customer Total Customers revenue
Year
2014 7.00 2.00 156.00 11,869.47
2015 1.00 3.00 0.00 9,853.93
2016 5.00 3.00 0.00 4,058.53
2017 9.00 3.00 0.00 8,056.37
2018 12.00 7.00 0.00 22,031.23
2019 16.00 10.00 0.00 97,142.42
Notice that the total customers column has an amount under 2014 and not the rest of the rows by year.请注意,总客户列的金额低于 2014 年,而不是其余各行的年份。 I want to be able to see the total number of customers for each year (ie New Customer + Existing Customer).我希望能够看到每年的客户总数(即新客户 + 现有客户)。 I have tried several different approaches but can't seem to get this right.我尝试了几种不同的方法,但似乎无法做到这一点。
上面建议的解决方案有效
Table['Total Customers'] = Table['New Customer'] + Table['Existing Customer']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.