[英]Python pandas: calculate revenue from price and quantity
I have a dataframe that looks like the following: 我有一个数据框,如下所示:
df
Out[327]:
date store property_name property_value
0 2013-06-20 1 price 101
1 2013-06-20 2 price 201
2 2013-06-21 1 price 301
3 2013-06-21 2 price 401
4 2013-06-20 1 quantity 1000
5 2013-06-20 2 quantity 2000
6 2013-06-21 1 quantity 3000
7 2013-06-21 2 quantity 4000
I would like to calculate revenue for each date, for each store then add that to the bottom of the dataframe. 我想计算每个商店每个日期的收入,然后将其添加到数据框的底部。 For example, for 2014-06-20, for store#2: revenue=201*2000 = 402000. 例如,对于2014-06-20,对于商店2:收入= 201 * 2000 = 402000。
Below is my code but I know it's not efficient for larger dataframe: 下面是我的代码,但我知道它对于较大的数据帧效率不高:
import pandas as pd
dates = df['date'].unique()
stores = df['store'].unique()
df_len = len(df)
for date in dates:
for store in stores:
mask_price = (df['date']==date) & (df['store']==store) & (df['property_name']=='price')
mask_quantity = (df['date']==date) & (df['store']==store) & (df['property_name']=='quantity')
price = df.loc[mask_price,'property_value'].iloc[0]
quantity = df.loc[mask_quantity,'property_value'].iloc[0]
df.loc[df_len,'date'] = date
df.loc[df_len,'store'] = store
df.loc[df_len,'property_name'] = 'revenue'
df.loc[df_len,'property_value'] = price*quantity
df_len=df_len+1
Thank you in advanced for your help :) 在此先感谢您的帮助:)
This is one way. 这是一种方式。
price = df[df['property_name'] == 'price'].set_index(['date', 'store'])['property_value']
quantity = df[df['property_name'] == 'quantity'].set_index(['date', 'store'])['property_value']
rev = (price * quantity).reset_index().assign(property_name='revenue')
df = pd.concat([df, rev], ignore_index=True)
Explanation 说明
price
and quantity
dataframes via slicing, index by date
and store
. 通过切片,按date
索引和store
得出price
和quantity
数据帧。 rev
via price
* quantity
on index; 通过price
*指数quantity
计算rev
; add property_name
columns. 添加property_name
列。 axis=0
by default (index). 默认情况下,沿axis=0
进行连接(索引)。 Result 结果
date property_name property_value store
0 2013-06-20 price 101 1
1 2013-06-20 price 201 2
2 2013-06-21 price 301 1
3 2013-06-21 price 401 2
4 2013-06-20 quantity 1000 1
5 2013-06-20 quantity 2000 2
6 2013-06-21 quantity 3000 1
7 2013-06-21 quantity 4000 2
8 2013-06-20 revenue 101000 1
9 2013-06-20 revenue 402000 2
10 2013-06-21 revenue 903000 1
11 2013-06-21 revenue 1604000 2
Another way of doing it: 另一种方法是:
prices = df[df['property_name'] == 'price']
quantities = df[df['property_name'] == 'quantity']
res = prices.merge(quantities,on=['date','store'],how='left')
res['property_value'] = res['property_value_x']*res['property_value_y']
res['property_name'] = 'revenue'
res = res[['date','store','property_name','property_value']]
res = prices.append([quantities,res])
Same logic as first answer here: 与第一个答案的逻辑相同:
Hope that helps. 希望能有所帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.