[英]Pandas aggregation subtraction based on column value
Suppose I have DataFrame 假设我有DataFrame
'name' 'quantity' 'day'
'A' 1 'Monday'
'A' 10 'Sunday'
'A' 5 'Friday'
'B' 2 'Monday'
'B' 30 'Sunday'
'B' 5 'Thursday'
What I need to build is another dataframe where for each name I subtract the quantity of Monday from the quantity of Sunday. 我需要构建的是另一个数据框,其中对于每个名称,我从星期日的数量中减去星期一的数量。 So, I guess I need a groupBy
on the name and then an agg
with a function, but I am not sure how to do the filter so that only those days are considered. 所以,我想我需要一个groupBy
上的名字 ,然后一个agg
与函数,但我不知道该怎么办的过滤器,以便只有那些日子被考虑。
Following the example, the end result I seek is 在这个例子之后,我寻求的最终结果是
'name' 'sub_quantity'
'A' 9
'B' 28
setup 设定
import pandas as pd
from io import StringIO
txt = """name quantity day
A 1 Monday
A 10 Sunday
A 5 Friday
B 2 Monday
B 30 Sunday
B 5 Thursday"""
df = pd.read_csv(StringIO(txt), delim_whitespace=True)
option 1 选项1
unstack
d1 = df.set_index(['name', 'day']).quantity.unstack()
d1.Sunday.sub(d1.Monday)
name
A 9.0
B 28.0
dtype: float64
option 2 选项2
query
s = df.set_index('name').query('day == "Sunday"').quantity
m = df.set_index('name').query('day == "Monday"').quantity
s - m
name
A 9
B 28
Name: quantity, dtype: int64
option 3 选项3
xs
d1 = df.set_index(['day', 'name']).quantity
d1.xs('Sunday') - d1.xs('Monday')
name
A 9
B 28
Name: quantity, dtype: int64
option 4 选项4
cute apply
可爱的apply
def obnoxious(x):
s = x.day.eq('Sunday').idxmax()
m = x.day.eq('Monday').idxmax()
q = 'quantity'
return x.get_value(s, q) - x.get_value(m, q)
df.groupby('name').apply(obnoxious)
name
A 9
B 28
dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.