基于列值的Pandas聚合减法

Question

Suppose I have DataFrame 假设我有DataFrame

'name'     'quantity'   'day'
'A'         1           'Monday'
'A'         10          'Sunday'
'A'         5           'Friday'
'B'         2           'Monday'
'B'         30          'Sunday'
'B'         5           'Thursday'

What I need to build is another dataframe where for each name I subtract the quantity of Monday from the quantity of Sunday. 我需要构建的是另一个数据框，其中对于每个名称，我从星期日的数量中减去星期一的数量。 So, I guess I need a groupBy on the name and then an agg with a function, but I am not sure how to do the filter so that only those days are considered. 所以，我想我需要一个groupBy上的名字，然后一个agg与函数，但我不知道该怎么办的过滤器，以便只有那些日子被考虑。

Following the example, the end result I seek is 在这个例子之后，我寻求的最终结果是

'name'     'sub_quantity'
'A'         9 
'B'         28

Answer 1

setup 设定

import pandas as pd
from io import StringIO

txt = """name     quantity   day
A         1           Monday
A         10          Sunday
A         5           Friday
B         2           Monday
B         30          Sunday
B         5           Thursday"""

df = pd.read_csv(StringIO(txt), delim_whitespace=True)

option 1 选项1
unstack

d1 = df.set_index(['name', 'day']).quantity.unstack()

d1.Sunday.sub(d1.Monday)

name
A     9.0
B    28.0
dtype: float64

option 2 选项2
query

s = df.set_index('name').query('day == "Sunday"').quantity
m = df.set_index('name').query('day == "Monday"').quantity
s - m

name
A     9
B    28
Name: quantity, dtype: int64

option 3 选项3
xs

d1 = df.set_index(['day', 'name']).quantity
d1.xs('Sunday') - d1.xs('Monday')

name
A     9
B    28
Name: quantity, dtype: int64

option 4 选项4
cute apply 可爱的apply

def obnoxious(x):
    s = x.day.eq('Sunday').idxmax()
    m = x.day.eq('Monday').idxmax()
    q = 'quantity'
    return x.get_value(s, q) - x.get_value(m, q)

df.groupby('name').apply(obnoxious)

name
A     9
B    28
dtype: int64

timing 定时
example data 示例数据

Answer 2

Solution with pivot and substracting by sub : 具有pivot和sub减法的解决方案：

df = pd.pivot(index=df.name, columns=df.day, values=df.quantity)
print (df.Sunday.sub(df.Monday).reset_index(name='sub_quantity'))
  name  sub_quantity
0    A           9.0
1    B          28.0

基于列值的Pandas聚合减法

问题描述

2 个解决方案

解决方案1
4 已采纳 2016-11-28 15:29:25

解决方案2
3 2016-11-28 15:42:16

基于列值的Pandas聚合减法

问题描述

2 个解决方案

解决方案1 4 已采纳 2016-11-28 15:29:25

解决方案2 3 2016-11-28 15:42:16

解决方案1
4 已采纳 2016-11-28 15:29:25

解决方案2
3 2016-11-28 15:42:16