从相同的列但以不同的方式创建两列

Question

From the table below, I would like to create two columns that aggregate 'amount' depending on the value of 'number' and 'type'.从下表中，我想创建两列，根据“数字”和“类型”的值聚合“金额”。

number数字	type类型	amount数量
1 1个	A一种	10 10
1 1个	A一种	20 20
2 2个	A一种	10 10
3 3个	B乙	20 20
2 2个	B乙	10 10
1 1个	B乙	20 20

Here's the table I would like to get.这是我想要的表。 The first column I want to create is 'amount A', which is the aggregation of the rows with 'A' in 'type' grouped by 'number'.我要创建的第一列是“数量 A”，它是按“数字”分组的“类型”中带有“A”的行的聚合。 The other one 'amount A+B' is the aggregation of all the rows grouped by 'number' regardless the value of 'type'.另一个“数量 A+B”是按“数字”分组的所有行的聚合，而不管“类型”的值如何。

number数字	amount A金额A	amount A+B金额A+B
1 1个	30 30	50 50
2 2个	10 10	20 20
3 3个	0 0	20 20

I only came up with the way to create subsets and create two columns separately.我只是想出了创建子集和分别创建两列的方法。 But I wonder if there is more efficient way.但我想知道是否有更有效的方法。

Answer 1

You can try this:你可以试试这个：

out = (
    df.astype({'number': 'category'})
    .query('type == "A"')
    .groupby(['number'])['amount'].sum()
    .to_frame('amount A')
)

out['amount A+B'] = df.groupby('number')['amount'].sum()

print(out)
        amount A  amount A+B
number                      
1             30          50
2             10          20
3              0          20

One of the tricks is to convert the 'number' column to a categorical so that we have a resultant sum for all numbers even if a number doesn't appear with 'type A' .其中一个技巧是将'number'列转换为分类列，这样即使数字没有出现在'type A'中，我们也可以得到所有数字的sum 。

Once we do that, we can very easily perform a groupby across the numbers with an without the rows where type == "A" .一旦我们这样做了，我们就可以很容易地在没有行的数字上执行 groupby where type == "A" 。

从相同的列但以不同的方式创建两列

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-11-22 14:42:32

从相同的列但以不同的方式创建两列

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-11-22 14:42:32

解决方案1
1 已采纳 2022-11-22 14:42:32