![](/img/trans.png)
[英]Sum column in one dataframe based on row value of another dataframe
[英]Dataframe group based on one column and get the sum of value of desired items for another column
目前我的 dataframe 是:
dd = [[1001,'green apple',1,7],[1001,'red apple',1,2],[1001,'grapes',1,5],[1002,'green apple',2,4],[1002,'red apple',2,4],[1003,'red apple',3,8],[1004,'mango',4,2],[1004,'red apple',4,6]]
df = pd.DataFrame(dd, columns = ['colID','colString','custID','colQuantity'])
colID colString custID colQuantity
0 1001 green apple 1 7
1 1001 red apple 1 2
2 1001 grapes 1 5
3 1002 green apple 2 4
4 1002 red apple 2 4
5 1003 red apple 3 8
6 1004 mango 4 2
7 1004 red apple 4 6
现在我只设法使用代码过滤包含红色和绿色苹果的行:
selection = ['green apple','red apple']
mask = df.colString.apply(lambda x: any(item for item in selection if item in x))
df = df[mask]
当前 Output:
colID colString custID colQuantity
0 1001 green apple 1 7
1 1001 red apple 1 2
3 1002 green apple 2 4
4 1002 red apple 2 4
5 1003 red apple 3 8
7 1004 red apple 4 6
最终所需的 output 得到具有相同 colID 的青苹果和红苹果的总和:
colID custID colQuantity
1001 1 9
1002 2 8
您可以使用isin
索引 dataframe 然后groupby.sum
:
(df[df.colString.isin(['green apple', 'red apple'])]
.groupby(['colID','colString'], as_index=False)
.sum())
colID colString colQuantity
0 1001 green apple 7
1 1001 red apple 2
2 1002 green apple 4
3 1002 red apple 4
4 1003 red apple 8
5 1004 red apple 6
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.