简体   繁体   English

Pandas按组中所有值的总和与另一列以逗号分隔

[英]Pandas Group by sum of all the values of the group and another column as comma separated

I want to group by one column (tag) and sum up the corresponding quantites (qty). 我想按一列(标签)分组并总结相应的数量(数量)。 The related reference no. 相关参考号 column should be separated by commas 列应以逗号分隔

import pandas as pd

tag = ['PO_001045M100960','PO_001045M100960','PO_001045MSP2526','PO_001045M870191', 'PO_001045M870191', 'PO_001045M870191']
reference= ['PA_000003', 'PA_000005', 'PA_000001', 'PA_000002', 'PA_000004', 'PA_000009']
qty=[4,2,2,1,1,1]

df = pd.DataFrame({'tag' : tag, 'reference':reference, 'qty':qty})

      tag           reference   qty
PO_001045M100960    PA_000003   4
PO_001045M100960    PA_000005   2
PO_001045MSP2526    PA_000001   2
PO_001045M870191    PA_000002   1
PO_001045M870191    PA_000004   1
PO_001045M870191    PA_000009   1

If I use df.groupby('tag')['qty'].sum().reset_index(), I am getting the following result. 如果我使用df.groupby('tag')['qty']。sum()。reset_index(),则会得到以下结果。

         tag           qty
ASL_PO_000001045M100960 6
ASL_PO_000001045M870191 3
ASL_PO_000001045MSP2526 2

I need an additional column where the reference no. 我需要在参考编号处增加一列。 are added under the respective tags like, 被添加到各个标签下,例如,

         tag           qty     refrence
ASL_PO_000001045M100960 6      PA_000003, PA_000005
ASL_PO_000001045M870191 3      PA_000002, PA_000004, PA_000009
ASL_PO_000001045MSP2526 2      PA_000001

How can I achieve this? 我该如何实现?

Thanks. 谢谢。

Use pandas.DataFrame.groupby.agg : 使用pandas.DataFrame.groupby.agg

df.groupby('tag').agg({'qty': 'sum', 'reference': ', '.join})

Output: 输出:

                                        reference  qty
tag                                                   
PO_001045M100960             PA_000003, PA_000005    6
PO_001045M870191  PA_000002, PA_000004, PA_000009    3
PO_001045MSP2526                        PA_000001    2

Note: if reference column is numeric, ', '.join will not work. 注意:如果reference ', '.join数字,则', '.join将不起作用。 In such case, use lambda x: ', '.join(str(i) for i in x) 在这种情况下,请使用lambda x: ', '.join(str(i) for i in x)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM