[英]Create a column of list from another column and display only unique values in pandas dataframe
I am new to pandas, I am trying to use group by and create a list of in a new column.我是 pandas 的新手,我正在尝试使用 group by 并在新列中创建列表。 I have 3 columns in my Dataframe and I created a 4th column(New_List) to create a list from another column like below: using the below code:我的 Dataframe 中有 3 列,我创建了第 4 列(New_List)以从另一列创建列表,如下所示:使用以下代码:
new_df = df.join(pd.Series(df.groupby(by='NO_ACCOUNTS').apply(lambda x: list(x.Bucket)), name="list_of_b"), on='NO_ACCOUNTS') new_df = df.join(pd.Series(df.groupby(by='NO_ACCOUNTS').apply(lambda x: list(x.Bucket)), name="list_of_b"), on='NO_ACCOUNTS')
Account_Number Bucket Number_Transactions New_List
ABA APP 155 [APP]
ABC APP 1352 [APP]
AAA APP 90 [API,APP]
AAA API 5 [API,APP]
I am looking to get the desired output with 3 columns:我正在寻找具有 3 列的所需 output:
Account_Number Number_Transactions New_List
ABA 155 [APP]
ABC 1352 [APP]
AAA 95 [API,APP]
You can agg
regate the two columns:您可以agg
这两列:
out = (df.groupby("Account_Number", sort=False, as_index=False)
.agg(Number_Transactions=("Number_Transactions", "sum"),
New_List=("Bucket", list)))
which first groups by Account_Number
while keeping their order with sort=False
and not making it index with as_index=False
, and then aggregates the Number_Transactions
column with summation and appoints it to the same name columns and similarly, aggs the Bucket
column with list
and assigns it to New_List
column in the output,首先按Account_Number
分组,同时使用sort=False
保持其顺序,而不是使用as_index=False
使其索引,然后将Number_Transactions
列与 summation 聚合并将其指定给相同名称的列,同样,将Bucket
列与list
聚合并分配它到New_List
中的 New_List 列,
to get要得到
>>> out
Account_Number Number_Transactions New_List
0 ABA 155 [APP]
1 ABC 1352 [APP]
2 AAA 95 [APP, API]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.