I am new to pandas, I am trying to use group by and create a list of in a new column. I have 3 columns in my Dataframe and I created a 4th column(New_List) to create a list from another column like below: using the below code:
new_df = df.join(pd.Series(df.groupby(by='NO_ACCOUNTS').apply(lambda x: list(x.Bucket)), name="list_of_b"), on='NO_ACCOUNTS')
Account_Number Bucket Number_Transactions New_List
ABA APP 155 [APP]
ABC APP 1352 [APP]
AAA APP 90 [API,APP]
AAA API 5 [API,APP]
I am looking to get the desired output with 3 columns:
Account_Number Number_Transactions New_List
ABA 155 [APP]
ABC 1352 [APP]
AAA 95 [API,APP]
You can agg
regate the two columns:
out = (df.groupby("Account_Number", sort=False, as_index=False)
.agg(Number_Transactions=("Number_Transactions", "sum"),
New_List=("Bucket", list)))
which first groups by Account_Number
while keeping their order with sort=False
and not making it index with as_index=False
, and then aggregates the Number_Transactions
column with summation and appoints it to the same name columns and similarly, aggs the Bucket
column with list
and assigns it to New_List
column in the output,
to get
>>> out
Account_Number Number_Transactions New_List
0 ABA 155 [APP]
1 ABC 1352 [APP]
2 AAA 95 [APP, API]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.