简体   繁体   中英

Create a column of list from another column and display only unique values in pandas dataframe

I am new to pandas, I am trying to use group by and create a list of in a new column. I have 3 columns in my Dataframe and I created a 4th column(New_List) to create a list from another column like below: using the below code:

new_df = df.join(pd.Series(df.groupby(by='NO_ACCOUNTS').apply(lambda x: list(x.Bucket)), name="list_of_b"), on='NO_ACCOUNTS')

Account_Number   Bucket  Number_Transactions     New_List
   ABA            APP          155                 [APP]
   ABC            APP          1352                [APP]
   AAA            APP          90                  [API,APP]
   AAA            API          5                   [API,APP]

I am looking to get the desired output with 3 columns:

Account_Number     Number_Transactions     New_List
   ABA                      155                 [APP]
   ABC                      1352                [APP]
   AAA                      95                  [API,APP]

You can agg regate the two columns:

out = (df.groupby("Account_Number", sort=False, as_index=False)
         .agg(Number_Transactions=("Number_Transactions", "sum"),
              New_List=("Bucket", list)))

which first groups by Account_Number while keeping their order with sort=False and not making it index with as_index=False , and then aggregates the Number_Transactions column with summation and appoints it to the same name columns and similarly, aggs the Bucket column with list and assigns it to New_List column in the output,

to get

>>> out

  Account_Number  Number_Transactions    New_List
0            ABA                  155       [APP]
1            ABC                 1352       [APP]
2            AAA                   95  [APP, API]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM