简体   繁体   中英

Group by the all the columns except the first one, but aggregate as list the first column

Let's say, I have this dataframe:

df = pd.DataFrame({'col_1': ['yes','no'], 'test_1':['a','b'], 'test_2':['a','b']})

What I want, is to group by all the columns except the first one and aggregate the results where the group by is the same.

This is what I'm trying:

col_names = df.columns.to_list()

df_out = df.groupby([col_names[1:]])[col_names[0]].agg(list)

This is my end data frame goal:

df = pd.DataFrame({'col_1': [['yes','no']], 'test_1':['a'], 'test_2':['b']})

And, if I have more rows, I want it to behave with the same principle, join in list the groups that are the same based on the column [1:] (from the second till end.

Using pandas agg() method

df = df.groupby(df.columns.difference(["col_1"]).tolist()).agg(
    lambda x: x.tolist()).reset_index()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM