简体   繁体   中英

Groupby pandas dataframe keeping unique values for some columns and list other columns

I want to group the following output by material_id keeping the unique values of material_description and MPN, but list the plant_id. picture for reference

def search_output(materials):
 df=pd.DataFrame(materials)
 df_ref = df.loc[:, df.columns!='@search.score'].groupby('material_id').agg({lambda 
 x:list(x)})
 return df_ref

This currently groups by material_id and list other columns.

The following code i use to keep unique values grouped by material_id, but now I am missing the plant_id list column.

df_t = df.loc[:, df.columns!='@search.score'].groupby('material_id' ['material_description','MPN'].agg(['unique'])

picture for reference#2

I'm looking for a way to combine the two. A way to group by a column, keep unique values of specific columns and list other columns at the same time.

Hope you can help - and sorry for the pictures, but can't figure out how to add output otherwise:)

You can create dictionary by lists - first for aggregation by unique and for all another columns by list with dict.fromkeys , join them an pass to GroupBy.agg :

print (df)
   material_id material_description MPN  A  B
0            1               descr1   a  b  c
1            1               descr2   a  d  e
2            1               descr1   b  b  c
3            2               descr3   a  b  c
4            2               descr4   a  b  c
5            2               descr4   a  b  c


u_cols = ['material_description','MPN']
d = {c: 'unique' if c in u_cols else list for c in df.columns.drop('material_id')}

df_ref = df.loc[:, df.columns!='@search.score'].groupby('material_id').agg(d)
print (df_ref)
            material_description     MPN          A          B
material_id                                                   
1               [descr1, descr2]  [a, b]  [b, d, b]  [c, e, c]
2               [descr3, descr4]     [a]  [b, b, b]  [c, c, c]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM