Groupby one column and apply 2 columns into list pandas

Question

i have a dataframe

id  name value    flag
1    a     x        F
1    b     y        A
2    c     z        B
3    d     m        Q

if i want to groupby id and put value column into a new column as a list.

i can do

df.groupby('id')['value'].apply(list).reset_index()

is there any way where i can do groupby by 'id' but put 2 column's(name and value) into list.

my desired output


id    col
 1    [[a,x],[b,y]]
 2    [[c,z]]
 3    [[d,m]]

Answer 1

Convert columns to numpy array by values and then to list s in groupby or sepearately to new Series :

df = df.groupby('id')
       .apply(lambda x: x[['name','value']].values.tolist())
       .reset_index(name='col')
print (df)
   id               col
0   1  [[a, x], [b, y]]
1   2          [[c, z]]
2   3          [[d, m]]

Or:

s = pd.Series(df[['name','value']].values.tolist(), index=df.index)
df = s.groupby(df['id']).apply(list).reset_index(name='col')
print (df)
   id               col
0   1  [[a, x], [b, y]]
1   2          [[c, z]]
2   3          [[d, m]]

Also if no problem with tuples in list s:

s = pd.Series(list(zip(df['name'],df['value'])), index=df.index)
df = s.groupby(df['id']).apply(list).reset_index(name='col')
print (df)
   id               col
0   1  [(a, x), (b, y)]
1   2          [(c, z)]
2   3          [(d, m)]

Answer 2

Use zip in apply ie

df.groupby('id').apply(lambda x: list(zip(x['name'],x['value'])))

id
1    [(a, x), (b, y)]
2            [(c, z)]
3            [(d, m)]
dtype: object

To match your exact output use to_frame and reset_index ie

df.groupby('id').apply(lambda x: list(zip(x['name'],x['value']))).to_frame('col').reset_index()

  id               col
0   1  [(a, x), (b, y)]
1   2          [(c, z)]
2   3          [(d, m)]

Answer 3

You can use numpy's stack function to convert the two columns to one column of lists, and then use pandas' own groupby function.

Imports and building dataframe:

import pandas as pd
import numpy as np

df = pd.DataFrame(
    [[1,'a','x','F'],
     [1,'b','y','A'],
     [2,'c','z','B'],
     [3,'d','m','Q']],
    columns=['id','name','value','flag']
).set_index('id')

The function:

df.assign(col=list(np.stack(df[['name','value']].values))) \
    .groupby(level=0)['col'].apply(list).to_frame()

Which returns:

                 col
id                 
1   [[a, x], [b, y]]
2           [[c, z]]
3           [[d, m]]

Answer 4

Fixing a previous errant solution

df = pd.DataFrame({"i" : [i % 3 for i in range(20)], "x" : range(20), "y" : range(20)}) # Init a dummy dframe
df = df.groupby('i')\
        .apply(lambda row: tuple(zip(row['x'], row['y'])))\
        .reset_index()

Groupby one column and apply 2 columns into list pandas

Question

4 answers

solution1
3 ACCPTED 2017-11-30 12:56:42

solution2
2 2017-11-30 12:10:39

solution3
1 2017-11-30 12:25:01

solution4
0 2017-11-30 11:45:50

Groupby one column and apply 2 columns into list pandas

Question

4 answers

solution1 3 ACCPTED 2017-11-30 12:56:42

solution2 2 2017-11-30 12:10:39

solution3 1 2017-11-30 12:25:01

solution4 0 2017-11-30 11:45:50

solution1
3 ACCPTED 2017-11-30 12:56:42

solution2
2 2017-11-30 12:10:39

solution3
1 2017-11-30 12:25:01

solution4
0 2017-11-30 11:45:50