Python Pandas - Group and join lines with multiple columns

Question

I working with a dataframe that have some rows that needed to be groupped (with a join) using a key.

Basically I have this dataframe:

d = {'process': [1, 2, 2, 3, 3], 'notes_txt': ['TESTE 1', 'TESTE A ', 'TESTE A ',  'TESTE B ', 'TESTE B '],'notes_cont': ['Process 1: 0 errors', 'Process 1:', '0 errors',  'Process 2:', '5 errors'], 'notes_cont_pt2': ['via script', 'via script 1', 'script 2', 'via script 2', 'script 5']}
df = pd.DataFrame(data=d)

And my desired output is:

I am trying with this code for one column (and it works fine):

import pandas as pd

d = {'process': [1, 2, 2, 3, 3], 'notes_txt': ['TESTE 1', 'TESTE A ', 'TESTE A ',  'TESTE B ', 'TESTE B '],'notes_cont': ['Process 1: 0 errors', 'Process 1:', '0 errors',  'Process 2:', '5 errors'], 'notes_cont_pt2': ['via script', 'via script 1', 'script 2', 'via script 2', 'script 5']}
df = pd.DataFrame(data=d)
df = df.groupby(['process','notes_txt'])['notes_cont'].apply(' '.join).reset_index()
print(df)

Grouping with one column I have the solution, but if I have to do it using two columns I getting erros:

Traceback (most recent call last):
    df = df.groupby(['process','notes_txt'])['notes_cont']['notes_cont_pt2'].apply(' '.join).reset_index()
  File "base.py", line 258, in __getitem__
    .format(selection=self._selection))
IndexError: Column(s) notes_cont already selected

I've tried with this:

df = df.groupby(['process','notes_txt'])['notes_cont', 'notes_cont_pt2'].apply(' '.join).reset_index()

But it gives me this output:

Answer 1

IIUC, GroupBy.agg

df.groupby(['process','notes_txt'],as_index = False).agg({'notes_cont':''.join,
                                                          'notes_cont_pt2':','.join})

   process notes_txt           notes_cont         notes_cont_pt2
0        1   TESTE 1  Process 1: 0 errors             via script
1        2  TESTE A    Process 1:0 errors  via script 1,script 2
2        3  TESTE B    Process 2:5 errors  via script 2,script 5

Python Pandas - Group and join lines with multiple columns

Question

1 answers

solution1
3 ACCPTED 2020-02-20 16:09:19

Python Pandas - Group and join lines with multiple columns

Question

1 answers

solution1 3 ACCPTED 2020-02-20 16:09:19

solution1
3 ACCPTED 2020-02-20 16:09:19