简体   繁体   中英

Python Pandas - Group and join lines with multiple columns

I working with a dataframe that have some rows that needed to be groupped (with a join) using a key.

Basically I have this dataframe:

d = {'process': [1, 2, 2, 3, 3], 'notes_txt': ['TESTE 1', 'TESTE A ', 'TESTE A ',  'TESTE B ', 'TESTE B '],'notes_cont': ['Process 1: 0 errors', 'Process 1:', '0 errors',  'Process 2:', '5 errors'], 'notes_cont_pt2': ['via script', 'via script 1', 'script 2', 'via script 2', 'script 5']}
df = pd.DataFrame(data=d)

And my desired output is:

在此处输入图片说明

I am trying with this code for one column (and it works fine):

import pandas as pd

d = {'process': [1, 2, 2, 3, 3], 'notes_txt': ['TESTE 1', 'TESTE A ', 'TESTE A ',  'TESTE B ', 'TESTE B '],'notes_cont': ['Process 1: 0 errors', 'Process 1:', '0 errors',  'Process 2:', '5 errors'], 'notes_cont_pt2': ['via script', 'via script 1', 'script 2', 'via script 2', 'script 5']}
df = pd.DataFrame(data=d)
df = df.groupby(['process','notes_txt'])['notes_cont'].apply(' '.join).reset_index()
print(df)

Grouping with one column I have the solution, but if I have to do it using two columns I getting erros:

Traceback (most recent call last):
    df = df.groupby(['process','notes_txt'])['notes_cont']['notes_cont_pt2'].apply(' '.join).reset_index()
  File "base.py", line 258, in __getitem__
    .format(selection=self._selection))
IndexError: Column(s) notes_cont already selected

I've tried with this:

df = df.groupby(['process','notes_txt'])['notes_cont', 'notes_cont_pt2'].apply(' '.join).reset_index()

But it gives me this output:

在此处输入图片说明

IIUC, GroupBy.agg

df.groupby(['process','notes_txt'],as_index = False).agg({'notes_cont':''.join,
                                                          'notes_cont_pt2':','.join})

   process notes_txt           notes_cont         notes_cont_pt2
0        1   TESTE 1  Process 1: 0 errors             via script
1        2  TESTE A    Process 1:0 errors  via script 1,script 2
2        3  TESTE B    Process 2:5 errors  via script 2,script 5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM