简体   繁体   中英

pandas.DataFrame: How to merge rows with a common column value in the same pandas.DataFrame

I have a pandas.DataFrame that looks like that:

index projectid question answer
0 1 'q1' 'str1'
1 1 'q2' 'str2'
2 1 'q3' 'str3'
3 2 'q1' 'str4'
4 2 'q3' 'str6'

And I would like to format it like that:

index projectid question1 answer1 question2 answer2 question3 answer3
0 1 'q1' 'str1' 'q2' 'str2' 'q3' 'str3'
1 2 'q1' 'str4' None None 'q3' 'str6'

Not every project has the same number of question but questions are shared for each project. So when a specific question isn't in a project, I would like cells to be filled up with None values.

I didn't found any way to do it with join or concat, but I don't know how to properly use it.

I would like to improve my pandas skills so my question is:
Is there any way to do it with pandas treatment or doing it manually by treating my DataFrames with iterrows is the only way?

Thank you !

You can use cumcount before pivoting to get your suffixes:

df['idx'] = df.groupby('projectid').cumcount() + 1
df = df.pivot(index='projectid',columns='idx')[['question','answer']]
df.columns = [''.join(map(str, col)) for col in df.columns]
print(df)

Output::

          question1 question2 question3 answer1 answer2 answer3
projectid
1              'q1'      'q2'      'q3'  'str1'  'str2'  'str3'
2              'q1'      'q3'       NaN  'str4'  'str6'     NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM