Pandas: Merge 2 dataframes based on a column values; for mulitple rows containing same column value, append those to different columns

Question

I have two dataframes, dataframe1 and dataframe2. They both share the same data in a particular column for both, lets call this column 'share1' and 'share2' for dataframe1 and dataframe2 respectively.

The issue is, there are instances where in dataframe1 , there is only one row in 'share1' with a particular value (lets call it 'c34z'), but in dataframe2 there are multiple rows with the value 'c34z' in the 'share2' column.

What I would like to do is, in the new merged dataframe, when there are new values, I would just like to place them in a new column.

So the number of columns in the new dataframe will be the maximum number of duplicates for a particular value in 'share2' . And for rows where there was only a unique value in 'share2', the rest of the added columns will be blank, for that row.

Answer 1

You can using cumcount create the additional key then, pivot df2

newdf2=df2.assign(key=df2.groupby('share2').cumcount(),v=df2.share2).pivot_table(index='share2',columns='key',values='v',aggfunc='first')

After this ,I am using .loc or reindex concat df2 to df1

df2=df2.reindex(df1.share1)

df2.index=df1.index
yourdf=pd.concat([df1,df2],axis=1)

Answer 2

Loading Data:

import pandas as pd
df1 = {'key': ['c34z', 'c34z_2'], 'value': ['x', 'y']}
df2 = {'key': ['c34z', 'c34z_2', 'c34z_2'], 'value': ['c34z_value', 'c34z_2_value', 'c34z_2_value']}
df1 = pd.DataFrame(df1)
df2 = pd.DataFrame(df2)

Convert df2 by grouping and pivoting

df2_pivot = df2.groupby('key')['value'].apply(lambda df: df.reset_index(drop=True)).unstack().reset_index()

merge df1 and df2_pivot

df_merged = pd.merge(df1, df2_pivot, on='key')

Pandas: Merge 2 dataframes based on a column values; for mulitple rows containing same column value, append those to different columns

Question

2 answers

solution1
1 ACCPTED 2019-04-21 01:30:10

solution2
1 2019-04-21 01:33:59

Pandas: Merge 2 dataframes based on a column values; for mulitple rows containing same column value, append those to different columns

Question

2 answers

solution1 1 ACCPTED 2019-04-21 01:30:10

solution2 1 2019-04-21 01:33:59

solution1
1 ACCPTED 2019-04-21 01:30:10

solution2
1 2019-04-21 01:33:59