简体   繁体   中英

Pandas: Combine two data-frames with different shape based on one common column

I have a df with columns:

Student_id      subject       marks
1               English       70
1               math          90
1               science       60
1               social        80
2               English       90
2               math          50
2               science       70
2               social        40

I have another df1 with columns

Student_id      Year_of_join   column_with_info
1                2020           some_info1
1                2020           some_info2
1                2020           some_info3
2                2019           some_info4
2                2019           some_info5

I want to combine two of the above data frames(.csv files) something like below res_df :

Student_id      subject       marks  year_of_join   column_with_info
1               English       70     2020            some_info1
1               math          90     2020            some_info2
1               science       60     2020            some_info3
1               social        80     NaN              NaN
2               English       90     2019            some_info4
2               math          50     2019            some_info5
2               science       70     NaN              NaN
2               social        40     NaN              NaN

Note: I want to join the datasets based on Student_id s. Both have the same unique Student_id's but the shape of the data is different for both the datasets.

PS: The resulting df res_df is just an example of how the data might look after combining two data-frames, It can also be like this:

Student_id      subject       marks  year_of_join   column_with_info
1               English       70     NaN               NaN
1               math          90     2020           some_info1
1               science       60     2020           some_info2
1               social        80     2020           some_info3
2               English       90     NaN               NaN
2               math          50     NaN               NaN
2               science       70     2019            some_info4
2               social        40     2019            some_info5

Thanks in advance for the help. Please help me to solve this..

Use GroupBy.cumcount for helper column used for merge with left join:

df['g'] = df.groupby('Student_id').cumcount()
df1['g'] = df1.groupby('Student_id').cumcount()

df = df.merge(df1, on=['Student_id','g'], how='left').drop('g', axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM