I have a df
with columns:
Student_id subject marks
1 English 70
1 math 90
1 science 60
1 social 80
2 English 90
2 math 50
2 science 70
2 social 40
I have another df1
with columns
Student_id Year_of_join column_with_info
1 2020 some_info1
1 2020 some_info2
1 2020 some_info3
2 2019 some_info4
2 2019 some_info5
I want to combine two of the above data frames(.csv files) something like below res_df
:
Student_id subject marks year_of_join column_with_info
1 English 70 2020 some_info1
1 math 90 2020 some_info2
1 science 60 2020 some_info3
1 social 80 NaN NaN
2 English 90 2019 some_info4
2 math 50 2019 some_info5
2 science 70 NaN NaN
2 social 40 NaN NaN
Note: I want to join the datasets based on Student_id
s. Both have the same unique Student_id's but the shape of the data is different for both the datasets.
PS: The resulting df res_df
is just an example of how the data might look after combining two data-frames, It can also be like this:
Student_id subject marks year_of_join column_with_info
1 English 70 NaN NaN
1 math 90 2020 some_info1
1 science 60 2020 some_info2
1 social 80 2020 some_info3
2 English 90 NaN NaN
2 math 50 NaN NaN
2 science 70 2019 some_info4
2 social 40 2019 some_info5
Thanks in advance for the help. Please help me to solve this..
Use GroupBy.cumcount
for helper column used for merge with left join:
df['g'] = df.groupby('Student_id').cumcount()
df1['g'] = df1.groupby('Student_id').cumcount()
df = df.merge(df1, on=['Student_id','g'], how='left').drop('g', axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.