简体   繁体   中英

Python Merging data frames

In python, I have a df that looks like this

Name    ID
Anna    1
Polly   1
Sarah   2
Max     3
Kate    3
Ally    3
Steve   3

And a df that looks like this

Name    ID
Dan     1
Hallie  2
Cam     2
Lacy    2
Ryan    3
Colt    4
Tia     4

How can I merge the df's so that the ID column looks like this

Name    ID
Anna    1
Polly   1
Sarah   2
Max     3
Kate    3
Ally    3
Steve   3
Dan     4
Hallie  5
Cam     5
Lacy    5
Ryan    6
Colt    7
Tia     7

This is just a minimal reproducible example. My actual data set has 1000's of values. I'm basically merging data frames and want the ID's in numerical order (continuation of previous data frame) instead of repeating from one each time. I know that I can reset the index if ID is a unique identifier. But in this case, more than one person can have the same ID. So how can I account for that?

From the example that you have provided above, you can observe that we can obtain the final dataframe by: adding the maximum value of ID in first df to the second and then concatenating them, to explain this better:

Name  df2   final_df
Dan   1     4

This value in final_df is obtained by doing a 1+(max value of ID from df1 ie 3) and this trend is followed for all entries for the dataframe.

Code:

import pandas as pd

df = pd.DataFrame({'Name':['Anna','Polly','Sarah','Max','Kate','Ally','Steve'],'ID':[1,1,2,3,3,3,3]})
df1 = pd.DataFrame({'Name':['Dan','Hallie','Cam','Lacy','Ryan','Colt','Tia'],'ID':[1,2,2,2,3,4,4]})

max_df = df['ID'].max()
df1['ID'] = df1['ID'].apply(lambda x: x+max_df)
final_df = pd.concat([df,df1])
print(final_df)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM