简体   繁体   中英

Merging two dataframes with an added condition

I got two dataframes that I need to merge as per follows:

Df1

Name Type Speed  
a x  1  
a y  0  
a z  1

Df2

Type Fast Slow  
x 2 3  
y 3 5  
z 4 6

Df3 - DESIRED OUTCOME

Name Type Speed Time  
a x 1 2  
a y 0 5  
a z 1 4

So basically I need to create a new 'Time' column that displays times from either 'Fast' or 'Slow' column based on 'Speed' column and the object 'Type' . I have literally no idea how to do this so any help would be much appreciated! Thanks in advance. Apologies for the confusing explanation..

Use merge + np.where for a more succinct solution:

v = df1.merge(df2, on=['Type'])
v['Time'] = np.where(v['Speed'], v.pop('Fast'), v.pop('Slow'))

  Name Type  Speed  Time
0    a    x      1     2
1    a    y      0     5
2    a    z      1     4

Use melt for reshape first, then map for correct match Speed and last merge with left join:

df = df2.melt('Type', var_name='Speed', value_name='Time')
df['Speed'] = df['Speed'].map({'Fast':1, 'Slow':0})
print (df)
  Type  Speed  Time
0    x      1     2
1    y      1     3
2    z      1     4
3    x      0     3
4    y      0     5
5    z      0     6

df3 = df1.merge(df, how='left', on=['Type','Speed'])
print (df3)
  Name Type  Speed  Time
0    a    x      1     2
1    a    y      0     5
2    a    z      1     4

If performance is important merge is not necessary - map by Series created by set_index with numpy.where - df1['Speed'] is 0 and 1 , so is processes like False s and True s:

s1 = df2.set_index('Type')['Fast']
s2 = df2.set_index('Type')['Slow']

df1['Time'] = np.where(df1['Speed'], df1['Type'].map(s1), df1['Type'].map(s2))
print (df1)
  Name Type  Speed  Time
0    a    x      1     2
1    a    y      0     5
2    a    z      1     4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM