如何比较列上的两个数据框并用其他列值替换

Question

I am having two data frames that are df1 and df2 我有两个数据帧，分别是df1和df2

id      first       last  size
  A 1978-01-01 1979-01-01     2
  B 2000-01-01 2000-01-01     1
  C 1998-01-01 2000-01-01     3
  D 1998-01-01 1998-01-01     1
  E 1999-01-01 2000-01-01     2

  id  token       
  A     ZA.00 
  B     As.11
  C     SD.34

output 产量

id          first       last        size
  ZA.00     1978-01-01 1979-01-01     2
  As.11     2000-01-01 2000-01-01     1
  SD.34     1998-01-01 2000-01-01     3
  D         1998-01-01 1998-01-01     1
  E         1999-01-01 2000-01-01     2

If df1 id is present in df2 then token value is to set df1 id value. 如果df2中存在df1 id，则令牌值将设置df1 id值。 How can i achieve this. 我怎样才能做到这一点。

Answer 1

Using Merge and combine_first : 使用Merge和combine_first ：

df = df1.merge(df2,how='outer')
df['id'] = df['token'].combine_first(df['id'] )
df.drop('token',inplace=True,axis=1)

Another way is to use replace with dictionary of df2.values , here the df1 dataframe changes.: 另一种方法是使用replace用的字典df2.values ，这里的DF1数据帧的变化：

df1.id.replace(dict(df2.values),inplace=True)

        id  first   last    size
    0   ZA.00   1978-01-01  1979-01-01  2
    1   As.11   2000-01-01  2000-01-01  1
    2   SD.34   1998-01-01  2000-01-01  3
    3   D   1998-01-01  1998-01-01  1
    4   E   1999-01-01  2000-01-01  2

Answer 2

Use map and fillna : 使用map和fillna ：

df1['id'] = df1['id'].map(df2.set_index('id')['token']).fillna(df1['id'])
df1

Output: 输出：

      id       first        last  size
0  ZA.00  1978-01-01  1979-01-01     2
1  As.11  2000-01-01  2000-01-01     1
2  SD.34  1998-01-01  2000-01-01     3
3      D  1998-01-01  1998-01-01     1
4      E  1999-01-01  2000-01-01     2

You can use map with a series as an argument. 您可以使用带有系列的map作为参数。

Answer 3

If you do not wish to merge your DataFrame, you could use apply function to solve this. 如果您不希望合并您的DataFrame，则可以使用apply函数来解决此问题。 Change your small DataFrame to dictionary and map it to the other DataFrame. 将您的小型DataFrame更改为字典并将其映射到另一个DataFrame。

from io import StringIO #used to get string to df

import pandas as pd

id_ =list('ABC')
token = 'ZA.00 As.11 SD.34'.split()
dt = pd.DataFrame(list(zip(id_,token)),columns=['id','token'])

a ='''
id first last size
A 1978-01-01 1979-01-01 2
B 2000-01-01 2000-01-01 1
C 1998-01-01 2000-01-01 3
D 1998-01-01 1998-01-01 1
E 1999-01-01 2000-01-01 2
'''

df =pd.read_csv(StringIO(a), sep=' ')

# This last two lines are all you need
mp= {x:y for x,y in zip(dt.id.tolist(),dt.token.tolist())}

df.id.apply(lambda x: mp[x] if x in mp.keys() else x)

如何比较列上的两个数据框并用其他列值替换

问题描述

3 个解决方案

解决方案1
1 2018-08-22 14:58:15

解决方案2
1 2018-08-22 20:16:38

解决方案3
0 2018-08-22 20:10:48

如何比较列上的两个数据框并用其他列值替换

问题描述

3 个解决方案

解决方案1 1 2018-08-22 14:58:15

解决方案2 1 2018-08-22 20:16:38

解决方案3 0 2018-08-22 20:10:48

解决方案1
1 2018-08-22 14:58:15

解决方案2
1 2018-08-22 20:16:38

解决方案3
0 2018-08-22 20:10:48