简体   繁体   中英

Set values in Pandas one dataframe based on rows in second dataframe

I have two dataframes df1 and df2 and I want to create a new column in df1 and set values in that column to 0 where rows in df1 are contained in df2. More specifically:

sample_data_1 = {'col1': ['80', '8080'], 'col2': ['0.0.0.0', '143.21.7.165']}
df1 = pd.DataFrame(data=sample_data_1)

sample_data_2 = {'col1': ['80', '8080', '1', '8888'], 'col2': ['0.0.0.0', '143.21.7.165', '1', '5.5.5.5'], 'col3': ['1','2','3']}
df2 = pd.DataFrame(data=sample_data_2)



     col1          col2
0    80         0.0.0.0
1  8080    143.21.7.165

   col1          col2 col3
0    80       0.0.0.0    1
1  8080  143.21.7.165    2
2     1             1    3
3  8888       5.5.5.5    4

I would like to add a column to df1 and set those values to 0 where col1 and col2 in df1 match col1 and col2 in df2. The resultant dataframe should look like:

    col1          col2    score
0    80         0.0.0.0   0
1  8080    143.21.7.165   0

When the dataframe sizes are the same, I can do a straight comparison using.loc function and logical and's, but when they have different shapes I get "unable to compare series" exceptions. Thoughts?

Thanks for the help!

You can use df.merge :

In [2735]: df1 = df1.merge(df2, on=['col1','col2']).drop('col3',1).assign(score=0)

In [2737]: df1 
Out[2737]: 
   col1          col2  score
0    80       0.0.0.0      0
1  8080  143.21.7.165      0

If the entries in col1 are not identical, you could set col1 as index. Precisely:

df = df2.set_index('col1').reindex(df1.set_index('col1').index)
df['score']=0
df.reset_index(inplace=True)

Check membership by zipping a common column in df1, df2 This returns boolean

Using np.where(condition, if condition, not condition) , calculate your desired output

import numpy as np

df1['score']=np.where([x in y for x,y in zip(df1.col1,df2.col1)],0,'not available')

    col1     col2          score
0   80      0.0.0.0         0
1   8080    143.21.7.165    0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM