Using pandas, how can I compare the values between 2 columns from two dataframes and push them to a new dataframe?

Question

So I'm new to Python and I'm trying to use Pandas to make a new dataframe using values from two existing ones. Basically using these dataframes:

df1= AB a '1' '3' b '4' '3' c '3' '2' d '9' '1'

df2= CD a '5' '1' b '2' '0' c '4' '2' d '1' '9'

I need to create a loop that will compare the value of each line in df1[A] to the value of each line df2[C]. If the values are equal, I need to join df1[A, B] and df2[C] and push that line to a third dataframe. So the result should look like this for the examples above:

dfnew= ABD a '1' '3' '9' b '4' '3' '2'

Since not all the values I'm working with will be integers I also need to treat the values as strings.

I've been checking out other similar questions but none of the answers seem to get me exactly what I need done.

Answer 1

I think you need merge with default inner join and drop :

df = pd.merge(df1, df2, left_on='A', right_on='C').drop('C', axis=1)

Another solution with rename column for join:

df = pd.merge(df1, df2.rename(columns={'C':'A'}), on='A')

print (df)
     A    B    D
0  '1'  '3'  '9'
1  '4'  '3'  '2'

Notice:

Values in joined columns has to be unique.

Answer 2

You can also use pd.Series.map

df1.assign(D=df1.A.map(dict(zip(df2.C, df2.D)))).dropna()

     A    B    D
a  '1'  '3'  '9'
b  '4'  '3'  '2'

Details
With just the map and assign we are left with rows that we need to drop.

df1.assign(D=df1.A.map(df2.set_index('C').D))

     A    B    D
a  '1'  '3'  '9'
b  '4'  '3'  '2'
c  '3'  '2'  NaN
d  '9'  '1'  NaN

I decided to drop them with a simple dropna . To be more precise, we probably should restrict the dropna to the D column.

df1.assign(D=df1.A.map(df2.set_index('C').D)).dropna(subset=['D'])

     A    B    D
a  '1'  '3'  '9'
b  '4'  '3'  '2'

We could use other ways as well. But then that wasn't really what this question was about.

Using pandas, how can I compare the values between 2 columns from two dataframes and push them to a new dataframe?

Question

2 answers

solution1
3 ACCPTED 2018-01-16 06:29:39

solution2
1 2018-01-16 07:51:45

Using pandas, how can I compare the values between 2 columns from two dataframes and push them to a new dataframe?

Question

2 answers

solution1 3 ACCPTED 2018-01-16 06:29:39

solution2 1 2018-01-16 07:51:45

solution1
3 ACCPTED 2018-01-16 06:29:39

solution2
1 2018-01-16 07:51:45