Pandas: set the value of one column of a dataframe with condition on another column of another dataframe

Question

I have two dataframes df1 and df2, with two columns each:

df1                                            df2
c1 c2                                          c2 c3

I want to create a new column c3 for df1 that will be:

equal to column c3 of df2 when df1.c2 = df2.c2
NaN else

which is basically what the vlookup function is doing in Excel.

So far I've tried this:

df1["c3"] = np.nan

for i in df1.c2.unique():
    for j in df2.c2.unique():
        if i == j:
            df1.loc(df1.c2 == i, "c3") = df2.loc(df2.c2 == j, "c3")
        else:
            pass

But when I print my resulting df1, c3 remains unchanged... I checked my df1.loc and df2.loc by printing them seperately in the loop, and they're both aiming at the right value...

Can anyone help me fix this?

PS: For further context, I am trying to associate the pygal country codes to the corresponding country, in order to plot them in a world map.

df1 = my dataset

df1.c1 = relavant data

df1.c2 = country name

df1.c3 = country code

df2 = pygal country code table

df2.c2 = country name

df2.c3 = country code

Answer 1

Something like this with NumPy np.where() :

df1['c3'] = np.where(df1['c2'] == df2['c2'], df2['c3'], np.nan)

Sort of like if() in Excel.

Pandas: set the value of one column of a dataframe with condition on another column of another dataframe

Question

1 answers

solution1
0 2021-04-21 00:20:26

Pandas: set the value of one column of a dataframe with condition on another column of another dataframe

Question

1 answers

solution1 0 2021-04-21 00:20:26

solution1
0 2021-04-21 00:20:26