[英]Pandas: set the value of one column of a dataframe with condition on another column of another dataframe
I have two dataframes df1 and df2, with two columns each:我有两个数据框 df1 和 df2,每列有两列:
df1 df2
c1 c2 c2 c3
I want to create a new column c3 for df1 that will be:我想为 df1 创建一个新列 c3,它将是:
which is basically what the vlookup function is doing in Excel.这基本上就是 vlookup function 在 Excel 中所做的事情。
So far I've tried this:到目前为止,我已经尝试过:
df1["c3"] = np.nan
for i in df1.c2.unique():
for j in df2.c2.unique():
if i == j:
df1.loc(df1.c2 == i, "c3") = df2.loc(df2.c2 == j, "c3")
else:
pass
But when I print my resulting df1, c3 remains unchanged... I checked my df1.loc and df2.loc by printing them seperately in the loop, and they're both aiming at the right value...但是当我打印结果df1时, c3保持不变......我通过在循环中分别打印它们来检查我的df1.loc和df2.loc ,它们都瞄准了正确的值......
Can anyone help me fix this?谁能帮我解决这个问题?
PS: For further context, I am trying to associate the pygal country codes to the corresponding country, in order to plot them in a world map. PS:为了进一步了解,我正在尝试将 pygal 国家代码与相应的国家/地区相关联,以便在世界 map 中将它们 plot。
df1 = my dataset
df1.c1 = relavant data
df1.c2 = country name
df1.c3 = country code
df2 = pygal country code table
df2.c2 = country name
df2.c3 = country code
Something like this with NumPy np.where()
: NumPy np.where()
这样的东西:
df1['c3'] = np.where(df1['c2'] == df2['c2'], df2['c3'], np.nan)
Sort of like if()
in Excel.有点像 Excel 中的if()
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.