[英]Pandas df: fill values in new column with specific values from another column (condition with multiple columns)
I have a dataframe:我有一个 dataframe:
df = pd.DataFrame({'col1': ['a', 'b', 'c', 'd'], 'col2': ['b', 'c', 'd', 'e'], 'col3': [1.0, 2.0, 3.0, 4.0]})
col1 col2 col3
0 a b 1.0
1 b c 2.0
2 c d 3.0
3 d e 4.0
My goal is to create an additional col4 that contains specific values from col3 with a conditon: for each row x, look at the value in col1 and if there is another row y anywhere in the df where this value is present in col2, take the col3 value form this row y and put it as col4 into the original row x.我的目标是创建一个额外的 col4,其中包含来自 col3 的特定值和条件:对于每一行 x,查看 col1 中的值,如果在 col2 中存在该值的 df 中的任何位置有另一行 y,则取col3 值形成该行 y 并将其作为 col4 放入原始行 x。 Otherwise, leave col4 empty for this row, eg NaN.否则,将此行的 col4 留空,例如 NaN。
So the expected output for col4 is: NaN, 1, 2, 3. For the first row there is no value as there is no row in the dataframe that has 'a' is col2.所以 col4 的预期 output 是:NaN, 1, 2, 3。对于第一行没有值,因为 dataframe 中没有具有“a”的行是 col2。 Unlike in this example, the rows can be completely unsorted in the df!与此示例不同,行可以在 df!
Expected output:
col1 col2 col3 col4
0 a b 1.0 NaN
1 b c 2.0 1.0
2 c d 3.0 2.0
3 d e 4.0 3.0
I have tried using.mask but no luck so far.我试过 using.mask 但到目前为止还没有运气。 Thanks for any help!谢谢你的帮助!
You can left join the dataframe to itself using col1 on the left side & col2 on the right side.您可以使用左侧的 col1 和右侧的 col2 将 dataframe 连接到自身。
rename col3
from the right side of the join to col4
and drop the rest of the right side columns example:将连接右侧的col3
重命名为col4
并删除右侧列示例的 rest:
df = df.merge(df, left_on='col1', right_on='col2', how='left', suffixes=('', '_'))
df = df.rename(columns={'col3_': 'col4'})
df = df[['col1', 'col2', 'col3', 'col4']]
df looks like: df 看起来像:
col1 col2 col3 col4
0 a b 1 NaN
1 b c 2 1.0
2 c d 3 2.0
3 d e 4 3.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.