[英]Replacing values in pandas dataframe column with same row value from another column
I have a pandas dataframe that looks like this:我有一个 pandas dataframe 看起来像这样:
val_1 val_2 Flag
Date
2018-08-27 221.0 121.0 0
2018-08-28 222.0 122.0 1
2018-08-29 223.0 123.0 0
2018-08-30 224.0 124.0 2
2018-08-31 225.0 125.0 0
I want to change the Flag column values to the same values from other columns based on Flag condition.我想根据标志条件将标志列值更改为其他列的相同值。 Namely, if Flag is 1 replace 1 with val_1 from the same row and if Flag is 2 replace it with val_2.
即,如果 Flag 为 1,则将同一行中的 1 替换为 val_1,如果 Flag 为 2,则将其替换为 val_2。 The output that I am looking would look like this:
我正在寻找的 output 看起来像这样:
val_1 val_2 Flag
Date
2018-08-27 221.0 121.0 0
2018-08-28 222.0 122.0 222.0
2018-08-29 223.0 123.0 0
2018-08-30 224.0 124.0 124.0
2018-08-31 225.0 125.0 0
I know that I can use .loc
like this df.loc[df['Flag'] == 1, ['Flag']] =
.我知道我可以像这样使用
.loc
df.loc[df['Flag'] == 1, ['Flag']] =
。 I don't know what goes to the right hand side of the code.我不知道代码右侧是什么。
IIUC:国际大学联盟:
new_vals = df.lookup(df.index, df.columns[df.Flag-1])
df['Flag'] = df.Flag.mask(df.Flag>0, new_val)
Note : as commented by @Erfan, this would also work:注意:正如@Erfan 所评论的,这也可以:
df['Flag'] = df.lookup(df.index, df.columns[df.Flag-1])
Output: Output:
val_1 val_2 Flag
Date
2018-08-27 221.0 121.0 0
2018-08-28 222.0 122.0 222
2018-08-29 223.0 123.0 0
2018-08-30 224.0 124.0 124
2018-08-31 225.0 125.0 0
One other way is to use np.where for numpy.where(condtion,yes,no)
另一种方法是将 np.where 用于
numpy.where(condtion,yes,no)
In this case, I use nested np.where
so that在这种情况下,我使用嵌套
np.where
以便
np.where(If Flag=2,take val_2,(take x)) where takex is another np.where
df['Flag']=np.where(df['Flag']==1,df['val_1'],(np.where(df['Flag']==2,df['val_2'],df['Flag'])))
df
Output Output
Few ways you could do this, firstly your initial code is very close, you just need to end the assignment:有几种方法可以做到这一点,首先你的初始代码非常接近,你只需要结束分配:
df.loc[df['Flag'] == 1, 'Flag'] = df['val_1']
print(df)
Date val_1 val_2 Flag
0 2018-08-27 221.0 121.0 0.0
1 2018-08-28 222.0 122.0 222.0
2 2018-08-29 223.0 123.0 0.0
3 2018-08-30 224.0 124.0 2.0
4 2018-08-31 225.0 125.0 0.0
what you're doing here is filtering your dataframe and replacing the values where the conditions matches.你在这里做的是过滤你的 dataframe 并替换条件匹配的值。 in this iinstance where Flag is equal to one.
在这种情况下,Flag 等于 1。
since you're making muliple assingments, lets use np.select
既然你正在做多重评估,让我们使用
np.select
import numpy as np
conditions = [df['Flag'].eq(1),
df['Flag'].eq(2)]
choices = [df['val_1'],df['val_2']]
df['Flag'] = np.select(conditions,choices,default=df['Flag'])
What this this does is evaulate any and all conditions you have.这样做的目的是评估您拥有的所有条件。 leaving the default as the original column.
将默认值保留为原始列。 You can add more conditions in, and wrap OR statements in parenthsis with a |
您可以在其中添加更多条件,并将 OR 语句用 | 括在括号中。 (pipe) sepreators.
(管道)分离器。 ie
[(df['Flag'] == 1 | df['Flag'] == 2)]
即
[(df['Flag'] == 1 | df['Flag'] == 2)]
Date val_1 val_2 Flag
0 2018-08-27 221.0 121.0 0.0
1 2018-08-28 222.0 122.0 222.0
2 2018-08-29 223.0 123.0 0.0
3 2018-08-30 224.0 124.0 124.0
4 2018-08-31 225.0 125.0 0.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.