简体   繁体   English

Pandas - 如果特定列的值为1,则将行中的其他列替换为0

[英]Pandas - Replace other columns in row with 0 if a specific column has a value of 1

Here is an example dataframe: 这是一个示例数据帧:

X Y Z 
1 0 1
0 1 0
1 1 1

Now, here is the rule I've come up with: 现在,这是我提出的规则:

  • X is left as is X原样保留
  • If Y is equal to 1 set the corresponding value in X to 0 如果Y等于1,则将X中的对应值设置为0
  • If Z is equal to 1 set the corresponding value in X and Y to 0 如果Z等于1,则将X和Y中的对应值设置为0

The final dataframe should look like this: 最终的数据框应如下所示:

X Y Z 
0 0 1
0 1 0
0 0 1

My first thought at a solution is this: 我在解决方案上的第一个想法是:

df_null_list = ['X']

for i in ['Y', 'Z']:

    df[df[i] == 1][df_null_list] = 0

    df_null_list.append(i)

When I do this and sum across the y axis, i'm starting to get values of 2 and 4 which don't make sense. 当我这样做并在y轴上求和时,我开始得到2和4的值,这是没有意义的。 Note, i'm referring to when I ran this on the actual dataset. 注意,我指的是当我在实际数据集上运行它时。

Do you have any suggestions for improvements or alternative solutions? 您对改进或替代解决方案有什么建议吗?

Use mask : 使用mask

df['X'] = df['X'].mask(df.Y == 1, 0)
df[['X', 'Y']] = df[['X', 'Y']].mask(df.Z == 1, 0)

Another solution with DataFrame.loc : DataFrame.loc另一个解决方案:

df.loc[df.Y == 1, 'X'] = 0
df.loc[df.Z == 1, ['X', 'Y']] = 0

print (df)
   X  Y  Z
0  0  0  1
1  0  1  0
2  0  0  1

You can generalize this to wanting the last index of 1 per row to remain 1 , and leave everything else as 0 . 您可以将此概括为希望每行1的最后一个索引保持为1 ,并将其他所有内容保留为0 For performance operate on the underlying numpy array: 对于底层numpy数组的性能操作:

a = df.values
idx = (a.shape[1] - a[:, ::-1].argmax(1)) - 1
t = np.zeros(a.shape)
t[np.arange(a.shape[0]), idx] = 1

array([[0., 0., 1.],
       [0., 1., 0.],
       [0., 0., 1.]])

If you need the result back as a DataFrame: 如果您需要将结果作为DataFrame返回:

pd.DataFrame(t, columns=df.columns, index=df.index).astype(int)

   X  Y  Z
0  0  0  1
1  0  1  0
2  0  0  1

Another solution could be to perform an expanding operation on the rows axis using numpy : 另一种解决方案可能是使用numpy在行轴上执行扩展操作:

df1 = df.copy() == 1
df1.iloc[:,::-1].expanding(axis=1).apply(
                 lambda x: x[-1] * np.prod(np.logical_not(x[:-1]))
                 ).iloc[:,::-1]

     X    Y    Z
0  0.0  0.0  1.0
1  0.0  1.0  0.0
2  0.0  0.0  1.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas - Select 来自特定列的行值基于其他列的值 - Pandas - Select row value from specific column based on value from other columns 熊猫用同一行中的其他列值替换数据框值 - pandas replace dataframe value by other columns value in the same row 替换列值取决于 pandas 中的其他列和条件 - Replace on column value dependgin on other columns and conditions in pandas pandas:如何检查列值是否在同一行的其他列中 - pandas: how to check if a column value is in other columns in the same row 根据行中的其他列设置熊猫列布尔值 - Set a pandas column Boolean value based on other columns in the row 熊猫用行中的值替换列 - Pandas replace column by value in row 如何将特定列转换为熊猫中的行关联其他列值 - How To transpose specific columns into rows in pandas associate other column value pandas:根据其他列将多行中一个单元格的值替换为一个特定行 - pandas: replace one cell's value from mutiple row by one particular row based on other columns 如何按特定列分组,然后使用 Pandas 替换其他列的现有值 - How to group by specific columns and then replace existing value of other columns using Pandas Python Pandas 将一列中的 NaN 替换为与列表列相同行的另一列中的值 - Python Pandas replace NaN in one column with value from another column of the same row it has be as list column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM