pandas DataFrame：用另一个值替换多个列中的值

Question

I've got a pandas DataFrame where I want to replace certain values in a selection of columns with the value from another in the same row.我有一个 pandas DataFrame ，我想用同一行中另一个列的值替换选择列中的某些值。

I did the following:我做了以下事情：

df[cols[23:30]] = df[cols[23:30]].apply(lambda x: x.replace(99, df['col1']))
df[cols[30:36]] = df[cols[30:36]].apply(lambda x: x.replace(99, df['col2']))

cols is a list with column names. cols是一个带有列名的列表。
99 is considered a missing value which I want to replace with the (already calculated) Mean for the given class (ie, col1 or col2 depending on the selection) 99 被认为是一个缺失值，我想用给定 class 的（已计算的）平均值（即 col1 或 col2 取决于选择）替换

It works, but time it takes to replace all those values seems to take longer than would be necessary.它有效，但替换所有这些值所需的时间似乎比必要的要长。 I figured there must be a quicker (computationally) way of achieving the same.我认为必须有一种更快（计算上）的方法来实现相同的目标。

Any suggestions?有什么建议么？

Answer 1

You can try:你可以试试：

import numpy as np

df[cols[23:30]] = np.where(df[cols[23:30]] == 99, df[['col1'] * (30-23)], df[cols[23:30]])

df[cols[30:36]] = np.where(df[cols[30:36]] == 99, df[['col2'] * (36-30)], df[cols[30:36]])

df[["col1"] * n] will create dataframe with exactly same column repeated n times, so numpy could use it as a mask for n columns you want to iterate through if 99 is encountered, otherwise taking respective value, which is already there. df[["col1"] * n]将创建 dataframe 与完全相同的列重复n次，因此 numpy 可以将它用作n列的掩码，如果遇到99则要迭代通过，否则取相应的值，这已经是那里。

pandas DataFrame：用另一个值替换多个列中的值

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-11-11 11:55:28

pandas DataFrame：用另一个值替换多个列中的值

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-11-11 11:55:28

解决方案1
0 已采纳 2019-11-11 11:55:28