[英]pandas DataFrame: replace values in multiple columns with the value from another
I've got a pandas DataFrame where I want to replace certain values in a selection of columns with the value from another in the same row.我有一个 pandas DataFrame ,我想用同一行中另一个列的值替换选择列中的某些值。
I did the following:我做了以下事情:
df[cols[23:30]] = df[cols[23:30]].apply(lambda x: x.replace(99, df['col1']))
df[cols[30:36]] = df[cols[30:36]].apply(lambda x: x.replace(99, df['col2']))
cols
is a list with column names. cols
是一个带有列名的列表。It works, but time it takes to replace all those values seems to take longer than would be necessary.它有效,但替换所有这些值所需的时间似乎比必要的要长。 I figured there must be a quicker (computationally) way of achieving the same.我认为必须有一种更快(计算上)的方法来实现相同的目标。
Any suggestions?有什么建议么?
You can try:你可以试试:
import numpy as np
df[cols[23:30]] = np.where(df[cols[23:30]] == 99, df[['col1'] * (30-23)], df[cols[23:30]])
df[cols[30:36]] = np.where(df[cols[30:36]] == 99, df[['col2'] * (36-30)], df[cols[30:36]])
df[["col1"] * n]
will create dataframe with exactly same column repeated n
times, so numpy could use it as a mask for n
columns you want to iterate through if 99
is encountered, otherwise taking respective value, which is already there. df[["col1"] * n]
将创建 dataframe 与完全相同的列重复n
次,因此 numpy 可以将它用作n
列的掩码,如果遇到99
则要迭代通过,否则取相应的值,这已经是那里。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.