I've got a pandas DataFrame where I want to replace certain values in a selection of columns with the value from another in the same row.
I did the following:
df[cols[23:30]] = df[cols[23:30]].apply(lambda x: x.replace(99, df['col1']))
df[cols[30:36]] = df[cols[30:36]].apply(lambda x: x.replace(99, df['col2']))
cols
is a list with column names. It works, but time it takes to replace all those values seems to take longer than would be necessary. I figured there must be a quicker (computationally) way of achieving the same.
Any suggestions?
You can try:
import numpy as np
df[cols[23:30]] = np.where(df[cols[23:30]] == 99, df[['col1'] * (30-23)], df[cols[23:30]])
df[cols[30:36]] = np.where(df[cols[30:36]] == 99, df[['col2'] * (36-30)], df[cols[30:36]])
df[["col1"] * n]
will create dataframe with exactly same column repeated n
times, so numpy could use it as a mask for n
columns you want to iterate through if 99
is encountered, otherwise taking respective value, which is already there.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.