[英]How to replace values in a column in pandas using regex and a conditional
Im trying to replace certain values in a pandas column (dataframe) using regex, but I want to apply the regex based on values in another column. 我试图使用正则表达式替换熊猫列(数据框)中的某些值,但是我想基于另一列中的值应用正则表达式。
A basic example; 一个基本的例子;
index col1 col2
1 yes foobar
2 yes foo
3 no foobar
Using the following; 使用以下内容;
df.loc[df['col1'] == 'yes', 'col2'].replace({r'(fo)o(?!bar)' :r'\1'}, inplace=True, regex=True)
I expected the following result; 我期望得到以下结果;
index col1 col2
1 yes foobar
2 yes fo
3 no foobar
However it doesn't seem to be working? 但是它似乎不起作用? It doesn't throw any errors or a
settingwithcopy
warning, it just does nothing. 它不会引发任何错误或
settingwithcopy
警告,它什么也不做。 Is there an alternative way to do this? 有替代方法吗?
For avoid chained assignments assign back and remove inplace=True
: 为了避免链接分配,请分配回去并删除
inplace=True
:
mask = df['col1'] == 'yes'
df.loc[mask, 'col2'] = df.loc[mask, 'col2'].replace({r'(fo)o(?!bar)' :r'\1'}, regex=True)
print (df)
col1 col2
1 yes foobar
2 yes fo
3 no foobar
Using np.where
: 使用
np.where
:
df.assign(
col2=np.where(df.col1.eq('yes'), df.col2.str.replace(r'(fo)o(?!bar)', r'\1'), df.col2)
)
col1 col2
1 yes foobar
2 yes fo
3 no foobar
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.