[英]conditionally replace part of a string in a column of dataframe
如果我在列中没有特定的分隔符/字符串,我想替换一个字符串。 如果该行中已存在分隔符,则我不想触摸该行。 我有大约350万条记录。
以下是样本集。 我想替换是 :
One:1
Two:2
Three is 3
Four is IV:4
输出应该是这样的
One:1
Two:2
Three:3
Four is IV:4
选项1
Inplace with update
df.update(
df.myValues.loc[
lambda x: ~x.str.contains(':')
].str.replace('\s+is\s+', ':'))
myValues
0 One:1
1 Two:2
2 Three:3
3 Four is IV:4
选项2
内联和使用map
f = lambda x: x if ':' in x else x.replace(' is ', ':')
df.assign(myValues=list(map(f, v)))
myValues
0 One:1
1 Two:2
2 Three:3
3 Four is IV:4
首先,过滤掉所有包含的字符串:
。 然后,对于剩下的所有行,将“is”替换为“:”。 (在你的例子中,“is”周围的空格也被删除。因此,我将“is”替换为“:”。)
df = pd.DataFrame(["One:1", "Two:2", "Three is 3", "Four is IV:4"], columns=["myValues"])
for idx, v in df[~df.myValues.str.contains(":")].iterrows():
df.loc[idx].myValues = df.iloc[idx].myValues.replace(" is ", ":")
参考
尝试没有循环和使用loc的单行
df = pd.DataFrame(["One:1", "Two:2", "Three is 3", "Four is IV:4", "Five is V"], columns=["myValues"])
df.loc[~df['myValues'].str.contains(':'), 'myValues'] = df.loc[~df['myValues'].str.contains(':'), 'myValues'].str.replace('is', ':')
打印(DF)
myValues
0 One:1
1 Two:2
2 Three : 3
3 Four is IV:4
4 Five : V
除.contains()
,您还可以使用简单的字符串操作:
df = pd.DataFrame(["One:1", "Two:2", "Three is 3", "Four is IV:4"], columns=["myValues"])
target = [":" not in e for e in df.myValues]
df.myValues[target] = df.myValues[target].str.replace(" is ",":")
结果:
myValues
0 One:1
1 Two:2
2 Three:3
3 Four is IV:4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.