[英]Pandas Dataframe replace part of string with value from another column
I having replace issue while I try to replace a string with value from another column.我尝试用另一列中的值替换字符串时遇到替换问题。 I want to replace 'Length' with df['Length'].
我想用 df['Length'] 替换'Length'。
df["Length"]= df["Length"].replace('Length', df['Length'], regex = True)
Below is my data下面是我的数据
Input:
**Formula** **Length**
Length 5
Length+1.5 6
Length-2.5 5
Length 4
5 5
Expected Output:
**Formula** **Length**
5 5
6+1.5 6
5-2.5 5
4 4
5 5
However, with the code I used above, it will replace my entire cell instead of Length only.但是,使用我上面使用的代码,它将替换我的整个单元格而不是仅替换 Length。 I getting below output: I found it was due to df['column'] is used, if I used any other string the behind offset (-1.5) will not get replaced.
我低于 output:我发现这是由于使用了 df['column'],如果我使用任何其他字符串,后面的偏移量(-1.5)将不会被替换。
**Formula** **Length**
5 5
6 6
5 5
4 4
5 5
May I know is there any replace method for values from other columns?我可以知道其他列的值是否有任何替换方法?
Thank you.谢谢你。
If want replace by another column is necessary use DataFrame.apply
:如果需要用另一列替换,请使用
DataFrame.apply
:
df["Formula"]= df.apply(lambda x: x['Formula'].replace('Length', str(x['Length'])), axis=1)
print (df)
Formula Length
0 5 5
1 6+1.5 6
2 5-2.5 5
3 4 4
4 5 5
Or list comprehension:或列表理解:
df["Formula"]= [x.replace('Length', str(y)) for x, y in df[['Formula','Length']].to_numpy()]
Just wanted to add, that list comprehension is much faster of course:只是想补充一下,列表理解当然要快得多:
df = pd.DataFrame({'a': ['aba'] * 1000000, 'c': ['c'] * 1000000})
%timeit df.apply(lambda x: x['a'].replace('b', x['c']), axis=1)
# 1 loop, best of 5: 11.8 s per loop
%timeit [x.replace('b', str(y)) for x, y in df[['a', 'c']].to_numpy()]
# 1 loop, best of 5: 1.3 s per loop
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.