[英]how to replace part of string using regular expression?
我有一個這樣的 dataframe:
fict={'well':['10B23','10B23','10B23','10B23','10B23','10B23'],
'tag':['15B22|TestSep_OutletFlow','15B22|TestSep_GasOutletFlow','15B22|TestSep_WellNum','15B22|TestSep_GasPresValve','15B22|TestSep_Temp','WHT']}
df=pd.DataFrame(dict)
df
well tag
0 10B23 15B22|TestSep_OutletFlow
1 10B23 15B22|TestSep_GasOutletFlow
2 10B23 15B22|TestSep_WellNum
3 10B23 15B22|TestSep_GasPresValve
4 10B23 15B22|TestSep_Temp
5 10B23 WHT
現在我想替換之前的任何東西 | 在標記列中為 11A22 之類的字符串,因此替換后的 dataframe 應如下所示:
well tag
0 10B23 11A22|TestSep_OutletFlow
1 10B23 11A22|TestSep_GasOutletFlow
2 10B23 11A22|TestSep_WellNum
3 10B23 11A22|TestSep_GasPresValve
4 10B23 11A22|TestSep_Temp
5 10B23 WHT
我正在考慮使用帶組的正則表達式來用字符串替換組,在我看來是這樣的
df['tag2']=df['tag'].str.replace(r'([a-z0-9]*)|TestSep_[a-z0-9]*','11A22',regex=True)
然后我得到了結果
well tag tag2
0 10B23 15B22|TestSep_OutletFlow 11A2211A22B11A2211A22|11A2211A2211A22O11A2211A...
1 10B23 15B22|TestSep_GasOutletFlow 11A2211A22B11A2211A22|11A2211A2211A22G11A2211A...
2 10B23 15B22|TestSep_WellNum 11A2211A22B11A2211A22|11A2211A2211A22W11A2211A...
3 10B23 15B22|TestSep_GasPresValve 11A2211A22B11A2211A22|11A2211A2211A22G11A2211A...
4 10B23 15B22|TestSep_Temp 11A2211A22B11A2211A22|11A2211A2211A22T11A2211A22
5 10B23 WHT 11A22W11A22H11A22T11A22
謝謝你的幫助
( |
) 是正則表達式中的一個特殊字符,您需要將其轉義。
df["tag2"] = df["tag"].str.replace(r"^\w*\|", "11A22|", regex=True)
Output:
print(df)
well tag tag2
0 10B23 15B22|TestSep_OutletFlow 11A22|TestSep_OutletFlow
1 10B23 15B22|TestSep_GasOutletFlow 11A22|TestSep_GasOutletFlow
2 10B23 15B22|TestSep_WellNum 11A22|TestSep_WellNum
3 10B23 15B22|TestSep_GasPresValve 11A22|TestSep_GasPresValve
4 10B23 15B22|TestSep_Temp 11A22|TestSep_Temp
5 10B23 WHT WHT
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.