熊貓根據正則表達式替換字符串中的字符？

Question

我想替換 pandas 中字符串中的一些字符（基於與整個字符串的匹配），同時保持字符串的其余部分不變。

例如，如果破折號不在數字字符串的開頭，則用數字字符串中的小數替換破折號：

'26.15971' -> '26.15971'

'1030899' -> '1030899'

'26-404700' -> '26.404700'

'-26-403268' -> '-26.403268'

代碼：

# --- simple dataframe
df = pd.DataFrame({'col1':['26.15971','1030899','26-404700']})

# --- regex that only matches items of interest
regex_match = '^\d{1,2}-\d{1,8}'
df.col1.str.match(regex_match)

# --- not sure how to only replace the middle hypens?
# something like  df.col1.str.replace('^\d{1,2}(-)\d{1,8}','^\d{1,2}\.\d{1,8}') ??
# unclear how to get a repl that only alters a capture group and leaves the rest 
# of the string unchanged

Answer 1

您可以嘗試使用帶有環視的正則表達式替換：

df["col1"] = df["col1"].str.replace("(?<=\d)-(?=\d)", ".")

正則表達式(?<=\\d)-(?=\\d)以兩個數字之間的每個破折號為目標，並將其替換為點。

我們也可以使用捕獲組來解決這個問題：

df["col1"] = df["col1"].str.replace("(\d{2,3})-(\d{4,8})", "\\1.\\2")

熊貓根據正則表達式替換字符串中的字符？

問題描述

1 個解決方案

解決方案1
1 已采納 2020-10-08 02:48:58

熊貓根據正則表達式替換字符串中的字符？

問題描述

1 個解決方案

解決方案1 1 已采納 2020-10-08 02:48:58

解決方案1
1 已采納 2020-10-08 02:48:58