簡體   English   中英

正則表達式在客戶的評論中用逗號替換一些點

[英]Regular Expression to replace some dots with commas in customers' comments

我需要編寫一個正則表達式來替換'.' 在一些患者對葯物的評論中帶有',' 他們應該在提到副作用后使用逗號,但其中一些使用了點。 例如:

text = "the drug side-effects are: night mare. nausea. night sweat. bad dream. dizziness. severe headache.  I suffered. she suffered. she told I should change it."

我寫了一個正則表達式代碼來檢測被兩個點包圍的一個詞(例如,頭痛)或兩個詞(例如,噩夢):

檢測由兩個點包圍的單詞:

text=  re.sub (r'(\.)(\s*\w+\s*\.)',r',\2 ', text )

檢測由兩個點包圍的兩個單詞:

text =  re.sub (r'(\.)(\s*\w+\s\w+\s*\.)',r',\2 ', text11 )

這是輸出:

the drug side-effects are: night mare, nausea,  night sweat.  bad dream, dizziness,  severe headache.   I suffered, she suffered.  she told I should change it.

但它應該是:

the drug side-effects are: night mare, nausea,  night sweat,  bad dream, dizziness,  severe headache.   I suffered. she suffered.  she told I should change it.

我的代碼沒有將night sweat to ','后的dot替換night sweat to ',' 我另外, if a sentence starts with a subject pronoun (such as I and she) I do not want to change dot to comma after it, even if it has two words (such as, I suffered) 我不知道如何將此條件添加到我的代碼中。

有什么建議嗎? 謝謝 !

您可以使用以下模式:

\.(\s*(?!(?:i|she)\b)\w+(?:\s+\w+)?\s*)(?=[^\w\s]|$)

這匹配一個點,然后捕獲一兩個單詞,其中第一個單詞不是您提到的代詞(您很可能需要擴展該列表)。 這必須跟一個既不是單詞字符也不是空格的字符(例如. ! : , )或字符串的結尾。

然后,您必須將其替換為,\\1

在蟒蛇中

import re
text = "the drug side-effects are: night mare. nausea. night sweat. bad dream. dizziness. severe headache.  I suffered. she suffered. she told I should change it."
text = re.sub(r'\.(\s*(?!(?:i|she)\b)\w+(?:\s+\w+)?\s*)(?=[^\w\s]|$)', r',\1', text, flags=re.I)
print(text)

輸出

the drug side-effects are: night mare, nausea, night sweat, bad dream, dizziness, severe headache.  I suffered. she suffered. she told I should change it.

這可能不是絕對的故障安全,您可能必須為某些邊緣情況擴展模式。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM