[英]Regular Expression to replace some dots with commas in customers' comments
我需要編寫一個正則表達式來替換'.'
在一些患者對葯物的評論中帶有','
。 他們應該在提到副作用后使用逗號,但其中一些使用了點。 例如:
text = "the drug side-effects are: night mare. nausea. night sweat. bad dream. dizziness. severe headache. I suffered. she suffered. she told I should change it."
我寫了一個正則表達式代碼來檢測被兩個點包圍的一個詞(例如,頭痛)或兩個詞(例如,噩夢):
text= re.sub (r'(\.)(\s*\w+\s*\.)',r',\2 ', text )
text = re.sub (r'(\.)(\s*\w+\s\w+\s*\.)',r',\2 ', text11 )
這是輸出:
the drug side-effects are: night mare, nausea, night sweat. bad dream, dizziness, severe headache. I suffered, she suffered. she told I should change it.
但它應該是:
the drug side-effects are: night mare, nausea, night sweat, bad dream, dizziness, severe headache. I suffered. she suffered. she told I should change it.
我的代碼沒有將night sweat to ','
后的dot
替換night sweat to ','
。 我另外, if a sentence starts with a subject pronoun (such as I and she) I do not want to change dot to comma after it, even if it has two words (such as, I suffered)
。 我不知道如何將此條件添加到我的代碼中。
有什么建議嗎? 謝謝 !
您可以使用以下模式:
\.(\s*(?!(?:i|she)\b)\w+(?:\s+\w+)?\s*)(?=[^\w\s]|$)
這匹配一個點,然后捕獲一兩個單詞,其中第一個單詞不是您提到的代詞(您很可能需要擴展該列表)。 這必須跟一個既不是單詞字符也不是空格的字符(例如.
!
:
,
)或字符串的結尾。
然后,您必須將其替換為,\\1
在蟒蛇中
import re
text = "the drug side-effects are: night mare. nausea. night sweat. bad dream. dizziness. severe headache. I suffered. she suffered. she told I should change it."
text = re.sub(r'\.(\s*(?!(?:i|she)\b)\w+(?:\s+\w+)?\s*)(?=[^\w\s]|$)', r',\1', text, flags=re.I)
print(text)
輸出
the drug side-effects are: night mare, nausea, night sweat, bad dream, dizziness, severe headache. I suffered. she suffered. she told I should change it.
這可能不是絕對的故障安全,您可能必須為某些邊緣情況擴展模式。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.