簡體   English   中英

替換包含至少一個列表元素的單詞-Python / RegEx

[英]Replacing words containing at least one element of a list - Python / RegEx

我有一個特殊字符列表。 我想應用一個RegEx函數,用逗號(“,”)替換包含列表中至少一個元素的單詞。

到目前為止,使用下面的代碼,我知道如何替換字符,但是我不知道如何使用整個單詞。


characters_list = ['%' , ':' , '(']

text = "Fun%( fact: About 71,3% of the Earth's surface is water-covered: and the oceans hold about 96.5% of all Earth's water."


regex = re.compile('|'.join(map(re.escape,characters_list)))
text = regex.sub(",", text)


我希望字符串“ text”成為:

“地球表面約占地球所有水的,海洋約占地球所有水的。”

(我的列表“ characters_list”中包含至少一個元素的所有單詞已更改為逗號)

這是沒有正則表達式的解決方案

>>> ' '.join((',' if any(c in word for c in characters_list) else word) for word in text.split())
", , About , of the Earth's surface is , and the oceans hold about , of all Earth's water."

使用re.findall

' '.join([',' if re.findall('|'.join(map(re.escape,characters_list)), s) else s 
          for s in text.split(' ')])

輸出:

", , About , of the Earth's surface is , and the oceans hold about , of all Earth's water."

使用re.sub

import re

characters_list = ['%' , ':' , '(']
text = "Fun%( fact: About 71,3% of the Earth's surface is water-covered: and the oceans hold about 96.5% of all Earth's water."

print(re.sub( '|'.join(r'(?:[^\s]*{}[^\s]*)'.format(re.escape(c)) for c in characters_list), ',', text ))

打印:

, , About , of the Earth's surface is , and the oceans hold about , of all Earth's water.

我認為,如果您使代碼更易於閱讀,請像下面這樣編寫代碼:

import re

text = "Fun%( fact: About 71,3% of the Earth's surface is water-covered: and the oceans hold about 96.5% of all Earth's water."
replaced = re.sub(r'\S+(%|:|\()', ',', text)

print(replaced) 

輸出:

, , About , of the Earth's surface is , and the oceans hold about , of all Earth's water.

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM