繁体   English   中英

替换包含至少一个列表元素的单词-Python / RegEx

[英]Replacing words containing at least one element of a list - Python / RegEx

我有一个特殊字符列表。 我想应用一个RegEx函数,用逗号(“,”)替换包含列表中至少一个元素的单词。

到目前为止,使用下面的代码,我知道如何替换字符,但是我不知道如何使用整个单词。


characters_list = ['%' , ':' , '(']

text = "Fun%( fact: About 71,3% of the Earth's surface is water-covered: and the oceans hold about 96.5% of all Earth's water."


regex = re.compile('|'.join(map(re.escape,characters_list)))
text = regex.sub(",", text)


我希望字符串“ text”成为:

“地球表面约占地球所有水的,海洋约占地球所有水的。”

(我的列表“ characters_list”中包含至少一个元素的所有单词已更改为逗号)

这是没有正则表达式的解决方案

>>> ' '.join((',' if any(c in word for c in characters_list) else word) for word in text.split())
", , About , of the Earth's surface is , and the oceans hold about , of all Earth's water."

使用re.findall

' '.join([',' if re.findall('|'.join(map(re.escape,characters_list)), s) else s 
          for s in text.split(' ')])

输出:

", , About , of the Earth's surface is , and the oceans hold about , of all Earth's water."

使用re.sub

import re

characters_list = ['%' , ':' , '(']
text = "Fun%( fact: About 71,3% of the Earth's surface is water-covered: and the oceans hold about 96.5% of all Earth's water."

print(re.sub( '|'.join(r'(?:[^\s]*{}[^\s]*)'.format(re.escape(c)) for c in characters_list), ',', text ))

打印:

, , About , of the Earth's surface is , and the oceans hold about , of all Earth's water.

我认为,如果您使代码更易于阅读,请像下面这样编写代码:

import re

text = "Fun%( fact: About 71,3% of the Earth's surface is water-covered: and the oceans hold about 96.5% of all Earth's water."
replaced = re.sub(r'\S+(%|:|\()', ',', text)

print(replaced) 

输出:

, , About , of the Earth's surface is , and the oceans hold about , of all Earth's water.

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM